The shape of intelligence

March 2026

Does it make sense to say that some people are more athletic than others? Or should we always break athletic ability into narrower traits like strength, speed, endurance, coordination, and balance?

Both descriptions are useful. A sprinter, a gymnast, and a powerlifter are not interchangeable. They have different strengths. But it is still meaningful to talk about athletic ability in general, because many physical abilities are positively correlated. People who do well on one physical test often do better than average on others.

That does not mean athleticism is one simple substance in the body. It means that, across a broad population and a broad set of tests, physical performance has shared structure.

§ PCA

A simple way to see this is with principal component analysis.

Suppose we give many people many tests. Each row is a person, each column is a test. For athletic ability, the tests might include grip strength, sprint speed, endurance, balance, vertical jump, and coordination. Before analysis, we orient every score so that higher means better performance, then standardize each test to have mean 0 and variance 1.

This gives us a standardized matrix $Z$ . From it, we compute the correlation matrix:

R = \frac{1}{M - 1} Z^\top Z

where $M$ is the number of people.

Then we decompose the correlation matrix:

R = Q \Lambda Q^\top

The first eigenvector, $q_1$ , is the direction that captures the most variance in the test battery. The corresponding eigenvalue tells us how much of the total standardized variation is explained by that direction.

Each person's score on this first principal component is:

s = Z q_1

or, for a single person:

s_k = z_k^\top q_1

That score is not a mystical essence. It is a statistical summary. It says: given this set of tests, here is the single dimension that best summarizes broad performance.

If the tests were unrelated, this first dimension would not be very important. But when many tests are positively correlated, the first component matters. The data cloud is not spherical. It has a dominant axis.

That dominant axis is what we usually mean, in ordinary language, by "general athletic ability." It does not erase narrower traits. A person can be unusually strong, unusually fast, or unusually coordinated relative to their general level. But those narrower differences sit on top of a broader common pattern.

§ Intuition

Imagine each person as a point in high-dimensional space, with one axis per test. If the abilities were unrelated, the cloud of points would look roughly spherical. But when many abilities are positively correlated, the cloud stretches. It looks more like a cigar than a ball, stretched along one axis and narrow along the rest. The first principal component is the long axis of that cigar.

§ Intelligence

The same logic applies to intelligence.

Give many people a broad battery of cognitive tests: vocabulary, arithmetic, memory, spatial reasoning, pattern recognition, processing speed, and so on. Standardize the tests. Build the correlation matrix. Run PCA. The first principal component is a general cognitive-performance dimension.

That PCA-derived dimension is what I mean here by g.

This does not mean that intelligence is one simple thing in the brain. Nor does it mean that verbal ability, memory, spatial reasoning, and processing speed are the same skill. They are not. The claim is narrower and more defensible: broad cognitive abilities are positively correlated, and the first principal component of a diverse cognitive test battery captures a meaningful part of that shared variation.

Nor is g merely an artifact of one particular test battery. Researchers can use different broad sets of cognitive tests and still tend to recover a similar general factor. The exact factor will vary somewhat depending on the tests, the sample, and the method, but the basic pattern is stable: diverse cognitive tests usually form a positive manifold, and a dominant general dimension repeatedly appears.

That cross-battery consistency is the strongest evidence that g is not just "whatever this one IQ test happens to measure." The tests produce the score in the mechanical sense, just as physical tests produce an athletic-performance score. But the reason the score is interesting is that different reasonable test batteries keep revealing a similar structure.

So g is test-derived, but not arbitrary. It is a statistical summary of a stable empirical pattern.

The athletic analogy helps because it blocks a common confusion. Saying that athleticism exists does not deny that different sports require different traits. Likewise, saying that g exists does not deny that different mental tasks require different abilities. A mathematician, lawyer, novelist, mechanic, and chess player use different mixtures of skill and knowledge. But it can still be true that people who perform well on one broad class of cognitive tasks tend, on average, to perform well on others.

The practical value of g is in prediction, not explanation. g predicts; it does not fully explain. Performance also depends on domain knowledge, motivation, conscientiousness, health, personality, social skill, opportunity, and luck.

That distinction matters. g is not the whole person and not the whole mind. It is a useful general predictor of performance on many cognitively demanding tasks, especially when looking across the full population rather than inside already-selected groups.

Many people mostly observe restricted ranges. In a selective university, hospital, law firm, or engineering company, nearly everyone has already passed cognitive filters. Inside that narrower group, other traits become more visible. Ambition, discipline, taste, confidence, social fluency, and domain knowledge may explain much of the remaining variation. But that does not mean general cognitive ability was unimportant. It means selection already removed much of the variation.

This is like studying only professional basketball players and concluding that height barely matters. Among NBA players, height is not enough to distinguish the best from the worst. But that does not mean height is irrelevant to becoming an NBA player.

The same mistake happens with intelligence. Within elite environments, people notice the limits of intelligence. Across the wider population, its importance is harder to miss.

There is also a social reason people resist this. Cognitive differences are uncomfortable to discuss because people often confuse equal dignity with equal ability. But those are different claims. A society can hold that every person deserves the same rights and respect while also recognizing that people differ in memory, reasoning, learning speed, attention, and abstraction.

Avoiding direct discussion of intelligence does not make cognitive hierarchy disappear. It mostly pushes judgment onto proxies.

People still sort one another. They use credentials, fluency, accent, polish, confidence, cultural fit, recommendation letters, elite affiliations, and the ability to present oneself well. These signals are not meaningless. Some contain useful information. But they are indirect, noisy, and often easier for class advantage to shape.

That is why direct measures can sometimes be more honest than holistic judgment. Standardized tests are imperfect. They are noisy, coachable, and never measure the whole person. But the alternatives are also imperfect. Essays can be polished by others. Extracurriculars can be purchased. Recommendations depend on networks. Interviews reward confidence and cultural fluency. Credentials often reflect prior access.

So the relevant comparison is not between standardized tests and perfection. It is between standardized tests and the proxies that replace them.

If society needs to select people for cognitively demanding roles, then some kind of sorting will happen. The question is whether that sorting is relatively direct, transparent, and measurable, or whether it is hidden inside a bundle of softer signals that are harder to audit.

The strongest argument for g, then, is not that intelligence is simple. It is that broad cognitive performance has a real statistical structure. PCA gives a clean way to describe that structure: g, as used here, is the first principal component extracted from a broad, well-designed battery of cognitive tests.

That component does not explain everything. It does not define human worth. It does not make narrower talents irrelevant. But it is real, it is repeatedly recovered across broad test batteries, it predicts important outcomes, and pretending otherwise does not make society kinder. It often just replaces measurement with less honest forms of judgment.