SNP Heritability

SciencePedia

Key Takeaways

The "missing heritability" problem describes the significant gap between heritability estimated from family studies and the smaller portion explained by top hits from early Genome-Wide Association Studies (GWAS).
SNP heritability is estimated by methods like GREML and LD Score Regression, which collectively account for the small, additive effects of thousands of common genetic variants across the entire genome.
The concept of SNP heritability provides the foundation for powerful applications, including the creation of Polygenic Scores (PRS) for trait prediction and the calculation of genetic correlations to reveal shared genetic underpinnings between different traits.
Despite its utility, genetic prediction has critical limitations, including modest accuracy and performance biases across different ancestral populations, warning against interpretations of genetic determinism.

Introduction

The quest to understand the genetic basis of complex traits, from our height to our risk for heart disease, is a central challenge in modern science. For decades, family and twin studies suggested that genetics played a major role, a concept quantified as heritability. Yet, with the dawn of the genomic era, a perplexing mystery emerged: the most powerful genetic studies could only find a fraction of this expected genetic influence. This puzzle, famously known as the "missing heritability" problem, challenged researchers and spurred the development of brilliant new ways to analyze the human genome.

This article navigates the journey to find this missing genetic signal. It explains not only what SNP heritability is but also how it is measured and why it has become such a foundational concept. Across two chapters, you will gain a comprehensive understanding of this powerful tool. In "Principles and Mechanisms," we will dissect the statistical innovations, such as GREML and LD Score Regression, that allowed scientists to finally account for the distributed, polygenic nature of complex traits. Following this, "Applications and Interdisciplinary Connections" will explore the revolutionary impact of this knowledge, from building predictive polygenic scores and mapping the genetic links between diseases to illuminating biological pathways and navigating the profound ethical frontiers of genetic prediction.

Principles and Mechanisms

The Case of the Missing Inheritance

Imagine you're a detective. Decades of investigating families, and especially twins, have given you a very strong clue: a certain trait, say, your risk for a complex disease, is largely inherited. By comparing identical twins (who share 100% of their DNA) to fraternal twins (who share, on average, 50%), you've confidently estimated that genetics accounts for about 75% of the variation in risk across the population. This number, the proportion of trait variation due to genetic variation, is what we call heritability.

Armed with this knowledge and the new tools of the genomic era, you launch a massive investigation. You scan the genomes of hundreds of thousands of people, looking for tiny genetic markers called Single Nucleotide Polymorphisms (SNPs)—places in our DNA where individuals differ by a single "letter." This powerful technique, a Genome-Wide Association Study (GWAS), is like a dragnet designed to catch the genetic culprits. The study is a success! You find 20 SNPs that are robustly associated with the disease. But when you tally up their combined effect, you're faced with a baffling result: together, they explain only 15% of the risk variation.

Where did the other 60% of the heritability go? This huge and perplexing gap between the heritability estimated from family studies and the heritability explained by significant GWAS "hits" became one of the central puzzles in modern genetics, famously known as the "missing heritability" problem. The search for this missing inheritance has led us to a much deeper and more beautiful understanding of how our genomes work.

A Forest of Tiny Effects

The first clue to solving the mystery was a radical shift in perspective. Early on, we hoped to find a few "genes for" a trait, like a handful of large trees dominating a landscape. But what if the genetic landscape is more like a vast meadow? What if complex traits are not governed by a few genes of large effect, but by the combined influence of thousands, or even tens of thousands, of genetic variants, each contributing a tiny, almost imperceptible amount?

This is the essence of the polygenic or infinitesimal model. In a GWAS, to avoid being fooled by chance, we set an incredibly high bar for statistical significance (typically a p-value less than $5 \times 10^{-8}$ ). If a trait's genetic basis is spread thinly across thousands of SNPs, then the individual effect of most of these true causal variants will be too small to clear this high bar. They are real, but they are hidden below the floor of statistical detection. A GWAS that reports "no significant findings" does not mean the trait has no genetic component; it may simply mean the study was not powerful enough to see the countless blades of grass that make up the meadow. The heritability isn't truly missing; it's just hiding in plain sight, distributed across the entire genome in tiny, individually non-significant effects.

From Individuals to a Map of Genetic Similarity

If we can't find the culprits one by one, perhaps we can find them collectively. This led to a brilliant conceptual pivot. Instead of asking, "Does this specific SNP affect the trait?", we started asking, "On the whole, do people who are more genetically similar also tend to be more similar in their traits?"

To answer this, geneticists developed a new tool: the Genomic Relationship Matrix (GRM). Imagine you have a list of a million common SNPs for two seemingly unrelated people. You can go through this list and count how many of these SNP "letters" they share. By doing this for every pair of individuals in a large study, you can build a massive table—the GRM—that quantifies the actual, or realized, genetic similarity between any two people based on these common SNPs. It’s a high-resolution map of genetic kinship across a whole population.

With this map in hand, we can use a statistical method called Genomic-Relatedness-based Restricted Maximum Likelihood (GREML). The idea is wonderfully intuitive. We take the total variation we see in a trait (like height) and try to partition it into two buckets. The first bucket is for variance that correlates with our genetic similarity map (the GRM). The second bucket is for everything else—unexplained variance we attribute to environment or other non-genetic factors.

The model can be expressed with beautiful simplicity: $y = g + e$ Here, the phenotype ( $y$ ) of an individual is the sum of a genetic component ( $g$ ) and a residual component ( $e$ ). The magic is in the assumption we make about the variance of the genetic parts: we assume that the covariance of the genetic values between any two people is proportional to their entry in the GRM. The proportion of the total phenotypic variance that gets explained by this genetic component is called the SNP-based heritability, or $h^2_{\mathrm{SNP}}$ .

This method was a breakthrough. It looks at all the common SNPs simultaneously, not just the "significant" ones, and aggregates their subtle, collective influence. For many traits, SNP heritability accounts for a much larger fraction of the total heritability than the old GWAS-hit-based methods, confirming that the polygenic view was largely correct.

A Deeper Look into the Gap

Even with this new tool, a gap often remains. For human height, twin studies might suggest an 80% heritability, while SNP heritability ( $h^2_{\mathrm{SNP}}$ ) from common variants might capture around 50%. So, where is the rest of the missing heritability? The investigation has pointed to several suspects.

The Streetlight Effect: The Role of Rare Variants. Standard SNP arrays are like streetlights illuminating a city—they light up the common avenues (common SNPs, with frequency > 1-5%) very well, but leave the narrow alleys (rare variants) in the dark. What if a significant portion of heritability comes from a vast number of rare genetic variants? These variants might have larger effects individually, but because they are so uncommon, they are missed by standard arrays and thus not included in the typical $h^2_{\mathrm{SNP}}$ calculation. In contrast, twin studies, which measure the total effect of shared DNA, implicitly capture the contribution of all variants, common and rare. This hypothesis is strongly supported by the fact that when we use Whole-Genome Sequencing (WGS), which captures rare variants, the estimated heritability increases, closing some of the gap.
Are the Twin Estimates Inflated? Another possibility is that the original 75% estimate from our detective work was a bit too high. Twin studies rely on a critical assumption called the "equal environments assumption"—that identical twins don't share a more similar environment than fraternal twins. If this assumption is violated (e.g., identical twins are treated more alike), some of the environmental similarity can be mistaken for genetic influence, inflating the heritability estimate. Furthermore, other factors like assortative mating (the tendency for people to choose partners with similar traits) and indirect genetic effects from parents can also cause traditional pedigree-based estimates to be overestimates.
Beyond Simple Addition: Non-Additive Effects. Our standard $h^2_{\mathrm{SNP}}$ model assumes that the effects of genes are additive—the effect of having two risk alleles is twice the effect of having one. But biology can be more complex. There can be dominance effects (interactions between alleles at the same gene) and epistasis (interactions between different genes). These non-additive effects are not well captured by standard SNP heritability models but do contribute to the overall similarity of twins, potentially explaining another slice of the gap.

In essence, the "missing heritability" is not one thing. It's a combination of the infinitesimal effects of many common variants, the contribution of unmeasured rare variants, a possible overestimation from family studies, and complex non-additive interactions.

Finding Heritability Hidden in the Noise

Just when you think the story can't get any more clever, geneticists came up with another remarkable trick. This method, called LD Score Regression, allows us to estimate SNP heritability using only the summary results from a GWAS, without needing individual-level genetic data. And even more impressively, it can distinguish true polygenicity from confounding bias.

The key idea is Linkage Disequilibrium (LD). Genes aren't shuffled completely randomly every generation; chunks of the genome are inherited together. This means that SNPs that are physically close to each other on a chromosome tend to be correlated. For any given SNP, we can calculate an LD score—a number that represents how much that SNP is correlated with all its neighbors across the genome. A SNP in a "busy" genomic neighborhood with lots of correlation has a high LD score; one in a "lonely" region has a low one.

Here is the insight: in a highly polygenic trait, a SNP can appear associated with the trait for two reasons. It could be a true causal variant itself, or it could simply be a bystander, correlated with one or more other true causal variants in its neighborhood. A SNP with a high LD score has many neighbors, so it has more chances to be a bystander to a causal variant than a SNP with a low LD score.

Therefore, we should expect that, on average, the association statistic from a GWAS (a $\chi^2$ statistic) for a SNP will be higher if its LD score is higher. In fact, there's a beautiful linear relationship: $\mathbb{E}[\chi^2] = \left( \frac{N h^2_{\mathrm{SNP}}}{M} \right) \ell + (1 + N a)$ This equation tells a simple story. The expected association statistic ( $\mathbb{E}[\chi^2]$ ) for a SNP is linearly related to its LD score ( $\ell$ ). The slope of this line is proportional to the SNP heritability ( $h^2_{\mathrm{SNP}}$ ) divided by the number of SNPs ( $M$ ). The more heritable and polygenic a trait is, the steeper the slope. The intercept of the line, however, captures something different. It captures effects that inflate the $\chi^2$ statistic of all SNPs equally, regardless of their LD score. This is the signature of confounding biases like population stratification.

By simply regressing the observed $\chi^2$ statistics against the pre-computed LD scores for all SNPs, we can estimate the slope and the intercept, and thereby disentangle true, widespread polygenicity from statistical artifacts. It is an exceptionally elegant way to find the signal of heritability hidden within the "noise" of millions of association statistics.

Seeing Through the Fog of Ancestry

The problem of confounding bias mentioned above, particularly population stratification, is a ghost that haunts all genetic studies. Suppose your study includes people from two different ancestral populations, say, Northern and Southern Europe. And suppose these two populations happen to have different average heights for purely environmental reasons, and they also have slightly different frequencies of certain SNPs for purely historical reasons. If you naively combine these groups, you will find spurious associations between those SNPs and height, even if the SNPs have nothing to do with height biologically. The genetic differences are simply acting as a proxy for the environmental differences.

This is where the geneticist must be a careful detective. A common and powerful way to correct for this is to use Principal Component Analysis (PCA) on the genetic data. PCA is a mathematical technique that finds the major axes of variation in a dataset. In genetics, the top principal components almost always correspond to axes of ancestry. By including these principal components as covariates in our statistical models—be it a GWAS, a GREML analysis, or LD Score regression—we are essentially telling the model, "Pay attention to the genetic variations that exist within ancestral groups, not the big average differences between them." This acts like a pair of statistical sunglasses, allowing us to see through the fog of ancestry and focus on the true genetic effects we seek to discover.

Through this journey—from the initial puzzle of missing heritability to the development of sophisticated tools that view the genome holistically and carefully guard against confounding—we have come to appreciate the beautifully complex and subtle architecture of our most human traits.

Applications and Interdisciplinary Connections

Having grappled with the principles of SNP heritability, we now arrive at the truly exciting part of our journey. What can we do with this knowledge? If the previous chapter was about forging a new key, this one is about the multitude of doors it unlocks. You will see that measuring the genetic component of a trait is not an end in itself; it is the starting point for a cascade of applications that stretch from medicine and biology to the very study of our evolutionary history and the structure of our societies. This is where the abstract concept of heritability becomes a powerful lens for viewing the world.

The Art of Genetic Prophecy: Polygenic Scores

The most direct application of SNP heritability is prediction. If a significant portion of the variation in a trait is tied to common genetic markers, then surely we can use an individual’s markers to make a forecast about their trait. This is the simple, powerful idea behind a Polygenic Score (PRS).

Imagine you have the results from a massive Genome-Wide Association Study (GWAS), which has estimated the tiny effect ( $\hat{\beta}_j$ ) of millions of SNPs on a trait, say, height. To calculate a person’s PRS, you simply go through their genome, and for each SNP, you multiply their genotype (0, 1, or 2 copies of the effect-raising allele) by its estimated effect size. Sum them all up, and you have a single number: a personalized estimate of their genetic predisposition to being tall.

But how good is this genetic forecast? Our intuition, and the mathematics, tells us that the accuracy must depend on two things: the strength of the genetic signal itself and the clarity with which we were able to measure it. The "signal" is none other than the SNP heritability, $h^2_{SNP}$ . The "clarity" is a function of the size of our GWAS, $N$ . A larger study reduces the statistical noise in our effect size estimates. A beautiful and simple relationship emerges for the expected accuracy (the correlation, $\rho$ , between the score and the actual trait):

\rho \approx \frac{h^2_{SNP}}{\sqrt{h^2_{SNP} + \frac{M}{N}}}

where $M$ is the effective number of independent genetic variants influencing the trait. Look at this little formula! It’s wonderfully instructive. The accuracy of our prediction is a contest between the signal ( $h^2_{SNP}$ in the numerator) and the total variance, which is the signal plus a noise term ( $\frac{M}{N}$ ). To get a better prediction, you either need a trait with a stronger genetic signal (larger $h^2_{SNP}$ ) or you must shrink the noise term. Since we can't change a trait's heritability, the path forward is clear: increase $N$ . This simple equation is the engine driving the global race to build ever-larger biobanks and conduct GWAS with millions of participants. It explains why a PRS for height built from a 2018 study of 700,000 people is vastly more predictive than one from a 2010 study of 180,000 people.

However, nature adds a delightful wrinkle. The total heritability is not the whole story. The genetic architecture—how that heritability is distributed—also matters. Imagine two diseases, both with the same total $h^2_{SNP}$ . "Oligogenia" has its heritability concentrated in a few SNPs with relatively large effects, while "Polygenia" has its heritability spread out like a fine dust over thousands of SNPs with minuscule effects. Our GWAS has a detection threshold; it can only "see" SNPs whose effects are large enough to stand out from the statistical noise. As a result, a PRS for Oligogenia will capture a much larger fraction of its total heritability and be a much better predictor than a PRS for Polygenia, whose genetic basis is too diffuse for our current tools to fully capture. Understanding SNP heritability is not just about the total amount, but also its pattern and structure.

The Shared Threads: Genetic Correlation and Pleiotropy

The world of traits is not a collection of independent islands. Diseases and behaviors are often correlated. Smokers are at higher risk for heart disease; depression and anxiety often co-occur; educational attainment is correlated with a longer lifespan. For centuries, we could only observe these correlations. With the tools of SNP heritability, we can now ask a deeper question: are these traits linked at a genetic level?

Enter Linkage Disequilibrium (LD) Score Regression. This wonderfully clever technique allows us to estimate the genetic correlation between two traits using only the summary statistics from their respective GWASs. The insight is this: for a truly polygenic trait, SNPs in regions of high LD (where many variants are correlated with each other) will, on average, tag more causal variants and thus show stronger GWAS association signals. Now, consider two different traits, say, schizophrenia and bipolar disorder. If we find that for both traits, the GWAS association statistics tend to be higher in the same high-LD regions, it suggests they are drawing from the same well of causal variants. LD Score regression quantifies this, allowing us to estimate the genetic covariance, and from that, the genetic correlation ( $r_g$ ).

This single number, $r_g$ , is a profound measure of pleiotropy—the tendency of single genes to affect multiple, seemingly distinct, traits. It unveils a hidden web of connections. For instance, Crohn's disease and ulcerative colitis are both inflammatory bowel diseases. LDSC reveals they have a substantial positive genetic correlation ( $r_g \approx 0.60$ ). This isn't just a curious fact; it has practical consequences. If you build a PRS for Crohn's disease and apply it to people with ulcerative colitis, you will find it has real predictive power for colitis. The amount of variance it can explain is directly related to the genetic correlation, approximately $R^2 = r_g^2 h^2_{\text{UC}}$ . The shared genetic architecture means a predictive tool for one can serve, albeit less powerfully, as a tool for the other.

Illuminating Biology and Evolution

So far, we have used SNP heritability to predict traits and map their connections. But we can push even further. We can use it as a flashlight to illuminate the biological machinery underlying a trait and even to peer into our own evolutionary history.

The tool for this is stratified LD Score regression. Instead of calculating a single LD score for each SNP, we can partition it based on where the SNP falls in the genome. We can create separate LD scores for SNPs inside genes, in regulatory regions known as "enhancers," near transcription start sites, and so on. By then looking at which of these annotation-specific LD scores best explains the GWAS signals for a trait, we can partition the trait's total heritability across these functional categories.

The results are stunning. For a brain-related trait like cortical thickness, we find that heritability is not spread evenly. Instead, it is significantly enriched in genomic regions that are functionally active in the brain, like transcription start sites and enhancers. This provides powerful, independent evidence that our statistical associations are pointing to real biology. We are bridging the vast gap between population-level statistics and molecular function.

This approach can also be turned into a kind of genetic archaeology. The genomes of modern non-African humans contain small segments—around 2%—that were inherited from our ancient Neanderthal relatives. We can create a genomic annotation: "archaic" vs. "modern." Using stratified LDSC, we can then ask if the heritability of certain traits is enriched in these archaic segments. For some traits, like those related to immunity or skin pigmentation, we find exactly this. This provides a statistical clue that these introgressed genes may have been beneficial, perhaps helping our ancestors adapt to new pathogens and different levels of sunlight as they moved out of Africa. From a simple GWAS, we are learning about human evolution hundreds of thousands of years ago.

From Correlation to Causation: A Glimpse into Mendelian Randomization

Perhaps the boldest application of genetics is to untangle correlation from causation. Does drinking alcohol cause lung cancer? They are correlated, but the correlation is likely confounded by smoking. Observational epidemiology is fraught with such challenges.

Genetics offers a unique way to tackle this: Mendelian Randomization (MR). At conception, the genetic variants we inherit from our parents are assigned randomly (this is Mendel's Law of Segregation). This means our genotype is a naturally "randomized" instrument. If a gene variant robustly influences an exposure (like alcohol consumption), and that same variant is also associated with an outcome (like heart disease), and it doesn't affect the outcome through any other pathway (the "no pleiotropy" assumption), then we can infer a causal link from the exposure to the outcome.

Finding good genetic instruments and testing the assumptions of MR is a complex field in itself. But the tools we have already developed play a crucial screening role. With summary statistics for hundreds of traits, we can run a "GWAS of causality." We can use LD Score regression to quickly scan all possible pairs of traits, filtering for those that have a significant genetic correlation and show no evidence of confounding from sample overlap. We can even develop directional heuristics to prioritize which direction of causality ( $X \to Y$ or $Y \to X$ ) is more plausible before diving into formal MR, helping to make this massive search computationally tractable. SNP heritability and genetic correlation thus become the foundational layer for a new kind of causal science.

The Human Element: Societal and Ethical Frontiers

Power invites responsibility. The ability to read and predict from the genome opens up social and ethical frontiers that we must navigate with extreme care. The very existence of SNP heritability and PRS has been misinterpreted as a justification for a new kind of genetic determinism. It is our duty as scientists to forcefully and clearly state the limitations.

Consider a proposal to use a PRS for educational attainment to stream children into different academic tracks. On its face, this is a horrifying prospect, and the science itself provides the most potent arguments against it. First, the predictive power is extremely modest. A PRS for educational attainment might explain 12% of the variance ( $R^2 \approx 0.12$ ). This means a full 88% of the variation in outcomes is due to everything else: environment, opportunity, luck, and genetic factors we haven't measured. Basing a life-altering decision on such a noisy predictor is statistically indefensible and guarantees high rates of misclassification. Second, heritability is a population statistic, not an individual's destiny. It describes "what is" in a specific population at a specific time, not "what must be" for an individual. Third, and most critically, these scores are not portable. A PRS trained in one ancestral group (e.g., European) performs poorly and can be systematically biased when applied to other groups (e.g., African or Asian). Using such a flawed tool in a diverse population would not be a meritocratic equalizer; it would be an engine for amplifying existing social and racial inequalities under a false veneer of scientific objectivity.

Similar arguments apply to the even more contentious topic of using PRS to select IVF embryos. Beyond the profound ethical questions, the statistical realities are sobering. Because of pleiotropy, selecting an embryo to have a lower genetic risk for heart disease might unintentionally increase its risk for an autoimmune disorder. Furthermore, when selecting from a small batch of, say, five embryos, the expected gain is statistically modest. The reduction in disease liability achieved by picking the "best" embryo is far smaller than often portrayed, limited by both the predictive power of the score and the small amount of genetic variation available within a single family.

The concept of SNP heritability, then, is not a measure of our genetic chains. It is a scientific instrument of remarkable power and scope. It allows us to build predictive models, to see the hidden genetic threads that link different facets of our lives, to illuminate the cogs of our biology, and to read the faint echoes of our evolutionary past. But like any powerful instrument, its use requires wisdom, humility, and a clear-eyed understanding of its profound limitations.