Genetics: Inheritance and Variation
Mendel's Peas
Gregor Mendel was a 19th-century Augustinian friar with a mathematical streak. He grew thousands of pea plants in a monastery garden and studied how traits passed between generations.
He picked peas for good reasons: they have several clearly distinguishable traits (tall or short plant, smooth or wrinkled seeds, yellow or green peas), they self-pollinate easily, and they grow fast. He crossed plants with different traits and counted the offspring.
From the counts, he worked out the foundational rules of genetics, a century before DNA was known to be the material of heredity.
Mendel's basic observations
- Traits come in alternative forms (tall vs short). Each plant inherits one form from each parent
- Some forms are dominant; others are recessive. If a plant has one copy of each, the dominant form shows
- Different traits are inherited independently of each other (mostly)
This became "Mendelian genetics", and it's broadly correct for simple traits. Most interesting traits are more complex, but Mendel's rules are the starting point.
Modern Vocabulary
Translating Mendel into modern terms:
Gene a stretch of DNA encoding a functional product
Allele a specific version of a gene (e.g. the "tall" version, the "short" version)
Genotype the alleles an individual has
Phenotype the trait that shows up (the "taste" of the genotype)
Homozygous two identical alleles at a locus
Heterozygous two different alleles at a locus
Dominant the allele whose trait shows when heterozygous
Recessive the allele whose trait is masked when heterozygous
Locus the position of a gene on a chromosome
You inherit one allele from each parent for each autosomal gene (the 22 non-sex chromosomes). Your genotype is the pair. Your phenotype depends on how the pair interacts.
Dominance isn't always simple
Classical dominance (Mendel's peas) is the simple case. Real biology has variants:
- Complete dominance: Mendelian. AA and Aa look the same; aa looks different
- Incomplete dominance: Aa is an intermediate between AA and aa (red + white = pink flowers)
- Codominance: Aa shows both A and a traits (AB blood type expresses both A and B)
- Overdominance: Aa is better than either homozygote (classic example: the sickle-cell allele)
Dominance is not intrinsic to an allele; it's about how two alleles combine.
Sex-Linked Inheritance
Humans have two sex chromosomes: XX (typically female) and XY (typically male). This produces special inheritance patterns.
The X chromosome has around 800 genes. The Y is small and has mostly male-determining genes. A gene on the X in a person with XY has no "partner" copy; its allele shows regardless of dominance.
This is why:
- Colour blindness (an X-linked recessive trait) is more common in XY individuals than in XX
- Haemophilia, in Queen Victoria's family line, was X-linked and affected her grandsons
- Some diseases skip generations in predictable patterns
Sex-linked genetics is not a separate field; it's Mendelian genetics applied to the unusual situation of the sex chromosomes.
Sources of Genetic Variation
Why isn't everyone identical? Several mechanisms generate variation:
1. Mutation
DNA copying is accurate but imperfect. Every time a cell divides, a few errors creep in. Most are caught and repaired; some aren't. Over generations, mutations accumulate.
Kinds of mutations:
- Point mutation: a single letter changes (a single nucleotide polymorphism, or SNP)
- Insertion: letters added
- Deletion: letters removed
- Duplication: a region doubled
- Inversion: a region flipped
- Translocation: a region moved to a different chromosome
Most mutations are neutral (no effect). Some are harmful. A few are beneficial. Natural selection acts on the harmful and beneficial ones; chapter 7 covers this.
2. Recombination
During meiosis (gamete formation), chromosomes pair up and exchange sections in a process called crossing over. This means sperm and egg cells carry shuffled versions of the parents' chromosomes.
Each of your chromosomes is a mosaic: parts from your maternal grandmother, parts from your maternal grandfather. Recombination is why siblings (who share the same parents) are not genetically identical.
3. Independent assortment
When gametes form, each pair of chromosomes sorts independently. With 23 pairs, that's 2^23 possible combinations (over 8 million) from just independent assortment, before recombination adds more variety.
4. Gene flow
When individuals migrate between populations, they carry their genes with them. This mixes gene pools.
5. Genetic drift
Random change in allele frequencies over generations, especially in small populations. Chapter 7 goes deeper.
Linkage and Linkage Disequilibrium
Mendel said traits sort independently. Mostly true, but not always. Genes that are close on the same chromosome tend to be inherited together, because recombination doesn't happen often enough to separate them.
Two alleles that tend to be inherited together are in linkage disequilibrium. This matters for genetic studies: if you find one allele associated with a disease, you might really be detecting a linked neighbour that's the actual cause.
Genome-wide association studies (GWAS) use linkage extensively. They look at a few million markers across the genome, and because of linkage, those markers tag many more positions than they directly cover.
Polygenic Traits
Most interesting traits are polygenic: influenced by many genes, each with a small effect.
- Height: influenced by hundreds of genetic variants, plus nutrition, plus hormone levels, plus childhood health
- Intelligence: highly polygenic, with environmental contributions larger than for simpler traits
- Type 2 diabetes risk: hundreds of small contributions plus lifestyle
- Coronary artery disease: similar
Polygenic traits rarely follow simple Mendelian patterns. They tend to be normally distributed in populations and respond to environmental factors.
Polygenic risk scores combine many variants into a single predictor. They are improving but remain modest predictors for most complex diseases. Saying "your polygenic risk for diabetes is in the 85th percentile" gives useful but limited information.
Penetrance and Expressivity
Two important words:
- Penetrance: the percentage of people with a given genotype who actually show the associated phenotype
- Expressivity: how strongly the phenotype shows, when it does
High penetrance (like Huntington's disease): if you have the allele, you almost certainly get the disease. Low penetrance (many cancer-risk alleles): having the risk variant raises your risk modestly, but many people with the variant never develop the disease.
Most "genetic" diseases have intermediate penetrance. "Having the gene for X" is usually an oversimplification.
Human Genetic Variation
Humans are genetically very similar: about 99.9% identical to each other on average. That 0.1% difference amounts to about 3 million variants per person.
Most variants are:
- In non-coding DNA
- Silent (no effect on protein)
- Already present in the population (shared polymorphisms)
- Neutral in effect
Rare variants, especially in protein-coding DNA, are more likely to have an effect. Most rare disease-causing variants are protein-coding and deleterious.
Ancestry and genetics
Human populations have moved, mixed, and separated. Some variants are more common in some populations than others. This is the basis of:
- Ancestry testing: statistical comparison against reference populations
- Forensic genetics: identifying individuals by their variants
- Medical genetics: some variants have different frequencies in different populations (BRCA1/2 in Ashkenazi Jews, HbS in populations from malaria-endemic regions)
A widespread misconception: there are no sharp genetic boundaries between human populations. Genetic variation is continuous, with geography. Racial categorisations are social constructs built on small genetic cues; they don't map cleanly to genetics.
Inherited Disease Patterns
Some common patterns:
Autosomal recessive
Two copies needed for disease. Unaffected carriers (heterozygotes) are common. Examples: cystic fibrosis, sickle-cell disease, Tay-Sachs.
Two carriers can have a 1 in 4 chance of an affected child.
Autosomal dominant
One copy enough for disease. Often shows in every generation. Examples: Huntington's disease, Marfan syndrome.
Affected person has a 1 in 2 chance of passing it to each child.
X-linked recessive
X chromosome; mostly affects males. Examples: haemophilia, Duchenne muscular dystrophy, colour blindness.
Carrier mothers can pass it to sons; daughters are usually carriers.
Complex / multifactorial
Many genes plus environment. Cancer, heart disease, diabetes, psychiatric disorders. Most disease falls in this category.
The Limits of Genetic Testing
Genetic tests have improved enormously. They're still limited:
- Tests cover only known variants. Novel mutations may not register
- Most genetic conditions have incomplete penetrance. A risk allele is a risk, not a sentence
- Most complex diseases are only weakly predicted by genetics alone
- Environmental and lifestyle factors often matter more than genetic ones for common diseases
A direct-to-consumer test ("you carry 7 variants associated with higher coffee-caffeine sensitivity") is usually informative at the variant level but only modestly predictive of actual outcomes.
Genetics in Biotech
Genetics underlies:
- Diagnostic tests
- Targeted therapies (drugs that work on specific genetic variants of a disease)
- Gene therapy
- Agricultural biotech (breeding and GMOs)
- Forensic and ancestry applications
Almost everything in the rest of this tutorial touches on genetics somehow.
Common Pitfalls
"I have the gene for X." Everyone has "the gene for X" if it's a gene. You probably mean "I have a variant of gene X associated with a trait or disease". The distinction matters
"Genes determine destiny." Rarely. Heritability varies by trait. For most complex traits, genes are part of the picture, not the whole picture
"Identical twins are genetically identical." Mostly, but small mutation differences accumulate after conception. They also differ epigenetically, especially over time
"23andMe told me I'm X% Irish." Ancestry estimates are statistical comparisons to reference populations, which are themselves constructed. The percentages are approximations, sometimes misleading
"If both parents are carriers, every child has the disease." A 1 in 4 chance per child, independently, not a guarantee. "Carrier" means heterozygous and typically unaffected
Next Steps
Continue to 07-evolution.md for the theory that ties genetics to the entire living world.