# Darwin and Mixture Distributions

Although Charles Darwin understood that each individual of a species resembled its parents, he did not know about the genetic basis for life, i.e. each trait of an individual comes from a gene, which comes from exactly one of either the mother or the father. Instead, he imagined that each trait comes from a blending of the traits of the parents. In Richard Dawkins' book The Greatest Show on Earth, The Evidence for Evolution, he writes about Darwin:

He was aware of course that characteristics tend to run in families, that offspring tend to resemble their parents and siblings. Heredity was a central plank of his theory of natural selection. But a gene pool is something else. A gene is an all or nothing entity. When you were conceived, what you received form your father was not a substance, to be mixed with what you received from your mother, as if mixing blue paint with red paint to make purple.

This is an interesting distinction because it amounts to the distinction between the average of two random variables

$X = \frac{1}{2}(X_1 + X_2)$

whose distribution is given by a convolution of the two probability density functions, and a mixture distribution,

$Z \sim U\{+1,-1\}$

$X \sim p_z$

where a latent Bernoulli random variable $$Z$$ decides which random variable to sample from (father/mother), and then depending upon the value of $$Z$$, one of two distributions is sampled. This two step process is then repeated for each gene. The averaging processes, over successive generations, would lead to a Gaussian distribution by the central limit theorem, implying that after several generations, all members of a species would look very similar with smooth variation across members. On the other hand, the mixture distribution would lead to the formation clusters.