, , , , , ,

PLoS Genetics has published a new population genetics paper. It summarizes the order by which the world was peopled through the use of a new statistical model. This has been a big question in anthropology, and has often relied on archaeology, linguistics, and ethnography to supplement the genetic and physical data. I don’t mean to imply that the question has been completely answered with this new paper — but it is a new approach to asking a very critical question.

The paper is titled, “Inferring Human Colonization History Using a Copying Model.” This study is based off of inheritance patterns of 2,000 SNPs from the Human Genome Diversity Project (HGDP) dataset from 2006. The dataset comes from 927 individuals from 53 different populations. Not all populations are included in this dataset, so there are gaps… But for any anthropologist out there who is interested with the tempo certain human populations radiated as well as their ancestry patterns, this open access paper is a must read.

The new “copy model” resolves much finer details because it compares the structure of chromosomes — i.e. how the haplotypes spread on a chromosome are inherited. This makes it possible to delve further back in time and identify smaller genetic contributions. You may know that other models have resorted to single loci, such as the Y-chromosome or mtDNA. It has been argued that these models oversimplify heredity. By analyzing shared parts of chromosomes across the entire human genome, the researchers believe their method can cope with much larger datasets, suggesting that over 500,000 genetic markers can be compared and contrasted in the future.

This paper has yielded both consistent and surprising results. For starters, the results are right inline with the Out of Africa model. In the video clips below, you can see that for yourself

Inferred history of the peopling of the world.

Donors are listed at the bottom in order according to the mean number of individuals that are used. Click to see the original movie in high res.

Did you noticed that the San are the beginning population? That’s obviously because the San of Southern Africa are the first population in the ordering of chromosomes. According to Spencer Wells, the San are one of the oldest, if not the oldest, peoples in the world based upon the Y-chromosome. Exactly one month ago, a study of mitochondrial genetic diversity within Africa kinda challenged this claim. But because this study used the HGDP dataset from 2006, the results are restricted to the populations included in the sample. The San gave rise to the Biaka, Bantu, and Mbuti populations which are all below the Sahara.

The last lineage to arise in Africa are the Mozabites, and based upon the 2,000 SNPs they have less in common with other African populations than the others African populations have with themselves. The authors suggest that this observation is because there was a bottleneck in the Mozabites that is not shared by any other African population.

The Mozabites gave rise to all the Central Eurasian populations in the HGDP sample. The Mozabites also gave rise to the Central European populations. The first three populations to arise in Europe are the French, Tuscans, and Italians. Several Near Eastern and Central Asian populations also contributed to the peopling of Central Europe.

East Asians have an entirely distinct source of ancestry from European peoples. The Uygurs and Hazara gave rise to Cambodian, Mongolian, Oroquen, Xibo, Yi, Tu, Daur, and Naxi people of East Asia. The Han also received their ancestry from the Xibo and other populations. Just how distinct is this cut-off? Well, less than 10% of Europeans show ancestry from the Uygurs. Almost no Europeans show ancestry from the Hazara. The authors suggest that this observation is because the East Asian populations were established independently from Europeans and only relatively recent admixture has affected the 10% Uygur-ness in European populations.

Many populations in Europe have exhibited distinct genetic, cultural, and linguistic traits such as the Basque. This study has shown that the Sardianians, Russians, Orcadians, and the Basque show strong similarities to other Europeans — but have a lot more Near Eastern and Central Asian ancestry markers than other Europeans. For example, the Basque show some of their ancestry come from the Hezhen, a far Eastern population.

The Pacific Islanders receive ancestry from the Melanesians and Cambodians — not surprising. The first Native American populations (the Colombians) share ancestry to the Hazara, Han, and Xibo, also not surprising. But since modern people were screened, the Colombians show European ancestry — it is most likely because of the outstanding European occupation of the Americas in the last 500 or so years.

The somewhat surprising finding (at least surprising to the authors, editors of the paper, and apparently the bloggers at the Spittoon) is that there’s strong Mongolian ancestry signal in the Pima people. This is distinctly differently from the Colombians, who have a much different ancestry. The authors write that this suggest independent waves of migration in the Americas which contradicts ‘the current consensus.’

I believe that this statement should be revised because a more recent paper, published after this current paper was submitted, suggests that the Americas was peopled in multiple waves. I’m kinda surprised the editors didn’t catch this. I’m also surprised the the bloggers behind the Spittoon, the 23andMe blog, didn’t catch this. They are in the population genetics and personal genomics business, I expect them to keep current on their literature. Anyways, I was talking to Razib about this and he suggested if some sort of Na-Dene phenomenon could be happening. Definitely possibly… what do you think?

Inferred history of chromosomes for individual populations

Each frame shows the path that chromosomes took from their origin in Southern Africa in reaching the population labelled in each frame. The width of each line indicates the proportion of the chromosomes that travelled by that route, with the diameter of the circle indicating the total proportion of chromosomes that went via that location (diameter of San = 1.0). Values were estimated recursively, working backwards from the labelled population to the first by assuming that the amount of genetic material passed on by each population was proportional to the number of donor individuals it contributed. Click to see the original movie in high res.

This paper is not the first to work around the single loci comparison critique, but it is successful and provides a template for others to work on. I’m really interested to see this same model applied to more SNPs and more populations.

    Hellenthal, G., Auton, A., Falush, D., Przeworski, M. (2008). Inferring Human Colonization History Using a Copying Model. PLoS Genetics, 4(5), e1000078. DOI: 10.1371/journal.pgen.1000078