How was the world peopled?

PLoS Genetics has published a new population genetics paper. It summarizes the order by which the world was peopled through the use of a new statistical model. This has been a big question in anthropology, and has often relied on archaeology, linguistics, and ethnography to supplement the genetic and physical data. I don’t mean to imply that the question has been completely answered with this new paper — but it is a new approach to asking a very critical question.

The paper is titled, “Inferring Human Colonization History Using a Copying Model.” This study is based off of inheritance patterns of 2,000 SNPs from the Human Genome Diversity Project (HGDP) dataset from 2006. The dataset comes from 927 individuals from 53 different populations. Not all populations are included in this dataset, so there are gaps… But for any anthropologist out there who is interested with the tempo certain human populations radiated as well as their ancestry patterns, this open access paper is a must read.

The new “copy model” resolves much finer details because it compares the structure of chromosomes — i.e. how the haplotypes spread on a chromosome are inherited. This makes it possible to delve further back in time and identify smaller genetic contributions. You may know that other models have resorted to single loci, such as the Y-chromosome or mtDNA. It has been argued that these models oversimplify heredity. By analyzing shared parts of chromosomes across the entire human genome, the researchers believe their method can cope with much larger datasets, suggesting that over 500,000 genetic markers can be compared and contrasted in the future.

This paper has yielded both consistent and surprising results. For starters, the results are right inline with the Out of Africa model. In the video clips below, you can see that for yourself

Inferred history of the peopling of the world.

Donors are listed at the bottom in order according to the mean number of individuals that are used. Click to see the original movie in high res.

Did you noticed that the San are the beginning population? That’s obviously because the San of Southern Africa are the first population in the ordering of chromosomes. According to Spencer Wells, the San are one of the oldest, if not the oldest, peoples in the world based upon the Y-chromosome. Exactly one month ago, a study of mitochondrial genetic diversity within Africa kinda challenged this claim. But because this study used the HGDP dataset from 2006, the results are restricted to the populations included in the sample. The San gave rise to the Biaka, Bantu, and Mbuti populations which are all below the Sahara.

The last lineage to arise in Africa are the Mozabites, and based upon the 2,000 SNPs they have less in common with other African populations than the others African populations have with themselves. The authors suggest that this observation is because there was a bottleneck in the Mozabites that is not shared by any other African population.

The Mozabites gave rise to all the Central Eurasian populations in the HGDP sample. The Mozabites also gave rise to the Central European populations. The first three populations to arise in Europe are the French, Tuscans, and Italians. Several Near Eastern and Central Asian populations also contributed to the peopling of Central Europe.

East Asians have an entirely distinct source of ancestry from European peoples. The Uygurs and Hazara gave rise to Cambodian, Mongolian, Oroquen, Xibo, Yi, Tu, Daur, and Naxi people of East Asia. The Han also received their ancestry from the Xibo and other populations. Just how distinct is this cut-off? Well, less than 10% of Europeans show ancestry from the Uygurs. Almost no Europeans show ancestry from the Hazara. The authors suggest that this observation is because the East Asian populations were established independently from Europeans and only relatively recent admixture has affected the 10% Uygur-ness in European populations.

Many populations in Europe have exhibited distinct genetic, cultural, and linguistic traits such as the Basque. This study has shown that the Sardianians, Russians, Orcadians, and the Basque show strong similarities to other Europeans — but have a lot more Near Eastern and Central Asian ancestry markers than other Europeans. For example, the Basque show some of their ancestry come from the Hezhen, a far Eastern population.

The Pacific Islanders receive ancestry from the Melanesians and Cambodians — not surprising. The first Native American populations (the Colombians) share ancestry to the Hazara, Han, and Xibo, also not surprising. But since modern people were screened, the Colombians show European ancestry — it is most likely because of the outstanding European occupation of the Americas in the last 500 or so years.

The somewhat surprising finding (at least surprising to the authors, editors of the paper, and apparently the bloggers at the Spittoon) is that there’s strong Mongolian ancestry signal in the Pima people. This is distinctly differently from the Colombians, who have a much different ancestry. The authors write that this suggest independent waves of migration in the Americas which contradicts ‘the current consensus.’

I believe that this statement should be revised because a more recent paper, published after this current paper was submitted, suggests that the Americas was peopled in multiple waves. I’m kinda surprised the editors didn’t catch this. I’m also surprised the the bloggers behind the Spittoon, the 23andMe blog, didn’t catch this. They are in the population genetics and personal genomics business, I expect them to keep current on their literature. Anyways, I was talking to Razib about this and he suggested if some sort of Na-Dene phenomenon could be happening. Definitely possibly… what do you think?

Inferred history of chromosomes for individual populations

Each frame shows the path that chromosomes took from their origin in Southern Africa in reaching the population labelled in each frame. The width of each line indicates the proportion of the chromosomes that travelled by that route, with the diameter of the circle indicating the total proportion of chromosomes that went via that location (diameter of San = 1.0). Values were estimated recursively, working backwards from the labelled population to the first by assuming that the amount of genetic material passed on by each population was proportional to the number of donor individuals it contributed. Click to see the original movie in high res.

This paper is not the first to work around the single loci comparison critique, but it is successful and provides a template for others to work on. I’m really interested to see this same model applied to more SNPs and more populations.

    Hellenthal, G., Auton, A., Falush, D., Przeworski, M. (2008). Inferring Human Colonization History Using a Copying Model. PLoS Genetics, 4(5), e1000078. DOI: 10.1371/journal.pgen.1000078

9 thoughts on “How was the world peopled?

  1. But why not? was my reasoning. Surely as early arrivals on their continent they would be a most revealing part of any pattern. And nowhere in India shows up. Another missing bit in the puzzle.

  2. Terry,

    Yeah the Australian Aborigines are a critical population but the HGDP has a lot of missing groups. I don’t know why they weren’t included in the main release. It seems surprising because the main goal of the HGDP is to map the genetic variation between humans, which is less than 1% different. Leaving out such a homogenous population seems like ignoring out a very large part of their research focus. Maybe someone out there affiliated with the HGDP can shed some light on this question?

    I did some research, and there have been some claims that the HGDP halted because of protests from organizations like the ETC Group, which drained support for the HGDP. Many of the protests stemmed off of the applications of the HGDP results, i.e. fear of racisms, consent, patenting. etc. If true, the resource draining by way of protests could be one reason why many groups were left out.

    I hope that the project will continue. I also hope the next release of the HGDP will include Aborigines as as well as some other populations.


  3. Australian Aboriginal populations were originally planned to be included in the HGDP, but they were eventually left out due to major issues with consent and community approval. Indigenous Australians are extremely suspicious of attempts to use their biological material for research (with some justification), and it will take a long period of constructive engagement with this community by researchers before this attitude changes.

    The Genographic Project is attempting to obtain DNA (through community consultation) from both Australian and New Zealand natives. As far as I know they haven’t collected any Australian samples yet, and I’d expect it to be an uphill battle for them to do so.

  4. By the way, you might have noticed that Indian populations are also not included in the HGDP, another major omission. This is due to a ban by the Indian government on the export of human DNA samples out of the country. As a result, all of the “Central/Southern Asia” samples in the HGDP come from Pakistan.

  5. The fact that different teams see American Indian genetic variation in two mutulaly exclusive ways (as either founded by one migration, or by multiple migrations) may mean that actually American Indians represent a long-term isolate and a potentially authochthonous population, with “Colombians” and “Pima” sharing core Amerindian genetic identity and with at least two migrations coming out of America. This will be consistent with the evidence from kinship systems that I presented in “The Genius of Kinship” and in an earlier post on this blog.

    If I suspend my controversial theory for a split second, then the data presented in the paper under discussion correlates with the following three pieces of data.

    1) Ancient Amerindian skulls (Lagoa Santa, Sabana de Bogotá, Toca dos Coqueiros, etc.) tend to differ considerably from modern Amerindian skulls. Physical anthropologists compare the former to Australia, Oceania and the Ainu, and the latter to recent Mongoloids. Ancient American teeth show the same association with Australo-Melanesians as the skulls. I remember when I brought this issue up with Richard Klein, he was absolutely positive that the radical difference between the metrics of ancient and modern Amerindian skulls means two separate populations, representing two separate migrations, colliding in the Americas.

    2) North American and South American kinship systems are quite different from each other, with the first containing signals of long-term exogamy, and the latter of long-term endogamy. In South America, only the Mapuche show a clear North American pattern, with Ge and Panoan being ambiguous.

    3) American Indian mtDNA variation falls into two superclades/macrohaplogroups, M (C, D, M*) and N (A, X, B). Scholars have noticed that, although all haplogroups are represented from Tierra del Fuego to North America, the frequencies fluctuate considerably between South and North America. Roughly speaking, M lineages dominate in South America, while N lineages dominate in North America. (B is somewhat of an exception from this pattern, being mostly concentrated along the western coastal line, with some Andean populations having this lineage at fixation.) In certain areas of South America, such as Ecuador, the penetration of N lineages appears to have happened within the last 5,000 years. The colliding of two populations in America may have resulted in prolonged admixture and gene flow that leveled the frequencies out and made the Amerindian variation look as if it were a result of a single migration.

    These three pieces of evidence seems to sit well with the copying model analysis, with South America and North America representing epicenters of two distinct population dispersals separated in time by thousands of years. However one of the challenges that mtDNA presents to any multiple-migration model is that American Indian lineages, although belonging to two different macroclades, are phylogenetically very closely related to each other. With the exception of B, all of them share the ancestral state of site 16223 (C), which in the Old World splits all Eurasian sequences into the Western (European) with 16223T and Eastern (Asian) clades with 16223C.

  6. …on a side note, related to the missing Aboriginal Australians…I recently came across evidence that early human beings went from southern Africa (San) to Australia in a very short time. It could be that some strips of land may have been swallowed by the mighty Indian Ocean over the millenia. But I’m very curious about Maori and Southern African folklore about human beings and their relationship with whales. Call me crazy, but my latest hypothesis is that the San set sail for what is now known as Australia via the mouths of whales like Jonah.
    I’m no anthropologist, I’m a filmmaker. And this idea may very well surface in a few years time on screen. Any thoughts?

Comments are closed.

A Website.

Up ↑

%d bloggers like this: