In light of a discussion between Razib and Martin, I recently took arms and battled the concepts behind race and identity and how human genetic variation plays a role in forming these concepts. In the comments, I was disgusted to read Martin throw in this rhetorical line in his comment,
“Genetics have long ago shown that people vary more within the major racial groups than these groups vary among themselves.”
I’ve heard this so many times that I want to puke. It means nothing to me. Based on this line and others he’s said in his discussions with Razib, I’ve picked up that he’s a big proponent of the “if it sounds like a mantra, then it must be true.” In the following post, I will debunk this mentality. Several new publications have just come out in PLoS Genetics that show exactly how genetics can help identify groups, especially groups that are not demarcated by major social and phenotypic differences.
All of these publications, three in total actually, focus on identifying genetic markers to help identify populations of Europeans (there’s a bonus one on the genetic structure of Polynesians). They are open access, so you have freely readable first hand literature to follow along. As you know Europeans are often viewed as a homogeneous category of classification. We have a wealth of evidence that tells us of cultural admixture, wars, migrations, formations and declines of new states, etc. over thousands of years of European history. All of these social mechanisms have left an imprint on the cultural, biological, and linguistic composition of Europe. To further complicate things, and addressing the identity crisis Martin brought up when he stated that he’s a Swede, I’ve seen US census reports just lump every of European descendant as ‘white.’ Such a label actually groups together multiple populations, which have diverse origins due to the complex history of Europeans.
The first genetic dissection of the population structure of European Americans that I will share with you involved a lot of researchers collaborating together. They focused their work on identifying the contributions from different genetic ancestries. Why did they do it? Like I said earlier, the primary motivation was to identify genes that can be associated with disease. Here’s an excerpt of the abstract that is useful in demonstrating how genetic markers can help identify groups of people,
“Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries.“
A sample size of 4,200 is large, folks. They could find and validate 300 markers that can distinguish regional ancestries, which helps narrow down ethnicity. But that’s not good enough, rather, if we had more markers we could classify more people who carrier similar markers into the ethnic groups. Well a second publication from a collaborating group did exactly that,
“European population genetic substructure was examined in a diverse set of >1,000 individuals of European descent, each genotyped with >300 K SNPs. Both STRUCTURE and principal component analyses (PCA) showed the largest division/principal component (PC) differentiated northern from southern European ancestry. A second PC further separated Italian, Spanish, and Greek individuals from those of Ashkenazi Jewish ancestry as well as distinguishing among northern European populations. In separate analyses of northern European participants other substructure relationships were discerned showing a west to east gradient.”
So moving from 300 to 300,000 SNPs increased the resolution immensely. Now people could be classified as Italian, Spanish, yada yada you read the list ancestry. Not bad, at all. These markers are labeled as ancestry informative markers or AIMs. These AIMs have been mixed and matched because doing so helps…
“.. distinguish the ancestries of these genetically very similar populations, whose real or perceived group differences may often be dominated by environmental, social, and cultural factors. Below, we outline the possible choices of marker sets for inferring various ancestries. In each case, a method such as structured association or principal components analysis can be applied to genotype data to correct for stratification.
To correct for stratification along the north–south (or northwest–southeast) cline, either the Price100 or Tian192 marker sets can be used. (The Tian192 markers, which were ascertained using northern European versus Ashkenazi Jewish ancestry, are effective in distinguishing north–south ancestry because southern Europeans attain intermediate ancestry values as compared to values at one extreme for northern Europeans.) To correct for stratification involving both north–south and Ashkenazi Jewish ancestry, one option is to use the Price100+Price200 marker sets, which together separate north, south, and Ashkenazi ancestry into three distinct clusters. Another option is to use the Tian192 marker set, which models these three ancestries along a single axis and will be sufficient in the case that the phenotype being analyzed has intermediate values for southern European as compared to northern European versus Ashkenazi Jewish ancestry. Finally, to correct for stratification involving a west–east gradient within northern Europe (e.g., Irish versus other northern European ancestry), the Tian1211 marker set is the only set of AIMs available.”
So there you have it, by increasing the screen to search for more markers and using different combinations of markers, the researchers were able to identify genetic similarities and differences between groups of genetically similar people. I don’t know how anyone can go about saying something like “people vary more within the major racial groups than these groups vary among themselves.” If it can be done in perceived homoegenous European groups, it can be done elsewhere too. It is probably being done elsewhere…
Maybe the reason why it hasn’t been done before is because we just haven’t had as many SNPs to screen for in the past. I attribute this shortcoming as one of the sources of the misinformation Martin reiterated. With more and more people studying human variation, more and more people will be sampled. As sample sizes increase, and as projects like the HapMap and Genographic projects expand, I imagine we’ll identify tons of SNPs and markers. Also, in the past, I imagine the mtDNA was the only easily researchable locus to screen for genetic variation and diversity… and mtDNA is small and does not store nearly as much variation as nuclear DNA. That could be where Martin picked up the “long ago” portion of his statement. Long ago in genetics was when mtDNA was the only accessible and reliable thing to study and that wasn’t that long ago. But we now look for variation in the nuclear genome of humans, which contains many more base pairs and many more heritable markers.
As kinda icing on the cake, I want to move away from European populations and shed the spot light onto human genetic diversity of the Pacific because as the authors write in their abstract,
“Human genetic diversity in the Pacific has not been adequately sampled, particularly in Melanesia. As a result, population relationships there have been open to debate.”
This is a cool study that’s made the rounds on a couple blogs. If you may not have caught it before, let me summarize the study for you. The study was just as comprehensive as the European genetic studies. In involved almost 1,000 individuals from 41 populations. Using more than 800 genetic markers the results revealed that Polynesians and Micronesians have almost no genetic relation to Melanesians, rather Polynesians’ and Micronesians’ closest relationships are to Taiwan Aborigines and East Asians. And that groups that live in the islands of Melanesia are remarkably diverse. The research also suggests that the ancestors of Polynesians moved through Island Melanesia relatively rapidly.
In conclusion, very recent papers are telling us exactly the opposite of what Martin said. Furthermore, I brought this up in my previous human genetic variation post, but I gotta bring it up again because it happened again, I really don’t appreciate this ‘long ago’ academic arrogance expressed when people say “long ago, we anthropologists decided race was a social construct,” or “long ago, genetics confirmed human don’t vary much.” Phrasing statements like that imply that if I think otherwise then I’m dated, I’m not with the times. How unscientific. It is actually those who do not refresh their knowledge and keep current with advances in population studies that look like the dated and uneducated ones.