Both Razib and Dienekes have put a posts about this new Current Biology paper, “Correlation between Genetic and Geographic Structure in Europe.” The authors of the paper compare the genetic make up of 2,514 individuals from Europe using the Affymetrix GeneChip Human Mapping 500K Array Set.

Always the over achiever of science blogging, Razib has dutifully labeled the populations on the graph. His modifications help better visualize the genetic similarities and differences among and between the European populations tested. And there are some interesting patterns. There’s a similarity among northern European populations as well as a similarity among southern European populations.

Fins tested are the least similar group to other European populations. Swedes and Spanish people are clearly different, while the Irish and British share a lot of admixture among the 500,000 SNPs tested. So what does that all mean? This result indicates that there is a genetic component to European ethnic groups.

Not entirely surprising, because in 2006, we saw the open access journal PLoS Genetics publish a typing of 5,000 SNPs among about 1,000 Europeans and European Americans. In that paper, the researchers were able to resolve the genetic differences between northern and southern European groups. Image below. Also, in January of this year I read and reviewed two papers that did similar tests, comparing 300,000 SNPs between approximately 4,198 European Americans. After some principal component analyses (PCA), there was a clear distinction between individuals with northern from southern European ancestry, as well as separation of Italian, Spanish, and Greek individuals from those of Ashkenazi Jewish ancestry.

PLoS Genetics has also recently published a similar paper, “Tracing Sub-Structure in the European American Population with PCA-Informative Marker,” which announces a purely computational method of identifying ancestry — one that doesn’t require a poll of the individuals’ identified ethnic background. The researchers analyzed 1,521 individuals for more than 300,000 SNPs across the entire genome.

While not as robust of a data set as the Current Biology paper, the authors were able to pluck out 200 ancestry informative SNPs that accurately predict fine structures in European American datasets, as identified by PCA. They did so by removing any redundant SNPs uncovered during the modeling process. Moreover, much of the genetic variation identified were between the northern and southern European ancestry groups.

Going back to the ‘is this surprising?’ point, in 1990, Barbujani et al. noted the delineation of northern and southern Europeans between the distribution of 63 allele frequences, in “Zones of sharp genetic change in Europe are also linguistic boundaries,” and attributed the language affiliation of European populations playing a major role in maintaining and probably causing genetic differences. Makes sense.

