, , , , , , , , ,

Both Nature and PNAS have put out two fascinating papers on the evolution of language.

Nature‘s “Quantifying the evolutionary dynamics of language,” studies how grammatical rules change over time, a term the authors call regularization. The authors specifically studied the regularization of English verbs over the past 1,200 years. Here’s a summary of what they concluded from the abstract,

“We have generated a data set of verbs whose conjugations have been evolving for more than a millennium, tracking inflectional changes to 177 Old-English irregular verbs. Of these irregular verbs, 145 remained irregular in Middle English and 98 are still irregular today. We study how the rate of regularization depends on the frequency of word usage. The half-life of an irregular verb scales as the square root of its usage frequency: a verb that is 100 times less frequent regularizes 10 times as fast. Our study provides a quantitative analysis of the regularization process by which ancestral forms gradually yield to an emerging linguistic rule.”

I’ve bolded what I consider important because this conclusion has some tangents to protein evolution as well. Often proteins that are less vital are mutated much more frequently than vital proteins. It is remarkable to see the authors quantified a similar phenomenon in language evolution.

On that note, PNAS ran this paper about a week ago, “Coevolution of languages and genes on the island of Sumba, eastern Indonesia.” Here’s the abstract,

“Numerous studies indicate strong associations between languages and genes among human populations at the global scale, but all broader scale genetic and linguistic patterns must arise from processes originating at the community level. We examine linguistic and genetic variation in a contact zone on the eastern Indonesian island of Sumba, where Neolithic Austronesian farming communities settled and began interacting with aboriginal foraging societies ~3,500 years ago. Phylogenetic reconstruction based on a 200-word Swadesh list sampled from 29 localities supports the hypothesis that Sumbanese languages derive from a single ancestral Austronesian language. However, the proportion of cognates (words with a common origin) traceable to Proto-Austronesian (PAn) varies among language subgroups distributed across the island. Interestingly, a positive correlation was found between the percentage of Y chromosome lineages that derive from Austronesian (as opposed to aboriginal) ancestors and the retention of PAn cognates. We also find a striking correlation between the percentage of PAn cognates and geographic distance from the site where many Sumbanese believe their ancestors arrived on the island. These language–gene–geography correlations, unprecedented at such a fine scale, imply that historical patterns of social interaction between expanding farmers and resident hunter-gatherers largely explain community-level language evolution on Sumba. We propose a model to explain linguistic and demographic coevolution at fine spatial and temporal scales.”

Like genes, words can be compared to one another and scrutinized with phylogenetic analysis to understand their origins. In this situation the authors found a correlation within individuals with similar Y chromosome lineages and cognates, words so similar from one language to the next that they suggest both are variants of a single ancestral prototype.