Anthropology.net

Beyond bones & stones

Archive for January 17th, 2008

Christopher Columbus’ Package of Love, Syphilis

with 7 comments

In my life, I’ve seen Christopher Columbus’ reputation take the downward spiral from hero to enemy. It has even affected me in a very superficial manner. See, even though the US government has commemorated the day he found the Americas as a holiday, I remember being devastated the year when my school district decided to nix the day off I was so looking forward too.

Since then, my education in anthropology hasn’t held him to a high standard either. He’s often vilified for starting the end of native American existence. And now a new study in PLoS Neglected Tropical Diseases, traces the emergence of syphilis in Europe to the time when Columbus returned from the Americas. Furthermore, a phylogenetic comparison of the syphlilis causes bacterium, Treponema pallidum, shows that it is a close cousin to the South American tropical disease yaws.

Actually for some time Columbus has been blamed for bringing syphilis to the East. But a skeleton of a man was found in north eastern Britain with signs of bone lesions similar to those causes by syphilis. Preliminary dates of this skeleton suggested that the man had died around 1442, exonerating Columbus for a bit. Here’s a link to that paper, ‘“The syphilis enigma”: the riddle resolved?

Since then, anthropologists re-evaluated the date and suggested that the fishy diet of the region somehow affected the dating technique, making the skeleton seem older than it is. With all this confusion over paleopathology and dating techniques, a genetic analysis of the Trepanoma bacterium seemed much more logical.

In order to conduct the study, 22 human samples, including two Yaws samples, were compared. Even though Yaws is not sexually transmitted (it is transmitted through skin contact), it was included because it is thought that South American variety is a good candidate for the source of the venereal disease. Since the bacteria are so fragile only some sections of the genome could be recovered, including 17 base pairs that ended up being diagnostic of the different Treponema. Through a SNP analysis, it was found that syphilis and South American yaws shared 4 identical base pairs. Not entirely convincing but the overlap with any other kind of Treponema was almost non-existant… A phylogenetic tree of the Treponema samples also showed that syphilis had evolved most recently of the bacterial strains studied, and by most recently we’re talking about 500 or so years ago.

A network path for four informative substitutions shows that New World subsp. pertenue, or yaws-causing strains, are the closest relatives of modern subsp. pallidum strains.

Columbus’ crew is the only one known to have voyaged to the Americas during that time. And the first recorded epidemic of syphilis in Europe broke out among French troops in 1495, two years after Columbus returned from his first voyage across the Atlantic, which points the finger to him and his people. But, with our knowledge of how bacteria can share genes, I’m not entirely convinced it was him. Furthermore, a 4 base homology isn’t a lot.

Written by Kambiz Kamrani

January 17, 2008 at 12:58 pm

The 20,500 Protein-Encoding Genome We Call Our Own

leave a comment »

Earlier this week, news of a new paper about the number of protein-encoding genes surfaced on Sandwalk and Henry. The paper’s title is straightforward, “Distinguishing protein-coding and noncoding genes in the human genome” but the concepts behind it may not be.

As mentioned in Sandwalk, the initial estimates of the number of genes in the human genome was about 30,000. That was when the first drafts of the human genome became available in June of 2000. Since then the numbers have been fluctuating, and for many it may seem like geneticists and molecular biologists working on annotating the human genome are riding a roller coaster of indecision. In reality, it is not easy to exactly calculate the number of the genes in any genome.

Why is it not easy to calculate the number of genes? The human genome is around 3,000,000,000 bases long. That’s three thousand million and the average human gene is 12,000 bases long! It is almost like finding a needle in a haystack, but thankfully there is some organization in the genome that helps us find genes faster. Large deserts of junk DNA exist, which helps weed out the possibility of finding genes. And since a gene have a start and a stop, we can harness the power of computers to scan and seek out these signals.

See, the current work flow to estimate the number of genes is to first isolate genomic DNA from the organism. The DNA is then sheared up into many fragments and depending on the cloning mechanism, the fragments are amplified by PCR, in vector expressing bacteria, or both! Once amplified the fragments are then sequenced. This is called shotgun sequencing, the method that Craig Venter deployed to help accelerate the sequencing of the human genome. Since some fragments are larger than other, it is possible to create scaffolding based on homologous sequences called contigs to figure our where fragments fall in order. This is called the assembly of the genome.

Once most of the fragments are assembled, it is also possible to annotate the genome. Annotate means to explain what the nucleotide sequence means. If a nucleotide sequence begins with a start codon and ends with a stop codon in frame, it creates a big flag that this sequence maybe a gene. There’s a lot of definitions of a gene, and for the sake of this post, let’s run on the one definition that calls a gene as any sequence of DNA that is transcribed. This segment of the genome is further scrutinized for splice sites and any other regions, such as regulatory sequences, to help figure out if it’s really a gene. The sequence is also compared to other known sequences, using BLAST, a tool the compares the sequence to a massive database of sequence. If any significant matches come up to already known genes, the possibility that the unknown sequence is a gene increases based on the observation that genes are generally highly conserved throughout evolutionary time.

If the sequence meets all the criteria of a gene, it is labeled an open reading frame or ORF. ORFs are putative genes. In order to confirm an ORF, researchers often need to turn to the wet-lab to either find the gene expressed as an RNA or protein in an organism. With 30,000 or so ORFs, the process of validating each gene is enormous and time consuming. Not every research lab is working on confirming if an ORF is really a gene, so that also slows down the process.

The research conducted in the paper above, involved scrutinizing 22,000 ORFs from the Ensembl database. The analysis revealed a lot of orphan DNA sequences. Orphan sequences look like they encode proteins because of their open reading frames, but they are not present in the mouse and dog genomes. Just cause dogs and mice didn’t have the ORFs didn’t mean the ORFs aren’t real genes. They could be unique primates genes, deriving during or after the primate lineage split from the rest of the mammals. Or, the genes could have been more ancient creations and lost in mouse and dog lineages. Either way, if the ORFs were also compared to primate genomes, then they should appear there as well.

Comparing the ORFs to the chimpanzee and macaque genomes invalidated a total of about 5,000 ORFs that had been incorrectly added to the lists of protein-coding genes. This reduces the current estimate to roughly 20,500 genes that encode for proteins in the human genome. That’s not much, evolution isn’t a numbers game. Some of the variation in the genes as well as the patterns of regulation and expression of these genes are what makes us human. So if you’re thinking, “Why do humans have so few genes?” don’t fret, size doesn’t matter in this case.

Written by Kambiz Kamrani

January 17, 2008 at 11:55 am

Follow

Get every new post delivered to your Inbox.

Join 691 other followers