Episode 1: A ScreenCast Tutorial On How-To Do A Multiple Sequence Alignment & Draw A Phylogenetic Tree Using Swami

The last time I did a little tutorial on how to use bioinformatic tools in anthropological research was last October. I’ve had some ideas since then and have decided to restart this project. The biggest change is the screencast format, rather than a set of static instructions.

Today, I’d like to introduce you to the first installation in this series of tutorials on how to use commonly used bioinformatic tools such as a multiple sequence alignment and drawing a phylogenetic tree. Multiple sequence alignments and phylogenetic trees are used in evolutionary analyses to understand the similarities and differences in sequences of DNA, RNA, or amino acids. The basic premise is built off the understanding that more similar sequences are more related than dissimilar sequences.

In this episode, I compare the D-Loop sequence of the mitochondrial genome of two Neandertals, one modern human, a chimpanzee, gorilla and orangutan using Swami — a cohesive collection of commonly used tools. Swami allows us to do a mutliple sequence alignment and generate a phylogenetic tree. The results are displayed above and to the right. I’ve recorded this 7 min 30 second screencast for you to follow. If you’d like to give it a run for yourself, here’s the array of primate D-Loop sequences I’ve used:

>Neandertal-1 (AF254446.1)
CCAAGTATTGACTCACCCATCAACAACCGCCATGTATTTCGTACATTACTGCCAGCCACCATGAATATTG
TACAGTACCATAATTACTTGACTACCTGTAATACATAAAAACCTAATCCACATCAACCCCCCCCCCCCAT
GCTTACAAGCAAGCACAGCAATCAACCTTCAACTGTCATACATCAACTACAACTCCAAAGACACCCTTAC
ACCCACTAGGATATCAACAAACCTACCCACCCTTGACAGTACATAGCACATAAAGTCATTTACCGTACAT
AGCACATTATAGTCAAATCCCTTCTCGCCCCCATGGATGACCCCCCTCAGATAGGGGTCCCTTGA

>Neandertal-2 (AF011222.1)
GTTCTTTCATGGGGGAGCAGATTTGGGTACCACCCAAGTATTGACTCACCCATCAGCAACCGCTATGTAT
CTCGTACATTACTGTTAGTTACCATGAATATTGTACAGTACCATAATTACTTGACTACCTGCAGTACATA
AAAACCTAATCCACATCAAACCCCCCCCCCCATGCTTACAAGCAAGCACAGCAATCAACCTTCAACTGTC
ATACATCAACTACAACTCCAAAGACGCCCTTACACCCACTAGGATATCAACAAACCTACCCACCCTTGAC
AGTACATAGCACATAAAGTCATTTACCGTACATAGCACATTACAGTCAAATCCCTTCTCGCCCCCATGGA
TGACCCCCCTCAGATAGGGGTCCCTTGAT

>Human (X90314.1)
TTCTTTCATGGGGAAGCAGATTTGGGTACCACCCAAGTATTGACTTACCCATCAACAACCGCTATGTATT
TCGTACATTACTGCCAGCCACCATGAATATTGCACGGTACCATAAATACTTGACCACCTGTAGTACATAA
AAACCCAATCCACATCAAAACCCCCTCCCCATGCTTACAAGCAAGTACAGCAATCAACCCTCAACTATCA
CACATCAACTGCAACTCCAAAGCCACCCCTCACCCACTAGGATACCAACAAACCTACCCACCCTTAACAG
TACATAGTACATAAAGCCATTTACCGTACATAGCACATTACAGTCAAATCCCTTCTCGTCCCCATGGATG
ACCCCCCTCA

>Chimpanzee (AF176766.1)
GTACCACCTAAGTATTGGCCTATTCATTACAACCGCTATGTATTTCGTACATTACTGCCAGCCACCATGA
ATATTGTACAGTACTATAACCACTCAACTACCTATAATACATTAAGCCCACCCCCACATTACAACCTCCA
CCCTATGCTTACAAGCACGCACAACAATCAACCCCCAACTGTCACACATAAAATGCAACTCCAAAGACAC
CCCTCTCCCACCCCGATACCAACAAACCTATGCCCTTTTAACAGTACATAGTACATACAGCCGTACATCG
CACATAGCACATTACAGTCAAATCCATCCTTGCCCCCACGGATGCCCCCCCTCAGATAGG

>Gorilla (AF089820.1)
TTCTTTCATGGGGAGACGAATTTGGGTGCCACCCAAGTATTAGTTAACCCACCAATAATTGTCATGTATG
TCGTGCATTACTGCCAGCCACCATGAATAATGTACAGTACCACAAACACTCCCCCACCTATAATACATTA
CCCCCCCTCACCCCCCATTCCCTGCTCACCCCAACGGCATACCAACCAACCTATCCCCTCACAAAAGTAC
ATAATACATAAAATCATTTACCGTCCATAGTACATTCCAGTTAAACCATCCTCGCCCCCACGGATGCCCC
CCTTCAGATAGGGATCCCTTAAA

>Orangutan (X97708.1)
TTCTTTCATGGGGGACCAGATTTGGGTGCCACCCCAGTACTGACCCATTTCTAACGGCCTATGTATTTCG
TACATTCCTGCTAGCCAACATGAATATCACCCAACACAACAATCGCTTAACCAACTATAATGCATACAAA
ACTCCAACCACACTCGACCTCCACACCCCGCTTACAAGCAAGTACCCCCCCATGCCCCCCCACCCAAACA
CATACACCGATCTCTCCACATAACCCCTCAACCCCCAGCATATCAACAGACCAAACAAACCTTAAAGTAC
ATAGCACATACTATCCTAACCGCACATAGCACATCCCGTTAAAACCCTGCTCATCCCCACGGATGCCCCC
CCTCAGTTAGTAATCCCTTACT

Please check it out and let me know what you think of it, i.e. do you like this format? Did you find it useful? Was I moving too fast, did I explain what I was doing thoroughly? And lastly, what would you like to see?

5 thoughts on “Episode 1: A ScreenCast Tutorial On How-To Do A Multiple Sequence Alignment & Draw A Phylogenetic Tree Using Swami

  1. Hmmm… I’m not any expert in this bioinformatic tech but I find quite odd that gorillas look more distant than orangutans in that NJ tree (normally gorilla would be closer to chimp and human than to orangutan). I also find odd that the orangutan’s sequence is more than twice as long as any of the others and I wonder if that may have introduced errors.

  2. Hi Luis,

    Thanks for you comment. You raise some valid points that I overlooked, I’ve taken out the ‘extra’ sequences in the orangutan sample and updated the sample sequences as well as the image. The gorilla is now slightly more closer than the orangutan, but Neandertal-1 now is more dissimilar than Neandertal-2.

    Kambiz

  3. Glad to be of some use. It looked strange, just that.

    The gorilla is now slightly more closer than the orangutan, but Neandertal-1 now is more dissimilar than Neandertal-2.

    That’s pretty interesting. I wonder if that means something or is just “noise” introduced by the small sample. I can just assume that humans, chimps, etc. are not just points either but that they are distributed by some small range. Still the distance between both neanders seems pretty large. N-1 is almost twice as distant from the split with sapiens as his cousin N-2.

    It looks as if they had a large genetic diversity and also as if N-2 would be more “archaic” and N-1 more “evolved”. Are they from very different dates? Or maybe geographies? (I think I know which are the two specimens but too tired to search for them now).

    It cannot mean hybridation in any case, because that would not affect the mtDNA at all.

  4. Looking in more detail, it looks like chimpanzees (or at least the dot representing them) would be less than 50% more distant from us than N-1. That is even more intriguing.

    It seems to mean that, if 2+2=4 and that program works fine (and there are no errors in the sequences), N-1 is almost as distant from us as chimps, what goes against everything we think we know about Homo sp. and specially about our close cousins the Neanderthals.

    It makes little sense. I’d check for any posible input error first of all.

    If there are no such errors… then is it a discovery or just a misunderstanding?

  5. Erratum:

    I just wrote … would be less than 50% more distant…

    Actually I forgot to measure the smaller segment. It’s more like 55, maybe 60%.

    But the head scratching remains.

Comments are closed.

A WordPress.com Website.

Up ↑

%d bloggers like this: