Scientific American has a news piece explaining the implications of one of the new studies on the human genome that I reported on last week. In a nutshell, the news piece explains how the identification of 250 new regions throughout the genome impacts the current human reference genome… raising concerns that reference genome may be faulty—and that there may actually be yet-to-be-uncovered genes missing from it.

Human Genome Project assembled this reference genome I am referring to in 2003. The reference genome is an amalgamation of sequences from four people (two men and two women) and still has gaps in it. I look forward to seeing if amendments will be made to the reference genome based upon these findings. If you think about it, it is gonna be a really big challenge to assemble a more complete reference genome. To recap the conclusions of the study,

“The researchers identified 1,695 instances of structural variations, 800 of which had not been previously reported. Fifty percent of the regions affected by these mutations showed up in more than one of the people studied. Forty percent of the 525 regions found to be missing from the reference genome were due to copy number variations, which means that a crop of yet-to-be-discovered genes may be hiding within them.”

With so many variants floating around, both large ones like copy number variations and small ones like SNPs, the reference genome must be assembled with the most common sets of alleles. That’s gonna take a lot of work, the genomes of many people from different ethnic backgrounds will need to be sequenced, assembled and folded into the current model.