Sunday, October 31, 2010

Plenty of Genomes are Still Fair Game for Sequencing


I've been grossly neglecting this space for an entire month with only the usual excuses -- big work projects, a lot of reading, etc. None good enough. Worst of all, as usual, it's not that I haven't composed possible entries in my head -- they just never get past my fingertips.

Tonight is the night most associated with pumpkins, and an earlier highlight was attending the Topsfield Fair, where the pictured specimen was on display. Amazing as it is, it fell nearly 15 pounds shy of the world record. If you want to try to grow your own, every year the variety which has dominated the winners can be purchased. Nature isn't all though; champion pumpkin growing requires a lot of specialized culture ranging from allowing only a single fruit to set to injecting nutrients just upstream of that fruit.

Sometime in recent memory there were some other blogs noted in GenomeWeb for discussing whether there are any truly remarkable genome sequencing projects left. Which I've been pondering: what makes for a very interesting species to sequence. Now, both of the bloggers mentioned clearly were not fond of either "K" genome project -- the 1,000 humans or 10,000 vertebrates. There were also some potshots taken at the "delicious or cute" genomes concept. One suggested that no interesting metazoa ("animals") are left.

So, what does make an interesting genome? Well, I can think of several broad categories. I'll try to throw out possible examples of each, though to be honest I wouldn't be surprised if some of these genomes are sequenced or nearly so -- it's very hard to keep track of complete genomes these days!

First, which I think would resonate with those two critical articles, would be genomes with interesting histories -- genomes that might tell us stories purely about DNA. This was the bent of these papers I refer to. In particular, they were thinking of many of the unicellular eukaryotes which are the result of multiple endosymbiont acquisition / genome fusion events. But, I would definitely throw into this category a particular animal: the Bdelloid rotifers, which have gone without recombination for a seeming eternity. Of course, to really understand that genome, you'd need to also sequence one of the less chaste rotifers.

Another hugely interesting class of genomes would be those to shed light on development and its evolution (evo-devo). In particular, there are a lot of arthopod genomes yet unsequenced -- from what I've noted it appears that most sequenced arthropods are either disease vectors, agricultural pests or economically important (plus, of course, the model Drosophila). Even so, I'd guess there are not many more than a dozen complete arthopod genomes so far -- quite a paucity considering the wealth of insects alone. And, if I'm not mistaken, mostly insects and an arachnid or two have gone fully through the sequencer -- where are all the others? By the way, I'd be happy to help with sample prep for the Homarus americanus genome!

Another huge space of genomes worth exploring are those were we are likely to find unusual biochemistry going on. Now, a lot of those genomes are bacterial or fungal, but there are also an awful lot of advanced plants that have interesting & useful biochemical syntheses.

All that said, I find it odd that some don't see the import and utility of sequencing many, many humans and a lot of vertebrates also. It is important to remember that a lot of funding is from the public, and the public considers many of these other pursuits less important than making medical advances. It is easy for those of us in the biology community to see the longer threads connecting these projects to human health or just the importance of pursuing curiosity, but that doesn't always sell well in public.

An optimistic view is that all the frustrated sequencers should hunker down and patiently wait; data generation for new genomes is getting cheaper by the minute, with short reads to fill out the sequence and ultra-long reads to replace physical mapping. A more conservative view holds that bioinformatics & data storage will soon dominate the equation, which might still make it hard to get lots of worthy genomes sequenced.

Personally, I can't stroll a country fair without wanting to sequence just about everything I see on display -- the chickens that look like Philadelphia Mummers, the two yard long squash, bizarrely shaped tomatoes -- and of course, the three quarter ton plus pumpkins.