Tuesday, March 13, 2007

Sailing the Genomes Blue

Today's Wall Street Journal had an item on Craig Venter's new publication in PLoS Biology describing the collection and metagenomic sequencing of seawater from around the world. You'll need to have paid access to the WSJ, or find a print copy (my access), or perhaps it will show up on a free newspaper site at some point (many WSJ articles do via the wire services). Further information is available on the expedition's website, including pictures of their sailboat Sorcerer II.

The raw numbers are amazing: 6.3 Gbp of raw data -- or about 1.5 human genome equivalents -- and all apparently by 'old-fashioned' fluorescent Sanger sequencing. Samples were collected at regular intervals along the sailing route

There's a lot in the paper, and I won't pretend to have read all of it. One interesting bit is what the authors call 'extreme assembly'. Whereas most genome assembly schemes attempt to minimize the probability of getting chimaeric assemblies (with data glommed together that should be apart), this approach tries to get as big an assembly as possible -- as long as 900Kb from this dataset. While chimaeras are expected (and found), the hope is that you can untangle the knots later but that these extreme assemblies will be useful in collecting sequences together that should go together.

One other nice bit: in addition to deposition at NCBI, the data & tool set will be made freely available at a site called CAMERA. One of my long-held idealistic beliefs in the genome project & bioinformatics is that it can be a great leveler of educational institutions (or more properly, a great boost for many smaller schools). With hardware which is increasingly cheap & ubiquitous, any undergraduate (or high school student!) can do interesting analyses using tools and data which are freely accessible. As an undergraduate, our budget for sequencing was about one kit per semester (and these were the pre-ABI days -- we're talking radioactive dideoxy here) -- and with a little bad luck we never got any useful data. I dabbled with public sequence data then -- but how little there was. Now, an undergraduate funded far worse than I was can have an endless supply of explorations.

The WSJ item brought out one interesting incident: at one point Venter and his crew were apparently placed under house arrest in a Pacific island nation (I forget which one; it was in the article). Treaties on bioprospecting give nations to the right to regulate such activities in their territorial waters, and Venter apparently didn't have the correct permits. Of course, the seawater bugs are probably rather deficient in critical documents such as passports, nor do I expect they swear allegiance to any nation.

Venter has, of course, obtained the career status many claim to dream of (particularly in the context of mega-lottery winnings): he is independently wealthy & gets to combine his favorite leisure activity with further promotion of his scientific interests. Color me several shades of green.


Jonathan Eisen said...

Well, since you did comment about the need for a subscription to the WSJ to read the article, I think it is worth pointing out that the articles themselves are all freely available, since PLoS Biology is an Open Access journal. See the collection here.

Keith Robison said...

Yes, my oversight -- PLoS is of course free.