Sunday, March 08, 2009

The next level in genomics term papers

I've been intrigued for a few months now since hearing about a St. Louis company called Cofactor Genomics. Right on their front webpage they advertise they will generate & assemble 680Mb of sequence (from an Illumina machine) for the paltry sum of $4.7K.

Wow! That would fit on my credit card when I was a graduate student (though it would have been a few months stipend). 680Mb is 100+X coverage of an E.coli-class genome, or about 50X coverage of Saccharomyces. It's even well over 0.5X coverage of an awful lot of interesting eukaryotes.

As an aside, I feel obligated to stress that I don't have any personal stake in, or direct relationship with, Cofactor Genomics. I also have no experience with them or any of their competitors. It's just the ease of accessing their pricing matrix makes them easy to talk about.

At those prices, the idea of doing my own personal genome project can't be easily shooed away. Not a Personal Genome Project -- I worry I'd develop genomania -- but some small genome sequenced on my whim. There's probably still not a shortage of interesting genomes in species I could easily & safely grow up with some forbearance of my shop's management or at a friendly academic. There must be some left; there are even some industrially-interesting E.coli strains that seem to lack public sequences. However, even if it wouldn't violate my town's zoning laws to do it in my basement, neither growing biological samples nor the $5K budget would fly with my spouse.

So I'll float a different idea. My only wish is that anyone who tries it post back here, and if you're already doing the same thing I invite your response as well. If I can't do it, why not some class?

Now $5K isn't chicken feed. I'm sure that is far beyond the typical budget for lab experiments in a college class, let alone a high school. Maybe a donor could step in, but these days that's a particularly tough challenge to find. But suppose the cost were spread over a lot of students?

One scenario would be for a very large university to make this the project for an entire class. A really huge state school I would guess could have 500+ students a year taking first-year biology. Now we're talking less than $10/student -- perhaps still a significant hit (what is a typical per student budget for such a course?). Each student would get about 1/500th of the genome as their very own research project.

At a smaller school, could a genome project become a departmental initiative? A bioinformatics class could set up the analysis pipeline & develop reporting tools. Biochemistry class could map the ORFs to the known biochemical pathways and identify both missing pathways and predicted novel (to the species) enzyme activities. Genetics classes could focus on operon structure or identifying possible regions recently transferred horizontally from another species. Evolution classes could tackle that, or building a bazillion gene trees. A bit of a stretch to work this into a human physiology curriculum, though a comparative look at how another biological system manages homeostasis isn't completely absurd.

Of course, when it comes time to publish it will be a very long author list!

I think I've heard of a genome project being run as an undergraduate effort, but I'm guessing a lot of that involved doing the actual sequencing. While there's merit to that, these days even with free labor, large-scale Sanger sequencing isn't cost competitive. Perhaps some departments have one of the next-gen machines & are willing to let some undergraduates play with them -- but I'm guessing that's pretty rare (like a NotI site in an AT-rich genome).

Will sequencing costs ever crash low enough that someone will sequence a genome for an grade school science fair project? I'm not holding my breath, but I certainly wouldn't rule it out.


dd said...

Sally Elgin and Elaine Mardis have been teaching a Research Explorations in Genomics course here at Washington University in St. Louis for several years. There are also sequencing courses offered at Cold Spring Harbor Laboratory.

Anonymous said...

A recent paper in PLOS discusses something similar. The project didn't sequence a genome, but it did use undergraduates to annotate previously sequenced metagenomes. It was titled "Metagenome Annotation Using a Distributed Grid of Undergraduate Students." You can read it here at PLoS, and I discuss their software on our blog.

So, my take is that yeah, this sounds like a definite future possibility.

Anonymous said...

We offer a genome sequencing course exactly as you describe at the University of Florida, Dept of Microbiology. The ugrads sequence a bacterial genome with 454 pyrosequencing. We are working on writing up the manuscript, and yes, it will list all of the students as authors. The paper that describes the pilot effort is in the J. of Microbiology and Biology Education (

Unknown said...

We don't do bugs, but our life science undergrads do regularly get put on human gene sequencing projects, and have been involved in the publications that arise.

Cofactor said...

Hey Keith,

Sounds like a great idea. Lets do it!

Cofactor will ask course organizers for a 1 page description of how their ~700Mb sequencing project will be used as an effective teaching aid in their class. We will review and choose the best entries the first week of May. Those entries will be awarded a free sequencing project including project consultation, sample QC, library construction, sequencing, and computational analysis.

More details can be found at:

Thanks for your thoughts on this Keith. I was a big fan.

Jarret Glasscock, CTO
Cofactor Genomics

Anonymous said...

Cofactor Genomics is not only a technologically capable group, but it is refreshing to see their vibrant scientific curiosity, educational responsiveness, and community involvement.

Exciting times, indeed!

Anonymous said...

The HHMI Science Education Alliance has a national experiment that introduces college freshmen to genomics via an authentic research experience that is implemented in their introductory science laboratory course. They have almost 300 students engaged in the genomics project and will double that number next Fall.