Monday, February 06, 2012

2012: Enter the Nanopores?

This summer will make it twenty years since I first heard of the concept of nanopore sequencing.  A very affable post-doc in George Church's lab was starting some experiments in the concept.  Unbeknownst to any of us, another group at Harvard in the Biolabs (Dan Branton's) was also working on a nanopore sequencing technology.  In the time since then, the field has generated many papers and much speculation, but no workable sequencer.  I had started joking a few years ago that nanopores were the monorails of sequencing: always the technology of the future.  To be a bit more fair, nanopores had started to resemble nuclear fusion, a tantalizing vision always just out of technological reach.
Still, many kept working in the field and a number of companies were spawned.  Now, multiple companies are trying to raise expectations that nanopore sequencing will become a reality (at least for a select few) this year.  Most prominently, Oxford Nanopore Technologies is scheduled next week at AGBT to make a detailed presentation on their strand sequencing technology and their plans to launch in commercially during the year.  I will, alas, be monitoring this the way I have every AGBT for the life of this blog, via others' live blogs and tweets.

I haven't been following the nanopore saga closely, so I'm realizing I'd better catch up.  OxNano has generally been very coy, though occasionally their chief Techie Clive Brown will post morsels of temptation on or in other public places.  But, there is very little to go on.  So, what follows is some rank speculation on my part.  I attempt below to outline about all the support each guess has, but they are just guesses.   If I'm anywhere close to the mark I'll be amazed.  However, one weak justification for this activity is that it helps tune an appetite for details, so that I'm more likely to look for all the key things.

The Box: This is what we know the most about, since OxNano showed off prototypes last year.  It will fit in a standard computer rack, making it far smaller than any sequencer on the market.  Not tricorder size, but still quite small and certainly not requiring a reinforced floor  Nothing but electricity and a disposable cartridge as physical inputs; no argon and no waste outlet (other than heat).  
I don't believe they've yet proposed the cost.  Some of that will depend on the capabilities of the instrument; the hardware will presumably have some fancy signal processing gear to capture and clean what comes off the sequencing chip.  However, it would seem reasonable that it might be in the range of a big, specialized server: perhaps $30K-$50K.  

The Consumables: The cartridge would seem to have a specialized chip derivatized with several proteins, plus some unmodified nucleotides and enzyme.   Perhaps the best model here are the Ion Torrent chips or perhaps 454 (with the additional enzymes).  However, enzyme usage will be quite low as there won't be a need to flow vast amounts during a complete sequencing run.  So, guessing $100-$1000 here.

Read Length:  The "strand" chemistry involves a polymerase (probably phi29) driving the template strand through the nanopore.  phi29 is a wonderfully processive polymerase, and some of the papers from OxNano advisors suggest reads that end when the polymerase runs off the molecule.  So read length will be dominated by fragment length: really short for FFPE, a bit longer for cDNA and up to about 50Kb for DNA prepared by standard methods.   
However, one question will be whether the size distribution is critical.  PacBio reportedly can achieve super read lengths if the input DNA is uniformly large; otherwise the smaller fragments load preferentially.  This leads to a DNA preparation method which is labor intensive and wasteful of input sample.  If OxNano is not afflicted with similar issues, then no special preparation is needed and input amounts might be miniscule.  Otherwise, expect a very PacBio-like sample prep.  Alternatively, the sorts of methods used to prepare DNA for BAC libraries might be back in style, in order to get unimaginable read lengths.

Accuracy:  Intuition would suggest that as a single molecule method, nanopore strand sequencing can't help but be error-prone.  You get one shot to read thing right; no ensemble averaging in your favor.  PacBio can adapt to this with the Circular Consensus strategy, which won't be available for strand sequencing.  But previous comments by folks associated with OxNano have raised expectations of high accuracy.  A system with 50Kb reads with even sub-PacBio accuracy could be quite useful in a limited number of fields; a system with 50Kb reads and Illumina-like accuracy would take a scythe to the current set of sequencers.

Extras: Papers from OxNano advisors have suggested the ability to read out modified nucleotides.  The market for this is perhaps limited -- but easy reading out of methylation could change that.

Numbers of Reads: For some applications, such as ChIP-Seq, it is the number of reads which count, not their length.  How many reads can the system handle simultaneously?   Furthermore, can a pore "restart": if a polymerase runs off the end of a molecule is that nanopore dead or can another complex find it (or polymerase grab another DNA)?

Library Prep.  What library prep?  Except for the possibility for sizing mentioned above, it would seem from earlier publications that the system would need no adaptors, no end repair, no ligation.  Just nick the dsDNA and go.
For RNA analysis, it would appear that cDNA synthesis would be required.  That would leave a window for future systems that can directly sequence RNA, but is that much of a market?  Helicos demonstrated direct sequencing, but it's hardly kept them out of the financial ICU.

Speed:  This is real time sequencing.  I expect runs to be speedy; at most an hour or so.  

The Carnage:  As suggested multiple times above, almost any proposed system could be deadly to Pacific Biosciences.  Imagine a system which combined all the positives of PacBio into a package that costs less than a tenth as much and can be brought into the site on an ordinary hand truck?

But more broadly, the ripples could extend far beyond PacBio.  Mate pair sequencing isn't a huge driver in the sequencing world, but it is important for de novo sequencing, for haplotyping, for structural variation studies and there have been papers demonstrating their use in developing patient-specific cancer biomarkers.  Even if OxNano still required the tens of micrograms which mate pair systems require, why fool around with noisy mate pair libraries when you can get 50Kb reads to scaffold your sequence?  Of course, the utility for haplotyping would be dependent on high single-pass quality.

Even more broadly, look at all the inventive approaches that have blossomed to produce libraries for sequencing: shearing methods, size selection robots, library preparation robots and the like.  Now imagine a sequencing platform that sneers at all that, that simple slurps in some DNA and reads it.  A sequencer as complicated to operate as a pH meter.  That's the enticing vision of nanopore sequencing and next week we'll start to see whether that vision is a reality, or whether the first nanopore instruments are just yet another evolutionary step in the sequencing instrument market.


Kevin said...

I suspect that they are not at the point of single-base resolution yet, so reads will be very noisy. Long reads may be possible though, and I understand that considerable progress is being made both in terms of controlling the movement of the DNA through the nanopore and in the precision of the reads.

Anonymous said...

The first line of your post says it all: it's been 20 years. And now the pressure to release something, anything, is all coming from Illumina who need a riposte to LifeTech's Proton, not in capability but as a statement to show they're still in the game of pushing technology. I agree PacBio should be (more) worried.

GroovyGeek said...

OxNano will certainly use biological nanopores. You don't speculate on how much multiplexing one can get out of a single rack, and my guess is "not much". Making these at the right density and addressing them individually is certainly going to be difficult... and expensive. Sure, each rack is small and compact, but if it reads a few hundred cells then you end up in the same ballpark as all other existing tools.

Synthetic nanopores can change all that, but as a semiconductor gearhead I can assure you with absolute certainty that there is no current technology that can handle the size requirements with anything approaching commercial feasibility. This is particularly true about the membrane thickness required to achieve single-base resolution.

OxNano's paper at AGBT is reportedly on "sequencing a strand". I would expect exquisite read lengths and single-strand error rate better than PacBio, which I am guessing is limited by the readout. As far as data throughput is concerned my prediction is that this is going to be a total "meh" moment, far behind what even PacBio can do today, and not even on the same planet and Illumina.

Overall this will be yet another niche technology. If respectable it may be deadly to PacBio but probably for reasons that are yet to emerge.

Keith Robison said...

Anon: I forgot to put this in the original post, but the strand sequencing is unpartnered -- Illumina will benefit indirectly as an investor, but at the moment has no more stake than that.

Anmiv said...

Interesting article. I am currently doing my PhD on solid state nanopores and their application to single molecule detection and analysis. While I have not worked with DNA too much, I do know that such platforms also have other potential applications such as studying protein folding and even in ultra fine nanoparticle separation.


Anonymous said...

Hi all,

CTO of ONT here. Looking forwards to next week and i will clarify a lot of the speculation here. One thing I cannot let pass right now is the following comment :

"OxNano's paper at AGBT is reportedly on 'sequencing a strand'.

That is NOT what the title says. "Strand" sequencing is in fact the generic name given to passing strand(s) of DNA through a nanopore and reading them as contrasted with, say, "exo" sequencing where bases are chopped and dropped.


Anonymous said...



Accuracy at less than 80% is useless for any kind of sequencing that would be used for diagnostics regardless of readlength or coverage....WHY YOU ASK? Well, if you calculate the amount of computational power and time required to gain any useful information you would be running all of IBM's machines for 2 centuries before you got any meaningful information....ASK any bioinformatician to do the calculation for you...ONT is theoretically prohibited from sequencing with an accuracy better than Pacbio