Tuesday, January 05, 2016

When it comes to Nanopore, am I too GAGA?

In the comments to yesterday's piece on the reboot of Nabsys, a commenter used a truly colorful epithet in inquiring why I am so bullish on Oxford Nanopore, and in particular whether I am paid to do so.  Whether I've lost objectivity on this (or any) subject is something I take seriously, and I think it is worth a look.  To deal with the more serious allegation up front: I have never been paid by Oxford Nanopore, and if I ever do write on a company which has paid me in money or substantive gifts or in which I held stock I would disclose that. My company has been a member of the MinION Access Program (MAP), and one could argue that Oxford has provided materials worth far in excess of the $1K entry fee.  On the other hand, I and my co-workers have sunk quite a bit of time trying to use bad flowcells and unstable kits, so we've also sunk a lot into that project, so there's hardly a windfall there to cloud my judgement.
Clearly I am enamored with Oxford and the MinION, but why?  That's a complex topic, but I'll outline the key reasons I can identify, none of which have to do with non-existent payments or favoritism from Oxford Nanopore.


Scientific dream incarnate

When I first rotated in George Church's lab in the summer of 1992, I naturally met each person and inquired as to their project.  A post-doc named Rich told me a crazy idea of Georges that one might be able to sequence DNA using patch-clamp techniques as the DNA passed through a pore in a membrane.  Later, Rich had a patch clamp setup -- it was on a vibration-isolating table enclosed in a Faraday cage.  It was at least the size of a desk. Even with all that, Rich would say he could tell who was walking down the blind hallway adjacent to it, based on the cadence from the footsteps creating noise on his instrument.

Fast forward 22 years, and we have our MinION at Starbase, ready to go.  No air table, no Faraday cage -- and I could stuff a shirt pocket with half a dozen of them. We loaded it with a library which my colleague created, and let it run.  Then I looked a the data.  Much of it was junk, and the yields that MinKNOW promised of long reads just weren't there -- but what was there was a single read which could be recognizable aligned to the entire 48.5 kilobases of lambda phage. Kaboom! That's one hell of an amazing read.

Since then, users of the MinION platform have generated a growing list of publications and preprints which have shown the system useful for a variety of tasks. Assembly of bacterial genomes either boosted with Illumina data or on its own?  Hybrid assemblies of multiple bacteria (B.fragilis, Francisella, Salmonella, Pandorea, and environmental isolates) plus yeast, and MinION-only of E.coli.  Rapid sequencing for medical applications where short time-to-result is a game changer. Antibiotic resistance testing (and again). Field sequencing of a viral epidemic? Amplicon sequencing of viruses, viruses+bacterial rRNA, or to work out HLA haplotypes .  cDNA sequencing (and again ). Real-time identification of species in metagenomes, and real-time identification of antibiotic resistance markers. SVs in cancer.  NIPT. Given what I heard at the community meeting, published detection of 5-methyl cytosine should be in the literature in the very near future. In a very short time, a very large number of scientific uses for a small, inexpensive & fast long read sequencer have been demonstrated. It doesn't mean that MinION is ideal for each of these, but it does show that a wide swath of genomics can be tackled with these tiny sequencers.

It's also worth noting the huge edge for MinION in the educational area.  I've seen first-hand that even a well-intentioned student can wreck havoc on an expensive scientific instrument (nobody pointed out to me how critical the correct screw length was for that vacuum flange!).  Who's going to let a pack of high schoolers or first years play with a $100K sequencing instrument?  But while $1K isn't chickenfeed, it isn't far in cost from the microscopes that are considered routine implements in teaching labs.


Replaying four years on a Starbase

Another driving source of excitement is replaying the last four years of my professional life.  When the company first started, we outsourced bacterial growth, DNA prep and sequencing because obviously there was no choice; we were working out of offices on the toniest shopping street in Boston.  We waited 6-8 weeks for our data to be turned around on a HiSeq, since that was an obvious platform. Admittedly, I had a bias against 454, which would have been more expensive for just slightly better results (I later caved and ran 454).  Worst, my big jigsaw problem commenced, for the genomes we were sequencing have brutal repeats too large for short reads (and not easily solvable with mate pairs either, which was in these runs as well), and the biology we cared about was all in the repeats.

When we got some lab space, I lobbied to get a benchtop sequencer, but very quickly all our benchtops disappeared.  We once arranged to demo a piece of equipment, and tried it out while it sat on the packing case the salesman had brought it in on!  Even worse, we cancelled one demo when we realized we had nowhere to put it.  Besides, management really didn't want to sink $100K or more into a sequencer with limited throughput, when our sequencing needs were highly variable.  Benchtop sequencers were now widely available.  I made my first key discovery using Ion Torrent data (from my rehab hospital bed, using DNA I had prepped from bugs I had grown!).  But the assemblies were hideous -- frameshift rich -- and forget about the repeat problem being solved.  Then MiSeqs became available on the contract market, but the turnaround times prevented any sort of rapid experimental iteration, and read lengths were very limiting in length (2x125 on MiSeq in those days, if I remember correctly).  

Then we moved to our current space, and the idea came up again -- or even buying something bigger.  But again, a desire to avoid expensive and fast-depreciating capital purchases to obtain limited capacity and also to preserve lab space.  PacBio became an option for solving repeats, demonstrating the beauty of long read sequencing.

Let's replay that scenario, but assuming MinION exists but with exactly the capability it has now.  

Nobody sequences in offices on Newbury St, do they?  Actually, more likely I would have just thrown together a lab in my kitchen (that's an option I contemplate regularly for personal play). Data back quickly, and long (but noisy) reads.  Probably would go polish that with Illumina data, but for getting the lay of the land a good MinION dataset would be fine.

Then we had lab operations.  For really big sequencing trawling, we'd still outsource to get big iron.  But for iterating quickly on various experiments, MinION would give us rapid feedback.  Also note that at the prices I have to pay for PacBio, the cost difference isn't so huge if there is one' 1Gb of data (about 100X coverage of a Streptomycete genome) will run me around $1K-$2K on each platform, once library prep is thrown in.  Now, that is a bit of an unfair comparison as the PacBio prep includes someone else preparing libraries and doing it internally obviously uses internal resources, but conversely we would gain complete scheduling control.  In terms of space, a small computer sharing a monitor with other instrumentation computers (via a KVM switch)

Furthermore, if you need more capacity you can easily buy it: at $1K a pop, the MinION itself is capital efficient.  Indeed, when Oxford made their original AGBT splash in 2012 (which I wish I had attended, because then I wouldn't have gone skiing and would have made that discovery from my desk), I sent a snarky email to upper management along the line of  :"You know how you won't buy me a sequencer: here's a sequencer I don't have to ask permission to buy!".  That's not a trivial consideration; empowering scientific staff to just go forward is huge, but so is not breaking the bank.  


If you're in this business, you must pay attention to Could


I'm often making projections or predictions based on what companies such as Oxford has promised that is well beyond what they have delivered.  Now, one consideration is that Oxford has delivered on some previous promises: there were many caustic remarks on social media claiming that their system was fundamentally impossible.  But conversely, Oxford announced optimistic timelines for all the gee whiz stuff they announced at London Calling, and little has advanced on the platform (other than a small pore speed boost called "not-so-slow-mode").  Reportedly fast mode is now undergoing alpha test with at least one even more blatant Nanopore fanboy.

But as someone who watches the industry for fun, and as someone who also needs to chart out genomics strategy at a company, the possible future must be considered seriously.  One can't bank on it, but nor can one ignore it. If I were at any other genomics company, and indeed for my current professional role, it would be a grave dereliction of duty to simply wave away Oxford's announcements as smoke-and-mirrors.  

Just a reminder, here is what could emerge from Oxford.  Direct RNA sequencing, which could drastically lower the price of RNA-Seq while it could also greatly raise the accuracy by eliminating reverse transcription artifacts  Fast mode could be upon us soon, enabling higher yields, faster time-to-result for speed-sensitive applications if the data is any good.  An increase in the number of nanopores per chip also means higher yields, and could arrive this year.  Crumpet chips, if they show up, could radically reshape the sequencing pricing model, perhaps enabling zero upfront cost to sequencing, which could mean huge inroads into educational and DIY-Bio markets.  A 20 minute transposon 1D prep, if it is launched and if the data quality is acceptable, would reduce the labor cost for library prep by a huge margin.  Library preps on VolTRAX, if this can really work reliably in the field. could enable clumsy lab bunglers like me to routinely prep their own libraries, or simply lower the labor cost of a library to nearly nothing.  And if fast mode, denser chips, crumpet chips all come together and if Oxford has solved the on-board base calling issue, then maybe PromethION will be a terabase-per-day terror to IT groups.

Note all the coulds, perhaps, ifs, maybes and other qualifiers.  None of these might be realized -- but also note that any one of these items on Oxford's ambitious agenda could mean MinION goes from a niche product for a bunch of crazed fans to a serious competitor to multiple existing platforms. 

It's also worth emphasizing that Oxford isn't some company making projections without having demonstrated anything.  Their platform is in the wild and generating papers.  It has a very specific edge over any other platform in the area of rapid time-to-result (obviously, so long as the amount of data fits in the bounds of a MinION).  It is uniquely portable, with the ability for use in very minimal laboratories demonstrated by multiple groups. The ability to adjust what is sequenced on the fly, with read-until, has been demonstrated.  It is also capable of spectacularly long reads, approaching 200Kb.  Now, as I commented yesterday, those are rare beasts, but the fact that they have been sighted is significant.  As PacBio has shown, a key to long read sequencing is mastering library preparation, and it would be gravely foolish to assume that nanopore preps have reached their apogee.  And, as I noted yesterday, any of the throughput improvements (more nanopores per chip, fast mode, PromethION) should enable catching more unicorn reads, since the same fraction of a bigger number is obviously larger.

So, in a nutshell, that's why I write perhaps excessively positive speculative pieces on Oxford, or work them into other piece.s  Because I think the technology is truly astounding, and I have a good basis for that.  Because I can see how the technology could have transformed what I have invested much blood, sweat and tears into these last few years.  And because I think that it is unlikely for Oxford to flub all of their grand ambitions, and succeeding with any one of them is going to have broad effects on the genomics field.  I appreciate constructive criticism on the tone of my pieces and will strive to label my conflicts-of-interest, but I also don't apologize for being excited about something truly exciting.  

9 comments:

Keith Robison said...

I wasn't strictly trying to review all of the MinION experimental publications, but it is embarrassing that pure haste led me to ignore Charles Chiu's publication on viral diagnosis. Also, I left the Nitrospira paper out of the bacterial list, as I'm unclear on how MinION was used here (alas, don't have a current Nature subscription).

Rafal Marszalek said...

Also: http://www.ncbi.nlm.nih.gov/pubmed/26025440 (disclaimer: I am one of GB's editors). Don't need subscription for that one ;)

P.S. I do appreciate it was not a comprehensive rundown of nanopore papers.

Yaniv Erlich said...

Hi Keith,
Thanks for the post.
Readers interested in ONT and education might want to look at our new preprint on incorporating ONT sequencers in the classroom, which includes education material, protocols, and lesson learned

Paper:
http://biorxiv.org/content/early/2015/12/24/035303

Dale Yuzuki said...

Hi Keith, many thanks for this comprehensive roundup of where ONT is (and valuable context from your personal experience), and don't let the random ad hominem comment bother you too much. Your take on the existing real-world status in this fast-changing market is valuable - your insight into the various use-cases / applications / emerging applications is proof enough.

Just a loyal reader,
Dale

Geneticist from the East said...

Let me be the bad guy that pours a bucket of cold water over your head.

At the current stage, while nanopore has the edge in real-time and on-field applications. PacBio has a big lead now with their Sequel in almost all other applications.

Remember also that Minion is yet fully commercialized. The $1000/box is not the true price at this moment. Maybe $2000 is more likely the true price.

Having said all that, I have to say that I am also very excited with ONT Minion. I think that's what a sequencer should look like in the near future. Minion will be highly disrupting if their error rate can be upped to PacBio accuracy. But if we take Ion Torrent as an example, that advancement can be a big if that might take forever to materialize.

Anyway, it is fun to be in this field to see how things unfold.

Duarte Molha said...

btw ... here is a 2 day paper no NIPT and nanopore to add to your publication list :)

http://www.genetics.org/content/202/1/37

Lakshmanan Iyer said...

Thanks for the post. I really look forward to your post as it is quite comprehensive and insightful!
Another loyal reader
Lax

Anonymous said...

@ Duarte:
negative, that paper is not on NIPT, see final words of the introduction: "amniocentesis and miscarriage samples". The 40,000-80,000 reads they used is too low for a robust and trustworthy NIPT IMHO.

Keith did link the ClinChem letter from Dennis Lo's group about NIPT and nanopore. But they also didn't really show much, except that X and Y was about what they would expect in pregnant women's serums with male or female fetuses. They had some 30,000-50,000 reads, again not enough for a good MPS-NIPT, which they didn't really even attempt....
In NIPT, you have on average a 1:10 delution of fetal to maternal DNA in the cfDNA of the plasma. Normally, chromosome 21 will give you roughly 1,5% of reads. In case of fetal trisomy 21, you would get around 1,575% 21-reads in the maternal plasma (assuming again 10% fetal DNA). With 40,000 reads, that would be 600 reads vs. 630 reads. And that's a massive overestimation, since Lo and colleagues only had 16,9% unique mapped reads to use. Even considering the amount of repetitive DNA in the genome, that does seem low and might be in part due to the high error rate of nanopore seq. And sure enough, the four (pooled) samples they analysed already showed a 21-percentage variation way above a T21 diagnostic level. In statistics, it is no problem to detect small differences, but the smaller the difference the larger the sample size needs to be to detect the difference with high confidence.......

best wishes
Lars

Nik said...

@Lars - The Lo paper on NIPT was just an early feasbility trial, nothing to suggest that NIPT via the MPSS method could be completed as of today. It would be interesting to see what other teams could do in terms of NIPT with the technology, its a long way from a completed product but if the goal was to reach NIPT Point of Care as possibly suggested by Lo, this could be a potential game changer.

I really like the summary you made on his paper, it definitely helps when you break the numbers. Thank you.