About a week ago, Ion Torrent's President Greg Fergus and Head of Marketing Manesh Jain were kind enough to engage me in a nearly an hour of discussion about the Ion Torrent platform. One agreement prior to our discussion is that I would withhold this piece until their announcement today of the 318 chip for the system (I also volunteered to let them see a draft of this in advance to ensure I had not misrepresented anything).
A key theme on their side is a certain degree of feeling that the wrong questions are being asked in the analysis of PGM versus MiSeq -- and an eagerness to shift the discussion. They wished to emphasize a number of points, and after the discussion I can see the validity of many of these.
First, is the sample prep question. MiSeq is generally viewed as faster and requiring less hands-on time in this department, due to the bridge PCR approach versus Ion Torrent's emPCR (as well as, for genomic fragment libraries, the Nextera transposons vs. conventional shear-and-ligate library construction). A clear message from the Ion Torrent team is they are working intensely to cut down the current 6 hour time to more on the order of three hours. Some of this comes from simplifying certain steps (the emulsion breaking), and some from cutting down the number of PCR cycles. They also expressed plans to have an integrated prep device by year's end to cut down hands-on time. Furthermore, they pointed out that one scientist can process 6-8 samples off-line from the sequencer and in parallel as the sequencer is running; MiSeq can prep only one at a time and the bridge PCR is integrated with the sequencer, preventing off-line usage. The Ion Torrent team stated a goal of making it "irrelevant" to the user whether they were using bridge PCR (Illumina) or emPCR (PGM) in terms of hands-on time or difficulty.
Second, they wanted to contrast the advantage of working on a very new system ("just being born") versus one which may be approaching maturity. Given that they have just started trying to optimize the various parameters (read length, cycle time, accuracy, etc.), there is much to be done. In general, they are trying to tackle one major issue at a time, to avoid over-extending themselves. The current ~100 nucleotide read length requires a two hour run; with further adjustments they expect to have 200 basepair runs in the wild later this year but keeping the 2 hour time. This will be valuable as they push read lengths farther to perhaps 400 bases or beyond at 100 bases/hr.
The contrast they laid out, and I've pondered exploring before, is that Illumina may be reaching maturity. Given that the current paired end sequencing covers the entire insert, longer reads won't really add value unless the insert sizes are increased, which could have other ramifications (for example, longer inserts are reputed to give fatter, less intense clusters). Illumina has mentioned increasing the flowcell surface size, but that would seem to enable only a relatively modest increase and certainly not orders of magnitude. Packing clusters in tighter and/or ordering them would be another route, with significant challenges. I'm not claiming Illumina is done innovating on their platform, but it would seem that the slope of their improvement curve is unlikely to be very steep.
On the front of accuracy, Ion Torrent made the claim of 6X higher accuracy at the 100th base than Illumina. Overall, they state a raw accuracy of 99.5% (phred 23). On the homopolymer issue, they mentioned data presented at AGBT showing high consensus accuracy in reading runs of 8 to 9 bases. A more stringent test would be the detection of indels in such a run; I hope to run such an experiment in the not too distant future. GC bias is claimed to be minimal; this is ascribed to the use of native nucleotides (though some GC bias can enter through PCR). Something to explore: whether emPCR is less subject to this than cluster generation by bridge PCR).
I also got some better clarity on how one might build out a larger facilty of Ion Torrents. You need one $16.5K Ion Server (Linux box, with lots of storage) for every three $50K PGMs. The current emulsion prep device is quite inexpensive ($1K) and isn't used very long per sample; one of these could handle a small army of PGMs. So the entry price for a PGM is about $68K, which requires saving a few more pennies than the off-touted $50K price but is quite a bit under the $125 for MiSeq.
One other frustration I've had is finding important details for experiment design, in particular the sequences one would design into fusion primers. They apologized for this and stated it was due to still changing sequences on the beads; all of the necessary protocols would be in an online Ion Community in a matter of weeks. I also brought up the constellation of third-party library prep and enrichment tools customized for Illumina; they are working with many partners to bring those to the platform.
Of course, their big excitement was around the new 318 chip, which will generate about 1Gbase using 4-8 million reads of length 100-200. This would, of course, directly challenge the output of the MiSeq, but in a long work day from input DNA to data (with current sample prep) or perhaps by then just a long day. It's a bit challenging to make an apples to apples comparison, but to get 1Gb with 2x150 bases on MiSeq is projected at 27 hours.
Let's see the ramifications of this. Suppose a PGM with a 318 is put in a race against a MiSeq where the staff works a strict 8 hour day and the project is sequencing PCR amplicons with appropriate fusion primers and making the pessimistic assumption that Ion Torrent's sample prep is still 6 hours.
On Monday, both racers start. The Ion Torrent prep takes most of the day whereas the sample is loaded on the MiSeq and runs. However, Ion Torrent loads at the end of day for an unattended run into the night.
On Tuesday, the MiSeq finishes its first sample around lunchtime. A second sample is loaded onto MiSeq which will run until late afternoon on Day 3. In the eight hours in between the Ion Torrent entry bumps off 4 more samples plus loads another for into the night.
On Wednesday, Ion Torrent finishes the batch before lunch with two morning runs. MiSeq accepts sample #3 in the late afternoon -- but it will finish after closing time on Thursday. Even if someone stays late for 5 minutes of loading, sample #4 would need to go into late in the evening of Friday. If there is nobody coming in on the weekend or Friday nights, then Mon-Tue-Wed-Thu of the next week will be needed to finish the remaining samples.
Now, I've deliberately set up a particular comparison and due to instrument differences, they aren't completely comparable. MiSeq will have generated 2x150 runs rather than 1x100 -- though again, some of that sequence overlaps, which generates greater confidence but lower sampling. Perhaps some read length could be sacrificed to achieve 24 hour run times, fitting the very rigid employee schedule. For some applications, a race of Illumina in 3-hour 1x35 mode might be appropriate (but this time giving the data edge to Ion Torrent). On the flip side, an intensely-anxious graduate student could pull a PGM all-nighter and sprint through a bunch of samples. Also, the announced cost of a MiSeq is somewhat higher than a 2 PGM full install ($118K) the Ion Torrent team would finish before tea time on Tuesday.
But it does illustrate what the Ion Torrent folks tried to highlight: PGM is potentially better suited to marching through samples in a given time, though with more hands on time. Also, if you were going for high-throughput then the same tech could presumably be prepping another batch of eight samples on Tuesday to keep the PGM humming. I can start picturing how a quite small lab with 2-3 techs and 1-3 PGMs could achieve a steady-state of about 4 Gbp/day -- with perhaps 8-16 Gbp possible with shorter cycle times and longer read lengths. HiSeq 2000 is spec-ed currently at 25Gb/day (clearly with much less hands on activity), giving some idea of the possible throughput. I've been scribbling scenarios right-and-left (none of which are quite apples-to-apples comparisons), which I'll try to surface later this week.
What would the 318 chip be good for? At 1Gb, it's probably just a little short for the typical 50Mb human exome kit -- though that could be either covered in two runs or by the read length going in the 200-300 range (yielding 2X-3X the data I've been penciling in). But, for a more focused custom design it could work well. Also, that would be a yeast-sized genome at 80+X. With around 10M reads, it's getting into the neighborhood needed for a lot of counting applications, such as ChIP-Seq or RNA-Seq. So it won't be ideal for every experiment, but will be a plausible option for many.
So, I guess I'm back to "ping" in my mental table tennis on these two platforms (But who knows? Perhaps someone from the competition with illuminate me as to why I should push my focus back to MiSeq). I think the MiSeq will be a valuable addition to the roster, particularly for shops very invested in Illumina technology, but it would appear that Ion Torrent is poised for enormous growth. MiSeq will also be favored for groups anticipating running smaller batches of sequences with far less hands on time. If Ion Torrent can launch a chip (presumably named 320) by end-of-year with 10Gb per run, then PGM really does start challenging HiSeq 2000 (though with much more labor), though HiSeq may go to about 1000Gb per 8 day run this year. But, a 100Gb chip in summer 2012 would actually approach one human genome per run (the promised l200-400bp reads by then would push it over). Ion Torrent also needs to drive hard to get all the subsidiary kits operating with PGM -- mate pairs, hybridization selection, etc.
Ion Torrent would definitely need a 320 chip to really go head-to-head with HiSeq 2000 on cost. But, with the 318 chip it will approach the weekly throughput (assuming 8-hour workdays) of the venerable GAIIx -- again, with GAIIx running unattended. I doubt many shops will plan to run their PGMs full tilt like that, but as a burst capability that is quite impressive. And for a service provider looking to provide inexpensive, rapid-turnaround sequencing, a fleet of PGMs could enable a very flexible toolset to handle projects.
In any case, the arrival of more PGMs in the field should start being reflected in more information on sites such as SEQAnswers. Plus, one of the winners of the European PGM giveaway is part of the blogging team at Pathogens: Genes and Genomics. And, I have been contacted by a service provider who is planning to launch a PGM service (atop their already successful second-gen service using another platform) about two months from you -- and I'm hoping to be very early in their queue (plus I think I've found a core lab that will run outside jobs). Looking forward to some regular real-world updates on the system.
[11:00 EST: corrected number of wells per chip]