Thursday, January 16, 2014

Illumina's New Lineup

Illumina made a brace of big hardware announcements at this week's J.P. Morgan conference, and Mick Watson has done a nice job of covering them.  I'll try to cover some different points that have occurred to me after letting the news ferment -- plus Illumina made yet another announcement tonight that scotched a portion of an earlier draft of this piece.

To review, the Illumina platform gets two new members and now looks like this:  

The MiSeq, now priced at $100K, is the entry level model.  The all-in price lines up very well with the Ion Torrent PGM, but outclasses it greatly on output and bases read per fragment.  MiSeq is great for amplicon sequencing, microbial sequencing and perhaps even dabbling with exome sequencing or RNA-Seq and fits on a benchtop.  Data can be pushed out to BaseSpace for analysis in the cloud (or there is apparently now a local version for those worried about HIPAA and such things).

Up the ladder is the NextSeq 500, priced at $250K which can sequence multiple exomes or even one human genome in just over a day.The NextSeq uses a daring new 2-color encoding scheme, in which G bases have no dye. This enables simpler LED-based optics and reading the data in only two imaging runs rather than four. NextSeq also uses MiSeq-style reagent cartridges and a new ordered array flowcell, which apparently both packs more clusters in and is more tolerant of variation in the loading concentration of DNA.  So for the all-in price of an Ion Proton, a machine that completely blows it away on data throughput (at least until Ion gets through at least two more chip versions). [okay, I botched this in my original version -- ordered arrays only show up on HiSeq X; certainly a performance enhancement path going forward.  Many thanks to Shawn & Snposaurus for pointing this out]

HiSeq 2500 comes next, around $690K or so (I think).  The workhorse for the typical high-throughput genomics shop, with no announcements at this time.

Now at the top comes the new HiSeq X, minimum order of 10 for an entry level of $10M.  Plus it apparently does a lot of analysis on the box, and is only configured for human full genomes -- but it can run 13K of them a year.   Ordered array  flowcells, which apparently both packs more clusters in and is more tolerant of variation in the loading concentration of DNA, enabling 1.8 terabases of data per run -- and in only 3 days.  Of course, you'll need another $500K or so  in library preparation gear and to get the fabled $1K genome the beast must be fed with those 13K genomes each year. Throw in staffing, and it looks like a $10.5 to get in the game and perhaps $25M/year in running costs.  They've already sold three, and aren't expecting huge volumes, but it will sequence "entire nations".

The two restrictions on the HiSeq X - only human genomes and 10 unit minimum order - are rankling some commentators.  Even after the upgrades of HiSeq 2500 to 1 terabase v4 chemistry, HiSeqs will deliver that much data in 6 days, vs. 50% more data per flowcell in a bit less than half the time on the HiSeq X.   

On the flip side, the spec sheets suggest the NextSeq 500 would actually be the superior instrument for rapid human genome sequencing versus the HiSeq 2500.  The new kid is claiming a 30X genome (or >90gigabases) in 29 hours, whereas the current spec for Rapid Run on the 2500 is 40 hours.  However, there is trepidation in the bioinformatics community as to error rates given the unusual dye scheme, which only large datasets can allay (or lead to re-calibration of various models).

As a number of commentators have pointed out, this lineup is aimed at Ion Torrent (and probably QIAGEN) in the low end and Complete Genomics / BGI at the high end.  Illumina will continue dominance.  Ion won't go away - for certain amplicon applications it is still end-to-end faster than Illumina, but it's hard to see why anyone would choose them for big exome or genome projects (except someone like Claritas that has tight ties to Ion).  Whoever sold the Saudis their Ion-driven genome project certainly earned a big bonus.

From what I can tell, though, is that Illumina wasn't talking at all about long reads -- no secret technology,
no emphasis on Moleculo. It would seem like a real shame if the HiSeq X really can't do Moleculo due to 
informatics issues, but the emphasis on the $1K genome scan seems to have overshadowed touting a $2K (or less) fully phased genome.  This lack of emphasis on long reads would suggest that Illumina is not worried about PacBio or Oxford Nanopore stealing business; PacBio perhaps is mostly complementary (and OxNano a cipher for at least a few more weeks).

The ordered flowcells and 2-color encoding (with cheaper optics and reagents) suggest a possible next shoe or pair of shoes to drop.  If upstarts such as GnuBio (or perhaps QIAGEN) start looking serious, then one could imagine a baby NextSeq with capacity lower than a MiSeq targeting amplicon sequencing.  Or, perhaps a more direct competitor for the 20-barrelled QIAGEN box again based on the NextSeq technology which would also support non-synchronous running of multiple small flowcells.  

I also suspect that given the FDA clearance on the MiSeq and Illumina applying for the HiSeq 2500, that the NextSeq won't be far behind.  Illumina will be getting well practiced on negotiating the FDA's rules, and the NextSeq lops a bunch off the entry price, which should be attractive to smaller pathology labs.

Until this evening, Illumina hadn't said much about sample prep, so I had a whole prediction prepared around their acquisition of Advanced Liquid Logic last summer.  Too bad I didn't push it out: Illumina announced just that tonight.  The NeoPrep will create 16 sequencing libraries at a time in a box that looks similar in size to a MiSeq (or perhaps a bit bigger) -- indeed would appear to have quite a bit larger footprint than NuGEN's Mondrian system (which is based on the same ΓΌber-cool ALL digital microfluidics scheme). Will it fragment the DNA, or like Mondrian take fragmented DNA through to a sequencer-ready libraries? Both DNA and RNA libraries promised, and it looks like it may support both TruSeq and Nextera, but the announcement is bare on details.  If Illumina is going to place sequencers in already busy pathology labs, they will need to have this sort of simple, walkaway automation.  The sample throughput suggests targeting low-to-medium throughput labs; 16 bacterial samples on a MiSeq is underloading one substantially. 

All in all, these were evolutionary announcements.  The price of human sequencing dropped, but for anything else no change for a while (though the 1 terabase upgrade for the HiSeq 2500 will presumably still occur this summer).  Cost of getting in at the low end dropped, and a new sample prep instrument that is barely unwrapped for viewing. Until something radical shows up from another vendor, Illumina will remain the dominant system for sequencing just about anything.


Shawn Baker said...

Keith, are you sure about the NextSeq 500 using patterned flow cells? I can't find the two mentioned together anywhere (and I've specifically heard from Illumina folks earlier that these flow cells would be restricted to human whole genomes).

Also, the number of genomes per year for a 10-pack of HiSeq X's is more like 18k (leading to a $62M sequencing bill over the four year amortization period!)

SNPsaurus said...

"NextSeq also uses MiSeq-style reagent cartridges and a new ordered array flowcell, which apparently both packs more clusters in and is more tolerant of variation in the loading concentration of DNA." -- I was under the impression that the NextSeq flow cell is not patterned.

Keith Robison said...

Thanks for catching my error!! Now edited to correct this.

AMac said...

For HiSeq X, $10.5m purchase price and $25m annual direct operating costs sounds plausible. Call the product cycle two years, so the cost for 26,000 genomes (13,000 * 2) would be ($10.5m + ($25m * 2)) or $60.5m.

Division gives $2,300 per raw genome, at 30x coverage I think.

With Shawn's figures (4-year amortization and 18k genomes/yr), it would work out to $860/genome. But I suspect his $62m sequencing bill is to cover consumables, and doesn't include other direct operating expenses.

Shawn said...

AMac, those aren't really 'my' figures - they're straight from Illumina. To get to the $1000 genome they assume ten instruments producing a total of 18k genomes/year over four years. 18k x 4 = 72k, or $72M. $10M for the instruments and $62M for the reagents (and a little bit of labor).

Anonymous said...

Do any of Illumina's new instruments (or Ion) for that matter offer automated analysis of 16s rRNA sequences on board or in the cloud? I'm thinking of something like mapping to Greengenes/SILVA.

Paul T Morrison said...

As always a nice wrap up. My only quibble would be "evolutionary" instead of "revolutionary". There are three things that are in these announcements that in toto blow up the entire NGS market and hand it all again to Illumina.
1) These new boxes and chemistry already have part numbers and are shipping today. NextSeq 500 delivered in three weeks!
2) Two dye.
3) ordered arrays.

These new boxes just handed out some serious buyers remorse as you point out, the Saudis, BGI with Complete Genomics. All the way down to labs who bought proton and ion torrents. Two dye could be evolutionary it may only cut run times in half but the inherent simplicity of doing half the work, half the plumbing (the current MiSeq valve is still just too complicated) could really be huge if they keep error rates in hand. Just a bet but I doubt dark Gs will be a problem (especially with ordered arrays.)

And lastly, ordered arrays. When that shoe drops and moves down to the smaller instruments it could be mind blowing. Why? No overloading. Which means you top off the chip and you get the same number of reads day in and day out. Not really that simple but it makes the prep and the informatics that much easier. Illumina could keep this shoe for years (OK my bet 18 months) and if they saw a competitor on the horizon, plop, ordered arrays on MiSeq and NextSeq.

How did Illumina keep all of this so quiet? Not a peep out there. Even the Broad did not have beta testing of these boxes but the boxes are shipping today. Amazing.

Unknown said...

To "Anonymous" I know that Ion Torrent has a cloud based 16S analysis pipeline up and running. See

Paul T Morrison said...

To anonymous, 16s on an Illumina MiSeq is now pretty cookbook either in their Basespace cloud or off line. I think we are using CLCbio right now. For particulars just google MBCF and click on Zach and he will give you an update on the most recent way we do it.

Duarte said...

I see here that you guys are putting the reagent cost of top of the sequencing bill... but I though the 1K USD cost already factored in 800 for consumables, 135 for machine depreciation and $65 on staff and overheads.

Can you guys explain to be where the 25 million extra is coming from?

Anonymous said...

BTW: Any news about the NextSeq 500 maximum read length improvement potential in the future? Will it ever go to 250-300 bps? Molleculo?
What about 2x-3x slower runs for longer MiSeq-style reads.

PS: I remeber their promice in ~2011 of the 2x400bps for the miseq by 2013 ...

IMHO: Two chanell scanning theoretically is less resilient to dephasing estimation and is likely to be more sensitive to high GC stuff (esp near the end of the reads), as the signal drops.

Those things would be important for folks interested in de novo or looking into repetitive regions.
But as it is now it can be quite interesting/aiming at the RNAseq/reseq folks.