Monday, January 09, 2017

Illumina Unveils HiSeq Successor NovaSeq

At today's J.P. Morgan Healthcare Conference Illumina made a number of small announcements -- some new partnerships, Firefly on track for launch later this year, launch of the single cell workflow partnered with Bio-Rad.  Then CEO Francis deSouza dropped the big news: a new high-end sequencer architecture to ultimately replace all of the HiSeq instruments.  It sounds like an interesting evolution of the Illumina product line, but unfortunately too many headlines and tweets have focused on a distant goal of $100 human genomes.  Worse, not only did some commentators misconstrue the announcement as delivering on $100 genomes, but some also touted a sequencing speed of one hour for a genome which isn't remotely true.

NovaSeq is currently a family of two instruments, the 5000 and 6000.  So even the naming scheme marks these as successors to HiSeq.  Four different microfabricated (e.g. patterned) flowcells have been announced, S1, S2, S3 and S4, though the S3 and S4 are only usable with the 6000 box.   S1 and S2 have two lanes whereas S3 and S4 have four lanes. 5000s are field upgradable to 6000s.   Both instruments can run one or two flowcells and the two flowcells could be different types. Much higher clustering densities are achieved than with prior patterned flowcells, though it will take active experimentation to see if problems with the prior exclusion amplification chemistry are better or worse (or unchanged) with these flowcells. S2 will be available at product launch, with the other flowcells appearing during the rest of 2017.

According to Illumina's spec sheet, the following specs apply (Illumina's slide on the flowcells sported even grander numbers by a factor of 2, perhaps suggesting near-term performance improvements). 
  • S1   <=1.6B passing reads,  <=167Gbp in 2x50, <=333Gbp in 2x100, <=500Gbp in 2x150
  • S2   2.8-3.3B passing reads, 280-333Gb in 2x50, 560-667 in 2x100, 850-1000Gb in 2x150
  • S3  <=6.8B passing reads, <=2000Gb in 2x150
  • S4  <=10B passing reads,  <=3000Gb in 2x150
Reagents are supplied via an RFID-tagged cartridge, much like the NextSeq (and MiSeq?) .  The silly restrictions on sample types or other experiment parameters are removed; if you can build a compatible library this instrument will sequence it. The instrument is described also as automating flowcell loading and clustering.

The chemistry is apparently two-color, similar to MiniSeq and NextSeq.  A concern this raises is the less understood and generally less reliable quality scores for the two color chemistries.  Scanning is apparently at the diffraction limit, with completely redesigned optics.

Want a NovaSeq 6000?  Get ready to shell out $985K, but you can save $135K by settling for a 5000 at only $850K. That price hasn't deterred early customers, with 49 boxes sold already, apparently more than Illumina can build this quarter.  Unsurprisingly, gigantic sequencing centers have been named as early customers -- Human Longevity Inc, Novogene, Chan Zuckerberg Biohub, Regeneron, Baylor and The Broad.  Illumina is offering partial trade-in credit for Xs or HiSeqs purchased in the last two quarters, and complete credit on undelivered instruments .  You'll also need some space; this appears to be the size of large refrigerator.  6000s are supposed to start shipping in March, with 5000s available only after "mid-year".

So how does this fit competitively?  Obviously Illumina is not a strong believer in the existence of a sequencer glut, evidence for which has been detailed by James Hadfield and Dan Koboldt.  In Dan's case, the apparent glut drove a change in institutions.  Releasing the instruments to work on any library could help enable better utilization, but this could hold back further purchases.  Also, many HiSeqs may be coming to the end of their design and accounting life, so there may be core groups which would slot these new instruments in as replacements.  BioIT World places the number of installed HiSeqs at approximately 1900 instruments at 800 customers (itself interesting data).

The BioIT World piece quotes deSouza saying that NovaSeq would be 20% less expensive per gigabase than HiSeq X, 45% cost savings vs. HiSeq 4000 and 50% vs. HiSeq 2500.  It is unclear how this translates into a final consumable cost for a 30X human genome, is it now $800? Sequencing labs will also benefit from simpler operation which could reduce labor costs, but the huge throughput increase apparently doesn't translate into greatly reduced cost per basepair.  Most of the excitement from actual sequencing jocks has been around the idea of HiSeq X-like running costs for projects that were previously verboten on that platform: microbial genomes, RNA-Seq, exomes and so forth.

It does not appear that NovaSeq has a rapid mode, so it won't replace HiSeqs in ultra-rapid applications.  Unfortunately, CEO deSouza's comment that the NovaSeq can run at a genome per hour, meaning 40 genomes in 40 hours, has been taken completely literally by some tweeters and at least one stock site.  I tried to resist, but my Storify has a small Hall of Shame at the bottom for these.  CEO deSouza touted an ultimate target of $100 genomes, which was also mindlessly trimmed of context and earned HoS rights.  The path to $100 isn't clear in the tweets; presumably it will be combination of increased read length, even higher density and perhaps expected cost reductions in flowcell manufacture over time.

NovaSeq clearly represents Illumina doubling down on the big sequencing center model which has made them very successful.  They've apparently moved away from the minimum instrument order sizes that characterized the HiSeq X series, but this is still selling very big boxes to a limited set of sequencing centers and core labs.  Firefly is reported to be on track to launch at end-of-year, providing coverage of the low end of the market.  Perhaps Illumina is moving into a strategy of alternating emphasis, with low-end machines (Firefly, MiniSeq, MiSeq and NextSeq) the focus one year and the big performers the next.  If so, then next year we might see a major overhaul or replacement of MiSeq, which is the oldest instrument design standing.

But NovaSeq also shows Illumina not making a move into long read sequencing at this time; Illumina will continue to sell the idea that resequencing genomes with relatively low per base cost is superior to more expensive and potentially error-rich long read genomes de novo assembled.  I think this ups the odds that Illumina makes a stronger relationship (major partnership or acquisition) with one of the linked read companies such as 10X Genomics or iGenomX in the relatively near term.  Otherwise, Illumina isn't providing a good solution for haplotyping or structural variant identification.

But as Matthew Herper noted in his piece on NovaSeq, long reads are starting to make inroads.  PacBio today announced a new Sequel chemistry which they tout as increasing yields and reducing per base costs, though they did not provide a cost estimate for a human genome sequence.  Oxford Nanopore is moving quickly towards human genome sequencing; yesterday an announcement was made of the first de novo assembly of a human genome using only nanopore data.  As has been the rule with nanopore data to date, accuracy was limited.  Another group has performed an initial structural variant calling on a different human nanopore dataset, with promising results.  Illumina has a head start on market adoption and cost, but both of those edges could erode rapidly.

Illumina is potentially most vulnerable in the long term to erosion from the pricing model.  Suppose a genome on NovaSeq is only $800 each.  But you can't do just one at that price; first you must buy the machine and then run 40 genomes in the run. To give an idea of the impact of this capital cost, I ran a quick estimate against the ONT PromethION (assuming my skepticism on that box is proven wrong). Assuming $4000 per genome on that box, which doesn't seem absurd, NovaSeq doesn't win the race until about 320 genomes have been sequenced; that stretches out to 800 genomes if ONT could get the cost down to $2000 per genome (all in, including the not insignificant compute cost).  For small biotechs wanting human sequencing completely under their own control but loathe to spend capital, a working PromethION system could easily be cost effective.  Plus NovaSeq's economics depend on running large batches.  That works well at big, centralized centers, but isn't ideal for small labs or many healthcare settings.  NovaSeq risks being a mainframe in a world moving to tablets, holding onto but also restricted to a small cadre of dedicated users.

All of this should make 2017 very, very interesting. How frequently will Illumina convert old HiSeqs into new NovaSeqs versus how often will prior customers decide to either outsource or hold out until PromethION gets fully on line and its quality improved?  Will Firefly really arrive on time and will it protect the low end  of the market from the likes of MinION, QIAGEN GeneReader and Ion Torrent?  Will significant new entrants appear -- Roche/Genia escaping their patent suit or one of the stealth companies decloaking? Even if 2017 isn't the year Illumina's long-running dominance of the sequencing market is challenged, it could well be the year the challenger(s) prime for their assault.

A big thanks to everyone who tweeted, particularly Eric Olivares (@SEQanswers) who had the highest number of technically detailed tweets. I've collected the set I used in researching this in a Storify (which will get a link to this post once I publish it; no actual recursion though!).  Bio IT World's article had some useful details I couldn't find elsewhere.

[2017-01-11 edit: fixed typo in Novagene per request in comments]


Anonymous said...

So the $100 genome will be on the NovaSeq 10000, in between 3 and 10 years. Why even talk about that? Only Illumina can be blamed for that being taken out of context. People are learning so many good lessons from Donald Trump. And can we resist talking about the Promethion until there's actual data? ONT has been promising magical unicorns for half a decade now. One has yet to materialize, and the ones that have are horses with prosthetic horns strapped on.

Anonymous said...

don't get it...they want labs to spend a million bucks but are saying firefly is not far away?

Rick said...

Firefly will be aimed towards a different market. If the Firefly is anything like the miSeq the run costs will be higher than then the NoveSeq and thus not that interesting for the big sequencing centers. The sequencing centers that can afford the NovaSeq and have enough sequencing coming through to justify the machine are now satisfied. For those centers and labs who can not justify the high cost and high throughput of the NovaSeq they now have a bone to chew on that will keep them from jumping ship to a different platform.

BTW: My back-of-the-envelope calculation is that if 100 NoveSeq are out by the end of the year then the world as a whole will be able to sequence at least a half million humans (at 10x coverage) per year. Not a bad jump from 14 years ago and this first human genome.

Anonymous said...

"ONT has been promising magical unicorns for half a decade now. One has yet to materialize, and the ones that have are horses with prosthetic horns strapped on."

weird that people use blogs about new ILMN products as a premise to bash Oxford.

Anonymous said...

I think firefly is aimed at the "sample to answer" labs that run lots of "gene panel tests". Different mentality and economics.

Matt said...

Hi Keith, I like your idea of a focus on the low-end this coming year. What I'd really like to see is an update to the MiSeq or MiniSeq that brings 2x higher cluster density and enables more consistent loading/passing filter. 50M (100M even better) clusters would be ideal for most of our applications. Could this be possible with a patterned flow cell?

Anonymous said...

When you mention the first companies who purchase Novaseq, the company name should be "Novogene" instead of "Novagene". Do you mind correcting?


Chris said...

When will NovaSeq machines be released?
And is the Ion S5 released already?
Thank you very much! I'm writing a thesis about it.

Anonymous said...

Hi, I always find your blog posts very informative. I know that many people have questions about release dates and pricing for the NovaSeq. I got this info form my Illumina sales rep the other day, figured id share as i couldn't find this info anywhere on the web last week.

S1 - Q3 2017 - 300cycle $9,000 200cycle $7,800 100 cycle $5,700
S2 - available today 300/$15,750 200/$13,650 100/9,975
S3 - 2018 only supports 300 cycle $21,600
S4 - Q3 2017 only supports 300 cycle $30,780