It looks like 2022 might be an exciting year for the short read genomics market, with new players taking on Illumina. J.P. Morgan will be virtual next week, so perhaps some of the players will make some announcements. Here's some thoughts on the situation as it stands now in a space where many of failed before -- QIAGEN, ThermoFisher (SOLiD) and Roche(454) -- as well as some have bailed out before even entering -- Agilent.
Here's a rundown of the known players who are in or might enter the market. Of course, once can't discount the possibility that there are additional companies that might jump in that have somehow stayed under the radar or I've misestimated how close a given startup is to market.
Before getting into the companies, a quick rundown of potential conflicts-of-interest I have, the CEO of PacBio is
Chairman of on the Board at my employer (though we've never met or talked) and they once gave me an exhibit pass to ASHG when it was in Boston, Illumina owns a stake in my employer, Oxford Nanopore once bought me drinks, Omniome once treated me to a weekend in San Diego (though I think everyone who hosted me long ago left) and you can always entertain the notion that non-disclosure agreements I've signed (or am bound to via my employer) start with the Fight Club Rule about non-disclosure agreements and so I can't disclose a COI.
The king of the field and the one to beat, with offerings across the capacity spectrum ranging from iSeq all the way to NovaSeq. Won't say much in this section, but certainly nearly everything else revolves around them in this market
I always feel a little guilty when I discover I've just failed to even mention Ion Torrent in this space, but on the other hand they seem to have long been satisfied to be a bit player and haven't challenged Illumina in any serious way. If they had a presence at 2020 AGBT I don't remember it -- though given the viral tsunami that washed over us just after that I have a lot of amnesia about 2020 AGBT. Still, as a movie character said recently, "If you expect disappointment, then you can never really be disappointed." - so that's how I'll avoid being disappointed by Ion Torrent
Genapsys changed to a professional CEO last year, but not much other news. Right now mired in the iSeq end of the market; without actually delivering the higher density flowcells they keep promising they will stay there to fight for scraps. Which they don't appear to do effectively. The capacity and low capital cost of Genapsys would seem just about right for many small labs wanting to do SARS-CoV-2 sequencing, yet there are no sequences in GISAID from them.
Now that doesn't mean someone isn't using Genapsys for that purpose, but a failure to get some sequences into GISAID and trumpet them is in my opinion a stunningly missed marketing opportunity.
In researching this I discovered that Genapsys in December announced they are setting up manufacturing and R&D in Colorado. That's great news - spreading biotech outside the highly concentrated Boston and Bay Area centers is just good for the industry and good for the country.
As I noted in my summary of the Nanopore Community Meeting (NCM), Oxford Nanopore is again trying to sell themselves as an all-read platform capable of both short and long reads. Nanopore applications not requiring long read lengths have been around forever -- the first NCM in 2014 had a talk on using nanopore in reproductive medicine for rapid aneuploidy detection and that just relied on successfully mapping reads to entire chromosomes.
ONT has instruments all along the yield axis, from Flongle for tightly focused projects up to PromethION for whole human genomes and megametagenomes.
I'll refer to Omniome as such to distinguish it from Pacific Biosciences' long read technology, at least unless (when?) PacBio renames it something else (Non-continuous Short Reads? Clonal Non-Real Time sequencing? -- hmm, CNRT could be pronounced "snort" which probably isn't ideal).
As I've noted before, I'm not a fan of this acquisition as I think it is a distraction for PacBio. But they'd love to prove all the critics (such as moi) wrong by integrating the two systems, particularly on the analysis end -- that was the vision CEO Christian Henry presented in a Mendelspod interview in early December 2021.
Singular went public last year
via a Special Purpose Acquisition Company (SPAC), a trendy way to enter the public markets. They announced just before Christmas that their NextSeq-class sequencer will be . They are also working on a NovaSeq-class machine as well as a spatial 'omics imager, so not exactly the most singular focus.
Singular appears to be aiming for flexibility and speed as keyselling points. The G4 features four flowcell slots and each flowcell has four independently addressable -- though their marketing material doesn't make clear how independently each flowcell is. In other words, must all four start simultaneously or can you start flowcell number two hours after flowcell number one. NovaSeq, for example, has two flowcells that are semi-independent -- if one is clustering then the other can't be started. PromethION has true independence -- any flowcell position can be started at any time no matter what any other is doing.
On speed, Singular is claiming a human genome in 16-19 hours -- but only once their higher density F3 flowcells are launched.
BGI appears set to enter the US market in late summer, based on a recent patent lawsuit outcome that appears to have invalidated the key blocking Illumina patents though Illumina won on other parts - a pyrrhic victory for Illumina. BGI has instruments across the entire range of scale, including concepts for instruments even bigger than NovaSeq.
One unfortunate issue BGI will have to deal with is the fraught geopolitical relationship between the U.S. and China. Also the logistics headaches seen this year with global shipping may loom over them, though realistically anyone domestic is going to have international shipping somewhere in their supply chain.
Element BiosciencesElement is headed by Illumina alumnus Molly He at CEO and has raised a mountain of capital, but launch data isn't clear. They do have a ton of open positions and some are in reagent manufacturing, os a 2022 launch isn't out of the question
Roche acquired Genia 7.5 years ago and has failed to launch a product. Will 2022 be different? Probably not, but the fact they've never killed off the Genia effort means the potential for a product hasn't gone away
The Library Prep Challenge
DNA or RNA can't just be popped into a flowcell; some sort of library prep must go on. The number of different library prep schemes grows constantly, with all sorts of clever ways to convert different phenomena into library fragments. Many of these embed key information in different places within the molecule, with schemes typically optimized for the market leading Illumina scheme of two index reads and paired end sequencing of the library insert.
For any new entrant, availability of library reagents will be one factor influencing customer acceptance. Ideally, one could just use Illumina libraries -- Genapsys claims this. This won't work if you need something else (e.g. the many specialized bits on an ONT adapter) or if Illumina has IP that blocks such full mimicry. I have no idea if they do or if it is enforceable, but it could certainly be a block -- but it's always possible they've been letting Genapsys slide because they were too small to bother with.
The next best thing would be a way to easily convert Illumina libraries for your box, either by some post-library process (such as just ligating on ONT oligos to a finished Illumina library) or by substituting a platform-specific step such as your own barcoding PCR.
Now the question is whether anything has been lost in such a conversion or direct use of a library. If you can precisely mimic the Illumina read scheme or simply read through everything because you have no read length constraints (ONT!), then no problem. But if you can't do this, it may mean that certain library types will be degraded or unusable. Genapsys has been promising paired end sequencing since day one, but doesn't seem to have delivered -- so if a library has 300 basepair inserts with a barcode in the first 6 bases of the insert on each side, that's going to be a problem.
A small aside, Illumina barcoding has some informatics quirks dealing with the fact that some platforms read one of the barcodes on one strand and in other platforms that barcode position is read in the other direction. So anyone trying to ape Illumina platforms must decide on this point whether they want to match some of the strangeness -- where you must remember which chemistry you are using to decide whether the barcode sequence is given forward or reverse complement.
So suppose you can't just convert Illumina, what then? With so many library types, it is neither rational nor practical to try to cover every one, but there are several types that are key.
Ligation libraries are the must fundamental; many other library types build off of this. Ligation chemistry has been around since before NGS, but there have been all sorts of refinements and tweaks. I So a new player partnering with an established library prep kit manufacturer to adapt well known kit brands would make sense in my opinion. Only downside is now you must share the revenue stream, and library prep has a potential to be regular cash flow.
Bulk RNA-Seq is the next class of library that is in big demand, and again there are all sorts of tweaks. So another obvious place to partner.
Non-Ligation Shotgun Libraries
Rapid, easy libraries via non-ligation chemistries is a much more rarified space. Illumina's acquisition of the Nextera chemistry has paid recurring dividends. If you want to play here, there's only a few options -- seqWell and iGenomX are two that spring to mind (since I've written about them here) -- there are some other Tn5-based kits revealed by Google.
Hybridization capture for exomes and other purposes remain in high demand and requires blocking oligos specific for the library adapters. Another place to partner with an existing vendor, ideally with an established library of key probe libraries
Particularly at the lower output end, PCR-based targeting remains very important -- as several million SARS-CoV-2 sequences in Genapsys can attest to. Anyone can design primers and primer sets, but getting really good ones and manufacturing at scale is not so trivial. Another good place to partner.
There's a long, long tail of other library types. But probably the most important -- if you are on the high output end -- is the various libraries from 10X Genomics. Single cell and spatial promises to be a nearly insatiable maw for sequencing capacity. 10X already supports BGI; if I were at another company I'd be looking to cut a deal
Once someone has a box, they'd like to generate data. Realistically, the number of shops that want to build their own pipeline is probably shrinking -- bioinformatics talent is in very steep demand (as we're finding with our multiple job openings at the strain factory) and it's just extra risk to take. If you can deliver an end-to-end sample-to-answer environment, then a much larger pool of potential customers is out there.
Illumina already has BaseSpace. PacBio has announced they will have an integrated environment for both the SMRT and Omniome platforms. BGI presumably has their own analysis environment. ONT says they are moving in this direction for human genetic variation.
But right now Singular and Genapsys seem to offer FASTQ as the endpoint. I love a good FASTQ file and that's what I like, but plenty more people want something more processed. So partnering with an existing software platform would make sense.
There's also the question of how easily your data fits into existing Illumina-centric pipelines. Is your error profile the same or radically different? Do you provide tools to make adjustments or do better variant calling than some off-the-shelf tool tuned to Illumina?
An intriguing possible partner to watch is QIAGEN -- they acquired or built all the pieces for their sequencing platform and then bailed out -- but kept most if not all of the pieces. Might QIAGEN want to get back in the game by acquiring someone or more cautiously start doing deals with new sequencing platforms?
Speaking of Customers
Who a new entrant gets as early customers matters. I've already remarked on the invisibility of Genapsys in the SARS-CoV-2 sequencing space -- which is pretty much their state everywhere else. BGI signed up three or so customers for Revolocity and then gave up.
The challenge here is that anyone with a large installed Illumina base faces huge switching costs -- retraining personnel, adapting software systems, dealing with possible issues during the switchover and so forth. So those would be prize conversions, but as a new entrant you run a heightened risk you are being used as a bargaining chip by your potential customer to extract a better deal from Illumina. Or you risk getting into an environment like a certain high-throughput facility in Cambridge which dabbles with other platforms, but if you ain't Illumina you won't be used broadly.
Alternatively, you could try to identify a few ambitious and up-and-coming labs that aren't yet wedded to Illumina. That's been a key driver of Oxford Nanopore's rise -- I can think of only one leading edge Nanopore facility that was previously (and continues to be) a big Illumina lab -- Cold Spring Harbor. Ion Torrent had one or two labs early on that were big beta test sites. But identifying such future stars isn't easy -- and not everyone is really prepared for the role of constantly pushing the performance and application envelope of a sequencing platform.
Perhaps a truly interesting opportunity would be a large venture-backed startup with a genomics mission and an experienced genomics team. Talk somewhere like that into a new platform -- and no, I don't have candidates -- and a new entrant could make quite a splash. But that's a huge amount of risk for such a startup to take on.
So look carefully at early big customer announcements and don't discount the ones you've never heard of, but also don't overvalue some huge name genomics site that was willing to sign on to a press release.
Ordering and the Customer Experience
Customers won't buy your product if they find ordering the essentials a frustration. Setting up a good web front end is important, though you can succeed with a mediocre one (e.g. Oxford Nanopore; sorry, but the split-design drives me nuts).
Perhaps more important in the modern world is integrating seamlessly with "punchout" systems for online ordering within companies. Yes, you can order from anywhere off these systems, but it is much more tedious.
Customer experience is ultimately about delivery of the goods in a predictable, boring manner. The first year of ONT's MAP was nothing like that, with far too much excitement as to whether your consumables would arrive in usable condition. Particularly if you are going to ship over international borders, think very carefully about whether you're prepared to excel at this or you should be partnering with someone highly experienced in the logistics of delivering molecular biology reagents.
What's Illumina Going to Do?
So how is Illumina going to react to all this, and in what portions of the market? How much of the response will be legal (more IP fights), how much pricing (plenty of margin to sacrifice to preserve market share) and how much new products?
Illumina hasn't innovated much at the low end of the line -- MiSeq is nearly a decade old (but nicely embedded in clinical labs) and iSeq never seems to have advanced since the initial launch. Illumina has a whole set of instruments as stepping stones -- iSeq then MiniSeq then MiSeq then NextSeq and finally NovaSeq -- but would a smaller number of instruments with overlapping capacity make more sense? Could Illumina compress run times significantly without compromising on quality or data yield?
One of the great phantoms of the Illumina world has been 2x400 read chemistry on MiSeq, followed closely by anything longer than 2x250 on any other platform. Heck, 2x250 is only seen outside the MiSeq on the NovaSeq SP. Supporting longer reads could make a splash and perhaps enable a few more amplicon formats to defend against encroachment by Nanopore and PacBio into that market.
At the high end, there is a nice Twitter rumor of a NovaSeq 2 and or S8 flowcell.
Rumblings of major Illumina launch in January. Novaseq2? S8 for $300 genome?— David Schlesinger (@david_schles) December 31, 2021
This would make good sense for keeping the high end. Migrating the superresolution imaging from the last NextSeq launch would make sense -- costs more time but higher cluster densities. Presumably existing NovaSeq customers would be offered an upgrade path -- presumably the change would be mostly in the optics. It also would be unsurprising for future NovaSeq releases to have the DRAGEN computation acceleration on board, again like the last NextSeqs. Heck, why not put DRAGEN on all the boxes if only for the marketing uniformity.
I think Illumina will continue to push the high throughput end, but it could easily be that their attention is being drawn steadily from hardware to the high value downstream applications like non-invasive testing both for pre-natal and cancer.
What Did I Forget? Get Absurdly Wrong?
I debated splitting this into multiple posts, but decided to lump it all together. What got lost in that decision? What do you think is totally wrong above? Let me know in comments or on Twitter!
[2022-01-04 8:20 EST -- corrected the incorrect statement that Singular used a SPAC to go public, an error pointed out by a commenter]
[2022-01-05 10:59 EST -- corrected my promotion of Christian Henry to Chairman of the Board at my employer; he is on the Board but not Chair]
[2022-01-05 11:02 EST -- corrected a dangling incomplete sentence about Genapsys that they have no sequences credited to their platform in GISAID]