Tuesday, May 31, 2022

Ultima Genomics Storms Out Of Stealth Promising $1/Gigabase Short Reads

To date, the new entrants targeting Illumina’s short read business have been aiming at the middle of Illumina’s range, trying to take on NextSeq.  Element Biosciences is touting high accuracy for a low price.  Omniome (now PacBio) also has positioned itself to tout accuracy.  Singular Genomics is claiming to enable great flexibility and fast runs.  But all aimed at NextSeq.  As part of the run up to AGBT another company is decloaking from stealth mode: Ultima Genomics, however they are going not after NextSeq but full throttle after Illumina’s pinnacle, the NovaSeq running the S4 flowcell.  The value proposition is a large sequencing device that delivers S4 output at S1 prices for an overall cost of $1 per gigabase.  Note that the interview for this piece was conducted under a CDA and Ultima reviewed my copy for accuracy and to ensure I didn’t disclose anything they had marked confidential.  They were nice enough to offer to have me fly out to their facility, but I was forced by the damn coronavirus to cancel those plans the night before the trip. A preprint summarizing the technology is also out in bioRxiv.  A trio of additional preprints have popped up as well, describing its application to generate a huge methylation sequencing dataset around colorectal adenocarcinoma, a huge Perturb-Seq dataset and for large scale single cell RNA-Seq.


Ultima isn’t planning on truly launching until early next year, but they’re well on the way with paying early access customers.  Indeed, AGBT will feature multiple posters and talks describing the use of the Ultima instrument for a variety of genomics tasks.  And Ultima is confident that their architecture will support significant increases in future throughput, enabling per base costs to go even lower.


Ultima’s chemistry is flow based - using unterminated but fluorescently labeled nucleotides.  Only a fraction of the nucleotides are labeled in each reaction, reducing the reagent costs and minimizing molecular scar accumulation.  The reactions take place on beads whose templates are amplified via emulsion PCR - though for all the ePCR-haters out there Ultima will include a fully automated benchtop ePCR robot.  Once primed, the beads retain the DNA polymerase, so this expensive component can be conserved between flows.  The instrument is a single end reader – no paired ends – but substitutes for that by reads with a modal read length of around 300 bases, which should be enough to plow all the way across most short read inserts and their associated molecular indices.


The use of unterminated nucleotides has typically meant challenges in resolving homopolymers.  Ultima is tuning their system to call homopolymers of up to 12 bases; via discussions with customers and their own experience accurate counting of longer homopolymers is deemed insufficiently valuable to focus on vs. other design tradeoffs.  


But Ultima has found several ways in which unterminated flow chemistry can either have its weaknesses ameliorated or become downright boons.  First, while it can’t accurately measure long homopolymers it can go straight through very long ones in a single extension cycle – so poly-A tails in cDNA ends can be easily blitzed through.  This helps ensure reading all the way through inserts of things like single cell libraries.  Second, for short homopolymers Ultima embeds in the Q-scores a probability matrix of the length – basically the odds of minus one and plus one versions of the sequence.  This is leveraged by their customized version of GATK, developed with the Broad Institute.  Third, is a clever approach of “cycle shift variant calling” that I’m still stunned has never appeared in the literature for any other flow chemistry – 454, Ion Torrent or Genapsys.  Cycle shift uses the known order of flows to increase the confidence in variant calls – particularly variable for low coverage data such as cell-free DNA.  


Another key driver of low cost and high density is the use of a spinning, open “flow cell” (really a 200mm diameter wafer) for both reagent addition and imaging.  Centrifugal force generated by the spinning (fake force, ha!) distributes the reagents as a very thin film, minimizing wastage.  Imaging as the wafer spins enables shooting many tiles without having to repeatedly accelerate and decelerate the flowcell as a rectilinear scanning scheme must do.  The speed difference adds up: Ultima can generate in 20 hours the same 3 terabases (10 billion reads of roughly 300 bases each)  as a NovaSeq S4, but an S4 requires 44 hours to run – and Ultima believes they can shave that down to 16 hours.  Faster cycle times means more runs per instrument – and each instrument runs two wafers simultaneously, each with its own chemistry station but sharing imaging path  The instrument features tanks for reagents which can be refilled, with a 24 hour capacity of each reagent.  Six different wafers can be queued for running, with built-in automation removing spent wafers and swapping in new wafers with new library pools.  


How might the system grow its output?  The patterned wafers place the beads at a very conservative pitch.  Larger diameter wafers are also a possible further option. Extending the read lengths is yet another possible expansion direction.


The instrument has onboard GPU compute power, which is currently used for basecalling and alignment and could ultimately also perform the variant calling work.  


Current accuracy is 0.1% error for substitutions and 0.5% for indels.  Most of the indel error is concentrated in homopolymers greater than 8, with calling capped at 12.  When used with the specially modified GATK co-developed with the Broad, or other custom DeepVariant or Sentieon pipelines, SNP calling accuracy of 99.7% precision, 99.7% recall is achieved and indel recall and precision range from 96-98% for small indels (excluding long homopolymers and low complexity regions).  Accuracy suffers in low complexity regions, which Ultima believes is an amplification chemistry not sequencing chemistry issue and they believe they can significantly improve on the current performance.  


Ultima plans to offer their own kits for PCR-free and PCR-based sheared genomic libraries.  Libraries for other systems can be converted by a simple indexing PCR scheme - this has been done for TruSeq libraries and proof-of-concept experiments have been run for Nextera libraries.  


What could be done with such an instrument?  A pending publication uses Ultima and Illumina in parallel on the same 4 million cell Perturb-Seq experiment and finds the results equivalent between the platforms.  A large fraction of the Phase IV ENCODE HiC data was generated on Ultima.  An internal proof-of-concept experiment utilized deep sequencing RNA from COVID-19 infected samples, recovering complete viral genomes after only ribosomal RNA depletion.  One of the AGBT abstracts demonstrates the ability of Ultima WGS to detect minimal residual disease at low levels by deep WGS of cell-free DNA, an approach academia and startups are actively exploring.  Additional AGBT abstracts describe population genetics studies, oncology, and rare disease sequencing.  Ultima has 10 paying Early Access customers, with 7 instruments installed to date – and these run the gamut from large academic genome centers to biopharma to government labs.  They hope to have “well into double digits” customers at the time of the official launch.


To get here Ultima has raised over $550 million dollars and hired over 350 employees.  The company has made steady progress from their start in 2016.  . Ultima CSO Doron Lipson previously was part of the teams at Helicos and Foundation Medicine, so he has extensive experience both in building a sequencing platform and applying it at scale. CEO Gilad Almogy has spent many years in the semiconductor manufacturing field - Ultima’s reaction wafers are patterned atop silicon substrates and the semiconductor industry also uses very high precision optical methods for both manufacturing and quality control.  


Illumina for a long time now has had an unassailed position as leader in sequencing in the US market as well as others.  Now that position is under pressure from all sides: Element and Singular are trying to squeeze the NextSeq market while Ultima is aiming for the top; Oxford Nanopore thinks their “short fragment mode” can compete as well and the patent shackles are being lifted from BGI.  At JP Morgan in January Illumina said their “Chemistry X” would offer improvements in accuracy, read length and output, but absolutely no details have been forthcoming – and in particular whether new instruments will be required to access Chemistry X benefits.  Perhaps the entry of Ultima and the others will add some urgency to Illumina communicating their future plans, lest customers start planning in earnest to opt for the new platforms


For we consumers of sequence data, more competition and lower prices are a pure good. Projects can continue to be increasingly ambitious and simply the number of different phenomena which can be converted into a sequence measurement constantly grows.  More for less is never, ever going to become boring – it will always be enabling.  After a long period of very shallow slope in the notorious “better than Moore’s Law” slide, we appear to be entering a new period of plunging sequencing costs.  Time to start making plans to take advantage of it!


[20220608 corrected really embarrassing millions typo (should have been billions) which has been requoted all over Twitter]

7 comments:

Unknown said...

Thanks for this great write up. Always exciting to hear about new sequencing tech.

Anonymous said...

Ultima's quality does not seem yet on par with Illumina, particularly their INDEL performance.... their F1 score for INDELs is ~96%, even when they exclude long homopolymer and low complexity regions. This does not seem ready for clinical prime time yet.

Anonymous said...

your surprise is well founded, as ion torrent was using special constructed flow orders to reduce error rates 10 years ago...perhaps it was never published on, but it was thoroughly developed and used, and was very clear to us...it would not be surprising if it was in long-since published patents...

Anonymous said...

your surprise is well founded, as in fact ion torrent was using specially constructed flow orders to reduce error rates 10 years ago...perhaps it was never published on, but it was thoroughly developed and used, and it was very clear to us early on that the known flow order could constrain errors...it is possibly less clear to folks that you can further optimize this by constructing special patterns in the flow order...it would not be surprising if this was in long-since published patents...

Anonymous said...

"...clever approach of “cycle shift variant calling” that I’m still stunned has never appeared in the literature for any other flow chemistry" -- prepare to become un-stunned, it may not have been widely noted, but Ion Torrent thoroughly developed and implemented such methods for using knowledge of the flow order to constrain and reduce errors and increase the accuracy of variant detection...the methods developed there were quite sophisticated, and used special patterns of flows to further increase accuracy, the underlying coding theory is amusing, you would like it...all done more than a decade ago...while not in publications, it may well be in the patents published back then....

Anonymous said...

It's a behemoth. Like an RS2 mated with a Helicos. yer $1 per Gb but $2M for the box and only if you can keep it stuffed full of samples. Many expensive GPUs. it makes novaseq look cute and agile.

Dale Yuzuki said...

Exciting stuff Keith!

Thinking about the scanning, do I understand correctly that the imaging is done as the circular plate is spinning? I'm not clear how a 44h imaging time on the NovaSeq could be reduced to only 16h...