Tuesday, October 25, 2022

PacBio Revio: Same Footprint, 80% The Time, 15X The HiFi!

PacBio has been rolling out announcements around the ASHG meeting and now delivers a huge one: the next generation SMRT instrument “Revio” will roll out next spring and it’s a big step up in throughput. With Revio’s 15X boost in per-run throughput over Sequel IIe, PacBio is touting this as 30X HIFi genomes for under $1K sequencing consumables per genome. 

Three types of improvements boost Revio’s throughput. First, it will use 25M ZMW SMRT Cells, up from 8M. With an in-spec SMRTbell library, this will deliver a 30X HiFi genome on each $995 SMRT Cell. Second, it runs four of these SMRTcells simultaneously. Third, only 24 hour movies, down from 30, will be required to generate these HiFi libraries - a feat enabled by moving Google’s DeepConsensus base re-calling onboard the instrument and supported by built-in NVIDIA GPUs. Methyl calling at CpG sites is also built into this workflow, which uses the same HiFi libraries as the existing instruments. All for a mere $779K per instrument. 



PacBio COO Mark Van Oene was kind enough to spend a half hour last Friday on the phone with me to walk through the announcements. PacBio had invited me to the launch party, but I’m on the wrong coast for it! Standard disclaimer: PacBio CEO Christian Henry sits on my employer’s Board of Directors. 

As far as timing, PacBio is taking orders now with plans to ship globally in February or March. Customers who took delivery of Sequel II or IIe systems will be offered loyalty discounts of an unspecified amount. 

On the informatics side, not only is DeepConsensus on board and boosted with NVIDIA GPUs and CpG calling standard, but Van Oene said the output file sizes have been reduced by 50% or more for a human genome. At this time, there is not a plan to make DeepConsensus standard on the older Sequel II instruments, since these instruments lack the GPUs required for rapid execution of this tool. 

Multiple improvements in operating convenience have been made. First, the four ZMWs are loaded in a tray for simple loading. Second, the ZMWs are now enclosed with a cover to form a flowcell, and as a result no longer requires nitrogen gas to exclude oxygen. So the site prep checklist just lost a small nuisance. More important for operators, a new ZMW tray can be loaded while the instrument is still operating on a prior set (except duing about 4 hours of the run when the robotics are in operation), enabling flexibility in queueing multiple runs to maintain high instrument utilization. PacBio has also consolidated reagents into a single sample plate

which both delivers reagents, the samples and collects waste, rather than requiring an operator to load multiple items separately. 

PacBio’s estimated throughput of 1,300 30X genomes is predicated on only 325 runs a year, perhaps a little conservative. Van Oene noted that some labs are experimenting with less than 30X coverage to explore what variation can be extracted from HiFi sequencing at lower cost, so that number could be boosted with lower coverage. Of course, feeding the instrument requires preparing sufficient libraries. Van Oene cited improvements in the version 3.0 library preparation kits, which have leveraged the Circulomics team to eliminate gel-based size selection and have also cut the number of manual steps. Van Oene also noted that PacBio believes that both small genome sequencing and IsoSeq, particularly with the MAS-Seq concatemer protocol now supported with 10X Genomics single-cell RNA-Seq libraries, will drive demand for many more barcodes, so PacBio is proactively evaluating indexing sets with 384 options. 

For anyone interested in exploring Revio data, PacBio is releasing five human datasets generated from Genome In A Bottle samples. 

SMRT Platform Bolstered with Collaborations & A New Software Tool 

PacBio also had four other ASHG-targeted press releases around the SMRT platform. 

Nine leading labs have been marshaled into the Consortium for Long Read Sequencing aka CoLoRS, with a goal of sequencing about 2,500 human samples with HiFi. Some labs are focused on samples from healthy individuals; others are targeting specific conditions A collaboration with Twist is offering two hybridization capture panels for PacBio libraries.. The “dark genes”’ panel targets 400 genes with a reputation for sequencing difficulty, such as critical tandem repeats. A second panel targets 50 genes of high relevance to pharmacogenomics. 

As mentioned above, the concatemer MAS-Seq approach for IsoSeq full length RNA sequencing, increasing the number of full length transcripts sequenced by about 15-fold, is now an official kit supporting single cell libraries made on the 10X Genomics Chromium. In brief, this uses amplification with uracil-containing primers and USER digestion to create specific sticky ends to drive the 15-fold concatemerization reaction, enabling more IsoSeq payloads to be sequenced per ZMW. 

A new software tool for genome-wide tandem repeat genotyping, TRGT (along with a visualization tool TRVZ), has been released on Github. TRGT/TRVZ is intended not only to type repeats, but to cross-reference that information with methylation calls to detect repeat-expansion associated changes in methylation.


Onso, PacBio’s Short Read Platform, On Target for 2023H1 

My conversation with Mark Van Oene, PacBio COO, also covered the PacBio benchtop short read platform - now named Onso. PacBio is still planning to launch in first half of next year, with order taking to being in the first quarter. Three beta sites were announced for Onso: The Broad Institue, Corteva Agriscience and Weill Cornell Medicine. Instruments will list for $259K. Onso is expected to deliver about 500 million 2x150 reads in less than 48 hours, with a 1x200 kit also available at launch. PacBio is considering other read formats; I briefly lobbied for something in the 2x300 range - I have some scientific partners who would love to have higher throughput than MiSeq in that range. Onso will launch with “a variety of library kits”, presumably covering the most popular applications. A conversion kit will enable using a small number of PCR cycles to adapt Illumina libraries. 



PacBio is emphasizing Onso’s quality. Currently over 90% of the called bases are Q40 or better, and in many library preps Q50 bases outnumber Q40. PacBio hypothesizes that because the sequencing chemistry itself has such high accuracy that errors are starting to be dominated by ones introduced by library prep or amplification errors. With such high accuracy, it is their belief that coverage requirements for many applications can be shrunk.



[Okay, this is an embarrassing pickle - I had 36 hours for the Sequel II movie time in an earlier draft & corrected the mistake -- but not the calculation from that that is in the title! 23:51 2022-10-25]

5 comments:

Anonymous said...

significant upgrade but still less data and more cost than an ONT P24.

Anonymous said...

Both have their merits. I see ONT and PacBio being different tools for different problems. Illumina is the cost effective solution where short reads will suffice (most resquencing applications). ONT is great where you need quick, cheap, and very long reads but base calling quality is not as important. PacBio seems to be positioning itself somewhere in between for cases where you need longer reads than Illumina while maintaining high accuracy.

John Didion said...

Hi Keith - thanks for the nice summary. Can you elaborate on the throughput? A 30x human genome per flowcell is ~90GB. At 25M ZMW per flowcell this works out to an average of 3.6 Kb reads. But they are claiming read sizes of 15-18 kb. What am I missing?

Keith Robison said...

John: The 25M is a the number of features on the device; loading efficiencies are typically in the 33% range and then another fraction of the loaded ZMWs will not generate a HiFi read

So probably 25% of the ZMW count will actually deliver HiFi reads, which squares the numbers (roughly, of course)

deepika kapoor said...

For anyone interested in exploring Revio data, PacBio is releasing five human datasets generated from Genome In A Bottle samples.