Thursday, September 07, 2023

Two More Automation Partners Join PacBio Compatible Program

A significant challenge for the long read sequencing vendors has been that short read sequencing has a decade and a half head start in evolving a tools ecosystem.  New entrants such as Singular Genomics and Element Biosciences can take the strategy of building short bridges to existing tools designed for Illumina whereas the long read players must often build anew, as tools and protocols sufficient for short reads often are lacking performance on long reads.  At J.P. Morgan in January, PacBio had announced a PacBio Compatible program to highlight products which specifically support PacBio sequencing.  This morning, two more automation vendors -- Revvity and Tecan -- have joined their liquid handling automation to the program.  I got a walkthrough of the new announcement from PacBio's Amit Patel yesterday.
PacBio library preparation is a somewhat laborious affair when performed manually.  While the version three kit significantly streamlined the number of pipetting steps, it is still a ligation protocol with multiple bead cleanups.  Since there is no PCR downstream and only truly covalently closed circular library molecules will function properly in the sequencing process, small process variations can have significant negative effects on library concentration and yield. Industrial settings also tend to put a premium on employee time as a valuable resource in contrast to the common academic ethos of grad students as free labor, so for companies walkaway automation is very prized -- that is certainly the case at The Strain Factory that employs me.

Liquid handlers tend to be large capital expense purchases, require skilled programming and maintenance, and have other characteristics that encourage sites to pick one or a few vendors and stick to them.  Hence it would be unwise for PacBio to choose a single vendor for the program - many sites interested in acquiring a PacBio sequencer are likely to have already locked in a liquid handling choice.  And that also fits the general ethos of the PacBio Compatible program - PacBio is trying to encourage a diverse supportive ecosystem with many partners and not pick an exclusive one in each functional area.  

The program also emphasizes PacBio's desire to focus their technical development in the immediate vicinity of the sequencer.  For example, in the library preparation space PacBio is focused on the actual HiFi library kits and specific workflows for specific types of projects such as AAV or in increasing flowcell utilization with their MAS-Seq insert concatenation scheme.  PacBio also continues to develop basic bioinformatics workflows, while partners such as BugSeq deliver complete solutions for specific use cases (in that case, microbial genome interpretation)

In the automation space, PacBio Compatible now includes one specialized unconventional instrument -- the single sample Integra Miro Canvas (formerly from Miroculus, acquired by Integra this past March) and three medium to high throughput conventional liquid handlers -- Hamilton's Microlab NGS Star, Revvity (lab automation assets of what was PerkinElmer) SciClone G3 NGSx, and Tecan DreamPrep NGS Compact.  

Miro Canvas uses electrowetting microfluidics for a single sample, which labs such as the Broad Institute have shown can be very interesting for method development.  Electrowetting also has the advantage of inflicting very low shearing forces on a sample.   Miro Canvas is also a very different beast with regard to moving parts -- basically none -- so maintenance and calibration isn't the recurring challenge found on large mechanical liquid handlers. 

Tecan's DreamPrep NGS Compact is designed to process 6-48 samples at a time.   For a lot of labs in the human genomics space, that's likely a good number.  Revio was touted for delivering four 30X human genomes per 24 hour run, but I'm told by a very highly placed source at PacBio that many labs interested in population genomics are finding that 12-15X genomes are very useful, with the benefit of now packing 8-10 human samples per run.  So DreamPrep might be very attractive in a lab thinking of pushing out samples at this scale.

The Revvity and Hamilton liquid handler protocols can handle up to 96 samples per run.  So human genome labs that have really leaned into Revio with multiple instruments, or labs using one of the target enrichment solutions from PacBio Compatible partner Twist Biosciences, could easily take advantage of this scale.  Or of course anyone packing many smaller genomes onto a Revio

So far the PacBio compatible program has been largely focused on Sequel/Revio HiFi long read sequencing.  Patel stated that the plan is for products targeting the Onso short read platform to also be under the same umbrella.


2 comments:

cdwScience said...

Thank you very much for this post.

With respect to "many labs interested in population genomics are finding that 12-15X genomes are very useful", I am not 100% sure if we are talking about the same thing. However, for HiFi data for myself, I was not highly satisfied with data that I thought was roughly within that range.

You can see those results and links to the raw data if you scroll down on this page (to at least "Submit Sample to Generate HiFi Reads"):

https://github.com/cwarden45/DTC_Scripts/tree/master/Dante_Labs

I believe I have seen some similar claims before. However, if you counted Harvey et al. 2023 as one of those reference, then I am not sure if the data itself shows some amount of benefit to higher coverage (such as greater than 15x). I might be placing more emphasis on the Nanopore results, but my individual opinion was that I might still prefer higher coverage to be safe.

Am I correctly understanding that "12-15X" is intended for HiFi coverage?

Thank you again!

Keith Robison said...

Hi cdsScience!

The coverage demands will depend on the scientific questions and the quality of the libraries. For human population genetics, according to my highly placed source (to be revealed next week :-) 12-15X is apparently sufficient to call a large fraction of the variation in the samples; these labs are presumably using imputation to some degree. For rare disease detection it might not be acceptable - I'd love to hear from experts on that.

With high accuracy long (~20Kb) insert reads, then 12-15X is probably mostly going to fail for reasons of sampling -- just through chance there will be some alleles not sampled - a great sort of intro comp bio programming problem would be to estimate how many sites at a human genome would not see both alleles sampled.