Omics! Omics!: Notes from a Conversation with PacBio's Christian Henry

Monday, January 31, 2022

Notes from a Conversation with PacBio's Christian Henry

PacBio CEO Christian Henry was kind enough to chat with me by videoconference just after JP Morgan. To get the the obvious issue out of the way, let me say that while it is common to agree to meet with interview subjects at some future date when they are in Boston, he is the first one to suggest he would just stop by my desk and we'd head to a break room. Henry sits on my employer's board, so if you think that shades my opinions you are forewarned.

Short Reads

We first chatted about the Omniome short read technology, which Henry says is targeted for launch in the first half of next year. On the question of branding, it will get its own product name and there will be an effort to both allow it to be distinctive yet be part of a common set of marketing themes across the PacBio portfolio.

Henry was particularly enthusiastic about the high accuracy of the platform, with 0.01% error (phred 40) at the 200 basepair mark; in places the observed error rates (not estimated from a model, but observed via alignment to a reference) are Q50 or Q60. He expects this to differentiate the technology from other short read players. A new clustering technology was recently developed which enables higher density and leads to less phasing error. There will be a paired end option with readlengths of no less than 150 bases per side.

Henry attributes some of the improved accuracy to the two-phase sequencing chemistry and the lack of molecular scars on the molecule. Omniome's technology first binds a labeled base in a manner which cannot be added to the chain, then removes that base and extends with a terminated base. After removal of the blocking group from the terminator base, what is left is a native nucleotide. It should be noted that whether the molecular scars on bases really matter is a contentious issue, with little if any hard biophysical analysis in the open literature that I am aware of (please feel free to educate me to the contrary!).

The instrument is not yet in its final form, but Henry did say that they are definitely "beyond the breadboard" and closer to an alpha instrument than that. Moreover, instruments are operated on a regular basis, so PacBio is building a substantial trove of data on which to analyze performance.

Long Reads

I brought up the newly announced Google collaboration around DeepConsensus. Henry described this as a formalization of a collaboration which has been ongoing for a while -- and of course has already produced a preprint and a GitHub repository. He is optimistic that improvements in HiFi accuracy from this collaboration might be made available to users before year's end.

We also talked about the recent spate of papers on concatemer sequencing on the PacBio, a delayed subject for a blog post. PacBio is collaborating with the Broad Institute on their MAS-Seq effort, with a goal to produce a kit to enable this method. MAS-Seq's first focus is on Iso-Seq, PacBio's full length mRNA sequencing protocol, packing more raw transcripts into each HiFi read to achieve higher overall transcriptome coverage.

The newly announced collaboration with Berry Genomics to produce a desktop instrument ("MiSeq-ish footprint") excites him because while Berry will handle obtaining diagnostic use approval in China, PacBio can sell the same technology worldwide. Converting to a desktop format will involve significant streamlining of the required lab infrastructure -- no more dry nitrogen. Henry thinks an eighteen to twenty four month development time will be required for this instrument.

The Invitae collaboration on high throughput sequencing has had more time to ferment and will result in a series of product rollouts, perhaps beginning next year. He is optimistic that data will start emerging late this year if not sooner. Ultimate goal is to convert current short read exome projects into long read complete genomes, with methylation calling (and methylation calling will soon be performed on board the Sequel IIe instruments).

We chatted before the announcement of a collaboration between PacBio and Hamilton on automated library prep (the hazards of procrastinating casting ideas into bytes!), so that wasn't talked about. But he did talk about how Invitae brings great expertise in high throughput sample prep, on the order of one hundred thousand samples per quarter. PacBio's goal is to take the know-how from the Invitae collaboration and package it up -- perhaps even a future in which PacBio sells complete "lab-in-a-box" packages of sequencer, automation and software. He sees this part of the puzzle as potentially harder than the PacBio technology, with more unknowns. Also fitting into the picture is the acquisition of Circulomics, with its automation-friendly high molecular weight DNA extraction products.

I brought up the question of reliability. The company is greatly increasing its investment in Quality Control functions as well as more R&D on how to reduce quality problems at their root. He also sees scaling as critical here: with greater scale there can be more "battle testing" of the entire process, with the end goal of reducing variability in what is introduced to the sequencer. He also pointed out that during his time at Illumina he oversaw the operations side of their push into clinical markets, a space in which consistency of product and performance is paramount.

A Parting Shot

We had a great, animated and friendly chat. At the tail, the subject of Illumina's Infinity technology came up and annoyance crept into his voice. The Wall Street analysts apparently were asking him constantly about Infinity. As with everyone else, Henry was working with the meager information available on the tech, but given the consensus that it is the Longas mutation technology he had a few pointed thoughts -- "short reads stitched together", "oversampling", limited to 10 kb when 20 is far more valuable for "getting to 99.5% of all structural variation" and skepticism on how it will handle long homopolymers and long VNTRs. His conclusion: “Interesting that they throw the shot across the bow” – “this is kind of a nothingburger”

[2022-02-01 20:59 - obviously one hundred samples a quarter is nothing interesting -- the word "thousand" was inadvertently left out!! ] [2022-02-05 22:25 - as pointed out by a commenter, it is 0.01% error and phred 40 at position 200 -- my notes had it right as "0.01 - phred 40" but I transcribed the 0.01% erroneously as 0.1% and then did the phred conversion on the fly. Aieeeee! ]

1 comment:

Anonymous said...: Hi Keith, there must be a misprint about 0.1% error at the 200 mark. Should be around 0.01% error or Q40 around the 200Base mark.; Saturday, February 05, 2022 9:35:00 PM

Omics! Omics!