Monday, November 30, 2020

PacBio's Renewed Energy

When confronted by antitrust regulators last year, the core thesis of Illumina and Pacific Biosciences was that PacBio could not survive as an independent company.  After giving up on the merger, PacBio decided to dispute their own thesis -- and seem to be succeeding so far.  As highlighted at their recent "Global Summit" meeting, they have a new CEO, new financing and a burnished product offering.
I don't like to spend much time looking at stock prices, but if you want a clear sign of renewed optimism in the company look at their two stock offerings this year -- 19.4M shares offered at $4.47 in August and mid November another 7.4M shares at $14.25 -- so half as many shares sold at almost three times the price per share! For those wondering, I do not hold positions in any company in the space, though results like that certainly challenge my focus on broad index funds.

PacBio has also refreshed their C-suite.  Genomics instrumentation trailblazer Michael Hunkapillar retired in favor of former Illumina executive Christian Henry (in the full disclosure department: Henry sits on the Board of Directors of my employer) --which was forshadowed by him being made Chairman of the Board earlier in the year. Exiting earlier was the Senior Vice President of Research, Michael Phillips, succeeded by an internal promotion.  Susan G Kim, whose previous experience is in tech not biotech, took over as CFO soon after Henry's elevation to CEO.  On the third quarter conference call, Henry spoke extensively about expanding the sales and marketing group with at least a doubling in size.  

On the product front, this fall PacBio announced the Sequel IIe, which extends the existing Sequel II instrument with on-board computation.  IIe's built-in silicon can rapidly compute the HiFi reads that are clearly where PacBio thinks the market excitement is, eliminating the requirement (and complexity) to transfer raw data off the instrument to a cluster and then compute a consensus.  This also shaves time off the computation.   On the other hand, this is pretty much all it supports right now -- underscoring that PacBio seens HiFi reads as their prime selling point and not methylation detection.

Across the board the PacBio software now supports uploading data to the cloud.  Cloud capabilities can be easily overhyped, but here it matters.  When queueing multiple samples, there was previously a limit imposed by the onboard storage -- it was easily possible to choose a combination of queued samples and movie times that would overwhelm it.  With offload to the cloud, one has unlimited storage.  Now if only the system supported hot-swapping of samples to enable essentially continuous sequencing!

I watched some of the talks at the Global Summit and if you have a chance it is well worth taking in the scientific talks.  A challenge with not being physically present is it is harder to pretend that conflicting work duties can be shirked until one's return.  But in any case, the outside scientific presentations demonstrated all sorts of clever uses of HiFi reads, though there certainly were more than a few mentions of other platforms. 15-25 kilobase reads with 1% or less error rates really opens up some cool possibilities.  One paper noted in a company presentation showed a 12 kilobase inversion in monozygotic twins leading to a developmental phenotype.  PacBio also just nailed down a collaboration with Invitae for pediatric epilepsy diagnostics, demonstrating buy-in by an established genome-based diagnostics company.

Where next for the company technically?  I'll engage in some speculation / projection.  Note: I wrote this section and then went back and listened to the third quarter earnings call, so some of my speculations are confirmed by Henry there, but in the most vague terms.

One option is the hardware front - -but a challenging one.  The current flowcells have 8M ZMWs (active sites).  More ZMWs, more reads and more data per library.  But more ZMWs means higher quality optical processing.  Since it is a real time imaging system, more ZMWs not only means possible flowcell fabrication and image resolution, but dealing with even more data gushing off the system.  Only a PacBio insider would know the options and full impacts here, but engineering for higher density would probably require re-engineering almost every aspect of the instrument.  Henry did tout the line that they are leveraging semiconductor technology with the ZMWs and further density improvements are planned - hopefully they won't find like Ion Torrent that increasing density can be hard due to complications not related to semiconductor nanofabrication techniques.

It's important to remember that PacBio greatly slows down DNA polymerase in order to image it.  Polymerases can operate at speeds of 10s of kilobases per hour, yet PacBio uses 30 hour movies to read only a few hundred kilobases (yeah, that statement stills seems a bit ridiculous -- only a few hundred kilobases).  This wouldn't seem an easy route either, probably requiring further engineering on the polymerase and the challenges of acquiring data rapidly.  Shortening the time required for a given level of multiple passes (and therefore HiFi quality)  

PacBio has historically spent a lot of effort increasing the lifetime of the polymerase during sequencing; a major drag on yield has been polymerases dying from photodamage.  I don't know how much more improvement is possible here.  Henry did talk on the earnings call about advancement of chemistry for higher yield, so perhaps there is still more to be gained here.  The other chemistry bit in the past was improving loading efficiency -- the fraction of available ZMWs with one and only one DNA molecule loaded.  There probably isn't a lot of headroom here; PacBio has optimized this extensively in the past.  But not a lot isn't zero, so perhaps further work can occur here.

Besides the sequencing chemistry, PacBio has made significant improvements over the years in library prep technology.  Their ultra-low input library prep enables libraries from tens of nanograms of DNA, with several demonstrated examples of sequencing from single insects or other small invertebrates.  Further development here might include simplifying the workflow further and moving library prep onto automation. On the earnings call, Henry talked about both lowering sample input and automating library prep.

One I thought of after the earnings call that wasn't mentioned: what about the basecalling?  PacBio has never been chatty about their baseline basecalling, whereas that's an angle ONT has milked for a long time.  It is of course a different sort of signal, but perhaps there's some improvement to be mined.  Any improvement in first pass basecalling helps the CLR (Continuous Long Read) applications as well as any HiFi attempts that yield only low numbers of passes.

So PacBio is independent and re-invigorated.  They will continue to try to pull applications and mindshare away from Illumina while going elbow-to-elbow with Oxford Nanopore in the long read space while looking over their shoulder at any new long read technologies entering the market.


9 comments:

David Eccles said...

Polymerases can operate at speeds of 10s of kilobases per hour, yet PacBio uses 30 hour movies to read only a few hundred kilobases

Let's see. What happens if we multiply 10 kilobases per hour by 30 hours? We get... 300 kilobases!

As I've mentioned previously (I think on this blog), I think PacBio's strength will be in delivering a great high-accuracy, single-technology genome assembly solution. Not the cheapest, not the best overall quality / accuracy, but the simplest to manage.

PacBio's current technology has a fundamental limitation, which is that it relies on DNA synthesis to work. That makes it three-bases-per-second slow (which Illumina deals with by massively parallelising sequencing), and limited by the chemicals used in the synthesis reaction (i.e. fluorescing A/C/G/T molecules). The first point puts a hard physical limit on the yield per ZMW, and the second point puts a hard physical limit on the amount of information that can be obtained from the sequencing process (which makes non-standard bases and base modification difficult to detect).

Henri said...

Regarding: "Any improvement in first pass basecalling helps the CLR (Continuous Long Read) applications as well as any HiFi attempts that yield only low numbers of passes." I am under the impression that the basecalling is already close to the theoretical maximum, the problem might be that what you 'see' is not what actually is build-in. Some excitement might happen randomly(?), and some bases are tried to built-in a and excited, but removed/gave up by the polymerase. That's at least how I got it. Next, the single-pass accuracy seems te be leveled off for some time now, but you never know what one can do. Maybe they should make it open source, the raw movies, and organize a basecalling hackaton.

Liang Zong said...

In Chinese proverb we call this "塞翁失馬,焉知非福", which literally means when you lose your horse, maybe you get fortune afterwards. But look back, the renewed energy of PB proved why ILMN wanted her, and for us, the PB customer, is also good news.

Keith Robison said...

I really should have looked at polymerase speeds better -- that's what I get for looking at one polymerase (Taq) and taking the low end (and yes, I did mean tens as in 30kilobases/hour -- which would mean a 10 hour run instead of 30 hours. SD derivative of Bst polymerase operates at 2 kilobases per minute -- so hundreds of kilobases per hour.

David Eccles said...

I expect PacBio would be eager to increase their synthesis speed, but suspect it's unlikely to be a low-hanging fruit given how little it has changed over the time they've been around.

For example, there may also be physical constraints to the sequencing speed, with regards to the time the base needs to be in the focal point of the ZMW in order to produce enough fluorescence signal above the background, as well as needing a polymerase that's stable enough to handle chonky fluorescent nucleotides.

My guess at the most likely improvement would be increasing the number of ZMWs per flow cell. I could imagine a future iteration where there's a tradeoff between sequence length and yield (e.g. 8M x 50kb vs 800M x 5kb).

Anonymous said...

Swapping polymerase now probably isn't easy given they've invested so much in modifying this one for photodamage mitigation... I thought it was Phi29 though and Phi29 is, from a quick google search, capable of 50-200 b/s. So they appear to have room to take the brakes off

Anonymous said...

ONT is well above Poisson loading of their flowcells. Since the PB consumable is an active chip they should have headroom there with electrophoretic pull-down. The biggest bang for the buck is, IMO, finding a way around one-and-done

Anonymous said...

I know non compete clauses can apparently not be enforced in Cali, but one member of the team that oversaw the PacBio negotiations and signed off on that ridiculous sweet heart deal has now joined PacBio as COO. So the guy who rescued PaCBios finances by fleecing Illumina with nothing to show for, btw. potentially neglecting his fiduciary responsibilities towards Illumina, gets rewarded by the rescued party.
On wonders how the CMA got all these Illumina internal documents that they refer to when they nixed the acquisition...
Smells fishy to me. And don't get me started on the valuation of PacBio after ARKG and Softbank drove up the price to "CRAZY".

Anonymous said...

CMA and FTC are allowed full disclosure so getting documents really isn't that fishy, they just ask...