Tuesday, April 30, 2024

A Peek At QuantumSI's Protein Sequencer

A number of academic labs and startups have been trying to build new ways of parallel sequencing of large numbers of peptides using schemes that have significant resemblance in their logic to the highly parallel DNA sequencing schemes often highlighted in this space; QuantumSI is the first (and so far only) such company to actually commercialize in this space.  Resemblances to NGS but not identity - for a few important reasons.

The biggest such challenge is the lack of anything resembling Watson-Crick basepairing in proteins. Sequencing chemistries almost invariably rely on basepairing, with the notable exceptions of Maxam-Gilbert reactions and nanopore sequencing.  Even ONT's scheme ends up leveraging basepairing at times, such as the sequencing adapters and various incarnations of double-stranded sequencing (2D, 1D^2, duplex). And very notably, there is not and probably will never be an equivalent of PCR for peptides; any peptide sequencing technology will inherently be a single-molecule approach  

Furthermore, peptide management enzymology just isn't as well developed.  There's some known proteases with degrees of specificity, but nothing like the wide catalog of restriction enzymes you can order from NEB or other vendores.  There's no polymerases of course, but even tools like ligases just don't have as wide a scope - though again, ligation are often driven by some basepairing.  Nature didn't make this space easy!

For these reasons, nearly all of the proposed chemistries are degradative in nature, with nanopore direct reading of peptides making up the rest. N-terminal degradation is an old concept; Edman developed his chemistry around the same time Fred Sanger was first solving the sequence of a protein (insulin) about 70 years ago.  Performing such analysis on single peptides, rather than pools will clearly be challenging - though it does eliminate the phasing problem and the problem of dealing with mixed populations of input peptides such as we did in a paper back yonder.

So the general concept will be to digest proteins into peptides, likely with trypsin, tether those peptides to a solid surface by their C-termini and then progressively read each N-terminal amino acid followed by removal of that terminal amino acid to expose the following one.

One idea for next-gen protein sequencing, with one example pursuer Encodia, is to try to build what is in effect a "reverse translatase" - progressively disassemble a protein and encode the released amino acids as DNA to be sequenced on a high throughput sequencer.  Each amino acid is coded back into DNA using some sort of code words, based on oligo-tagged recognizers.  One challenge with such a concept is the difficulty of distinguishing closely related amino acids, with leucine vs. isoleucine perhaps the most tricky.  The next is that each amino acid must have its own recognizer.  Of course, it might be acceptable to have some compression - maybe isoleucine and leucine aren't distinguished and that is dealt with in downstream search software.  But, even if the amino acid sequence space must, by necessity, be compressed, the total space of interest is huge if common post-translational modifications are desired to be in scope.  And many of these modifications may complicate the selection of recognizers.

QuantumSI is detecting the recognizers directly using optics. Importantly, they are using the time domain as well -- something a reverse encoder probably can never leverage. In fact, they use the time domain two different ways.  

First, each recognizer is labeled with dyes with different fluorescent lifetimes but the same absorbance and emission spectra.  This enables a monochrome optical system, and monochrome is always simpler and higher resolution than a polychromatic system.  Put another way, they've shifted possible optical and/or mechanical complexity into the chemical domain.

Second, the dynamics of the recognizers binding the N-terminus of a peptide are a key part of the signal. Rather than some sort of 1:1 pairing of recognizers to amino acids, each recognizer will display a certain pattern of binding kinetics with each possible terminal amino acid.  QuantumSI says they can distinguish leucine from isoleucine, as they display different kinetic signals. The biggest advantage is that a small number of recognizers can potentially differentiate a very large number of amino acids - QuantumSI's latest chemistry uses just nine recognizers.  They aren't yet claiming decoding all the funky amino acids - from my Millennium life I have not only a love for phosphorylation but also ubiquitination and its kin - but their system may have a shot at many of these without requiring a custom recognizer for each one.

A very interesting design choice from QuantumSI is to make their system a single-pot chemistry; there is no chemical cycling as with their corporate cousin 454.bio.  This makes for a much simpler instrument - a great deal of microfluidic complexity avoided - and saves on reagents since none of the expensive components are lost.  Unlike 454.bio, QuantumSI doesn't even need to remove incorporated labels, since they are degrading the analyzed peptides.  

But, this does complicate things.  There's basically always a race going on for access to the N-terminus of each peptide. Recognizers will come and go, but eventually the N-terminal endopeptidase strides in and clips off an amino acid - and hopefully leaves without clipping another.  In the ideal case, a set of recognizers flit in and out, giving a complex and useful signal, before the clipping - but there's no guarantee of that.  The scheme also seems a nightmare for any homopolymeric stretch - I doubt QuantumSI will be used to count glutamines within huntingtin.  But with looking up in a database, these should be manageable issues -- and the incumbent technique of mass spectrometry has its own challenges.

How simple is the workflow?  QuantumSI says their communications guy ran it.  One hours hands on time to digest the sample and click-label the C-termini for attachment to the flowcell, followed by 10 hours of running.  Automation of this workflow is on their development roadmap.

On the recognizer front, QuantumSI has made steady progress.  Their publication in Science used only three recognizers; at launch they had five and the newest kits have six.  This really emphasizes how their kinetic analysis can extract a great deal of data from a small number of recognizers.  Some post-translational modifications can already be detected, though the high value space of detecting phosphorylation is still in development.

On the informatics site, QuantumSI provides a hierarchy of data, with "what proteins are we identifying" on top, counts of individual peptides the next rung down and detailed kinetic information on each residue at the bottom.  

If QuantumSI is the Answer, What is the Question?

A core challenge with biological mixtures of proteins is the extreme of dynamic range. For example, with human blood (or serum or plasma) you can remove something like 99.99% of serum albumin and the dominant signal will still be serum albumin.  Solve serum albumin and a new set of abundant proteins must be batted down. The really interesting stuff is many orders of magnitude less abundant than all that.  Which is one of the reasons immunoassays such as home pregnancy tests are so amazing - they detect absurdly dilute targets in a sea of abundant proteins yet can be made cheaply and run with essentially no training.  

Some in the mass spec field have been not been shy about pointing out this issue; indeed, some have been downright obnoxious about it. Unless you can sequence enormous numbers of peptides - or figure out some extremely clever ways to deal with those abundant proteins - sequencing approaches will be swamped by boring background.  

QuantumSI's answer to this is to not take on such difficult challenges, at least not yet.  What they are proposing is that m biologists for ages have used tools such as Coomassie staining, Western Blots and ELISAs to study abundant proteins in simplified mixtures, and QuantumSI can provide higher information content but with workflows that are simple to learn and use.  After all, one drawback to mass spectrometry is it requires a very expensive set of instrumentation that requires a high degree of training to operate.  Mass spectrometers with associate liquid chromatographs are not something every lab is going to splurge on; doubly so on the mass spectrometrist to go with it.  QuantumSI claims their sample prep workflow is just a simple set of biochemical steps; no chromatography required if your inputs are simple.

At $85K an instrument, QuantumSI certainly isn't going to be ubiquitous as a simple gel box. Perhaps more seriously, the current instrument processes only two samples at a time, with runtimes of roughly overnight.  That's much less throughput than a simple gel box.  QuantumSI says that for applications so far they are resolving more peptides than required, so expanding the number of samples is high on their priority list.  This also points to another place the nucleic acids have a leg up - it's really easy to design barcoding schemes for DNA or RNA since we can easily design, synthesize and tack on such barcodes; this technology isn't well developed for peptides for direct peptide reading (the mass spectrometrists do have fancy mass-encoded tags).  But there are already case studies using QuantumSI to read out genetically encoded peptide barcodes, so there's already progress there.

Among applications mentioned by QuantumSI: reading out protein-protein interaction partners detected by immunoprecipitation, verifying protein engineering results, quality control for antibody production., and verifying if an engineered protein mutation is being correctly expressed.  All applications where the number of abundant proteins is sufficiently low to avoid the signal of interest being swamped out.

QuantumSI commented on the sorts of conferences they've attended and the response.  The Festival of Genomics - I first saw a box in the wild at FOG Boston last autumn - has been very successful, as has been other genomics-oriented conferences.  In their view, genomics practitioners are reluctant to invest in mass spectrometers.  They also go to proteomics-oriented conferences and encounter a much more mass spec oriented audience and the skepticism for NGS-like approaches held by that community.  Currently they are selling themselves in North America and Europe and using distributors to sell into Asia-Pacific geography.

It will be interesting to watch the further development of this space.  QuantumSI launched at the end of 2022 and is still the only NGS-like protein sequencing that has launched.  The new kits just announced have increased the number of peptides read out by about two to seven fold.  Personally, I think having more sample chambers per run is likely to be very popular; nobody ever ran a two lane gel!  And it may take time to identify the "killer apps" which will drive labs to buy into the platform, though even a few splashy publications could create some significant buzz.  

A final thought: it's interesting that QuantumSI gets attention at genomics-oriented meetings, but how much low-complexity protein sequencing are genome-focused labs interested in?  Perhaps it is a new direction that some are contemplating branching out in, but in general I don't see the QuantumSI approach - at its current level of sample throughput or tolerance for sample dynamic range - being a frequent companion for high throughput genome sequencing, RNA-Seq or spatial analysis.  There is an apparent fit for smaller scale synthetic biology and protein engineering labs perhaps - it remains to be seen how many such labs will try this technology out.  Rather than core labs, I suspect the better fit for QuantumSI is individual principal investigators or their equivalent in industry.  That is a very diffuse market with weaker network effects to drive adoption (versus genome labs that love to get on the latest bandwagon).

1 comment:

Anonymous said...

Still waiting for this explosion but it doesn’t seem to be catching on at all…. :(