Monday, May 23, 2022

London Calling 2022: Peptide Sequencing

London Calling was last week and Clive Brown's big revelation was a peek at Oxford Nanopore's progress on enabling peptide sequencing on the platform.  Peptide sequencing and identification is a hot area right now, with multiple startups looking to provide alternatives to mass spectrometry approaches.  Clive stressed that the technology is very early in development.  It's definitely a clever fork of the existing DNA sequencing technology.  However, it also illustrates a significant organizational challenge which Oxford. So I'm going to spend a post focused on this while I figure out how to slice up the rest of the meeting.
I've repeatedly toyed with, but never quite convinced myself it was good, a post proposing that Oxford Nanopore go through some significant splitting in order to maximize their ability to bring new technologies to market.  ONT has been a bubbling pot of innovation, but many of those innovations seem to just disappear back into the pot.  Some of those I shed no tears for, but there are others that perhaps would fare better if they weren't competing for resources with the core DNA and RNA sequencing business.  For example, a number of years back ONT declared that they were jumping into the booming market for alternatives to phosphoramidite DNA synthesis with a "DNA Foundry" -- and nothing was ever heard of it again.

One of the advantages to not splitting up a company is sharing expertise and intellectual property.  Here the peptide sequencing is clearly at home in the parent company.  The approach relies on digesting proteins to peptides, conjugating a DNA tag onto both the N and C termini of the peptide and then using the "outie" nanopore configuration to read signal from the DNA-protein-DNA molecule.  Since outie is used, the possibility for repeatedly reading the same molecule (aka "flossing") is available.  

Signal deconvolution will likely prove complex, with 20 highly abundant amino acids instead of just four nucleotides.  That space grows much larger with post-translational modifications, with a few being of very high value and should be on ONT's target list from the beginning: phosphoserine, phosphotyrosine and phosphothreonine.  ONT has shown great ability to train machine learning models for these sorts of problems and the model really must be just "good enough" to identify known proteins given a set of peptides -- straight-up de novo peptide sequencing is probably not required.  Identification given a database of proteins is what anyone in the field aspires to. Nanopore sequencing might also be more quantitative than mass spectrometry, which suffers from the issue that the ability to ionize peptides varies widely by peptide, so the number of ions seen for a given peptide is a complex function of the peptide itself, the upstream processing and the peptide's original abundance.

If successful at this, then the protein sequencing business can leverage all the hardware advances of the DNA sequencing business, such as bigger machines and cheaper flowcells.  The DNA tags front and back also offer the opportunity of multiplexing samples.  So all good advantages to keeping this as just another product line at ONT.

Pretty much everything else argues for hiving off the peptide identification business into some sort of highly independent unit.

First, there is just going to be a completely different world of sample preparation.  Clive expressed his usual disdain for upstream separations, which are a key part of mass spectrometry workflows.  If the nanopore alternative can avoid having to simplify the mixtures kicked out by trypsin (or other site-specific protease) digestion then that would be a huge win.  But these are nanopore devices, and their delicate protein-in-membrane configurations are often degraded by components of real, interesting biological mixtures.  

There's also the dynamic range problem of many interesting samples, most notoriously human blood plasma or serum.  Roughly speaking, if you don't do anything to this what you see is serum albumin.  Lying several orders of magnitude lower are other boring proteins -- it's only after clearing away an awful lot of overburden that one gets down to proteins of biological interest.  There are kits and techniques for this and they'll certainly apply to upstream prep for nanopore -- but that still means upstream prep.  Perhaps this can be automated on the VolTRAX, though that's a device that's hardly electrified the DNA and RNA sample prep space for Nanopore.

It will also be interesting to see - as in "may you live in interesting times" - how certain post-translational modifications behave or misbehave in the nanopores.  A serum sample is going to have peptides with huge glycosylation modifications hanging off the side, and intracellular samples will sometimes have bits of peptides linked to side chains by isopeptide bonds -- ubiquitin is the best known but there's a small zoo of such modifier peptides.  Trypsin digestion of ubiquitin leaves only two amino acids glommed on; NEDD8 doesn't have any dibasic motifs for trypsin to act on.  

There's also the question of cost -- ONT will of course position this as not requiring expensive hardware, but what will the cost per sample be?  Clive didn't disclose much about the tagging process and my protein chemistry is rusty, but perhaps this uses peptide ligases so as to avoid tagging side chains.  The fancy oligos will add cost as well.  The field of non-mass spec proteomics is evolving and there aren't yet any launched full sequencing products to compare to -- though there are the Olink and SomaLogic specific protein identification kits out there. 

And what are the markets for this?  Let's just park the fantasy of a huge untapped market in do-it-yourself proteomics.  Yes, ONT will sell some kits to hobbyists just like they have for the nucleic acid business, and it won't even be rounding error on the bottom line just like the nucleic acid business.  There really should be someone dedicated to deeply understanding the existing markets for protein identification, the performance characteristics demanded of those markets, the fit of all the competing technologies into those markets and most of all looking for over-served and under-served markets an Oxford product might slot into.  

So sample prep and characterizing performance in various biological tissues are immediately two spaces where the peptide identification must have dedicated minds or this effort won't succeed on a commercial basis.

But perhaps the even bigger driver of some sort of separation can be seen in London Calling itself.  It's a great meeting and I found myself with pangs of regret for not being there -- though this year I was still testing positive for COVID by rapid antigen around when I probably would have planned to fly out, so it would have been a near-run thing to test out in time to get there.  But LC is filled with mostly the wrong people and the energy is around genomics, not proteomics.  There will certainly be overlap between the community using nanopore to study genomes and transcriptomes and those studying proteomes, but it is a reasonable hypothesis that those in the intersection on the Venn diagram are greatly outnumbered by those outside it.  The peptide crowd and the peptide advances will just be lost at London Calling or Nanopore Community Meeting.  To incubate such a new, different application space there should be an entirely peptide-focused set of conferences where "you can also use it for DNA" is a message very much in the background.

Even more so, Oxford Nanopore has gotten enormous mileage from a symbiotic relationship with a core of MinION Access Program (MAP)  participants who were nearly all junior and unknown before MAP but who aggressively developed experimental and computational approaches that let the platform shine.  If ONT can repeat that phenomenon it would boost their fortunes significantly -- but with no disrespect to the MAP superstar alumni crowd it won't be the same people - a new crop of unknown-to-star transitions are what is needed.

All of this is a ways in the future.  My guess is that early access to the peptide identification technology is at a minimum a year away, and more likely closer to two years. So this feels a bit like the 2012 AGBT announcement - an early reveal to attract attention and that pool of future collaborators.  If you're an early career  investigator in the protein space looking for a way to differentiate yourself, it's not a a bad bet at all to pay close attention to ONT's progress and to jump on the first opportunity to access this new platform.

No comments: