Omics! Omics!: 2024

Tuesday, May 07, 2024

On The Expanding Versatility of Single Molecule Sequencing for Detecting Anomalous DNA

An exciting aspect of true single molecule sequencing has been the detection of methylated bases. Both Oxford Nanopore and Pacific Biosciences technology generate altered signals if methylated bases are present. For Oxford Nanopore this is hardly surprising, as it would seem any change in the DNA should alter the complex interaction with the protein pore and it should become just a computational challenge of recognizing that signal. PacBio is a bit more surprising, but the kinetics of base incorporation are apparently sensitive to the complementary base. I wanted to point out, though without much deep analysis, three recent preprints that demonstrate detection of other modifications to DNA and thereby enable some interesting applications (and of course, some wild speculations on my part). It's also interesting because of the overlap between the papers, as they are interconnected to a degree in their methods.

Wednesday, May 01, 2024

First Illumina Complete Long Reads Preprint

Readers of this space might have detected a significant slant towards skepticism in my coverage of Illumina Complete Long Reads (iCLR), exacerbated by now deposed Illumina CEO Francis deSouza claiming it isn't a synthetic read technology. Illumina's posters on iCLR at AGBT this year seemed to reinforce my view that Illumina was marketing purely on short-read like terms - call SNPs in a few more hard-to-map regions of the genome, but not really compete head-to-head with the true long read platforms. But now there is a preprint out on MedRxiv that reports iCLR results for a Genome In A Bottle (GIAB) sample as well as seven samples from individuals wiith potential genetic diseases of unresolved cause. The GIAB sample was also sequenced with some of the latest Oxford Nanopore chemistry (Duplex R10.4.1) and as HiFi libraries on PacBio Revio - enabling comparisons of the platforms. The preprint is probably going to be revised and expanded - I'm certainly hoping some of my comments are found constructive - but is very useful to see. And perhaps it will soften positions such as mine on iCLR's utility.

Tuesday, April 30, 2024

A Peek At QuantumSI's Protein Sequencer

A number of academic labs and startups have been trying to build new ways of parallel sequencing of large numbers of peptides using schemes that have significant resemblance in their logic to the highly parallel DNA sequencing schemes often highlighted in this space; QuantumSI is the first (and so far only) such company to actually commercialize in this space. Resemblances to NGS but not identity - for a few important reasons.

The biggest such challenge is the lack of anything resembling Watson-Crick basepairing in proteins. Sequencing chemistries almost invariably rely on basepairing, with the notable exceptions of Maxam-Gilbert reactions and nanopore sequencing. Even ONT's scheme ends up leveraging basepairing at times, such as the sequencing adapters and various incarnations of double-stranded sequencing (2D, 1D^2, duplex). And very notably, there is not and probably will never be an equivalent of PCR for peptides; any peptide sequencing technology will inherently be a single-molecule approach

Furthermore, peptide management enzymology just isn't as well developed. There's some known proteases with degrees of specificity, but nothing like the wide catalog of restriction enzymes you can order from NEB or other vendores. There's no polymerases of course, but even tools like ligases just don't have as wide a scope - though again, ligation are often driven by some basepairing. Nature didn't make this space easy!

For these reasons, nearly all of the proposed chemistries are degradative in nature, with nanopore direct reading of peptides making up the rest. N-terminal degradation is an old concept; Edman developed his chemistry around the same time Fred Sanger was first solving the sequence of a protein (insulin) about 70 years ago. Performing such analysis on single peptides, rather than pools will clearly be challenging - though it does eliminate the phasing problem and the problem of dealing with mixed populations of input peptides such as we did in a paper back yonder.

So the general concept will be to digest proteins into peptides, likely with trypsin, tether those peptides to a solid surface by their C-termini and then progressively read each N-terminal amino acid followed by removal of that terminal amino acid to expose the following one.

One idea for next-gen protein sequencing, with one example pursuer Encodia, is to try to build what is in effect a "reverse translatase" - progressively disassemble a protein and encode the released amino acids as DNA to be sequenced on a high throughput sequencer. Each amino acid is coded back into DNA using some sort of code words, based on oligo-tagged recognizers. One challenge with such a concept is the difficulty of distinguishing closely related amino acids, with leucine vs. isoleucine perhaps the most tricky. The next is that each amino acid must have its own recognizer. Of course, it might be acceptable to have some compression - maybe isoleucine and leucine aren't distinguished and that is dealt with in downstream search software. But, even if the amino acid sequence space must, by necessity, be compressed, the total space of interest is huge if common post-translational modifications are desired to be in scope. And many of these modifications may complicate the selection of recognizers.

QuantumSI is detecting the recognizers directly using optics. Importantly, they are using the time domain as well -- something a reverse encoder probably can never leverage. In fact, they use the time domain two different ways.

First, each recognizer is labeled with dyes with different fluorescent lifetimes but the same absorbance and emission spectra. This enables a monochrome optical system, and monochrome is always simpler and higher resolution than a polychromatic system. Put another way, they've shifted possible optical and/or mechanical complexity into the chemical domain.

Second, the dynamics of the recognizers binding the N-terminus of a peptide are a key part of the signal. Rather than some sort of 1:1 pairing of recognizers to amino acids, each recognizer will display a certain pattern of binding kinetics with each possible terminal amino acid. QuantumSI says they can distinguish leucine from isoleucine, as they display different kinetic signals. The biggest advantage is that a small number of recognizers can potentially differentiate a very large number of amino acids - QuantumSI's latest chemistry uses just nine recognizers. They aren't yet claiming decoding all the funky amino acids - from my Millennium life I have not only a love for phosphorylation but also ubiquitination and its kin - but their system may have a shot at many of these without requiring a custom recognizer for each one.

A very interesting design choice from QuantumSI is to make their system a single-pot chemistry; there is no chemical cycling as with their corporate cousin 454.bio. This makes for a much simpler instrument - a great deal of microfluidic complexity avoided - and saves on reagents since none of the expensive components are lost. Unlike 454.bio, QuantumSI doesn't even need to remove incorporated labels, since they are degrading the analyzed peptides.

But, this does complicate things. There's basically always a race going on for access to the N-terminus of each peptide. Recognizers will come and go, but eventually the N-terminal endopeptidase strides in and clips off an amino acid - and hopefully leaves without clipping another. In the ideal case, a set of recognizers flit in and out, giving a complex and useful signal, before the clipping - but there's no guarantee of that. The scheme also seems a nightmare for any homopolymeric stretch - I doubt QuantumSI will be used to count glutamines within huntingtin. But with looking up in a database, these should be manageable issues -- and the incumbent technique of mass spectrometry has its own challenges.

How simple is the workflow? QuantumSI says their communications guy ran it. One hours hands on time to digest the sample and click-label the C-termini for attachment to the flowcell, followed by 10 hours of running. Automation of this workflow is on their development roadmap.

On the recognizer front, QuantumSI has made steady progress. Their publication in Science used only three recognizers; at launch they had five and the newest kits have six. This really emphasizes how their kinetic analysis can extract a great deal of data from a small number of recognizers. Some post-translational modifications can already be detected, though the high value space of detecting phosphorylation is still in development.

On the informatics site, QuantumSI provides a hierarchy of data, with "what proteins are we identifying" on top, counts of individual peptides the next rung down and detailed kinetic information on each residue at the bottom.

If QuantumSI is the Answer, What is the Question?

A core challenge with biological mixtures of proteins is the extreme of dynamic range. For example, with human blood (or serum or plasma) you can remove something like 99.99% of serum albumin and the dominant signal will still be serum albumin. Solve serum albumin and a new set of abundant proteins must be batted down. The really interesting stuff is many orders of magnitude less abundant than all that. Which is one of the reasons immunoassays such as home pregnancy tests are so amazing - they detect absurdly dilute targets in a sea of abundant proteins yet can be made cheaply and run with essentially no training.

Some in the mass spec field have been not been shy about pointing out this issue; indeed, some have been downright obnoxious about it. Unless you can sequence enormous numbers of peptides - or figure out some extremely clever ways to deal with those abundant proteins - sequencing approaches will be swamped by boring background.

QuantumSI's answer to this is to not take on such difficult challenges, at least not yet. What they are proposing is that m biologists for ages have used tools such as Coomassie staining, Western Blots and ELISAs to study abundant proteins in simplified mixtures, and QuantumSI can provide higher information content but with workflows that are simple to learn and use. After all, one drawback to mass spectrometry is it requires a very expensive set of instrumentation that requires a high degree of training to operate. Mass spectrometers with associate liquid chromatographs are not something every lab is going to splurge on; doubly so on the mass spectrometrist to go with it. QuantumSI claims their sample prep workflow is just a simple set of biochemical steps; no chromatography required if your inputs are simple.

At $85K an instrument, QuantumSI certainly isn't going to be ubiquitous as a simple gel box. Perhaps more seriously, the current instrument processes only two samples at a time, with runtimes of roughly overnight. That's much less throughput than a simple gel box. QuantumSI says that for applications so far they are resolving more peptides than required, so expanding the number of samples is high on their priority list. This also points to another place the nucleic acids have a leg up - it's really easy to design barcoding schemes for DNA or RNA since we can easily design, synthesize and tack on such barcodes; this technology isn't well developed for peptides for direct peptide reading (the mass spectrometrists do have fancy mass-encoded tags). But there are already case studies using QuantumSI to read out genetically encoded peptide barcodes, so there's already progress there.

Among applications mentioned by QuantumSI: reading out protein-protein interaction partners detected by immunoprecipitation, verifying protein engineering results, quality control for antibody production., and verifying if an engineered protein mutation is being correctly expressed. All applications where the number of abundant proteins is sufficiently low to avoid the signal of interest being swamped out.

QuantumSI commented on the sorts of conferences they've attended and the response. The Festival of Genomics - I first saw a box in the wild at FOG Boston last autumn - has been very successful, as has been other genomics-oriented conferences. In their view, genomics practitioners are reluctant to invest in mass spectrometers. They also go to proteomics-oriented conferences and encounter a much more mass spec oriented audience and the skepticism for NGS-like approaches held by that community. Currently they are selling themselves in North America and Europe and using distributors to sell into Asia-Pacific geography.

It will be interesting to watch the further development of this space. QuantumSI launched at the end of 2022 and is still the only NGS-like protein sequencing that has launched. The new kits just announced have increased the number of peptides read out by about two to seven fold. Personally, I think having more sample chambers per run is likely to be very popular; nobody ever ran a two lane gel! And it may take time to identify the "killer apps" which will drive labs to buy into the platform, though even a few splashy publications could create some significant buzz.

A final thought: it's interesting that QuantumSI gets attention at genomics-oriented meetings, but how much low-complexity protein sequencing are genome-focused labs interested in? Perhaps it is a new direction that some are contemplating branching out in, but in general I don't see the QuantumSI approach - at its current level of sample throughput or tolerance for sample dynamic range - being a frequent companion for high throughput genome sequencing, RNA-Seq or spatial analysis. There is an apparent fit for smaller scale synthetic biology and protein engineering labs perhaps - it remains to be seen how many such labs will try this technology out. Rather than core labs, I suspect the better fit for QuantumSI is individual principal investigators or their equivalent in industry. That is a very diffuse market with weaker network effects to drive adoption (versus genome labs that love to get on the latest bandwagon).

Tuesday, April 23, 2024

Bruker Wins NanoString Auction

NanoString declaring bankruptcy on the eve of 2024's edition of AGBT was a shock to many at the meeting and then there was confusion: would one of the sponsors have a dark booth? The aggressive 10X Genomics legal strategy that forced the bankruptcy raised a degree of polite ire. But NanoString marketing carried on and CSO Joe Beecham delivered a fiery speech saying "we're not going anywhere". Then an investment firm, Patient Square Capital, appeared to be the front runner for acquiring the assets, with speculation they would combine NanoString with their other spatial omics portfolio company, Resolve Biosciences. But last week, as the genomics world was still processing PacBio's turmoil, news broke that Bruker had significantly outbid Patient Square - $392.6M vs $220M. So Bruker takes NanoString home - and I gives me an entree to float an ontology of spatial technologies I've been fermenting, as Bruker will now have instruments in the four major spatial approaches. And 10X now has a more formidable opponent in the ongoing patent wars.

Wednesday, April 17, 2024

PacBio Plummets

PacBio announced preliminary earnings yesterday, and the nearly immediate result was a 50% plunge in their share price. Along with the earnings, the company announced significant cost cutting. The details of those cuts were not made available, but some clever tea leave parsers noted a significant omission from what the company said it would continue. The ASeq Discord channel on PacBio absolutely blew up, with opinions ranging from PacBio is in a death spiral to PacBio must be for sale, with significant numbers of "Christian Henry won't be CEO by year's end".

Wednesday, April 10, 2024

Thoughts on RNU4-2 Mutation Paper

A new preprint based on Genomics UK data has identified a set of single base insertion mutations (predominantly a specific A insertion) in a spliceosomal RNA which is responsible for about 0.5% of previously undiagnosed genetic cases of syndromic neurodevelopmental disorders . That's a remarkably high frequency mutation which has gone unnoticed to date, but the fact it was hiding in a non-protein-coding RNA (a spliceosome component called RNU4-2) had much to do with that - this gene won't be in any exome panels. The mutation always appears to be de novo and therefore the pathogenic phenotype is dominant. I'd like to write down a few other thoughts - mostly in the form of questions -- with the caveat that I've never worked on a rare disease project and to describe me as a detached armchair voyeur of the field would be far too generous.

Thursday, March 28, 2024

Post-AGBT: VizGen & Scale Biosciences Partner

It's been just a few weeks since I sat poolside at AGBT with VizGen CEO Terry Lo and Scale Biosciences CEO Giovanna Prout to discuss the two companies' new partnership. Well, that would have been accurate about a month ago; getting the last AGBT threads together has been buried under post-AGBT day work, some family business, another vacation - and let's be serious, mega-scale procrastination and writer's block (and that's just a euphemism here for more procrastination). But that shouldn't detract from what these two RNA (and more!) profiling companies are trying to build together. Plus this is my last "Post-AGBT" tag for the year; now I can move on to "inspired by AGBT" that is a bit less tied to the meeting (and less obviously overdue)

Monday, March 11, 2024

BioNano In Peril Again

While I still have a pair of pre-AGBT and AGBT interviews to write up - plus a long list of post ideas inspired by AGBT - breaking news about BioNano Genomics takes precedence. The company has announced a major restructuring, with about 30% of its employees being laid off. I've been laid off twice and it's never enjoyable, so I hope what I write here is appropriately sensitive - but won't be surprised if I still commit a faux pas. Even with the restructuring, one analyst who likes BioNano estimated they will have about three quarters of cash - this is indeed a perilous time.

Thursday, February 29, 2024

Post-AGBT: Sequencing Hardware Roundup

Some updates on the sequencing instrument vendors, save Ultima Genomics and Element Biosciences which I've covered already.

Post-AGBT: Element AVITI Sequencing Updates

Element has been very busy over the past year and in the Silver Sponsor presentation covered updates since last AGBT as well as a number of completely new items. I covered their Teton approach to multiomic analysis of cell culture in the last piece; in this one I'll cover their sequencing platform evolution. Element was kind enough to loan me key members of their technical braintrust for an hour in the week before AGBT, which sadly I repaid by allowing their lunch to be scheduled over. Thankfully, they do have a recording available!

Tuesday, February 27, 2024

Post-AGBT: Both Element & Singular Want Spatial to Go With The Flow(cells)

Element Biosciences and Singular Genomics have often appeared to be on roughly parallel trajectories, though with key differences. Both companies launched sequencing instruments with NextSeq 2000-like specifications and largely aimed at the academic core lab and small biotech company market. At AGBT, both announced upgrades to their sequencing instruments that allow the instrument to perform spatial omics while still functioning as a sequencer. But there are key differences in their approach and what we know about each company and their degree of success so far in the sequencer market.

Tuesday, February 20, 2024

AGBT Follow-up: Ultima Genomics UG100, Volta Labs Callisto, N6Tec iconPCR

A confusion of ideas for AGBT follow-up have collided with the inevitable post-AGBT return-to-ordinary-life requirements. To try to avoid a huge project that never gets completed, I'm breaking these up into multiple pieces. First off, a look at reaction to the three big pieces I wrote before the conference or early during the conference: Ultima Genomics, Volta Labs Callisto and N6Tec iconPCR. My comments are based on further thoughts on my part, discussions with other AGBT attendees and feedback I've gotten via social media, blog comments and emails/DMs. Please keep it coming! One of the great values of writing this is getting feedback - it illuminates questions I haven't considered and highlights gaps in my thinking.

Wednesday, February 07, 2024

VoltaLabs Launches Callisto for DNA Extraction & Library Prep

Here at AGBT, VoltaLabs has unveiled their 24-sample DNA extraction and NGS library prep Callisto instrument, which is particularly suited for long read applications but is also suited for short read work. Volta has matured liquid handling automation to a novel open top electrowetting technology. Priced at $125K and planning to ship in the second quarter, Callisto is designed as a walk-away solution requiring no human interaction during a run. Personally, not only do I love the a new medium-throughput instrument for HMW DNA extraction and manipulation, but I also can at least pretend I helped steer the company In that directions

Tuesday, February 06, 2024

iconPCR: Super-Flexible qPCR Thermocycler Oft Dreamed, Now Delivered

Has there ever been a product you’ve just wanted to have, but it doesn’t exist? That keeps popping up in discussions - “if only we had X this project would go so much faster!”. Well, N6 Tec’s automation-friendly $99K i96 well iconPCR thermocycler is that to me. Launching at AGBT, it’s the gadget I’ve wanted repeatedly at Codon Devices, Warp Drive Bio and now Ginkgo Bioworks. It won’t solve all your PCR challenges, but it certainly gives new options to customize PCR like never before. And for many NGS labs, it offers major streamlining of PCR-based library construction protocols while also delivering superior data. How? By being a thermocycler where every well can run its own thermal profile and each well can go dormant once a desired level of amplification is achieved

Monday, February 05, 2024

Want to Build A Sequencer? 454.bio Opens Up Their Plans

Just as the AGBT hype cycle was firing up (with me contributing multiple sparks), serial entrepreneur Jonathan Rothberg's latest sequencing startup 454.bio fully de-stealthed their technology this weekend, going so far as to release open source plans to build an instrument prototype. 454.bio is aiming to build a Keurig-sized device to retail for $100, with sequencing runs in the $20 range. To accomplish this, they're attempting a novel twist on sequencing-by-synthesis. It's an unconventional strategy by someone who has succeeded twice before in DNA sequencing (454 and Ion Torrent) and has multiple other companies going (if I've counted correctly) - QuantumSI in protein sequencing (a future topic for this space, I promise!), ButterflyNetworks with inexpensive, compact diagnostics ultrasound and Hyperfine with inexpensive, compact MRI diagnostic devices. Then I went to the 4Catalyzer site - Rothberg's incubator - and discovered a bunch of companies I hadn't heard of or had forgotten about -- Protein Evolution in synthetic biology for plastics production, Detect for home-based diagnostics instruments, AI Therapeutics in the rare disease space and Liminal with what looks like consumer brain scanning. That's quite a series of companies! But the one closest to my heart (sorry QuantumSI :-) is 454.bio, and their announcements have many interesting facets which I'll dive into.

[2024-02-06 01:41 - 'used"--> iSeq fix -- stupid autocorrect!]

Thursday, February 01, 2024

Ultima Launches

As part of the run-up to Gold sponsorship at AGBT, Ultima Genomics held a multi-day event in early December, with tours of the headquarters facility and factory floor in the Bay Area and a day at a beautiful Wine Country resort. The resort session included talks from the company, early access collaborators and a pair of big name early backers, with a few hundred current customers and many contemplating the leap. So confident was the company in their product, they even invited a blogger to moderate one of the panel discussions! The UG100 is now officially launched as a fully commercial product, with ambitions to replace panels, exomes and microarrays with whole genome sequences at $100 apiece. All in an instrument package designed for continuous industrial-scale operation. Please note that Ultima did review this piece to ensure I didn’t disclose information they did not wish public, but for the most part just gave me some very good proofreading support. Photos are my own, except as noted.

Monday, January 29, 2024

On Illumina's Moats Past & Present

Studying how Illumina came to dominate sequencing markets is certainly worthy of at least a Harvard Business School case study, and perhaps an entire graduate thesis. But I wanted to give a quick review of some of my thoughts on the matter, spurred by Nava Whiteford's repeated savaging of a piece in another space but also because many of these themes will show up in a flurry of pieces I'm planning (one's even nearly done!) in the next few weeks due to AGBT and some non-AGBT news.

Friday, January 05, 2024

2024: A Look Ahead

It's January, and that means the J.P. Morgan Healthcare Conference looms next week -- followed by AGBT just a month later. Indeed, I've been trying to mark out the "can't miss" talks for AGBT so I can resist over-scheduling them with meet-ups -- but many talks lack titles so that's not easy. JP Morgan seems to have Illumina, 10X and Nanostring -- and not much else in the way of sequencing-space companies. But time to prognosticate before all the news happens!