Monday, June 02, 2025

Roche Gives SBX Updates - and a Name!

Last week I double-dipped on conferences, going from London Calling to European Society for Human Genetics (ESHG) in Milan.  I have a raft of notes and ideas from these, which I'll try to spool out over the next week or two before jumping to a long list of more whimsical ideas I've jotted down.   First up are some updates on Roche's SBX sequencing technology, which has now been christened Axelios - which Nava Whiteford reported in his ASeq newsletter.

Wednesday, May 21, 2025

Oxford Nanopore Should Spin Out Protein Sequencing

I've toyed with writing something on these lines for a long time but never quite pulled the trigger.  But the more I think about it, the more imperative my logic feels for spinning off the nascent protein sequencing effort.  I actually finally peeked at the agenda for the meeting and University of Washington's Jeff Nivala is basically closing the meeting with an update on his work in this space.  

Tuesday, May 20, 2025

London Calling 2025: What I'm Thinking About

London Calling fires up for real on Wednesday; Tuesday has the training courses I haven't signed up for.  I don't have any grand predictions, but rather some thoughts of things I am trying to keep my antennae particularly tuned for.  Sly tips of course are always welcome I can be DMed on Discord, LinkedIn, X, or email me at keith.e.robison at Gmail.com..

Monday, May 19, 2025

Clive Brown At ONT: A Belated Retrospective

on Calling is imminent, and this is a notable one: the first held without Clive Brown in an official capacity at Oxford Nanopore. I started drafting a piece on Clive's tenure at ONT as soon as Nava Whiteford first broke the rumor (soon confirmed) he was leaving - that was back in November.  But first there was writer's block and then a pair of elderly relatives had health crises, only one of which resolved desirably, and then the piece got stuck in my procrastination queue.  But London Calling was an absolute deadline I set for myself and here we are.

One contributor to my discarding early drafts was trying to set the right balance.  Clive was by far the most visible leader at ONT and so to him goes much kudos but also much criticism.  Of course, in many cases it was someone else who deserves the credit or the debit, or at least the story is complicated.  But I was never, as the song goes, in the room where it happened, so I am sadly blind to such nuance.  It's also the case that Clive of early ONT is almost certainly not Clive of late ONT, and some of the foibles I dredge up might not be repeated.  But, they are part of the story.

It cannot be over-emphasized that under Clive's technical leadership Oxford Nanopore condensed an incredible (as in, not to be believed) concept into an incredible (as in, OMG!) working sequencing technology. ONT sequencers enable sequencing about any place on this planet (and even above this planet!) a human can go due to the compact MinION design as well as its lack of moving parts, have stunningly low capital cost and are simple enough that even middle school students can be trained to sequence.  The early "barely can align" data quality has advanced by leaps-and-bounds so that many reads have error rates of under 1%. Data yields early in MAP were low tens of megabases; MinION flowcells now deliver a few tens of gigabases and the PromethION flowcells significantly more.  I was hooked seeing a full length lambda phage of 48 kilobases in our first MinION Access Program (MAP) run; the world record ONT read length is almost 100-times that - and no other technology reliably generates high accuracy reads of even 48 kilobases.

Clive's team did that.  He built the technical organization to make magic real.  Now, it must be said that being a member of that organization came with special requirements.  Some staff stayed a long time - but some did not.  At the first Nanopore Community Meeting at the New York Genome Center, ONT had downplayed the idea that Clive would break any technical news - and all of us customers were madly scribbling notes and photographing slides as he blatantly violated that guidance.  When it was over, one looked around to realize that ONT employees had done the same and some very much had "deer in the headlines" faces.  Indeed, mutterances of "first I've seen of that" or "there goes all my timelines" were heard afterwards.

Eventually, ONT settled on a solution to allow Clive to be minimally restrained but the company minimally exposed to his wilder extemporaneous comments.  After Clive's talk (which led with a disclaimer that he might color outside the lines), Rosemary Dokos would be the voice of reason and carefully shepherd expectations into what the company was actually committing to

Under Clive's leadership, the MAP had some interesting aspects as well.  Clive has always projected a vision of the individual biological pioneer, in many cases an amateur with no training or background, casually using ONT gear to explore the world.  A parallel is often drawn, often dangerously bordering (or crossing into) fetishism, to the early days of personal computing.  I know that era: my first taste of electronics was learning the color codes on resistors so I could sort them upstream of my brother and father assembling a single-board DATAC-1000 computer.  But do-it-yourself can collide with amateur.  When MAP rolled out the first, a thinly documented, version of MinKNOW, I started looking for what seemed obvious to me must be in the package.  MinKNOW would output data in an HDF5-based POD5 FAST5 format, and obviously there must be another tool to extract the reads as FASTA or FASTQ from the POD5 FAST5. Right?  Right? Surely????

No, there wasn't.  If you wanted to, y'know, analyze MAP data your had to dig into some very unfamiliar software guts.  In my case, Perl was a dead end because the HDF5 library choked on the POD5s (by 2014 I was realizing Perl library development and maintenance was pretty much in zombie mode; around that time the PDF module in CPAN had syntax errors!).  Since I wanted to learn Julia, I quickly tested if it had a working HDF5 library and wrote a simple parser. Several years later I found out it didn't work anymore for the same reason I had been reticent to open source it publicly - the implementation was tightly coupled to the POD5 implementation.  Presumably the same issue befell Mick Watson's R solution and Nick Loman's Python one, though by then ONT had made FASTQ extraction a standard part of their pipeline.

Similarly, ONT for a long time avoided writing much in the way of documentation or having much in the way of technical support.  You threw your questions to the Nanopore Community and you'd often get answers.  But as the Community aged, it became very hard to tell which information was still valid and which was badly obsolete.  Which version of a protocol did the search find?  Who knows?

What I said about open sourcing my extractor leads into another big tension area: openness and secrecy. I knew ONT was sensitive about internal workings and so asked if I could release my extractor, and got a not emphatic reply that suggested they wouldn't love that.  Mick and Nick must not have asked, as they did open source theirs.  For a long time, much of the software from ONT was only available from the Nanopore Community site, which was far less convenient to download from than GitHub or similar.  Over time that went away.  In general, ONT has been very open which has helped drive much innovation by the community, but now is probably enabling BGI and other Chinese competitors who are launching similar nanopore platforms.  Of course, ONT in turn benefited from PacBio's great openness and particularly the long read software community that grew up around PacBio.

From secrecy we can slide over to an entertaining topic: Clive and ONT's penchant for combativeness which often edged into destructive corporate paranoia.  In this he wasn't alone: CEO Gordon Sanghera and majordomo Spike Willcocks would also engage in this behavior; I chided Willcocks here on an egregious case that threatened to spike customer relationships. If you talk to the older ONT crowd, they have some very colorful stories to tell of then Illumina CEO Jay Flatley's behavior around them when Illumina had an investment in ONT (and yes, I'd love to hear Flatley's or any other Illumina's folks take on this - as off the record as they would like).  Illumina would later yank their investment in ONT and then try to sink ONT with a patent lawsuit. One can understand the bitterness after such a scorch-the-earth style of corporate battling.

But some of that activity, and the lawsuits with PacBio, brought out behavior in Clive that must have put the entire ONT legal team on high doses of ACE2 and proton pump inhibitors.  Clive would tweet out very sharp commentary on one of ONT's legal opponents, often with insults.  This would escalate a bit, until finally Clive's Twitter account would be "deleted'.  There would be a pause for weeks or months then Clive would start tweeting innocuous stuff again - which would eventually get spicer. Rinse and repeat.

In a similar vein, I'll always remember the first time I met Clive face-to-face.  It was at AGBT in 2013, a year after the big AGBT splash.  I thought that presentation was exciting, but apparently Clive and other ONT folks were run through the wringer of criticism as purveyors of vaporware, hucksters trying to just fleece gullible investors and so forth.  It had really gotten to him.  We met in one of the little outside alcoves that existed at the old Marriott facility in Marco Island and he showed me a MinION - probably one revision back from what came out for MAP.  And my big takeaway from that was "man is that guy strung tight!".  He was intense - both the great kind of someone who is passionate about their work and the less desirable intensity of someone who feels hounded.  The next time I saw him, when I helped him navigate to the ONT demo that fall in Kendall Square after bumping into him on the street, he was a bit more relaxed - building towards success can do that.  But still passionate - always passionate.

Clive's passions and ethos of the individual explorer sometimes took the company in directions that were of questionable commercial relevance.  Thrilling to watch or ponder perhaps, but not much in the way of a path to profitability.  For example, last year I was certainly excited at Clive saying ONT had succeeded in sequencing some of the smaller yeast chromosomes as a single fragment.  But I work at a strain factory that loves to work on yeast; few are in such a situation.  And yeast chromosomes are a tiny fraction of the length of even the shortest mammalian chromosome. 

Perhaps the most egregious case of this was the "Ubiqibopsy" demo.  Live, on stage Clive had his cheek swabbed, then some noisy bead-based sample prep, a rapid prep and by the end of the presentation ... a handful of reads.  Supposedly they proved Clive is human though I don't believe any reviewer #3 got a look at the data.  The device was visible later and clearly hacked together from ubiquitous lab parts - if I recall correctly there was part of a centrifuge tube, a micropipette tip and maybe one other recognizable bit.  But was this the path to a real product?  Given the challenges involved in getting to a data yield that might be interesting, unlikely - and little was heard of this ever again.

But of course nobody though the whole concept would ever work.  But then again, while ONT has shown technical success it still struggles with financial success. 

Similarly, Clive's love of the VolTRAX electrowetting technology was never returned by that tech.  In one talk he presented the idea of putting the nanopores actually on the VolTRAX chip, which certainly solves the transfer problem from VolTRAX to flowcell.  But the number of pores was to be tiny - so what application wanted a few hundred dollar prep to get maybe a few megabases of data?  That was a question never answered - and it too disappeared from future presentations.

At last year's LC, Clive gave hints that he might not be at ONT by the next year - something along the line of "if I'm still here".  There were other signs - the ElysION (formerly TurBOT) robot is the epitome of what Clive disliked, a large expensive box specialized on a single task and marketed to large faceless labs.  

I hope Clive is enjoying his retirement.  His Twitter feed shows very brief life very rarely, but mostly about non-sequencing topics.  If there isn't a third act for him (he was on the team that brought Solexa's sequencing technology to life), then we all now have a name to scan for in the annual King's Honors List. Under Clive's leadership Oxford Nanopore launched a truly revolutionary platform which has delivered huge gains to genomics; we should all be very grateful for that.


[2025-05-21 fixed POD5->FAST5, as suggested by a commenter]

Wednesday, April 23, 2025

AGBT Flashback: Scale Biosciences’ QuantumScale

Back at AGBT two months ago, Scale Biosciences CEO Giovanna Prout was kind enough to spend thirty minutes of her hectic schedule with me discussing Scale's QuantumScale technology, which embarrassingly I've let those notes be sucked into a maelstrom of procrastination.  I was reminded of my delay by the announcement this week of Scale partnering with the Chan Zuckerberg Institute (CZI) to apply the technology to CZI's "100 Million Cell Challenge" and planned "Billion Cells" single cell expression profiling projects.  The QuantumScale technology is now fully commercially available.

QuantumScale has eye-popping specs.  It is possible to profile 4 million cells in a single experiment, with those cells divided among up to 9216 samples using ScalePlex technology, which I covered last June.

Scale has partnered with Ultima to enable Ultima indexing kits enabling native Ultima libraries, rather than requiring a conversion step.

A quick review: Scale's approach to single cell profiling avoids any specialized encapsulation equipment ala 10X Genomics by fixing cells so that the cell remnant serves as the addressable unit for indexing reads.  After labeling, pooling a sample and then splitting it again enables combinatorial barcoding.

Library preparation starts with the option of using ScalePlex to index multiple samples.  Fixed samples then are processed with reverse transcriptase and indexed on the first barcoding plate.  The entire set of indexed samples is then pooled and added to the QuantumScale barcoding plate.  QuantumScale, as I covered in September, doesn't just add a well=specific barcode to each new split of the pool, but instead each well has the capability to index at much higher, well, Scale.  In the Fall Prout was keeping quiet about the exact means QuantumScale used to achieve this, but at AGBT she revealed it relies on indexing beads - 800K per well.  The entire workflow can be completed in a day and a half.  And she is confident the technology can go to even higher levels of indexing; they just need sequencing costs to drop further to enable such plans.  Though right now the cost per cell is quite good - library prep costs of 0.8 cents/cell for 4M cells or 1.0 cents per cell for 2M cells or just $5K all-in to profile 84K cells . She also noted the entire protocol - including the microbeads - is very friendly for liquid handling robots. 

Prout also mentioned that Scale is developing a probe-based workflow for Formalin Fixed Paraffin Embedded (FFPE) samples, the dominant sample type of large tissue archives but also the nightmare of molecular profilers as the FFPE preparation process and the deparaffinization tend to both damage and break any nucleic acids.  It actually could be worse: very early FFPE protocols used unbuffered formalin and tended to completely destroy nucleic acids.   

I can still remember the first stirrings of single cell sequencing about a decade and a half ago, when it truly meant "single cell sequencing" - data was generated for individual cells in wells.  We're so spoiled in genomics by routine exponential climbing of the price/performance curve, but its still astounding to contemplate profiling 4M cells and potentially dividing that largesse over a matrix of over 9K different conditions and replicates.  But that is the corollary of those fantastic growth curves: what is barely imaginable today will be fodder for proof-of-concept in a couple of years and then routine a decade later.  And with the growing power of machine learning models, the demand for such ginormous datasets seems to be on a nearly vertical upward curve.  QuantumScale is emblematic of such growth of genomics capabilities, now we can sit back and see what biological insights tumble forth from the experiments it enables. 



Thursday, April 17, 2025

Stellaromics Dives into the Thick of Spatial Genomics

2025 has sometimes seemed like an unending dark winter of biotech, but just before AGBT spatial genomics company Stellaromics announced they had secured $80 million in Series B funding to advance their Pyxa platform.  Around that time, CEO Todd Dickinson and CTO Ye Fu chatted with me by phone, which any diligent and responsible scribe would have written up immediately.  Alas, that is not me.  But even two months of delay in writing cannot change the fact that Stellaromics is an interesting new entrant to the spatial genomics field.

Monday, April 14, 2025

Will the Result Be GeneXpertION?

One of the more intriguing and slightly mysterious genomics news stories of last week is Oxford Nanopore announcing a partnership with Danaher's Cepheid business to use Cepheid's GeneXpert system for sample preparation upstream of nanopore sequencing.  The press release was filled with grand visions but few details, which isn't exactly unusual for ONT.   I've often heard of Cepheid but never really dove into the technology, so I've been doing a crash dive on the company's website (which was also a raiding expedition for the images used below) as well as a detailed post Nava made in 2022.  Here's what I have learned and also some speculations about how it might fit with ONT - and if I've misunderstood anything I hope to receive constructive guidance on where I've gone wrong.  Please note I've made a COI disclosure at the bottom of this piece.

Tuesday, March 18, 2025

Mission Impossible: Methylomics

Good morning Mr. Hunt.

Today's briefing will have a bit more background than usual - you haven't tangled with biotechnology since that odd little company in Australia with the fancy office tower and actual labs in caves.

As you may know, DNA has four letters or bases which form pairs; A with T and C with G.  It is also possible for the C to be modified by methylation to form 5-methyl-C or even 5-hydroxymethyl-C.  These in humans are always at C followed by G, called CpG, sequences.  There is great interest in reading the methylation of DNA from blood, as this "cell-free DNA' may be a oracle into current and future health conditions.

The best technologies for reading this are the single molecule sequencers from Oxford Nanopore and Pacific Biosciences, as they can read these marks directly with no additional preprocessing of the DNA beyond what is required  by the sequencer to just read the bases, the construction of sequence libraries.  But these suffer from relatively high input requirements, and any amplification of DNA by PCR or similar techniques erases the methylation.

It is possible to read methylation on the popular short read sequencers of Illumina and other companies but only with a trick.  The most popular method is to treat the DNA chemically with bisulfite, a rather nasty reagent; even with your disdain of danger, you really should read and adhere to the MSDS on this stuff.  It converts all unmethylated Cs to Us and so it pairs and sequences like a T; modified Cs are untouched.  Please do not bring up this technique with the bioinformatician joining your team; they are known to rage about bisulfite being "a weapon of mass sequence destruction".  Similar methods using enzymes produce the same result.  Conversion means this DNA can be amplified.  But it also means it is useless for calling genetic variants; a separate unmodified library must be used for that.

Watchmaker Genomics has a clever chemistry that performs a much more limited transformation - only methylated Cs are converted to Us.  An even more clever biochemistry is offered by Biomodal, which copies one strand of a DNA fragment into a second, linked strand and then treats with enzymes.  After sequencing both linked sides, the pattern of matching and mismatching between the two can call variants, 5-methyl-C and 5-hydroxymethyl-C.  

Roche has recently unveiled a new single molecule sequencing technology called SBX, but it requires first copying the DNA of interest into bizarre highly modified "expandomer' form.  So it shouldn't be able to read methylation.  But at AGBT, SBX boffin Mark Kokoris offered, in response to a question on the topic, that there was clearly room for a fifth signal level in their traces and hinted Roche was working on using that fifth level.   But how, if SBX requires copying the input DNA first?

We first thought of sending you in to extract the secret from Roche, but even we can't just go performing espionage on a legitimate company with no apparent plans for world domination; monopolization of the sequencing market by Illumina has never triggered us to action so that isn't a justification.

Instead, your mission, should you choose to accept it, is to realize the impossible by creating a system for creating SBX molecules using a third basepair, and such artificial basepairing schemes have been realized in the lab, so that the set of all 6 bases can be used to resolve A, C, G, T and 5-methyl-C; we will save 5-hydroxymethyl-C for a future mission.  Ideally this would be achieved by specifically converting 5-methyl-C into a base of the third basepair type, but if you devise schemes converting C or T to the extra basepair and then 5-methyl-C into whichever base you freed up, that's acceptable as well.

Our technical directorate has outlined two general strategies which could be independent or used separately.  

In one, a purely chemical approach conversion would change one of the bases to the new basepairing arrangement.  No chemical reaction has ever been proposed in the literature to do such a transformation of one base into an unnatural basepair.

In the other plan of attack, protein engineering would be used to generate one or more enzymes to perform the transformation.  No enzyme is known that would provide an obvious starting point for such a protein evolution campaign.  Perhaps one of the new machine learning models can take a crack, but this is far beyond anything demonstrated with AI-based protein engineering.  

Either way, in the event of success you may be required to disguise yourself when presenting the results at AGBT.

A note about commercial considerations, and we don't mean the usual of making sure you hold your beverages with the label facing outwards.  Whatever process you devise must not have an inordinate number of steps, should not require a fume hood and the cost-of-goods should not be exorbitant.  Nor should your spending during the project; if you should exceed your budget authority the secretary will disavow all knowledge of your 

This message will self-destruct in 5 seconds - or would have if we hadn't spent the micro-explosives budget on DNA sequencing reagents.
 

Saturday, February 22, 2025

Is Midi Read Sequencing A Thing?

Throughout the two pieces on Roche SBX sequencing I sprinkled the term “midi read sequencing”.  Here I’m going to explore in more detail this concept.


Thursday, February 20, 2025

Roche Ripple Predictions

In the prior piece, I covered the technical details unveiled by Roche for their SBX technology, but generally tried to avoid predicting its effects on the marketplace.  Here I put on the pundit’s hat.  The TL;DR is this is a major new sequencing platform and if you’re at one the competitors you have about a year before it fully hits the market - though in reality the action has already started as Roche starts grabbing hearts-and-minds.  What can we anticipate about the effect on each of the current players? As noted in the prior piece, some key aspects - in particular purchase price and run cost - aren’t being disclosed by Roche and complicate prognostication.  

Roche Xpounds on New Sequencing Technology

Bar bets can be a powerful force in human society.  One of the best known books on the planet, The Guinness Book of World Records, originated from the need to equitably settle wagers.  Many entries in that tome are questions of immense scale - the largest this or heaviest that.  Shortly before this posted, Roche unveiled a sequencing technology that per its inventors may be the result of such a bar bet: how large a dangling bit can you stick on a nucleotide and still have it incorporated by a polymerase.  

Monday, January 27, 2025

Olink Reveal: Focused Proteomics, Simplified

I’ve covered a lot of genomics in this space, but there is an inherent challenge to studying biology via DNA - DNA is the underlying blueprint, but that blueprint must pass through multiple steps before actual biology of interest emerges.  RNA-Seq gets closer, but much of the real action is at the level of proteins (though much is not - let’s not forget all the metabolites!).  When I set out in this space 18 years ago, I thought I’d cover more proteomics but that didn’t materialize - time to plunk one piece on the proteomics side of the ledger!


Proteomics has multiple challenges, but two inherent ones are the diversity of proteoforms and the dynamic range within the proteome.


The diversity of proteins within a human is astounding, even if we discard the inherently hypervariable antibodies and T cell receptors which have specific means of diversification within an individual that include random generation of sequence during VDJ recombination and somatic hypermutation of antibodies.  The rest of the bunch are subject to transcript-level diversification by features such as alternative promoters, alternative splicing and RNA editing and then another wealth of post-translational proteolysis, phosphorylation, glycosylation and a heap more covalent modifications.  If we really wanted to make things complex, we’d worry about protein localization, who a protein is partnered with and even alternative protein conformations - but let’s just stick to primary proteoforms and a diversity that is estimated in excess of 1 million different forms.


The key part here is that there is no analytical method capable of resolving all of these.  Any proteomics method is to some degree ignoring much of the proteome entirely, and for many other proteins compressing many forms into a single signal.  Indeed, most proteomic tools look at very short windows of sequence or perhaps patches of three dimensional structure, and will rarely if ever be able to directly connect two such short windows or patches - they will be stuck correlating them.  The key takeaway here is that all proteomics methods work on a reduced representation of the proteome.


The dynamic range in the proteome is astounding, with some potentially challenging effects.  For example, blood serum is utterly dominated by a handful of proteins such as serum albumin, beta 2 microglobulin and immunoglobulins - for methods that look at the total proteome there is a serious danger of flooding out your signal with these abundant but relatively dull proteins and not being able to seen interesting ones such as hormones that are many logs lower in concentration.


Proteomics has been dominated by mass spectrometry, which has had over three decades to develop into a mature science.  Mass spec is inherently a counting process and on its own can’t focus or filter out the dull stuff.  Even more so, you don’t fly intact proteins in a mass spec, but peptides and there’s only a few useful proteases out there.  Peptides don’t ionize consistently, so that adds a layer of challenge to quantitation.  But as noted, this has been an intensely developed field for multiple decades and so there are very good mass spectroscopy proteomics techniques using liquid chromatography (LC-MS) and other methods to remove abundant dull proteins and fractionate complex peptide pools into manageable ones.


But, protein LC-MS is very much its own discipline, and most proteomics labs aren’t strong in genomics or vice versa - though there are certainly collaborations or dual-threat labs.  LC-MS setups require serious capital budgets for the instruments and their accompanying sample handling automation and highly skilled personnel. 


A number of companies are attempting to apply the strategies of high throughput DNA sequencing to peptide sequencing or identification.  Quantum-SI is the only one to make it to market but there are other startups out there such as Erisyon are plugging away.   These methods look a bit like mass spectrometry in their sample requirements, as they will also be counting peptides - and the current Quantum-SI doesn’t count nearly enough to be practical for complex samples such as serum or plasma.  


The other “next gen proteomics” - one lesson not learned from the DNA sequencing world is the problem of calling something “next-gen” - this year will be the 20th anniversary of the commercial launch of 454 sequencing - approach is to use affinity reagents such as antibodies or aptamers and tag them with DNA barcodes, then sequence those barcodes on high throughput DNA sequencers.  By using affinity reagents, the problem of boring but abundant proteins goes away – just don’t give them any affinity reagents.  Dynamic range can be addressed as well - the exact details aren’t necessarily disclosed by manufacturers but one could imagine only labeling a fraction of a given antibody to tune how many counts are generated from a certain concentration of targeted analyte.


Olink Proteomics, now a component of Thermo Fisher, is one company offering a product in this space.  Olink’s Proximity Extension Assay (PEA) relies on two antibodies to each protein of interest and requiring hybridization between the probes on both antibodies to enable extension by polymerase  in order to generate a signal.  This increases the specificity of the signal and tamps down any signal from non-specific binding - or from just having antibodies in solution.  


Olink has released a series of panels targeting increasing numbers of target proteins in the human proteome.   This is generally a good thing - except counting more proteins means generating more DNA tags which means a bigger sequencing budget per sample.  The other knock on Olink’s (and their competitor SomaLogic, now within Standard Biotools and also marketed by Illumina) approach is a complex laboratory workflow that mandates liquid handling automation.  So this has meant that the big Olink Explore discovery panels are inevitably going to be run at huge genome centers that have both the big iron sequencers and the liquid handling robots that are required.   And this strategy has started paying out scientific dividends - some of which were covered by the Olink Proteomics World online symposium last fall  that featured speakers such as Kari Stefansson.  Olink’s and Ultima’s recent announcement on starting to process all of the UK Biobank is an example of such grand plans, and this will be run at Regeneron’s genome center.


Academic center core labs and smaller biotechs often power important biomedical advances, but if Olink Explore is only practical with NovaSeq/UG100 class machines and fancy liquid handlers, then few of these important scientific constituencies will be able to access the technology.   Which would be unfortunate, since small labs often cultivate very interesting sample sets that very large population-based projects like UK Biobank might not have.  Large population based and carefully curated small projects are complementary, but is only one able to access Olink’s technology?


And that’s where Olink’s newest product, Olink Reveal, comes in, enabling smaller labs to process 86 samples.   First, a select set of about 1000 proteins is targeted, bringing the required sequencing for a panel of samples plus controls to fit on a NextSeq-class flowcell - only 1 billion reads required.  Second, the laboratory workflow has been made very simple and practical to execute with only multichannel pipettes.  The product is shipped with a 96-well plate that contains dried down PEA reagents; simply adding samples and controls to the wells activates the assay for an overnight incubation.  The next day, PCR reagents are added to graft sample index barcodes onto the ligation products and then that is pooled to form a sequencing library.  The library prep costs $98 per sample (list price) - $8,428 per kit.  Throw in sequencing costs of $2K-$5K per run (depending on the instrument) and this isn’t out-of-line for other genomics applications.  


Of course, this is a reduced representation over the larger “Explore” sets  But Olink has selected the proteins to be a useful reduced representation.  They’ve used sources such as Reactome to prioritize proteins, and have also prioritized proteins that have been shown to have genetically-driven expression variability in the human population - protein QTLs aka pQTLs.  If the new panel is cross-referenced to studies using the larger panels, most of these studies would have found at least one protein showing statistically significant change in concentration. This can be seen in the plot below, where each row is a study colored by disease area. On the left is the distribution of P-values for the actual Olink Explore data and the right the same data filtered for proteins in the Olink Reveal panel.





It’s also robust - Olink has sent validation samples to multiple operators and compared the results, and the values from each lab are tightly correlated.


So Olink with their affinity proteomics approach is basically following the same playbook as genomics did with exomes.  When hybrid capture approaches for exome sequencing first came out, it was thought these would be used for only a few years and then be completely displaced by whole genome sequencing (WGS).  But exomes have proven too cost effective - even with drops in WGS costs, it is still possible to sequence more samples with exomes for the same budget.  Yes, that risks missing causal variants outside the exome target set was always a concern – the recent excitement around lesions in non-coding RNAs such as RNU4-2 have demonstrated that  - but many investigators saw exomes as enabling studies that otherwise wouldn’t happen.  Plus sometimes the bigger worry is biological noise obscuring a signal you could see and that is dealt with by more samples.


The new Olink Reveal product fills a gap between Olink’s large “Explore” discovery sets and very small custom panels.    In the Proteomics World talks many speakers described work run with PEA panels of only two dozen or so targets, often using PCR as a readout rather than sequencing.  This shows one bit of synergy in the Olink acquisition by ThermoFisher, as Thermo has an extensive PCR product catalog including array-type formats.  Thus PEA follows the well worn patterns in genomics: huge discovery panels for some studies, high value panels that balance cost and coverage for many studies and focused custom panels for validating findings on very large cohorts.  The Proteomics World talks even suggested some of these focused panels might soon be seriously evaluated as in vitro diagnostics.   With developments like these, targeted proteomics via sequencing will be a very interesting space to watch.




Wednesday, January 22, 2025

Illumina & NVIDIA Team to Remake How to Train Your DRAGEN

If you've been in a movie theater recently, you may have seen a trailer for a mixed live action and animation spectacle called How To Train Your Dragon.  Having seen the purely animated original - and wished I had gone to a 3D showing as the flight scenes must have been amazing - it was a bit unsettling, as the animated dragon in the new looks exactly like the one in the old.  It's apparent a shot-for-shot remake of the original, but this time with live human actors.  So effectively a port of a script from one cinematic language to another.  In a similar vein, at last week's J.P. Morgan Conference, Illumina and NVIDIA announced they will start porting Illumina's DRAGEN applications onto NVIDIA GPU hardware.