Omics! Omics!: 2021

Friday, December 31, 2021

Reflecting on Anniversaries and Changes

As the year closes out for me (as I write this, it may well have closed out for some of you!) I'm reflecting on some anniversaries that were concentrated in this year, particularly those that are multiples of an early evolutionary developmental decision millions of years ago.

ONT Community Meeting 2021

Oxford Nanopore held their annual Community Meeting online at the beginning of this month. As is typical for this stage of the ONT news cycle, most topics were confirmations and updates of earlier projections, with little brand new material. There was one surprise, a new concept for running nanopore with little to no auxiliary lab equipment. Oh, and perhaps in the surprise category is Oxford appears to be finally moving away from the R9 pore which has been their mainstay for many years now.

A Look at Two HiFi Polisher Preprints

PacBio has made its reputation delivering very high accuracy long reads, which they have branded HiFi. These are based on their circular consensus technology: each template DNA molecule is converted into a single continuous circle of DNA which can be read in a rolling circle reaction. The "movie" is converted to raw base calls and the adapters are clipped out, leaving "subreads" which can be aligned together to generate a consensus (CCS) read. With many passes over the same molecule and its complement, the relatively high (~15%) error rate of the raw data can be brought down substantially using an HMM-based scheme. PacBio calls reads HiFi at 1% error rate, but their model calls overall quality for reads and it can keep getting better from there. Homopolymers still bedevil the technology, though not like they once did and it turns out there is at least one more systematic error class. Consensus building is a powerful way to cut through error. But could you do better? Two recent preprints from large tech companies, with PacBio co-authors, apply deep learning to this problem and each comes up with the astounding result that they can do a bit over 40% better.

PacBio Pulls Down Circulomics

I was on vacation early this week when the news broke that PacBio has acquired HMW DNA solid phase extraction kit maker Circulomics -- the kind of vacation that I need where the scenery is gorgeous and the internet access terrible. Where solid phase means monumental slabs of granite with diabase intrusions being attacked by a high salt liquid phase. Where I actually sighted Atlantic Puffins and didn't once think about sequencing their genomes ('til now!). But now I'm back to work and genomics.

PacBio Enters a Binding Agreement to Acquire Omniome

Pacific Biosciences announced today that they are slurping up short read sequencer startup Omniome for around $800M. Omniome has been developing an interesting clonal read technology. On the conflict-of-interest side, many years ago (and I think an entire management team different) Omniome treated myself and my family to a weekend in San Diego (it was my son's birthday weekend) so I could look at their technology back then -- my NDA has expired but so has most of my memory of what I saw at that meeting! Also the periodic reminder that PacBio Christian Henry sits on the board of my employer, though we haven't met. Simon Barnett of ARK Investments (which is a major holder of PacBio stock) has a very nice explainer on the Omniome Sequencing-By-Binding (SBB) chemistry and his bullish perspective on the acquisition and there is a proof-of-concept publication of the technology. I'll briefly explain the tech and then outline my somewhat more bearish view. It's also interesting to note that the FTC's actions on Illumina-PacBio and Illumina-Grail have analysts jumpy about this acquisition attempt.

ONT Sketches Paths to Long, Selective, Accurate Sequencing

Some sort of summary of London Calling in this space is grossly overdue after getting caught by multiple work firedrills and then several recursive rounds of procrastination. I'm not going to attempt to cover all the company announcements. I'm going to focus on a cluster of announcements that show a long range vision of inexpensive sequencing consisting of very accurate, very long reads. Well, a cluster of visions -- some parts can be mixed and matched and others cannot. This should be a prospect to grab the attention of any current or aspiring ONT competitors. Now before I'm accused of being a gullible shill for Oxford, I want to make it clear I think that running the table on these will be technically difficult and is many years in the future. But even if Oxford manages some of these but not all, they would substantially upgrade their platform.

New Clinical Human Genome Speed Record

I proposed last year that there should be a regular racing event for human genomics. The only real competitor in is this interesting race seems to be Steven Kingsmore's group at Rady Children's Hospital. I was sent an embargoed press release from Illumina about a new record by that group, which clocks in at 13.5 hours from patient sample to clinical report. A New England Journal of Medicine paper (hence the embargo, ending just before I post this) reports on the advance but wasn't in the packet I received.

Matt Meselson Needs a Biographer!

Yesterday was Matt Meselson's 91st birthday. I have only met him a few times and he wouldn't know me from Adam, but he is a particularly interesting individual I've had the good fortune to converse with. I'm putting out a plea now for a skilled biographer to write his life, because it certainly has been an interesting and impactful one, with scientific work stretching from the early beginnings of molecular genetics to a preprint just recently posted on BioRxiv.

My Latest London Calling Thoughts

The title really says it -- London Calling has actually already begun and here I am pretending to write a "before the conference" piece. Of course, since everything is virtual again this year I can actually do this since I haven't watched anything yet nor have seen any tweets -- and the big technology announcement section isn't for a few hours so I have loads of time to write! Sadly, nor have I gone and looked at what I've written before. Nor have I defended these two days very well - my schedule is cluttered with meetings and appointments. So I haven't prepared in any way, shape or form -- but here goes some thoughts.

GISAID Broken Down by Sequencing Hardware

The GISAID database has been the workhorse for storing and distributing SARS-CoV-2 sequences during the COVID-19 pandemic and recently passed one million entries. There was some Twitter chatter wondering about the hardware breakdown for this, as it isn't really easy to get out of GISAID. I had done a somewhat arduous partial take at this for my VIB talk last month, but in the meantime GISAID had granted me some additional access to metadata which I've been too busy to tackle. But knowing some others were curious, time to dive back in.

AGBT21: VizGen Unveils MERSCOPE

More spatial profiling news coming in from AGBT -- Harvard spin-out VizGen is launching in the U.S. an instrument implementing MERFISH technology. This sub-$300K instrument will initially enable panels of up to 500 genes to be profiled, with plans to expand that capacity to 1000. Users either pick from a menu of pre-designed panels or select genes using a Gene Panel Design Tool and VizGen would proceed to manufacturing the panel in around two weeks. VizGen CEO Terry Lo and Senior Director of Marketing Brittany Auclair were kind enough to give me a preview last Friday.

AGBT21: The LabRoots Presentation Platform is an Unmitigated Disaster

Rant is ON! I've been having an utterly miserable experience with the LabRoots conference software that AGBT is using for their virtual meeting. This year has exposed many of us to a wide variety of teleconference and virtual meeting software and many of the glitches are small and hard to pin down. Or matters of personal preference (though if you don't share mine, you are simply wrong!). But now on two major platforms I've come across major issues with LabRoots

AGBT21: Rebus Esper for Spatial Sees Things You Wouldn't Believe

My prediction that spatial would be a hot topic at AGBT was easy to make knowing I was sitting on embargoed news in the spatial space. This morning Rebus Biosystems announced the launch of the Rebus Esper system for wide field spatial profiling of gene panels with subcellular resolution. Rebus is promising that this instrument will offer true walkaway automation from fluidics through imaging, and data processing, requiring only one hour of hands-on time.

AGBT21: A Few Pre-Conference Mutterings

Getting some miscellanea out before AGBT21 starts later this morning

AGBT 2021: A Spatial Foundation

I'll call it now -- the big buzz at this year's AGBT will be around spatial profiling. Trust me, it's not just a hunch. The two current players in the field -- nanoString and 10X Genomics -- both have significant presence in the virtual conference. Don't be surprised to see more players on the field -- just sayin'

PacBio With SoftBank's $900M: How Might TheyWork?

Pacific Biosciences continued its roll of successful business development, snagging $900M from Japan's SoftBank two weeks ago. Combined with a recent secondary stock offering and a major deal with Invitae, PacBio has gone from their self-proclaimed near-derelict status during the Illumina acquisition attempt saga to rolling in cash.

More Details on 10X's Sample Profiling Trident

10X Genomics had an online event Wednesday called Xperience (as far as I could tell no Jimmy Hendrix music was used, a missed opportunity!) to lay out their development roadmap. This largely paralleled the presentation given at J.P. Morgan, but there were a few new bits and of course much more technical detail to whet the appetites of scientists -- and judging from a number of very positive tweets I saw today they were successful in that goal. Some of the 10X management was kind enough to walk me through the deck earlier this week as well as permission to borrow images from it, so this summary is based on that as well as watching the presentation. While their name is 10X, the company emphasized progress on three axes: scale, resolution and access and that progress across the three different platforms.

Could I See Myself at J.P. Morgan?

There's a question that others pop my way pretty much every year around J.P. Morgan: would I ever attend myself? I'll confess it never occurred to me before I was asked, but that isn't necessarily a deal breaker. I foolishly didn't attend AGBT until 2013 when Alexis Borisy (then CEO of Warp Drive) suggested I go -- I think it was mostly because he thought it was a good investment and probably only secondarily to keep me off the ski slopes for a week -- I shattered my knee just after AGBT 2012 ended. It's an interesting but complex question which I will answer one way here, but freely admit that over coffee I could be nudged one way or the other.

Why I Hated One Genapsys Slide

I claimed in my Miscellanea piece that I was one post away from being done with J.P. Morgan -- oops, forgot I had drafted a minor screed on data display which I'll push out before the last piece - particularly since I hinted I would be taking Genapsys to task on this subject. Unexpectedly good timing too: maybe new Genapsys CEO Jason Myer's first big initiative can be to fix this plot!

J.P. Morgan: Miscellania

Before J.P. Morgan is truly a month ago I should clean up some loose ends as a penultimate post driven by this year's virtual conference (the last post isn't exactly time sensitive). In contrast to the single company focused items that preceded it, this is a grab bag of minor observations and notes.

J.P.Morgan: NanoString

Almost done with my J.P. Morgan summaries -- this will be the last focused on a specific company: nanoString. They wish to emphasize that they are becoming the company for spatial analysis of DNA, RNA and proteins in biological samples. They also want us to differentiate that space into two segments: profiling and imaging. Profiling gathers spatial information from regions of multiple cells; imaging in their lingo covers spatial techniques with single cell or subcellular localization. In both cases nanoString is betting heavily on oligo-tagged antibodies to enable deep multiplexing of protein detection to be integrated with RNA and DNA detection.

J.P. Morgan: Genapsys

Genapsys' J.P. Morgan presentation by CEO Hesaam Esfandyarpour focused on their story of delivering a compact sequencer based on electronic detection that offers low capital, low cost sequencing. There were two bits of specific product news, but mostly general painting of a rosy picture.

J.P. Morgan: PacBio

PacBio CEO Christian Henry’s presentation at J.P. Morgan wasn't rich in technical specifics. But he gave a very bullish portrait of a company aiming for the stars. A conflict reminder: he’s a member of the Board of the Strain Factory that employs me, though I haven’t yet had the pleasure of meeting him.

The biggest news is a broad partnership with Invitae four clinical human genome sequencing. The only specific here is that this is not the whole enchilada; platform development will take place both within the Invitae collaboration and outside it. What might that development be?

Between Henry’s comments in the Q&A and a few info crumbs on slides there will be pushed to further tune all the canister. Her mentioned efforts on dyes and further improving SMRTcell loading efficiency. There was chatter on Twitter about an overdue update to improve HiFi yields.

Henry talked of the importance of increasing ZMW packing, but gave no specifics other than to suggest this is more "development" than "innovation" -- this was in response to a question asking if technical breakthroughs are required. But we are left wondering on a timetable as well as what the next density might be; four-fold to 32M wouldn’t be surprising on naïve geometry grounds.

I suspect a huge area of joint effort with Invitae will be to automate HiFi library production. The current protocol is long, manual and labor intensive - not at all appealing for lease scale clinical use. How much of that will be retained as proprietary to Invitae will remain to be seen. Henry claims that the Invitae effort will be separate but coordinated with existing development efforts; prior plans have not been shelved or diverted to support Invitae. A major software effort to support clinical operations is a given. PacBio has separate workflows for SNP and SV calling and those must be integrated and a clinician-friendly report generated.

Henry believes that the new Sequel IIe will be the dominant product shipped going forward. It will be interesting to see which of the older workflows PacBio updates and moves into the on-board compute. For example, if you want to call methylation you must export BAM files with kinetics data, which are predicted to be five-fold fatter. If the methylation calling happened on board, then that extra processing and extra data would be eliminated.

Similarly, workflows such as microbial assembly are still based around Continuous Long Reads (CLR). Henry didn't mention CLR once (I think). While I doubt they would ever dump it altogether like they did Strobe Reads, it would seem likely that it won't get much attention. Oxford Nanopore can beat them on very long reads and their single molecule accuracy is much higher; far better to focus on the CCS/HiFi reads where PacBio can deliver much higher accuracy. It will be interesting to see if PacBio pushes the HiFi fragment read length longer. On the one hand it will be more challenging to work with longer fragments and to routinely get enough circuits around them to deliver HiFi quality data. Twenty five kilobases is a nice size for many applications, but there will always be incremental value for going to thirty or forty or beyond.

In response to a question about $1000 genomes, Henry described it as "just a number" around "where it makes sense" in high throughput applications. He says the Invitae collaboration will be able to drive prices below $1000. But he also pushed the idea that a PacBio genome is a truly clinical grade genome and has higher value than genomes produced on other platforms. He argued that this higher value, in terms of higher diagnostic yield for rare diseases, will be more attractive to payers and that there will be a net benefit to the healthcare industry by ending diagnostic odysseys sooner. He vowed to continue generating "diagnostic proof statements" to provide evidence to support the higher value claim.

Should be interesting to watch, particularly if you have a front row seat in front of a Sequel IIe,

J.P. Morgan: 10X Genomics

As I attempt to collate various incomplete thoughts about the J.P. Morgan presentations I have read and listened to from genomics instrument shops, one thing stands out about 10X Genomics: they actually announced new gadgets and kits! I should thank the company for supplying the slides after I snarked on Twitter about how they weren't archived in the J.P. Morgan webcast -- but now it is there. So either my eyes failed again or I had a personal IT failure (I think the website doesn't like iOS and I may have forgotten that). The slides were presented by CEO Serge Saxonov

JP Morgan: Illumina

Illumina presented at J.P. Morgan on Monday, reminding us that they aren't just a sequencing instrument company but an interlocking set of businesses focused on genomics. CEO Francis deSouza spent much of his time discussing the Grail acquisition and some of the other ways in which Illumina is pushing rapidly to become an essential part of clinical medicine, but there was one slide on future improvements to sequencing technology and a few on the lineup of existing sequencers. Reminder: I'm working off public sources, as during the day we work closely with Illumina and they even sunk some serious cash into my employer last May.

J.P. Morgan 2021

The J.P. Morgan Healthcare Conference has started this morning in virtual form, so I'd really better get this draft cleaned up and out (indeed, Roche is presenting as I hurriedly type, though about pharma not diagnostics). 2021 already feels like a darker continuation of 2020, between the appalling putsch attempt in my nation's center of government last Wednesday and the still buggy roll-out of the coronavirus vaccine. As I noted in my piece on the Oxford Nanopore Community Meeting, the many disruptions of 2020 make grading the progress of companies essentially impossible: many were disrupted by lockdowns, supply chain issues and the general distraction from the year of doomscrolling.

Advent of Code vs. FizzBuzz

A bunch of coding types at the Strain Factory participated in The Advent of Code, a clever 24-day set of programming challenges that runs each year before Christmas. Each day a new two=part programming challenge was posted. Technically it is a speed contest, but you won't find me on the public leaderboard as I'm not nearly quick enough to ever rate a point there. One of my major official activities last month was contributing towards screening candidates for three different computational positions, one of which we threw open to general data science experience. As a result, I've been thinking far too much about the FizzBuzz problem and my prejudices towards it.

Peri-New Year Nanopore Playing

Ever since the community meeting I've been toying with an idea, then never quite trying to code it.
So on New Year's Eve I started getting the dataset together and reducing it to a bunch of dataframes, and today I pushed that a bit further and started graphing some of it. It's very much a rough project -- some of the dataframes have some issues I'm still chasing down with redundant data not being initially collapsed, but I think the data is accurate. I also think I have my conventions consistent -- at one point confused myself into inverting the labels on the plots! In other words, ApG would be labeled GpA -- not good! There's already some intriguing patterns, which are presumably the sort of signal tools like Medaka use to polish assemblies from FASTQ data aligned to draft references.

Friday, December 31, 2021

Monday, December 13, 2021

Tuesday, October 26, 2021

Wednesday, August 04, 2021

Tuesday, July 20, 2021

Tuesday, June 29, 2021

Wednesday, June 02, 2021

Tuesday, May 25, 2021

Thursday, May 20, 2021

Sunday, April 25, 2021

Tuesday, March 02, 2021

Monday, March 01, 2021

Sunday, February 28, 2021

Saturday, February 27, 2021

Friday, February 26, 2021

Tuesday, February 09, 2021

Monday, February 08, 2021

Saturday, February 06, 2021

Thursday, January 28, 2021

Monday, January 25, 2021

Tuesday, January 19, 2021

Saturday, January 16, 2021

Thursday, January 14, 2021

Monday, January 11, 2021

Sunday, January 03, 2021

Saturday, January 02, 2021

Get new posts by email: