Wednesday, May 17, 2023

Called Back To London Again

After a too long pandemic-induced hiatus, I'm in the UK for this year's edition of London Calling.  I talked myself out of going last year well in advance, which would have been interesting as my rapid tests were still coming up positive about the time I would have needed to fly from Boston over the Atlantic.  And while I've been watching remotely, I've been dismal over the past year in actually writing anything about it.  Which was foolish on my part as ONT has been going through an interesting transition.

Tuesday, March 21, 2023

Thoughts on Unexpected Sequences Found In COVID mRNA Vaccines

Writing this piece is not easy, not only because the topic matter is completely in controversies around SARS-CoV-2 and the vaccines for it, but because the data was generated by someone whose outspoken opinions on any COVID-19 public health topic are nearly always ones I find myself in opposition to.  Someone who periodically lobs my way personal attacks on my ethics.  It doesn't help that these results will be certainly misused to attempt to undermine public confidence in the vaccines, or that this post will probably attract a lot of commentary that I don't wish to address because of the adage that generating misinformation takes far less energy than rationally correcting it.  But, data is data and in the end I believe that whatever our differences, the data generator is not someone who would construct a hoax.  And in any case, the results can be checked, so if somehow it were a vicious hoax that could be exposed.  And importantly, I feel that what has been found should be discussed, as no advanced technology is ever perfect - these results I feel suggest new standards for the design and implementation of mRNA therapeutics.

Tuesday, February 21, 2023

Is Illumina Delivering the MVP of Long Reads?

At AGBT last week Illumina released additional details on their still incubating Complete Long Reads (CLR) product (formerly known as Infinity) but is still holding back both some interesting technical information as well as exact performance specifications.  Illumina is already floating some of their marketing messages, which in some cases are dependent on some of those still-in-flux specifications and some of the claims may not withstand careful scrutiny.  And Illumina continues to make statements that irritate anyone with deep technical knowledge of the long read space.  The reaction by attendees was definitely mixed - one long read aficionado even offered me a very spicy title suggestion for this entry.  Alas, I can't use it, as it would be a bit of an inside joke based on a portion of a presentation that the presenter asked not be tweeted.  So instead you get the above title,  which may not be what you think. 

Sunday, February 05, 2023

What's AGBT Like?

AGBT begins in less than 24 hours, and the signs are everywhere here at the Diplomat Resort in Hollywood Florida.  I arrived Friday with family, and the count of old friends I've chatted with is steadily climbing.  If you somehow forgot about the meeting, the insides of the elevator doors will remind you. This is the fifth time I've attended in person, plus heavy monitoring of about twice as many via Twitter.  It's one of the premier events of the genomics conference schedule, and if you haven't been it's certainly fair to ask why?  Or whether you would want to go to a future edition?  So I'll try to capture what makes AGBT so irresistible to many but also why it just might not be your cup of tea

Tuesday, January 31, 2023

AGBT 2023 Is Nearly Upon Us!

AGBT is less than a week away in Hollywood Florida - and I've been letting everything else get ahead of writing anything here.  The JP Morgan Conference at the beginning of this month didn't have major fireworks from the sequencing vendors, but did have some news.

Tuesday, October 25, 2022

PacBio Revio: Same Footprint, 80% The Time, 15X The HiFi!

PacBio has been rolling out announcements around the ASHG meeting and now delivers a huge one: the next generation SMRT instrument “Revio” will roll out next spring and it’s a big step up in throughput. With Revio’s 15X boost in per-run throughput over Sequel IIe, PacBio is touting this as 30X HIFi genomes for under $1K sequencing consumables per genome. 

Wednesday, October 19, 2022

Better Than FizzBuzz: First Bioinformatics Problem in an Erratic Series of Indeterminant Length

Periodically in my work or during writing this blog I come across computational problems that have the aspects of making, at least in my mind, very good teaching problems.  Some of the characteristics are that the basic problem is relatively simple to explain, the skills required are reusable on other problems, the concepts are germane to other problems and that the posed problem can be expanded in steps to something much richer.  Such problems might even be the nucleus of undergraduate or even high school bioinformatics projects, though with the recent news of a high schooler sequencing his dead pet angelfish's genome the bar for high school projects has leapt a few notches! In contrast to a programming problem that doesn't fit these, I'm going to tag such posts as "Better than FizzBuzz".

Thursday, October 06, 2022

General Inception Aims to Ignite New Company Formation

A week ago, a company calling itself General Inception emerged from stealth as a new concept, which they call an “Igniter company”, to promote the formation of new life sciences company.  As described to me in a phone conversation with General Inception CEO Paul Conley, General Inception provides a range of science and business expertise and support to enable embryonic ideas to condense into functional startups.  


The igniter metaphor Conley offered me is the spark plug of a car: it is required to start the engine and continues to provide a key part of the functioning whole.  Conley extended this to say that General Inception plans to go for the ride, but let the scientific founders occupy the driver’s seat.  But another ignition metaphor occurred to me, the skill of turning a tiny spark from flint and steel into a raging campfire.  Without careful nurturing, most tiny sparks will never ignite a blaze; only with careful and staged addition of oxygen and fuel does this reliably occur.


General Inception is not a fund, Conley stressed, but structured as a corporation. The ultimate goal is to make systematic and reproducible the generally artisanal craft of discovering and nurturing new ideas – as well as sometimes terminating efforts early that do not appear successful.  


So what does General Inception offer?  First, some of the boring business functions such as quotidian finance functions such as paying bills.  Also a wide range of expertise, from access to technical experts in a diverse set of biological disciplines to persons familiar with estimating markets and development paths. General Inception has lined up Contract Research Organizations and Contract Manufacturing Organizations, but not just as contract partners – General Inception has “meaningful” equity stakes and perhaps board seats in these companies.  A key goal of General Inception is to identify key experiments which can be run quickly at these partners to test project concepts.  This might be simply reproducing results from an academic lab or perhaps running a key experiment to de-risk the project.  An example of a CRO partner is Triple Ring Technologies, which offers company incubator services and facilities both in the San Francisco Bay and Boston areas.


Interestingly, General Inception is casting a very wide net.  Any life science concept is potentially a target: therapeutics, diagnostics, tool companies, agriculture, synthetic biology or whatever else is centered around biotechnology.  Within the company there are defined practice areas: Tools & Diagnostics; Cell Engineering & Synthetic Biology; Therapeutics.  But these are intended not to be siloed fiefdoms but rather foci which overlap each other and reinforce each other. After all, so many technologies are converging - a diagnostic may require synthetic biology or a therapeutic cell engineering.  Conley believes human health will continue to grow – but non-health life sciences might grow even faster and overtake healthcare in terms of total economic value, so he is positioning the company to support company formation in all areas.


In terms of geography, not only are they scouring US labs but also in Europe.  In the latter case, the companies that emerge might have R&D remaining in Europe but commercial operations headquartered in the US.  Conley says the venture environment in Europe remains more conservative than the US, so General Inception can make a particularly large impact in Europe by helping ideas cross “the valley of death” to where a venture firm is comfortable investing in it.


Not only is General Inception stalking the halls of academia, but they are also talking to existing companies about dormant assets that might find new life in a startup. And General Inception hopes to form long-term relationships with innovators; having an idea fail early won’t be held as a demerit against the academic.  


In terms of deal structures, General Inception looks to set themselves up as a founder, with founder’s common stock.  Generally they would be funding companies at the seed or pre-see stage, perhaps taking the place of angel investors or “friends and family” investments.  Venture capital firms often demand preferred stock, which gives them first rights to the financial carcass of failed ventures - General Inception will be taking the same risk as scientific founders of not getting anything from liquidated companies.  


General Inception itself has raised $60M from a set of venture capital firms. Their goal is to reach a steady state in which around 25 companies are seeded and graduated annually. General Inception also sees itself as evolving to an information business – with a large experience base of startups they hope to glean new insights into predicting what ideas work and finding the best company structures to maximize the chance of success.  Venture firms that invest in General Inception will have early access to companies incubated by the company.  Conley has been piloting the company since February 2020, refining the approach and the set of expert resources which General Inception can draw on


In my career I’ve interacted with startups in a variety of contexts - as a potential employee or consultant, an actual employee or consultant and as a potential or actual partner.  I’ve also daydreamed a rough business plan or two.  I will be the first to proffer that this is hardly a comprehensive exposure to the variety of ways that companies are seeded.  Thinking back on experiences such as Warp Drive, I can see the value of quick proof-of-concept experiments to either validate a company or simply nip off an idea that is unlikely to ever bear fruit – but I also know how complicated it can be to identify such experiments or assemble all the components to perform such an experiment.  So I’m intrigued by General Inception and wish them well, though I reserve a certain amount of skepticism that starting new companies will ever be anything other than artisanal.


Monday, October 03, 2022

Illumina Roadmap Part 2: Infinity Becomes Illumina Complete Long Reads

The Only Thing Clear About Infinity Is It Is Now Complete Long Reads. 
Illumina told us a new name for Infinity -- Illumina Complete Long Reads -- and an initial pair of products, but didn't reveal anything new about the underlying tech.  They threw out a number of claims, but very vague ones.  Particularly confusing is that it "isn't synthetic reads".  If not, then what is it?  

Wednesday, September 21, 2022

Notes From Coffee With MGI

A couple of weeks ago  I sat down for coffee with a pair of MGI representatives - American Region CEO Yongwei Zhang and Director, Global Business Development Damon Zhang. Since I hadn’t been at AGBT 2022 (my 2023 application already filed!). Yongwei and I had planned to try to catch up the next time he was in Boston area, so I braved our current subway issues (not one, but two major lines shut for extended maintenance!) and covered a range of topics.

Tuesday, August 23, 2022

SRA Entries Should Not Ever Disappear Into Thin Air

I ran into an annoying problem last night and was quite steamed, but had the discipline to wait until morning to vent publicly about it.  Now I'm more in a morose mood on the subject, not furious but still quite frustrated. The quick version of what happened is I'm belatedly trying to go through some nicely documented reproducible analysis code to explore some concerns I have with the analysis, and the code is working on an SRA entry -- and that SRA entry is the entire point of the analysis. And that SRA entry which I know once existed now doesn't - other than this code and the preprint to go with it, it's as though it never existed -- which is terrible.  And I'm irritated with everyone who contributed to that terrible result, starting with NCBI

Tuesday, August 16, 2022

Supply Stall Slows Singular

Singular Genomics reported earnings last week and delivered an unpleasant surprise: inability of suppliers to make timely deliveries of key (but unspecified) hardware components have slowed G4 instrument production to a very slow crawl.  Given the lively competition in the desktop short read space, this is a serious setback for Singular's commercial launch.  

Thursday, June 23, 2022

AGBT 2022: Overhanging Questions

AGBT broke up a couple of weeks ago and I've failed to write anything here so far.  It was frustrating not attending, but not registering for a meeting in February seemed prudent given the pattern of COVID waves - I hadn't considered (nor would have wanted to bank on) AGBT organizers reacting so well and rescheduling the meeting.  It sounds like a number of attendees did catch the virus at the meeting -- though I'm presumably still quite protected by my infection a month earlier.  Anyways, I'm going to organize this around one to two questions that hover in my head for the different sequencing providers.  AGBT also had a strong spatial angle, but I feel ill-equipped to cover that in the absence of being on the scene -- I don't work with spatial data and so don't have a deep feel for it.  As always, please flag me here or on Twitter or by email for any errors I made -- or any juicy sequencing company gossip you wish to share!

Wednesday, June 08, 2022

Admin: Feedburner to Follow.it Switch

A bit over a year ago Google made one of their dreaded announcements that they would be slowly killing off one of their acquisitions, in this case FeedBurner.  Well over a thousand of you have been using FeedBurner to follow me via email.  Follow.it has a wonderful free plan that can take over all of the previous functionality and I could just import the old subscription list

Tuesday, May 31, 2022

Ultima Genomics Storms Out Of Stealth Promising $1/Gigabase Short Reads

To date, the new entrants targeting Illumina’s short read business have been aiming at the middle of Illumina’s range, trying to take on NextSeq.  Element Biosciences is touting high accuracy for a low price.  Omniome (now PacBio) also has positioned itself to tout accuracy.  Singular Genomics is claiming to enable great flexibility and fast runs.  But all aimed at NextSeq.  As part of the run up to AGBT another company is decloaking from stealth mode: Ultima Genomics, however they are going not after NextSeq but full throttle after Illumina’s pinnacle, the NovaSeq running the S4 flowcell.  The value proposition is a large sequencing device that delivers S4 output at S1 prices for an overall cost of $1 per gigabase.  Note that the interview for this piece was conducted under a CDA and Ultima reviewed my copy for accuracy and to ensure I didn’t disclose anything they had marked confidential.  They were nice enough to offer to have me fly out to their facility, but I was forced by the damn coronavirus to cancel those plans the night before the trip. A preprint summarizing the technology is also out in bioRxiv.  A trio of additional preprints have popped up as well, describing its application to generate a huge methylation sequencing dataset around colorectal adenocarcinoma, a huge Perturb-Seq dataset and for large scale single cell RNA-Seq.


Ultima isn’t planning on truly launching until early next year, but they’re well on the way with paying early access customers.  Indeed, AGBT will feature multiple posters and talks describing the use of the Ultima instrument for a variety of genomics tasks.  And Ultima is confident that their architecture will support significant increases in future throughput, enabling per base costs to go even lower.


Ultima’s chemistry is flow based - using unterminated but fluorescently labeled nucleotides.  Only a fraction of the nucleotides are labeled in each reaction, reducing the reagent costs and minimizing molecular scar accumulation.  The reactions take place on beads whose templates are amplified via emulsion PCR - though for all the ePCR-haters out there Ultima will include a fully automated benchtop ePCR robot.  Once primed, the beads retain the DNA polymerase, so this expensive component can be conserved between flows.  The instrument is a single end reader – no paired ends – but substitutes for that by reads with a modal read length of around 300 bases, which should be enough to plow all the way across most short read inserts and their associated molecular indices.


The use of unterminated nucleotides has typically meant challenges in resolving homopolymers.  Ultima is tuning their system to call homopolymers of up to 12 bases; via discussions with customers and their own experience accurate counting of longer homopolymers is deemed insufficiently valuable to focus on vs. other design tradeoffs.  


But Ultima has found several ways in which unterminated flow chemistry can either have its weaknesses ameliorated or become downright boons.  First, while it can’t accurately measure long homopolymers it can go straight through very long ones in a single extension cycle – so poly-A tails in cDNA ends can be easily blitzed through.  This helps ensure reading all the way through inserts of things like single cell libraries.  Second, for short homopolymers Ultima embeds in the Q-scores a probability matrix of the length – basically the odds of minus one and plus one versions of the sequence.  This is leveraged by their customized version of GATK, developed with the Broad Institute.  Third, is a clever approach of “cycle shift variant calling” that I’m still stunned has never appeared in the literature for any other flow chemistry – 454, Ion Torrent or Genapsys.  Cycle shift uses the known order of flows to increase the confidence in variant calls – particularly variable for low coverage data such as cell-free DNA.  


Another key driver of low cost and high density is the use of a spinning, open “flow cell” (really a 200mm diameter wafer) for both reagent addition and imaging.  Centrifugal force generated by the spinning (fake force, ha!) distributes the reagents as a very thin film, minimizing wastage.  Imaging as the wafer spins enables shooting many tiles without having to repeatedly accelerate and decelerate the flowcell as a rectilinear scanning scheme must do.  The speed difference adds up: Ultima can generate in 20 hours the same 3 terabases (10 billion reads of roughly 300 bases each)  as a NovaSeq S4, but an S4 requires 44 hours to run – and Ultima believes they can shave that down to 16 hours.  Faster cycle times means more runs per instrument – and each instrument runs two wafers simultaneously, each with its own chemistry station but sharing imaging path  The instrument features tanks for reagents which can be refilled, with a 24 hour capacity of each reagent.  Six different wafers can be queued for running, with built-in automation removing spent wafers and swapping in new wafers with new library pools.  


How might the system grow its output?  The patterned wafers place the beads at a very conservative pitch.  Larger diameter wafers are also a possible further option. Extending the read lengths is yet another possible expansion direction.


The instrument has onboard GPU compute power, which is currently used for basecalling and alignment and could ultimately also perform the variant calling work.  


Current accuracy is 0.1% error for substitutions and 0.5% for indels.  Most of the indel error is concentrated in homopolymers greater than 8, with calling capped at 12.  When used with the specially modified GATK co-developed with the Broad, or other custom DeepVariant or Sentieon pipelines, SNP calling accuracy of 99.7% precision, 99.7% recall is achieved and indel recall and precision range from 96-98% for small indels (excluding long homopolymers and low complexity regions).  Accuracy suffers in low complexity regions, which Ultima believes is an amplification chemistry not sequencing chemistry issue and they believe they can significantly improve on the current performance.  


Ultima plans to offer their own kits for PCR-free and PCR-based sheared genomic libraries.  Libraries for other systems can be converted by a simple indexing PCR scheme - this has been done for TruSeq libraries and proof-of-concept experiments have been run for Nextera libraries.  


What could be done with such an instrument?  A pending publication uses Ultima and Illumina in parallel on the same 4 million cell Perturb-Seq experiment and finds the results equivalent between the platforms.  A large fraction of the Phase IV ENCODE HiC data was generated on Ultima.  An internal proof-of-concept experiment utilized deep sequencing RNA from COVID-19 infected samples, recovering complete viral genomes after only ribosomal RNA depletion.  One of the AGBT abstracts demonstrates the ability of Ultima WGS to detect minimal residual disease at low levels by deep WGS of cell-free DNA, an approach academia and startups are actively exploring.  Additional AGBT abstracts describe population genetics studies, oncology, and rare disease sequencing.  Ultima has 10 paying Early Access customers, with 7 instruments installed to date – and these run the gamut from large academic genome centers to biopharma to government labs.  They hope to have “well into double digits” customers at the time of the official launch.


To get here Ultima has raised over $550 million dollars and hired over 350 employees.  The company has made steady progress from their start in 2016.  . Ultima CSO Doron Lipson previously was part of the teams at Helicos and Foundation Medicine, so he has extensive experience both in building a sequencing platform and applying it at scale. CEO Gilad Almogy has spent many years in the semiconductor manufacturing field - Ultima’s reaction wafers are patterned atop silicon substrates and the semiconductor industry also uses very high precision optical methods for both manufacturing and quality control.  


Illumina for a long time now has had an unassailed position as leader in sequencing in the US market as well as others.  Now that position is under pressure from all sides: Element and Singular are trying to squeeze the NextSeq market while Ultima is aiming for the top; Oxford Nanopore thinks their “short fragment mode” can compete as well and the patent shackles are being lifted from BGI.  At JP Morgan in January Illumina said their “Chemistry X” would offer improvements in accuracy, read length and output, but absolutely no details have been forthcoming – and in particular whether new instruments will be required to access Chemistry X benefits.  Perhaps the entry of Ultima and the others will add some urgency to Illumina communicating their future plans, lest customers start planning in earnest to opt for the new platforms


For we consumers of sequence data, more competition and lower prices are a pure good. Projects can continue to be increasingly ambitious and simply the number of different phenomena which can be converted into a sequence measurement constantly grows.  More for less is never, ever going to become boring – it will always be enabling.  After a long period of very shallow slope in the notorious “better than Moore’s Law” slide, we appear to be entering a new period of plunging sequencing costs.  Time to start making plans to take advantage of it!


[20220608 corrected really embarrassing millions typo (should have been billions) which has been requoted all over Twitter]

Monday, May 23, 2022

London Calling 2022: Peptide Sequencing

London Calling was last week and Clive Brown's big revelation was a peek at Oxford Nanopore's progress on enabling peptide sequencing on the platform.  Peptide sequencing and identification is a hot area right now, with multiple startups looking to provide alternatives to mass spectrometry approaches.  Clive stressed that the technology is very early in development.  It's definitely a clever fork of the existing DNA sequencing technology.  However, it also illustrates a significant organizational challenge which Oxford. So I'm going to spend a post focused on this while I figure out how to slice up the rest of the meeting.

Friday, April 15, 2022

Nanopore Knights' Notes

Clive Brown gave a "mezzanine" update on Oxford Nanopore just over two weeks ago titled "The Knights Who Say Me".  Clive reiterated a lot of prior guidance but did make a few announcements that are relevant to the ongoing history of the Oxford Nanopore platform - and blessedly, he omitted for time's sake a deep coverage of that history or the usual Nanopore 101 tutorial.    In particular, two long-time components of the platform are now headed for the exits.

Thursday, March 31, 2022

The End of the Beginning of Human Genome Sequencing?

Today in Science a slew of papers have been published from the Telomere-to-Telomere (T2T) Consortium.  The flagship paper details the generation of a complete genome assembly from a Complete Hydatiform Mole (CHM) cell line which is telomere-to-telomere for all 22 autosomes plus X (assembly T2T-CHM13); the companion papers apply this groundbreaking assembly to a number of biological questions.  PacBio CSO Jonas Korlach and I chatted yesterday about the PacBio contribution to the flagship as well as two of the other papers, as well as another T2T preprint on automated assembly and a related paper from Heng Li and colleagues that recently appeared in Nature Biotechnology.  I did not have advance access to the T2T paper but it had appeared in preprint form and Jonas assured me that no substantial information was added in the published version.