A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
Wednesday, May 17, 2023
Called Back To London Again
Tuesday, March 21, 2023
Thoughts on Unexpected Sequences Found In COVID mRNA Vaccines
Tuesday, February 21, 2023
Is Illumina Delivering the MVP of Long Reads?
Sunday, February 05, 2023
What's AGBT Like?
Tuesday, January 31, 2023
AGBT 2023 Is Nearly Upon Us!
Tuesday, October 25, 2022
PacBio Revio: Same Footprint, 80% The Time, 15X The HiFi!
PacBio has been rolling out announcements around the ASHG meeting and now delivers a huge one: the next generation SMRT instrument “Revio” will roll out next spring and it’s a big step up in throughput. With Revio’s 15X boost in per-run throughput over Sequel IIe, PacBio is touting this as 30X HIFi genomes for under $1K sequencing consumables per genome.
Wednesday, October 19, 2022
Better Than FizzBuzz: First Bioinformatics Problem in an Erratic Series of Indeterminant Length
Thursday, October 06, 2022
General Inception Aims to Ignite New Company Formation
A week ago, a company calling itself General Inception emerged from stealth as a new concept, which they call an “Igniter company”, to promote the formation of new life sciences company. As described to me in a phone conversation with General Inception CEO Paul Conley, General Inception provides a range of science and business expertise and support to enable embryonic ideas to condense into functional startups.
The igniter metaphor Conley offered me is the spark plug of a car: it is required to start the engine and continues to provide a key part of the functioning whole. Conley extended this to say that General Inception plans to go for the ride, but let the scientific founders occupy the driver’s seat. But another ignition metaphor occurred to me, the skill of turning a tiny spark from flint and steel into a raging campfire. Without careful nurturing, most tiny sparks will never ignite a blaze; only with careful and staged addition of oxygen and fuel does this reliably occur.
General Inception is not a fund, Conley stressed, but structured as a corporation. The ultimate goal is to make systematic and reproducible the generally artisanal craft of discovering and nurturing new ideas – as well as sometimes terminating efforts early that do not appear successful.
So what does General Inception offer? First, some of the boring business functions such as quotidian finance functions such as paying bills. Also a wide range of expertise, from access to technical experts in a diverse set of biological disciplines to persons familiar with estimating markets and development paths. General Inception has lined up Contract Research Organizations and Contract Manufacturing Organizations, but not just as contract partners – General Inception has “meaningful” equity stakes and perhaps board seats in these companies. A key goal of General Inception is to identify key experiments which can be run quickly at these partners to test project concepts. This might be simply reproducing results from an academic lab or perhaps running a key experiment to de-risk the project. An example of a CRO partner is Triple Ring Technologies, which offers company incubator services and facilities both in the San Francisco Bay and Boston areas.
Interestingly, General Inception is casting a very wide net. Any life science concept is potentially a target: therapeutics, diagnostics, tool companies, agriculture, synthetic biology or whatever else is centered around biotechnology. Within the company there are defined practice areas: Tools & Diagnostics; Cell Engineering & Synthetic Biology; Therapeutics. But these are intended not to be siloed fiefdoms but rather foci which overlap each other and reinforce each other. After all, so many technologies are converging - a diagnostic may require synthetic biology or a therapeutic cell engineering. Conley believes human health will continue to grow – but non-health life sciences might grow even faster and overtake healthcare in terms of total economic value, so he is positioning the company to support company formation in all areas.
In terms of geography, not only are they scouring US labs but also in Europe. In the latter case, the companies that emerge might have R&D remaining in Europe but commercial operations headquartered in the US. Conley says the venture environment in Europe remains more conservative than the US, so General Inception can make a particularly large impact in Europe by helping ideas cross “the valley of death” to where a venture firm is comfortable investing in it.
Not only is General Inception stalking the halls of academia, but they are also talking to existing companies about dormant assets that might find new life in a startup. And General Inception hopes to form long-term relationships with innovators; having an idea fail early won’t be held as a demerit against the academic.
In terms of deal structures, General Inception looks to set themselves up as a founder, with founder’s common stock. Generally they would be funding companies at the seed or pre-see stage, perhaps taking the place of angel investors or “friends and family” investments. Venture capital firms often demand preferred stock, which gives them first rights to the financial carcass of failed ventures - General Inception will be taking the same risk as scientific founders of not getting anything from liquidated companies.
General Inception itself has raised $60M from a set of venture capital firms. Their goal is to reach a steady state in which around 25 companies are seeded and graduated annually. General Inception also sees itself as evolving to an information business – with a large experience base of startups they hope to glean new insights into predicting what ideas work and finding the best company structures to maximize the chance of success. Venture firms that invest in General Inception will have early access to companies incubated by the company. Conley has been piloting the company since February 2020, refining the approach and the set of expert resources which General Inception can draw on
In my career I’ve interacted with startups in a variety of contexts - as a potential employee or consultant, an actual employee or consultant and as a potential or actual partner. I’ve also daydreamed a rough business plan or two. I will be the first to proffer that this is hardly a comprehensive exposure to the variety of ways that companies are seeded. Thinking back on experiences such as Warp Drive, I can see the value of quick proof-of-concept experiments to either validate a company or simply nip off an idea that is unlikely to ever bear fruit – but I also know how complicated it can be to identify such experiments or assemble all the components to perform such an experiment. So I’m intrigued by General Inception and wish them well, though I reserve a certain amount of skepticism that starting new companies will ever be anything other than artisanal.
Wednesday, October 05, 2022
Monday, October 03, 2022
Illumina Roadmap Part 2: Infinity Becomes Illumina Complete Long Reads
Sunday, October 02, 2022
Wednesday, September 21, 2022
Notes From Coffee With MGI
A couple of weeks ago I sat down for coffee with a pair of MGI representatives - American Region CEO Yongwei Zhang and Director, Global Business Development Damon Zhang. Since I hadn’t been at AGBT 2022 (my 2023 application already filed!). Yongwei and I had planned to try to catch up the next time he was in Boston area, so I braved our current subway issues (not one, but two major lines shut for extended maintenance!) and covered a range of topics.
Tuesday, August 23, 2022
SRA Entries Should Not Ever Disappear Into Thin Air
Tuesday, August 16, 2022
Supply Stall Slows Singular
Thursday, June 23, 2022
AGBT 2022: Overhanging Questions
Wednesday, June 08, 2022
Admin: Feedburner to Follow.it Switch
Tuesday, May 31, 2022
Ultima Genomics Storms Out Of Stealth Promising $1/Gigabase Short Reads
To date, the new entrants targeting Illumina’s short read business have been aiming at the middle of Illumina’s range, trying to take on NextSeq. Element Biosciences is touting high accuracy for a low price. Omniome (now PacBio) also has positioned itself to tout accuracy. Singular Genomics is claiming to enable great flexibility and fast runs. But all aimed at NextSeq. As part of the run up to AGBT another company is decloaking from stealth mode: Ultima Genomics, however they are going not after NextSeq but full throttle after Illumina’s pinnacle, the NovaSeq running the S4 flowcell. The value proposition is a large sequencing device that delivers S4 output at S1 prices for an overall cost of $1 per gigabase. Note that the interview for this piece was conducted under a CDA and Ultima reviewed my copy for accuracy and to ensure I didn’t disclose anything they had marked confidential. They were nice enough to offer to have me fly out to their facility, but I was forced by the damn coronavirus to cancel those plans the night before the trip. A preprint summarizing the technology is also out in bioRxiv. A trio of additional preprints have popped up as well, describing its application to generate a huge methylation sequencing dataset around colorectal adenocarcinoma, a huge Perturb-Seq dataset and for large scale single cell RNA-Seq.
Ultima isn’t planning on truly launching until early next year, but they’re well on the way with paying early access customers. Indeed, AGBT will feature multiple posters and talks describing the use of the Ultima instrument for a variety of genomics tasks. And Ultima is confident that their architecture will support significant increases in future throughput, enabling per base costs to go even lower.
Ultima’s chemistry is flow based - using unterminated but fluorescently labeled nucleotides. Only a fraction of the nucleotides are labeled in each reaction, reducing the reagent costs and minimizing molecular scar accumulation. The reactions take place on beads whose templates are amplified via emulsion PCR - though for all the ePCR-haters out there Ultima will include a fully automated benchtop ePCR robot. Once primed, the beads retain the DNA polymerase, so this expensive component can be conserved between flows. The instrument is a single end reader – no paired ends – but substitutes for that by reads with a modal read length of around 300 bases, which should be enough to plow all the way across most short read inserts and their associated molecular indices.
The use of unterminated nucleotides has typically meant challenges in resolving homopolymers. Ultima is tuning their system to call homopolymers of up to 12 bases; via discussions with customers and their own experience accurate counting of longer homopolymers is deemed insufficiently valuable to focus on vs. other design tradeoffs.
But Ultima has found several ways in which unterminated flow chemistry can either have its weaknesses ameliorated or become downright boons. First, while it can’t accurately measure long homopolymers it can go straight through very long ones in a single extension cycle – so poly-A tails in cDNA ends can be easily blitzed through. This helps ensure reading all the way through inserts of things like single cell libraries. Second, for short homopolymers Ultima embeds in the Q-scores a probability matrix of the length – basically the odds of minus one and plus one versions of the sequence. This is leveraged by their customized version of GATK, developed with the Broad Institute. Third, is a clever approach of “cycle shift variant calling” that I’m still stunned has never appeared in the literature for any other flow chemistry – 454, Ion Torrent or Genapsys. Cycle shift uses the known order of flows to increase the confidence in variant calls – particularly variable for low coverage data such as cell-free DNA.
Another key driver of low cost and high density is the use of a spinning, open “flow cell” (really a 200mm diameter wafer) for both reagent addition and imaging. Centrifugal force generated by the spinning (fake force, ha!) distributes the reagents as a very thin film, minimizing wastage. Imaging as the wafer spins enables shooting many tiles without having to repeatedly accelerate and decelerate the flowcell as a rectilinear scanning scheme must do. The speed difference adds up: Ultima can generate in 20 hours the same 3 terabases (10 billion reads of roughly 300 bases each) as a NovaSeq S4, but an S4 requires 44 hours to run – and Ultima believes they can shave that down to 16 hours. Faster cycle times means more runs per instrument – and each instrument runs two wafers simultaneously, each with its own chemistry station but sharing imaging path The instrument features tanks for reagents which can be refilled, with a 24 hour capacity of each reagent. Six different wafers can be queued for running, with built-in automation removing spent wafers and swapping in new wafers with new library pools.
How might the system grow its output? The patterned wafers place the beads at a very conservative pitch. Larger diameter wafers are also a possible further option. Extending the read lengths is yet another possible expansion direction.
The instrument has onboard GPU compute power, which is currently used for basecalling and alignment and could ultimately also perform the variant calling work.
Current accuracy is 0.1% error for substitutions and 0.5% for indels. Most of the indel error is concentrated in homopolymers greater than 8, with calling capped at 12. When used with the specially modified GATK co-developed with the Broad, or other custom DeepVariant or Sentieon pipelines, SNP calling accuracy of 99.7% precision, 99.7% recall is achieved and indel recall and precision range from 96-98% for small indels (excluding long homopolymers and low complexity regions). Accuracy suffers in low complexity regions, which Ultima believes is an amplification chemistry not sequencing chemistry issue and they believe they can significantly improve on the current performance.
Ultima plans to offer their own kits for PCR-free and PCR-based sheared genomic libraries. Libraries for other systems can be converted by a simple indexing PCR scheme - this has been done for TruSeq libraries and proof-of-concept experiments have been run for Nextera libraries.
What could be done with such an instrument? A pending publication uses Ultima and Illumina in parallel on the same 4 million cell Perturb-Seq experiment and finds the results equivalent between the platforms. A large fraction of the Phase IV ENCODE HiC data was generated on Ultima. An internal proof-of-concept experiment utilized deep sequencing RNA from COVID-19 infected samples, recovering complete viral genomes after only ribosomal RNA depletion. One of the AGBT abstracts demonstrates the ability of Ultima WGS to detect minimal residual disease at low levels by deep WGS of cell-free DNA, an approach academia and startups are actively exploring. Additional AGBT abstracts describe population genetics studies, oncology, and rare disease sequencing. Ultima has 10 paying Early Access customers, with 7 instruments installed to date – and these run the gamut from large academic genome centers to biopharma to government labs. They hope to have “well into double digits” customers at the time of the official launch.
To get here Ultima has raised over $550 million dollars and hired over 350 employees. The company has made steady progress from their start in 2016. . Ultima CSO Doron Lipson previously was part of the teams at Helicos and Foundation Medicine, so he has extensive experience both in building a sequencing platform and applying it at scale. CEO Gilad Almogy has spent many years in the semiconductor manufacturing field - Ultima’s reaction wafers are patterned atop silicon substrates and the semiconductor industry also uses very high precision optical methods for both manufacturing and quality control.
Illumina for a long time now has had an unassailed position as leader in sequencing in the US market as well as others. Now that position is under pressure from all sides: Element and Singular are trying to squeeze the NextSeq market while Ultima is aiming for the top; Oxford Nanopore thinks their “short fragment mode” can compete as well and the patent shackles are being lifted from BGI. At JP Morgan in January Illumina said their “Chemistry X” would offer improvements in accuracy, read length and output, but absolutely no details have been forthcoming – and in particular whether new instruments will be required to access Chemistry X benefits. Perhaps the entry of Ultima and the others will add some urgency to Illumina communicating their future plans, lest customers start planning in earnest to opt for the new platforms
For we consumers of sequence data, more competition and lower prices are a pure good. Projects can continue to be increasingly ambitious and simply the number of different phenomena which can be converted into a sequence measurement constantly grows. More for less is never, ever going to become boring – it will always be enabling. After a long period of very shallow slope in the notorious “better than Moore’s Law” slide, we appear to be entering a new period of plunging sequencing costs. Time to start making plans to take advantage of it!