Illumina is rebranding their Constellation mapped read technology to TruePath Genome and launching it into the marketplace. Priced at $395 per genome, TruePath Genome was presented as being superior to true long read approaches for generating haplotype blocks and for resolving difficult genetic disease loci. TruePath Genome relies on on-flowcell tagmentation of long input molecules, meaning that library prep moves entirely to the flowcell. TruePath genome runs on NovaSeq X 10B flowcells, with one sample per lane.
After Illumina CTO Steve Barnard gave an overview of Constellation, Marcel Nelen of UMC Utrecht presented data on a few clinical cases which were resolved with Constellation. Nelen's group uses the VoltaLabs Callisto for DNA extraction upstream of TruePath Genome.
A quick review. TruePath Genome works by flowing DNA, ideally high molecular weight DNA, onto a patterned flowcell. Transposases in the flowcell nanowells tagment the DNA to add the i5 and i7 adapters, which allow a cluster to form in that nanowell. Clusters which are proximal in space on the flowcell will often be from the same input DNA, and so the onboard DRAGEN compute applies algorithms to identify reads which are likely to have come from the same input molecule. The images below from the workshop show fluorescently stained DNA (green) draped across flowcell nanowells (red).
One the proximity information is extracted, then a plot of observed distances versus expected distances can be plotted, much like with Hi-C or other such chromosome position data, revealing chromosomal rearrangements and aberrations as deviations from a strict diagonal band, with exemplary patterns illustrated below
Barnard emphasized that the $395 price is everything downstream of DNA extraction - on flowcell library prep eliminates a host of consumables typically found in library prep workflows and the onboard DRAGEN software performs the read mapping, proximity recognition, variant calling, and haplotype calling.
Illumina also emphasized workflow aspects. New Chief Medical Officer (and AGBT legend) Eric Green was quoted as saying that even after decades away from the bench, he could still run TruePath Genome. A comparison with PacBio for sample-to-insight time rated TruePath genome at 32 hours and PacBio at 59 hours, with both data acquisition and computational processing significantly faster.
Barnard also presented a roadmap for future TruePath Genome development, with multiplexing planned for 2027 and layering on the 5-base (5-methyl-cytosine conversion) chemistry in 2028 or later. He also declared that TruePath Genome would be extended to improve segmental duplication calling, reduce turnaround time, enabled for plant and animal genomes, enabled for de novo microbial assembly and ported to Illumina platforms beyond the NovaSeq X.
A funny little experience happened at ASHG where I had met a member of the Constellation team who had a poster and we set a time to meet to discuss the poster. On showing up, I discovered myself surrounded by the Constellation team - but this was certainly a friendly meeting and I was very flattered to get such generous treatment.
One question I asked back in October was why had Illumina now been driving the approach to commercialization, when the foundational paper from Jay Shendure's lab was about a decade old? Interestingly, the answer is that the original goal of the project was simply the on-flowcell library prep, and only after that was working it was a bit of a whim that the team looked into whether any connectivity information was captures. This is a bit ironic given all the previous attempts by Illumina to capture long-read information on their platform and their failed attempt to acquire PacBio; unrecognized over all that time is a very good long range information method was hiding in their vast IP portfolio.
Back at ASHG we also discussed the challenge of building a rich dataset of difficult can't-be-solved-by-short-reads rogues gallery of variants. The team was going through the Coriell catalog for samples with known structural variants, ideally which had been validated with existing long read methods. While it wasn't quite so bad, one pessimistic view is each sample enabled incrementing the rogues gallery by one difficult abnormality.
Illumina's strong claims of equivalence or even superiority for some of the difficult regions and variants is unlikely to go unchallenged by "Company P" and "Company N" - not to be confused with short read "Company E" or "Company U" (who could possibly decode such clever cryptographic puzzles????). Ideally some clinical groups would run completely unbiased head-to-head comparisons - same extracted DNA - of TruePath Genome, HiFi and ONT on some of the worst samples. Illumina might also want to look carefully at regions poorly sequenced by their competitors - e.g. PacBio's weakness when regions are primarily purines on one strand, pyrimidines on the other.
TruePath genome is part of an Illumina push to retain customers, arguing they can generate superior data on the same instrument they already own rather than investing in different platforms. Illumina is also emphasizing the power of onboard DRAGEN processing, eliminating the need for downstream compute. Barnard didn't emphasize the point strongly, but a preprint last Fall in collaboration with Fritz Sedlazeck did note the much lower input requirements for TruePath Genome - as low as 350 nanograms - which is far smaller than microgram-scale quantities required for long read platforms. At least according to the pre-print and Illumina; PacBio fans were quick to cry foul at the multiple micrograms listed as the requirement there rather than 500 nanograms; ONT would probably have similar complaints about the 1000 nanograms listed.
Illumina's TruePath Genome will now be out in the wild, so it will be interesting to see feedback from a wider range of laboratories than the collaborators used during development of the technology. Also, there will be responses from PacBio, Oxford Nanopore and their collaborators and partisans. So stay tuned, as both the technology and its application will continue to be worth watching.


4 comments:
Is this really a good value proposition if you need to burn an entire 10B lane for a single sample? 10 billion reads -> 3 terabases -> ~30 30x human genomes. A 10B has 8 lanes, so TruPath effectively cuts the flowcell capacity by a factor of 4. It's my understanding that the cost of a flowcell without any reagents is thousands of dollars, which leads me to believe that $395 is just the per-sample cost, and does not include the cost of the flowcell itself.
Yes, that is the per sample cost - as you point out it does represent more sampling and even just using 10B flow cells you can get more samples per instrument per year using standard WGS
Still 4x Element Biosciences' Vitari costs. What a progress!
TruePath genome is delivering a very different result than VITARI or any other standard short read - phasing and structural data. So your comparison makes little sense
Post a Comment