Canal Biosciences is a small startup in the library prep space that is making a debut at AGBT this year as a contributing sponsor. Based in the historic mill town of Lowell Massachusetts with its many canals, the company is bringing forward a ligation chemistry which is free from end repair - and that offers many interesting advantages. Three of Canal's scientists (half the team!) chatted with me on the eve of AGBT about their TrueLink chemistry.
Enzymatic ligation lies at the heart of a large number of NGS library preparation protocols. Tagmentation is probably the next most common, and then there is a long tail of niche methods that introduce sequencing adaptors by random priming, click chemistry or other chemistry. Ligation traces back to the earliest days of molecular cloning. At first such techniques relied on the sticky overhangs generated by Type II restriction enzymes, but later methods were developed for blunt ends. DNA sheared by mechanical or non-specific nucleases can have a variety of ends, so "end polishing" or "end repair" protocols were developed, taking advantage of the properties of T4 DNA polymerase. Under the right conditions, this enzyme will extend a recessed 3' hydroxyl and trim back a 3' overhang. Later came A-tailing and Y-adaptors to ensure directional generation of sequencing library fragments - no i5-i5 or i3-i3 fragments (to use Illumina nomenclature) which will soon drop out during PCR or just waste flowcell space in a PCR-free protocol.
End repair into ligation (with possibly a stop at A-tailing) has served the sequencing community well for decades, so don't fix what isn't broken, right? Except there are issues becoming increasingly apparent with this approach, especially as DNA sequencing moves beyond genome assembly, transcriptome discovery and variant calling into epigenomic markings and naturally fragmented DNA. Plus the sequencing instruments deliver increasingly accurate reading of the library fragments they are given, so it is important to not alter the sequence from its native state.
Those cracks have been showing. First, there are fields using DNA that is extracted as fragments - ancient DNA, forensic DNA and cell-free DNA. Many of these are short to start with, so discarding any of it via trimming 3' overhangs is painful. Worse, there is an emerging field of fragmentomics which has shown that the lengths and termini of cell-free DNA carry critical information about the processes that generated the cell-free DNA.
T4 DNA polymerase also lacks the fidelity of many polymerases, and so potentially generates base substitutions. Element Biosciences dealt with this in their Q50 protocol by running dark cycles across the bases most often filled in by T4, which they showed had lower accuracy. But the Broad has published data showing that end repair often rewrites DNA far from the ends of the molecule, with all the problems of rewriting the native DNA. And if you are interested in methylation marks, rewriting will erase them and the bases filled-in won't have the marks of the opposite strand copied over.
One solution to this is to denature the sample and use a single stranded DNA prep, such as marketed by Claret Biosciences. Canal is avoiding that step by creating a double stranded ligation chemistry which can still ligate to recessed 3' ends, recessed 5' ends or blunt ends. It also functions on single stranded DNA. The company is a cagey about the exact chemistry - tiny startups must be careful not to give fast follower competitors more information than strictly necessary - it uses to accomplish this, but their issued patents discuss using mutant ligases and modified oligonucleotides. One gets the impression that co-founder Yu Zheng, a veteran protein engineer with stops at IDT and NEB and a previous founder of a protein engineering startup - probably regularly dreams of atoms dancing the complex molecular choreography of a ligation reaction.
Canal's first TrueLink product is a PCR-indexed DNA kit, with 24 reactions for $1.2K and 96 for $4.35K. This is a bit of a premium to existing ligation kits, but you are getting a superior chemistry which won't trim ends or erase methylation. Soon to follow - making and validating all the indexing oligos isn't a small task - will be TruLink Absolute PCR-Free. These are intended to take in DNA either from cell-free inputs or downstream of mechanical shearing; commercial nuclease fragmentation kits typically contain end repair enzymes. Canal plans to offer their own nuclease mix in the future to enable enzymatic shearing
Canal's libraries nearly eliminate the higher error rates seen at the ends of reads. Interestingly, on Illumina platforms the first few bases still show a highly elevated rate. Canal hasn't tested this on another platform - I'd suggest Element or PacBio HiFi would be very interesting. Is it just a basecalling algorithm weakness?
An interesting additional benefit of TrueLink is reducing the number of chimaeras - despite all the Y-adapter trickery ligation libraries contain as much as 1% chimaeric fragments. Those are certainly unwelcome when trying to detect structural variants, particularly in low coverage data.
Canal has plenty more in development - kits tuned for RNA-Seq (targeted for next year) and for methylation sequencing. The latter would support both cytosine-conversion chemistries and 5mC conversion chemistries.
So expect even more from this small company that wants to steer sequencing library preparation away from the time tested, but problematic, paradigm of end repair followed by A-tailing.

No comments:
Post a Comment