Tuesday, December 06, 2016

Reversible Terminators: Not Just For Sequencing

Reversible terminator nucleotides lie at the heart of sequencing-by-synthesis systems such as Illumina.  These nucleotides in their original state cannot be extended, terminating DNA polymerization.  But with the correct chemical treatment, the block is removed and polymerization can continue.  A recent paper moves the concept from sequencing to making large single mutation libraries.  The authors have apparently also applied for a patent (according to the Conflicts of Interest statement accompanying the paper), though that does not turn up on Google.
Exploring a wide range of mutations in a known sequence is an excellent way to explore sequence-function relationships.  Such saturation mutagenesis libraries can identify which nucleotides or amino acids contribute to function or reveal mutations which can change the function of a region.

Several methods are well=used for generating saturation mutation libraries.  Error-prone PCR is a very inexpensive method to introduce many errors in a region.  By running PCR with one or more low fidelity polymerases, many alterations are introduced.  The downside is a lack of control. The types of mutations will be very biased, which is ameliorated only somewhat by using a mix of polymerases with different biases. Also, many templates will have multiple mutations and a few might have none.  I've collaborated on one such library at my current employer, and it wasn't clear for the application if the multiple mutations were a bug or a feature (the biased mutation spectrum was definitely a bug).  Multiple mutations, particularly for protein libraries, increase the odds of entirely non-functional constructs due the introduction of stop codons or other serious mischief.  On the other hand, multiple mutations may explore a greater range of sequence/structure/function space, though they also make data analysis more complex.

At the other extreme are defined approaches which use specific oligonucleotides to introduce the desired mutations.  This can give much greater control, enabling single mutation libraries to be generated and even the option to avoid undesired mutations, such as known required sites or synonymous mutations.  But the downside is a much greater expense for custom oligo synthesis and much more work to construct a library covering a large area.  Touting my own horn again, a library built at Codon was eventually published which was shooting for every possible sense codon change in a single protein.  This library explored possible codon effects on folding of the protein (a G-protein coupled receptor) as well as single substitution effects.

The new paper proposes a strategy has the simplicity and low expense found in error-prone PCR but a much higher level of control.  While specific single mutations can't be avoided, the system essentially guarantees that all mutant constructs will contain one and only one mutation (wild type constructs will also result).  This is accomplished via a reversible terminator, but one which will never be seen in a sequencing scheme.

Inosine is a nucleotide that acts as a wildcard, able to pair well with all four of the canonical bases. Spiking a reversibly-terminated inosine into a standard amplification mix enables a single primer to be used to create a collection of fragments which all end in an inosine.  Multiple cycles of such one-sided, linear amplification ensure coverage throughout the target region.  Chemically removing the terminating moiety from the inosine enables a final round of extension with a conventional nucleotide mix, completing the mutant strands.  A single extension with the reverse primer should introduce a mutation three fourths of the time via random incorporation across from the inosine in the first strand.

The terminator inosine is formed enzymatically from a commercially available reversible terminator adenosine.  Adenosine deaminase will convert adenosine to inosine, but not if the triphosphate is present.  So the terminator was first dephosphorylated, then deaminated and finally re-phosphorylated.

For a test case, the authors mutagenized the active site of an enzyme which breaks down penicillin and similar molecules, the TEM-1 beta-lactamase.  By varying the ratio of terminator to deoxynucleotides in the reaction, the fraction of single mutant molecules could be varied from 33% (1:1 ratio) to over 50% (4:1 terminator:normal), yet maintain low levels of multiple mutant clones (<1 1:1="" 4:1="" and="" for="" nbsp="" p="" ratio="" the="">
Also important is a transition:transversion ratio of 0.48, whereas error prone PCR created a ratio of 3.2.  A perfect mutagenesis would create a ratio of 0.33, since of the 3 point mutations for a nucleotide, two are transversions.

The original library used a very short insert, so that it could be fully sequenced on the Illumina platform. A second attempt used a longer segment encompassing the stop codon, with a unique molecular identifier (UMI) incorporated by the reverse primer.  This UMI is simply a randomized 20-mer barcode.  In the longer format, only a modest dropoff in mutation frequency with distance was seen.

The authors constrained themselves to single mutations, but one could imagine using the same system to create two or more mutations.  For example, if removal of the terminating nucleotide was followed by an extension with pure terminators, then two adjacent nucleotides would be mutagenized.  Since single mutations can't achieve all possible codon->codon swaps, this would explore a greater mutational space (though, of course, the pair of nucleotides could be at positions 1&2 of a codon, or 2&3, or the last base of one codon and the first of the next).  Or, by extending again with the original mixture, clones with 2 spaced mutations could be created.  By switching out templates, various other combinations can be imagined in which different zones of a target each contain a single mutation.

Mutagenesis is an important way to understand the function of DNA, RNA and proteins, as well as to explore how proteins can be modified to unlock new functions or improve on existing ones.  Driven by companies such as Gen9 and Twist, the cost of designing highly controlled mutant libraries continues to drop.  Even so, such libraries remain expensive, leaving the door open for clever techniques which can inexpensively generate mutants.  This new technique morphs a technological by-product of advanced sequencing technology into a new tool for generating mutant libraries.


Anonymous said...

patents publish 18 months after filing of provisional application....

Anonymous said...

I had a brief look at the paper. I believe that something is wrong, because all polymerases copy a inosine in any template to a C. This has been long established by Kobayashi et al. (http://nass.oxfordjournals.org/content/48/1/225). I have experienced myself that inosine solely encodes for C when copied.
I suppose that some controls were missing and a retraction of the patent application will be pending (sic!).

Anonymous said...

Indeed, I tried to validate this method myself and found that deoxyinosine functions as a deoxyguanosine analog and pairs exclusively with deoxycitidine. The paper is deeply flawed.