Monday, August 06, 2007

Pre-WWW Hyperlinking

I recently attempted to rhapsodize on the wonders of restriction endonucleases. My exploration of this area has also reacquainted me with an amazing invention, what I might argue is the first artifact of what we now call synthetic biology.

An important early use, still going strong, for restriction enzymes is the cutting-and-pasting of DNA sequences. An early vector which was heavily used was pBR322, and it was also one of the first DNA molecules to have its entire sequence determined. pBR322 was particularly useful because for certain popular restriction enzymes it contained only a single site and that site was not in a critical region. This facilitated cloning into that site.

However, only a few restriction enzymes fit this description. In addition, a common problem with cloning into plasmids was that of empty vector, in which the plasmid reseals without capturing a DNA of interest. A clever scheme emerged somewhere of cloning into a portion (the alpha peptide) of E.coli beta-galactosidase; if the plasmid captured an insert then beta-Gal function would be disrupted. This loss-of-function would show up as white colonies when the E.coli were grown on media containing synthetic compounds that turn blue when cleaved by beta-Gal.

It turns out that this alpha peptide will accept a significant insertion of amino acids, and somewhere the germ of the idea of a polylinker emerged. The polylinker would contain many unique restriction sites and also enable blue-white cloning. For what I believe is the first time, a human sat down and designed a specific & novel DNA sequence for a specific & novel purpose and had it synthesized. Previous DNA synthesis efforts, such as the original effort by Har Gobind Khorana to make a tRNA or the synthesis of an artificial human hormone gene at UCSF, were intended to make something already extant in nature. The first polylinker was perhaps the first creative work of DNA!

That original polylinker had a mirror-symmetry and just 4 cloning sites, with the fold preventing using pairs of sites. Not long afterwards came the pUC polylinkers, which have each site represented only once and a very dense packing of sites. These have been propagated to many other vectors.

I've seen other polylinkers, but none seem to have the popularity of the pUC polylinkers. Shown is the pUC18 polylinker; one additional twist is that this sequence reads through (no stop codons) in either direction; pUC19 simply has the polylinker in the opposite orientation.

CAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATTCGT

Two pedagogic angles occur to me. For any biology class, it would be fun to follow-up the session on restriction enzymes by handing each student the pUC polylinker sequence. The assignment is to find as many six or eight basepair palindromes as possible. The other interesting assignment would be for an advanced bioinformatics class: write a program to take a set of restriction enzymes and build a polylinker with them, with shorter outputs scoring higher and bidirectionality scoring higher. Such an exercise will really underline the achievement of the pUC design, which I believe was done with pencil-and-paper, not by computer program.

2 comments:

Anonymous said...

It's amazing how many techniques have come and gone in the past 25-30 years. I can remember the pre-PCR struggles to locate appropriate restriction sites to insert a piece of DNA into a vector. Thanks for the reminder of how much ingenuity has been expended to make our scientific lives so much easier!

Schmootzie said...

Yes, it has been a long time since pBR322. This was fun to read. Remember the good old days when grants were not impossible to get and the cloning was a pain in the tail?

Thanks for your thoughts.