Gotta first toot a tiny little horn of my own here. The potential for nanopore sensing to detect base methylation was suggested long before the MAP put ONT devices into users hands, but I did post what I think was the first example of this in the Nanopore Community. I realized that an E.coli dataset generated and posted by Nick Loman was in a Dam+ Dcm+ positive strain, and these two methylases have frequent sites in the genome. If I aligned the data back to the reference E.coli genome, computed miscall rates and then plotted those as a function of the distance from the nearest Dam or Dcm site (I think I did both; it was almost a decade ago) one might see a rise in the error rate around the methylation sites since the ONT basecaller considered an extended window of sequence (and indeed, the pore sees more than one base at a time). And indeed that was the case!!! Alas, I think that was the old community, as a search for "Robison Dam" finds nothing.
Okay, back to the serious recent stuff. There's a whole collection of methods that have been published that I'm going to call "Long Read Chromatin Footprinting" (but won't call LRCF because I don't like to inflict ugly acronyms on the world) because the community can't seem to settle on a single name/acronym - my spreadsheet tracking these has 21 different names, though some are variations on a theme due to slightly different protocols with slightly different goals (e.g. SAMOSA vs. SAMOSA-CHAAT). But the basic idea is to extract chromatin from cells and then treat with something that labels the accessible DNA but not that shielded by chromatin proteins. Single molecule sequencing then reveals the marks and thereby which chromatin was open.
In most of these protocols, the marking method generates 6-methyladenine, which is nearly non-existent in eukaryotic genomes. That means that the detection of native 5-methylcytosine and 5-hydroxymethylcytosine. And of course mutations can be detected - so these methods potentially generate three orthogonal signals. Early methods used adenine methylases with four-base specificity, but NEB has an Enzymes for Innovation product line of wonderfully strange enzymes and one of those is EcoGII methyltransferase - which seems to have no context specificity for methylating adenines. So now nearly every publication uses EcoGII.
Angelicin
As stated: nearly every publication. A recent preprint from Angela Brook's lab introduces a new marking agent: angelicin, a plant natural product that intercalates DNA and can be covalently cross-linked with UV exposure (and named for the common garden plant Angelica, not the senior author of the paper!). The proposed advantage of this small molecule for marking is that very short linker regions between nucleosomes are not labeled by methyltransferases due to steric clashes with the nucleosomes. Importantly, each angelicin molecule crosslinks only to a single DNA strand, so the DNA is still competent for nanopore sequencing. Angelicin shows a preference for intercalating in the order TA > AT >> TG > GT, as shown in their plot below - TATATA kmer shows a notable shift in nanopore raw signal not seen in GGCGCG or CGTTAC.
The second thread is the nucleotide analog BrdU -bromodeoxyuridine, which replaces the methyl in thymine base with bromine. BrdU is incorporated in place of thymine into DNA, but oddly will pair with guanine during replication (why the asymmetry? I haven't been able to find an explanation) and so is classed as a mutagen -- though apparently a mild one. BrdU has been used for many DNA studies - for example that heavy bromine atom can be used to assist phasing in X-ray crystallography. It can be also be picked up by electron microscopy = and now that I think of it, was probably part of the labeling scheme for attempts at direct sequencing of DNA by electron microscopy - commercialization attempts I've been told by an inside source were starved of funding after ONT's big 2012 AGBT announcement.
A number of papers have shown that BrdU can be detected in nanopore content; if used in a pulse-quench or pulse-chase experiment the BrdU will label DNA synthesized during a specific time period - a method that long predates nanopore sequencing. So with nanopores, one can perform this genome wide. The recent preprint that caught my eye showed that human replication initiation occurs at more sites than previously thought, but in preparing this piece I discovered the long trail of preprints detecting BrdU with nanopores.
Why not both?
A technique called RASAM uses both BrdU labeling of nascent replication and methyltransferase marking of open chromatin simultaneously, potentially providing in one assay four different 'omics readouts - sequence variation, native methylation, open chromatin and nascent DNA synthesis. RASAM builds on a chromatin footprinting protocol called SAMOSA; I knew before that the latter was a component of South Indian cuisine and now I understand the former is too (and the same group has the variant SAMOSA-CHAAT; I'm getting hungry writing this!). Interestingly, these use PacBio for detection - PacBio can also be trained to recognize BrdU
Speculations
There are probably many more ways to leverage the detection of angelicin and BrdU in single molecule sequencing.
For example, there is an interesting technique I've considered writing up (sadly, I have mislaid a draft) called Strand-Seq, and it relies on BrdU labeling to provide phasing information in difficult-to-map regions of the genome. While long reads and particularly ultra-long read sequencing has largely solved this, there might still be a niche for long read Strand-Seq. The chromatin footprinting schemes may be a generally interesting approach to studying genomic samples in rare disease research, in order to detect allele-specific open chromatin and methylation simultaneously - the utility of this has already been demonstrated in one case.
I've also cooked up some educational uses. Sequencing BrdU-labeling of DNA synthesis could be demonstrated in a relatively early lab experience, making those diagrams in intro bio much more relatable. If single molecule DNA library preparation becomes streamlined and inexpensive, then students could perform an updated version of the "most beautiful experiment in biology" (watch the video at the link - it's amazing) - the Meselson-Stahl experiment, demonstrating that DNA replication is semiconservative. If it all works, then after one replication one should find BrdU-labeled fragments mapping to one strand in each cell and BrdU-clean strands mapping to the other. That should rule out -- check my logic - both dispersive (which is from Max Delbrück!) and fully conservative.
This idea of updating Meselson-Stahl to modern molecular biology is dedicated to the anonymous Harvard undergraduate who heard either Meselson himself or Bill Gelbart explained Seymour Benzer's elegant phage recombination mapping experiments that attained single basepair resolution - and asked why he hadn't "just sequenced them".
Back in the experimental world, one of my favorite modifications to see a single molecule method trained on is phosphorothioate linkages. These are often used in biotechnology for oligos to be used in vivo, because they are resistant to many nucleases. As with many seemingly clever inventions of humans, biology beat us by a million years or more. Some bacteria incorporate phosphorothioate into their DNA in a limited fashion. It's thought to be a restriction-modification system, but isn't well understood. It's quite a serious change, since the phosphorothioate linkages must be incorporated by nicking the DNA and limited resynthesis.
What other DNA modifications will fall next? And I've focused on DNA - the world of RNA modifications is known to be vast and
What other DNA modifications will fall next? And I've focused on DNA - the world of RNA modifications is known to be vast and
Long Read Chromatin Footprinting References
Note: I believe this is complete at this time but it is easy to miss a reference. I don't plan to maintain this publicly, but if you'd like to leave any missed or new ones in the comments, please do so!
1 comment:
Are you forgetting the early paper "Error rates for nanopore discrimination among cytosine, methylcytosine, and hydroxymethylcytosine along individual DNA strands." Proceedings of the National Academy of Sciences of the United States of America. 2013 by Schreiber et al.?
Post a Comment