Thursday, December 26, 2019

Long Overdue and Overly Short Notes on Clive's NCM 2019 Talk

A theme of the 2019 Nanopore Community Meeting in New York was the long and short of nanopore sequencing.  While the public sparring with Illumina/PacBio over the definitions of sequencing types wasn't explicitly discussed, certainly ONT wants to make sure that people understand they don't intend to ignore applications that are naturally short reads.  I've been slowly trying to get this summary to gel for awhile, with the usual distractions this time of year of some trips, planning for holidays and a bout with a virus.  Plus general procrastination. 

I'm just going to cover Clive's talk; there were some really spectacular presentations (including one by someone who remarked that they hoped their upcoming thesis committee meeting would go well! One of multiple excellent platform talks by very junior scientists) If you'd like to watch yourself, the video of Clive's talk is online. Watching it again is kind of fun, except for the distraction of seeing a guy with huge glasses trying to live tweet from the front row.  I've put in brackets rough timepoints for some of the topics; you may need to slide back or forth a bit to nail it exactly but they can land you near the right spot.

Things We Might Be Able to Buy in 2020

R10.3 & R11

Perhaps the biggest, and certainly the most concrete, announcement is the rolling out of another, semi-final version of the R10 pore chemistry.  Dubbed R10.3 with the previous chemistry R10.0 (which means the pre-release version must have been R10 point negative one!).  In addition to better overall accuracy than R10.0, R10.3 has a capture efficiency just a bit below R9.4.1 whereas R10.0 was substantially worse [33:00].  The semi-final bit is that Clive's slide states that "best features from these two pores are being combined", so more iterations may show up in the future. Clive also suggested that optimizing the software and biochemical conditions hasn't halted.  Another interesting tidbit is that R10.0 couldn't be fitted onto PromethION whereas R10.3 can be.  Expected to ship in January for MinION/GridION and Februrary for PromethION -- nor was fitting R10.3 onto Flongle even mentioned.

No release schedule yet, but Clive said ONT has an R11 pore well under development.  R10 can apparently be mixed with R11 in the same device, but R11 has a highly anti-correlated pattern of errors with R10.

Flowcell Running Boosts

I'm particularly excited about three changes which will improve performance in a manner appropriate in a high throughput environment.  There are tricks which people use if they can babysit flowcells.  This includes topping up fuel (aka ATP) and applying either washes or nuclease flushes to clear jammed pores.  But that requires intervention, and if personnel time is considered the most irreplaceable resource, they're not really options.

As announced at London Calling, ONT is tackling the fuel problem with two strategies.  The first, promised for Q1 of next year, is to provide an ATP regeneration system in the running buffer.  This is a long-standing trick in high throughput screening: include some compound that can donate phosphate to ADP plus the enzyme to catalyze that reaction.  The second approach is to solve a major source of ATP depletion: idling motor proteins.  The current motor burns some ATP even if not docked at a pore.  Mutant motors have been developed which have 10% of the idle consumption of ATP than the current motors.  These are only promised for next year. [17:47].  The ATP burn issue is particularly a problem for short insert libraries, as these will have a higher number of motors sitting around

The other advance is to embed nuclease on the trans side of the pores to chew up DNA after it has gone through the pore.  Secondary structures forming on the trans side have become the go-to hypothesis for why really long DNA kills pores and why chicken DNA kills pores.  Details on the system are still being worked out; according to one employee I talked to there will may be an inhibitor on the cis side to inactivate any nucleases that sneak over via a popped pore, though it may also be that the tiny volume of the trans side reservoirs (which are not common to all the pores; more on that below). [5:04]

Clive also spoke about advances in new "trans-tethering" chemistry.  The current tether chemistry has been a double-edged sword.  Once the tether is in the membrane it helps library molecules find a pore faster by reducing the problem to a 2D search rather than a 3D one.  But the current tether is very greasy and since it is pre-attached to library molecules causes them to stick everywhere.  The new chemistry puts the tethers in the membrane, with ONT saying this will boost sensitivity by 200X.  Since the new tethers will be in the flowcell, .  No timeframe for commercial release though [45:40].


Don't expect the gaggle of basecallers to be thinned anytime soon; ONT announced yet another research-grade basecaller architecture at the meeting.  Bonito uses convolutional networks and has a shockingly compact Python codebase, though it also has very minimal documentation and is not for the faint of heart.  This joins the research grade Flappie and Runnie.  [24:20]

A related basecaller issue is that the adapter sequence causes a "lift" in the signal and this particularly degrades base accuracy for short inserts.  Scaling is also affected by base composition, with short reads having greater sensitivity to this.  A short read scaling mode for poly-T has been released into Guppy and further adapter scaling will release into Guppy soon [15:32]

RNA basecaller is now of similar accuracy to DNA basecaller.  Look around 31:43 for a really crazy Venn diagram of RNA modifications.

DNA modification callers to date have been very context-specific and typically trained on CpG, GATC (Dam) and CCWGG (Dcm).  ONT has an all-contexts methylation caller in development, which would be quite impressive if successful.

Not exactly basecallers, but two other informatics notes is that ONT is enhancing the LIMS interface to their systems and has introduced VBZ compression for FAST5 files.  This is claimed to give substantial improvements (35% the file size with 10% the required compute in comparison to GZIP) and to be enabled by simply adding a plug-in to your HDF5 kit (I haven't tried this myself).  [8:56]


A related note is that ONT is bringing back 2D libraries with a new 2D"C" protocol to generate library molecules which have strands linked by a hairpin on one end, allowing forward and complementary-to-forward sequence data to be read.  A key aspect of the new scheme is that the hairpin adapter contains a nick which is extended so as to resynthesize the complementary strand.  This has an interesting twist that for palindromic modification sites one strand will be modified and one unmodified.  ONT also floated the idea that this enables "8B4" chemistry, in which incorporation of base analogues in the extension reaction could enable better homopolymer deconvolution.  In these libraries, 80% of the read molecules generate 2D-C reads.  [39:18]

To enable 2D-C ONT is working both on new basecallers and yet another flowcell type, R9.6 pore combined with E8.1 motors.  This combination delivers much higher raw accuracy for the complement strand; apparently the hairpin likes to re-form on the trans side and this alters the electrical signal.  Perhaps yet another boost in future from trans nuclease flowcells here.  The new basecallers consider both strands simultaneously but aren't ready for release, though Clive suggested it is "some weeks or months" away.  ONT is targeting better than Q20 for 2D-C accuracy [41:20].

Next-Gen ASIC

Clive described again the next-generation ASICs announced at London Calling and went into some detail on devices these will enable.

The Plongle concept with 96 "flow wells' continues to be developed, with a target of $25-$50 per flow well [51:40].  Here ONT adopts the angle of single-cell RNA sequencing vendors of talking in the small multiple prices, rather than the aggregate to run an experiment -- $50/well is nearly $5K per entire grid.  Still, for a lot of applications 96 independent flowcells is superior, particularly with ONT's native barcoding mired at a low plex level of 24.  Even with a promised output of only 0.5-1Gb per well.  No word on which pore chemistry would be formatted this way.

Clive unveiled two different MinION Mk2 flowcell concepts.  One is something looking like a tiny flat thumbdrive that he depicts plugging into a tiny USB dongle.  The other looks more the scale of a current MinION flowcell but includes a trail of positions in which on-board sample prep could occur [52:39].  Mk2 is promised to have no on-board electronics and would be fully disposable.  Clive also showed a picture suggesting the follow-on to GridION might simply look like a USB hub.  Personally I'd suggest exploring other geometries, particularly those that would play well with liquid handling robots (octagonal not a first choice there!)

Clive also makes some interesting comments about future production plans -- devices would be printed on flexible ribbons on reel-to-reel devices and then cut into specific units [53:20].

Odds & Ends

Miscellaneous commercial notes: ONT is making progress on catching up to their Flongle backorder challenge, though order volume continues to expand [12:46].  XL kits are launching for LSK109, with 48 reactions and around a 20% per-reaction discount, and for the wash kit [8:30].  MinION Mk1C is launched but the GSM (cellular) radio isn't working correctly [11:20].  Pore-C protocol for HiC calling works very well and should be out in the public.

DNA & Protein Sequencing on VolTRAX

Clive went over again the idea of sequencing on VolTRAX by capturing pores into membranes on instrument.  For DNA or RNA this would enable very low throughput operation but with the ability to access both sides of the pore.  So one could refuel repeatedly or perhaps even recover sequenced molecules from the trans side.  Clive talks of using this to fully sequence the contents of a cell. Droplet size is 10nL, perhaps to be reduced to 4.5nL.  [41:53]

Perhaps more exciting is the prospect of protein sequencing with this system. Building from work by Jeff Nivala, who spoke at London Calling, the idea is to use the ATP-dependent protein unfoldase / protease complex ClpPX on the trans side to pull peptides through.  Clive showed plots of signal and said that development of "amino callers" is well underway  [44:20]
VolTRAX sequencing -- low throughput DNA; proteins 41:39

New Architecture

Clive dove into more detail on a new architecture for the pores [47:20].  A key difference is that these would sense voltage and not current, and that there would be full access to the trans side. Current devices pack pores at approximately 200 micron pitch; the new design has been tested at 50 and 20 microns. 

Clive showed a table of gaudy numbers suggesting a Flongle-class device could have 10K channels and generate 390Gbp per day, a MinION class device 100K channels and 3.9Tbp per day and a big version ("XL") generating 39Tb per day from 1M channels.  Of course, getting data off such beasties could be a real challenge -- indeed it was that challenge that proved to be a titanic iceberg for Ion Torrent's scale-up plans. 

But a friendly ONT employee pointed out a different possible application for such huge pore arrays that wouldn't require stripping all the data off.  With clever multiplexing, a set of pores might be used as a very sensitive capture area for very low concentration libraries, with signal only needed to be streamed from the rare pore that grabbed a strand.  Or, particularly since with access to the trans side one can have an essentially infinite supply of electrochemical mediator, an array might be used for very long time periods as an environmental sensor, with live pores being rotated into use to replace dead pores.

In any case, ONT claims they will ship in 2020 devices based on this new architecture.

 new architecture 47:20 -- voltage not current sensing. gives access to trans side, potentially very high pore densities -- but can you get data off?  Clive gave huge projections with that caveat. Fail of Ion Torrent.  Alternative uses via big muxing -- return data only from active pores, big pore capture area for high sensitivity, rotate through many pores as they die for long-life sensor.  low cross talk  claim devices in 2020

What Wasn't Discussed?

An ONT employee approached me before Clive's talk and asked me what I anticipated.  That particularly brought to focus my failure to keep a table of announcements and promises (in the best case, I'd have slip charts to refer to for some of the more tortuous ONT product releases such as VolTRAX and PromethION).  So it's hard to keep track of things that weren't discussed.  There's also scabs I'd rather not pick at.  But what wasn't in the air?

1D^2 is clearly going to be replaced with 2D-C; not only does it never seemed to have become very popular (based on a dearth of Nanopore Community posts or Twitter traffic) and you can't even order the special R9.5 flowcells with the normal methods.

There isn't anything out there to combine different flowcell data in a principled way; Medaka takes R9 or R10 data but not both together.  Perhaps there isn't much to be gained there yet; R10.0 in my hands is just enormously better than R9.4.1 so I expect R10.3 to be better still (which is what ONT showed).

No word on ONT's efforts on DNA synthesis. Also, SmidgION seems to have gone without mention -- the tiny MinION MkII flowcell is presumably the heir to this effort.

Below is my incomplete and very approximate set of timepoints in Clive's talk in case you find them useful.  This is probably my last post for the 2010s -- I look forward to taking on the 2020s

Time Item
05:04 trans nuclease
08:30 XL kits
08:56 LIMS, VBZ
11:20 Mk1C having GSM trouble
15:32 adapter scaling - improves short read accuracy
16:48 Regen
17:47 Fuel efficient adapter
19:12 1ng input claim for cDNA; 160M fl on Pro, 20M on Min
20:25 RNA Accuracy
21:30 generations of basecalling
22:02 flipflop
24:20 Bonito
27:43 R9,R10.0 accuracy claims
28:00 SNV calling
29:00 CNV
29:10 Mod bases
31:43 RNA modification Venn diagram
32:20 R9 vs R10 pretty pictures
32:30 R10.3
34:00 R10.3 accuracy plots by organism
35:30 R10.3 shipping
35:40 R10.3 consensus accuracy
35:47 R11
36:49 R11
38:00 Single molecule consensus
38:50 UMI R10.0 accuracy
39:30 2D"C" diagram
40:30 2DC signal diagrams
41:20 2D"C" accuracy
41:39 VolTRAX DNA sequencing
44:20 VolTRAX protein sequencing; amino callers
45:40 next gen tether
47:20 new architecture
51:40 next gen ASIC

No comments: