Library preparation has been an ongoing issue for anyone wanting to dive into high throughput sequencing systems, particularly with the Illumina platform. Preparing libraries, quantifying them and normalizing multiple libraries (if multiplexing, which on Illumina platforms it is rare not to do) are steps which can have a significant learning curve and even once learned require substantial hands-on time. At the original starbase, even if we had wanted to commit capital and precious bench space to a benchtop sequencer, the labor commitment to make our own libraries was a clear killer.
The most typical library prep workflow begins by shearing purified DNA, usually in an acoustic shearing instrument such as Covaris. Alternate shearing methods, such as enzyme cocktails or nebulizers have also been used. The sheared DNA must be repaired, since these shearing methods can leave a wide variety of ends on the DNA. After end repair, adapters are ligated to the molecules, followed by PCR amplification of the library (except, obviously, in PCR-free protocols). Often there are bead cleanup steps or transfers interspersed, which improve quality (or are required to change buffers) but add complexity and potentially lose material.
NeoPrep is based on the electrowetting microfluidic technology which Illumina acquired by purchasing Advanced Liquid Logic. In electrowetting, regions of a surface can be made hydrophobic or hydrophilic by changing the applied charge, triggering liquids to leap from suddenly hydrophobic patches to adjacent hydrophilic patches. A fun video illustrating this can be found on the still extant ALL webpage; the somewhat jerky motion is reminiscent of Frogger or other video games of my youth. By performing a series of such steps, and utilizing on-instrument heating and cooling instruments, all of the post-shearing steps can be automated on the instrument.
So for $49K ($39K introductory pricing for the first 6 months, one gets a small instrument which can prepare 16 libraries at a time. A cartridge is loaded by the operator, guided by a plastic template, with reagents and sheared DNA. The instrument runs unattended, eventually generating and quantifying each library, ready for pooling. Data from the instrument is also uploaded to BaseSpace, so that information on library quality is available for downstream analyses. Run times vary by application and are many hours, but with an overnight run a single instrument can run thrice in a day to prepare 48 libraries. Roughly 30 minutes of hands-on time is required per run. Library costs are approximately the same as for manual preps (not including amortizing the instrument buy); Illumina thinks NeoPrep will become the dominant means of preparing libraries and thinks many early adopters will spring for more than one box.
Initial applications for the instrument are the TruSeq Nano genomic DNA prep and TruSeq stranded RNA-Seq, both able to work with 25 nanograms of input material. High quality library preps of small genomes can be made with substantially less (single digit nanograms) DNA, though with large genomes representational bias may start being apparent. Small input amounts are enabled by the small volumes (and also the consequently faster kinetics) within the system, enabling about 1/3 to 1/5 the amount of material for manual preps. TruSeq PCR-Free genomic preps and a palette of targeted sequencing methods will be released later in the year
NeoPrep has had a long gestation; Illumina gave a long spiel on it at last year's AGBT. During this time, the cartridges have undergone a series of design changes, with especial care paid to the plastics used and coatings employed. Hence, NeoPrep is a completely new design, radically different than the earlier ALL/NuGen Mondrian library prep instrument. Illumina is using experience gained in their other cartridge-based systems, such as NextSeq, in designing newer systems such as NeoPrep. The final iteration has undergone extensive testing; Illumina boasted that the demo instrument at AGBT has generated over 5,000 libraries.
I'd be interested to hear from hands-on folks as to how they see NeoPrep fitting it. I'm not familiar with the pricing of other boxes, but many I've seen take up substantially more space on the desktop, and have complex liquid handling robotics. NeoPrep's lack of moving parts and compact size would seem an advantage, and the stated ability to work with smaller input amounts a huge plus, as rarely does anyone have a surfeit of input material. But, many labs will have already invested in those other robotic systems, and at only 16 samples per run the really big projects involving thousands of libraries might not see NeoPrep as a good fit (though Illumina pointed out that one NeoPrep could prepare well over 10K libraries in a year). NeoPrep also isn't solving the problem in the small genome community of needing to push the cost of library prep radically lower (and the number of barcodes radically higher), since one HiSeq run can now potentially deliver the data to assemble thousands of E.coli class genomes. I have a couple of more thoughts on the library prep market, but those need to wait until I can publish a write-up on another phone conference I had today -- but that one is embargoed until later in the conference.
NeoPrep is the biggest Illumina announcement at the meeting, but other currents are flowing as well. Shawn Levy of Hudson Alpha Institute gave a comparison today of data quality from HiSeq v4 vs. HiSeq X v2 (incorporating exclusion amplification) and NextSeq v2 (using 2-color sequencing); the conclusion is that the three are nearly indistinguishable. In a tweet, James Hadfield pondered whether this means 4-color sequencing-by-synthesis is on its way out at Illumina. It will be interesting to see if this speculation, or my more grandiose concept of a 2-color, patterned flowcell mini-MiSeq might show up in the next year.
Illumina is not giving details at this time, but is starting to spread the word that the AACR meeting this spring will feature a reveal of a new lineup of targeted sequencing products, promised to be best-in-class.
Illumina also mentioned to me that Kevin Gunderson from their research group has a poster on an on-flowcell method for generating long range information. This is a scheme in which long DNA is placed on a patterned flowcell and libraries made on the cell; spatially near clusters will frequently encode fragments which were near each other.
Well, that's my summary of my conversation with Illumina and a little bit of reaction. The market leader continues to innovate within their ecosystem, filling in some of the gaps in their offering (gaps that may have been filled by competitors).