Sunday, February 20, 2011

Will Cheap Gene Synthesis Squelch Cheaper Gene Synthesis?

Among the vast piles of items which I've meant to write about but have slipped are a paper last year on gene synthesis and some subsequent announcements about trying to commercialize the method described in that paper. This is an area in which I have past experience, though I would never claim this gives me indisputable authority or omniscience in the matter.

The paper, primarily by scientists from Febit but also with two scientists from Stanford and George Church in the author list, finally describes an interesting approach to dealing with some serious challenges in gene synthesis which substantially increase the costs. By finally, I mean that the idea has certainly been kicking around for a while and was mentioned when I visited Codon Devices in the fall of 2006 looking for employment.

To fill in some background first, gene synthesis is a powerful way to generate DNA constructs which can enable all sorts of experiments. The challenge is that the cost of gene synthesis, currently starting at around $0.40 per base pair for very easy and short stuff (say, less than 2Kb), tends to restrict what you can use it for. I have a project concept right now that would be a slam dunk for gene synthesis -- but not at $0.40/bp (which I think I couldn't even get for the project). Whack that price by a few factors of two and the project becomes reasonable.

There are many cost components to commercial gene synthesis, and only someone who has carefully looked over the books while wearing a green eyeshade is going to have a proper handle on them. But three of the big expenses are the oligos themselves, the sequencing of constructs to find the correct ones and labor. What the Febit paper does is illustrate a nice way to tackle the first two in a manner that shouldn't require a lot of labor.

The oligo cost is a serious issue. Conventional oligos can be had for around $0.08 or maybe a bit less a base. However, each base in the final construct requires close to 2 bases in the oligo set. Some design strategies might get this down a bit. However, conventional columns generate far more oligo than you actually need. An approach which has been published (but not commercialized as far as I know), is to scale down the synthesis using microfluidics. This method matches better the amount synthesized and the amount you need, though the length and quality of the oligos needs refinements from what was reported in order to be truly useful. Microarrays are a means to synthesize huge numbers of oligos, but their quality also tends to be low and the quantity of each oligo species is much too small without further amplification. Amplification schemes have been worked out, but add to the processing costs of the oligos.

What Febit and company have done is take those microarray-build oligos and screen them using 454 sequencing. The beads containing the amplicons with correct oligos are then plucked out of the 454 flowcell (with 90% success of getting the right bead) and used as starting points.

Now, this has several interesting angles. First, it has been challenging to marry the non-Sanger new sequencing technologies to gene synthesis. The new technologies tend to have short reads, too short to read even a short construct. The new technologies also require library construction and it is difficult to trace a given sequence back to a specific input DNA. In other words, short read technologies are great at reading populations, but not individual wells in a gene synthesis output. Sanger on the other hand, is ill-suited for populations but great for individual clones. One solution to this problem is clever pooling and barcoding strategies, but these necessitate having enough different clones to be worth pooling and barcoding. In other words, second generation sequencing is difficult to adapt to retail gene synthesis, but looks practical for wholesale gene synthesis.

Getting the oligos right has important positive side-effects. While the stitching together of oligos into larger fragments (and larger fragments into still larger ones) can generate errors, and awful lot of the problems stem from bad input oligos. Not only can error rates be troublesome, but some of the erroneous sequences may have advantages over the correct ones in later steps. For example, a deleted fragment may PCR more efficiently than the full length, and slightly toxic gene products may be disfavored in cloning steps over frameshifted versions of the same reading frame. So, by putting the sequencing up front it should be possible to reduce the later sequencing downstream. So even if that sequencing remains Sanger, it should be possible to do a lot less.

Okay, that's the science. Now some worries about the business. Febit announced in January they are looking for investors to fire off a new company to commercialize this approach. This makes good business sense, since Febit itself must be encrusted with all sorts of business barnacles, having lurched from one business to another in trying to commercialize their microfluidic microarray system. Previously failed attempts include gene synthesis as well as microarray expression analysis and hybridization capture (I even ran one experiment with their system, whose results certainly didn't argue for them staying in that business!). The press release stated they were hoping to attain pricing in the 0.08$ per base range, which would make my current experiment concept feasible. That would be great.

Now, they will need to refine their system and perhaps adapt other sequencers. A 454 Jr would probably not be a difficult adaptation, but moving on to Ion Torrent must be tempting. Getting things to work for one paper and one set of genes is unfortunately different than being able to keep things working over an entire spectrum of customer designs.

Which leads me to where I think they will have a great challenge, though one which I think can be finessed with the proper business approach. They will be brining to market a methodology whose benefit is cost at the expense but with the caveat of attaining that cost advantage only with sufficient volume. Initially, they will be unable to reliably predict delivery times (due to kinks showing up). Finally, they are adding some additional processing steps (454 sequencing, bead recovery & oligo recovery from the bead) which may add to the time.

The abyss into which this new company must plunge is a world in which very fast gene synthesis is available from a large number of vendors in the $0.40 price range. So, they must find very large customers who are willing to be a bit patient and keep their pipeline filled. Such customers do exist, but they aren't always easy to find and pry away from their existing suppliers. In theory much cheaper synthesis would unleash new orders for projects (such as mine) which are too costly at current prices, but that is always a risky assumption to bank a company on (c.f. Codon Devices' gene synthesis business).

It's the alternative route that I predict this NewCo is likely to go down. That would be to link up with an established provider in the field. Said provider, through their salespersons and sales software, could offer each customer an option -- I can build your genes for $0.40 if you want them fast or hack that down to $0.10 a base if you can wait. In order to preserve customer satisfaction, that long time would need to include an insurance period to build the genes by the conventional route if the new route fails -- but of course if you are frequently forced to build $0.40/bp genes for which you charged $0.10/bp, that would be financial suicide.

So, in summary, I think this is a clever idea which needs to be pushed forward. But, after a long gestation in the lab, it faces a very rocky future in the production world. I hope they succeed, because it is not hard to imagine projects I would like to do which would be enabled by such a capability.

1 comment:

Austen Heinz said...

Your wait may not last too much longer. Be on the lookout for NewCo coming out of South Korea with a slightly different approach.