One of the perceived weak points of the Ion Torrent, in contrast to the Illumina, has been the use of emulsion PCR for template preparation. The original template prep protocol was apparently around 10 hours of wall clock time, with a substantial amount of hands on time. Improved template prep is the subject of one of the Grand Challenges. A new kit being released today along with a new instrument announced today (but not generally available until late summer or sometime this fall) go after this issue; the new kit providing an inexpensive but substantial immediate improvement and the modestly priced template prep instrument providing a very low labor solution once it arrives
A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery
Wednesday, April 13, 2011
Tuesday, April 12, 2011
Vostok & Columbia
It has hardly gone unreported in the media that today marks the 50th anniversary of the manned spaceflight and the 30th of the first space shuttle launch. It was an accident of scheduling delays that put that flight on the 20th anniversary of Gagarin's, but what an appropriate synchrony that is! My own contribution herein is to pen two quick capsules of two books that deserve longer reviews: Two Sides of the Moon and Riding Rockets. If you have a strong interest in the history of spaceflight, you should consider reading each of these if you haven't already. The only caveat I'll throw in is that if you have a young friend who fits that category, but you do not, you should at least skim Reading Rockets before passing it on; some of the content (and most of the humor) is of a mature nature. While neither book is primarily about either of the events commemorated today, both bear on it.
Friday, April 08, 2011
Gnoteworthy But Gnot Gnoticed
A commenter on yesterday's piece on Intelligent Biosystems scolded me on not mentioning GnuBio and said they had released data. This had totally slipped my my notice, and indeed it seemed to have slipped past the GenomeWeb and BioIT worlds as well, judging from some Google searches. There is certainly a press release out there and covered by several outlets, but amazingly stealthy for public relations. Strange! But many thanks to my anonymous correspondent for flagging me on this!
But, it does make a set of stunning claims. When GnuBio launched last year and announced they would have alpha instruments in collaborator's hands by the end of 2010, I was skeptical.
But, it does make a set of stunning claims. When GnuBio launched last year and announced they would have alpha instruments in collaborator's hands by the end of 2010, I was skeptical.
Another Low Cost Sequencer on the Horizon?
An article in GenomeWeb's In Sequence (which, alas, requires a subscription for which I've never sprung) has a piece on Intelligent Biosystems (IBS), which iat the X-Gen Congress meeting apparently announced a plan to launch the "Pinpoint Mini" sequencer. The box would come in at $85K, putting it smack in the middle of Ion Torrent PGM and Illumina MiSeq pricing. The hope is to have boxes shipping to early access customers by the end of the year.
To be honest, I'm guilty of mentally writing off IBS, as they had been around a long time and very quiet. Indeed, in my defense their website looks like it hasn't been updated since it first went up, and you'd think now that they made a big announcement it would be updated, but apparently not yet. On the other hand, while stale websites can indicate fading companies, the fact that it still is working suggests some life.
In any case, I was apparently hasty in my thoughts. The box as described has some interesting features. Supposedly it will crank out data at $75/Gbase. The claim is that one exome could be sequenced (reagent costs only, mind you) at 30X for $150 in about 1.5 days. No word on read lengths. The system will mount 20 flow cells, each of which can be run independently. Chemistry is based on reversible terminators that they exclusively licensed; if I understand it correctly the big advantage of their chemistry is simplicity. There's also a curious bit in the publication from the founders is using a mix of unlabeled reversible terminators with labeled dideoxy terminators; the same cleavage reaction removes both the terminator and the label. This was touted as a way to reduce the discrimination of the polymerase against the reversible terminators. Of course, an alternative would be to generate mutant polymerases which are more amenable to being fed terminators. Having four pairs of complicated compounds wouldn't seem to be a route to low cost, but perhaps the gains are worth it (or this chemistry isn't being used any more; very hard to tell from what I was able to read).
Sounds great, but of course there's a lot to do before such a machine can launch. There's no word about sample prep method, which probably means it will use emulsion PCR, since necessary licenses for that can apparently be obtained. There's also the problem of manufacturing the instrument and the reagent kits. Te fact that IBS apparently planned to launch a PinPoint large-scale sequencer a few years back and couldn't get it out the door is not going to help them compete in the expectations market with the other boxes.
One solution to some of these issues would be a strategic partnership with (or outright acquisition by) a major reagents and/or equipment player. It's not hard to come up with a list of candidates, based on nothing more than that description. Perhaps at some point Affymetrix will decide to move into next-gen. Roche could always decide to go for something cheaper than 454, but I doubt it. Agilent seems quite happy supplying picks and shovels, but perhaps they'll go for the big time. Perkin Elmer, GE (which is working on blue sky sequencers) or a host of others. Picking the right partner will be key; Illumina has the advantage of an enormous installed base and thriving ecosystem of associated vendors, whereas Ion Torrent has a lot of buzz and serious marketing muscle (not to say Illumina is lacking there either).
It's also interesting to see this machine being touted as an exome sequencing workhorse for clinical use. The issue really deserves its own detailed post, but such an application brings some serious issues. Library preparation requires a lot of labor and a bunch of other instruments, or a bit less labor and some more instruments specialized for prep. Right now, the market for exomes on HiSeq using either SureSelect or EZCap is quite competitive; I've recently gotten quotes ranging from $2K-$5K (the lower quotes tend to be from new entrants; promised coverage varies a bit too) . For 50Mb capture at 50X coverage, it would seem you could get around 50 exomes into one HiSeq flowcell, which at $10K each means about $200 in sequencing (feel free to correct my math in the comments). That would suggest that very little of the cost of these exome captures is sequencing reagents; the majority is labor and the EZCap or SureSelect kits.
While some elegant library-free methods (really, methods which add the sequencing adaptors as they are capturing the targeted DNA). for exome sequencing have been published, none of these are commercially available on an exome scale. RainDance requires an expensive ($200K+) box and isn't quite up to exome scale,. . Halo Genomics. has announced a library-free prep for "1000s of exons", so perhaps this will break this problem open. Whether this is really whole exome, or something smaller, remains to be seen. A safe rule, though, is that these methods are only efficient if read lengths are significant. At a minimum, the first part of a read is burned on getting through the targeting primers, and with very short reads the size of each targeted amplicon must be small, meaning for a fixed number of amplicons (cost) you can capture a lot less DNA than with a longer read technology.
So, another player in the field -- but with a far off beta release and a lack of a track record. They'll be fun to watch (assuming they go out of possum mode), but probably won't be a real factor in the market for over a year.
To be honest, I'm guilty of mentally writing off IBS, as they had been around a long time and very quiet. Indeed, in my defense their website looks like it hasn't been updated since it first went up, and you'd think now that they made a big announcement it would be updated, but apparently not yet. On the other hand, while stale websites can indicate fading companies, the fact that it still is working suggests some life.
In any case, I was apparently hasty in my thoughts. The box as described has some interesting features. Supposedly it will crank out data at $75/Gbase. The claim is that one exome could be sequenced (reagent costs only, mind you) at 30X for $150 in about 1.5 days. No word on read lengths. The system will mount 20 flow cells, each of which can be run independently. Chemistry is based on reversible terminators that they exclusively licensed; if I understand it correctly the big advantage of their chemistry is simplicity. There's also a curious bit in the publication from the founders is using a mix of unlabeled reversible terminators with labeled dideoxy terminators; the same cleavage reaction removes both the terminator and the label. This was touted as a way to reduce the discrimination of the polymerase against the reversible terminators. Of course, an alternative would be to generate mutant polymerases which are more amenable to being fed terminators. Having four pairs of complicated compounds wouldn't seem to be a route to low cost, but perhaps the gains are worth it (or this chemistry isn't being used any more; very hard to tell from what I was able to read).
Sounds great, but of course there's a lot to do before such a machine can launch. There's no word about sample prep method, which probably means it will use emulsion PCR, since necessary licenses for that can apparently be obtained. There's also the problem of manufacturing the instrument and the reagent kits. Te fact that IBS apparently planned to launch a PinPoint large-scale sequencer a few years back and couldn't get it out the door is not going to help them compete in the expectations market with the other boxes.
One solution to some of these issues would be a strategic partnership with (or outright acquisition by) a major reagents and/or equipment player. It's not hard to come up with a list of candidates, based on nothing more than that description. Perhaps at some point Affymetrix will decide to move into next-gen. Roche could always decide to go for something cheaper than 454, but I doubt it. Agilent seems quite happy supplying picks and shovels, but perhaps they'll go for the big time. Perkin Elmer, GE (which is working on blue sky sequencers) or a host of others. Picking the right partner will be key; Illumina has the advantage of an enormous installed base and thriving ecosystem of associated vendors, whereas Ion Torrent has a lot of buzz and serious marketing muscle (not to say Illumina is lacking there either).
It's also interesting to see this machine being touted as an exome sequencing workhorse for clinical use. The issue really deserves its own detailed post, but such an application brings some serious issues. Library preparation requires a lot of labor and a bunch of other instruments, or a bit less labor and some more instruments specialized for prep. Right now, the market for exomes on HiSeq using either SureSelect or EZCap is quite competitive; I've recently gotten quotes ranging from $2K-$5K (the lower quotes tend to be from new entrants; promised coverage varies a bit too) . For 50Mb capture at 50X coverage, it would seem you could get around 50 exomes into one HiSeq flowcell, which at $10K each means about $200 in sequencing (feel free to correct my math in the comments). That would suggest that very little of the cost of these exome captures is sequencing reagents; the majority is labor and the EZCap or SureSelect kits.
While some elegant library-free methods (really, methods which add the sequencing adaptors as they are capturing the targeted DNA). for exome sequencing have been published, none of these are commercially available on an exome scale. RainDance requires an expensive ($200K+) box and isn't quite up to exome scale,. . Halo Genomics. has announced a library-free prep for "1000s of exons", so perhaps this will break this problem open. Whether this is really whole exome, or something smaller, remains to be seen. A safe rule, though, is that these methods are only efficient if read lengths are significant. At a minimum, the first part of a read is burned on getting through the targeting primers, and with very short reads the size of each targeted amplicon must be small, meaning for a fixed number of amplicons (cost) you can capture a lot less DNA than with a longer read technology.
So, another player in the field -- but with a far off beta release and a lack of a track record. They'll be fun to watch (assuming they go out of possum mode), but probably won't be a real factor in the market for over a year.
Tuesday, April 05, 2011
Can we treat the kinase du jour?
For the second time in just over a week, the Boston Globe Sunday was discussing protein kinases in the context of cancer. A group from the Broad has just published a sequencing study (Sanger!) identifying mutations in the protein kinase DDR2 in about 4% of squamous cell carcinomas of the lung. This is a common form of smoking-induced non-small cell lung cancer (NSCLC) and one for which many therapies useful in lung cancer are contraindicated. The prior study by another group at the Broad was published a bit over a week ago in Nature detailing an extensive look at myelomas by sequencing, and found mutations in the kinase BRAF in 4% of myelomas.
The myeloma study is quite a watershed and in some ways raises the bar for cancer genomics publications. Whereas most papers have published a single cancer genome and a few have published single digit genomes, this one looked at 38 myelomas. Now, not to overstate things, as only in 23 patients were matched normal and myeloma whole genomes sequenced; for the other 15 patients just the exomes were sequenced (one additional exome pair was run in a patient with whole genome sequencing, to enable comparison). Clearly this is a serious scaling up of effort, enabled by dropping costs. By sequencing multiple genomes, the possibility exists both to discover rare variants as well as get some rough mutation frequency information.
The myeloms study is curious in one aspect: the results were first discussed about a year ago at AACR, the big pre-clinical meeting going on right now, and were described as submitted at a conference I attended at MIT last June. Indeed, the paper states "Received 11 June 2010; accepted 17 January 2011". While there is a bit of functional investigation of one gene (siRNA vs. HOXA9) and some Western blots of coagulation factors, this is primarily a genomics paper. Is Nature becoming reluctant to publish such papers? What really held this up for so long?
In any case, the primary finding in both of these papers is low but measurable frequency mutation of protein kinases in human cancers. This should come as no surprise, as a previous paper from the same groups in lung adenocarcinoma (the other major class of NSCLC), multiple kinases were found to be mutated beyond the relatively high frequency EGFR, again including BRAF but also a host of other kinases. The new DDR2 paper also found mutations in multiple kinases, though any follow-up was focused on DDR2. Another 5%-or-so slice of adenocarcinoma carries a fusion protein of the kinase ALK, which can be treated with inhibitors developed against ALK. It also may be an opportunity to target ALK by a different strategy, one which my company has explored (yes, I have a financial interest there!).. The challenge is to determine which, if any, of these mutations are driving tumors and which are just passengers.
In the case of the DDR2 paper, the authors built a pretty nice story. One big bonus to protein kinases is that there has been extensive efforts in the last 30 or so years to study them, with many inhibitors available. A raging argument in the field is whether clinically useful inhibitors need to be exquisitely specific or can be as subtle as a wrecking ball, and the truth is that clinically approved kinase inhibitors run the gamut. Imatinib (Gleevec) is quite specific, though it still hits multiple kinases and that has proven useful as it has enabled targeting multiple cancers. For example, some gastrointestinal stromal tumors are driven by c-KIT mutations and others by PDGFR mutations, but luckily imatinib hits both. Other inhibitors such as sorafenib and suntinib are less discriminating, but still tolerable.
In the case of DDR2, the approved inhibitor dasatinib turns out to be effective, and the new paper shows this first in cell lines. Cell lines carrying DDR2 mutations are more sensitive to dasatinib than those which do not, but the trend continues both in mouse xenograft models and finally in a single human patient carrying a DDR2 mutation in her tumor. Alas, the patient apparently had to discontinue therapy due to side effects.
Now that genomics has demonstrated the ability to find these low frequency mutations, the question is quite open as to how to move them into clinical practice. One model would be to simply sequence extensively and treat each patient by the best guess for their mutations; this approach has been published and is apparently being used in the case of author Christopher Hitchens. While whole genome or exome sequencing might be too costly or slow for routine use, targeted mutation panels are another possible approach (though honestly, exome sequencing is getting down in the $2.5K range these days). Such targeted panels can attempt to focus on the most frequent and actionable mutations, though DDR2 in squamous cell carcinoma appears to not have any one mutation particularly favored.
The alternative is to try to run clinical trials to carefully appraise the clinical utility of these approaches. When I mentioned the BRAF in myeloma story a while back to a co-worker (who happens to have developed multiple drugs, including an effective one in myeloma) and expressed the opinion that it is a slam-dunk to use a BRAF inhibitor (which is near approval in melanoma) in such cases, he took a more cautious view. How do you know these are really the important mutations? How will you know how long the treatment lasts? Perhaps the BRAF mutations in myeloma help the tumor but are not critical. How will you know the correct dose schedule? Combination therapy? Whether drug is getting to a very different tumor? To truly answer these questions rigorously, trials are needed.
But the difficulty in running such trials cannot be underestimated. For example, a company thinking of running a clinical trial looking at BRAF in squamous cell carcinoma faces quite a task. Now, the market is not small: according to Wikipedia (an easy lookup late at night, though perhaps with large error bars) there are about 500K new cases yearly, and a quarter of those are squamous. Presumably at least a quarter of those cases are in the U.S., so around 70K new cases per year. Four percent of 70K (error bars growing with each estimate) is 2.8K patients, which is possibly attractive but getting small..
However, to get the trial going you are going to need to screen to find that 4% of patients. Squamous is a standard diagnosis, so you can start there, but will still need to recruit, consent and screen to get that small fraction. In the mean time, you are competing with every other trial out there to recruit, consent and screen patients. Sure, once they miss another trial they might come to yours -- or might not. To top it off, a lot of patients either are never offered or will never consent to a trial; the farther you are from a large academic cancer center, the less likely you will have a trial available to you.
Now, if oncogenic mutation screening becomes a standard part of cancer care, as it has at MGH and probably some other leading institutions, then if these mutations are in the panels it may be that many patients will know their mutation status before you recruit them into your trial. But until this becomes widespread, and only if your gene of interest is sufficiently covered, will this method work.
Yet another approach is to design trials which test multiple therapies. One prominent example in lung cancer is the BATTLE trial, which is trying 4 different therapies with an adaptive design which uses molecular testing as part of the therapy-assignment scheme. Designs such as BATTLE are quite complicated (well beyond my expertise to critique) and get only more so with more drug regimens; if lung cancer is driven by a dozen or so kinases suggesting a slightly smaller number of therapies, can a trial to test these therapies be designed, patients accrued and useful results out? In such studies, will they be judged by whether the study overall improves survival, or can each treatment be viewed as a separate study?
For the sake of patients, these issues need to be tackled. They'll be hard in lung cancer, and far worse in a disease like myeloma. If we ballpark myeloma at 15K new cases per year in the U.S., 4% of that is getting to be a small group (600 patients). Any sizable trial is going to need to recruit a huge fraction of these patients. Now, with patient advocacy groups and publicity it may be possible to find that small population, but it will certainly be challenging. Indeed, the Multiple Myeloma Research Foundation (which sponsored the sequencing) is already talking about how to support such efforts.
So, in closing, these sequencing studies are suggesting very real therapeutic options for patients. However, driving these findings to routine clinical use, even when drugs are available off-the-shelf for the kinases of interest, will continue to challenge all of the scientists working on translational oncology research.
The myeloma study is quite a watershed and in some ways raises the bar for cancer genomics publications. Whereas most papers have published a single cancer genome and a few have published single digit genomes, this one looked at 38 myelomas. Now, not to overstate things, as only in 23 patients were matched normal and myeloma whole genomes sequenced; for the other 15 patients just the exomes were sequenced (one additional exome pair was run in a patient with whole genome sequencing, to enable comparison). Clearly this is a serious scaling up of effort, enabled by dropping costs. By sequencing multiple genomes, the possibility exists both to discover rare variants as well as get some rough mutation frequency information.
The myeloms study is curious in one aspect: the results were first discussed about a year ago at AACR, the big pre-clinical meeting going on right now, and were described as submitted at a conference I attended at MIT last June. Indeed, the paper states "Received 11 June 2010; accepted 17 January 2011". While there is a bit of functional investigation of one gene (siRNA vs. HOXA9) and some Western blots of coagulation factors, this is primarily a genomics paper. Is Nature becoming reluctant to publish such papers? What really held this up for so long?
In any case, the primary finding in both of these papers is low but measurable frequency mutation of protein kinases in human cancers. This should come as no surprise, as a previous paper from the same groups in lung adenocarcinoma (the other major class of NSCLC), multiple kinases were found to be mutated beyond the relatively high frequency EGFR, again including BRAF but also a host of other kinases. The new DDR2 paper also found mutations in multiple kinases, though any follow-up was focused on DDR2. Another 5%-or-so slice of adenocarcinoma carries a fusion protein of the kinase ALK, which can be treated with inhibitors developed against ALK. It also may be an opportunity to target ALK by a different strategy, one which my company has explored (yes, I have a financial interest there!).. The challenge is to determine which, if any, of these mutations are driving tumors and which are just passengers.
In the case of the DDR2 paper, the authors built a pretty nice story. One big bonus to protein kinases is that there has been extensive efforts in the last 30 or so years to study them, with many inhibitors available. A raging argument in the field is whether clinically useful inhibitors need to be exquisitely specific or can be as subtle as a wrecking ball, and the truth is that clinically approved kinase inhibitors run the gamut. Imatinib (Gleevec) is quite specific, though it still hits multiple kinases and that has proven useful as it has enabled targeting multiple cancers. For example, some gastrointestinal stromal tumors are driven by c-KIT mutations and others by PDGFR mutations, but luckily imatinib hits both. Other inhibitors such as sorafenib and suntinib are less discriminating, but still tolerable.
In the case of DDR2, the approved inhibitor dasatinib turns out to be effective, and the new paper shows this first in cell lines. Cell lines carrying DDR2 mutations are more sensitive to dasatinib than those which do not, but the trend continues both in mouse xenograft models and finally in a single human patient carrying a DDR2 mutation in her tumor. Alas, the patient apparently had to discontinue therapy due to side effects.
Now that genomics has demonstrated the ability to find these low frequency mutations, the question is quite open as to how to move them into clinical practice. One model would be to simply sequence extensively and treat each patient by the best guess for their mutations; this approach has been published and is apparently being used in the case of author Christopher Hitchens. While whole genome or exome sequencing might be too costly or slow for routine use, targeted mutation panels are another possible approach (though honestly, exome sequencing is getting down in the $2.5K range these days). Such targeted panels can attempt to focus on the most frequent and actionable mutations, though DDR2 in squamous cell carcinoma appears to not have any one mutation particularly favored.
The alternative is to try to run clinical trials to carefully appraise the clinical utility of these approaches. When I mentioned the BRAF in myeloma story a while back to a co-worker (who happens to have developed multiple drugs, including an effective one in myeloma) and expressed the opinion that it is a slam-dunk to use a BRAF inhibitor (which is near approval in melanoma) in such cases, he took a more cautious view. How do you know these are really the important mutations? How will you know how long the treatment lasts? Perhaps the BRAF mutations in myeloma help the tumor but are not critical. How will you know the correct dose schedule? Combination therapy? Whether drug is getting to a very different tumor? To truly answer these questions rigorously, trials are needed.
But the difficulty in running such trials cannot be underestimated. For example, a company thinking of running a clinical trial looking at BRAF in squamous cell carcinoma faces quite a task. Now, the market is not small: according to Wikipedia (an easy lookup late at night, though perhaps with large error bars) there are about 500K new cases yearly, and a quarter of those are squamous. Presumably at least a quarter of those cases are in the U.S., so around 70K new cases per year. Four percent of 70K (error bars growing with each estimate) is 2.8K patients, which is possibly attractive but getting small..
However, to get the trial going you are going to need to screen to find that 4% of patients. Squamous is a standard diagnosis, so you can start there, but will still need to recruit, consent and screen to get that small fraction. In the mean time, you are competing with every other trial out there to recruit, consent and screen patients. Sure, once they miss another trial they might come to yours -- or might not. To top it off, a lot of patients either are never offered or will never consent to a trial; the farther you are from a large academic cancer center, the less likely you will have a trial available to you.
Now, if oncogenic mutation screening becomes a standard part of cancer care, as it has at MGH and probably some other leading institutions, then if these mutations are in the panels it may be that many patients will know their mutation status before you recruit them into your trial. But until this becomes widespread, and only if your gene of interest is sufficiently covered, will this method work.
Yet another approach is to design trials which test multiple therapies. One prominent example in lung cancer is the BATTLE trial, which is trying 4 different therapies with an adaptive design which uses molecular testing as part of the therapy-assignment scheme. Designs such as BATTLE are quite complicated (well beyond my expertise to critique) and get only more so with more drug regimens; if lung cancer is driven by a dozen or so kinases suggesting a slightly smaller number of therapies, can a trial to test these therapies be designed, patients accrued and useful results out? In such studies, will they be judged by whether the study overall improves survival, or can each treatment be viewed as a separate study?
For the sake of patients, these issues need to be tackled. They'll be hard in lung cancer, and far worse in a disease like myeloma. If we ballpark myeloma at 15K new cases per year in the U.S., 4% of that is getting to be a small group (600 patients). Any sizable trial is going to need to recruit a huge fraction of these patients. Now, with patient advocacy groups and publicity it may be possible to find that small population, but it will certainly be challenging. Indeed, the Multiple Myeloma Research Foundation (which sponsored the sequencing) is already talking about how to support such efforts.
So, in closing, these sequencing studies are suggesting very real therapeutic options for patients. However, driving these findings to routine clinical use, even when drugs are available off-the-shelf for the kinases of interest, will continue to challenge all of the scientists working on translational oncology research.
Subscribe to:
Posts (Atom)