Today's Nature contains a great paper which is one more step forward for cancer genomics. Using Illumina sequencing a group in British Columbia sequenced both the genome and transcriptome of a metastatic lobular (estrogen receptor positive) breast cancer. Furthermore, they searched a sample of the original tumor for mutations found in the genome+transcriptome screen in order to identify those that may have been present early vs. those which were acquired later.
From the combined genome sequence and RNA-Seq data they found 1456 non-synonymous changes which was then trimmed to 1178 after removing pseudogenes and HLA sequences. 1120 of these could be re-assayed by Sanger sequencing of PCR amplicons from both normal DNA and the metastatic samples -- 437 of these were confirmed. Most of these (405) were found in the normal sample. Of the 32 remaining, 2 were found only in the RNA-Seq data, a point to be addressed later below. Strikingly, none of the mutated genes were found in the previous whole-exome sequencing (by PCR+Sanger) of breast cancer, though those samples were of a different subtype (estrogen receptor negative).
There are a bunch of cool tidbits in the paper, which I'm sure I won't give full justice to here but I'll do my best. For example, several other papers using RNA-Seq on solid cancers have identified fusion proteins, but in this paper none of the fusion genes suggested by the original sequencing came through their validation process. Most of the coding regions with non-synonymous mutations have not been seen to be mutated before in breast cancer, though ERBB2 (HER2, the target of Herceptin) is in the list along with PALB2, a gene which when mutated predisposes individuals to several cancers (and is also associated with BRCA2). The algorithm (SNVMix) used for SNP identification & frequency estimation is a good example of an easter egg, a supplementary item that could easily be its own paper.
One great little story is HAUS3. This was found to have a truncating stop codon mutation and the data suggests that the mutation is homozygous (but at normal copy number) in the tumor. A further screen of 192 additional breast cancers (112 lobular and 80 ductal) for several of the mutations found no copies of the same hits seen in this sample, but two more truncating mutations in HAUS3 were found (along with 3 more variations in ERBB2 within the kinase domain, a hotspot for cancer mutations). HAUS3 is particularly interesting because until about a year ago it was just C4orf15, an anonymous ORF on chromosome 15. Several papers have recently described a complex ("augmin") which plays a role in genome stability, and HAUS3 is a component of this complex. This starts smelling like a tumor suppressor (truncating mutations seen repeatedly; truncating mutation homozygous in tumor; protein in function often crippled in cancer), and I'll bet HAUS3 will be showing up in some functional studies in the not too distant future.
Resequencing of the primary tumor was performed using amplicons targeting the mutations found in the metastatic tumor. These amplicons were small enough to be spanned directly by paired-end Illumina reads, obviating the need for library construction (a trick which has shown up in some other papers). By using Illumina sequencing for this step, the frequency of the mutation in the sample could be estimated. It is also worth noting that the primary tumor sample was a Formalin Fixed Paraffin Embedded slide, a way to preserve histology which is notoriously harsh on biomolecules and prone to sequencing artifacts. Appropriate precautions were made, such as sequencing two different PCR amplifications from two different DNA extractions. The sequencing of the primary tumor suggests that only 10 of the mutations were present there, with only 4 of these showing a frequency consistent with being present in the primary clone and the others probably being minor components. This is another important filter to suggest which genes are candidates for being involved in early tumorigenesis and which are more likely late players (or simply passengers).
One more cool bit I parked above: the 2 variants seen only in the RNA-Seq library. This suggested RNA editing and also consistent with this an RNA editase (ADAR) was found to be highly represented in the RNA-Seq data. Two genes (COG3 and SRP9) showed high frequency editing. RNA editing is beginning to be recognized as a widespread phenomenon in mammals (e.g. the nice work by Jin Billy Li in the Church lab); the possibility that cancers can hijack this for nefarious purposes should be an interesting avenue to explore. COG3 is a Golgi protein & links of the Golgi to cancer are starting to be teased out. SRP9 is part of the signal recognition particle involved in protein translocation into the ER -- which of course feeds the Golgi. Quite possibly this is coincidental, but it certainly rates investigating.
One final thought: the next year will probably be filled with a lot of similar papers. Cancer genomics is gearing up in a huge way, with Wash U alone planning 150 genomes well before a year from now. It seems unlikely that those 150 genomes will end up as 150 distinct papers and more so it will be a challenge to do the level of follow-up in this paper on such a grand scale. A real challenge to the experimental community -- and the funding establishment -- is converting the tantalizing observations which will come pouring out of these studies into validated biological findings. With a little luck, biotech & pharma companies (such as my employer) will be able to convert those findings into new clinical options for doctors and patients.
Sohrab P. Shah, Ryan D. Morin, Jaswinder Khattra, Leah Prentice, Trevor Pugh, Angela Burleigh, Allen Delaney, Karen Gelmon, Ryan Guliany, Janine Senz, Christian Steidl, Robert A. Holt, Steven Jones, Mark Sun, Gillian Leung, Richard Moore, Tesa Severson, Greg A. Taylor, Andrew E. Teschendorff, Kane Tse, Gulisa Turashvili, Richard Varhol, René L. Warren, Peter Watson, Yongjun Zhao, Carlos Caldas, David Huntsman, Martin Hirst, Marco A. Marra, & Samuel Aparicio (2009). Mutational evolution in a lobular breast tumor profiled at single nucleotide resolution Nature, 461, 809-813 : 10.1038/nature08489