Friday, September 25, 2009

How many genomes did I just squash?

Yesterday was a good day for catching up on the literature; not only did I finally get around to the IL28B papers I blogged about yesterday, but I also took a run through the genome fusion paper which is being seen as the fitting marker of the end of the "Communicated by" mechanism of PNAS (sample coverage by In The Pipeline and Science, though the latter requires a subscription).

The paper, by Donald Williamson and communicated by Lynn Margulis, takes the position that " in animals that metamorphose, the basic types of larvae originated as adults of different lineages, i.e., larvae were transferred when, through hybridization, their genomes were acquired by distantly related animals". This is a whopper of a proposal and definitely interesting.

Margulis is famous for proposing the endosymbiont hypothesis to explain mitochondria and chloroplasts and other organelles. The gist of it is that some ancestral eukaryote took in a guest species and in the long run integrated it fully into its operations so that the two could not be separated. An important observation which this explained is the fact that mitochondria and chloroplasts have their own genomes, which encode (almost?) entirely for proteins and RNAs used in these structures. However, their genomes do not encode many of the proteins required -- indeed in metazoans such as ourselves only a tiny pittance of genes are encoded by the mitochondrial genome. A further observation which fits into this framework is the curious case of Cyanophora paradoxa, a photosynthetic organism whose chloroplast-like structure is surrounded by a rudimentary cell wall.

When I was an undergraduate, there was still significant controversy on the validity of the endosymbiont hypothesis. I remember this well, as I wrote a term paper on the subject. What really nailed it down was the careful comparison of gene trees in the cases where the same function is required both in the organelle and in the cytoplasm and both are nuclear encoded. In the vast majority of these cases, the two are evolutionarily distant from one another and in the case of chloroplasts the gene whose protein goes to the chloroplast looks more like homologs in cyanobacteria and the copy producing cytoplasmic protein looks more like homologs in non-photosynthetic eukaryotes. There are some fascinating exceptions, such as cases in which one gene does double duty -- via (for example) alternative splicing or promoters including or excluding the chloroplast targeting sequences.

Margulis and others have tried to extend this notion to other systems. There are definitely other success -- unicellular organisms which appear to carry three genomes & the always challenging to classify Euglena, which appears to be a genome fusion. But there have also been some prominent non-successes, such as the eukaryotic flagellum/cillium. Also when I was an undergraduate a Cell paper made a big splash claiming to find a chromosome associated with the basal body, the organelle associated with flagellum synthesis. However, this work was never repeated and the publication of the Chlamydomas genome failed to find such a chromosome.

After reading the paper at hand, I'm both confused and disappointed. The confusion is embarassing, but the paper goes into a lot of detail on taxonomy and gross development of which I'm horribly ignorant. But, conversely the disappointment comes from what I do understand and how cursorily that is treated. And since it is the stuff I understand which is the route Williamson proposes to test his hypothesis, that is a big let down.

A key part that I do understand (minus a few terms I hadn't encountered before), with my emphasis:
Many corollaries of my hypothesis are testable. If insects acquired larvae by hybrid transfer, the total base pairs of DNA of exopterygote insects that lack larvae will be smaller than those of endopterygote (holometabolous) species that have both larvae and pupae. Genome sequences are known for the fruitfly, Drosophila melanogaster, the honeybee, Apis mellifera, the malarial mosquito, Anopheles gambiae, the red flour beetle, Tribolium castaneum, and the silkworm, Bombyx mori: holometabolous species, with marked metamorphoses. I predict that an earwigfly (Mercoptera Meropeidae), an earwig (Dermaptera), a cockroach (Dictyoptera), or a locust (Orthoptera) will have not necessarily fewer chromosomes but will have fewer base pairs of protein-coding chromosomal DNA than have these holometabolans. Also the genome of an onychophoran that resembles extant species will be found in insects with caterpillar or maggot-like larvae. Onychophoran genomes will be smaller than those of holometabolous insects. Urochordates, comprising tunicates and larvaceans, present a comparable case. Larvaceans are tadpoles throughout life. Garstang regarded larvaceans as persistent
tunicate larvae, and, if so, their genomes would resemble those of tunicates. But if larvaceans provided the evolutionary source of marine tadpole larvae, their genomes would be smaller and included in those of adult tunicates. The genome of the larvacean Oikopleura dioica is about one-third that of the tunicate Ciona intestinalis, consistent with my thesis

Williamson is obviously not an expert on genomics, but Margulis should have known better and pushed him to improve this section. In the "communicated by" path, the academy member can basically hand-pick the reviewers and is supposed to act as an editor would.

The first problem is a rather naive view of genome size and evolution. Genome sizes vary all over the map even within related species; Fugu to salmon is several fold as is fruit fly to malaria vector. The latter pair is particularly relevant since these are both dipteran insects, and therefore in the same bin by Williamson's standard (as stated in the quoted text). Now, that is overall genome size; if you restrict to protein coding regions these pairs are more similar, which leaves some wiggle room. But, by the same token the Oikopleura and Ciona genomes contain about the same number of genes (~15-16K).

But furthermore, his hypothesis should be quite testable right now, at least in a basic form. If a genome fusion occurred, then genes active in larval stages and genes active in the adult should show different gene trees if they are homologs. Given that there is a lot of data to annotate which Drosophila genes are active when, this should be a practical exercise. While I leave this as an exercise for the student, I would point out that it is already known that in Drosophila many proteins are active in both phases. This can probably also be tallied in some fashion. I'm guessing that the fraction of genes shared between stages will be quite large, which would not be very supportive of the fusion hypothesis.

Should a paper like this get into a journal such as PNAS? Given what I've written above, I think not, simply on its demerits. On the other hand, crazy hypotheses do need a place to go because they are sometimes the right hypotheses -- Margulis's formulation of endosymbiont hypothesis had very tough sledding on its path to the textbooks. However, in the modern world there is a place for odd speculations and journeying outside your expertise. It's called a blog!
Williamson DI (2009). Caterpillars evolved from onychophorans by hybridogenesis. Proceedings of the National Academy of Sciences of the United States of America PMID: 19717430

1 comment:

Jim said...

Hey Keith, On the contrary, I have always felt that PNAS was a great place to read some 'out there' ideas. I used to enjoy going through each new issue hoping to find some crazy communications. One of my favorites is the Eccles classic "Evolution of Consciousness":
Jim Deeds