There are few things you can appreciate better than something you have striven hard at yet failed. For a bit of time I was a minor expert in G-protein coupled receptors (GPCRs) -- well, really just the curator of a private database.
GPCRs are molecular wonders. The human genome contains around a thousand of so, but a large fraction of these are olfactory receptors -- our detectors of scents. These are organized into at least three major sequence families -- there were always a few more trying to break in, and I've lost track of the current opinion on these unusual families.
GPCRs have two key characteristics. First, they signal by coupling to heterotrimeric GTP-binding proteins, or G-proteins. Second, they have seven membrane spanning domains. Indeed, the main reason to claim some new looks-like-nothing-else protein as a GPCR was the prediction of this seven transmembrane, or 7TM, character. That 7TM character also makes them crystallographic sinkholes -- I think it is still true that only one crystal structure has been reported (bovine rhodopsin).
GPCRs have an amazing variety of ligands, ranging from small proteins to peptides to sugars to lipids to nucleotides to what have you. As mentioned above, our sense of smell is largely driven by GPCRs -- the discovery of this large subfamily led to a Nobel prize. All sorts of molecules have smells, suggesting the versatility of these proteins. Some fundamental tastes are also detected by GPCRs. Our very entry into this world is governed by a GPCR (oxytocin receptor). Perhaps the most amazing GPCRs are those that detect light and enable our vision. While a photon isn't truly the ligand for these receptors (a photoisomerization product of a covalently bound small molecule is), it is fun to think of it that way. If someday a physiological role is found for a noble gas, I wouldn't want to bet against a GPCR being the receptor for it.
GPCRs are also key drug targets. Many neurotransmitters are detected by GPCRs, along with many important hormones. Because they are such important drug targets, special care was made by every genomics company in sifting through their data to ensure that no GPCR slipped through unnoticed. Many that were found resembled olfactory receptors and probably are -- though sometimes they are clearly expressed in rather peculiar places outside the nose.
Once found, life is not easy. In order to configure a high-throughput screen for a small molecule (a few GPCRs are antibody targets, namely the chemokine receptors), you really need to know what the input is and which G-protein the output is sent out on. This also doesn't hurt in deducing the physiological role for the GPCR. The G-protein is the easy side. The specificity is mostly in the alpha subunit, which there are around 20 of but which also fall into a few subfamilies. Most GPCRs talk to only one of these subfamilies, and better yet for drug discovery there are mutants which seem to be rather promiscuous. So that's taken care of.
But finding a ligand: good luck! Again, since GPCRs seem to bind anything you can assume a novel one might bind just about anything. Treeing them with their kinfolk can suggest possible ligand classes, as neighborhoods on the tree will often have similar ligands, but that's no help if your novel GPCR doesn't look much like the rest. So every lab would throw a small kitchen sink of candidate ligands at their 'orphan' GPCRs and look for a signal -- and based on our experience & what's in the literature, that wasn't very often. New ligands would appear, often in small cascades -- once a new class of ligand was identified (such as short chain fatty acids), then a slew of papers would follow after a bunch of these had been explored on orphan receptors. But the last time I checked my database of receptors of interest without ligands, the list was still long.
One interesting possibility is that some of these receptors don't have specific ligands, because they may not function on their own Heterodimerization of GPCRs has been reported, and other families of receptors (kinases, nuclear hormone receptors) show how proteins lacking in some key receptor functions can still be very important via heterodimerizing with close relatives.
So it is with a bit of envy I view the recent press release from Compugen, an Israeli company that built an informatics approach to identifying novel transcripts and splice variants. They report finding, and demonstrating the function of, eight novel peptide ligands for GPCRs, some for orphan GPCRs and others as additional ligands for previously characterized ones. These are a challenging problem -- one which I and several more clever people at MLNM beat their head on -- and clearly Compugen has done well. Part of their identification relied on finding characteristic amino acid motifs recognized by the proteases which process these peptides -- many peptide GPCR ligands are clipped from larger precursors. Often, multiple ligands are encoded by the same precursor. Finding novel precursors is not trivial -- not only are they very short open reading frames, and therefore are difficult to distinguish from random open reading frames appearing in DNA, but many are also on fast evolutionary clocks -- which means that finding these peptides by cross-searching the human and mouse (for example) genomes isn't always much help.
So hats off to Compugen. I would be shocked if we are done finding GPCR ligands, but to find eight at once is quite an achievement.