Thursday, November 08, 2018

No, the Groves Fallacy Can't be Retired Yet

Vijay Pande has a thought-provoking piece in Scientific American on the Groves Fallacy, though in the end I'm afraid mostly what he provokes in me is the thought that he's in most cases pretty far off base. Titled "How to Engineer Biology", he claims that the Grove Fallacy -- the idea that biology can't be tamed by engineering -- is quickly being put to rest.  And Pande isn't some naive Silicon Valley type, but a professor at Stanford whose lab works in experimental biology.  So he has some street cred -- but that doesn't mean he isn't mostly wrong.
Gotta get the space geek stuff done with first.  Pande comments that Apollo 1 didn't land on the moon; Apollo 11 did.  Not really a good choice for something extolling the wonders of engineering: Apollo 1 burned in a ground test, killing the three astronauts inside.  The investigation of the tragedy highlighted grievous engineering issues -- the hatch door opened inwards and there were no provisions to rapidly extract the astronauts*. And the fire exposed issues in actually building the spacecraft** -- the fire was started by poorly insulated wiring.  So even in a highly funded, heavily engineered program, there's still a problem with execution and reproducibility

But I do take Pande's point: the Apollo program and the prior Gemini program and associated unmanned probes were an exercise in systematically checking off a detailed to-do list.  Spacewalk? Gemini 4.  Rendezvous in space?  Gemini 6 & 7.  Docking?  Gemini 8 (Soft landing on the moon?  Surveyor.  All systems in order?  Apollo 10.

But there were a lot of mishaps along the way -- with the Apollo 1 inferno being the most horrific.  Gene Cernan exhausted himself on his spacewalk on Gemini 9 trying to some basic operations like use a wrench, because nobody had thought about what Newton's Laws of Motion really entailed in zero gravity.  Of course a fix was produced -- handholds and footholds let Buzz Aldrin accomplish these same tasks with ease on Gemini 12.  

But as I've said before, Apollo knew the majority of the challenges before the program even started.  It was possible to have a very stepwise program because almost all of the steps were known! What Pande is far too optimistic about is the difficulty in figuring that out, particularly when trying to deliver therapies.  Only a handful of Gemini missions discovered previously unknown problems (such as the handholds); nearly every time we put a new drug in people we seem to discover new biology that wasn't previously hinted at.

Pande notes that billion dollar bridges are designed but rarely fail, whereas billion dollar drugs frequently fail.  His prescription: "with design, however, we can plan and progress very systematically along a roadmap and make incremental innovations along the way".  At one level that is saying water is wet -- drug developers routinely make incremental innovations along the way.   But at another it completely ignores one of the fundamental problems in drug discovery: many problem drugs can only be found as such in ginormous human populations.

Pande is also very fond of machine learning aka artificial intelligence.  Which is a very lively field and one I am very interested in, and one that is increasingly delivering important results.  But I also today was looking at a humorous list of examples from the ML literature where the algorithms discovered one way or another to succeed that subverted the goal of the trainer. Ask the algorithm to minimize the energy for an arrangement of carbon atoms?  Algorithm finds no rule against putting them all in exactly the same three-dimensional location.  Many were trained using models of physics -- and succeeded by violating the actual rules of physics by exploiting errors in the models.  In other cases algorithms embedded cookie crumbs in image data, erased input data so it matched a blank output. In other cases the algorithms discovered true cluelessness in the programmers -- given two cases in perfectly alternating order, the algorithm learns to assign label A to odd numbered examples and label B to even numbered ones.  Pick the wrong figure of merit -- or a flawed one -- and delusions follow that are eventually dashed in the real world.

Now this all sounds silly - -except they illustrate the challenge of developing good models.  Biology and drug development would be soooo much easier if the models were truly good.   We know most of our models are really problematic, and we don't know enough biology to necessarily fix them.  And how do you model exactly a genetically heterogeneous population with different co-morbidities and co-treatments?  Or the fact that every tumor is a heterogeneous mess that isn't even self-similar, let alone identical to anyone elses?

Animals of course are a help -- except they aren't little people.  Does your drug really cause kidney toxicity or is it just causing it in rodents because of their potent urine?  Rodents and dogs don't have exactly the same cytochrome P450s under exactly the same transcriptional control, so should we expect them to precise model the ADMET in humans?  And then there's the immune system -- how are your supposed to model an immunomodulatory drug -- particularly an antibody -- in a mouse when so many of the little details are different***  And it's those little details that often lead to big differences

Pande claims that biomarkers have been historically been discovered by "a bespoke, one-off process -- so the discovery of PSA for prostate cancer, for instance, does not suggest a biomarker for ovarian cancer".  Well, that could be because few if any biomarkers will be shared between cancers -- particularly ones you wish to detect with an antibody test in serum.  PSA has such a checkered reputation, yet is there a better prostate marker?  And in any case, we've seen several decades now of academics and companies trying to discover biomarkers in industrial fashion -- that was the premise of Millennium Predictive Medicine (which I was never in but interacted with) and a few dozen other companies in the nineties and oughties -- which yielded very few if any major successes.  And at least one case where the touted results were purely due to the order of sample presentation.  Or other confounders -- one Millennium study on Velcade in the clinic using Affymetrix arrays tried very hard to standardize the biopsy procedures, but still the strongest signal by far in the data was which medical center had taken the sample.

Of course the worst part of all this is I believe in part in much of what Pande is saying -- but I believe it with a great deal of reluctance and restraint.  AI/ML is an important set of tools, but one that needs to be treated extremely skeptically.  Continuing to work out the hierarchy of cell types and the nature of genetic circuits is going to deliver results.  Automation and genomics give us great gains.  Further nailing down protocols, characterizing reagents like antibodies and documenting everything will help with reproducibility.  But these approaches all bump against the inherent mess that is biology and medicine, and it is most critical for those of us who believe in those approaches to stay soberly aware of their limits.

In the end the only way we know what a drug does in large patient populations is to dose large patient populations.  Only then can rare side effects emerge (e.g. Baycol).  Or the rare beneficial side-effect.  That is the ultimate reproducibility -- getting the same result in every patient -- and probably permanently out-of-reach due to complexity of actual human populations.

The modern wave of gene and cell therapy is very welcome, but have serious strings.  Gene therapies are targeting relatively simple cases of diseases in which altering a small number of cells of a single type is expected to have significant patient benefit.  CAR-T is amazing, but still only works in a few cancer subtypes and we've already seen a case report in which a tumor cell was accidentally engineered in the process.  

Many good things will come from the trends Pande extols, but treating biology and drug development as purely an engineering and design problem isn't going to solve things.  

Let's go back to that bridge example.  Let's take the specific example of the new Tappan Zee bridge, a recently completed span over the Hudson River I use periodically.  The engineers for that structure would have taken a large body of knowledge on bridge design and integrated it with data on the winds and currents in the area as well as the geology and the traffic projections for the span.  They then used this information to design the overall structure and break it down into component parts and order them as well as design the construction schedule that put those parts together in the correct order and so that the structure was stable at all times prior to completion.  Wonderful -- and I do love a good bridge.

But perhaps far more relevant to this discussion is the bridge a half mile from me which carries my street over a railroad track.  Said bridge was condemned in August and is now open only to pedestrians.  Apparently the steel is badly corroded -- not surprising given that in winter the streets department tends to worship Lot's wife.  

The engineering solution to repairing the bridge is complex but fathomable -- find the most compromised elements and replace or reinforce them.  But what if we needed a drug discovery solution?  Let the bridge sit in for a cell type we want to modulate to improve some condition

Well first off, you wouldn't have a great picture of what is wrong.  Something more like a pattern -- the bridge is flexing too much or has a weird shimmy if you speed over it.  But we can't look at the structural elements directly.  We have a hypothesis they've rusted and need reinforcement.  Alas, none of our existing drugs are good at bolstering things -- we have some leads on inhibiting further rust but they aren't proven yet.  Perhaps they are sensitive to the air or UV.  And we can't use a crane to deliver them -- no, we'll spray them on the hoods of cars headed that way and hope enough drips off to do good.  And that it doesn't wreck the concrete.

Whatever the problems with my analogy, bridges are very different than patients, tissues or cells.  We are limited in how we can collect information -- and worse we are very limited in how we can intervene.   Look at how many things we understand relatively well for quite a long time -- say KRAS in cancer**** -- and yet we sit powerless to act on it in a clinical setting.  

In the end the challenge is that biology, like this post, is riddled with footnotes and asterisks (and even footnotes to the footnotes, which no decent writer resorts to)  Industrial approaches to biology***** may better illustrate both the rules and the exceptions, but that won't change the complex manner of biology and further legions of exceptions that will hide in that complexity.  

* yes, I know all about Gus Grissom's Mercury hatch door blowing -- if you haven't discovered my love for the book and movie The Right Stuff, now you know.  And it remains a major mystery why that door blew.  And Grissom would die in Apollo 1, which lacked such a door.

** I have a friend whose summer job was to wire lunar modules -- a teenager was a good size to get into the cramped spaces.  Lucky SOB!

*** I found a paper in Cell once that claimed a particular phosphorylation site on an apoptosis regulator is very important to immune regulation.  And they had a lot of data to argue the case.  So maybe it is  -- in mouse -- but the site isn't conserved in humans.  Cell wouldn't publish my note -- not important enough for their tastes.  Pre-blog, of course -- now I would have ranted here.

**** my last two employers have been working on this, though I haven't been

***** Pande's piece really lacks an appreciation for small-scale  (aka "bespoke"******) science, which in biology can be powerful.  Some wonderful things come from clever people watching carefully.  But I'll leave that defense to another post -- or to other authors

****** I pride myself in my vocabulary, but I must say I never heard this word until I left graduate school.  

No comments: