Tuesday, August 21, 2007

Personal breakthrough?

One of the contributing factors to a poor recent post frequency is an obsessive tackling of a particular problem at work, one that strayed into the borders of my programming competency. A complete solution is now coded; tomorrow I start trying to make it work.

Much of the programming in bioinformatics is pretty straightforward data slinging -- extract some data from a set of sources, cross-reference it, condense it, slice it, dice it, etc. Real algorithms are left to a small cadre of programmers working on, well, real algorithms.

Periodically though, one is faced with dusting off some algorithmics. In this case, I realized my problem could be formulated as a graph-walking problem, though with some painful rules about walking the graph. One way to think about it (which only occurred to me now, it probably would have been a help), is that the nodes come in different colors & there are rules for when you must or must not switch colors during the traverse. There's even another attribute (texture?) which has different alternation rules.

After figuring out the original graph idea, I started churning out code to tackle it. However, before long, I started struggling with the endgame of the algorithm -- I could set up a graph which would contain any valid solution, but I couldn't quite put together the code to pull out that solution. A sure sign that things were going south was that my classes & method signatures were becoming bloated, cluttered with lots of parameters & fields. On trying to get what I had running, memory blew up on me.

As is often the case, discussion with Miss Amanda suggested another approach. So I placed all the old code in a separate file & started on the new approach. I had figured out a clever way to reduce the memory requirements, both by a way to compress the representation of edges (because the problem results in many edges in the form S->E, S->E+1, S->E+2, etc) and an approach to avoid where I though the memory pig had really gone hogging.

Not helping any of this was the memory of a pointed article by my personal programming guru on the problems with recursive code. Graph & tree walking are recursive problems, but can be solved with non-recursive coding. Particularly in languages which support custom iterators (such as C# & Python), the non-recursive solutions have significant advantages. But, some such solutions occurred easily to me & others just became more knots of ugly, unproductive code.

But again, the endgame started unnerving me. New classes & methods sprung up, but it wasn't clear if they were really moving me forward or simply putting me in a Red Queen setup. So, another walk with my assistant & another approach.

After a few more days slog, that's the one that's ready to start testing. It feels good -- but nothing like what it will feel like if the thing actually WORKS!

No comments: