High throughput sequencing of genomes is over twenty years old, which demanded the development of automated pipelines for annotating this data. I've worked on such pipelines since the early 1990s, implementing them as a student and at two different corporate stops. Indeed, we were reviewing results from my pipeline versus some of the other ones out there to see what can be done better. And unfortunately, I've found infuriating problems with RefSeq entries annotated with NCBI's bacterial genome annotation pipeline. Now I'm usually one to sing the praises of NCBI -- they are a key resource for biological research and they make available multiple spectacular public services freely to the entire world. But I'm afraid this time I need to vent.
Tuesday, August 15, 2017
Last week's news contained a story sure to raise eyebrows. A group of computer security researchers from the University of Washington claimed to have demonstrated that they could hijack a computer via sequencing a carefully-constructed DNA fragment. Visions of NextSeqs rampaging through the streets immediately sprung to mind. The paper is interesting and has some useful warnings for the bioinformatics community, but certainly the news coverage has been strong on hype and alarmism.
Saturday, August 05, 2017
Over on Quora a common type of question is "Can I be a computational biologist if I am now an X". Personally I take a very broad view and think just about anyone with intellectual curiosity can become any kind of scientist. A related type of question is "how skilled do I need to be in Y to succeed in computational biology", where Y is most often programming, biology or math. I got thinking about this and started wondering whether I am actually at all skilled in math. Here is the results of that analysis.