Thursday, May 19, 2011

Forums: Open Beats Closed Hands Down

Around the Internet, there are a number of communities in which scientists can swap useful information.   SEQAnswers is a very useful site I frequent; BioStar is one I don't but probably should.  Life Technologies has set up a community around Ion Torrent, and the contrast between that and SEQAnswers is a useful one.

SEQAnswers has a straightforward access policy.  Anyone can view content, but to post or reply you must register for a membership.  This approach appears to have been very successful, as there is a healthy number of individuals posting to the site.  You can browse around and figure out if the site applies, plus search engines such as Google can steer individuals in.  SEQAnswers boasts a number of authors of major second generation sequencing analysis packages, including Bowtie, Tophat, BFAST and Samtools, as regular contributors.  There is a significant network benefit to this; quality encourages quality and conversely such folks must be judicious in the number of forums they actively participate in.  The management of SEQAnswers applies a light hand, occasionally moving posts to more relevant forums and smothering all spam.  In addition to the forums, a key asset is a large wiki on second generation sequencing packages.

The Ion Community is set up on a very different basis.  It has two sections, each with its own membership restrictions.  PGM Users is open only to registered owners of the sequencing system; Torrent Dev is open to that group plus anyone registered for the Grand Challenges.  Each section has both discussion areas and documents.  The site is flashy, though more than a few links are indirect detours to what you really want.

Now, there are plenty of examples of how maintaining some control over a site can be productive.  I was recently trying to eradicate one of these nefarious fake antivirus viruses from our home computer, and on one major security software company's forums I found what looked suspiciously like a link to infect with one of these viruses.  Keeping a single point of origin for documents can be useful as well, to reduce confusion.  For example, if you Google around for information on 454 fusion amplicon design it is easy to find outdated information.  We also wouldn't want any forum to devolve to the level of the Biotech Rumor Mill, which by its very nature must allowed unregistered posters, and as a result is a mudpit of insults and near(?)-libel.

But, Ion's approach is in my opinion strongly self-defeating.  I won't go into detail here, but in the extreme form they have on two occasions (this thread and this one) argued for the suppression of information on PGM from SEQAnswers (I will try to tackle this soon, but after getting a chance to talk to at least one person with the Ion side of the story).  But it's also easy to argue from a purely practical standpoint, rather than a philosophical one, that this approach is not doing their platform any favors.

The first problem is that the closed access means that you can't lurk on the cheap there; either commit to being part of the community or stay out.  This prevents sucking people in slowly; for many the barrier of registering -- particularly since your registration does not become instantly active -- is too high a barrier.  Indeed, I would offer as evidence for this that the first major PGM-tuned software package not from Ion or a commercial partner was apparently not spurred by the Ion Community, but by Nick Loman's post on assembly of Ion data.  Nick's original post apparently received over 2K views, which must be at least an order of magnitude larger than the current Ion Community membership.

The second problem is the two layer design.  Okay, I'll admit it -- it's maddening to be excluded from the PGM Users forum.  How do I know that there are not discussions there I could either benefit from or contribute to?  If someone starts a discussion there better suited for Torrent Dev, will it get bumped up?  If so, who decides?  But, worse than that is that many technical documents that I would find valuable are beyond that access barrier.

The third problem is I find these rigid definitions completely at odds with scientific reality.  Just because I don't own a PGM doesn't mean I might try to do everything but run the instrument.  As an example on another platform, my colleagues & I have run SureSelect on Illumina where we did all the steps except shear the DNA (outsourced), prep the flowcell and run the flowcell.  Furthermore, in a modern collaborative environment, someone in one lab may own the machine but in another lab work up the samples. It's also not clear what any of this "security by obscurity" is buying; given the number of groups using Ion, whomever they're trying to hide the information from can certainly find a leak.  Ion should heed the example of the music industry, which failed to provide legitimate means to supply a growing demand for digital music, and thereby spawned widespread illegal file sharing.  Plus, many browsers are potential buyers -- people want to know what they are really buying in to and to start storyboarding what running an instrument would mean in terms of personnel and auxiliary equipment.

The fourth problem is discoverability.  How do I find information?  Well, for second generation sequencing stuff it is the trio of PubMed, Google and the SEQAnswers software wiki.  PubMed is great, but many packages show up online long before they are published.  Google is useful if you know what you are looking for, and SEQAnswers is great if someone has logged it there (and in general, once I see something I make sure it is logged there).

But with the Ion Community, those last two are problematic.  The objections Ion has raised to links going to the interior of the community mean that I don't dare put such in the Wiki; but conversely a link just pointing to the community is not terribly useful.  But worst, since Ion apparently won't let Google in their stuff is invisible to that valuable tool.  As evidence, at the moment if you Google for information on their TMAP aligner with "tmap source code ion torrent", nothing from the community comes up (but threads on SEQAnswers do!).

Ion isn't the first, and probably won't be the last, company to try to have its I proprietary control cake yet eat its Internet openness.  Life has SOLiD Community, Helicos has one for Helioscope, PacBio DevNet for SMRT sequencing and so forth.  It takes some real courage to relax some control and invite the whole world to your party.  Some folks would no doubt see view-without-registration as a loss of useful marketing data, forgetting that the openness of a site will itself draw in customers.  Anyone launching a new platform should really ask themselves, will I be better off trying to control a rare destination for a few visitors, or perhaps just cultivate a sub-community over at SEQAnswers.


Anonymous said...

Ion Torrent is definitely being very heavy-handed, and I would recommend that no one sign any sort of collaboration agreement with them. There was a paper I was asked to do some bioinformatics work on that they threatened to sue to suppress—they didn't like that the numbers were not as favorable as their ads.

The paper is not being published, though the research in it is good. As a minor co-author, I had little or no say in this decision. I just know that I'll never sign any agreements with Ion Torrent, and will advise others to be very, very wary of any claims Ion Torrent makes. I think the company lost their way when they got sold.

Kevin said...

I too find it disturbing I have to jump through so many hoops to explore data analysis software options avail for the Ion Torrent.
Up to now I have yet to gone back to the 'community' to find the information that I required.

I am sure it would have been easier if I asked their FAS but it does point out the fact that if you keep forums closed, it remains ineffective for promoting a product.
The SOLiD community is another one features mainly company propoganda and evangelistic posts.
For real life information look elsewhere