Wednesday, November 04, 2015

Comments on "The use and misuse of supplementary material in science publications"

Mihai Pop and Steven Salzberg have an opinion piece in BMC Bioinformatics titled "Use and mis-use of supplementary material in science publications", examining issues arising from the ever growing data supplements accompanying papers, particularly in high-profile journals with strict article length limits.  Pop & Salzberg make a number of important points, but there are some topics they didn't cover that I think are also worth treatment.

Pop & Salzberg start out by stating that supplementary sections can be a useful way to improve the reading of a paper.  For example, the paper itself might have a Materials & Methods section that emphasizes readability, while the supplemental methods have the gory details.  Supplements can also have very rich tables that would be just dull in the text.  Pop and Salzberg decry that many supplements have their own reference lists, which aren't captured by various citation indexers, as well as the horrific "see supplemental text" references that simply dump the reader in 10s (or 100s) of pages of supplement.

I can agree on those points, though I one aspect that irks me, and I do complain about it as a referee, is when truly key figures and tables are relegated to the supplement.  Clearly this is a judgement call, but I think it is clear in some cases that awful judgement is employed.  Worse is when key pieces of data are relegated to the supplement.  For example, during part of my Millennium career I was collecting phosphorylation sites for kinases we were interested in, most of which had low single digits of such target sites in the literature.  I learned quickly to skim through the supplements, as this was often where new phosphorylation sites were reported.

Even worse has been papers in which substantial details on entirely new methods, particularly computational methods, are deported to the supplements.  As Pop and Salzberg note, reviewers are less likely to scrutinize the supplements.  This also typifies the difficult consequence of highly interdisciplinary papers, which are unlikely to have all their facets reviewed by three reviewers of sufficient quality.  Now, a while back on Twitter one person commented that they like this coat-tail effect, because otherwise method-focused scientists such as she would never get Science/Nature/Cell papers.  But the flip side would be that should one of these papers blow up after insufficient review (e.g. Arsenic Life, Reactome), that's not going to feel very good.
Pop & Salzburg don't go after the problem of data in supplements; many times these are formatted or delivered very badly.  Putting primer sequences or gene name lists in a PDF is bad enough; more than once I've encountered cases in which these are present only in images of the original tables!

The last problem I've seen in papers, and my graduate adviser George Church is a repeat offender, are supplementary materials which effective have one or more additional papers entombed within them.  For example, George's polony sequencing-by-ligation paper has a sizable supplement, in which they characterized the variability in cut site positioning of a Type IIS restriction endonuclease used to generate mate pairs.  If you are interested in Type IIS restriction endonucleases, it was certainly the most detailed study of this phenomenon up to that point, and might well still be -- but unless you read the supplement, you'll never know -- it certainly isn't captured in Medline.

Scientific journals are undergoing rapid shifts, as many become (either de facto or in actuality) purely on-line, and all sorts of new features are being tested.  A vigorous discussion of the appropriate uses and inappropriate abuses of supplementary materials are long overdue: Pop and Salzberg have done the community a service by getting the ball rolling.


Anonymous said...

Ironically, the supplemental table in Pop & Salzberg's paper cannot be opened in Excel, at least not on my computer...


Anonymous said...

Have you tried to access the supplementary table of the paper? Currently there is the Word version (.doc; no "x") of the main paper, and not the Excel sheet.

Keith Robison said...

ROFL! Did I check the supplementary material in the paper on supplementary material? No!

Chris said...

It seems to me that the real issue is the arms race between people trying to get into a top-tier journal. In the genomics field, for example, one might have to analyze 1000 or more genomes to get a result that merits Science or Nature. If one is to really report all data in the journal (variants, coverage, copy number, SV, statistical comparisons, yadda yadda) then the supplement necessarily has to grow to ludicrous lengths.

Sebastien Lemieux said...

In fact, the supplementary file is the manuscript in docx format. You can open it by downloading the .xlsx file and replace the extension with .docx. Seems to me like a little blunder on BMC bioinfo part.