There's a running problem with the plots: if you display adjacent plots with the same type of data from two different datasets, then except if you have a damn good reason the axes must be scaled the same! If you are plotting them adjacently, then you probably want me to compare the two plots. I might even want to compared the data even if you didn't intend it -- if you put plots next to each other you're inviting comparison.
Here's an example of a good side-by-side plot from the webinar.
Two different human libraries run on the 8M SMRT cells. X-axis and Y-axis scaled identically so we can easily compare the two libraries and see they are largely the same, though with a bit of quirk in the middle.
Now let's get to the rogues' gallery. Here's a three graph comparison of libraries from different organisms run on the 8M SMRT cells. The X-axes are the same -- but note that the Y's aren't remotely so -- maxing out at 1.75M for E.coli, 3.5M for B.subtilis and 2M for O.sativa. So if you want to compare the distributions, have fun rescaling everything!
But maybe you don't really care; the exact shapes of the distributions say something about the different DNA preps. But how about trying to see the difference between 10 hour and 20 hour movies as with this slide? This time the Y-axis is held constant, but the X axis ends at 140K for 10 hours and at 250K for 20 hours. The audio commentary makes it clear that what you're supposed to take away is that it isn't worth running a library with this distribution for the longer movie -- but with the bad scaling it's really difficult to see what is gained by the longer instrument time.
Okay, enough of these plots of total data versus read length. How about some plots of the read length distribution intended to show the advantage of using a single cell of the new chemistry rather than four cells of the old chemistry. This time the X-axis is correctly fixed but the Y max varies. Actually, for this plot I'd really rather have just curves rather than bars, as that would make it even easier to compare the curves -- well, if they can be clearly distinguished. Since I have a very idiosyncratic color sensitivity I often struggle here, but with marker shapes one can make things clear.
Here's a truly egregious example -- the bottom plot represents three times as much data but you'd neer be able to tell that, as the top plot maxes at 20K and the bottom at six times that! This Iso-Seq data also could really use some inset plots zooming in on that 40K-60K region; I'd really like to know what happened to that hump in the 2.1? Is that the real size distribution of the mRNA population or not?
Okay, one last plot to beat on. Yet again the sin is the X-axis (it's faint praise, but I couldn't find paired plots that differed in both X and Y axis limits).
Okay, I'm done -- probably because I couldn't find more paired plots.
There are many factors that can lead to poor plots -- rushed preparation, not running your slides by a naive audience in advance, etc. So it takes awareness and discipline to avoid making these mistakes. But you should also have strong incentives, starting with professional pride but also if you want to convince people -- and here convince has sizable dollar signs associated with success. And at least the are probably errors of inattention, not active crimes of execution such as perspective pie charts.