I have just finished reading this excellent book, **Statistics done wrong: a woefully complete guide** by **Alex Reinhart**. I’d recommend it to anyone interested in quantitative biology and particularly to PhD students starting out in biomedical science.

Statistics is a topic that many people find difficult to grasp. I think there are a couple of reasons for this that I’ll go into below. The aim of this book is to comprehensively cover the common mistakes and errors that are continually crop up in data analysis. The author writes in an easy-to-understand style and – this is the important bit – he dispenses with nearly all the equations. The result is an accessible guide on “what not to do” in significance testing.

I think there are two main reasons why people find statistics tough: **uncertainty** and **mathematical anxiety**.

First, *uncertainty*. What I mean is the uncertainty over what statistical approach to take, rather than the uncertainty that can be studied using statistics! It is very easy to find fault in which statistical approaches have been used in a study by a biologist. Why did they show the confidence interval and not the standard deviation? Why haven’t they corrected for multiple testing…? Statistics has a “gotcha” reputation. The reason for the uncertainty is that it is difficult to come up with a hard-and-fast set of guidelines of approaches to take, because this depends a lot on the type of data that has been collected, what is being tested etc. And there are often several ways to do the same thing. This uncertainty doesn’t go away even with a firm grounding in statistics. The methods are nearly always up for debate as far as I can see. And I think it is this uncertainty that prevents people from really engaging with statistics. In the absence of clear direction, it seems like having in mind a set of “what not to do”, is a useful approach to stats.

Second, *mathematical anxiety*, i.e. fear of maths. Biology has a reputation for being populated by people who ended up here through an affinity with science but a discomfort with physics and maths. This is unfair as there are many areas of biology where this is not true and statistical/quantitative approaches are right at the forefront. Nonetheless, there is a reason why there are umpteen “Statistics for Biologists” books in the bookshop. Now, the way that statistics is taught is to crunch through the equations that describe statistical concepts. Again, this means that people who really need to know about statistics for their research are held back if they don’t have a mathematical background or just find maths a bit daunting. The situation is well described by a recent post at Will Kurt’s excellent Count Bayesie blog on the teaching of statistics. His point is: insisting that students know these equations gets in the way of them understanding statistics. Nowadays, calculating something like the standard deviation is trivial using a computer *and* we are unlikely to need to know the derivation of an equation in order to do our work. We should just skip the equations and explain *why*.

The nice thing about this book is that the author has collected together all the faux pas that you’re likely to encounter and how to avoid them. This goes some way to addressing *uncertainty* in what methods to use. Secondly, the author has dispensed with the equations, so the *mathematically anxious* can pick it up without fear. These features make this book different to other stats books that I’ve read.

You can find copies at many online retailers. It’s published by No Starch. I picked up a copy after reading about it on Nathan Yau’s Flowing Data blog.

—

The post title comes from “My Blank Pages” by Velvet Crush from their Teenage Symphonies to God LP.

Thank you for the recommendation, it looks like something that would be very useful on a lab book shelf. I’m getting one for ours asap. Statistics learning is seriously under appreciated these days!

If I am allowed a recommendation myself: “Discovering Statistics with SPSS” by Andy Field is a great handbook if you use SPSS.

Thanks for the comment and the recommendation, Alex.