Statistical hypothesis testing, commonly referred to as “statistics”, is a topic of consternation among cell biologists.
This is a short practical guide I put together for my lab. Hopefully it will be useful to others. Note that statistical hypothesis testing is a huge topic and one post cannot hope to cover everything that you need to know.
What statistical test should I do?
To figure out what statistical test you need to do, look at the table below. But before that, you need to ask yourself a few things.
 What are you comparing?
 What is n?
 What will the test tell you? What is your hypothesis?
 What will the p value (or other summary statistic) mean?
If you are not sure about any of these things, whichever test you do is unlikely to tell you much.
The most important question is: what type of data do you have? This will help you pick the right test.
 Measurement – most data you analyse in cell biology will be in this category. Examples are: number of spots per cell, mean GFP intensity per cell, diameter of nucleus, speed of cell migration…
 Normallydistributed – this means it follows a “bellshaped curve” otherwise called “Gaussian distribution”.
 Not normallydistributed – data that doesn’t fit a normal distribution: skewed data, or better described by other types of curve.
 Binomial – this is data where there are two possible outcomes. A good example here in cell biology would be a mitotic index measurement (the proportion of cells in mitosis). A cell is either in mitosis or it is not.
 Other – maybe you have ranked or scored data. This is not very common in cell biology. A typical example here would be a scoring chart for a behavioural effect with agreed criteria (0 = normal, 5 = epileptic seizures). For a cell biology experiment, you might have a scoring system for a phenotype, e.g. fragmented Golgi (0 = is not fragmented, 5 = is totally dispersed). These arbitrary systems are a not a good idea. Especially, if the person scoring is unblinded to the experimental procedure. Try to come up with an unbiased measurement procedure.
What do you want to do?  Measurement
(Normal) 
Measurement
(not Normal) 
Binomial

Describe one group  Mean, SD  Median, IQR  Proportion 
Compare one group to a value  Onesample ttest  Wilcoxon test  Chisquare 
Compare two unpaired groups  Unpaired ttest  WilcoxonMannWhitney twosample rank test  Fisher’s exact test
or Chisquare 
Compare two paired groups  Paired ttest  Wilcoxon signed rank test  McNemar’s test 
Compare three or more unmatched groups  Oneway ANOVA  KruskalWallis test  Chisquare test 
Compare three or more matched groups  Repeatedmeasures ANOVA  Friedman test  Cochran’s Q test 
Quantify association between two variables  Pearson correlation  Spearman correlation  
Predict value from another measured variable  Simple linear regression  Nonparametric regression  Simple logistic regression 
Predict value from several measured or binomial variables  Multiple linear (or nonlinear) regression  Multiple logistic regression 
Modified from Table 37.1 (p. 298) in Intuitive Biostatistics by Harvey Motulsky, 1995 OUP.
What do “paired/unpaired” and “matched/unmatched” mean?
Most of the data you will get in cell biology is unpaired or unmatched. Individual cells are measured and you have say, 20 cells in the control group and 18 different cells in the test group. These are unpaired (or unmatched in the case of more than one test group) because the cells are different in each group. If you had the same cell in two (or more) groups, the data would be paired (or matched). An example of a paired dataset would be where you have 10 cells that you treat with a drug. You take a measurement from each of them before treatment and a measurement after. So you have paired measurements: one for cell A before treatment, one after; one for cell B before and after, and so on.
How to do some of these tests in IgorPRO
The examples below assume that you have values in waves called data0, data1, data2,… substitute the wavenames for your actual wave names.
Is it normally distributed?
The simplest way is to plot them and see. You can plot out your data using Analysis>Histogram… or Analysis>Packages>Percentiles and BoxPlot… Another possibility is to look at skewness or kurtosis of the dataset (you can do this with WaveStats, see below)
However, if you only have a small number of measurements, or you want to be sure, you can do a test. There are several tests you can do (KolmogorovSmirnoff, JarqueBera, ShapiroWilk). The easiest to do and most intuitive (in Igor) is ShapiroWilk.
StatsShapiroWilkTest data0
If p < 0.05 then the data are not normally distributed. Statistical tests on normally distributed data are called parametric, while those on nonnormally distributed data are nonparametric.
Describe one group
To get the mean and SD (and lots of other statistics from your data):
Wavestats data0
To get the median and IQR:
StatsQuantiles/ALL data0
The mean and sd are also stored as variables (V_avg, V_sdev). StatsQuantiles calculates V_median, V_Q25, V_Q75, V_IQR, etc. Note that you can just get the median by typing Print StatsMedian(data0) or – in Igor7 – Print median(data0). There is often more than one way to do something in Igor.
Compare one group to a value
It is unlikely that you will need to do this. In cell biology, most of the time we do not have hypothetical values for comparison, we have experimental values from appropriate controls. If you need to do this:
StatsTTest/CI/T=1 data0
Compare two unpaired groups
Use this for normally distributed data where you have test versus control, with no other groups. For paired data, use the additional flag /PAIR.
StatsTTest/CI/T=1 data0,data1
For the nonparametric equivalent, if n is large computation takes a long time. Use additional flag /APRX=2. If the data are paired, use the additional flag /WSRT.
StatsWilcoxonRankTest/T=1/TAIL=4 data0,data1
For binomial data, your waves will have 2 points. Where point 0 corresponds to one outcome and point 1, the other. Note that you can compare to expected values here, for example a genetic cross experiment can be compared to expected Mendelian frequencies. To do Fisher’s exact test, you need a 2D wave representing a contingency table. McNemar’s test for paired binomial data is not available in Igor
StatsChiTest/S/T=1 data0,data1
If you have more than two groups, do not do multiple versions of these tests, use the correct method from the table.
Compare three or more unmatched groups
For normallydistributed data, you need to do a 1way ANOVA followed by a posthoc test. The ANOVA will tell you if there are any differences among the groups and if it is possible to investigate further with a posthoc test. You can discern which groups are different using a posthoc test. There are several tests available, e.g. Dunnet’s is useful where you have one control value and a bunch of test conditions. We tend to use Tukey’s posthoc comparison (the /NK flag also does NewmanKeuls test).
StatsAnova1Test/T=1/Q/W/BF data0,data1,data2,data3 StatsTukeyTest/T=1/Q/NK data0,data1,data2,data3
The nonparametric equivalent is KruskalWallis followed by a multiple comparison test. DunnHollandWolfe method is used.
StatsKSTest/T=1/Q data0,data1,data2,data3 StatsNPMCTest/T=1/DHW/Q data0,data1,data2,data3
Compare three or more matched groups
It’s unlikely that this kind of data will be obtained in a typical cell biology experiment.
StatsANOVA2RMTest/T=1 data0,data1,data2,data3
There are also operations for StatsFriedmanTest and StatsCochranTest.
Correlation
Straightforward command for two waves or one 2D wave. Waves (or columns) must be of the same length
StatsCorrelation data0
At this point, you probably want to plot out the data and use Igor’s fitting functions. The best way to get started is with the example experiment, or just display your data and Analysis>Curve Fitting…
Hazard and survival data
In the lab we have, in the past, done survival/hazard analysis. This is a bit more complex and we used SPSS and would do so again as Igor does not provide these functions.
Notes for use
The good news is that all of this is a lot more intuitive in Igor 7! There is a new Menu item called Statistics, where most of these functions have a dialog with more information. In Igor 6.3 you are stuck with the command line. Igor 7 will be out soon (July 2016).
 Note that there are further options to most of these commands, if you need to see them
 check the manual or Igor Help
 or type ShowHelpTopic “StatsMedian” in the Command Window (put whatever command you want help with between the quotes).
 Extra options are specified by “flags”, these are things like “/Q” that come after the command. For example, /Q means “quiet” i.e. don’t print the output into the history window.
 You should always either print the results to the history or put them into a table so that we can check them. Note that the table gets over written if you do the same test with different data, so printing in this case is a good idea.
 The defaults in Igor are setup OK for our needs. For example, Igor does twotailed comparison, alpha = 0.05, Welch’s correction, etc.
 Most operations can handle waves of different length (or have flags set to handle this case).
 If you are used to doing statistical tests in Excel, you might be wondering about tails and equal variances. The flags are set in the examples to do twotailed analysis and unequal variances are handled by Welch’s correction.
 There’s a school of thought that says that using nonparametric tests is best to be cautious. These tests are not as powerful and so it is best to use parametric tests (t test, ANOVA) when you can.
—
Part of a series on the future of cell biology in quantitative terms.