Line Up: Waffle plots for colocalisation data

Quantifying the degree of colocalisation of two signals in microscopy images is very tricky. Lots has been written on this topic, including in my book The Digital Cell. The focus of this post is on visualising colocalisation.

One way to look at colocalisation is two think about two sets of objects and how many of each set overlap. This is sometimes referred to as co-occupancy or object-based colocalisation. Determining the objects (e.g. discrete spots) is typically done using thresholding or other detection method, and then a rule for defining overlap is used to determine the number of co-occupied spots (e.g. maximum centre-to-centre distance of 200 nm).

This gives us number of spots in each channel (we’ll only consider two channels here) and the number that overlap, which also reveals how many spots in each channel are not overlapping.

So how can we visualise this data?

  • Venn diagram – defines sets of interest: channel 1 only, channel 2 only, overlap.
  • Euler diagram – similar to Venn diagram, but where the area for each subset matches the size of the subset.
  • Series of plots to show: number of spots in each channel, fraction of channel 1 spots that overlap with channel 2 and vice versa.

The plots work well because we can see the variability between cells/experiments across the datasets. The Venn and Euler diagrams are OK to summarise the overall degree of overlap but they have two drawbacks.

  • Area of arbitrary shapes is hard to judge.
  • Hard to assess changes in the number of spots from one condition to another.

Is there an alternative?

Waffle plot

Waffle plots are used for data visualisation in the media but are rarely seen in scientific papers. They can replace Venn and Euler diagrams to summarise colocalisation data.

In R, there are a few implementations, notably ggwaffle which works with ggplot. I have written some simple code to generate Waffle plots in Igor Pro.

This is an example of three waffle plots to show colocalisation data in three different conditions. Green is channel 1 only, Red is channel 2 only, yellow is the number of spots that overlap.

In the example above, we have partial overlap between two sets of spots. In the condition in the middle and right we have fewer green spots, but also proportionally less overlap between the two signals in the middle condition.

These changes are hard to visualise from a series of box plots or by trying to understand the areas in three Euler plots.

The number of spots can have a real meaning too. Here they are the actual number of spots in a 100 micron squared region of the cell.

The colour blind issue

The colours above are not much use to colour blind people. So there are some alternatives built in, or the user can pick whatever colours they like!

The dots work well for data visualisation when we want to take about small puncta in cells. But other symbols can be used instead.

I’m wondering: is cell biology ready for the waffle plot?

Please go ahead and use/remix the WafflePlots code or feel free to make a suggestion.

The post title is taken from “Line Up” by Elastica from their debut LP “Elastica”.