Here is a fun post about using colour palettes in R. It starts with a computer game… After a few years of sporadically playing Super Mario World 2 – Yoshi’s Island on the Retropie, I made it to the final level. In the background, as Bowser approached, I noticed that those coloured bars in the […]
Tag: dataviz
Say It Ain’t So: using Weezer album cover colours in R
I’m a long-term fan of Weezer. Such was the brilliance of their first two albums that I have stuck with them through thick and thin. And dear me, there has been some very thin music. Nonetheless I own every album – thirteen of them. Among them are six albums entitled “Weezer”. These records are colloquially […]
All Around The World: Maps and Flags in R
Our lab is international. People born all over the world have come to work in my group. I’m proud of this fact, especially in the current political climate. I’ve previously used the GoogleMaps API to display a heat map on our lab webpage. It shows where in the world people in the lab come from. […]
The Sound of Clouds: wordcloud of tweets using R
Another post using R and looking at Twitter data. As I was typing out a tweet, I had the feeling that my vocabulary is a bit limited. Papers I tweet about are either “great”, “awesome” or “interesting”. I wondered what my most frequently tweeted words are. Like the last post you can (probably) do what […]
Elevation: accuracy of a Garmin Edge 800 GPS device
I use a Garmin 800 GPS device to log my cycling activity. including my commutes. Since I have now built up nearly 4 years of cycling the same route, I had a good dataset to look at how accurate the device is. I wrote some code to import all of the rides tagged with commute […]
Colours Running Out: Analysis of 2016 running
Towards the end of 2015, I started distance running. I thought it’d be fun to look at the frequency of my runs over the course of 2016. Most of my runs were recorded with a GPS watch. I log my cycling data using Rubitrack, so I just added my running data to this. This software is great but […]
Parallel Lines: Spatial statistics of microtubules in 3D
Our recent paper on “the mesh” in kinetochore fibres (K-fibres) of the mitotic spindle was our first adventure in 3D electron microscopy. This post is about some of the new data analysis challenges that were thrown up by this study. I promised a more technical post about this paper and here it is, better late […]
My Blank Pages III: The Art of Data Science
I recently finished reading The Art of Data Science by Roger Peng & Elizabeth Matsui. Roger, together with Jeff Leek, writes the Simply Statistics blog and he works at JHU with Elizabeth. The aim of the book is to give a guide to data analysis. It is not meant as a comprehensive data analysis “how to”, […]
The Great Curve: Citation distributions
This post follows on from a previous post on citation distributions and the wrongness of Impact Factor. Stephen Curry had previously made the call that journals should “show us the data” that underlie the much-maligned Journal Impact Factor (JIF). However, this call made me wonder what “showing us the data” would look like and how journals might […]
Waiting to happen II: Publication lag times
Following on from the last post about publication lag times at cell biology journals, I went ahead and crunched the numbers for all journals in PubMed for one year (2013). Before we dive into the numbers, a couple of points about this kind of information. Some journals “reset the clock” on the received date with manuscripts […]