## Timestretched: audio stretching on the command line

I was recently reminded of the wonders of paulstretch by a 8-fold slowed down version of Pyramid Song by Radiohead.

Paulstretch is an audio manipulation widget that can stretch or compress the time of an audio recording. Note that it doesn’t “slow down” or “speed up” a recording, it resamples the audio and recasts it over a different time scale while maintaining the pitch. There’s lots of examples on the web of how paulstretch can stretch a song, but fewer examples of the other way around. I wondered what time compression would sound like for a slow song.

There’s a plugin for Audacity, which allows stretching but does not allow compressing. There is a python version available to run paulstretch from the command line, and another user has added the ability to process lossless audio (in FLAC format). My fork is here (with a very minor change) for permanence. These scripts all allow time compression as well as time stretching.

I compressed Ebony Tears by Cathedral. A doom metal tune from 1991 which is at ~56 bpm. Two-fold compression (-s 0.5 with 0.25 sampling) recasts the song as a 112 bpm heavy metal tune.

For a more well known example of a slow song, I went for “Last Night I Dreamt That Somebody Loved Me” by The Smiths. Compression works OK with this song. Without sounding like a total philistine, the intro with the animal noises and Johnny Marr plonking away on the piano becomes mercifully shortened. Then the main song (0:58) turns into a more jolly reel at 0.5 compression.

…and a somehow more urgent moody tune (at 1:28) with 0.75. Disclaimer: I love the original version of this song, at the correct pace. I am not saying this is an improvement in any way.

Compressing songs was fun, but somehow not as fascinating as stretching. The trick is to pick songs with minimal percussion and vocals for the stretched version to sound like something other than prolonged noise. Here is a four-fold stretch of Into The Groove by Madonna. The vocals are OK but those gated 80s drums sound awful smudged over a longer time window.

And here are two-fold and four-fold versions of Joe Satriani’s instrumental The Forgotten (Part One). The original is an agitated guitar workout. It is transformed into an ambient soundscape with stretching.

The command line versions of paulstretch are easy to use and fun to experiment with. Feel free to comment with suggestions for good contenders for stretching or compressing.

The post title comes from “Timestretched” by The Divine Comedy from their Regeneration LP.

## Garmonbozia: Using R to look at Garmin CSV data

Garmin Connect has a number of plots built in, but to take a deeper dive into all your fitness data, you need to export a CSV and fire up R. This post is a quick guide to some possibilities for running data.

There’s a few things that I wanted to look at. For example, how does my speed change through the year? How does that compare to previous years? If I see some trends, is that the same for short runs and long runs? I wanted to look at the cumulative distance I’d run each year… There’s a lot of things that would be good to analyse.

Garmin Connect has a simple way to export data as a CSV. There are other ways to get your data, but the web interface is pretty straightforward. To export a CSV of your data, head to the Garmin Connect website, login and select Activities, All Activities. On this page, filter the activities for whatever you want to export. I clicked Running (you can filter some more if you want), and then scrolled down letting the data load onto the page until I went back as far as I wanted. In the top right corner, you click Export CSV and you will download whatever is displayed on the page.

The code to generate these plots, together with some fake data to play with can be found here.

Now in R, load in the CSV file

require(ggplot2)
require(dplyr)
require(hms)
file_name <- file.choose()
df1 <- read.csv(file_name, header = TRUE, stringsAsFactors = FALSE)


We have a data frame, but we need to rejig the Dates and a few other columns before we can start making plots.

# format Date column to POSIXct
df1$Date <- as.POSIXct(strptime(df1$Date, format = "%Y-%m-%d %H:%M:%S"))
# format Avg.Pace to POSIXct
df1$Avg.Pace <- as.POSIXct(strptime(df1$Avg.Pace, format = "%M:%S"))
# make groups of different distances using ifelse
df1$Type <- ifelse(df1$Distance < 5, "< 5 km", ifelse(df1$Distance < 8, "5-8 km", ifelse(df1$Distance < 15, "8-15 km", ">15 km")))
# make factors for these so that they're in the right order when we make the plot
df1$Type_f = factor(df1$Type, levels=c("< 5 km","5-8 km","8-15 km", ">15 km"))


Now we can make the first plot. The code for the first one is below, with all the code for the other plots shown below that.

# plot out average pace over time
p1 <- ggplot( data = df1, aes(x = Date,y = Avg.Pace, color = Distance)) +
geom_point() +
scale_y_datetime(date_labels = "%M:%S") +
geom_smooth(color = "orange") +
labs(x = "Date", y = "Average Pace (min/km)")


The remainder of the code for the other plots is shown below. The code is commented. For some of the plots, a bit of extra work on the data frame is required.

# plot out same data grouped by distance
p2 <- ggplot( data = df1, aes(x = Date,y = Avg.Pace, group = Type_f, color = Type_f)) +
geom_point() +
scale_y_datetime(date_labels = "%M:%S") +
geom_smooth() +
labs(x = "Date", y = "Average Pace (min/km)", colour = NULL) +
facet_grid(~Type_f)
# now look at stride length. first remove zeros
df1[df1 == 0] <- NA
# now find earliest valid date
date_v <- df1$Date # change dates to NA where there is no avg stride data date_v <- as.Date.POSIXct(ifelse(df1$Avg.Stride.Length > 0, df1$Date, NA)) # find min and max for x-axis earliest_date <- min(date_v, na.rm = TRUE) latest_date <- max(date_v, na.rm = TRUE) # make the plot p3 <- ggplot(data = df1, aes(x = Date,y = Avg.Stride.Length, group = Type_f, color = Type_f)) + geom_point() + ylim(0, NA) + xlim(as.POSIXct(earliest_date), as.POSIXct(latest_date)) + geom_smooth() + labs(x = "Date", y = "Average stride length (m)", colour = NULL) + facet_grid(~Type_f) df1$Avg.HR <- as.numeric(as.character(df1$Avg.HR)) p4 <- ggplot(data = df1, aes(x = Date,y = Avg.HR, group = Type_f, color = Type_f)) + geom_point() + ylim(0, NA) + xlim(as.POSIXct(earliest_date), as.POSIXct(latest_date)) + geom_smooth() + labs(x = "Date", y = "Average heart rate (bpm)", colour = NULL) + facet_grid(~Type_f) # plot out average pace per distance coloured by year p5 <- ggplot( data = df1, aes(x = Distance,y = Avg.Pace, color = Date)) + geom_point() + scale_y_datetime(date_labels = "%M:%S") + geom_smooth(color = "orange") + labs(x = "Distance (km)", y = "Average Pace (min/km)") # make a date factor for year to group the plots df1$Year <- format(as.Date(df1$Date, format="%d/%m/%Y"),"%Y") p6 <- ggplot( data = df1, aes(x = Distance,y = Avg.Pace, group = Year, color = Year)) + geom_point() + scale_y_datetime(date_labels = "%M:%S") + geom_smooth() + labs(x = "Distance", y = "Average Pace (min/km)") + facet_grid(~Year) # Cumulative sum over years df1 <- df1[order(as.Date(df1$Date)),]
df1 <- df1 %>% group_by(Year) %>% mutate(cumsum = cumsum(Distance))
p7 <- ggplot( data = df1, aes(x = Date,y = cumsum, group = Year, color = Year)) +
geom_line() +
labs(x = "Date", y = "Cumulative distance (km)")
# Plot these cumulative sums overlaid
# Find New Year's Day for each and then work out how many days have elapsed since
df1$nyd <- paste(df1$Year,"-01-01",sep = "")
df1$Days <- as.Date(df1$Date, format="%Y-%m-%d") - as.Date(as.character(df1$nyd), format="%Y-%m-%d") # Make the plot p8 <- ggplot( data = df1, aes(x = Days,y = cumsum, group = Year, color = Year)) + geom_line() + scale_x_continuous() + labs(x = "Days", y = "Cumulative distance (km)")  Finally, we can save all of the plots using ggsave. # save all plots ggsave("allPace.png", plot = p1, width = 8, height = 4, dpi = "print") ggsave("paceByDist.png", plot = p2, width = 8, height = 4, dpi = "print") ggsave("strideByDist.png", plot = p3, width = 8, height = 4, dpi = "print") ggsave("HRByDist.png", plot = p4, width = 8, height = 4, dpi = "print") ggsave("allPaceByDist.png", plot = p5, width = 8, height = 4, dpi = "print") ggsave("paceByDistByYear.png", plot = p6, width = 8, height = 4, dpi = "print") ggsave("cumulativeDistByYear.png", plot = p7, width = 8, height = 4, dpi = "print") ggsave("cumulativeDistOverlay.png", plot = p8, width = 8, height = 4, dpi = "print")  I think the code might fail if you don’t record all of the fields that I do. For example if heart rate data is missing or stride length is not recorded, I’m not sure what the code will do. The aim here is to give an idea of what sorts of plots can be generated just using the summary data in the CSV provided by Garmin. Feel free to make suggestions in the comments below. The post title comes from “Garmonbozia” by Superdrag from the Regretfully Yours album. Apparently Garmonbozia is something eaten by the demons in the Black Lodge in the TV series Twin Peaks. ## All Around The World: Maps and Flags in R Our lab is international. People born all over the world have come to work in my group. I’m proud of this fact, especially in the current political climate. I’ve previously used the GoogleMaps API to display a heat map on our lab webpage. It shows where in the world people in the lab come from. This was OK, but I wanted to get an R based solution to make this graphic to make it easier to automate updates. This post is an explainer and “how to” guide. Code and some example data are here. The idea is to create graphics to describe the origins of a group of people. For my use-case it is my research group, but it could be any group of people. All you need is a list of countries that the people come from. In the example for this post, I took the Top 100 Footballers voted for by Guardian readers in 2016. In my lab dataset, I store the countries of origin in ISO2 format. This means Spain is ES, Germany is DE and so on. I converted the Guardian dataset to ISO2 format using a lookup and then I was ready to put it into the R script. if (!require("rworldmap")) { install.packages("rworldmap") library(rworldmap) } # ggplot2, ggFlags, dplyr are needed for the bar charts library(ggplot2) library(dplyr) if (!require("ggflags")) { devtools::install_github("rensa/ggflags") library(ggflags) } # csv file with each person as a row and containing a column with the header Origin and # countries in 2-letter ISO format (change joinCode for other formats) file_name <- file.choose() df1 <- read.csv(file_name, header = TRUE, stringsAsFactors = FALSE) countries_lab <- as.data.frame(table(df1$Origin))
colnames(countries_lab) <- c("country", "value")
matched <- joinCountryData2Map(countries_lab, joinCode="ISO2", nameJoinColumn="country")


This part of the script sets up the libraries that are needed. We’ll use the rworldmap package first. This post was very useful for guidance. We load in the csv file which contains the countries of origin for the people we want to map out. It doesn’t need anything more than one column with the ISO2 codes. If it does it’s OK. As long as the header for the countries column is called “Origin”, all will be OK.

This column is extracted and a new dataframe is made from it which has each country as a row and the number of instances of that country as a second column. These are labelled “country” and “value” for convenience. Now rworldmap does its thing with the joinCountryData2Map line. Next we make the map!

# make png of the map
png(file = "labWorldMap.png",
width = 1024, height = 768)
par(mai=c(0,0,0.2,0))
mapCountryData(matched,
nameColumnToPlot="value",
mapTitle= "",
catMethod = "logFixedWidth",
colourPalette = "heat",
oceanCol="lightblue",
missingCountryCol="white",
lwd = 1)
dev.off()


This makes a nice map. I’ve hidden the legend which shows what the colours mean. The map can be customised in lots of ways. I liked the way this map looked and my other aim was to make a chart to show the relative numbers of people in each country. Speaking of which…

# make bar chart of lab members
countries_lab %>%
mutate(code = tolower(country)) %>%
ggplot(aes(x = reorder(country, value), y = value)) +
geom_bar(stat = "identity") +
geom_flag(y = -1, aes(country = code), size = 4) +
scale_y_continuous(expand = c(0.1, 1)) +
xlab("Country") +
ylab("Members") +
theme_bw() +
theme(legend.title = element_blank()) +
coord_flip()
ggsave("plot.png", plot = last_plot())


Using the data frame we made previously, we can pipe it to ggplot2 via the wonders of dplyr. I am using geom_flag here from the ggflags library. Note that this is a fork of ggflags which gives circular flags which look great on the graph. The geom_flag needs a lowercase entry for each ISO2 country code so the first step is to mutate the country column to make a new lowercase column called code.

That’s it! With a csv file and a few lines of R code you can generate some nice looking graphics.

The dataset shows that the country that produced the biggest fraction of the world’s best footballers (as voted for by Guardian readers) was Spain. There are no surprises in this dataset. The most prominent European and South American countries giving a strong showing.

The post title is taken from “All Around The World” by The Dead Milkmen. Many songs in my library with this title, but this paranoid extraterrestrial tune gets the pick this time.

## Installing open source PyMol on a Mac

This is a quick “how to” post. There is a licensed version of PyMol (MacPyMol) available, but the open source version can be installed on a Mac free of charge. The official page has a guide, which is not terribly detailed, and I found this excellent guide which is unfortunately out-of-date.

Here is an updated guide to installing PyMol using Homebrew on macOS Mojave 10.14.3

Step 1 is to install Homebrew

/usr/bin/ruby -e "\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Next step is to install the PyMol dependencies using Homebrew

brew install Caskroom/cask/xquartzbrew install tcl-tkbrew install python

Now install PyMol.

brew install brewsci/bio/pymol

You can now run PyMol by typing

pymol

The MPIBR-Bioinformatics page has the following guide to make a little executable app to launch PyMol straight from the Desktop.

• Open Automator, which is in Applications in macOS.
• Create new document, select “Application”.
• Select “Actions” and “Library” in the left pane. Select “Utilities” and “Run Shell Script”. Drag this into the main pane.
• Choose “/bin/bash” as a shell.
• Paste the following: /usr/local/bin/pymol -M. If this doesn’t work, check the path to pymol using which pymol in the terminal, and use this instead.
• Save the application (“File > Save”) to the Desktop and name it “PyMol”.

Now you can double-click this app to run PyMol.

## Tips from the blog XI: docx to pdf

A long time ago I posted a little Automator routine to convert Word doc/docx files to PDF. Not long after that, this routine ceased to work due to changes in Microsoft Word (I think). It’s still very useful to convert a whole folder of docx files to PDF in order to avoid Word and just use Preview on the Mac. For committee work or for marking students’ work, I often have a whole folder of docx files and would prefer it if they were in PDF format. I found this very nifty trick on the web and thought I’d post a link here to make up for the fact that my old post no longer works.

The full post is here. What is so nice about this Automator solution is that it uses a bash script to do the conversion. This means you don’t need Microsoft Word for it to work! From what I can see it uses the xml in the docx file (and presumably won’t work on older *.doc files) for the conversion. The post describes how to run it as a Service in macOS. Note, that it destroys the docx files, so it should only be used on a copy. It could be run from the command line rather than right-clink, the engine is this little script.

Thanks to Jacob Salmela for posting it.

This post is part of a series of short tips

## Multiplex: Small multiple artwork from GPX tracks

I’d seen the small multiple artwork of running and cycling routes from Marcus Volz’s R package Strava all over the web. Ads for “posters of your GPS tracks” pop up on Reddit and I’d notice a few #Rstats people put up their posters on Twitter. I’ve had the package bookmarked for a while and this week I finally got round to generating a small multiple poster of some of my cycling routes.

I was pleased with the result and wanted to post it here. But also, running the code was not straightforward as I’ll explain below. If you want to generate your own plot read on.

The idea behind the poster is really nice. You get a kind of generative art-style poster. It looks nice and you can identify individual routes which is fun to do.

The instructions on the GitHub page are absolutely correct and the code should run out-of-the-box. The idea is that you download your Strava data and then make your plots. Unfortunately, it seems that a change in Strava’s data export policy (possibly related to GDPR changes) has broken the package. I found that there are two problems. First, Strava’s “download your data” link gives you a mix of formats (in my case GPX and FIT files), the package only works with GPX. Second, if there is any elevation data missing from a track, the data frame that is needed to make the poster is not built properly.

Going GPX only: In my case, I don’t keep all my data in Strava and instead use a local repository managed with RubiTrack. This software allows me to filter for the tracks I want and export them in GPX format. The only problem is that it generates one huge file with all the tracks enclosed. This gets read by the package as a single track. To fix this, I split the file using awk.

awk '/<trk>/{close(file);n++;}{file="track"n".gpx";print >> file;}' untitled.gpx


I could then discard track1.gpx which just had the xml header and then use the directory of gpx files.

The elevation problem: this affected only some of the tracks, so in the end the R code needed to be modified. The elevation data is not needed to make the posters so the file process_data.R needs editing, line 28 can be commented out and then line 32 should read:

result <- data.frame(lat = lat, lon = lon, time = time, type = type) %<%

This issue is raised on GitHub and has been closed, but the code doesn’t work with elevation blanks. If you run into this problem, this is the way I found to fix it. The other plots in the package which do use elevation will not run, but at least the poster can be made.

I exported the poster as PDF and then made some changes in Illustrator to give the result above.

The post title comes from Multiplex from Oliver’s Standing Stone LP from 1974.

## Adventures in Code VI: debugging and silly mistakes

This deserved a bit of further explanation, due to the stupidity involved.

“Debugging is like being the detective in a crime movie where you are also the murderer.” – Filipe Fortes

My code was giving an unexpected result and I was having a hard time figuring out the problem. The unexpected result was that a resampled set of 2D coordinates were not being rotated randomly. I was fortunate to be able to see this otherwise I would have never found this bug and probably would’ve propagated the error to other code projects.

I narrowed down the cause but ended up having to write some short code to check that it really did cause the error.

I was making a rotation matrix and then using it to rotate a 2D coordinate set by matrix multiplication. The angle was randomised in the loop. What could go wrong? I looked at that this:

theta = pi*enoise(1)
rotMat = {{cos(theta),-sin(theta)},{sin(theta),cos(theta)}}

and thought “two lines – pah – it can be done in one!”. Since the rotation matrix is four numbers [-1,1], I thought “I’ll just pick those numbers at random, I just want a random angle don’t I?”

rotMat = enoise(1)

Why doesn’t an alarm go off when this happens? A flashing sign saying “are you sure about that?”…

My checks showed that a single point at 1,0 after matrix multiplication with this method gives.

When it should give

And it’s so obvious when you’ve seen why. The four numbers in the rotation matrix are, of course, not independent.

I won’t make that mistake again and I’m going to try to think twice when trying to save a line of code like in the future!

Part of a series on computers and coding.

## Frankly, Mr. Shankly

I read about Antonio Sánchez Chinchón’s clever approach to use the Travelling Salesperson algorithm to generate some math-art in R. The follow up was even nicer in my opinion, Pencil Scribbles. The subject was Boris Karloff as the monster in Frankenstein. I was interested in running the code (available here and here), so I thought I’d run it on a famous scientist.

By happy chance one of the most famous scientists of the 20th Century, Rosalind Franklin, shares a nominative prefix with the original subject. There is also a famous portrait of her that I thought would work well.

I first needed needed to clear up the background because it was too dark.

Now to run the TSP code.

The pencil scribbles version is nicer I think.

The R scripts basically ran out-of-the-box. I was using a new computer that didn’t have X11quartz on it nor the packages required, but once that they were installed I just needed to edit the line to use a local file in my working directory. The code just ran. The outputs FrankyTSP and Franky_scribbles didn’t even need to be renamed, given my subject’s name.

Thanks to Antonio for making the code available and so easy to use.

The post title comes from “Frankly, Mr. Shankly” by The Smiths which appears on The Queen is Dead. If the choice of post title needs an explanation, it wasn’t a good choice…

## Paintball’s Coming Home: generating Damien Hirst spot paintings

A few days ago, I read an article about Damien Hirst’s new spot paintings. I’d forgotten how regular the spots were in the original spot paintings from the 1990s (examples are on his page here). It made me think that these paintings could be randomly generated and so I wrote a quick piece of code to do this (HirstGenerator).

I used Hirst’s painting ‘Abalone Acetone Powder’ (1991), which is shown on this page as photographed by Alex Hartley. A wrote some code to sample the colours of this image and then a script to replicate it. The original is shown below  © Damien Hirst and Science Ltd. Click them for full size.

and then this is the replica:

Now that I had a palette of the colours used in the original. It was simple to write a generator to make spot paintings where the spots are randomly assigned.

The generator can make canvasses at whatever size is required.

The code can be repurposed to make spot paintings with different palettes from his other spot paintings or from something else. So there you have it. Generative Hirst Spot Paintings.

For nerds only

My original idea was to generate a palette of unique colours from the original painting. Because of the way I sampled them, each spot is represented once in the palette. This means the same colour as used by the artist is represented as several very similar but nonidentical colours in the palette. My original plan was to find the euclidean distances between all spots in RGB colour space and to establish a distance cutoff to decide what is a unique colour.

That part was easy to write but what value to give for the cutoff was tricky. After some reading, it seems that other colour spaces are better suited for this task, e.g. converting RGB to a CIE colour space. For two reasons, I didn’t pursue this. First, quantixed coding is time-limited. Second. assuming that there is something to the composition of these spot paintings (and they are not a con trick) the frequency of spots must have artistic merit and so they should be left in the palette for sampling in the generated pictures. The representation of the palette in RGB colour space had an interesting pattern (shown in the GIF above).

The post title comes from “Paintball’s Coming Home” by Half Man Half Biscuit from Voyage To The Bottom Of The Road. Spot paintings are kind of paintballs, but mostly because I love the title of this song.

## Esoteric Circle

Many projects in the lab involve quantifying circular objects. Microtubules, vesicles and so on are approximately circular in cross section. This quick post is about how to find the diameter of these objects using a computer.

So how do you measure the diameter of an object that is approximately circular? Well, if it was circular you would measure the distance from one edge to the other, crossing the centre of the object. It doesn’t matter along which axis you do this. However, since these objects are only approximately circular, it matters along which axis you measure. There are a couple of approaches that can be used to solve this problem.

Principal component analysis

The object is a collection of points* and we can find the eigenvectors and eigenvalues of these points using principal component analysis. This was discussed previously here. The 1st eigenvector points along the direction of greatest variance and the 2nd eigenvector is normal to the first. The order of eigenvectors is determined by their eigenvalues. We use these to rotate the coordinate set and offset to the origin.

Now the major axis of the object is aligned to the x-axis at y=0 and the minor axis is aligned with the y-axis at x=0 (compare the plot on the right with the one on the left, where the profiles are in their original orientation – offset to zero). We can then find the absolute values of the axis crossing points and when added together these represent the major axis and minor axis of the object. In Igor, this is done using a oneliner to retrieve a rotated set of coords as the wave M_R.

PCA/ALL/SEVC/SRMT/SCMT xCoord,yCoord

To find the crossing points, I use Igor’s interpolation-based level crossing functions. For example, storing the aggregated diameter in a variable called len.

FindLevel/Q/EDGE=1/P m1c0, 0
len = abs(m1c1(V_LevelX))
FindLevel/Q/EDGE=2/P m1c0, 0
len += abs(m1c1(V_LevelX))

This is just to find one axis (where m1c0 and m1c1 are the 1st and 2nd columns of a 2-column wave m1) and so you can see it is a bit cumbersome.

Anyway, I was quite happy with this solution. It is unbiased and also tells us how approximately circular the object is (because the major and minor axes tell us the aspect ratio or ellipticity of the object). I used it in Figure 2 of this paper to show the sizes of the coated vesicles. However, in another project we wanted to state what the diameter of a vesicle was. Not two numbers, just one. How do we do that? We could take the average of the major and minor axes, but maybe there’s an easier way.

Polar coordinates

The distance from the centre to every point on the edge of the object can be found easily by converting the xy coordinates to polar coordinates. To do this, we first find the centre of the object. This is the centroid $$(\bar{x},\bar{y})$$ represented by

$$\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_{i}$$ and $$\bar{y} = \frac{1}{n}\sum_{i=1}^{n} y_{i}$$

for n points and subtract this centroid from all points to arrange the object around the origin. Now, since the xy coords are represented in polar system by

$$x_{i} = r_{i}\cos(\phi)$$ and $$y_{i} = r_{i}\sin(\phi)$$

we can find r, the radial distance, using

$$r_{i} = \sqrt{x_{i}^{2} + y_{i}^{2}}$$

With those values we can then find the average radial distance and report that.

There’s something slightly circular (pardon the pun) about this method because all we are doing is minimising the distance to a central point initially and then measuring the average distance to this minimised point in the latter step. It is much faster than the PCA approach and would be insensitive to changes in point density around the object. The two methods would probably diverge for noisy images. Again in Igor this is simple:

Make/O/N=(dimsize(m1,0)-1)/FREE rW

rW[] = sqrt(m1[p][0]^2 + m1[p][1]^2)

len = 2 * mean(rW)
Here again, m1 is the 2-column wave of coords and the diameter of the object is stored in len.

How does this compare with the method above? The answer is kind of obvious, but it is equidistant between the major and minor axes. Major axis is shown in red and minor axis shown in blue compared with the mean radial distance method (plotted on the y-axis). In places there is nearly a 10 nm difference which is considerable for objects which are between 20 and 35 nm in diameter. How close is it to the average of the major and minor axis? Those points are in black and they are very close but not exactly on y=x.

So for simple, approximately circular objects with low noise, the ridiculously simple polar method gives us a single estimate of the diameter of the object and this is much faster than the more complex methods above. For more complicated shapes and noisy images, my feeling is that the PCA approach would be more robust. The two methods actually tell us two subtly different things about the shapes.

Why don’t you just measure them by hand?

In case there is anyone out there wondering why a computer is used for this rather than a human wielding the line tool in ImageJ… there are two good reasons.

1. There are just too many! Each image has tens of profiles and we have hundreds of images from several experiments.
2. How would you measure the profile manually? This approach shows two unbiased methods that don’t rely on a human to draw any line across the object.

* = I am assuming that the point set is already created.

The post title is taken from “Esoteric Circle” by Jan Garbarek from the LP of the same name released in 1969. The title fits well since this post is definitely esoteric. But maybe someone out there is interested!