I’m not following you II: Twitter data and R

My activity on twitter revolves around four accounts.

I try to segregate what happens on each account, and there’s inevitably some overlap. But what about overlap in followers?

What lucky people are following all four? How many only see the individual accounts?

It’s quite easy to look at this in R.

So there are 36 lucky people (or bots!) following all four accounts. I was interested in the followers of the quantixed account since it seemed to me that it attracts people from a slightly different sphere. It looks like about one-third of quantixed followers only follow quantixed, about one-third follow clathrin also and more or less the remainder are “all in” following three accounts or all four. CMCB followers are split about the same. The lab account is a bit different, with close to one-half of the followers also following clathrin.

Extra nerd points:

This is a Venn diagram and not an Euler plot. Venn just shows schematically the intersections and does not attempt to encode information in the area of each part. Euler plots for greater than three groups are hard to generate and to make any sense of what is shown. It is a dataviz problem to look at the proportions or lots of groups. A solution here would be to generate a further four Venn diagrams. On each, display the proportion for one category as a fraction or percentage

How to do it:

Last time, I described how to set up rtweet and make a Twitter app for use in R. You can use this to pull down lists of followers and extract their data. Using the intersect function you can work out the numbers of followers at each intersection. For four accounts, there will be 1 group of four, 4 groups of three, 6 groups of two. The VennDiagram package just needs the total numbers for all four groups and then details of the intersections, i.e. you don’t need to work out the groups minus their intersections – it does this for you.

library(rtweet)
library(httpuv)
library(VennDiagram)
## whatever name you assigned to your created app
appname <- "whatever_name"
## api key (example below is not a real key)
key <- "blah614h"
## api secret (example below is not a real key)
secret <- "blah614h"
## create token named "twitter_token"
twitter_token <- create_token(
app = appname,
consumer_key = key,
consumer_secret = secret)
clathrin_followers <- get_followers("clathrin", n = "all")
clathrin_followers_names <- lookup_users(clathrin_followers)
quantixed_followers <- get_followers("quantixed", n = "all")
quantixed_followers_names <- lookup_users(quantixed_followers)
cmcb_followers <- get_followers("Warwick_CMCB", n = "all")
cmcb_followers_names <- lookup_users(cmcb_followers)
roylelab_followers <- get_followers("roylelab", n = "all")
roylelab_followers_names <- lookup_users(roylelab_followers)
# a = clathrin
# b = quantixed
# c = cmcb
# d = roylelab
## now work out intersections
anb <- intersect(clathrin_followers_names$user_id,quantixed_followers_names$user_id)
anc <- intersect(clathrin_followers_names$user_id,cmcb_followers_names$user_id)
and <- intersect(clathrin_followers_names$user_id,roylelab_followers_names$user_id)
bnc <- intersect(quantixed_followers_names$user_id,cmcb_followers_names$user_id)
bnd <- intersect(quantixed_followers_names$user_id,roylelab_followers_names$user_id)
cnd <- intersect(cmcb_followers_names$user_id,roylelab_followers_names$user_id)
anbnc <- intersect(anb,cmcb_followers_names$user_id)
anbnd <- intersect(anb,roylelab_followers_names$user_id)
ancnd <- intersect(anc,roylelab_followers_names$user_id)
bncnd <- intersect(bnc,roylelab_followers_names$user_id)
anbncnd <- intersect(anbnc,roylelab_followers_names$user_id)
## four-set Venn diagram
venn.plot <- draw.quad.venn(
area1 = nrow(clathrin_followers_names),
area2 = nrow(quantixed_followers_names),
area3 = nrow(cmcb_followers_names),
area4 = nrow(roylelab_followers_names),
n12 = length(anb),
n13 = length(anc),
n14 = length(and),
n23 = length(bnc),
n24 = length(bnd),
n34 = length(cnd),
n123 = length(anbnc),
n124 = length(anbnd),
n134 = length(ancnd),
n234 = length(bncnd),
n1234 = length(anbncnd),
category = c("Clathrin", "quantixed", "CMCB", "RoyleLab"),
fill = c("dodgerblue1", "red", "goldenrod1", "green"),
lty = "dashed",
cex = 2,
cat.cex = 1.5,
cat.col = c("dodgerblue1", "red", "goldenrod1", "green"),
fontfamily = "Helvetica",
cat.fontfamily = "Helvetica"
);
# write to file
png(filename = "Quad_Venn_diagram.png");
grid.draw(venn.plot);
dev.off()

I’ll probably return to rtweet in future and will recycle the title if I do.

Like last time, the post title is from “I’m Not Following You” the final track from the 1997 LP of the same name from Edwyn Collins

Adventures in Code VI: debugging and silly mistakes

This deserved a bit of further explanation, due to the stupidity involved.

“Debugging is like being the detective in a crime movie where you are also the murderer.” – Filipe Fortes

My code was giving an unexpected result and I was having a hard time figuring out the problem. The unexpected result was that a resampled set of 2D coordinates were not being rotated randomly. I was fortunate to be able to see this otherwise I would have never found this bug and probably would’ve propagated the error to other code projects.

I narrowed down the cause but ended up having to write some short code to check that it really did cause the error.

I was making a rotation matrix and then using it to rotate a 2D coordinate set by matrix multiplication. The angle was randomised in the loop. What could go wrong? I looked at that this:

theta = pi*enoise(1)
rotMat = {{cos(theta),-sin(theta)},{sin(theta),cos(theta)}}

and thought “two lines – pah – it can be done in one!”. Since the rotation matrix is four numbers [-1,1], I thought “I’ll just pick those numbers at random, I just want a random angle don’t I?”

rotMat = enoise(1)

Why doesn’t an alarm go off when this happens? A flashing sign saying “are you sure about that?”…

My checks showed that a single point at 1,0 after matrix multiplication with this method gives.

When it should give

And it’s so obvious when you’ve seen why. The four numbers in the rotation matrix are, of course, not independent.

I won’t make that mistake again and I’m going to try to think twice when trying to save a line of code like in the future!

Part of a series on computers and coding.

Do It Yourself: Lab Notebook Archiving Project

A while back, the lab moved to an electronic lab notebook (details here and here). One of the drivers for this move was the huge number of hard copy lab note books that had accumulated in the lab over >10 years. Switching to an ELN solved this problem for the future, but didn’t make the old lab note books disappear. So the next step was to archive them and free up some space.

We access the contents of these books fairly regularly so archiving had to mean digitising them as well as putting them into storage. I looked at a few options before settling on a very lo-fi solution.

Option 1: call in the professionals

I got a quote from our University’s preferred data archiving firm. The lab notebooks we use have 188 pages and I had 89 to archive. The quote was over £4000 + VAT for scanning only. This was too expensive and so I next looked at DIY options.

Option 2: scan the books

At the University we have good MPDs that will scan documents and store them on a server as a multipage PDF. There’s two resolutions at which you can scan, which are good-but-not-amazing quality. The scanners have a feeder which would automate the scan of a lab book, but it would mean destroying the books (which are hardbound) to scan them.

I tried scanning one book using this method. Disassembling a notebook with a razorblade was quite quick but the problem was that the scanner struggled with the little print outs that people stick in their lab books. Dealing with jams and misfired scans meant that this was not an option, and I didn’t want to destroy all of the books either.

Option 3: photography rigs

Next, I looked at book scanning projects to see how they were done. In these projects, the books are valuable and so can’t be destroyed, but it must be automated… I found that these projects use a cradle to sit the book in. A platen is pushed against the pages (to flatten the pages) and then two cameras take a picture of the two pages, triggered in sync using an external button or foot pedal. An example of one raspberry pi-powered rig is here. Building one of these appealed but would still require some expense (and time and effort). I asked around if anyone else wanted to help with the build, thinking that others may be wanting to archive their notebooks, but I got no takers.

Option 4: the zero-cost solution!

Inspiration came from a student who left my lab and wanted to photograph her lab books for future reference. She captured them on her camera phone by hand in a matter of minutes. Shooting two pages of a book from a single digital camera suspended above the notebook would be a good compromise. Luckily I had access to a digital camera and a few hundred Lego bricks. Total new spend = £0.

I know it looks terrible, but it was pretty effective!

I put the rig on a table (for ergonomic reasons), next to a window and photographed each book using natural light. It took around 10 min to photograph one lab book. I took the images over a few weeks amongst doing other stuff so that the job didn’t become too onerous. I shot the books at the highest resolution and stored the raw images on the server. I wrote a quick script to stack the images scale them down 25% and export to PDF to make an easy-to-consult PDF file for each lab book. Everyone in the lab can access these PDFs and if needed can pull down the high res versions. The lab books have now been stored in a sealed container. We can access the books if needed. However, having looked at the images, I think if something is not readable from the file, it won’t be readable in the hard copy.

Was it worth it?

I think so. It took a while to get everything digitised but I’m glad it’s done. The benefits are:

  1. Easy access to all lab books for every member of the lab.
  2. Clearing a load of clutter from my office.
  3. The rig can be rebuilt easily, but is not otherwise sitting around gathering dust.
  4. Some of the older lab books were deteriorating and so capturing them before they got worse was a good idea (see picture above for some sellotape degradation).

The post title is taken from the LP “Do It Yourself” by The Seahorses.

Dividing Line: not so simple division in ctenophores

This wonderful movie has repeatedly popped up into my twitter feed.

It was taken by Tessa Montague and is available here (tweet is here).

The movie is striking because of the way that cytokinesis starts at one side and moves to the other. Most model systems for cell division have symmetrical division.

Rob de Bruin commented that “it makes total sense to segregate this way”. Implying that if a cell just gets cut in half it deals with equal sharing of components. This got me thinking…

It does make sense to share n identical objects this way. For example, vesiculation of the Golgi generates many equally sized vesicles. Cutting the cell in half ensures that each cell gets approximately half of the Golgi (although there is another pathway that actively segregates vesicular material, reviewed here). However, for segregation of genetic material – where it is essential that each cell receives one (and exactly one) copy of the genome – a cutting-in-half mechanism simply doesn’t cut it (pardon the pun).

The error rate of such a mechanism would be approximately 50% which is far too high for something so important. Especially at this (first) division as shown in the movie.

I knew nothing about ctenophores (comb jellies) before seeing this movie and with a bit of searching I found this paper. In here they show that there is indeed a karyokinetic (mitotic) mechanism that segregates the genetic material and that this happens independently of the cytokinetic process which is actin-dependent. So not so different after all. The asymmetric division and the fact that these divisions are very rapid and synchronised is very interesting. It’s very different to the sorts of cells that we study in the lab. Thanks to Tessa Montague for the amazing video that got me thinking about this.

Footnote: the 50% error rate can be calculated as follows. Although segregation is in 3D, this is a 1D problem. If we assume that the cell divides down the centre of the long axis and that object 1 and object 2 can be randomly situated along the long axis. There is an equal probability of each object ending each cell. So object 1 can end in either cell 1 or cell 2, as can object 2. The probability that objects 1 and 2 end in the same cell is 50%. This is because there is a 25% chance of each outcome (object 1 in cell 1, object 2 in cell 2; object 1 in cell 2, object 2 in cell 1; object 1 and object 2 in cell 1; object 1 and object 2 in cell 2). It doesn’t matter how many objects we are talking about or the size of the cell. This is a highly simplified calculation but serves the purpose of showing that another solution is needed to segregate objects with identity during cell division.

The post title comes from “Dividing Line” from the Icons of Filth LP Onward Christian Soldiers.

Frankly, Mr. Shankly

I read about Antonio Sánchez Chinchón’s clever approach to use the Travelling Salesperson algorithm to generate some math-art in R. The follow up was even nicer in my opinion, Pencil Scribbles. The subject was Boris Karloff as the monster in Frankenstein. I was interested in running the code (available here and here), so I thought I’d run it on a famous scientist.

By happy chance one of the most famous scientists of the 20th Century, Rosalind Franklin, shares a nominative prefix with the original subject. There is also a famous portrait of her that I thought would work well.

I first needed needed to clear up the background because it was too dark.

Now to run the TSP code.

The pencil scribbles version is nicer I think.

The R scripts basically ran out-of-the-box. I was using a new computer that didn’t have X11quartz on it nor the packages required, but once that they were installed I just needed to edit the line to use a local file in my working directory. The code just ran. The outputs FrankyTSP and Franky_scribbles didn’t even need to be renamed, given my subject’s name.

Thanks to Antonio for making the code available and so easy to use.

The post title comes from “Frankly, Mr. Shankly” by The Smiths which appears on The Queen is Dead. If the choice of post title needs an explanation, it wasn’t a good choice…

Paintball’s Coming Home: generating Damien Hirst spot paintings

A few days ago, I read an article about Damien Hirst’s new spot paintings. I’d forgotten how regular the spots were in the original spot paintings from the 1990s (examples are on his page here). It made me think that these paintings could be randomly generated and so I wrote a quick piece of code to do this (HirstGenerator).

I used Hirst’s painting ‘Abalone Acetone Powder’ (1991), which is shown on this page as photographed by Alex Hartley. A wrote some code to sample the colours of this image and then a script to replicate it. The original is shown below  © Damien Hirst and Science Ltd. Click them for full size.

and then this is the replica:

Now that I had a palette of the colours used in the original. It was simple to write a generator to make spot paintings where the spots are randomly assigned.

The generator can make canvasses at whatever size is required.

The code can be repurposed to make spot paintings with different palettes from his other spot paintings or from something else. So there you have it. Generative Hirst Spot Paintings.

For nerds only

My original idea was to generate a palette of unique colours from the original painting. Because of the way I sampled them, each spot is represented once in the palette. This means the same colour as used by the artist is represented as several very similar but nonidentical colours in the palette. My original plan was to find the euclidean distances between all spots in RGB colour space and to establish a distance cutoff to decide what is a unique colour.

That part was easy to write but what value to give for the cutoff was tricky. After some reading, it seems that other colour spaces are better suited for this task, e.g. converting RGB to a CIE colour space. For two reasons, I didn’t pursue this. First, quantixed coding is time-limited. Second. assuming that there is something to the composition of these spot paintings (and they are not a con trick) the frequency of spots must have artistic merit and so they should be left in the palette for sampling in the generated pictures. The representation of the palette in RGB colour space had an interesting pattern (shown in the GIF above).

The post title comes from “Paintball’s Coming Home” by Half Man Half Biscuit from Voyage To The Bottom Of The Road. Spot paintings are kind of paintballs, but mostly because I love the title of this song.

Scoop: some practical advice

So quantixed occasionally gets correspondence from other researchers asking for advice. A recent email came from someone who had been “scooped”. What should they do?

Before we get into this topic we have to define what we mean by being scooped.

In the most straightforward sense being scooped means that an article appeared online before you managed to get your article online.

You were working on something that someone else was also working on – maybe you knew about this or not and vice versa – but they got their work out before you did. They are the scooper and you are the scoopee.

There is another use of the term, primarily used in highly competitive fields, which define the act of scooping as the scooper have gained some unfair advantage to make the scoop. In the worst case, this can be done by receiving your article to review confidentially and then delaying your work while using your information to accelerate their own work (Ginsparg, 2016).

However it happens, the scoop can classified as an overscoop or an underscoop. An overscoop is where the scooper has much more data and a far more complete story. Maybe the scooper’s paper appears in high profile journal while the scoopee was planning on submitting to a less-selective journal.  Perhaps the scooper has the cell data, an animal model, the biochemical data and a crystal structure; while the scoopee had some nice data in cells and a bit of biochemistry. An underscoop is where a key observation that the scoopee was building into a full paper is partially revealed. The scoopee could have more data or better quality results and maybe the full mechanism, but the scooper’s paper gives away a key detail (Mole, 2004).

All of these definitions are different from the journalistic definition which simply means “the scoop” is the big story. What the science and journalistic term share is the belief that being second with a story is worthless. In science, being second and getting the details right is valuable and more weight should be given that it currently is. I think follow-up work is valued by the community, but it is fair to say that it is unlikely to receive the same billing and attention as the scooper’s paper.

How often does scooping actually happen?

To qualify as being scooped, you need to have a paper that you are preparing for publication when the other paper appears. If you are not at that point, someone else was just working on something similar and they’ve published a paper. They haven’t scooped you. This is easiest to take when you have just had an idea or have maybe done a few experiments and then you see a paper on the same thing. It must’ve been a good idea! The other paper has saved you some time! Great. Move on. The problem comes when you have invested a lot of time doing a whole bunch of work and then the other paper appears. This is very annoying, but to reiterate, you haven’t really been scooped if you weren’t actually at the point of preparing your work for publication.

As you might have gathered, I am not even sure scooping is a real thing. For sure the fear of being scooped is real. And there are instances of scooping happening. But most of the time the scoopee has not actually been scooped. And even then, the scoopee does not just abandon their work.

So what is the advice to someone who has discovered that they have been scooped?

Firstly, don’t panic! The scoopers paper is not going to go away and you have to deal with the fact you now have the follow up paper. It can be hard to change your mindset, but you must rewrite your paper to take their work into account. Going into denial mode and trying to publish your work as though the other paper doesn’t exist is a huge mistake.

Second, read their work carefully. I doubt that the scooper has left you with no room for manoeuvre. Even in the case of the overscoop, you probably still have something that the other paper doesn’t have that you can still salvage. There’s bound to be some details on which your work does not agree and this can feature in your paper. If it’s an underscoop, you have even less to worry about. There will be a way forward – you just need to identify it and move on.

The main message is that “being scooped” is not the end. You just need to figure out your way forward.

How do I stop it from happening to me?

Be original! It’s a truism that if you are working on something interesting, it’s likely that someone else is too. And if you work in a highly competitive area, there might be many groups working on the same thing and it is more likely that you will be scooped. Some questions are obvious next steps and it might be worth thinking twice about pursuing them. This is especially true if you come up with an idea based on a paper you’ve read. Work takes so long to appear that the lab who published that paper is likely far ahead of you.

Having your own niche gives the best protection. If you have carved out your own question you probably have the lead and will be associated with work in this area anyway. Other labs will back off. If you have a highly specialised method, again you can contribute in ways that others can’t and so your chances of being scooped decrease.

Have a backup plan. Do you have a side project which you can switch to if too much novelty is taken away from your main project? You can insulate yourself from scoop damage by not working on projects that are all-or-nothing. Horror stories about scooping in structural biology (which is all about “the big reveal”) are commonplace. Investing energy in alternative approaches or new assays as well as getting a structure might help here.

If you find out about competition, maybe from a poster or a talk at a meeting, you need to evaluate whether it is worth carrying on. If you can, talk to the other lab. Most labs do not want to compete and would prefer to collaborate or at least co-ordinate submission of manuscripts.

Use preprints! If you deposit your work on a preprint server, you get a DOI and a date stamp. You can prove that your work existed on that date and in what form. This is ultimate protection against being scooped. If someone else’s work appears online before you do this, then as I said above, you haven’t really been scooped. If work appears and you already have a DOI, well, then you haven’t been scooped either. Some journals see things this way. For example, EMBO J have a scoop protection policy that states that the preprint deposition timestamp is the date at which priority is assessed.

The post title is taken from “Scoop” by The Auctioneers. I have this track on an extended C86 3-Disc set.

Measured Steps: Garmin step adjustment algorithm

I recently got a new GPS running watch, a Garmin Fēnix 5. As well as tracking runs, cycling and swimming, it does “activity tracking” – number of steps taken in a day, sleep, and so on. The step goals are set to move automatically and I wondered how it worked. With a quick number crunch, the algorithm revealed itself. Read on if you are interested how it works.

Step screen on the Garmin Fēnix 5

The watch started out with a step target of 7500 steps in one day. I missed this by 2801 and the target got reduced by 560 to 6940 for the next day. That day I managed 12480, i.e. 5540 over the target. So the target went up by 560 to 7500. With me so far? Good. So next I went over the target and it went up again (but this time by 590 steps). I missed that target by a lot and the target was reduced by 530 steps. This told me that I’d need to collect a bit more data to figure out how the goal is set. Here are the first few days to help you see the problem.

Actual steps Goal Deficit/Surplus Adjustment for Tomorrow
4699 7500 -2801 -560
12480 6940 5540 560
10417 7500 2917 590
2726 8090 -5364 -530
6451 7560 -1109 -220
8843 7340 1503 150
8984 7490 1494 300
9216 7790 1426 290

The data is available for download as a csv via the Garmin Connect website. After waiting to accumulate some more data, I plotted out the adjustment vs step deficit/surplus. The pattern was pretty clear.

There are two slopes here that pass through the origin. It doesn’t matter what the target was, the adjustment applied is scaled according to how close to the target I was, i.e. the step deficit or surplus. There was either a small (0.1) or large (0.2) scaling used to adjust the step target for the next day, but how did the watch decide which scale to use?

The answer was to look back at the previous day’s activity as well as the current day.

So if today you exceeded the target and you also exceeded the target yesterday then you get a small scale increase. Likewise if you fell short today and yesterday, you get a small scale decrease. However, if you’ve exceeded today but fell short yesterday, your target goes up by the big scaling. Falling short after exceeding yesterday is rewarded with a big scale decrease. The actual size of the decrease depends on the deficit or surplus on that day. The above plot is coloured according to the four possibilities described here.

I guess there is a logic to this. The goal could quickly get unreachable if it increased by 20% on a run of two days exceeding the target, and conversely, too easy if the decreases went down rapidly with consecutive inactivity. It’s only when there’s been a swing in activity that the goal should get moved by the large scaling. Otherwise, 10% in the direction of attainment is fine.

I have no idea if this is the algorithm used across all of Garmin’s watches or if other watch manufacturer’s use different target-setting algorithms.

The post title comes from “Measured Steps” by Edsel from their Techniques of Speed Hypnosis album.

Esoteric Circle

Many projects in the lab involve quantifying circular objects. Microtubules, vesicles and so on are approximately circular in cross section. This quick post is about how to find the diameter of these objects using a computer.

So how do you measure the diameter of an object that is approximately circular? Well, if it was circular you would measure the distance from one edge to the other, crossing the centre of the object. It doesn’t matter along which axis you do this. However, since these objects are only approximately circular, it matters along which axis you measure. There are a couple of approaches that can be used to solve this problem.

Principal component analysis

The object is a collection of points* and we can find the eigenvectors and eigenvalues of these points using principal component analysis. This was discussed previously here. The 1st eigenvector points along the direction of greatest variance and the 2nd eigenvector is normal to the first. The order of eigenvectors is determined by their eigenvalues. We use these to rotate the coordinate set and offset to the origin.

Now the major axis of the object is aligned to the x-axis at y=0 and the minor axis is aligned with the y-axis at x=0 (compare the plot on the right with the one on the left, where the profiles are in their original orientation – offset to zero). We can then find the absolute values of the axis crossing points and when added together these represent the major axis and minor axis of the object. In Igor, this is done using a oneliner to retrieve a rotated set of coords as the wave M_R.

PCA/ALL/SEVC/SRMT/SCMT xCoord,yCoord

To find the crossing points, I use Igor’s interpolation-based level crossing functions. For example, storing the aggregated diameter in a variable called len.

FindLevel/Q/EDGE=1/P m1c0, 0
len = abs(m1c1(V_LevelX))
FindLevel/Q/EDGE=2/P m1c0, 0
len += abs(m1c1(V_LevelX))

This is just to find one axis (where m1c0 and m1c1 are the 1st and 2nd columns of a 2-column wave m1) and so you can see it is a bit cumbersome.

Anyway, I was quite happy with this solution. It is unbiased and also tells us how approximately circular the object is (because the major and minor axes tell us the aspect ratio or ellipticity of the object). I used it in Figure 2 of this paper to show the sizes of the coated vesicles. However, in another project we wanted to state what the diameter of a vesicle was. Not two numbers, just one. How do we do that? We could take the average of the major and minor axes, but maybe there’s an easier way.

Polar coordinates

The distance from the centre to every point on the edge of the object can be found easily by converting the xy coordinates to polar coordinates. To do this, we first find the centre of the object. This is the centroid \((\bar{x},\bar{y})\) represented by

\(\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_{i} \) and \(\bar{y} = \frac{1}{n}\sum_{i=1}^{n} y_{i} \)

for n points and subtract this centroid from all points to arrange the object around the origin. Now, since the xy coords are represented in polar system by

\(x_{i} = r_{i}\cos(\phi) \) and \(y_{i} = r_{i}\sin(\phi) \)

we can find r, the radial distance, using

\(r_{i} = \sqrt{x_{i}^{2} + y_{i}^{2}}\)

With those values we can then find the average radial distance and report that.

There’s something slightly circular (pardon the pun) about this method because all we are doing is minimising the distance to a central point initially and then measuring the average distance to this minimised point in the latter step. It is much faster than the PCA approach and would be insensitive to changes in point density around the object. The two methods would probably diverge for noisy images. Again in Igor this is simple:

Make/O/N=(dimsize(m1,0)-1)/FREE rW

rW[] = sqrt(m1[p][0]^2 + m1[p][1]^2)

len = 2 * mean(rW)
Here again, m1 is the 2-column wave of coords and the diameter of the object is stored in len.

How does this compare with the method above? The answer is kind of obvious, but it is equidistant between the major and minor axes. Major axis is shown in red and minor axis shown in blue compared with the mean radial distance method (plotted on the y-axis). In places there is nearly a 10 nm difference which is considerable for objects which are between 20 and 35 nm in diameter. How close is it to the average of the major and minor axis? Those points are in black and they are very close but not exactly on y=x.

So for simple, approximately circular objects with low noise, the ridiculously simple polar method gives us a single estimate of the diameter of the object and this is much faster than the more complex methods above. For more complicated shapes and noisy images, my feeling is that the PCA approach would be more robust. The two methods actually tell us two subtly different things about the shapes.

Why don’t you just measure them by hand?

In case there is anyone out there wondering why a computer is used for this rather than a human wielding the line tool in ImageJ… there are two good reasons.

  1. There are just too many! Each image has tens of profiles and we have hundreds of images from several experiments.
  2. How would you measure the profile manually? This approach shows two unbiased methods that don’t rely on a human to draw any line across the object.

* = I am assuming that the point set is already created.

The post title is taken from “Esoteric Circle” by Jan Garbarek from the LP of the same name released in 1969. The title fits well since this post is definitely esoteric. But maybe someone out there is interested!

Inspiration Information: some book recommendations for kids

As with children’s toys and clothes, books aimed at children tend to be targeted in a gender-stereotyped way. This is a bit depressing. While books about princesses can be inspirational to young girls – if the protagonist decides to give it all up and have a career as a medic instead (the plot to Zog by Julia Donaldson) – mostly they are not. How about injecting some real inspiration into reading matter for kids?

Here are a few recommendations. This is not a survey of the entire market, just a few books that I’ve come across that have been road-tested and received a mini-thumbs up from little people I know.

Little People Big Dreams: Marie Curie by Isabel Sanchez Vegara & Frau Isa

This is a wonderfully illustrated book that tells the story of Marie Curie. From a young girl growing up in Poland, overcoming gender restrictions to go and study in France and subsequently winning two Nobel Prizes and being a war hero! The front part of the book is written in simple language that kids can read while the last few pages are (I guess) for an adult to read aloud to the child, or for older children to read for themselves.

This book is part of a series which features inspirational women: Ada Lovelace, Rosa Parks, Emmeline Pankhurst, Amelia Earhart. What is nice is that the series also has books on women from creative fields Coco Chanel, Audrey Hepburn, Frida Kahlo, Ella Fitzgerald. Often non-fiction books for kids are centred on science/tech/human rights which is great but, let’s face it, not all kids will engage with these topics. The bigger message here is to show young people that little people with big dreams can change the world.

Ada Twist, Scientist by Andrea Beaty & David Roberts

A story about a young scientist who keeps on asking questions. The moral of the story is that there is nothing wrong with asking “why?”. The artwork is gorgeous and there are plenty of things to spot and look at on each page. The mystery of the book is not exactly solved either so there’s fun to be had discussing this as well as reading the book straight. Ada Marie Twist is named after Ada Lovelace and Marie Curie, two female giants of science.

This book is highly recommended. It’s fun and crammed full with positivity.

Rosie Revere, Engineer by Andrea Beaty & David Roberts

By the same author and illustrator, ‘Rosie Revere…’ tells the story of a young inventor. She overcomes ridicule when she is taken under the wing of her great aunt who is an inspirational engineer. Her great aunt Rose is I think supposed to be Rosie the Riveter, be-headscarfed feminist icon from WWII. A wonderful touch.

Rosie is a classmate of Ada Twist (see above) and there is another book featuring a young (male) architect which we have not yet road-tested. Rather than recruitment propaganda for Engineering degrees, the broader message of ‘Rosie Revere…’ is that persevering with your ideas and interests is a good thing, i.e. never give up.

Good Night Stories for Rebel Girls by Elena Favilli & Francesca Cavallo
A wonderful book that gives brief biographies of inspiring women. Each two page spread has some text and an illustration of the rebel girl to inspire young readers. The book has a This book belongs to… page at the beginning, but in a move of pure genius, the book has two final pages for the owner of the book to write their own story. Just like the women featured in the book, the owner to the book can have their own one page story and draw their own self-portrait.
This book is highly recommended.
EDIT: this book was added to the list on 2018-02-26

Who was Charles Darwin? by Deborah Hopkinson & Nancy Harrison

This is a non-fiction book covering Darwin’s life from school days through the Beagle adventures and on to old age. It’s a book for children although compared to the books above, this is quite a dry biography with a few black-and-white illustrations. This says more about how well the books above are illustrated rather than anything particularly bad about “Who Was Charles Darwin?”. Making historical or biographical texts appealing to kids is a tough gig.

The text is somewhat inspirational – Darwin’s great achievements were made despite personal problems – but there is a disconnect between the life of a historical figure like Darwin and the children of today.

For older people

Quantum Mechanics by Jim Al-Khalili

Aimed at older children and adults, this book explains the basics behind the big concept of “Quantum Mechanics”. These Ladybird Expert books have a retro appeal, being similar to the original Ladybird books published over forty years ago. Jim Al-Khalili is a great science communicator and any young people (or adults) who have engaged with his TV work will enjoy this short format book.

Evolution by Steve Jones

This is another book in the Ladybird Expert series (there is one further book, on “Climate Change”). The brief here is the same: a short format explainer of a big concept, this time “Evolution”. The target audience is the same. It is too dry for young children but perfect for teens and for adults. Steve Jones is an engaging writer and this book doesn’t disappoint, although the format is limited to one-page large text vignettes on evolution with an illustration on the facing page.

It’s a gateway to further reading on the topic and there’s a nice list of resources at the end.

 

 

Computing for Kids

After posting this, I realised that we have lots of other children’s science and tech books that I could have included. The best of the rest is this “lift-the-flap” book on Computers and Coding published by Usborne. It’s a great book that introduces computing concepts in a fun gender-free way. It can inspire kids to get into programming perhaps making a step up from Scratch Jr or some other platform that they use at school.

I haven’t included any links to buy these books. Of course, they’re only a google search away. If you like the sound of any, why not drop in to your local independent bookshop and support them by buying a copy there.

Any other recommendations for inspirational reading for kids? Leave a comment below.

The post title comes from the title track of the “Inspiration Information” LP by Shuggie Otis. The version I have is the re-release with  ‘Strawberry Letter 23’ on it from ‘Freedom Flight’ – probably his best known track – as well as a host of other great tunes. Highly underrated, check it out. There’s another recommendation for you.