## All That Noise: The vesicle packing problem

This week Erick Martins Ratamero and I put up a preprint on vesicle packing. This post is a bit of backstory but please take a look at the paper, it’s very short and simple.

The paper started when I wanted to know how many receptors could fit in a clathrin-coated vesicle. Sounds like a simple problem – but it’s actually more complicated.

Of course, this problem is not as simple as calculating the surface area of the vesicle, the cross-sectional area of the receptor and dividing one by the other. The images above show the problem. The receptors would be the dimples on the golf ball… they can’t overlap… how many can you fit on the ball?

It turns out that a PhD student working in Groningen in 1930 posed a similar problem (known as the Tammes Problem) in his thesis. His concern was the even pattern of pores on a pollen grain, but the root of the problem is the Thomson Problem. This is the minimisation of energy that occurs when charged particles are on a spherical surface. The particles must distribute themselves as far away as possible from all other particles.

There are very few analytical solutions to the Tammes Problem (presently 3-14 and 24 are solved). Anyhow, our vesicle packing problem is the other way around. We want to know, for a vesicle of a certain size, and cargo of a certain size, how many can we fit in.

Fortunately stochastic Tammes solvers are available like this one, that we could adapt. It turns out that the numbers of receptors that could be packed is enormous: for a typical clathrin-coated vesicle almost 800 G Protein-Coupled Receptors could fit on the surface. Note, that this doesn’t take into account steric hinderance and assumes that the vesicle carries nothing else. Full details are in the paper.

Why does this matter? Many labs are developing ways to count molecules in cellular structures by light or electron microscopy. We wanted to have a way to check that our results were physically possible. For example, if we measure 1000 GPCRs in a clathrin-coated vesicle, we know something has gone wrong.

What else? This paper ticked a few things on my publishing bucket list: a paper that is solely theoretical, a coffee-break idea paper and one that is on a “fun” subject. Erick has previous form with theoretical/fun papers, previously publishing on modelling peloton dynamics in procycling.

We figured the paper was more substantial than a blog post yet too minimal to send to a journal. So unless a journal wants to publish it (and gets in touch with us), this will be my first preprint where bioRxiv is the final destination.

We got a sense that people might be interested in an answer to the vesicle packing problem because whenever we asked people for an estimate, we got hugely different answers! The paper has been well-received so far. We’ve had quite a few comments on Twitter and we’re glad that we wrote up the work.

The post title comes from the “All That Noise” LP by The Darkside. I picked this not because of the title, but because of the cover.

## Installing open source PyMol on a Mac

This is a quick “how to” post. There is a licensed version of PyMol (MacPyMol) available, but the open source version can be installed on a Mac free of charge. The official page has a guide, which is not terribly detailed, and I found this excellent guide which is unfortunately out-of-date.

Here is an updated guide to installing PyMol using Homebrew on macOS Mojave 10.14.3

Step 1 is to install Homebrew

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" Next step is to install the PyMol dependencies using Homebrew brew install Caskroom/cask/xquartzbrew install tcl-tkbrew install python Now install PyMol. brew install brewsci/bio/pymol You can now run PyMol by typing pymol The MPIBR-Bioinformatics page has the following guide to make a little executable app to launch PyMol straight from the Desktop. • Open Automator, which is in Applications in macOS. • Create new document, select “Application”. • Select “Actions” and “Library” in the left pane. Select “Utilities” and “Run Shell Script”. Drag this into the main pane. • Choose “/bin/bash” as a shell. • Paste the following: /usr/local/bin/pymol -M. If this doesn’t work, check the path to pymol using which pymol in the terminal, and use this instead. • Save the application (“File > Save”) to the Desktop and name it “PyMol”. Now you can double-click this app to run PyMol. ## Super Automatic: computer-based tools for research Since I have now written several posts on this. I thought I would summarise the computer-based tools that we are using in the lab to automate our work and organise ourselves. Electronic lab notebook – there are previous posts from me on picking an ELN platform and how to set up a WordPress ELN as well as Dave Mason’s guide and CAMDU’s guide to help you to get this set up. Slack – it took us ages to set up our lab slack. I wanted to try it out a few years ago but some people in the lab were reluctant (see below for some notes on getting people on board with new tech). I was also sceptical since we are a wet lab and I wondered whether slack would work as well for us as it does for bioinformaticians, for example. We just went for it one day and have not regretted it as it has improved lab communication in so many ways. There is a good guide available on the site to set up something that works for a lab. Trello – organising projects and assigning to-do items (see this post). Since we switched our communication to Slack, we use Trello a lot less. It is still very good for checking progress on defined tasks, for brainstorming during a meeting, and to make sure stuff doesn’t get forgotten about. Besides our lab boards, I own other boards for work outside of my lab and these also work well, in some cases better than our lab boards. Databases – infrastructure in the lab is centred on databases for key reagents and these can be cross-referenced in electronic lab notebooks. Keeping them updated is essential. We use FileMaker Pro to maintain them. Setup took some time, but very much worth it. OMERO – The lab is lucky to have excellent computing support from CAMDU who set up our OMERO database and server. Images from microscopes are piped direct to the database on a per user basis. I’m a big fan. Dropbox – an account is essential in academia. I have a shared folder with each person in the lab and a folder that all members share. File exchange is best done via our lab server, but Dropbox is still incredibly useful. Overleaf – writing a paper collaboratively in LaTeX with Overleaf is a joy (see this post). There are Google-based tools for doing this but I am not keen on using them. Dropbox now has a collaborative writing function that my colleagues are using and enjoying. Zotero – when writing collaboratively in Overleaf, a shared reference management system is essential. Zotero is a healthy alternative to Mendeley and it is now supported in Overleaf v2. I don’t store my PDFs in Zotero and I found no solution for an iTunes-for-PDF-files. Calendars – while I’m not a fan of Google-based tools, we use the calendar functions for our lab. We inherited this from the Centre where we work, where these calendars are used for booking equipment. We have lab calendars for microscopes, equipment, workstations and for general lab stuff so we know when people are away. I set up a dummy account that belongs to all the lab calendars and this is linked to our lab Slack so that we get a digest of the days bookings every morning, and notifications if new events are added. The University has an outlook-based calendar system, the use of which is patchy amongst academics. However, the admin people use it and so I have blocked out times in here when I’m busy to reduce diary conflicts. Filter, filter, filter – I set up many filters on my email, twitter… wherever I can… to keep out spam and irrelevant stuff. Automating the little stuff – a previous post on being organised as a PI mentioned that I advocate writing scripts and macros to automate little things that you do often. Of course there is an xkcd cartoon for this. I have scripts set up to do things like assembling PDFs or converting or compressing file formats. We’ve also been automating the big stuff too. Figures are a good example. We write scripts to produce (and reproduce) figure panels. Sharing code – A while back I set up an update site to distribute our ImageJ macros among the lab, but also people from outside can subscribe and get the latest updates easily. Our Igor code is shared within the lab via a cloud-based updater which allows code to compiled on-demand. Lab code is maintained via git at GitHub and our centre forks repos from published projects to its own account. Onboarding. None of these tools work unless everyone is on board. It is worth having a strategy to make this happen. Simple steps such as introducing on system at a time, providing initial training and some support for people who are slow to uptake. There’s a group effect to onboarding but getting to the tipping point can be hard. The title for this post “Super Automatic” comes from the album of that name by Myracle Brah (the name of this band is not endorsed by quantixed). ## Experiment Zero: Using a Raspberry Pi Zero camera This is the first post at quantixed about Raspberry Pi computing. Pi Zero is a minimalist Raspberry Pi that can be coupled to a camera. With this little rig, you can make time-lapse footage amongst other things. I’ve set up a couple of these now. One was to make a time-lapse movie of some plants growing through a plastic maze. The results were pretty good and I thought I’d upload the video and a brief how-to guide. After a delay, you can see four beans sprouting and then one eventually makes it to the top of the maze. This footage was shot over 27 days. The Pi took pictures every 5 min, but I sampled at 10 min in order to make the movie (after discarding the pictures after the sun went down). Everything was automated. The camera shoots at 3280 × 2464. I downsampled the images to make the video. The camera didn’t focus well on the maze which was a bit too close. Other units are shooting scenery and the autofocus on the unit is great. How I did it Pi Zero with camera module (without IR filter) and a case are available for around £40. I bought mine from the Pi Hut. Power supplies and SD cards are readily available. I put together the PiCam with a fresh Raspbian full image on a 16GB SD card. Another option is to use a smaller card and get the Pi to save the images to a server. I used PiBaker to format the SD Card, load on Raspbian and add a startup script that would connect the Pi Zero to WiFi and enable VNC. That meant I could plug it in and start using it headless. Well in theory! It turns out that VNC via Mac does not work with the UNIX style password which is the default on the Pi. I needed to connect to a monitor to rectify this by changing to VNC password in the VNC GUI. After this I could log in and use the Pi Zero remotely. A few more minor steps were needed for full functionality: 1. I enabled ssh and camera port in Raspberry Pi Configuration, disabled bluetooth and set the correct timezone (this can probably be done in PiBaker but I forgot). 2. Since I have several Raspberry Pis on the LAN. I needed to give this one its own identity to prevent network conflicts. 3. I needed to set up SMB sharing on the new Pi. Instructions for how to do these things are just one google search away. Now the Pi was ready to start taking images. I built a little stand for it out of Lego and set up the plant maze. Taking pictures with the Pi I wrote a shell script to take pictures using raspistill. I made a directory called camera in home/pi mkdir camera  Then made a camera.sh file home/pi that looked like this: #!/bin/bash DATE=$(date +"%Y-%m-%d_%H%M")
raspistill -o /home/pi/camera/$DATE.jpg  Then I made it executable chmod +x camera.sh  Using CRON, I execute the shell script on a schedule. I wanted to take pictures every 5 minutes. You can consult cronguru for your needs. */5 * * * * /home/pi/camera.sh 2>&1  That’s it! The Pi Zero will happily take pictures until you tell it to stop. Or there’s a…. crash. Dealing with crashes If you are going to do long-term time-lapse imaging, you need to defend against a crash that will prevent images from being acquired. In the worst case, the Pi could go offline and you wouldn’t know until you checked up on it. The first one I set up crashed quite often. I couldn’t determine the cause immediately. So I did the next best thing. I set up a watchdog to monitor for crashes and then reboot the Pi if/when it happens. Many guides online suggest bcm2708_wdog but this doesn’t work for a Pi Zero. Instead this worked for me: sudo modprobe bcm2835_wdt sudo nano /etc/modules  Adding the line “bcm2835_wdt” and saving the file Next I installed the watchdog daemon sudo apt-get install watchdog chkconfig chkconfig watchdog on sudo /etc/init.d/watchdog start sudo nano /etc/watchdog.conf  I uncommented two lines from this file: 1. watchdog-device = /dev/watchdog 2. the line that had max-load-1 in it Save the watchdog.conf file. There are guides online that describe how to set up the Pi so that it sends you an email or SMS when there’s a crash/reboot. I figured I didn’t need this – as long as it reboots OK. What now? Well, you wait for it to take photos! You can log in via VNC and check that the images are being acquired, or go in via ssh and watch the camera directory fill up. The size of the images is 3280 × 2464 and they are around 4.5 MB each, so the disk can quickly fill. After a while you’ll want to assemble a movie. I wrote a shell script on my Mac in order to to pull down the images, take a copy of the ones I want and then make a movie file and upload it to Dropbox so I could look at it on the go. #!/usr/bin/env bash # move to the location of the images cd /local/disk/folder2/ # pull down all images to a local folder - only new images are copied rsync -trv /Volumes/HOMEPI/camera/ /local/disk/folder/ # overnight images are dark and less than 1.5 MB # copy the ones we want to keep rsync -trv --min-size=1000K /local/disk/folder/ /local/disk/folder2/ # or you could filter on size like this - delete <2MB find . -name "*.jpg" -size -2000k -delete # scale the images down to 480 px wide and make movie ffmpeg -framerate 30 -pattern_type glob -i '*.jpg' -c:v libx264 -pix_fmt yuv420p -vf scale=480:-2 out.mp4 # move to dropbox mv out.mp4 /My/Dropbox/Folder/out.mp4  This script means that I had to manually delete the pictures from the Pi once they’d been copied but that was OK. My plan is to write a script to do this for the longer running projects so that it is automated. While it is possible to make the movies on the Pi itself, I did it on the Mac as that computer is beefier and is not busy taking pictures every 5 min! ffmpeg is a great tool for this and the documentation is impressive. For example if you have set up the camera in the wrong orientation you can do transposition in ffmpeg. If you don’t have ffmpeg, it is a simple install on the command line. ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
brew install ffmpeg


Hopefully this guide is useful to you. The Pi Zero Camera can be used for streaming video as well as taking a series of still images. I’m planning to test this out soon.

The post title “Experiment Zero” comes from the title of the album by Man or Astro-Man?

## Tips from the blog XI: docx to pdf

A long time ago I posted a little Automator routine to convert Word doc/docx files to PDF. Not long after that, this routine ceased to work due to changes in Microsoft Word (I think). It’s still very useful to convert a whole folder of docx files to PDF in order to avoid Word and just use Preview on the Mac. For committee work or for marking students’ work, I often have a whole folder of docx files and would prefer it if they were in PDF format. I found this very nifty trick on the web and thought I’d post a link here to make up for the fact that my old post no longer works.

The full post is here. What is so nice about this Automator solution is that it uses a bash script to do the conversion. This means you don’t need Microsoft Word for it to work! From what I can see it uses the xml in the docx file (and presumably won’t work on older *.doc files) for the conversion. The post describes how to run it as a Service in macOS. Note, that it destroys the docx files, so it should only be used on a copy. It could be run from the command line rather than right-clink, the engine is this little script.

Thanks to Jacob Salmela for posting it.

This post is part of a series of short tips

## New Lexicon: how to add a custom minted lexer in Overleaf

This quick post comes courtesy of LianTze Lim (an Overleaf TeXpert) and Kota Miura (a bioimage analyst).

I asked on the ImageJ forum some time ago how to add an ImageJ Macro lexer for a LaTeX document I was writing. Kota responded with this lexer for pygments. I then asked Overleaf if it was possible to add a custom lexer to an Overleaf document using the minted package. At the time this was not possible. However, I got a message from them today with a solution.

Steps to do this for your own Overleaf project:

1. Add Kota’s imagejmacro.py file to your project
2. Add minted to your preamble and then use
\begin{minted}{imagejmacro.py:ImageJMacroLexer -x}
// your code
\end{minted}

Here, imagejmacro.py is the name of the custom lexer saved in your project and ImageJMacroLexer is the name of the class in that file. If you want to use another custom lexer just replace as required. I have put up a read-only Overleaf example to show it working.

Thanks to LianTze for following up with me about this and special thanks to Kota who wrote the custom lexer.

The post title comes from the LP of the same name by Paint It Black.

## My Blank Pages VI: Programming in Igor Pro

It has been a long time since I wrote a book review.

A few months ago I read on IgorExchange that Martin Schmid had written a book about programming Igor. I snapped up a copy. I’m a competent Igor programmer but I was hoping that this book would be useful for lab members that want to learn.

Learning Igor – like most IDEs or programming languages – is tough going. There’s a booklet from WaveMetrics (the company that sells Igor Pro) called Getting Started – which is really good. There are a few other guides on the web (Payam’s guide, Thomas Braun’s coding conventions, quantixed’s own translator), but other resources are pretty scarce. The Igor Manual itself is excellent but it’s many, many pages long and is only meant to be consulted. So I was intrigued whether Martin Schmid’s book would fill the gap between Getting Started and more advanced guides.

What makes Igor Pro so fantastic is the way that you can use it for so many different things: image processing, statistics, graphing, curve fitting, instrument control and so on. Part of the challenge of writing a book on Igor Programming is deciding what to cover. Schmid deals with this by covering basic programming and core-intermediate topics such as dialogs, loops, string magic etc. The book stops short of any specialised applications. So it’s a really useful intermediate programming guide. It’s a great little book and is recommended for those who want to dig further after doing the Getting Started exercises.

I knew I would learn something from the book because there’s always alternative ways to do stuff in Igor: things that you didn’t know about or little tricks to do stuff faster. What I was surprised about was the first thing mentioned in the book was new to me. The author favours module-static programming. All of my Igor programming has been done in the global pragma and I have avoided this more C-like way of encapsulating programs that I’ve written so far. Module-static works well because it eliminates naming conflicts. I have dealt with name conflicts by using static functions which are called from the top of the stack, and the top has a unique name (arguably this is the same as module-static, but not identical). As the Igor Manual says “this gets tedious after a while” and that’s true. Although in my defence, name conflicts are generally not a problem for the way I work because I favour a reproducible approach. A new experiment is started – one user-written ipf is opened – and the code is run. This means naming conflicts are minimised. The book has actually convinced me that module-static is a good thing, especially since my Igor code is now deployed around the lab and naming conflicts could easily become a problem. It’s an advanced programming technique but is dealt with early by the author and it kind of works. After this, more basic programming topics are covered in depth.

There’s always room for improvement: there are several example programs at the back which need to be rekeyed to run, since this is a paper book and no electronic version is available AFAIK. The author has put one up here to save rekeying and another here, but otherwise you need to type in the examples to see what will happen. This is too long-winded. I’ve been spoiled reading texts about R where the examples can all be run from a markdown file inside RStudio. It would’ve been nice if the code was made available for this book. I don’t think it would compromise the value of the book since it is the text that is most valuable.

The book is available at Amazon for £7.99 at the time of writing.

My Blank Pages is a track by Velvet Crush. This is an occasional series of book reviews.

## Rip It Up: Grabbing movies from Twitter for use in ImageJ

Some great scientific data gets posted on Twitter. Sometimes I want to take a closer look and this post describes a strategy to do so.

Edit: I received a request to take down the 3D volume images derived from the example dataset I used in this post. I’ve edited the post below so that is now a general guide.

Grab the video

It can be a bit difficult to the grab video from Twitter. The best way I’ve found is using youtube-dl. This works for downloading video and audio from YouTube to view offline, but it also works for other embedded video content on other websites.

To download the video use:

youtube-dl -o '%(title)s.%(ext)s' https://twitter.com/username/status/tweetID

this downloads an mp4 file which is automatically named.

Convert to avi

Now, mp4 is a compressed file format which cannot be read directly by FIJI/ImageJ. Conversion to avi means that the file can be loaded. I like to use another command line tool, ffmpeg for video conversions.

ffmpeg -i originalFile.mp4 -pix_fmt nv12 -f avi -vcodec rawvideo convertedFile.avi

Now we have an avi file called convertedFile.avi that we can use.

Load into FIJI

The avi can be loaded into FIJI. At this point you can analyse the video. However, in the case of the video I was interested in, the data had been pseudocolored and was now in RGB format. I wanted to look at the original data. Converting to grayscale does not give the correct representation but conversion back to grayscale is possible if you know the LUT was applied. Even if you don’t, it’s possible to take a guess at the LUT and do the conversion.

Converting RGB to original values

I found a nice gist that does the conversion for a single image. I just modified this code to work for a stack. It requires the LUT to be displayed vertically in a window called LUT. Caution: this code runs very slowly because every pixel in every slice needs to be recalculated and ImageJ is slow… I took a guess that mpl-inferno was used (I don’t think is exactly right but it worked well enough). You can display the built-in LUTs in FIJI using Color > Display LUTs… and from there you can make the LUT window which the macro uses for the calculation. The macro to convert stacks to grayscale using the LUT is here.

I had a nice grayscale version of the data (inverted because I wanted to look at the volume). This let me see how the layers in the original video add together to make the full structure. I used ClearVolume which can be installed via Update Site in FIJI. I just made a quick video to show it in action (see below). You’ll have to take my word for it (video removed).

So extracting scientific data from Twitter or another online source is pretty straightforward. The extra complication was getting rid of the pseudocoloring, but once this was done, something very close to the original data was available.  Nonetheless this workflow is a fun way to take a closer look at some of the cool movies that people post on Twitter. I hope you find it useful.

The post title comes from “Rip It Up” by Orange Juice. A popular title in my library with versions from several different artists. I was thinking what is described is similar to ripping video content.

## Rollercoaster III: yet more on Google Scholar

In a previous post I made a little R script to crunch Google Scholar data for a given scientist. The graphics were done in base R and looked a bit ropey. I thought I’d give the code a spring clean – it’s available here. The script is called ggScholar.R (rather than gScholar.R). Feel free to run it and raise an issue or leave a comment if you have some ideas.

I’m still learning how to get things looking how I want them using ggplot2, but this is an improvement on the base R version.

As described earlier I have many Rollercoaster songs in my library. This time it’s the song and album by slowcore/dream pop outfit Red House Painters.

## Ten Years vs The Spread: Calculating publication lag times in R

There have been several posts on this site about publication lag times. You can read them here. Lag times are the delays in the dissemination of scientific data introduced by the process of publishing the paper in a journal. Nowadays, your paper can be online in a few hours using a preprint server. However, this work is not peer reviewed. Journals organise a formal peer review and provide some sort of certification of the work. They typeset the work and all of this adds delays the dissemination of work in a journal.

To look at publication delays, you can use PubMed data, which is incomplete but can give insight into how long these delays can be. Previous posts have involved the use of a ruby script to make a csv file from PubMed XML output and then use this in Igor to calculate the publication lag times. There is another method detailed in this excellent post by Daniel Himmelstein.

I recently posted a figure for Nature Communications lag times on Twitter and was asked to generate others. I figured that I should write an R script and people can make their own!

The PubMedLagR code is available here with instructions for use.

A query for Nature Communications data at PubMed, such as:

nat commun[ta] AND 2000 : 2018[pdat] AND journal article[pt]

Retrieves all paper for this journal. The range from 2010 to 2018 is for illustration, this journal has only been in operation for these years. Filtering for journal articles rather and attempting to get rid of reviews and front matter is wise, but doesn’t always work. Again this journal doesn’t carry this material so this is for illustration. Getting your query right is very important.

Save the results in XML format and then run the R script as directed. This should give a csv of the data and a png of the lag times.

This is data from Nature Communications. Colleagues had two separate papers accepted at this journal and experienced long delays. I was interested to see if papers were generally taking longer to publish here. Of course we do not know why. Delays are partly the fault of the authors, the reviewers and the journal and it is not possible to say why publication lag times are increasing for this journal year-on-year. The journal has grown in terms of number of papers published, has this introduced inefficiencies? Are reviewers being slow to review? Are they being more demanding? Are Editors not marshalling the referee reports and providing clear guidance to authors? Allowing too much time and too many rounds of revision? Are authors being too slow to do further experimental work? The answer will be yes to some of these questions for some of the papers.

This is not to focus on Nature Communications, it’s one of a few journals that many colleagues complain is too slow to publish their work. With this code you can have a look at the journal you are interested in submitting to and consider whether there is a more rapid venue for your work.

Update:

I changed the code slightly and prettified the plots just a little. Below are some plots for Nature Cell Biology, Nature Neuroscience. I also did a search for clathrin or CRISPR papers over the same time period. These keyword searches are fairly flat, whereas the journal-specific increase in publication lag time can be seen.

The lag times at Nature Neuroscience look artificially low and then seem to have jumped up in 2016 to be something similar to Nature Cell Biology or Nature Communications.

Edit

I neglected to point out that the code truncates the y-axis in the bottom right plot to 1000 days or the maximum lag time, whichever is smaller. This is because it gets difficult to see the data points if there is an outlier, which might be due to an error in PubMed data.

A reader commented on Twitter that some poor paper had a lag time almost 1000 days. Well, due to the y-axis truncation we don’t see that 9 papers in Nature Communications since 2010 have lag times (RecAcc) of > 1000 days. The record holder has a lag time of 1561 days! I checked that this was not a PubMed error by looking at the dates on the paper.

Notes

Date information is not available in PubMed for every paper unfortunately. This is especially true of older papers.

The date information is supplied to PubMed from the journal. These dates are not necessarily accurate: 1) you can see occasional errors in the data, 2) journals sometimes “reset the clock” on papers and treat resubmissions as new submissions.

The post title is taken from “10 Years vs The Spread” by Wing-Tipped Sloat from the LP Chewyfoot. Obviously the song has nothing to do with smoothed kernel density estimates of journal publication lag times, but the title was incredibly apt.