Scoop: some practical advice

So quantixed occasionally gets correspondence from other researchers asking for advice. A recent email came from someone who had been “scooped”. What should they do?

Before we get into this topic we have to define what we mean by being scooped.

In the most straightforward sense being scooped means that an article appeared online before you managed to get your article online.

You were working on something that someone else was also working on – maybe you knew about this or not and vice versa – but they got their work out before you did. They are the scooper and you are the scoopee.

There is another use of the term, primarily used in highly competitive fields, which define the act of scooping as the scooper have gained some unfair advantage to make the scoop. In the worst case, this can be done by receiving your article to review confidentially and then delaying your work while using your information to accelerate their own work (Ginsparg, 2016).

However it happens, the scoop can classified as an overscoop or an underscoop. An overscoop is where the scooper has much more data and a far more complete story. Maybe the scooper’s paper appears in high profile journal while the scoopee was planning on submitting to a less-selective journal.  Perhaps the scooper has the cell data, an animal model, the biochemical data and a crystal structure; while the scoopee had some nice data in cells and a bit of biochemistry. An underscoop is where a key observation that the scoopee was building into a full paper is partially revealed. The scoopee could have more data or better quality results and maybe the full mechanism, but the scooper’s paper gives away a key detail (Mole, 2004).

All of these definitions are different from the journalistic definition which simply means “the scoop” is the big story. What the science and journalistic term share is the belief that being second with a story is worthless. In science, being second and getting the details right is valuable and more weight should be given that it currently is. I think follow-up work is valued by the community, but it is fair to say that it is unlikely to receive the same billing and attention as the scooper’s paper.

How often does scooping actually happen?

To qualify as being scooped, you need to have a paper that you are preparing for publication when the other paper appears. If you are not at that point, someone else was just working on something similar and they’ve published a paper. They haven’t scooped you. This is easiest to take when you have just had an idea or have maybe done a few experiments and then you see a paper on the same thing. It must’ve been a good idea! The other paper has saved you some time! Great. Move on. The problem comes when you have invested a lot of time doing a whole bunch of work and then the other paper appears. This is very annoying, but to reiterate, you haven’t really been scooped if you weren’t actually at the point of preparing your work for publication.

As you might have gathered, I am not even sure scooping is a real thing. For sure the fear of being scooped is real. And there are instances of scooping happening. But most of the time the scoopee has not actually been scooped. And even then, the scoopee does not just abandon their work.

So what is the advice to someone who has discovered that they have been scooped?

Firstly, don’t panic! The scoopers paper is not going to go away and you have to deal with the fact you now have the follow up paper. It can be hard to change your mindset, but you must rewrite your paper to take their work into account. Going into denial mode and trying to publish your work as though the other paper doesn’t exist is a huge mistake.

Second, read their work carefully. I doubt that the scooper has left you with no room for manoeuvre. Even in the case of the overscoop, you probably still have something that the other paper doesn’t have that you can still salvage. There’s bound to be some details on which your work does not agree and this can feature in your paper. If it’s an underscoop, you have even less to worry about. There will be a way forward – you just need to identify it and move on.

The main message is that “being scooped” is not the end. You just need to figure out your way forward.

How do I stop it from happening to me?

Be original! It’s a truism that if you are working on something interesting, it’s likely that someone else is too. And if you work in a highly competitive area, there might be many groups working on the same thing and it is more likely that you will be scooped. Some questions are obvious next steps and it might be worth thinking twice about pursuing them. This is especially true if you come up with an idea based on a paper you’ve read. Work takes so long to appear that the lab who published that paper is likely far ahead of you.

Having your own niche gives the best protection. If you have carved out your own question you probably have the lead and will be associated with work in this area anyway. Other labs will back off. If you have a highly specialised method, again you can contribute in ways that others can’t and so your chances of being scooped decrease.

Have a backup plan. Do you have a side project which you can switch to if too much novelty is taken away from your main project? You can insulate yourself from scoop damage by not working on projects that are all-or-nothing. Horror stories about scooping in structural biology (which is all about “the big reveal”) are commonplace. Investing energy in alternative approaches or new assays as well as getting a structure might help here.

If you find out about competition, maybe from a poster or a talk at a meeting, you need to evaluate whether it is worth carrying on. If you can, talk to the other lab. Most labs do not want to compete and would prefer to collaborate or at least co-ordinate submission of manuscripts.

Use preprints! If you deposit your work on a preprint server, you get a DOI and a date stamp. You can prove that your work existed on that date and in what form. This is ultimate protection against being scooped. If someone else’s work appears online before you do this, then as I said above, you haven’t really been scooped. If work appears and you already have a DOI, well, then you haven’t been scooped either. Some journals see things this way. For example, EMBO J have a scoop protection policy that states that the preprint deposition timestamp is the date at which priority is assessed.

The post title is taken from “Scoop” by The Auctioneers. I have this track on an extended C86 3-Disc set.

In a Word: LaTeX to Word and vice versa

Here’s a quick tech tip. We’ve been writing papers in TeX recently, using Overleaf as a way to write collaboratively. This works great but sometimes, a Word file is required by the publisher. So how do you convert from one to the other quickly and with the least hassle?

If you Google this question (as I did), you will find a number of suggestions which vary in the amount of effort required. Methods include latex2rtf or pandoc. Here’s what worked for me:

  • Exporting the TeX file as PDF from Overleaf
  • Opening it in Microsoft Word
  • That was it!

OK, that wasn’t quite it. It did not work at all on a Mac. I had to use a Windows machine running Word. The formatting was maintained and the pictures imported OK. Note that this was a short article with three figures and hardly any special notation (it’s possible this doesn’t work as well on more complex documents). A couple of corrections were needed: hyphenation at the end of the line was deleted during the import which borked actual hyphenated words which happened to span two lines; and the units generated by siunitx were missing a space between the number and unit. Otherwise it was pretty straightforward. So straightforward that I thought I’d write a quick post in case it helps other people.

What about going the other way?

Again, on Windows I used Apache OpenOffice to open my Word document and save it as an otd file. I then used the writer2latex filter to make a .tex file with all the embedded images saved in a folder. These could then be uploaded to Overleaf. With a bit of formatting work, I was up-and-running.

I had heard that many publishers, even those that say that they accept manuscripts as TeX files actually require a Word document for typesetting. This is because, I guess, they have workflows set up to make the publisher version which must start with a Word document and nothing else. What’s more worrying is that in these cases, if you don’t supply one, they will convert it for you before putting into the workflow. It’s probably better to do this yourself and check the conversion to reduce errors at the proof stage.

The post title is taken from “In A Word” the compilation album by Nottingham noise-rockers Fudge Tunnel.

Some Things Last A Long Time II

Back in 2014, I posted an analysis of the time my lab takes to publish our work. This post is very popular. Probably because it looks at the total time it takes us to publish our work. It was time for an update. Here is the latest version.

The colours have changed a bit but again the graphic shows that the journey to publication in four “eras”:

  1. Pre-time (before 0 on the x-axis): this is the time from first submission to the first journal. A dark time which involves rejection.
  2. Submission at the final journal (starting at time 0). Again, the lime-coloured periods are when the manuscript is with the journal and the green ones, when it is with us (being revised).
  3. Acceptance! This is where the lime bar stops. The manuscript is then readied for publication (blank area).
  4. Published online. A red period that ends with final publication in print.

Since 2013 we have been preprinting our work, which means that the manuscript is available while it is under review. This procedure means that the journey to publication only delays the work appearing in the journal and not its use by other scientists. If you want to find out more about preprints in biology check out or my posts here and here.

The mean time from first submission to the paper appearing online in the journal is 226 days (median 210). Which is shorter than the last time I did this analysis (250 days). Sadly though we managed to set a new record for longest time to publication with 450 days! This is sad for the first author concerned who worked hard (259 days in total) revising the paper when she could have been doing other stuff. It is not all bad though. That paper was put up on bioRxiv the day we first submitted it so the pain is offset somewhat.

What is not shown in the graphic is the other papers that are still making their way through the process. These manuscripts will change the stats again likely pushing up the times. As I said in the last post, I think the delays we experience are pretty typical for our field and if anything, my group are quite quick to publish.

If you’d like to read more about publication lag times see here.

Thanks to Jessica Polka for nudging me to update this post.

The post title comes again from Daniel Johnston’s track “Some Things Last A Long Time” from his “1990” LP.

Fusion confusion: new paper on FGFR3-TACC3 fusions in cancer

We have a new paper out! This post is to explain what it’s about.

Cancer cells often have gene fusions. This happens because the DNA in cancer cells is really messed up. Sometimes, chromosomes can break and get reattached to a different one in a strange way. This means you get a fusion between one gene and another which makes a new gene, called a gene fusion. There are famous fusions that are known to cause cancer, such as the Philadelphia chromosome in chronic myelogenous leukaemia. This rearrangement of chromosomes 9 and 22 result in a fusion called BCR-ABL. There are lots of different gene fusions and a few years ago, a new fusion was discovered in bladder and brain cancers, called FGFR3-TACC3.

Genes encode proteins and proteins do jobs in cells. So the question is: how are the proteins from gene fusions different to their normal versions, and how do they cause cancer? Many of the gene fusions that scientists have found result in a protein that continues to send a signal to the cell when it shouldn’t. It’s thought that this transforms the cell to divide uncontrollably. FGFR3-TACC3 is no different. FGFR3 can send signals and the TACC3 part probably makes it do this uncontrollably. But, what about the TACC3 part? Does that do anything, or is this all about FGFR3 going wrong?

What is TACC3?

Chromosomes getting shared to the two daughter cells

TACC3, or transforming acidic coiled-coil protein 3 to give it its full name, is a protein important for cell division. It helps to share the chromosomes to the two daughter cells when a cell divides. Chromosomes are shared out by a machine built inside the cell called the mitotic spindle. This is made up of tiny threads called microtubules. TACC3 stabilises these microtubules and adds strength to this machine.

We wondered if cancer cells with FGFR3-TACC3 had problems in cell division. If they did, this might be because the TACC3 part of FGFR3-TACC3 is changed.

We weren’t the first people to have this idea. The scientists that found the gene fusion suggested that FGFR3-TACC3 might bind to the mitotic spindle but not be able to work properly. We decided to take a closer look…

What did you find?

First of all FGFR3-TACC3 is not actually bound to the mitotic spindle. It is at the cells membrane and in small vesicles in the cell. So if it is not part of the mitotic spindle, how can it affect cell division? One unusual thing about TACC3 is that it is a dimer, meaning two TACC3s are stuck together. Stranger than that, these dimers can stick to more dimers and multimerise into a much bigger protein. When we looked at the normal TACC3 in the cell we noticed that the amount bound to the spindle had decreased. We wondered whether the FGFR3-TACC3 was hoovering the normal TACC3 off the spindle, preventing normal cell division.

We made the cancer cells express a bit more normal TACC3 and this rescued the faulty division. We also got rid of the FGFR3-TACC3 fusion, and that also put things back to normal. Finally, we made a fake FGFR3-TACC3 which had a dummy part in place of FGFR3 and this was just as good at hoovering up normal TACC3 and causing cell division problems. So our idea seemed to be right!

What does this mean for cancer?

This project was to look at what is going on inside cancer cells and it is a long way from any cancer treatments. Drug companies can develop chemicals which stop cell signalling from fusions, these could work as anti-cancer agents. In the case of FGFR3-TACC3, what we are saying is: even if you stop the signalling there will still be cell division problems in the cancer cells. So an ideal treatment might be to block TACC3 interactions as well as stopping signalling. This is very difficult to do and is far in the future. Doing work like this is important to understand all the possible ways to tackle a specific cancer and to find any problems with potential treatments.

The people

Sourav Sarkar did virtually all the work for this paper and he is first author. Sourav left the lab before we managed to submit this paper and so the revision experiments requested by the peer reviewers were done by Ellis Ryan.

Why didn’t we post this paper as a preprint?

My group have generally been posting our new manuscripts as preprints while they undergo peer review, but we didn’t post this one. I was reluctant because many cancer journals at the time of submission did not allow preprints. This has changed a bit in the last few months, but back in February several key cancer journals did not accept papers that had appeared first as preprints.

The title of the post comes from “Fusion Confusion” 4th track on the Hazy EP by Dr Phibes & The House of Wax Equations.

Parallel lines: new paper on modelling mitotic microtubules in 3D

We have a new paper out! You can access it here.

The people

This paper really was a team effort. Faye Nixon and Tom Honnor are joint-first authors. Faye did most of the experimental work in the final months of her PhD and Tom came up with the idea for the mathematical modelling and helped to rewrite our analysis method in R. Other people helped in lots of ways. George did extra segmentation, rendering and movie making. Nick helped during the revisions of the paper. Ali helped to image samples… the list is quite long.

The paper in a nutshell

We used a 3D imaging technique called SBF-SEM to see microtubules in dividing cells, then used computers to describe their organisation.

What’s SBF-SEM?

Serial block face scanning electron microscopy. This method allows us to take an image of a cell and then remove a tiny slice, take another image and so on. We then have a pile of images which covers the entire cell. Next we need to put them back together and make some sense of them.

How do you do that?

We use a computer to track where all the microtubules are in the cell. In dividing cells – in mitosis – the microtubules are in the form of a mitotic spindle. This is a machine that the cell builds to share the chromosomes to the two new cells. It’s very important that this process goes right. If it fails, mistakes can lead to diseases such as cancer. Before we started, it wasn’t known whether SBF-SEM had the power to see microtubules, but we show in this paper that it is possible.

We can see lots of other cool things inside the cell too like chromosomes, kinetochores, mitochondria, membranes. We made many interesting observations in the paper, although the focus was on the microtubules.

So you can see all the microtubules, what’s interesting about that?

The interesting thing is that our resolution is really good, and is at a large scale. This means we can determine the direction of all the microtubules in the spindle and use this for understanding how well the microtubules are organised. Previous work had suggested that proteins whose expression is altered in cancer cause changes in the organisation of spindle microtubules. Our computational methods allowed us to test these ideas for the first time.

Resolution at a large scale, what does that mean?

The spindle is made of thousands of microtubules. With a normal light microscope, we can see the spindle but we can’t tell individual microtubules apart. There are improvements in light microscopy (called super-resolution) but even with those improvements, right in the body of the spindle it is still not possible to resolve individual microtubules. SBF-SEM can do this. It doesn’t have the best resolution available though. A method called Electron Tomography has much higher resolution. However, to image microtubules at this large scale (meaning for one whole spindle), it would take months or years of effort! SBF-SEM takes a few hours. Our resolution is better than light microscopy, worse than electron tomography, but because we can see the whole spindle and image more samples, it has huge benefits.

What mathematical modelling did you do?

Cells are beautiful things but they are far from perfect. The microtubules in a mitotic spindle follow a pattern, but don’t do so exactly. So what we did was to create a “virtual spindle” where each microtubule had been made perfect. It was a bit like “photoshopping” the cell. Instead of straightening the noses of actresses, we corrected the path of every microtubule. How much photoshopping was needed told us how imperfect the microtubule’s direction was. This measure – which was a readout of microtubule “wonkiness” – could be done on thousands of microtubules and tell us whether cancer-associated proteins really cause the microtubules to lose organisation.

The publication process

The paper is published in Journal of Cell Science and it was a great experience. Last November, we put up a preprint on this work and left it up for a few weeks. We got some great feedback and modified the paper a bit before submitting it to a journal. One reviewer gave us a long list of useful comments that we needed to address. However, the other two reviewers didn’t think our paper was a big enough breakthrough for that journal. Our paper was rejected*. This can happen sometimes and it is frustrating as an author because it is difficult for anybody to judge which papers will go on to make an impact and which ones won’t. One of the two reviewers thought that because the resolution of SBF-SEM is lower than electron tomography, our paper was not good enough. The other one thought that because SBF-SEM will not surpass light microscopy as an imaging method (really!**) and because EM cannot be done live (the cells have to be fixed), it was not enough of a breakthrough. As I explained above, the power is that SBF-SEM is between these two methods. Somehow, the referees weren’t convinced. We did some more work, revised the paper, and sent it to J Cell Sci.

J Cell Sci is a great journal which is published by Company of Biologists, a not-for-profit organisation who put a lot of money back into cell biology in the UK. They are preprint friendly, they allow the submission of papers in any format, and most importantly, they have a fast-track*** option. This allowed me to send on the reviews we had and including our response to them. They sent the paper back to the reviewer who had a list of useful comments and they were happy with the changes we made. It was accepted just 18 days after we sent it in and it was online 8 days later. I’m really pleased with the whole publishing experience with J Cell Sci.


* I’m writing about this because we all have papers rejected. There’s no shame in that at all. Moreover, it’s obvious from the dates on the preprint and on the JCS paper that our manuscript was rejected from another journal first.

** Anyone who knows something about microscopy will find this amusing and/or ridiculous.

*** Fast-track is offered by lots of journals nowadays. It allows authors to send in a paper that has been reviewed elsewhere with the peer review file. How the paper has been revised in light of those comments is assessed by at the Editor and one peer reviewer.

Parallel lines is of course the title of the seminal Blondie LP. I have used this title before for a blog post, but it matches the topic so well.

Tips from the blog XI: Overleaf

I was recently an external examiner for a PhD viva in Cambridge. As we were wrapping up, I asked “if you were to do it all again, what would you do differently?”. It’s one of my stock questions and normally the candidate says “oh I’d do it so much quicker!” or something similar. However, this time I got a surprise. “I would write my thesis in LaTeX!”, was the reply.

As a recent convert to LaTeX I could see where she was coming from. The last couple of manuscripts I have written were done in Overleaf and have been a breeze. This post is my summary of the site.


I have written ~40 manuscripts and countless other documents using Microsoft Word for Mac, with EndNote as a reference manager (although I have had some failed attempts to break free of that). I’d tried and failed to start using TeX last year, motivated by seeing nicely formatted preprints appearing online. A few months ago I had a new manuscript to write with a significant mathematical modelling component and I realised that now was the chance to make the switch. Not least because my collaborator said “if we are going to write this paper in Word, I wouldn’t know where to start”.

screen-shot-2016-12-11-at-07-39-13I signed up for an Overleaf account. For those that don’t know, Overleaf is an online TeX writing tool on one half of the screen and a rendered version of your manuscript on the other. The learning curve is quite shallow if you are used to any kind of programming or markup. There are many examples on the site and finding out how to do stuff is quick thanks to LaTeX wikibooks and stackexchange.

Beyond the TeX, the experience of writing a manuscript in Overleaf is very similar to editing a blog post in WordPress.


The best thing about Overleaf is the ability to collaborate easily. You can send a link to a collaborator and then work on it together. Using Word in this way can be done with DropBox, but versioning and track changes often cause more problems than it’s worth and most people still email Word versions to each other, which is a nightmare. Overleaf changes this by having a simple interface that can be accessed by multiple people. I have never used Google docs for writing papers, but this does offer the same functionality.

All projects are private by default, but you can put your document up on the site if you want to. You might want to do this if you have developed an example document in a certain style.



Depending on the type of account you have, you can roll back changes. It is possible to ‘save’ versions, so if you get to a first draft and want to send it round for comment, you can save a version and then use this to go back to, if required. This is a handy insurance in case somebody comes in to edit the document and breaks something.

You can download a PDF at any point, or for that matter take all the files away as a zip. No more finalfinalpaper3final.docx…

If you’re keeping score, that’s Overleaf 2, Word nil.


Placing figures in the text is easy and all major formats are supported. What is particularly nice is that I can generate figures in an Igor layout and output directly to PDF and put that into Overleaf. In Word, the placement of figures can be fiddly. Everyone knows the sensation of moving a picture slightly and it disappears inexplicably onto another page. LaTeX will put the figure in where you want it or the next best place. It just works.


This is what LaTeX excels at. Microsoft Word has an equation editor which has varied over the years from terrible to just-about-usable. The current version actually uses elements of TeX (I think). The support for mathematical text in LaTeX is amazing, not surprising since this is the way that most papers in maths are written. Any biologist will find their needs met here.

Templates and formatting

There are lots of templates available on Overleaf and many more on the web. For example, there are nice PNAS and PLoS formats as well as others for theses and for CVs and other documents. The typesetting is beautiful. Setting out sections/subsections and table of contents is easy. To be fair to Word, if you know how to use it properly, this is easy too, but the problem is that most people don’t, and also styles can get messed up too easily.


This works by adding a bibtex file to your project. You can do this with any reference manager. Because I have a huge EndNote database, I used this initially. Another manuscript I’ve been working on, my student started out with a Mendeley library and we’ve used that. It’s very flexible. Slightly more fiddly than with Word and EndNote. However, I’ve had so many problems (and crashes) with that combination over the years that any alternative is a relief.


You can set the view on the right to compile automatically or you can force updates manually. Either way the document must compile. If you have made a mistake, it will complain and try to guess what you have done wrong and tell you. Errors that prevent the document from being compiled are red. Less serious errors are yellow and allow compilation to go ahead. This can be slow going at first, but I found that I was soon up to speed with editing.


This is the name of the stuff at the header of a TeX document. You can add in all kinds of packages to cover proper usage of units (siunitx) or chemical notation (mhchem). They all have great documentation. All the basics, e.g. referencing, are included in Overleaf by default.


The entire concept of Overleaf is to work online. Otherwise you could just use TeXshop or some other program. But how about times when you don’t have internet access? I was concerned about this at the start, but I found that in practice, these days, times when you don’t have a connection are very few and far between. However, I was recently travelling and wanted to work on an Overleaf manuscript on the aeroplane. Of course, with Word, this is straightforward.

With Overleaf it is possible. You can do two things. The first is to download your files ahead of your period of internet outage. You can edit your main.tex document in an editor of your choice. The second option is more sophisticated. You can clone your project with git and then work on that local clone. The instructions of how to do that are here (the instructions, from 2015, say it’s in beta, but it’s fully working). You can work on your document locally and then push changes back to Overleaf when you have access once more.


OK. Nothing is perfect and I noticed that typos and grammatical errors are more difficult for me to detect in Overleaf. I think this is because I am conditioned with years of Word use. The dictionary is smaller than in Word and it doesn’t try to correct your grammar like word does (although this is probably a good thing!). Maybe I should try the rich text view and see if that helps. I guess the other downside is that the other authors need to know TeX rather than Word. As described above if you are writing with a mathematician, this is not a problem. For biologists though this could be a challenge.

Back to the PhD exam

I actually think that writing a thesis is probably a once-in-a-lifetime chance to understand how Microsoft Word (and EndNote) really works. The candidate explained that she didn’t trust Word enough to do everything right, so her thesis was made of several different documents that were fudged to look like one long thesis. I don’t think this is that unusual. She explained that she had used Word because her supervisor could only use Word and she had wanted to take advantage of the Review tools. Her heart had sunk when her supervisor simply printed out drafts and commented using a red pen, meaning that she could have done it all in LaTeX and it would have been fine.


I have been totally won over by Overleaf. It beats Microsoft Word in so many ways… I’ll stick to Word for grant applications and other non-manuscript documents, but I’m going to keep using it for manuscripts, with the exception of papers written with people who will only use Word.

The Arcane Model

I’m currently writing two manuscripts that each have a substantial data modelling component. Some of our previous papers have included computer code, but it was straightforward enough to have the code as a supplementary file or in a GitHub repo and leave it at that. Now with more substantial computation in the manuscript, I was wondering how best to describe it. How much detail is required?

How much explanation should be in the main text, how much is in supplementary information and how much is simply via commenting in the code itself?

I asked for recommendations for excellent cell biology papers that had a modelling component, where the computation was well described.

I got many replies and I’ve collated this list of papers below so that I can refer to them and in case it is useful for anyone who is also looking for inspiration. I’ve added the journal names only so that you can see what journals are interested in publishing cell biology with a computational component. Here they are, in no particular order:

  • This paper on modelling kinetochore-microtubule attachment in pombe. Published in JCB there is also a GitHub repo for the software, kt_simul written in Python. The authors used commenting and also put a PDF of the heavy detail on GitHub.
  • Modelling of signalling networks here in PLoS Comput Biol.
  • This paper using Voronoi tesselations to examine tissue packing of cells in EMBO J.
  • Two papers, this one in JCB featuring modelling of DNA repair and this one in Curr Biol on photoreceptors in flies.
  • Cell movements via depletion of chemoattractants in PLos Biol.
  • Protein liquid droplets as organising centres for biochemical reactions is a hot topic. This paper in Cell was recommended.
  • Final tip was to look at PLoS Comput Biol for inspiration, searching for cell biology topics. Papers like this one on Smoldyn 2.1.

Thanks to Hadrien Mary, Robert Insall, Joachim Goedhart, Stephen Floor, Jon Humphries, Luis Escudero, and Neil Saunders for the suggestions.

The post title is taken from “The Arcane Model” by The Delgados from their album Peloton.

Everything In Its Right Place

Something that has driven me nuts for a while is the bug in FIJI/ImageJ when making montages of image stacks. This post is about a solution to this problem.

What’s a montage?

You have a stack of images and you want to array them in m rows by n columns. This is useful for showing a gallery of each frame in a movie or to separate the channels in a multichannel image.

What’s the bug/feature in ImageJ?

If you select Image>Stacks>Make Montage… you can specify how you want to layout your montage. You can specify a “border” for this. Let’s say we have a stack of 12 images that are 300 x 300 pixels. Let’s arrange them into 3 rows and 4 columns with 0 border.

Screen Shot 2016-07-06 at 14.21.53So far so good. We have an image that is 1200 x 900. But it looks a bit rubbish, we need some grouting (white pixel space between the images). We don’t need a border, but let’s ignore that for the moment. So the only way to do this in ImageJ is to specify a border of 8 pixels.

Screen Shot 2016-07-06 at 14.22.27Looks a lot better. Ok there’s a border around the outside, which is no use, but it looks good. But wait a minute! Check out the size of the image (1204 x 904). This is only 4 pixels bigger in x and y, yet we added all that grouting, what’s going on?

The montage is not pixel perfect.

Screen Shot 2016-07-06 at 14.23.38

So the first image is not 300 x 300 any more. It is 288 x 288. Hmmm, maybe we can live with losing some data… but what’s this?

Screen Shot 2016-07-06 at 14.24.49

The next image in the row is not even square! It’s 292 x 288. How much this annoys you will depend on how much you like things being correct… The way I see it, this is science, if we don’t look after the details, who will? If I start with 300 x 300 images, it’s not too much to ask to end up with 300 x 300 images, is it? I needed to fix this.


I searched for a while for a solution. It had clearly bothered other people in the past, but I guess people just found their own workaround.

ImageJ solution for multichannel array

So for a multichannel image, where the grayscale images are arrayed next to the merge, I wrote something in ImageJ to handle this. These macros are available here. There is a macro for doing the separation and arraying. Then there is a macro to combine these into a bigger figure.

Igor solution

For the exact case described above, where large stacks need to be tiled out into and m x n array, I have to admit I struggled to write something for ImageJ and instead wrote something for IgorPRO. Specifying 3 rows, 4 columns and a grout of 8 pixels gives the correct TIFF 1224 x 916, with each frame showing in full and square. The code is available here, it works for 8 bit greyscale and RGB images.

Screen Shot 2016-07-06 at 14.55.31

I might update the code at some point to make sure it can handle all data types and to allow labelling and adding of a scale bar etc.

The post title is taken from “Everything In Its Right Place” by Radiohead from album Kid A.

Voice Your Opinion: Editors shopping for preprints is the future

Today I saw a tweet from Manuel Théry (an Associate Ed at Mol Biol Cell). Which said that he heard that the Editor-in-Chief of MBoC, David Drubin shops for interesting preprints on bioRxiv to encourage the authors to submit to MBoC. This is not a surprise to me. I’ve read that authors of preprints on bioRxiv have been approached by journal Editors previously (here and here, there are many more). I’m pleased that David is forward-thinking and that MBoC are doing this actively.

I think this is the future.

Why? If we ignore for a moment the “far future” which may involve the destruction of most journals, leaving a preprint server and a handful of subject-specific websites which hunt down and feature content from the server and co-ordinate discussions and overviews of current trends… I actually think this is a good idea for the “immediate future” of science and science publishing. Two reasons spring to mind.

  1. Journals would be crazy to miss out: The manuscripts that I am seeing on bioRxiv are not stuff that’s been dumped there with no chance of “real publication”. This stuff is high profile. I mean that in the following sense: the work in my field that has been posted is generally interesting, it is from labs that do great science, and it is as good as work in any journal (obviously). For some reason I have found myself following what is being deposited here more closely than at any real journal. Journals would be crazy to miss out on this stuff.
  2. Levelling the playing field: For better-or-worse papers are judged on where they are published. The thing that bothers me most about this is that manuscripts are only submitted to 1 or more journals before “finding their home”. This process is highly noisy and it means that if we accept that there is a journal hierarchy, your paper may or may not be deserving of the kudos it receives in its resting place. If all journals actively scour the preprint server(s), the authors can then pick the “highest bidder”. This would make things fairer in the sense that all journals in the hierarchy had a chance to consider the paper and its resting place may actually reflect its true quality.

I don’t often post opinions here, but I thought this would take more than 140 characters to explain. If you agree or disagree, feel free to leave a comment!

Edit @ 11:46 16-05-26 Pedro Beltrao pointed out that this idea is not new, a post of his from 2007.

Edit 16-05-26 Misattributed the track to Extreme Noise Terror (corrected). Also added some links thanks to Alexis Verger.

The post title comes from “Voice Your Opinion” by Unseen Terror. The version I have is from a Peel sessions compilation “Hardcore Holocaust”.

If I Can’t Change Your Mind

I have written previously about Journal Impact Factors (here and here). The response to these articles has been great and earlier this year I was asked to write something about JIFs and citation distributions for one of my favourite journals. I agreed and set to work.

Things started off so well. A title came straight to mind. In the style of quantixed, I thought The Number of The Beast would be amusing. I asked for opinions on Twitter and got an even better one (from Scott Silverman @sksilverman) Too Many Significant Figures, Not Enough Significance. Next, I found an absolute gem of a quote to kick off the piece. It was from the eminently quotable Sydney Brenner.

Before we develop a pseudoscience of citation analysis, we should remind ourselves that what matters absolutely is the scientific content of a paper and that nothing will substitute for either knowing it or reading it.

looseEndsThat quote was from a Loose Ends piece that Uncle Syd penned for Current Biology in 1995. Wow, 1995… that is quite a few years ago I thought to myself. Never mind. I pressed on.

There’s a lot of literature on JIFs, research assessment and in fact there are whole fields of scholarly activity (bibliometrics) devoted to this kind of analysis. I thought I’d better look back at what has been written previously. The “go to” paper for criticism of JIFs is Per Seglen’s analysis in the BMJ, published in 1997. I re-read this and I can recommend it if you haven’t already seen it. However, I started to feel uneasy. There was not much that I could add that hadn’t already been said, and what’s more it had been said 20 years ago.

Around about this time I was asked to review some fellowship applications for another EU country. The applicants had to list their publications, along with the JIF. I found this annoying. It was as if SF-DORA never happened.

There have been so many articles, blog posts and more written on JIFs. Why has nothing changed? It was then that I realised that it doesn’t matter how many things are written – however coherently argued – people like JIFs and they like to use them for research assessment. I was wasting my time writing something else. Sorry if this sounds pessimistic. I’m sure new trainees can be reached by new articles on this topic, but acceptance of JIF as a research assessment tool runs deep. It is like religious thought. No amount of atheist writing, no matter how forceful, cogent, whatever, will change people’s minds. That way of thinking is too deeply ingrained.

As the song says, “If I can’t change your mind, then no-one will”.

So I declared defeat and told the journal that I felt like I had said all that I could already say on my blog and that I was unable to write something for them. Apologies to all like minded individuals for not continuing to fight the good fight.

But allow me one parting shot. I had a discussion on Twitter with a few people, one of whom said they disliked the “JIF witch hunt”. This caused me to think about why the JIF has hung around for so long and why it continues to have support. It can’t be that so many people are statistically illiterate or that they are unscientific in choosing to ignore the evidence. What I think is going on is a misunderstanding. Criticism of a journal metric as being unsuitable to judge individual papers is perceived as an attack on journals with a high-JIF. Now, for good or bad, science is elitist and we are all striving to do the best science we can. Striving for the best for many scientists means aiming to publish in journals which happen to have a high JIF. So an attack of JIFs as a research assessment tool, feels like an attack on what scientists are trying to do every day.

JIFDistBecause of this intense focus on high-JIF journals… what people don’t appreciate is that the reality is much different. The distribution of JIFs is as skewed as that for the metric itself. What this means is that focussing on a minuscule fraction of papers appearing in high-JIF journals is missing the point. Most papers are in journals with low-JIFs. As I’ve written previously, papers in journals with a JIF of 4 get similar citations to those in a journal with a JIF of 6. So the JIF tells us nothing about citations to the majority of papers and it certainly can’t predict the impact of these papers, which are the majority of our scientific output.

So what about those fellowship applicants? All of them had papers in journals with low JIFs (<8). The applicants’ papers were indistinguishable in that respect. What advice would I give to people applying to such a scheme? Well, I wouldn’t advise not giving the information asked for. To be fair to the funding body they also asked for number of citations for each paper, but for papers that are only a few months old, this number is nearly always zero. My advice would be to try and make sure that your paper is available freely for anyone to read. Many of the applicants’ papers were outside my expertise and so the title and abstract didn’t tell me much about the significance of the paper. So I looked at some of these papers to look at the quality of the data in there… if I had access. Applicants who had published in closed access journals are at a disadvantage here because if I couldn’t download the paper then it was difficult to assess what they had been doing.

I was thinking that this post would be a meta-meta-blogpost. Writing about an article which was written about something I wrote on my blog. I suppose it still is, except the article was never finished. I might post again about JIFs, but for now I doubt I will have anything new to say that hasn’t already been said.

The post title is taken from “If I Can’t Change Your Mind” by Sugar from their LP Copper Blue. Bob Mould was once asked about song-writing and he said that the perfect song was like a maths puzzle (I can’t find a link to support this, so this is from memory). If you are familiar with this song, songwriting and/or mathematics, then you will understand what he means.

Edit @ 08:22 16-05-20 I found an interview with Bob Mould where he says song-writing is like city-planning. Maybe he just compares song-writing to lots of different things in interviews. Nonetheless I like the maths analogy.