Ferrous: new paper on FerriTagging proteins in cells

We have a new paper out. It’s not exactly news, because the paper has been up on bioRxiv since December 2016 and hasn’t changed too much. All of the work was done by Nick Clarke when he was a PhD student in the lab. This post is to explain our new paper to a general audience.

The paper in a nutshell

We have invented a new way to tag proteins in living cells so that you can see them by light microscopy and by electron microscopy.

Why would you want to do that?

Proteins do almost all of the jobs in cells that scientists want to study. We can learn a lot about how proteins work by simply watching them down the microscope. We want to know their precise location. Light microscopy means that the cells are alive and we can watch the proteins move around. It’s a great method but it has low resolution, so seeing a protein’s precise location is not possible. We can overcome this limitation by using electron microscopy. This gives us higher resolution, but the proteins are stuck in one location. When we correlate images from one microscope to the other, we can watch proteins move and then look at them with high resolution. All we need is a way to see the proteins so that they can be seen in both types of microscope. We do this with tagging.

Tagging proteins so that we can see them by light microscopy is easy. A widely used method is to use a fluorescent protein such as GFP. We can’t see GFP in the electron microscope (EM) so we need another method. Again, there are several tags available but they all have drawbacks. They are not precise enough, or they don’t work on single proteins. So we came up with a new one and fused it with a fluorescent protein.

What is your EM tag?

We call it FerriTag. It is based on Ferritin which is a large protein shell that cells use to store iron. Because iron scatters electrons, this protein shell can be seen by EM as a particle. There was a problem though. If Ferritin is fused to a protein, we end up with a mush. So, we changed Ferritin so that it could be attached to the protein of interest by using a drug. This meant that we could put the FerriTag onto the protein we want to image in a few seconds. In the picture on the right you can see how this works to FerriTag clathrin, a component of vesicles in cells.

We can watch the tagging process happening in cells before looking by EM. The movie on the right shows green spots (clathrin-coated pits in a living cell) turning orange/yellow when we do FerriTagging. The cool thing about FerriTag is that it is genetically encoded. That means that we get the cell to make the tag itself and we don’t have to put it in from outside which would damage the cell.

What can you use FerriTag for?

Well, it can be used to tag many proteins in cells. We wanted to precisely localise a protein called HIP1R which links clathrin-coated pits to the cytoskeleton. We FerriTagged HIP1R and carried out what we call “contextual nanoscale mapping”. This is just a fancy way of saying that we could find the FerriTagged HIP1R and map where it is relative to the clathrin-coated pit. This allowed us to see that HIP1R is found at the pit and surrounding membrane. We could even see small changes in the shape of HIP1R in the different locations.

We’re using FerriTag for lots of projects. Our motivation to make FerriTag was so that we could look at proteins that are important for cell division and this is what we are doing now.

Is the work freely available?

Yes! The paper is available here under CC-BY licence. All of the code we wrote to analyse the data and run computer simulations is available here. All of the plasmids needed to do FerriTagging are available from Addgene (a non-profit company, there is a small fee) so that anyone can use them in the lab to FerriTag their favourite protein.

How long did it take to do this project?

Nick worked for four years on this project. Our first attempt at using ribosomes to tag proteins failed, but Nick then managed to get Ferritin working as a tag. This paper has broken our lab record for longest publication delay from first submission to final publication. The diagram below tells the whole saga.

 

The publication process was frustratingly slow. It took a few months to write the paper and then we submitted to the first journal after Christmas 2016. We got a rapid desk rejection and sent the paper to another journal and it went out for review. We had two positive referees and one negative one, but we felt we could address the comments and checked with the journal who said that they would consider a revised paper as an appeal. We did some work and resubmitted the paper. Almost six months after first submission the paper was rejected, but with the offer of a rapid (ha!) publication at Nature Communications using the peer review file from the other journal.

Hindsight is a wonderful thing but I now regret agreeing to transfer the paper to Nature Communications. It was far from rapid. They drafted in a new reviewer who came with a list of new questions, as well as being slow to respond. Sure, a huge chunk of the delay was caused by us doing revision experiments (the revisions took longer than they should because Nick defended his PhD, was working on other projects and also became a parent). However, the journal was really slow. The Editor assigned to our paper left the journal which didn’t help and the reviewer they drafted in was slow to respond each time (6 and 7 weeks, respectively). Particularly at the end, after the paper was ‘accepted in principle’ it took them three weeks to actually accept the paper (seemingly a week to figure out what a bib file is and another to ask us something about chi-squared tests). Then a further three weeks to send us the proofs, and then another three weeks until publication. You can see from the graphic that we sent back the paper in the third week of February and only incurred a 9-day delay ourselves, yet the paper was not published until July.

Did the paper improve as a result of this process? Yes and no. We actually added some things in the first revision cycle (for Journal #2) that got removed in subsequent peer review cycles! And the message in the final paper is exactly the same as the version on bioRxiv, posted 18 months previously. So in that sense, no it didn’t. It wasn’t all a total waste of time though, the extra reviewer convinced us to add some new analysis which made the paper more convincing in the end. Was this worth an 18-month delay? You can download our paper and the preprint and judge for yourself.

Were we unlucky with this slow experience? Maybe, but I know other authors who’ve had similar (and worse) experiences at this journal. As described in a previous post, the publication lag times are getting longer at Nature Communications. This suggests that our lengthy wait is not unique.

There’s lots to like about this journal:

  • It is open access.
  • It has the Nature branding (which, like it or not, impresses many people).
  • Peer review file is available
  • The papers look great (in print and online).

But there are downsides too.

  • The APC for each paper is £3300 ($5200). Obviously open access must cost something, but there a cheaper OA journals available (albeit without the Nature branding).
  • Ironically, paying a premium for this reputation is complicated since the journal covers a wide range of science and its kudos varies depending on subfield.
  • It’s also slow, and especially so when you consider that papers have often transferred here from somewhere else.
  • It’s essentially a mega journal, so your paper doesn’t get the same exposure as it would in a community-focused journal.
  • There’s the whole ReadCube/SpringerNature thing…

Overall it was a negative publication experience with this paper. Transferring a paper along with the peer review file to another journal has worked out well for us recently and has been rapid, but not this time. Please leave a comment particularly if you’ve had a positive experience and redress the balance.

The post title comes from “Ferrous” by Circle from their album Meronia.

Ten Years vs The Spread: Calculating publication lag times in R

There have been several posts on this site about publication lag times. You can read them here. Lag times are the delays in the dissemination of scientific data introduced by the process of publishing the paper in a journal. Nowadays, your paper can be online in a few hours using a preprint server. However, this work is not peer reviewed. Journals organise a formal peer review and provide some sort of certification of the work. They typeset the work and all of this adds delays the dissemination of work in a journal.

To look at publication delays, you can use PubMed data, which is incomplete but can give insight into how long these delays can be. Previous posts have involved the use of a ruby script to make a csv file from PubMed XML output and then use this in Igor to calculate the publication lag times. There is another method detailed in this excellent post by Daniel Himmelstein.

I recently posted a figure for Nature Communications lag times on Twitter and was asked to generate others. I figured that I should write an R script and people can make their own!

The PubMedLagR code is available here with instructions for use.

A query for Nature Communications data at PubMed, such as:

nat commun[ta] AND 2000 : 2018[pdat] AND journal article[pt]

Retrieves all paper for this journal. The range from 2010 to 2018 is for illustration, this journal has only been in operation for these years. Filtering for journal articles rather and attempting to get rid of reviews and front matter is wise, but doesn’t always work. Again this journal doesn’t carry this material so this is for illustration. Getting your query right is very important.

Save the results in XML format and then run the R script as directed. This should give a csv of the data and a png of the lag times.

This is data from Nature Communications. Colleagues had two separate papers accepted at this journal and experienced long delays. I was interested to see if papers were generally taking longer to publish here. Of course we do not know why. Delays are partly the fault of the authors, the reviewers and the journal and it is not possible to say why publication lag times are increasing for this journal year-on-year. The journal has grown in terms of number of papers published, has this introduced inefficiencies? Are reviewers being slow to review? Are they being more demanding? Are Editors not marshalling the referee reports and providing clear guidance to authors? Allowing too much time and too many rounds of revision? Are authors being too slow to do further experimental work? The answer will be yes to some of these questions for some of the papers.

This is not to focus on Nature Communications, it’s one of a few journals that many colleagues complain is too slow to publish their work. With this code you can have a look at the journal you are interested in submitting to and consider whether there is a more rapid venue for your work.

Update:

I changed the code slightly and prettified the plots just a little. Below are some plots for Nature Cell Biology, Nature Neuroscience. I also did a search for clathrin or CRISPR papers over the same time period. These keyword searches are fairly flat, whereas the journal-specific increase in publication lag time can be seen.

The lag times at Nature Neuroscience look artificially low and then seem to have jumped up in 2016 to be something similar to Nature Cell Biology or Nature Communications.

Edit

I neglected to point out that the code truncates the y-axis in the bottom right plot to 1000 days or the maximum lag time, whichever is smaller. This is because it gets difficult to see the data points if there is an outlier, which might be due to an error in PubMed data.

A reader commented on Twitter that some poor paper had a lag time almost 1000 days. Well, due to the y-axis truncation we don’t see that 9 papers in Nature Communications since 2010 have lag times (RecAcc) of > 1000 days. The record holder has a lag time of 1561 days! I checked that this was not a PubMed error by looking at the dates on the paper.

Notes

Date information is not available in PubMed for every paper unfortunately. This is especially true of older papers.

The date information is supplied to PubMed from the journal. These dates are not necessarily accurate: 1) you can see occasional errors in the data, 2) journals sometimes “reset the clock” on papers and treat resubmissions as new submissions.

The post title is taken from “10 Years vs The Spread” by Wing-Tipped Sloat from the LP Chewyfoot. Obviously the song has nothing to do with smoothed kernel density estimates of journal publication lag times, but the title was incredibly apt.

Scoop: some practical advice

So quantixed occasionally gets correspondence from other researchers asking for advice. A recent email came from someone who had been “scooped”. What should they do?

Before we get into this topic we have to define what we mean by being scooped.

In the most straightforward sense being scooped means that an article appeared online before you managed to get your article online.

You were working on something that someone else was also working on – maybe you knew about this or not and vice versa – but they got their work out before you did. They are the scooper and you are the scoopee.

There is another use of the term, primarily used in highly competitive fields, which define the act of scooping as the scooper have gained some unfair advantage to make the scoop. In the worst case, this can be done by receiving your article to review confidentially and then delaying your work while using your information to accelerate their own work (Ginsparg, 2016).

However it happens, the scoop can classified as an overscoop or an underscoop. An overscoop is where the scooper has much more data and a far more complete story. Maybe the scooper’s paper appears in high profile journal while the scoopee was planning on submitting to a less-selective journal.  Perhaps the scooper has the cell data, an animal model, the biochemical data and a crystal structure; while the scoopee had some nice data in cells and a bit of biochemistry. An underscoop is where a key observation that the scoopee was building into a full paper is partially revealed. The scoopee could have more data or better quality results and maybe the full mechanism, but the scooper’s paper gives away a key detail (Mole, 2004).

All of these definitions are different from the journalistic definition which simply means “the scoop” is the big story. What the science and journalistic term share is the belief that being second with a story is worthless. In science, being second and getting the details right is valuable and more weight should be given that it currently is. I think follow-up work is valued by the community, but it is fair to say that it is unlikely to receive the same billing and attention as the scooper’s paper.

How often does scooping actually happen?

To qualify as being scooped, you need to have a paper that you are preparing for publication when the other paper appears. If you are not at that point, someone else was just working on something similar and they’ve published a paper. They haven’t scooped you. This is easiest to take when you have just had an idea or have maybe done a few experiments and then you see a paper on the same thing. It must’ve been a good idea! The other paper has saved you some time! Great. Move on. The problem comes when you have invested a lot of time doing a whole bunch of work and then the other paper appears. This is very annoying, but to reiterate, you haven’t really been scooped if you weren’t actually at the point of preparing your work for publication.

As you might have gathered, I am not even sure scooping is a real thing. For sure the fear of being scooped is real. And there are instances of scooping happening. But most of the time the scoopee has not actually been scooped. And even then, the scoopee does not just abandon their work.

So what is the advice to someone who has discovered that they have been scooped?

Firstly, don’t panic! The scoopers paper is not going to go away and you have to deal with the fact you now have the follow up paper. It can be hard to change your mindset, but you must rewrite your paper to take their work into account. Going into denial mode and trying to publish your work as though the other paper doesn’t exist is a huge mistake.

Second, read their work carefully. I doubt that the scooper has left you with no room for manoeuvre. Even in the case of the overscoop, you probably still have something that the other paper doesn’t have that you can still salvage. There’s bound to be some details on which your work does not agree and this can feature in your paper. If it’s an underscoop, you have even less to worry about. There will be a way forward – you just need to identify it and move on.

The main message is that “being scooped” is not the end. You just need to figure out your way forward.

How do I stop it from happening to me?

Be original! It’s a truism that if you are working on something interesting, it’s likely that someone else is too. And if you work in a highly competitive area, there might be many groups working on the same thing and it is more likely that you will be scooped. Some questions are obvious next steps and it might be worth thinking twice about pursuing them. This is especially true if you come up with an idea based on a paper you’ve read. Work takes so long to appear that the lab who published that paper is likely far ahead of you.

Having your own niche gives the best protection. If you have carved out your own question you probably have the lead and will be associated with work in this area anyway. Other labs will back off. If you have a highly specialised method, again you can contribute in ways that others can’t and so your chances of being scooped decrease.

Have a backup plan. Do you have a side project which you can switch to if too much novelty is taken away from your main project? You can insulate yourself from scoop damage by not working on projects that are all-or-nothing. Horror stories about scooping in structural biology (which is all about “the big reveal”) are commonplace. Investing energy in alternative approaches or new assays as well as getting a structure might help here.

If you find out about competition, maybe from a poster or a talk at a meeting, you need to evaluate whether it is worth carrying on. If you can, talk to the other lab. Most labs do not want to compete and would prefer to collaborate or at least co-ordinate submission of manuscripts.

Use preprints! If you deposit your work on a preprint server, you get a DOI and a date stamp. You can prove that your work existed on that date and in what form. This is ultimate protection against being scooped. If someone else’s work appears online before you do this, then as I said above, you haven’t really been scooped. If work appears and you already have a DOI, well, then you haven’t been scooped either. Some journals see things this way. For example, EMBO J have a scoop protection policy that states that the preprint deposition timestamp is the date at which priority is assessed.

The post title is taken from “Scoop” by The Auctioneers. I have this track on an extended C86 3-Disc set.

In a Word: LaTeX to Word and vice versa

Here’s a quick tech tip. We’ve been writing papers in TeX recently, using Overleaf as a way to write collaboratively. This works great but sometimes, a Word file is required by the publisher. So how do you convert from one to the other quickly and with the least hassle?

If you Google this question (as I did), you will find a number of suggestions which vary in the amount of effort required. Methods include latex2rtf or pandoc. Here’s what worked for me:

  • Exporting the TeX file as PDF from Overleaf
  • Opening it in Microsoft Word
  • That was it!

OK, that wasn’t quite it. It did not work at all on a Mac. I had to use a Windows machine running Word. The formatting was maintained and the pictures imported OK. Note that this was a short article with three figures and hardly any special notation (it’s possible this doesn’t work as well on more complex documents). A couple of corrections were needed: hyphenation at the end of the line was deleted during the import which borked actual hyphenated words which happened to span two lines; and the units generated by siunitx were missing a space between the number and unit. Otherwise it was pretty straightforward. So straightforward that I thought I’d write a quick post in case it helps other people.

What about going the other way?

Again, on Windows I used Apache OpenOffice to open my Word document and save it as an otd file. I then used the writer2latex filter to make a .tex file with all the embedded images saved in a folder. These could then be uploaded to Overleaf. With a bit of formatting work, I was up-and-running.

I had heard that many publishers, even those that say that they accept manuscripts as TeX files actually require a Word document for typesetting. This is because, I guess, they have workflows set up to make the publisher version which must start with a Word document and nothing else. What’s more worrying is that in these cases, if you don’t supply one, they will convert it for you before putting into the workflow. It’s probably better to do this yourself and check the conversion to reduce errors at the proof stage.

The post title is taken from “In A Word” the compilation album by Nottingham noise-rockers Fudge Tunnel.

Some Things Last A Long Time II

Back in 2014, I posted an analysis of the time my lab takes to publish our work. This post is very popular. Probably because it looks at the total time it takes us to publish our work. It was time for an update. Here is the latest version.

The colours have changed a bit but again the graphic shows that the journey to publication in four “eras”:

  1. Pre-time (before 0 on the x-axis): this is the time from first submission to the first journal. A dark time which involves rejection.
  2. Submission at the final journal (starting at time 0). Again, the lime-coloured periods are when the manuscript is with the journal and the green ones, when it is with us (being revised).
  3. Acceptance! This is where the lime bar stops. The manuscript is then readied for publication (blank area).
  4. Published online. A red period that ends with final publication in print.

Since 2013 we have been preprinting our work, which means that the manuscript is available while it is under review. This procedure means that the journey to publication only delays the work appearing in the journal and not its use by other scientists. If you want to find out more about preprints in biology check out ASAPbio.org or my posts here and here.

The mean time from first submission to the paper appearing online in the journal is 226 days (median 210). Which is shorter than the last time I did this analysis (250 days). Sadly though we managed to set a new record for longest time to publication with 450 days! This is sad for the first author concerned who worked hard (259 days in total) revising the paper when she could have been doing other stuff. It is not all bad though. That paper was put up on bioRxiv the day we first submitted it so the pain is offset somewhat.

What is not shown in the graphic is the other papers that are still making their way through the process. These manuscripts will change the stats again likely pushing up the times. As I said in the last post, I think the delays we experience are pretty typical for our field and if anything, my group are quite quick to publish.

If you’d like to read more about publication lag times see here.

Thanks to Jessica Polka for nudging me to update this post.

The post title comes again from Daniel Johnston’s track “Some Things Last A Long Time” from his “1990” LP.

The Digital Cell: Workflow

The future of cell biology, even for small labs, is quantitative and computational. What does this mean and what should it look like?

My group is not there yet, but in this post I’ll describe where we are heading. The graphic below shows my current view of the ideal workflow for my lab.

Workflow

The graphic is pretty self-explanatory, but to walk you through:

  • A lab member sets up a microscopy experiment. We have standardised procedures/protocols in a lab manual and systems are in place so that reagents are catalogued to minimise error.
  • Data goes straight from the microscope to the server (and backed-up). Images and metadata are held in a database and object identifiers are used for referencing in electronic lab notebooks (and for auditing).
  • Analysis of the data happens with varying degrees of human intervention. The outputs of all analyses are processed automatically. Code for doing these steps in under version control using git (github).
  • Post-analysis the processed outputs contain markers for QC and error checking. We can also trace back to the original data and check the analysis. Development of code happens here too, speeding up slow procedures via “software engineering”.
  • Figures are generated using scripts which are linked to the original data with an auditable record of any modification to the image.
  • Project management, particularly of paper writing is via trello. Writing papers is done using collaborative tools. Everything is synchronised to enable working from any location.
  • This is just an overview and some details are missing, e.g. backup of analyses is done locally and via the server.

Just to reiterate, that my team are not at this point yet, but we are reasonably close. We have not yet implemented three of these things properly in my group, but in our latest project (via collaboration) the workflow has worked as described above.

The output is a manuscript! In the future I can see that publication of a paper as a condensed report will give way to making the data, scripts and analysis available, together with a written summary. This workflow is designed to allow this to happen easily, but this is the topic for another post.

Part of a series on the future of cell biology in quantitative terms.

Zero Tolerance

We were asked to write a Preview piece for Developmental Cell. Two interesting papers which deal with the insertion of amphipathic helices in membranes to influence membrane curvature during endocytosis were scheduled for publication and the journal wanted some “front matter” to promote them.

Our Preview is paywalled – sorry about that – but I can briefly tell you why these two papers are worth a read.

The first paper – a collaboration between EMBL scientists led by Marko Kaksonen – deals with the yeast proteins Ent1 and Sla2. Ent1 has an ENTH domain and Sla2 has an ANTH domain. ENTH stands for Epsin N-terminal homology whereas ANTH means AP180 N-terminal homology. These two domains are known to bind membrane and in the case of ENTH to tubulate and vesiculate giant unilamellar vesicles (GUVs). Ent1 does this via an amphipathic helix “Helix 0” that inserts into the outer leaflet to bend the membrane. The new paper shows that Ent1 and Sla2 can bind together (regulated by PIP2) and that ANTH regulates ENTH so that it doesn’t make lots of vesicles, instead the two team up to make regular membrane tubules. The tubules are decorated with a regular “coat” of these adaptor proteins. This coat could prepattern the clathrin lattice. Also, because Sla2 links to actin, then actin can presumably pull on this lattice to help drive the formation of a new vesicle. The regular spacing might distribute the forces evenly over large expanses of membrane.

The second paper – from David Owen’s lab at CIMR in Cambridge – shows that CALM (a protein with an ANTH domain) actually has a secret Helix 0! They show that this forms on contact with lipid. CALM influences the size of clathrin-coated pits and vesicles, by influencing curvature. They propose a model where cargo size needs to be matched to vesicle size, simply due to the energetics of pit formation. The idea is that cells do this by regulating the ratio of AP2 to CALM.

You can read our preview and the papers by Skruzny et al and Miller et al in the latest issue of Dev Cell.

The post title and the title of our Preview is taken from “Zero Tolerance” by Death from their Symbolic LP. I didn’t want to be outdone by these Swedish scientists who have been using Bob Dylan song titles and lyrics in their papers for years.

Joining A Fanclub

When I started this blog, my plan was to write about interesting papers or at least blog about the ones from my lab. This post is a bit of both.

I was recently asked to write a “Journal Club” piece for Nature Reviews Molecular Cell Biology, which is now available online. It’s paywalled unfortunately. It’s also very short, due to the format. For these reasons, I thought I’d expand a bit on the papers I highlighted.

I picked two papers from Dick McIntosh’s group, published in J Cell Biol in the early 1990s as my subject. The two papers are McDonald et al. 1992 and Mastronarde et al. 1993.

Almost everything we know about the microanatomy of mitotic spindles comes from classical electron microscopy (EM) studies. How many microtubules are there in a kinetochore fibre? How do they contact the kinetochore? These questions have been addressed by EM. McIntosh’s group in Boulder, Colorado have published so many classic papers in this area, but there are many more coming from Conly Rieder, Alexey Khodjakov, Bruce McEwen and many others. Even with the advances in light microscopy which have improved spatial resolution (resulting in a Nobel Prize last year), EM is the only way to see individual microtubules within a complex subcellular structure like the mitotic spindle. The title of the piece, Super-duper resolution imaging of mitotic microtubules, is a bit of a dig at the fact that EM still exceeds the resolution available from super-resolution light microscopy. It’s not the first time that this gag has been used, but I thought it suited the piece quite well.

There are several reasons to highlight these papers over other electron microscopy studies of mitotic spindles.

It was the first time that 3D models of microtubules in mitotic spindles were built from electron micrographs of serial sections. This allowed spatial statistical methods to be applied to understand microtubule spacing and clustering. The software that was developed by David Mastronarde to do this was later packaged into IMOD. This is a great software suite that is actively maintained, free to download and is essential for doing electron microscopy. Taking on the same analysis today would be a lot faster, but still somewhat limited by cutting sections and imaging to get the resolution required to trace individual microtubules.

kfibreThe paper actually showed that some of the microtubules in kinetochore fibres travel all the way from the pole to the kinetochore, and that interpolar microtubules invade the bundle occasionally. This was an open question at the time and was really only definitively answered thanks to the ability to digitise and trace individual microtubules using computational methods.

The final thing I like about these papers is that it’s possible to reproduce the analysis. The methods sections are wonderfully detailed and of course the software is available to do similar work. This is in contrast to most papers nowadays, where it is difficult to understand how the work has been done in the first place, let alone to try and reproduce it in your own lab.

David Mastronarde and Dick McIntosh kindly commented on the piece that I wrote and also Faye Nixon in my lab made some helpful suggestions. There’s no acknowledgement section, so I’ll thank them all here.

References

McDonald, K. L., O’Toole, E. T., Mastronarde, D. N. & McIntosh, J. R. (1992) Kinetochore microtubules in PTK cells. J. Cell Biol. 118, 369—383

Mastronarde, D. N., McDonald, K. L., Ding, R. & McIntosh, J. R. (1993) Interpolar spindle microtubules in PTK cells. J. Cell Biol. 123, 1475—1489

Royle, S.J. (2015) Super-duper resolution imaging of mitotic microtubules. Nat. Rev. Mol. Cell. Biol. doi:10.1038/nrm3937 Published online 05 January 2015

The post title is taken from “Joining a Fanclub” by Jellyfish from their classic second and final LP “Spilt Milk”.