Zero Tolerance

We were asked to write a Preview piece for Developmental Cell. Two interesting papers which deal with the insertion of amphipathic helices in membranes to influence membrane curvature during endocytosis were scheduled for publication and the journal wanted some “front matter” to promote them.

Our Preview is paywalled – sorry about that – but I can briefly tell you why these two papers are worth a read.

The first paper – a collaboration between EMBL scientists led by Marko Kaksonen – deals with the yeast proteins Ent1 and Sla2. Ent1 has an ENTH domain and Sla2 has an ANTH domain. ENTH stands for Epsin N-terminal homology whereas ANTH means AP180 N-terminal homology. These two domains are known to bind membrane and in the case of ENTH to tubulate and vesiculate giant unilamellar vesicles (GUVs). Ent1 does this via an amphipathic helix “Helix 0” that inserts into the outer leaflet to bend the membrane. The new paper shows that Ent1 and Sla2 can bind together (regulated by PIP2) and that ANTH regulates ENTH so that it doesn’t make lots of vesicles, instead the two team up to make regular membrane tubules. The tubules are decorated with a regular “coat” of these adaptor proteins. This coat could prepattern the clathrin lattice. Also, because Sla2 links to actin, then actin can presumably pull on this lattice to help drive the formation of a new vesicle. The regular spacing might distribute the forces evenly over large expanses of membrane.

The second paper – from David Owen’s lab at CIMR in Cambridge – shows that CALM (a protein with an ANTH domain) actually has a secret Helix 0! They show that this forms on contact with lipid. CALM influences the size of clathrin-coated pits and vesicles, by influencing curvature. They propose a model where cargo size needs to be matched to vesicle size, simply due to the energetics of pit formation. The idea is that cells do this by regulating the ratio of AP2 to CALM.

You can read our preview and the papers by Skruzny et al and Miller et al in the latest issue of Dev Cell.

The post title and the title of our Preview is taken from “Zero Tolerance” by Death from their Symbolic LP. I didn’t want to be outdone by these Swedish scientists who have been using Bob Dylan song titles and lyrics in their papers for years.

Sure To Fall

What does the life cycle of a scientific paper look like?

It stands to reason that after a paper is published, people download and read the paper and then if it generates sufficient interest, it will begin to be cited. At some point these citations will peak and the interest will die away as the work gets superseded or the field moves on. So each paper has a useful lifespan. When does the average paper start to accumulate citations, when do they peak and when do they die away?

Citation behaviours are known to be very field-specific. So to narrow things down, I focussed on cell biology and in one area “clathrin-mediated endocytosis” in particular. It’s an area that I’ve published in – of course this stuff is driven by self-interest. I downloaded data for 1000 papers from Web of Science that had accumulated the most citations. Reviews were excluded, as I assume their citation patterns are different from primary literature. The idea was just to take a large sample of papers on a topic. The data are pretty good, but there are some errors (see below).

Number-crunching (feel free to skip this bit): I imported the data into IgorPro making a 1D wave for each record (paper). I deleted the last point corresponding to cites in 2014 (the year is not complete). I aligned all records so that year of publication was 0. Next, the citations were normalised to the maximum number achieved in the peak year. This allows us to look at the lifecycle in a sensible way. Next I took out records to papers less than 6 years old as I reasoned these would have not have completed their lifecycle and could contaminate the analysis (it turned out to make little difference). The lifecycles were plotted and averaged. I also wrote a quick function to pull out the peak year for citations post hoc.

So what did it show?

Citations to a paper go up and go down, as expected (top left). When cumulative citations are plotted most of the articles have an initial burst and then level off. The exception are ~8 articles that continue to rise linearly (top right). On average a paper generates its peak citations three years after publication (box plot). The fall after this peak period is pretty linear and it’s apparently all over somewhere >15 years after publication (bottom left). To look at the decline in more detail I aligned the papers so that year 0 was the year of peak citations. The average now loses almost 40% of those peak citations in the following year and then declines steadily (bottom right).

Edit: The dreaded Impact Factor calculation takes the citations to articles published in the preceding 2 years and divides by the number of citable items in that period. This means that each paper only contributes to the Impact Factor in years 1 and 2. This is before the average paper reaches its peak citation period. Thanks to David Stephens (@david_s_bristol) for pointing this out. The alternative 5 year Impact Factor gets around this limitation.

Perhaps lifecycle is the wrong term: papers in this dataset don’t actually ‘die’, i.e. go to 0 citations. There is always a chance that a paper will pick up the odd citation. Papers published 15 years ago are still clocking 20% of their peak citations. Looking at papers cited at lower rates would be informative here.

Two other weaknesses that affect precision is that 1) a year is a long time and 2) publication is subject to long lag times. The analysis would be improved by categorising the records based on the month-year when the paper was published and the month-year when each citation comes in. Papers published in January in one year probably have a different peak than those published in December of the same year, but this is lost when looking at year alone. Secondly, due to publication lag, it is impossible to know when the peak period of influence for a paper truly is.
MisCytesProblems in the dataset. Some reviews remained despite being supposedly excluded, i.e. they are not properly tagged in the database. Also, some records have citations from years before the article was published! The numbers of citations are small enough to not worry for this analysis, but it makes you wonder about how accurate the whole dataset is. I’ve written before about how complete citation data may or may not be. These sorts of things are a concern for all of us who are judged by these things for hiring and promotion decisions.

The post title is taken from ‘Sure To Fall’ by The Beatles, recorded during The Decca Sessions.

All This And More

I was looking at the latest issue of Cell and marvelling at how many authors there are on each paper. It’s no secret that the raison d’être of Cell is to publish the “last word” on a topic (although whether it fulfils that objective is debatable). Definitive work needs to be comprehensive. So it follows that this means lots of techniques and ergo lots of authors. This means it is even more impressive when a dual author paper turns up in the table of contents for Cell. Anyway, I got to thinking: has it always been the case that Cell papers have lots of authors and if not, when did that change?

I downloaded the data for all articles published by Cell (and for comparison, J Cell Biol) from Scopus. The records required a bit of cleaning. For example, SnapShot papers needed to be removed and also the odd obituary etc. had been misclassified as an article. These could be quickly removed. I then went back through and filtered out ‘articles’ that were less than three pages as I think it is not possible for a paper to be two pages or fewer in length. The data could be loaded into IgorPro and boxplots generated per year to show how author number varied over time. Reviews that are misclassified as Articles will still be in the dataset, but I figured these would be minimal.

Authors1First off: Yes, there are more authors on average for a Cell paper versus a J Cell Biol paper. What is interesting is that both journals had similar numbers of authors when Cell was born (1974) and they crept up together until the early 2000s, when the number of Cell authors kept increasing, or JCell Biol flattened off, whichever way you look at it.

I think the overall trend to more authors is because understanding biology has increasingly required multiple approaches and the bar for evidence seems to be getting higher over time. The initial creep to more authors (1974-2000) might be due to a cultural change where people (technicians/students/women) began to get proper credit for their contributions. However, this doesn’t explain the divergence between J Cell Biol and Cell in recent years. One possibility is Cell takes more non-cell biology papers and that these papers necessarily have more authors. For example, the polar bear genome was published in Cell (29 authors), and this sort of paper would not appear in J Cell Biol. Another possibility is that J Cell Biol has a shorter and stricter revision procedure, which means that multiple rounds of revision, collecting new techniques and new authors is more limited than it is at Cell. Any other ideas?

AuthorI also quickly checked whether more authors means more citations, but found no evidence for such a relationship. For papers published in the years 2000-2004, the median citation number for papers with 1-10 authors was pretty constant for J Cell Biol. For Cell, these data mere more noisy. Three-author papers tended to be cited a bit more than those with two authors, but then four author papers were also lower.

The number of authors on papers from our lab ranges from 2-9 and median is 3.5. This would put an average paper from our lab in the bottom quartile for JCB and in the lower 10% for Cell in 2013. Ironically, our 9 author paper (an outlier) was published in J Cell Biol. Maybe we need to get more authors on our papers before we can start troubling Cell with our manuscripts…

The Post title is taken from ‘All This and More’ by The Wedding Present from their LP George Best.