This post is about a citation analysis that didn’t quite work out.
I liked this blackboard project by Manuel Théry looking at the influence of each paper authored by David Pellman’s lab on the future directions of the Pellman lab.
I'm working on a new project for @criparis students: "The Tree of a Lab"— THERY Manuel (@ManuelTHERY) March 21, 2018
= how thematics have evolved in a lab history. Trying to understand what makes a new branch, or end one. Here is Pellman lab tree. Red = > 200 citations (NCS papers). But branches started from JCB papers ! pic.twitter.com/CPaq3w3owG
It reminds me that papers can have impact in the field while others might be influential to the group itself. I wondered which of the papers on which I’m an author have been most influential to my other papers and whether this correlates with a measure of their impact on the field.
There’s no code in this post. I retrieved the relevant records from Scopus and used the difference in “with” and “without” self-citation to pull together the numbers.
Influence: I used the number of citations to a paper from any of our papers as the number for self-citation. This was divided by the total number of future papers. This means if I have 50 papers, and the 23rd paper that was published has collected 27 self-citations, this has a score of 1 (the 23rd paper nor any of the preceding 22 papers, can cite the 23rd paper, but the 27 that follow, could). This is our metric for influence.
Impact: As a measure of general impact I took the total number of citations for each paper and divided this by the number of years since publication to get average cites per year for each paper.
Reviews and methods papers are shown in blue, while research papers are in red. I was surprised that some papers have been cited by as much as half of the papers that followed.
Generally, the articles that were most influential to us were also the papers with the biggest impact. Although the correlation is not very strong. There is an obvious outlier paper that gets 30 cites per year (over a 12 year period, I should say) but this paper has not influenced our work as much as other papers have. This is partly because the paper is a citation magnet and partly because we’ve stopped working on this topic in the last few years.
Obviously, the most recent papers were the least informative. There are no future papers to test if they were influential and there are few citations so far to understand their impact.
It’s difficult to say what the correlation between impact and influence on our own work really means, if anything. Does it mean that we have tended to pursue projects because of their impact (I would hope not)? Perhaps these papers are generally useful to the field and to us.
In summary, I don’t think this analysis was successful. I had wanted to construct some citation networks – similar to the Pellman tree idea above – to look at influence in more detail, but I lost confidence in the method. Many of our self-citations are for methodological reasons and so I’m not sure if we’re measuring influence or utility here. Either way, the dataset is not big enough (yet) to do more meaningful number crunching. Having said this, the approach I’ve described here will work for any scholar and could be done at scale.
There are several song titles in the database called ‘For What It’s Worth’. This one is Chapterhouse on Rownderbout.