I was interested in the analysis by Frontiers on the lack of a correlation between the rejection rate of a journal and the “impact” (as measured by the JIF). There’s a nice follow here at Science Open. The Times Higher Education Supplement also reported on this with the line that “mass rejection of research papers by selective journals in a bid to achieve a high impact factor is an enormous waste of academics’ time”.
First off, the JIF is a flawed metric in a number of ways but even at face value, what does this analysis really tell us?
This plot is taken from the post by Jon Tennant at Science Open.
As others have pointed out:
- The rejection rate is dominated by desk rejects, which although very annoying, don’t take that much time.
- Without knowing the journal name it is difficult to know what to make of the plot.
The data are available from Figshare and – thanks to Thomson-Reuters habit of reporting JIF to 3 d.p. – we can easily pull the journal titles from a list using JIF as a key. The list is here. Note that there may be errors due to this quick-and-dirty method.
The list takes on a different meaning when you can see the Journal titles alongside the numbers for rejection rate and JIF.
Looking for familiar journals – whichever field you are in – you will be disappointed. There’s an awful lot of noise in there. By this, I mean journals that are outside of your field.
This is the problem with this analysis as I see it. It is difficult to compare Nature Neuroscience with Mineralium Deposita…
My plan with this dataset was to replot rejection rate versus JIF2014 for a few different journal categories, but I don’t think there’s enough data to do this and make a convincing case one way or the other. So, I think the jury is still out on this question.
It would be interesting to do this analysis on a bigger dataset. Journals releasing their numbers on rejection rates would be a step forward to doing this.
One final note:
The Orthopedic Clinics of North America is a tough journal. Accepts only 2 papers in every 100 for an impact factor of 1!
The post title is from “Throes of Rejection” by Pantera from their Far Beyond Driven LP. I rejected the title “Satan Has Rejected my Soul” by Morrissey for obvious reasons.
11 thoughts on “Throes of Rejection: No link between rejection rates and impact?”
Did you get a sense of the quality of the rejection rate data? I imagine there’s much room for error, since many sources are third party reports on publisher provided stats across a wide timespan.
My intuition is that scientists playing the impact factor game submit according to an impact factor ladder. And each rejection, they move down a rung. In other words, if you are able to look by field, I expect an association with rejection rate to emerge.
My intuition is the same. Very few journals have high-IF, thousands have low-IF. A ladder with a very narrow top and very wide base accurately describes the relationship I think. Plus, there is a tendency for citations to follow the unattainability, which will reinforce the relationship.
I couldn’t tell about the rejection rate data. I doubt it is very meaningful as there is no standard to report to. An Editor at Science told me that there is a pre-screen for really crazy papers before the Editors even start to make a judgement. So would those be included? What fraction are they?
Thomson Reuters InCites gives us not only the JIF, but also the average JIF percentile category for each journal. (The JIF percentile, averaged over all the categories that Thomson Reuters thinks the journal belongs to).
For example, the ‘European Journal of Agronomy’ has a rejection rate of 0.88, apparently, and an impact factor of 2.704, which sounds a bit rubbish; except that by average JIF percentile category, it reaches the ‘88.272’th percentile (in whatever categories Thomson Reuters assigns it to).
I can’t see how to get all these numbers easily. Besides, I’m not sure about the uqality of the rejection rate data. For example one journal with a stated rejection rate of zero, ‘Current Opinion in Colloid and Interface Science’, “publishes invited articles only” … so the selection/rejection is going on at the stage of the invitations.
Thanks for the comment, Richard. Great idea to get a field-normalised view for the JIF numbers. I don’t know how to get those either… I didn’t even manage to get the categories from TR in a sensible format. Although I didn’t try too hard as this was just a quick, lunchtime exercise. I also don’t have that much confidence in the rejection rate which meant I wasn’t keep to keep crunching.
Richard makes a great point. I am also curious about whether accounting for journal category will change the findings. For this analysis, I suggest switching from the JCR Impact Factor to Journal Metrics. The SNIP and SJR measures adjust for subject field. Additionally, the Journal Metrics data is publicly available, although not openly licensed AFAIK. I’ve extracted tidy TSVs of the relevant data, which includes journal prestige values and categories.
The main roadblock I’m running into is mapping the 2014 Impact Factors from the Frontiers dataset to journal ISSNs or NLM Catalog IDs. It’s a real shame that Frontiers stripped their dataset of journal identities. Their considerable compilation effort is going to waste, while the community is unable to investigate their findings.
Can you explain more how you mapped 2014 impact factors to journal titles? When I joined unique impact factors from 2014 with the Frontiers dataset, only 117 (20.5%) journals remained (dataset, notebook).
Hmmm very good point. I didn’t check exactly, so I think you’re correct. I checked that the JIF numbers in the dataset were unique and then fished out the first match for journal name from a list of journals ranked by JIF2014. But I didn’t check the JIF2014 to see how unique JIFs are (I assumed they were, which is a mistake). So in my list, any non-unique journals are matched to the journal with the highest JIF.
In my defence I did say that the method was quick-and-dirty and there might be errors. I didn’t think 80% would be unmatched though!!
Thanks for pointing this out.
I totally agree that they should have kept the journal names and also that this kind of analysis needs to be field-normalised (SMIP and SJR look good). But then, how good is the rejection rate data… I’m not sure this analysis is possible right now.
Great stuff, Steve and Daniel.
On one hand, Frontiers is seemingly willfully refusing to reveal what journals are in their data set or give details on the source of the rejection rate data. (Compare their silence on these issues with their willingness to share other data from their blog). Moreover as a publisher that makes it nearly impossible to reject a paper, they have an interest in arguing that rejection is a waste of time. So until they tell us how they’ve done the study, I don’t believe their results one bit more than any other advertising copy from any other commercial publisher.
On the other, it is an entertaining parlor game to try to reverse engineer their study and figure out what they did. You’ve done a good job of this.
One thing that is quite striking is how massively over-represented the Frontiers journals are in Daniel’s dataset, with 5 of 117 journals. By my count, there are 16 Frontiers journals total in the JCR, out of 11,149 total journals. Therefore I calculate that there is less than a 0.0001% chance (p<10^-6) that you would see 5 or more Frontiers journals in a sample of 117 journals from the JCR when drawing from the JCR at random.
Given those probabilities, I HEREBY CALL BULLSHIT on the methods description in the Frontier's blog post, where they write "In Figure 1, we plotted the impact factors of 570 randomly selected journals indexed in the 2014 Journal Citation Reports (Thomson Reuters, 2015), against their publicly stated rejection rates."
It is highly improbable that these journals were actually selected at random. I'd conjecture that Frontiers added most or all of their own JCR-listed journals to the data set; the others may well have been selected at random from the JCR. I'd be happy to be proven wrong if Frontiers wishes to release the data, and would happily recant if shown to be incorrect.
Why does this matter? Well, for one thing if we can't trust the methods description, we can't really trust anything about their analysis. But moreover, I think this detail of the methods could effect the conclusion if we do choose to take the rest on faith. The Frontiers journals may indeed have unusually high impact factors given their rejections. But by including this unusual set (which I bet make a sizable fraction of their very-low-rejection rate journals) in an otherwise supposedly random set of journals, they are likely generating a misleading regression between impact factor and rejection rate. If I'd written the original post and that data had come out like I expect they did, I'd have concluded the reverse of what the blog post actually concludes: that impact factor DOES correlate with rejection rate, and that Frontiers journals are outliers. This is something they should be genuinely proud of, not something that should be getting covered up in sloppy–at best–data analysis.
Great points Carl. If I had to guess, by “randomly selected journals” Frontiers means “journals that we could find rejection rate data for and were in the 2014 JCR.” I think their method was to find data for a subset of all journals, but not a random selection. Random here is especially difficult because most journals don’t post rejection rates, so there is a considerable non-response bias. And when you find a resource containing rejection rates for many journals, do you include them all or just the journal you set out to find?
Regarding Frontiers motivation in withholding the complete dataset and methods, I’m not going to speculate. However, as a publisher interested in reproducible and open research, they are not leading by example with respect to this analysis. In addition, my comment requesting journal identities is hidden, still awaiting moderation after 20 days.
On the one hand you make a big deal of the fact that rejection rates are dominated by desk rejections, but on the other hand you completely fail to mention a serious confounding variable: the quality of the editor. Do you think there’s an off chance that the editors of Nature do a better job at selecting potentially well cited research from long-tail research? Is there a difference between a full-time editor and an academic doing editorial work? How would you even account for that?
Disclaimer: I’m a full-time editor.
Comments are closed.