Marcus Munafò and Björn Brembs: The impact factor agenda

marcus_munafobjorn_brembs

 

 

 

 

Where do we find the latest, high quality research to inform our own research or clinical practice? Although new media have introduced a range of alternatives, such as the National Elf Service, the primary vehicle for disseminating research continues to be the learned journal. And the primary means of rating the quality of a journal remains (and is increasingly) the Impact Factor, calculated as the average number of citations per article over a two year period. But to what extent is the Impact Factor fit for purpose?

We recently posted a critique of the Impact Factor (and the ranking of journals more generally) on ArXiv. There are several converging lines of evidence which indicate that publications in high ranking journals are not only more likely to be fraudulent than articles in lower ranking journals, but also more likely to present discoveries which are less reliable (ie, are inflated, or cannot subsequently be replicated). Some of the sociological mechanisms behind these correlations have been documented, such as pressure to publish (preferably positive results in high ranking journals), leading to the potential for decreased ethical standards and increased publication bias in highly competitive fields.

Nevertheless, there is a strong common perception that high ranking journals publish “better” or “more important” science, and that the Impact Factor captures this well. The assumption is that high ranking journals are able to be highly selective and publish only the most important and best supported scientific discoveries, which will then, as a consequence of their quality, go on to be highly cited. However, it has been established for some time that journal rank is a measurable, but unexpectedly weak predictor of future citations. A recent analysis of the correlation between journal rank and future citations between 1902 and 2009 revealed two interesting trends. First, while the predictive power of journal rank remained very low for the entire first two thirds of the 20th century, it started to slowly increase shortly after the publication of the first Impact Factor data in the 1960’s. Second, this correlation kept increasing until the advent of the internet and keyword search in the 1990’s, from which point on it fell back to pre-1960’s levels until the end of the study period in 2009.

In our view, Impact Factor generates an illusion of exclusivity and prestige based on an assumption that it will predict subsequent impact, which is not supported by empirical data. Alternatives to journal rank exist—we now have technology at our disposal which allows us to perform all of the functions journal rank is currently supposed to perform in an unbiased, dynamic way on a per article basis, allowing the research community greater control over selection, filtering, and ranking of scientific information. Since there is no technological reason to continue using journal rank, one implication of our critique is that we can remove the need for a journal hierarchy completely.

Indeed, alternative to learned journals themselves exist. We now have the technology to bring scholarly communication back into research institutions. Software, raw data, and their text descriptions can be archived and made accessible, after peer review. Study level metrics can be used to accrue reputation and assess impact. This system would be subjected to standard scientific scrutiny, and evolve to minimize gaming and maximize the alignment of researchers’ interests with those of science (which are currently misaligned). Funds currently spent on journal subscriptions could easily suffice to finance the initial conversion of scholarly communication, even if only as long term savings.

What then for our cherished journals, and in particular those serving professional communities? One option would be for these to concentrate providing exegetical reviews, research digests, and a forum for debate—something which journals such as Nature and the BMJ already do well. Judgements of the quality of research would then focus on the research itself, rather than its vehicle for dissemination.

Competing Interests: None.

Marcus Munafò is professor of biological psychology at the University of Bristol, United Kingdom. His research interests are primarily in the area of behavioural and neurobiological mechanisms of tobacco and alcohol use. He completed his PhD in 2000 at the University of Southampton, and worked as a postdoctoral fellow at the University of Oxford and the University Pennsylvania before taking up a permanent position at Bristol in 2005.

Björn Brembs is professor of neurogenetics at the University of Regensburg, Germany. He studies the neurobiological mechanisms underlying spontaneous behaviour and operant learning. He completed his PhD with Martin Heisenberg in 2000 in Würzburg, spent a postdoctoral period in John H Byrne’s lab in Houston, Texas, started his own lab in Berlin in 2004, and became appointed professor in Regensburg in 2012.