Iain Marshall: Fixing chains of trust in health research 

On 4 June, a major covid-19 study was retracted from the Lancet. The observational study, published only 13 days earlier, reported potential harm from hydroxychloroquine treatment in covid-19. The paper had a quick and dramatic impact, temporarily halting the World Health Organization’s Solidarity trial. A single author, Sapan Desai, who directed the company Surgisphere, which describes itself as a healthcare data analytics company, was responsible for collecting data from 671 hospitals across 6 continents, and the sole person responsible for acquiring the data and conducting the statistical analyses. Post-publication, researchers found a large number of inconsistencies in the results presented, which required further explanation. The three other authors, all academic, were not convinced by Desai’s response, and requested the retraction.

It would be easy to criticise the process and response from the journals, but the Lancet does not appear to have acted unusually. On the contrary, expressions of concern were posted, and the paper was retracted quickly. We can’t scrutinise the peer review reports (which are confidential). It is also possible that preferentially publishing “practice changing” results might inadvertently lead to dubious research submissions. But these editorial practices are commonplace. The more worrying conclusion is that our system for quality control of science is not capable of picking up untrustworthy research. With the likelihood that a small but important minority of scientists are engaging in dubious practices, how do we ensure the integrity of the scientific record?

In computer science, “trust” has been defined as existing where: “the user has first-hand experience of consistent, good, behaviour or the user trusts someone who vouches for consistent, good, behaviour.” Research publication involves chains of trust: journal readers trust the journal, the journal trusts the peer reviewers and the authors, and co-authors trust one another that their sub-tasks have been done correctly. 

Each link in the chain of trust is critical, and relies on trust of all the steps below. Honest mistakes and fraud can occur at any step, and break the chain of trust, and that broken trust would propagate all the way to the top. There is usually some resilience in the system. If, say, one of the peer reviewers acted in bad faith, then we might still have two or more good reviews, so trust is maintained. Similarly, if one of the authors (who was responsible for revising and signing off the manuscript) made an error in the manuscript, this would hopefully be picked up by any of the other authors, the journal editors, or the peer reviewers.

However, the weak links can also creep into the chain. For example, if only one author has access to the data. If there were any problems at the data collection or statistical analysis steps, then these could only have been detected if they were evident in the manuscript. In this case, this weakness was plainly evident in the author contributions section of the Lancet manuscript, and could have been raised by the editors or peer reviewers. 

Better open science practices are an important way of building resilience into the system—it seems likely that if journals had insisted on scrutiny of underlying data, the Surgiphere analysis would not have been published. The Open Science Framework (among many other similar platforms) allows researchers to collate and pre-register protocols, analysis plans, data and code for free.

But how is it possible that the integrity of an important piece of research, which can affect people’s lives, ever came to rest on a single person?

Ultimately, the responsibility is with legitimate researchers to make the changes that will prevent dubious practices. Legitimate researchers should no longer publish articles without asking a colleague to check their analyses. We should welcome it, since researchers make honest mistakes (often!). Legitimate researchers should proudly share their data and analysis code. There are understandable reasons in some cases why data are not publicly shared (for example, where patients could be identified). However, there is no ethical obstacle to sharing such data with a peer reviewer or journal, with appropriate data governance processes. In the retracted Lancet paper, the data were not even made available to the co-authors.

For co-authors, guarantors, peer reviewers, institutions, and journals—the key people responsible for maintaining scientific integrity—this should be our mantra: Can I, or someone I trust, vouch for each aspect of the study?

Iain Marshall is a south London GP, and MRC fellow in health informatics at King’s College London.
He investigates the use of computers to automate evidence synthesis. Twitter: @ijmarshall
Competing interests statement: I am employed by King’s College London, and my research is funded by grants from the Medical Research Council (UK) and the National Library of Medicine (US). My wife Rachel Marshall is a medical editor and has worked for a number of journals, including the Lancet Respiratory Medicine.