Top 10 Research Articles of 2025, #6 – 10

In 2025, we saw many outstanding articles published in BMJ Quality & Safety. The papers highlighted in this blog represent some of the very best from the past year and were selected based on engagement metrics and evaluation by our editorial board. The methodology used to identify these papers is described in a previous blog. This article will focus on the articles ranked sixth through tenth. You can read about those ranked first to fifth in a separate blog.

We would like to express our thanks to the authors of these exceptional papers, as well as to the entire BMJ Quality & Safety team for their support throughout the selection process – choosing amongst so many high-quality contributions was not a simple task!

10 – Use of structured handoff protocols for within-hospital unit transitions: a systematic review from Making Healthcare Safer IV

Accompanying editorial available here.

Transitions of care are a particularly vulnerable point in the patient journey. As patients move between specialties or clinical teams, responsibility for their care is transferred together with important clinical information. These handoffs (or handovers) occur multiple times a day in every hospital, yet they are not risk-free. Miscommunication during handoffs is common and can contribute to delays in patient care, duplication of work, and avoidable patient harm. Improving the quality of handoffs is therefore an important focus for healthcare organisations and policymakers.

In this systematic review from the Making Healthcare Safer IV programme, the authors explored the evidence base for structured handoff protocols used for within-unit handoffs. These handoffs typically occur when staff change shifts, and information is communicated between incoming and outgoing teams. Despite widespread availability of structured communication tools, only two tools – SBAR and I-PASS – had been evaluated in more than one study for within-unit handoffs. The authors concluded that some evidence existed to support the use of SBAR in improving patient outcomes, but that implementation fidelity was difficult to achieve, and the quality of evidence was low. More robust evidence supported the use of I-PASS, particularly in improving communication and reducing preventable adverse events, however its impact on clinical outcomes such as preventable mortality and length of stay was more uncertain.

Structured handoff protocols are an intuitive solution to improve communication during transitions of care. However, their effectiveness is likely to depend upon implementation – including how they are embedded into workflows, supported by training, and reinforced by organisational culture. For example, many of the positive findings associated with I-PASS emerged from comprehensive programmes to support implementation. As healthcare systems become increasingly complex and fragmented, understanding how to minimise risk during transitions of care will be critical.

9 – Do patient safety incident investigations align with systems thinking? An analysis of contributing factors and recommendations

Accompanying editorial available here.

Preventable patient safety incidents are a major challenge across healthcare systems, and patient safety incident investigations are intended to facilitate organisational learning to prevent future harm. Safety science increasingly recognises that adverse events occur in complex sociotechnical systems, where outcomes are shaped by interactions between people, processes, technology, and environments. If we wish to truly understand why adverse events occur, the approach to investigation must reflect these complexities. However, healthcare organisations and policymakers often continue to search for simple solutions to complex problems, of which there are rarely any. Despite consistent advocacy for a systems thinking approach in incident investigation, it is unclear to the extent to which these principles are being applied.

This study examined 300 high-harm patient safety incident investigations across three Australian states to explore the extent to which these aligned with systems thinking principles. Using frameworks based on AcciMap and Systems Engineering Initiative for Patient Safety (SEIPS), the authors analysed the contributory factors identified in investigation reports and the strength of resulting recommendations. Most incidents were related to clinical processes and procedures such as diagnosis, assessment, and interventions, as well as falls and incidents related to patient or staff behaviour. On average, three contributory factors were identified per investigation, but nearly half of these were at the person-level, and 32 investigations identified only person-level contributors. Similar themes emerged from analysis of recommendations where only 6% of recommendations were considered strong with many focussing on review and enhancement of guidelines, and training and education.

These findings highlight the tension between the growing advocacy for systems thinking in patient safety investigations, and the continued tendency for organisations to favour simple individual-level solutions to complex organisational problems. The methods chosen by organisations may partly explain this. Most investigations reviewed in this study relied on approaches such as root cause analysis, which can encourage linear thinking rather than an approach more aligned to systems thinking. Additionally, data collection often centred on staff interviews which when used alone may not be sufficient to unravel the complexities of healthcare systems. If investigation methods are not purposively designed and selected to understand the complexities of the system, there is a risk that investigations continue to generate weak recommendations that do not promote organisational learning.

8 – Evaluation of the accuracy and safety of machine translation of patient-specific discharge instructions: a comparative analysis

Accompanying editorial available here.

Language barriers are common in modern healthcare, and if accurate translation to communicate pertinent clinical information is not provided, there is a significant risk to the quality of care. Accurate communication of instructions at discharge from hospital is particularly important, and often includes highly personalised information about medications, follow-up care, and warning signs that patients need to understand. Whilst standardised patient information such as leaflets and consent forms can be translated relatively easily, discharge instructions are a far bigger challenge given that they are often highly personalised. Previous research identified potentially harmful inaccuracies resulting from the use of Google Translate. However, the development of large language models has re-ignited interest in machine translation. Could they offer a safer and more practical solution?

This study evaluated the accuracy and safety of translations produced by GPT-4 and Google Translate when used to convert patient-specific discharge instructions from English into Spanish, Chinese, and Russian. The authors purposively selected 50 discharge summaries from two emergency departments in the US, whilst oversampling for common clinical complaints and discharge instructions with medication changes, recognising this as a common source of miscommunication. The translated summaries were back-translated into English by professional translators and assessed for accuracy and potential harm. At the sentence-level, accuracy was judged to be over 90% for English to Spanish and Chinese using both tools, and slightly less for Russian translation. Few of the inaccuracies reported were considered to be potential causes of patient harm. However, there were some important examples of potential for harm including a translation that advised continuing, rather than withholding, medication.

This study raises an important question for healthcare systems – should translation facilitated by artificial intelligence be viewed as an imperfect but pragmatic solution to a widespread problem, or as an unacceptable shortcut that could cause patient harm? The current standard is likely far from perfect, and this could plausibly represent an improvement. Many patients receive discharge instructions that they struggle to understand, and access to professional translation is often insufficient. However, even rare translation errors could result in significant consequence for individuals. Whether or not these tools are ultimately considered ‘safe enough’ for clinical use will depend, amongst other things, on regulators’ and organisations’ attitudes to risk and governance.

7 – Time to de-implementation of low-value cancer screening practices: a narrative review

A frequently cited statistic in implementation science is the average time for evidence to translate into routine clinical practice – approximately 17 years. This statistic has justified increasing attention on how healthcare systems adopt beneficial innovation at pace. However, far less attention has been paid to the converse – how quickly we de-implement interventions, policies, and services that do not provide good value. This is particularly important in cancer screening, where unnecessary screening practices carry substantial potential for harm through overdiagnosis, anxiety, and unnecessary investigation and treatment.

In this study, the authors examined how quickly low-value cancer screening practices were de-implemented following recommendations from the United States Preventive Services Task Force. Low-value practices were defined as screening practices where there was moderate or high certainty evidence that the practice has no net benefit, or harms that outweigh the benefits. Examples included cervical cancer screening outside recommended age groups, prostate-specific antigen screening in people over 70, and asymptomatic screening for other specific cancers. The authors then evaluated the time between publication of evidence underpinning the recommendations, incorporation into clinical guidelines, and reductions in the practice. Substantial heterogeneity in time to de-implementation was highlighted with some practices declining relatively quickly, and others ongoing despite over a decade since guideline recommendations were published. In several cases there was insufficient data availability to determine whether practice has changed.

These findings show that organisations are not only slow to implement and adopt new evidence, but also to de-implement established practices. De-implementation may be particularly challenging in situations where something is being taken away that was previously promoted as important and beneficial. Public perception, healthcare professionals’ habits, and conflicting clinical guidance can all contribute to continued and inappropriate use of low-value practices. This study also highlights the lack of data available to effectively monitor the de-implementation of practices after evidence changes. If implementation science is to support the acceleration of evidence-based care, more attention should be paid to doing less of the wrong things, as well as doing more of the right things.

6 – Effectiveness of clinician-directed default nudges on reducing overuse of tests and treatments in healthcare: a systematic review of randomised controlled trials

From unnecessary blood tests to insensitive imaging requests, low-value care consumes significant resources and can expose patients to unnecessary ham. As healthcare systems increasingly rely on electronic prescribing and ordering systems, there has been growing interest in whether these systems can be designed to encourage better clinical decision-making. One such strategy is to guide healthcare professionals towards choices that are likely to be appropriate in most situations using default nudges, where appropriate options are pre-selected to influence use of investigations and treatments.

This systematic review synthesised randomised controlled trials evaluating clinician-directed default nudges to target the overuse of tests or treatments. Six randomised controlled trials conducted in the USA were identified. The nature of these default nudges differed substantially and included changes in the default prescription quantities, the default medication doses, and the default frequency of imaging. Overall, their findings suggested that default nudges can influence healthcare professional behaviour, but these influences were heterogeneous and highly context dependent. Default nudges were used to successfully reduce the use of large volume opioid prescriptions, and to reduce inappropriate imaging in palliative radiotherapy. However, not all nudges were successful. Interestingly, one study found that a default opioid prescription of 10 tablets reduced overprescribing, whereas a more restrictive 5-tablet default increased large-pack opioid prescribing, demonstrating the nuance in these interventions.

This study provided fascinating insights into the complexity of seemingly simple changes to electronic prescribing and test ordering. Although changing the default option may appear on the surface as a minor technical adjustment, it is clear that these are interpreted in broader clinical and cultural contexts, and there may be negative unintended consequences. These findings also highlight a broader challenge in digital health research with most evidence emerging from US healthcare systems, limiting our understanding of how such interventions translate to other settings. As healthcare organisations utilise these behavioural cues within electronic systems, robust evaluation and ongoing monitoring will be required to understand whether they work and how local context shapes their impact.

(Visited 1 times, 1 visits today)

BMJ Blogs