We don’t need AI to pass the Turing Test to be helpful in healthcare

This week, Simon Stevens, chief executive of NHS England, announced that hospitals are to receive extra funding for substituting human labour with AI-provided analysis. We applaud the move by the NHS to utilise the benefits of AI and in being proactive in shaping its use. At the same time we should be cautious about the notion some hold of replacing clinicians with computers. [1] The benefits of AI will come from automating certain, mundane, clinical tasks and augmenting human doctors’ abilities’ to work with machines.

Today it is 65 years since the death of Alan Turing. In his ‘Imitation Game’, also known as the Turing test, a machine would pass if a human evaluator could not tell it apart from a human in conversation. The intellectual legacy of this experiment is strong, but this principle does not always translate well into everyday practice.

The equivalent of the Turing Test in aerospace engineering would be to create an aeroplane which flies like a bird, which has clearly not been the case. If they had sought to mimic birds flapping wings and feathers, they would have created an unsafe and ineffective transport mechanism. We should treat medicine like engineering, and focus on building robots that help doctors, the pilots of the medical profession, navigate through complexity to meet the ultimate objectives of more accurate diagnoses and achieve better patient care. Clinicians need to oversee the process and interpret the results from computers somewhat similarly to how pilots oversee the flying of the plane.

Human plus machine will be better than one or the other. There are sound statistical reasons for this. As long as the AI system and the human clinician have complementary roles, combining these two sources of analysis together can provide better overall guidance. Integrating modules of AI with human analysis will result in better performance. Diversity helps. Two systems each with 90% performance accuracy will provide greater than 90% accuracy when combined, provided that their decision making is not perfectly aligned with one another. This is also the logic of the second opinion.

When making a diagnosis, we can expect clinicians and machines to pick up on different relevant features, and hence statistically combining these will lead to better decisions than from either alone. It is highly likely that some stages of clinicians’ workflow will be automated to good effect, but major clinical disciplines are unlikely to be replaced altogether. [2]

Out of all the sectors, the public rank healthcare as having the greatest potential gains from machine learning, particularly AI’s ability to interpret large amounts of complex medical information faster and more accurately than clinicians in specific contexts. [3] AI will continue to make progress in complex domains.

However, there are substantial challenges (and limited appetite from patients) for AI to master the final stages of delivery of a diagnosis to the patient. After all, who wants to receive a terminal diagnosis from a robot? [4] This unfortunately became close to reality in the US where a doctor told a patient he was going to die using a robot with video link. [5]

Patients with certain conditions such as mental health problems, have also expressed concern that their condition will be overlooked by technology. [6] There is a part of medicine which needs to be able to contextualise a patient, which, at present, AI will find challenging. A machine has a limited internal representation of what a patient is, and a limited ability to display empathy in the consultation.

That said, there are potential upsides from digitising and automating how patients access medical advice: some patients prefer to disclose potentially embarrassing conditions to automated systems. [7] And having decision algorithms that have learnt from a population-scale dataset can provide doctors additional insights into similar patients, particularly those presenting with rare diseases. A single clinician may only see a handful of patients with a particular condition per year, but nationally there will be many more. 

Overall, we see incredible potential to AI improving the quality of decisions in healthcare, however not from trying to replace clinicians, but from using AI systems to augment human clinical reasoning. Will machines ever think like doctors? Not yet and they do not have to in order to be useful.

William Warr is a Doctoral Researcher in Digital Health and Primary Health Care at the University of Oxford

Matthew Willetts is a Doctoral Researcher in Statistics at the University of Oxford and the Alan Turing Institute.

Chris Holmes is a professor of Biostatistics at the University of Oxford and Scientific Director for the Health Programme at the Alan Turing Institute, London.

We thank Professor Sir Peter Donnelly and Professor Sir John Bell for reviewing the piece and their feedback.

Competing interests: None declared.


1] Shah, Nirav R. “Health Care in 2030: Will Artificial Intelligence Replace Physicians?.” Annals of internal medicine (2019); Goldhahn, Jörg, Vanessa Rampton, and Giatgen A. Spinas. “Could artificial intelligence make doctors obsolete?.” Bmj 363 (2018): k4563.

2] Quer, Giorgio, et al. “Augmenting diagnostic vision with AI.” The Lancet 390.10091 (2017): 221.

3] Ipsos Mori, Public Views on Machine Learning, Royal Society, https://royalsociety.org/-/media/policy/projects/machine-learning/publications/public-views-of-machine-learning-ipsos-mori.pdf

4] https://www.bmj.com/content/363/bmj.k4669

5] BBC, Man told he’s going to die by doctor on video-link robot, 8th March 2019.

6] Ipsos Mori, Public Views on Machine Learning, Royal Society, https://royalsociety.org/-/media/policy/projects/machine-learning/publications/public-views-of-machine-learning-ipsos-mori.pdf

7] Riper, H., et al., Effectiveness of E-self-help interventions for curbing adult problem drinking: a meta-analysis.Journal of medical Internet research, 2011. 13(2).