By Andrea Ferrario and Nikola Biller-Andorno.
A technology enthusiast (TE) and a medical ethicist (ME) walk into a bar. Over a few rounds of drinks, their discussion shifts to the topic of large language models (LLMs) and their use in medical ethics.
TE: Have you seen the latest? Technology using LLM, like OpenAI’s GPT-4, is diving into medical ethics. They can even generate medical ethics use cases and elaborate on those provided by human users. LLMs are ground breaking.
ME: Ground breaking, but fraught with complexity. Here, I stand with Ferrario and Biller-Andorno and their arguments in their recent JME article. We need to start with the right perspective on this issue. It is not just about LLMs discussing medical ethics problems with humans. It is investigating what LLMs can do when we need to understand the depths of human experiences, relating them to the norms that govern medical practice, and formulating ethics-compliant plans for clinical action.
TE: Framing the context is important, but I believe that LLMs actually understand ethical issues in medicine. If asked in a correct way, they elaborate on ethical problems and even come up with pertinent examples. That is a huge leap forward. They can be much more than a support in medical ethics decision-making.
ME: Identifying issues, sure, LLMs have the ability to do that. In fact, they are trained on immense quantities of text, including medical ethics use cases. Retrieving medical ethics information is something they can do pretty well. But understanding medical ethics like an expert would do? That is where this technology falls short. LLMs are not experts in medical ethics, even if they can sometimes simulate aspects of expertise in a rather convincing way.
TE: I disagree. You underestimate LLMs. If their answers are precise and context-relevant, it is because they understand what users need. Balas et al. recently showed that OpenAI’s GPT-4 can achieve a good understanding across different medical ethics problems. Therefore, LLMs show the core ability of human experts and we should treat them as such.
ME: This is a crucial point: just because LLMs can deliver answers that are relevant to the context, it does not imply that they understand the questions. LLMs generate answers by computing word probabilities, not based on their semantics. The element of randomness plays a role on how the LLMs compute answers, and currently, we lack the means to fully control this aspect. This is not understanding. Moreover, expertise is not merely about understanding facts. It is also striving for excellence in managing knowledge by cultivating traits, such as conscientiousness, curiosity, perseverance and humility. These are abilities we cannot ascribe to LLMs.
TE: I see. It seems to me that LLMs are excellent actors, then. Essentially, they perform an incredibly complex script with a touch of improvisation, but without understanding the content of it. Their ability lies in reciting the most appropriate parts of the script at the right time in a coherent dialogue with the audience. Do I get your point right?
ME: I like your metaphor. As human actors do, LLMs can elicit emotions in us, promote self-reflection and support our understanding of reality. Here lies their utility. However, we have to stay alert: we do not really know how much truth lies in the script they use in their improvisation theater. It is up to us to find it out.
TE: Then, we need to assess the theoretical limitations of LLMs and, in virtue of this, assign their most appropriate role in medical ethics education and clinical decision-making. Medical students could, for instance, examine the responses generated by the LLMs and compare them with those of medical ethicists. LLMs could also create fictional case reports for their ethical reflection. There is so much to do!
ME: Yes, there is much to do. But this is a matter for another day. Let us have a coffee.
Paper title: Large Language Models in Medical Ethics: Useful but not Expert
Authors:: Andrea Ferrario,1 Nikola Biller-Andorno1
Affiliations: 1Institute of Biomedical Ethics and History of Medicine, University of Zurich
Competing interests: None declared
Social media accounts of post author: Andrea Ferrario (LinkedIn): https://www.linkedin.com/in/andrea-ferrario-43b58534/