By Jonathan Pugh, Dominic Wilkinson and Julian Savulescu.
The UK government has put lateral flow antigen tests (LFATs) at the forefront of its strategy to re-open schools. These tests can be used to detect current infections, and they can provide results quickly at the point of care. The tests themselves also have a low financial cost. In this regard, they have some advantages over ‘gold-standard’ PCR tests, which require the laboratory processing of samples to generate results.
The government’s use of LFATs for the purposes of mass testing has previously been subject to criticism in the pandemic, partly due to concerns about the accuracy of these tests. A pilot trial in Liverpool of a LFAT in asymptomatic individuals suggested that it failed to identify half of the positive cases in the trial population; that is, the test had a poor sensitivity of 48.89% compared to PCR testing, producing a high proportion of false negatives.
The use of an insensitive test makes false negative results likely, and these results have considerable moral costs due to the false reassurance that they can engender. Individuals who incorrectly believe they are not infected may unwittingly engage in behaviours that will transmit the virus.
However, the Liverpool pilot data also suggested that the LFAT under investigation did not lead to a high proportion of false positive results. That is, the data suggest the test was 99.93% specific compared to PCR testing. In the past couple of days, new data has been published on the specificity of LFATs, suggesting a specificity of 99.72% when compared to PCR testing, assuming the latter had 100% sensitivity. However, further analysis of other data in this release suggests that the LFAT could have a specificity as high as 99.97%, and that it’s specificity is likely to be at least 99.9%
This is important because false positive results also have moral costs; recipients of such results may have their liberty restricted unnecessarily. In the current context, a false positive will prevent a child from being in school. This result will also have implications for the liberty of their close contacts, both in school and at home, who will also have to self-isolate. As such, a single false positive result can have ripple effects for many people beyond the recipient herself.
The use of an insensitive but highly specific test can be justifiable when it is used for certain limited purposes, and with clear public health messaging. In particular, it can be justifiable to use a test to identify cases that would otherwise have been missed by other forms of population screening, and to triage individuals who are sent for more accurate forms of testing.
However, the government has announced that, at least for the initial period of the school re-opening, a negative PCR test may not be used to overrule a positive LFAT test. This has led to considerable criticism, with some claiming that the policy is ruining the return to schools. However, the authors of the recently released data suggest that their results support the government policy of not requiring PCR confirmation of positive LFAT results.
Sensitivity, specificity, and the predictive value of testing in populations with low prevalence
Some commentators have previously criticised this policy by pointing out that the number of false positives that LFATs identify will currently outnumber the number of true positives, given the low prevalence of the virus. As we shall explain below, this may still be true if the LFAT is as specific as 99.72%, the lowest specificity figure suggested by the recent data. If the LFAT is as specific as 99.97%, as some interpretations of the recently released data suggest, this may not be true.
However, even if it is true, this would not be the complete explanation for why the policy is morally problematic. It is not unfeasible that the kind of serial testing regimen mooted as an alternative (i.e. requiring a positive result from a PCR to confirm a positive LFAT result in order to justify prolonged isolation) could also generate more false positives than true positives in some circumstances. The positive predictive value of a serial testing regimen may also fall below 50% when the virus has very low prevalence and, when neither test is 100% specific.
To very roughly illustrate the point, the current prevalence of the coronavirus is estimated by the ONS to be 1 in 220 in the UK, in the week beginning Mar 5. This is a prevalence of roughly 0.45%. So, if prevalence in schools reflects this national prevalence, then you might expect roughly 40,000 infected individuals in the total school pupil population in England of roughly 8’890’350 (according to latest estimates covering 2019-2020.)
On these numbers, even if we assume that a serial testing regimen would be100% sensitive (which is unlikely in reality), it could only identify roughly 40’000 positive cases in this population. The remaining 8’850’360 pupils in the population would be uninfected. Crucially, if the specificity of the testing regimen as a whole was much below (roughly) 99.55%, the number of false positive results it will generate amongst uninfected pupils would be greater than 40’000 (i.e. the number of true positive cases that the test could hope to identify in the total pupil population).
The precise sensitivity and specificity of PCR testing is somewhat contested. In any case, the above illustration shows that even assuming 100% sensitivity of a serial testing regimen, it would still have to pass a reasonably high threshold of specificity to in order to have a positive predictive value of over 50%, when there is a very low prevalence of the virus.
What matters morally
But, in itself, whether or not the false positives outnumber the true positives doesn’t settle the moral argument about whether we should use a test; it can be ethically justifiable to use a testing regimen which identifies more false positives than true positives. It can be so if the benefits of identifying positive cases is sufficiently high, and the costs of the false negatives are sufficiently low.
If it is very valuable to identify positive cases, we might be willing to accept a large number of false positive cases in the process of doing so. That is, we might be willing to accept the use of a testing regimen with a relatively low positive predictive value. The minimal predictive value of the testing regimen that we should use to achieve a public health goal is ultimately an ethical judgement about the trade-off between these moral costs and benefits.
Accordingly, what really matters for the ethical analysis of the serial testing regimen vs LFAT testing alone is not merely whether the false positive results would outnumber the true positives when using that strategy; it is instead the extent of this discrepancy on the different approaches, and how it compares to the discrepancy between the true negatives and false negatives.
Unless PCR testing is 100% sensitive, a serial testing regimen could potentially increase the number of false negative results in the tested population. Crucially though, this reduction would be minimal if PCR testing approaches 100% sensitivity. The number of false negatives would still be hugely outnumbered by true negatives under a serial testing regimen requiring two positive results, when the prevalence of the virus is low and both tests have reasonable sensitivity.
In contrast, the serial testing regimen would likely reduce the number of false positive results. It would do so even if the serial testing regimen were only modestly more specific than the use of LFATs alone, if it were used in a large population in which the prevalence of the virus is very low. That is morally significant, since it means that it is more plausible that the public health benefits of identifying true positives will be sufficient to outweigh the costs of the false positive results that doing so may involve.
Much here depends on the precise specificity of LFATs. On the figures outlined above, if LFATs themselves have a specificity as high as the very best case scenario estimate of 99.97%, they might be expected to lead to roughly 2’655 false positive results in the pupil population, given the low 0.45% prevalence of the virus at present, and the size of the population.
However, using the government’s reported estimate of the minimum specificity of these tests (as 99.9%), and the roughly 50% sensitivity of the test established in the Liverpool pilot data, the test would be likely to identify approximately 20’000 true positive and roughly 9’000 false positives – i.e. about 1 in 3 tests will still be false.
Further decreases in the estimate of specificity can have dramatic effects on the number of false positives. When we decrease specificity to the lowest figure mentioned in the range suggested by the new data reported by the government (99.72%), the number of false positives we can expect jumps to roughly 24’780. Notice that this is greater than the expected number of expected true positives.
It should be borne in mind that we are treating the above LFAT specificity figures as absolute specificity measures in these calculations. The number of expected false positive cases could again rise if these figures are understood as specificity measures relative to a gold-standard that is not itself 100% specific.
However, the above false positive estimates are sufficient to make the further point we now wish to raise. Whatever the precise specificity of the test might be, the harms of false positives will be compounded by the ripple effects they have for close contacts. To adequately assess the harms of these results, we would need to multiply the number of false positives by the expected number of close contacts of an average individual in the population. Even with the most generous estimates of specificity, the harms of false positives in this population will affect many thousands of people.
In view of this, even assuming an extremely high specificity for LFATs, there are strong moral reasons to further decrease the number of false positives with PCR testing when doing so will have a high negative predictive value.
Acknowledgements: The authors are working as part of the UKRI Pandemic Ethics Accelerator project. This project was funded by the Arts and Humanities Research Council (AHRC) as part of UKRI’s Covid-19 funding.
Authors: Jonathan Pugh, Dominic Wilkinson and Julian Savulescu.
Affiliations: Oxford Uehiro Centre for Practical Ethics, University of Oxford