Study designs, regulation, and transparency of in-vitro diagnostic tests matter and need to be embedded in pandemic planning, argues Sheila M. Bird
A recent report from the Royal Statistical Society’s Working Group on Diagnostic Tests has called for new standards for diagnostic tests to assure the performance of future tests. The report makes 22 recommendations on study-design; regulation; and transparency. The Royal Statistical Society (RSS) convenes working parties infrequently, usually only when a major misuse of statistics needs correction, so it is noteworthy that the RSS has issued a report on this topic. [1-5]
Since the early 1980s, stricter regulations apply in respect of in-vitro diagnostic tests for infectious diseases which could contaminate the blood supply. Despite the global pandemic, and an initial infection-fatality-rate of around 1%, SARS-CoV-2 does not threaten the blood supply and tests for it are not held to a common statistical benchmark as used for drugs or vaccines. On 10 June 2021, the US Food and Drugs Administration (FDA) issued a safety communication warning the public to stop using the Innova SARS-CoV-2 antigen rapid qualitative test over concerns about performance of the test. A statement from the UK’s regulator (Medicines & Healthcare products Regulatory Agency MHRA) is expected as the Innova lateral flow antigen tests have been in wide use in the UK where the supply of the devices is only permitted by the Department for Health and Social Care.
Antigen tests for SARS-CoV-2 rely primarily on a nasopharyngeal swab and manufacturers are able to apply for emergency use authorization on the basis of self-certification, so-called Conformité Européene (CE), with a similar process in the United States. Tests can be used in contexts which differ from their CE-use. For example, in the UK, the deployment of the Innova lateral flow antigen test for mass screening of asymptomatic people in Liverpool in November 2020 deviated from its CE-use, which was for people with symptoms of covid-19. The rollout of mass screening in the UK was presaged only by modest investigation in asymptomatic persons.
Crucially, in Liverpool a dual-swabbing study-design was used whereby 6,000 asymptomatic people agreed to provide contemporaneous swabs for the Innova rapid test (result within 30 minutes) and to be tested using quantitative reverse transcription Polymerase Chain Reaction tests (qRT-PCR). qRT-PCR tests are considered to be the UK’s gold standard test for diagnosing SARS-CoV-2 infection in symptomatic people. Associated with qRT-PCR is an amplification cycle-threshold (ct-value), which is lower when viral load is higher.
Liverpool’s study-design revealed that the Innova lateral flow test diagnosed only 28 (40%) of the 70 infections [95% CI: 28% to 52%] that were identified using qRT-PCR, but the rapid lateral flow test performed better at diagnosing “infectiousness”, that is: infections for which ct-value was below 25 (67%) [95% CI: 41% to 87%].
In March 2021, the UK government rolled out assisted-screening of secondary school pupils in England for SARS-CoV-2 on their return to school (three times in two weeks). This was another context-of-use for the Innova lateral flow antigen test due to pupils’ younger age and the substantially lower prevalence of asymptomatic infection in the community compared to November 2020, when mass testing was rolled out in Liverpool.
Not only was there no properly-designed evaluation of this mass screening in schools, but confirmation using qRT-PCR tests for positive lateral flow test results was also suspended, despite the RSS warning that, due to low prevalence of asymptomatic infection in the community, half of the lateral flow test results would be negative when adjudicated. 
The UK has learned the hard way that, for diagnostic testing, study design matters and we cannot assume that test-performance can be generalised from one context of use to another (symptomatic to asymptomatic; adult to child; high to low prevalence; high viral load versus vaccine-reduced; or across different variants of concern).
The recent RSS report on new standards for diagnostic tests notes that robust studies of analytical performance provide necessary, but insufficient evidence to implement in vitro diagnostics. Confidence intervals for the sensitivity (% of infected persons who are correctly detected by the test) and specificity (% of uninfected persons correctly labelled by the test as uninfected) for each intended use should be standard practice, not exceptional. Lesser sensitivity and specificity can be adjusted for in surveillance studies.  Study designs for direct comparison of competing in vitro diagnostic tests should be a priority in the planning for future pandemics, which requires:
- Identification of multi-site networks to recruit patients or citizens willing to provide relevant biological specimens (including serially);
- Creation and maintenance of specimen banks;
- Multi-disciplinary input—statistical and regulatory—as well as clinical and laboratory, to agree evaluation strategies;
- Expedited study-protocol and ethical approvals before access to banked specimens.
In vitro diagnostic tests require stricter regulation because mis-represented or mis-applied tests can be unsafe—as alerted by FDA warning last week. The RSS looks to the MHRA, the UK’s regulator to take a strong lead to ensure international harmonization on higher standards of diagnostic tests. In particular, the RSS recommends the strengthening of Target Product Profiles for each intended use and consideration of the consequences of a positive test result, which for SARS-CoV-2 ranges from inappropriately liberalised behaviour following a false negative result, to deprivation of liberty via self isolation following a positive result. Both of these outcomes have considerable implications for individuals and the community.
Stricter regulation helps in achieving greater transparency because protocols for field or clinical evaluation studies would be publicly available, well-scrutinized, ethically approved, and any post-hoc analyses should be clearly identifiable as exploratory. Regulators would also have a crucial role in vetting the information provided to the public about test-performance in different contexts. People should not be confused—inadvertently or deliberately—about which test is being offered; and the public needs to be clear about when confirmatory testing is required and why—including for public health reasons such as genomic tracking.
Sheila M. Bird, formerly Programme Leader, MRC Biostatistics Unit, Cambridge
Competing interests: SMB is member of the Royal Statistical Society’s COVID-19 Taskforce and chairs its RSS/DHSC Panel on NHS Test & Trace; member since January 2021 of PHE/NHS Test & Trace’s Testing Initiatives Evaluation Board; proposer and member of RSS Working Group on Diagnostic Tests; past member/chair of five out of six previous RSS Working Parties in the past 30 years.
- Covid-19: Tests must be more rigorously regulated to protect public, say statisticians
- Covid-19: US regulator raises “significant concerns” over safety of rapid lateral flow tests
- Report of a Royal Statistical Society Working Party on Official Statistics in the UK (chairman: Professor Peter G. Moore). Official Statistics: Counting with Confidence. Journal of the Royal Statistical Society, Series A 1991; 154: 23-44.
- Report of a Royal Statistical Society Working Party (chairman: Professor Stuart J. Pocock). Statistics and Statisticians in Drug Regulation in the United Kingdom. Journal of the Royal Statistical Society Series A 1991; 154: 413-419.
- Report of a Royal Statistical Society Working Party (chairman: Professor David J. Bartholomew). The Measurement of Unemployment in the UK. Journal of the Royal Statistical Society, Series A 1995; 158: 363-417.
- Report of a Royal Statistical Society Working Party on Performance Monitoring in the Public Services (chair: Professor Sheila M. Bird). Performance indicators: good, bad, and ugly. Journal of the Royal Statistical Society, Series A 2005; 168: 1 – 27.
- Report of a Royal Statistical Society Working Party (chair: Professor Stephen Senn). Statistical issues in first-in-man studies. Journal of the Royal Statistical Society, Statistics in Society 2007; 170: 517 – 579.
- Bird SM. Covid tests in secondary schools: a statistical cause celebre. Significance 2021; 18: 42-45.
- Diggle PJ. Estimating prevalence using an imperfect test. Epidemiology Research International 2011; Article ID 608719. doi:10.1155/2011/608719.