Jeffrey Aronson: When I Use a Word . . . Modelling elections and covid-19

Last week I introduced the basic and effective reproduction numbers, respectively R₀ and R_e, that are estimated in studying the way a viral epidemic spreads through the population.

The basic reproduction number is defined as the number of cases that are expected to occur on average in a homogeneous population as a result of infection by a single individual, when the population is susceptible at the start of an epidemic, before widespread immunity starts to develop and before any attempt has been made at immunization. So if one person develops the infection and passes it on to two others, the R₀ is 2.

The effective reproduction number, R_e, is the number of people in a population who can be infected by an individual at any specific time. It changes as people become immune or die.

If the average R₀ in the population is greater than 1, the infection will spread exponentially. If R₀ is less than 1, or if R_e falls to less than 1 during the epidemic, the infection will spread only slowly, and will eventually die out.

R₀ is estimated from data collected in the field and entered into mathematical models. The estimate depends on the model used and the data that inform it. A typical model is based on three factors: individual susceptibility to the infection, the rate at which infections actually occur, and the rate of removal of infection from the population, by either recovery or death.

Since advice about how to behave during an epidemic, such as when to enforce isolation and when to relax restrictions, depends in part on estimates of R_e, it is important that the models used to calculate it should be robust. So how good are the mathematical models?

Here’s a strikingly good example—the accuracy with which exit polls have predicted the outcomes of general elections. The American statistician Warren Mitofsky introduced exit polls in the 1960s, in a Kentucky governorship election. The idea may have been based on the habit of interviewing movie preview audiences as they left the cinema. However, exit polls had taken place in the USA as early as the 1940s, although the earliest instance of the term recorded in the Oxford English Dictionary is from a June 1976 article in the New York Times.

How do exit polls work? Polling is based on taking a random sample of the population and assuming that they are representative of the whole population. In UK exit polls about 200 voters are randomly sampled as they leave each of 144 of the 40 000 or so polling stations around the country and are asked how they have just voted; they may be asked to cast their votes again in a sham procedure, perhaps making them more likely to report exactly how they have just voted at the ballot box. They are also asked how they voted last time. The data are then modelled according to known demographic characteristics and country-wide variations and scaled up to the whole voting population using complex statistical techniques.

The results of the 2019 UK general election are shown in Table 1. The exit poll came as close as one could to the actual outcome without suspecting sorcery; the data were of high quality and the expected variability was well modelled, under reasonable assumptions.

In contrast, modelling the spread of a virus is much more difficult, First, the data are incomplete; the number of infected individuals cannot be known accurately at any time because there are so many mild and asymptomatic cases; deaths cannot be completely counted and some deaths may be attributable to other causes, even in someone who has the infection. In addition, if other epidemiological knowledge, the equivalent of the well understood voting patterns and demographic effects in the exit poll models, is inadequate, modelling assumptions are likely to be inappropriate and inaccurate results are likely to emerge.

Thus, it has been exceedingly difficult to calculate accurately the reproduction numbers of SARS-CoV-2. Figure 1 illustrates this, with data taken from a systematic review of 21 studies, showing huge variation; the mean estimate was 3.32 (2.81–3.82) and the range 1.9–6.49.

We should therefore recall George Box’s wise words about statistical models: “All models are wrong but some models are useful”. Or as he put it elsewhere, less dramatically but more specifically: “Models, of course, are never true, but fortunately it is only necessary that they be useful. For this it is usually needful only that they not be grossly wrong”. But the models being used to estimate the reproduction numbers of SARS-CoV-2 may be “grossly wrong”.

Jeffrey Aronson is a clinical pharmacologist, working in the Centre for Evidence Based Medicine in Oxford’s Nuffield Department of Primary Care Health Sciences. He is also president emeritus of the British Pharmacological Society.

Competing interests: None declared.

This week’s interesting integer: 269

269 is a prime number, the 57^th, P₅₇. It has many prime connections.

269 is a twin with 271, i.e. they are consecutive primes; this makes 269 a strong prime, i.e. one that is greater than the average of the two adjacent primes, in this case 263 and 271 (average 267). Note: a strong prime has a different definition in cryptography, one that has properties that allow it to be used in safe encoding.

It is a Pythagorean prime, i.e. one of the form 4n + 1; like all Pythagorean primes it is therefore the sum of two squares: 269 = 10² + 13², and therefore the square of the hypotenuse of a right-angled triangle whose legs are 10 and 13 units in length.

269 in base 11 is 317 in base 10 and 317 is another prime.

The sum of the digits of 269, 17, is also prime.

269 is the sum of three consecutive primes, 83 + 89 + 97. It is also the sum of two consecutive composite numbers, 134 + 135.

269 is a deletable prime, one that produces a series of primes when one digit is removed at a time: 269 → 29 → 2

269 can be expressed as the sum of three non-zero squares in no less than five different ways:

16² + 3² + 2²
14² + 8² + 3²
13² + 8² + 6²
12² + 11² + 2²
12² + 10² + 5²

Convert 269 to other bases:

Base 9 8 7 6 5 4 3 2
Value 328 415 533 725 2034 4031 30222 10001101

None of these numbers is a prime in base 10 except 269; for example 415 = 5 × 83. There is only one smaller number with this property, 263.

Information for Authors