Apologetic South African scientists admit flawed Covid-19 models in an enlightening research document

Made in Workshop - Makerspace in Johannesburg

South Africa’s epidemiological models have proven to be wildly inaccurate. A new commentary by international modellers explains why Covid-19 models are a poor basis for decision-making, while South Africa’s own modellers try, but fail, to explain themselves.

President Cyril Ramaphosa and the secretive politburo in charge of lockdown routinely refer to model projections to assert the necessity of a range of draconian interventions to curtail the rights and liberties of citizens and severely restrict economic activity.

At the start of the lockdown, we were told to expect at least 88 000 and perhaps as many as 350 000 deaths. Later, the lower bound number was tempered to between 40 000 and 48 000 deaths, but even this does not appear to match the reality (of 6 769 deaths to date).

The most notable critic of the government’s various modelling efforts has been the Pandemics Data & Analytics group (Panda), which opened the betting with a paper that estimated the number of life years that will be lost to the lockdown to be 29 times as high as the number of life years lost to Covid-19.

This suggests that the world’s harshest and longest lockdown, which continues to have catastrophic consequences for people’s livelihoods, was an unjustified and absurdly disproportionate response in a country that could ill afford it.

They went on to publish a number of articles reviewing the evidence and concluding that the models continue to ‘grossly overestimate’ deaths.

As I pointed out almost three months ago, the problems with models are not unique to South Africa, and follow a pattern of emphasising worst-case scenarios based on uncertain assumptions, dubious code, and poor data.

At the time, I criticised the secrecy in which South Africa’s modelling exercises were shrouded. The models on which these projections were based were never disclosed for public peer review.

Secret models now published

A number of local modellers from the South African Covid-19 Modelling Consortium (SACMC) last week published a n apologia for their work in modelling the course of the Covid-19 pandemic in South Africa.

In it, they claim that ‘we have aimed at making all model outputs and code available for public scrutiny from the beginning’.

They might have aimed at it, but they didn’t actually do it. As they admit: ‘…we are planning to make the model code public for additional scrutiny’.

They appear finally to have done so, on the quiet, on Friday 24 July. It took a Promotion of Access to Information Act request from Panda to elicit a website address where the model code and data for SACMC’s National COVID-19 Epi Model (NCEM) could be found.

The earlier models, created by the South African Centre for Epidemiological Modelling and Analysis (SACEMA) have been abandoned, and were never published.

Now that the current models are finally public, I urge anyone with the requisite coding, statistics, modelling or epidemiology background, to review them in detail and publish their findings. Panda surely will, but the more reviewers, the better.

A modelling manifesto

For now, let’s have a closer look at the SACMC modellers’ apologia.

At the outset, they recognise – and concur with – a new commentary by a larger group of international modellers, published in the respected journal Nature, entitled Five ways to ensure that models serve society: a manifesto.

That paper points out many inherent shortcomings of computer models. It is worth quoting at some length:

‘…computer modelling is in the limelight, with politicians presenting their policies as dictated by “science”. Yet there is no substantial aspect of this pandemic for which any researcher can currently provide precise, reliable numbers. Known unknowns include the prevalence and fatality and reproduction rates of the virus in populations. There are few estimates of the number of asymptomatic infections, and they are highly variable. We know even less about the seasonality of infections and how immunity works, not to mention the impact of social-distancing interventions in diverse, complex societies.

‘Mathematical models produce highly uncertain numbers that predict future infections, hospitalizations and deaths under various scenarios. Rather than using models to inform their understanding, political rivals often brandish them to support predetermined agendas. To make sure predictions do not become adjuncts to a political cause, modellers, decision makers and citizens need to establish new social norms. Modellers must not be permitted to project more certainty than their models deserve; and politicians must not be allowed to offload accountability to models of their choosing.’ (Emphasis mine.)

It goes on to list five ‘best practices’ for quality modelling, summarised here:

Mind the assumptions. Make sure assumptions are reasonable, perform uncertainty and sensitivity analyses on them, and make it clear what the compounded uncertainties mean for the final output.
Mind the hubris. It is tempting to increase a model’s complexity to better capture reality, but each added element introduces new uncertainty, so increasing complexity can lead to an ‘uncertainty cascade’, making the results less, rather than more, accurate.
Mind the framing. A range of factors, from modellers’ own biases, interests and disciplines to the choice of tools can influence, and even determine, the outcome of the analysis. Models are easily manipulated to produce desired results. To put it in jargon, the authors write, ‘qualitative descriptions of multiple reasonable sets of assumptions can be as important in improving insight in decision makers as the delivery of quantitative results’.
Mind the consequences. Excessive focus on numbers ‘can push a discipline away from being roughly right towards being precisely wrong’. In the case of Covid-19, the issue is not only projected deaths, or projected need for hospital beds, but also projected unemployment, projected economic costs, projected non-Covid-related deaths, and projected consequences for civil liberties.
Mind the unknowns. When there is no clear or simple answer to a question, modellers should say so. Failure to acknowledge ignorance not only leads to wrong policy options, but also allows politicians to abdicate responsibility for their actions by saying ‘we only acted upon what the models told us’.

Ultimately, the paper concludes: ‘Mathematical models are a great way to explore questions. They are also a dangerous way to assert answers. Asking models for certainty or consensus is more a sign of the difficulties in making controversial decisions than it is a solution. … We are calling not for an end to quantification, nor for apolitical models, but for full and frank disclosure. Following these five points will help to preserve mathematical modelling as a valuable tool. Each contributes to the overarching goal of billboarding the strengths and limits of model outputs. Ignore the five, and model predictions become Trojan horses for unstated interests and values. Model responsibly.’

The modellers’ apologia

The SACMC modellers no doubt would like to think that they have been fully cognisant of these issues, and have clearly communicated the limitations of their models to their political clients. However, the explanations of their own results raises more questions than answers.

If, as the local modellers concede, there is significant uncertainty regarding almost all central aspects of how the virus behaves, including its prevalence, reproduction rate, the proportion of infected people who remain asymptomatic, the role of seasonality, whether cross immunity to the virus from other infections exists, whether immunity to the virus itself persists, and the impact of non-pharmaceutical interventions such as physical distancing and mask wearing, on what basis can we have any confidence at all in the model outputs?

So-called Susceptible-Exposed-Infectious-Recovered (SEIR) models have failed miserably in countries that were well ahead of us on the infection curve. Why then, do the SACMC modellers continue to rely on such a model, and why, given that they have historical data from those countries, is there still ‘considerable uncertainty regarding the likely trajectory of the Covid-19 epidemic in South Africa overall, and perhaps more so due to recent flattening of the growth in cases in the Western Cape’?

They claim to have incorporated international experience into the models, but their models would have failed in countries that are ahead of South Africa on the curve, so should be expected to fail in South Africa, too.

The modellers say that they ‘present [their] estimates with uncertainty bands reflective of variation in the parameters driving the model and the model process itself’. Yet even with very wide uncertainty bands, some of the modelled projections failed on the low side, within a month of publication.

Best effort rather poor

They chose one long-term projection and one short-term projection to demonstrate how well they did. Both estimated the death count on 1 July.

The long-term projection, published on 6 May, ranged from an optimistic scenario of 822 deaths (between 431 and 1 618) to a pessimistic scenario of 5 486 deaths (between 2 849 and 9 869).

To say ‘somewhere between 400 and 10 000-odd deaths’ is not a projection. It’s a guess, hedged to meaninglessness. Even ‘between 822 and 5 486 deaths’ is not a useful projection. That the actual death toll of 2 952 on 1 July falls in that massive range should hardly be a point of pride. Any undergraduate with a basic curve-fitting technique could have given you a similar answer.

Even the short-term projection, published on 12 June, had a massive error range. The estimate was 3 810 deaths (between 1 880 and 7 270), which reflects a 50% error bar on the downside, and a 90% error bar on the upside. The actual death toll does fall within this spectacularly wide range, as one would expect, but the projection still exceeded reality by a massive 30%.

Could you plan a party, knowing that you’d have to cater for between 18 and 72 people? Or worse, between four and 100 people? That’s how useful the SACMC model projections have been. Now imagine you’re the manager of a cash-strapped, understaffed hospital, facing these sort of numbers and being expected to plan for them.

And this is their best effort, which they chose to highlight. That the models repeatedly failed on the downside is never explained.

Western Cape breaks models

The modellers observe that deaths in the Western Cape have plateaued for four weeks, and hospital admissions in the province peaked on 22 June. Their models did not predict this. In fact, they got the Western Cape’s numbers spectacularly wrong. The death count was lower than their most optimistic projection. They over-estimated the number of ICU beds that would be needed by between 12 and 16.5 times. The most optimistic projection for total beds needed was more than double the actual number at the peak.

They have no explanation for the plateau in the Western Cape, although Panda’s own very basic model, based on nothing but publicly available data, predicted it far more accurately.

Worse, the modellers wonder whether they over-estimated the Covid-19 deaths in the Western Cape, and conclude that they don’t know. How much more is there to know, except to note that actual deaths were lower than their lower bound, and their models failed to anticipate the peak? Clearly, they did over-estimate. There’s no uncertainty about that.

Although the Western Cape numbers surprised them, the modellers claim they cannot infer anything from the change, or say what it means for the rest of the country. It’s almost as if real-world data is an inconvenience to them, rather than a critical input into their models.

Trojan horses

They say that ‘individual behaviour changes are unpredictable and difficult to model, [so] these are not included in our models’. A few paragraphs later, they say that they regularly update ‘our assumptions about the adherence of the population to the restrictions’.

Which is it? Do they model behaviour changes, or don’t they? If not, how can they justify their failure to model one of the single biggest factors in the spread of the disease? And if they do, where do they get the data on these behaviour changes?

They claim they’ve had to decrease their assumptions about how fast the virus spreads, because there turned out to be fewer cases than expected. But they have no idea why this happened. They speculate, based on no evidence at all, that it might have to do with adherence to restrictions. But that would conflict with the evidence that the Western Cape’s numbers have started to decline while restrictions were being lifted.

The SACMC modellers promise that they’ll continue to update their numbers, and ‘continue to strive towards making our models as useful as possible’. However, it is clear that their models are not very useful at all. Merely updating the numbers periodically will not fix that.

They are exactly what the Nature paper warns us against: ‘Trojan horses for unstated interests and values’. And we all know what those interests and values are.