COVID-19 Models: Predicting the Unpredictable


COVID-19 is full of unknowns. An unknown origin. An even more unknown future. And what is the most fundamental fear of all living creatures known to mankind? The Unknown. Thus, as fearless tools, computers will no doubt play a fundamental role in stopping the pandemic, from modelling the virus itself to simulating its spread in different populations. One such model is known as the SIR model, and it can be used to simulate the effectiveness of lock-downs, social distancing, and other measures. Yet, as powerful as they are, will they prevail and shine a light onto humanity’s path to salvation? Or will they succumb to the universal predator?

Computer simulation is the process of visually displaying the results of mathematical models, and is used to predict events or outcomes in the real world. Models have been used to represent numerous human systems in economics, psychology and healthcare, as well as natural systems in physics, chemistry and biology. In a pandemic like COVID-19, computer simulations can be used to prove the effectiveness of certain policies, and provide strong arguments for governments to take up these policies. They can also give governments ideas for when to change policies, or move into different phases of attack. Without simulations, imposing measures like lock-downs would be based on speculation and hope, rather than the more reliable facts and figures.

For decades, people have used computer simulations to model how a population changes over time. One famous example is Conway’s Game of Life, first popularised in 1970. In this simulation, a standard example of which is shown in Figure 1, cells can either be “alive” or “dead”, based on the states of its neighbours (the eight cells around it). “Alive” cells stay “alive” if they have two or three “alive” neighbours, otherwise they “die”. Similarly, “dead” cells can only transition to the “alive” state if they have exactly three “alive” neighbours. On the surface, this seems like a very simple concept.

Figure 1 – Simulation of the Game of Life
(white: “alive”, black: “dead”)

Yet, within this randomness, we begin to see patterns. These include:

  • Oscillators: Patterns that change but repeat themselves after periods of iterations. Examples include blocks of three cells in a row “spinning”.
  • Gliders: As shown in Figure 2, gliders will move across the environment as a single, unbroken entity.
  • Glider Guns: Figure 3 is an example. These structures generate a continuous stream of gliders.

These patterns show a sense of emergence, as well as self-organisation. Emergence refers to the idea that the whole is greater than the sum of parts, and that it is better to work together. Self-organisation is defined as the spontaneous formation of patterns without guidance. Both of these traits are also present in communities of organisms, thus the name: the Game of Life. In this simulation, with less than five initial conditions, we were able to produce a system resembling some kind of society, albeit an incredibly simplistic one. This shows the power that models have to generalise, yet also suggests the limitations it has when it comes to details and unpredictability.

One of the most basic models for modelling the spread of an infectious disease over time is known as the SIR (Susceptible, Infected, Recovered) model, a simulation of which can be found in Figure 4. This type of model is known as a compartmental model: a model where the population is split into distinct categories. We’ll take a look at the version without vital dynamics, i.e. where the population is fixed, and we’ll assume one unit of time to be one day.

Figure 4 – Spatial SIR Model Simulation

The SIR model depends on a number of different variables:

Let’s derive some equations from these variables. In the following equations, dV/dt represents how a variable V changes from one day to the next.

The susceptible population can only can only transition into the infectious compartment, and this can only be done by an infectious person at a rate of b people per day. This means the rate of decrease of the susceptible fraction can be represented as follows:

Then, only the infectious can become recovered, at a rate of k per day.

However, since this model doesn’t account for vital dynamics, there should be no loss or gain of people. This means that the total population should remain constant, and that:

Either through differentiation, or by understanding that the net change in the different compartments will always be zero, we can then deduce that:

Given these equations for each compartment, we can start simulating. To begin with, only one person is infectious and the rest are susceptible. Figure 5 shows how increasing b, the number of infections per day, both increases the peak in infections and brings it earlier.

Figure 5 – Series of simulations to show the effect of varying b on the SIR model

Figure 6 shows how increasing k, the proportion of the infectious recovering per day, also flattens the curve.

Figure 6 – Series of simulations to show the effect of varying k on the SIR model

This version of the SIR model frankly doesn’t provide much guidance as to how to deal with the pandemic, so we can adapt our model to include the incubation time, the death rate, and a lock-down after a specific number of days. Let’s define more compartments and variables:

Once a person is infectious, they either die (probability of a) or they recover (probability of 1-a). Each of the following equations is a simple probability times proportion times infected fraction.

We can leave s(t) as none of the new conditions affect the number of susceptible people. However, rather than the susceptible population transitioning into the infectious population, they now have to go through the exposed phase, where they are in their incubation period. Since the incubation period is D days, an average of 1/D of the exposed population will become infectious per day. This means:

Now, the infectious population must have transitioned from the exposed and must transition to either the recovered or the dead. This gives our final equation:

According to a study conducted by American scientists[1], of which 181 confirmed cases of COVID-19 were examined, the median incubation period for COVID-19 was 5.1, with only 1% developing symptoms after 14 days. In order to account for the worst cases scenario, let D = 14. Based upon the infection fatality rate of the first Italian outbreak[2], let a = 0.00129. Finally, based upon assumption, let r = 0.05 and k = 0.143 (approx. 1/7).

Lock-down decreases the number of susceptible people that one infectious person infects per day: b. Since k is the proportion of infectious people who recover per day, 1/k would be the number of days before recovery for each infectious person. The total number of people an infectious person infects, R, is the number infected per day multiplied by the number of days whilst infectious, so:

After configuring the variables, the computer simulation was run, with population size equal to the UK population (67.8 million), only one infectious person to begin with, and R set to 5.0 before lock-down and 0.5 during lock-down. This is shown in Figure 7, which gives an idea as to just how effective imposing a lock-down as early as possible is. According to this model, with a lock-down 200 days into the pandemic, approximately 305,803 people will die from COVID-19 in one year. A lock-down 150 days into the pandemic reduces this number to 206,813 deaths. A further 50 days earlier brings the death count down to 1,499, and if the lock-down was imposed within the first 50 days, the maximum death count would be 3.

Figure 7 – Series of simulations to show the effect of lock-down at different times in the pandemic on the number of infected and dead

Even with this simplistic SEIRD model, we can clearly understand that an early lock-down is crucial to flattening the curve and stopping the pandemic. Yet it has some fatal flaws and limitations when it comes to predicting the number of deaths. It assumes a universal fatality rate for all people, which is not the case. The fatality rate was estimated at 4.25% for people above 60 years of age, yet only 0.05% for those under 60 years of age[2]. This model also assumes that the recovered aren’t susceptible, or you cannot contract the virus twice. This is, one again, an unknown of COVID-19. In fact, most of the variables used in the model were assumed or estimated because they were unknown, and this is where COVID-19 models start to break down.

With such an unprecedented and new event like COVID-19, there will always be assumptions and therefore inaccuracies within models, simulations and predictions. For example, in order to predict the number of deaths in America due to COVID-19, numerous models published by infectious disease researchers have each added their own assumptions, leading to vastly different estimates. Three examples, their assumptions, and their predictions for May 30th are as follows:

  • The University of Texas model, which assumes that people’s movement levels won’t change from the previous week, predicts 93,000 deaths.
  • The Columbia University model, which assumes that there will be a 20% reduction in contact between people compared to the previous week, predicts 104,000 deaths.
  • The Institute for Health Metrics and Evaluation model, which assumes that current policies and movement patterns will continue until the number of new infections become very small, predicts 110,000 deaths.

We don’t know if COVID-19 can be spread during its incubation period. Nor do we know its true infection rate, and many other characteristics of the virus. Due to the difficulties and unreliability of predicting with minimal data, these models are often changed daily to make use of newly-collected statistics, whether it’s on mobility, social distancing policies, or testing rates, so that the information people discover is as reliable as possible, as soon as possible.

COVID-19 is full of unknowns. This makes the development of models a fundamentally difficult task, yet through the use of computers, we have simulations that can make predictions with 95% accuracy and can guide us when decision-making. Modelling is crucial, as it is neither made to incite fear, nor to give false hope, but rather to openly provide information and help us better understand what we need to do and why. Whether or not models and computer simulation are susceptible to the unknown, we will need them in order to minimise the fatality rate, and to ensure that the maximum fraction of the infectious population recovers successfully. After all, cooperation is a key factor in the Game of Life.


  1. Lauer SA, Grantz KH, Bi Q, et al., “The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application”, Annals of Internal Medicine, May 5, 2020
  2. Gianluca Rinaldi, Matteo Paradisi, “An empirical estimate of the ifnection fatality rate of COVID-19 from the first Italian outbreak”, medRxiv, April 23, 2020
  3. Elena Loli Piccolomiini, Fabiana Zama, “Monitoring Italian COVID-19 spread by an adaptive SEIRD model”, medRxiv, April 6, 2020
  4. Lester Caudill, “Lack of data makes predicting COVID-19’s spread difficult but models are still vital”, The Conversation, April 15, 2020
  5. Ryan Best, “Where The Latest COVID-19 Models Think We’re Headed — And Why They Disagree”, FiveThirtyEight, May 8, 2020
  6. Eugene M. Izhikevich et al., “Game of Life”, Scholarpedia, 2015
  7. Henri Froese, “Infectious Disease Modelling: Beyond the Basic SIR Model”, Towards Data Science, April 11, 2020
  8. Harry Stevens, “Why outbreaks like coronavirus spread exponentially, and how to ‘flatten the curve'”, Washington Post, March 14, 2020


Leave a Comment