The breaking-news last night was that Chinese authorities decided to redefine what a “confirmed case” was. This lead to an immediate jump in the number of confirmed cases as well as of confirmed deaths. Does this mean that my model laboured over the past few days was ripe for the trash can? Not really!

Case redefinition

When studying an ongoing outbreak a redefinition of what constitutes a “confirmed case” is troubling. On the one hand it messes up your statistics and hence makes it harder to forecast where things seem to be going. On the other hand, especially when the redefinition increases the number of confirmed cases, it raises discussions about the numbers that do not necessarily contribute to battling the outbreak.

But we are where we are! The redefinition by the Chinese authorities has a few positives. There were some indications that some of the reported case-numbers were also due to constraints on the ability to administer viral-dna tests. This change in case-definition removes that constraint. The fact that Chinese authorities were willing to do so despite the bad publicity of reporting a huge increase in cases shows that they are not necessarily unwilling to share bad news.

The downsides to the redefinition is that the new definition relies on either a positive viral-dna test or clinical measurements such as a positive CT-scan for pneumonia. However these clinical measurements may be afflicted by (a) mis-diagnoses leading to over-estimates and (b) missing cases which show only moderate or weak clinical symptoms leading to underestimates. So the downside really is that we start to lose track a little about which cases and which numbers we should take fully seriously. The added cases induce additional uncertainties.


In my previous posts I indicated I saw little evidence in the official numbers for a mortality rate larger than 2.1%. Does the case redefinition change this? Here is a plot of the reported confirmed cases (where the last data-point uses the redefined cases) and the reported confirmed deaths.


The dots are the reported numbers and the line is a lie which assumes a mortality of 2.2% and assumes that about 10 reported deaths are actually not attributable to COVID19 at all. The new data-point after redefinition fits neatly unto the line. The trend, that seemed to emerge in the past 5 days of reporting, of a slow increase in the mortality-rate is not evident in the most recent data. As a result I still see little real evidence in the officially reported figures, including the figures after redefining what “confirmed infection” means, for mortality rates as large as 20% as reported by some.

If I would assume that the reported deaths are lagging behind the reported cases by 6 days, i.e. that the correct mortality-rate of day t would be the deaths reported at day t+6, then we get the following graph


which would indicate a mortality rate of 3.6% and a number of miss-allocated deaths of  around 150. To push this to anything close to 12% mortality we would need to assume a lag in excess of 14 days and this would require the mis-diagnosing of 400 of those deaths. But by effectively dropping 14 days from the range of days for which we have data our estimates become increasingly spurious. I don’t see how the current data can be brought in line with a notion of a 20% mortality-rate.

Adjusted model

The fact that the mortality does not seem to have changed as a result of the change in case-definition points to something quite interesting. The cases confirmed through viral-dna tests and those confirmed through clinical means are actually not that different. That would suggest that perhaps the underreporting is simply an underreporting by some fixed factor. If we compare, for yesterday’s reported numbers, the ratio of total confirmed cases with the new definition and the total reported cases following the old definition than this ratio is roughly 1.28 to 1. If we scale the data of previous days by this factor of 1.28, so assuming that this has been a systematic but constant underreporting by 28%, and we analogously expand our quarantine figure of 25,000 as well as our quarantine-lead figures by a similar 28% we get the following graph


In this graph all reported confirmed cases of infections of the past days have been multiplied by 1.28, reflecting that the new definition gives a 28% higher-count. All the model parameters referring to numbers of patients have also been increased by 28%. But crucially, the probability of transmission of the infection and the mortality have been kept un changed in all three scenarios plotted here. What we see is that the model we devised four days ago keeps doing very well. The redefinition of cases does not seem to require changing any assumptions we made about the infectiousness of the disease nor of its mortality.

Quarantine and asymptomatic transmission

These three scenarios now work with a ‘quarantine population’ of initially 32,000 and a leak between 1400 and 1800 of people added every day. Note that the Chinese authorities are keeping about 3 times as many people in a more rigorous form of quarantine at the moment. My model’s ‘quarantine population’ consists exclusively of people who are quarantined together with people who carry and can pass on the infection! But the mortality and infectiousness of the three scenarios have not changed relative to the previously posted estimates of a mortality between 0%-6%, a mortality lag between 0 and 3 days and a probability of infection of  around 41% a day per confirmed case.

The reported number that each patient infects about 2.6 others would be the result of each patient being infectious for about 7 days before being isolated, recovered or dead. So whether or not there is onward asymptomatic transmission depends on the time we estimate it takes for a patient with symptoms to be isolated, recovered or dead. In the model we assumed a 3 day-lag in one of those scenarios which would amount to 3 or 4 days of asymptomatic transmission.

Forecast & Conclusions

When we use the revised model, with the revised case-definition implemented (as no revision to the disease parameters was necessary) to calculate 30 days ahead we get the following graph


Due to the revised case-definition and constant mortality we are now no longer talking about 77,00-80,000 cases by the end of the first week of March, but about 100,000-105,000. As a result the expected casualty number by then would have increased to around 2200. All of this is to be taken with a huge grain of salt because it assumes an unchanging quarantine-leak of about 1500 people a day.

My conclusion remains that, with the current change in case-definition, the official figures still follow the very same and reasonable narrative, except involving 28% more people than originally assumed. That does still not prove those numbers are true. But it still means any exaggerated panic stories about global conspiracies lack any reasonable basis and similarly estimates of mortalities above 15% still seem very much on high side and not in line with what we see in the development of the spread.

3 responses to “Modelling #CoViD19: A #Coronavirus modelling update after case-redefinition”

  1. Modelling the #CoronaVirus outbreak – My Imaginary Numbers Avatar

    […] A third update after the case-definition revision can be found here. […]


  2. #CoViD19: A #coronavirus #modelling update – My Imaginary Numbers Avatar

    […] the data in the 3 different scenarios discussed in earlier posts here, here ,here and here then we find the following […]


  3. #CoViD19: A final #coronavirus #modelling update – My Imaginary Numbers Avatar

    […] of post formulating a very basic model and seeing how it does. You find that here, here, here, here and here and overall the fit was good. Now a month later let’s see how the model does now […]


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: