Original Article| Volume 136, P96-132, August 2021

Ok

# Effect estimates of COVID-19 non-pharmaceutical interventions are non-robust and highly model-dependent

Published:March 26, 2021

## Highlights

• Different SIR models developed by the same modeling team on the effectiveness of various non-pharmaceutical interventions (NPIs) for COVID-19 were compared.
• The model proposing major benefits from lockdown in European countries had the worse fit to the data.
• Models with better fit to the data showed little or no benefit from lockdown.
• Inferences on the effects of non-pharmaceutical interventions is non-robust and depend on model specification and selection.

## Abstract

### Objective

To compare the inference regarding the effectiveness of the various non-pharmaceutical interventions (NPIs) for COVID-19 obtained from different SIR models.

### Study design and setting

We explored two models developed by Imperial College that considered only NPIs without accounting for mobility (model 1) or only mobility (model 2), and a model accounting for the combination of mobility and NPIs (model 3). Imperial College applied models 1 and 2 to 11 European countries and to the USA, respectively. We applied these models to 14 European countries (original 11 plus another 3), over two different time horizons.

### Results

While model 1 found that lockdown was the most effective measure in the original 11 countries, model 2 showed that lockdown had little or no benefit as it was typically introduced at a point when the time-varying reproduction number was already very low. Model 3 found that the simple banning of public events was beneficial, while lockdown had no consistent impact. Based on Bayesian metrics, model 2 was better supported by the data than either model 1 or model 3 for both time horizons.

### Conclusion

Inferences on effects of NPIs are non-robust and highly sensitive to model specification. In the SIR modeling framework, the impacts of lockdown are uncertain and highly model-dependent.

## 1. Introduction

Until effective and safe vaccines can become widely available, the levers of policy makers to manage COVID-19 have included non-pharmaceutical interventions (NPIs), such as social distancing mandates, travel restrictions, self-isolation, banning of public events, closure of schools, and ultimately complete lockdown. These measures aim to reduce infections by decreasing contact between individuals. Given that multiple NPIs are often introduced in quick succession, it is difficult to separate their effects.
Here, we compare the inferences regarding the effectiveness of various NPIs obtained from different SIR (susceptible-infected-removed) models. The first model (model 1) was produced by the Imperial College COVID-19 Response Team and led to arguably the most influential publication to-date in support of large benefits from total lockdown [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
]. Its publication in Nature [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
], concluded that complete lockdown was responsible for 80% of the reduction in the time-varying reproduction number, $Rt$, and that 3.1 million deaths were avoided in 11 European countries due to lockdown.
The Imperial College team also developed and applied a different model (model 2) to the USA [
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
], which assumes $Rt$ varies as a function of mobility. In model 2, there is no explicit causal link between an NPI and $Rt$–NPIs enter the model indirectly via their effects on mobility. Inference regarding the (complex) impact of NPIs is possible by observing the $Rt$ trajectory at the time of intervention/s.
This work tries to make the point that one has to resolve uncertainty not only about fundamental epidemiological variables such as the time-varying reproduction number, but also about the form or structure of the model used to estimate these variables. In Bayesian statistics this is known as model comparison, while in other fields it is known as structure learning. We emphasize the potential importance of model comparison in the context of quantitative epidemiology using a worked example to show that conclusions about the efficacy of various interventions depend sensitively on the ability to compare one model with another. We illustrate this point with a worked example based upon an early assessment of NPIs during the first wave of the current coronavirus outbreak.
In particular, we compare the results and performance (fit to the data) of models 1 and 2, when applied to the original 11 countries, plus another 3 European countries for which data were available but had not been included in the original publication. We also consider a third model (model 3), a hybrid of the first two, that considers both mobility and various NPIs together. We aim to understand if inferences are robust to model specification and whether some model provides a better fit than others. It is important to note that all three models were proposed (and in the case of the first two models) implemented by the Imperial College team.

## 2. Methods

### 2.1 Data

We compare the impact of NPIs and mobility on $Rt$ for three models, two time horizons and two sets of European countries. Specifically,
• 1.
For all models, we examine the evolution of $Rt$ for two time horizons: up to May 5th (the end date chosen by Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
]), and July 12th to allow investigating both the imposition and lifting of various NPIs.
• 2.
The original publication by Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] had included 11 European countries (Austria, Belgium, Denmark, France, Germany, Italy, Norway, Spain, Sweden, Switzerland, United Kingdom). However, suitable data were also available for the Netherlands, Portugal, and Greece; therefore we also consider 14 countries.
Seeding of new infections in all models is chosen to be 10 days before the day a given country has cumulatively observed 10 deaths so that mobility data are available for all countries examined and thus allowing a fair comparison between models. Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] chose the seeding of new infections to be 30 days before a country has cumulatively observed 10 deaths. We alter the prior for the initial infection count, which is a model parameter inferred from the posterior distribution, to reflect this modification. Seeding dates appear in Table A.1.
For mobility data we follow Unwin et al. [
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
], and use Google’s COVID-19 Community Mobility Report [

], which provides data measuring the percentage change in mobility compared to a baseline level for visits to: retailers and recreation venues, grocery markets and pharmacies, parks, transit stations, workplaces and residential places. We use the average change in mobility across all locations, excluding residential places and parks. Mobility indicators are proxies for changes in human behavior and of exposure risk — the number of close contacts and duration of contact. Behavior change could be due to one or more centrally imposed interventions or the product of individuals responding to the epidemic on their own initiative.

### 2.2 Model 1 (all NPIs considered)

In model 1, the evolution of $Rt$ is given by,
$Rt,m=R0,mexp(−∑k=16αkIk,t,m−βmIt,m*),$
(1)

where $Rt,m$ is the effective reproduction rate for country $m$ at time $t$ and $Ik,t,m$ is an indicator variable, where $Ik,t,m=1$ if NPI $k$ is in place at time $t,$ for country $m$ and $Ik,t,m=0$ otherwise, for $k=1,…,6$. The subscript $k$ refers to the various NPIs (Table A.2) whose timeline and definition are given in Supplementary Table 2 of Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
]. The covariate $It,m*$ is an indicator variable for the last imposed intervention allowing for country-specific random effects given by $βm$. In all countries except Sweden, this was lockdown, see Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] for details. For the analysis up to July 12th when some of the NPIs have been lifted, we allow the impact of lifting an NPI on $Rt$ to be different in magnitude from the impact of imposing that NPI in the first place. The timing of lifting NPIs in different countries appears in Table A.3.
In Equation (1), the proportional variation of $Rt$ from the initial $R0$ is modeled as a step function and only allowed to change, immediately so, in response to an intervention. Therefore, any decrease in $Rt$ (even if this decrease is a result of the increasing proportion of the population who are infected, changes in human behavior, clustered contact structures and/or pre-existing immunity [
• Grifoni A.
• Weiskopf D.
• Ramirez S.I.
• Mateus J.
• Dan J.M.
• Moderbacher C.R.
• et al.
Targets of T cell responses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals.
]) must, by construction, be attributed to interventions; the impact of a new intervention is immediate without time lag or gradual change. This assumption is clearly made for simplicity but is unrealistic.

### 2.3 Model 2 (Mobility Only Considered)

In model 2, the proportional variation of $Rt$ from $R0$ is allowed to vary with mobility. Model 2 does not presume $Rt$ follows a step function and is therefore capable of capturing more gradual changes over time. The impact of mobility on $Rt$ is allowed to vary across countries by use of country-specific random effects terms. Specifically,
$Rt,m=R0,m·f(−αXt,m−β1,m−β2,mXt,m−ϵm,w(m)(t)),$
(2)

where $f(x)=2×exp(x)1+exp(x)$ is twice the inverse of the logit function, $Xt,m$ is the average change in mobility, excluding residential places and parks, at time $t$ for country $m$ and $ϵm,w(m)(t)$ is a weekly AR(2) process centered around zero. In Equation (2), $α$ is a measure of the impact of the average change in mobility on $Rt$ which is common to all countries, while $β2,m$ measures country-specific deviations from this common value. The advantage of model 2 is that it gives a more flexible estimate of $Rt,$ allowing it to change with mobility trends. Although NPIs are not explicitly included in the model, the impact of an NPI can be measured by observing the value of $Rt,$ and its subsequent change, when specific interventions were imposed.

### 2.4 Model 3 (Mobility and NPIs jointly considered)

After communication with the Imperial College team, we also consider a third model (model 3) which jointly includes mobility and NPIs. The motivation behind the formulation of model 3 is to attempt to untangle the impacts of mobility, lockdown and other NPIs. In our communication, the Imperial College team proposed a similar model but only included mobility and a single NPI – lockdown – in their model. Given that our goal is to quantify the relative contributions of several NPIs, we consider all NPIs, and mobility.
However, we caution against using model 3 as a tool for inference. NPIs may impact mobility in possibly non-linear, non-additive, lagged and interactive fashions, with possibly complex feedback. We include this model here to compare its performance against models 1 and 2. The functional form of $Rt,m$ in model 3 is:
$Rt,m=R0,m·f(−α0Xt,m−∑k=15αkIk,t,m−β1,m−β2,mXt,m−ϵm,w(m)(t)).$
(3)

In brief, we use Bayesian model inversion to evaluate the evidence for a particular model and the posterior density over the parameters of that model. The models in question generate new confirmed cases and daily deaths reported from a series of countries. We use a conventional SIR (susceptible-infected-removed) model that, given initial conditions and a time-varying reproduction number, enables us to generate the expected incidence of new cases and fatalities over a specified time period. In these models, the time-varying reproduction number is parameterized in terms of known events or fluctuations (here, the onset and offset of NPIs or fluctuations in mobility using proxy measures). The functional form relating these known fluctuations to the time-varying reproduction number defines the structure of various models. Once that form has been specified, one can then use standard sampling procedures (e.g., Stan) to evaluate the posterior over the model parameters that best explain the incidence of new cases and deaths. Finally, the quality of the model can be assessed with the model evidence (also know as marginal likelihood). Here, we approximate model evidence with standard information criteria, acknowledging their limitations (please see discussion in Appendix B.2).
For a more technical discussion of prior specification and Bayesian measures of model fit for all models, see Appendix B.

## 3. Results

### 3.1 Mobility

Figs. 1 and A.1 show that for most countries the initial reduction in mobility preceded the date of the first lockdown. This suggests that people’s behavior changed in response to earlier, less severe interventions such as banning of public events and social distancing, and/or as a result of individual choices in the face of an unknown, but potentially catastrophic, pandemic.

### 3.2 Convergence diagnostics

Convergence diagnostics (trace plots and $R^$ [
• Gelman A.
• Rubin D.
Inference from iterative simulation using multiple sequences.
• Tanner M.
Tools for statistical inference.
]) for all three models and both time horizons appear in Fig. A.2, providing strong evidence that the Markov chains have converged.

### 3.3 Comparison of models up to May 5th

While model 1 and model 2 give very different trajectories of $Rt$ (Figs. 2a, Fig. A.3, Fig. A.4, Fig. A.11, Fig. A.12, Appendix C), both models produce visually similar fits to the observed daily death counts, i.e., different trajectories of $Rt$ may give rise to the same data and hence different inference surrounding the impact of various NPIs. As pointed out in Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
], the disparity between the observed and predicted number of cases is due to asymptomatic and non-documented infections and limited testing capacities.
For the 11 countries (Table 1), the inference from model 1 indicates that lockdown had the biggest impact of all the interventions in all countries with an average reduction in $Rt$ of 80%. In contrast, model 2 shows clearly that $Rt$ was falling well before lockdown, excluding Sweden that had no lockdown. In the other 10 countries, $Rt<1.0$ at the time of lockdown in 4 countries and only $1.0−1.3$ in another 3 countries (all three 95% CIs contained 1.0).
Table 1Comparison of the value of $Rt$ at lockdown (LD) and its 95% CIs between models 1 and 2 for all eleven countries analyzed in Flaxman et al.
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
for the time horizon March 4th to May 5th. Values of basic reproduction number $R0$ and $Rt$ immediately after the introduction of other NPIs for both models are given in Table A.5 in the Appendix.
CountryModel 1Model 2
$Rt$ one day before LD$Rt$ at LD% change$Rt$ at LD
UK3.390.68$−$79.671.11
(2.84, 3.94)(0.55, 0.81)$−$85.29, $−$72.96)(0.75, 1.60)
Austria2.960.52$−$81.420.87
(1.67, 4.50)(0.40, 0.64)$−$88.80, $−$69.47)(0.42, 1.55)
Belgium4.300.90$−$78.314.83
(2.87, 6.06)(0.78, 1.02)$−$85.99, $−$67.26)(3.47, 6.45)
Denmark3.250.68$−$78.110.58
(1.98, 4.81)(0.57, 0.80)$−$86.01, $−$65.70)(0.28, 1.05)
France4.060.71$−$82.081.69
(2.98, 4.95)(0.61, 0.82)$−$87.07, $−$74.21)(1.16, 2.39)
Germany3.680.73$−$79.991.02
(2.94, 4.51)(0.60, 0.85)$−$85.84, $−$72.48)(0.68, 1.47)
Italy2.900.70$−$75.351.30
(2.17, 3.46)(0.63, 0.78)$−$80.98, $−$66.51)(0.86, 1.76)
Norway2.420.40$−$82.300.50
(1.36, 3.71)(0.25, 0.57)$−$91.04, $−$69.16)(0.27, 0.79)
Spain4.290.67$−$84.051.78
(3.35, 5.39)(0.59, 0.75)$−$88.43, $−$78.72)(1.22, 2.42)
Sweden
Switzerland2.670.55$−$78.610.93
(1.93, 3.48)(0.44, 0.68)$−$86.43, $−$67.32)(0.62, 1.31)
When we considered 3 additional countries (Table C.1), the average reduction in $Rt$ from lockdown shrank to 73% in model 1. Model 2 shows $Rt<1.0$ in 7 countries and $1.0−1.3$ in another 3 countries when lockdown was imposed. In particular, the three added countries already had $Rt<1.0$ at the time of lockdown. For Greece and Portugal, $Rt$ was already so low (0.34 and 0.67, respectively) that even the 95% CIs excluded 1.0.
Model 3 provides different inference yet again. Only the mobility and banning of public events had 95% CIs for regression coefficients which excluded zero (Fig. A.2). The impact of lockdown was not statistically significant (95% CI is -0.23, 4.25).
In comparing the models, Table A.4 shows that model 2 provides a lower RMSE for eight of the eleven original countries considered by Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
], for the period March 4th to May 5th. The three countries for which model 1 had a lower RMSE are the UK, Germany and Norway.
Table 2 demonstrates that model 2 is the best supported by the data for all three information criteria: WAIC1, WAIC2 and DIC (see Appendix B.2). Model 3 is the next best supported by the data, while model 1 published in Nature is the least supported.
Table 2Estimates and standard errors of the differences of various information criteria against model 1; the Watanabe-Akaike information criterion, $WAIC1=−2lppd+2pWAIC1$ and $WAIC2=−2lppd+2pWAIC2$ which uses $lppd$ as a measure of fit with $pWAIC1$ and $pWAIC2$ as the effective number of parameters to penalize the fit respectively; the Deviance information criterion $DIC=−2logp(y|θ^Bayes)+2pDIC$ which uses $logp(y|θ^Bayes),$ as the measure of fit, and $pDIC$ as the penalty. Note that a negative value implies a better predictive model compared to model 1, and the preferred model for each criteria and time period is shown in bold. See Appendix B for computational details.
ModelTime period$ΔWAIC1$$ΔWAIC2$$ΔDIC$
2Up to May 5th$−31.21±0.30$$−$29.95 $±$0.34$−30.46±$0.28
3Up to May 5th$−$24.03 $±$ 0.31$−$22.49 $±$ 0.36$−$23.29 $±$ 0.29
2Up to July 12th$−$54.27 $±$ 1.78$−$49.93 $±$ 3.42$−$51.95 $±$ 0.37
3Up to July 12th$−$36.74 $±$ 1.30$−$32.24 $±$ 3.22$−$34.97 $±$ 0.37

### 3.4 Comparison of models up to July 12th

The analysis of the time horizon March 4th to July 12th, leads to very similar conclusions (Figs. 2b, A.3b–A.12b, A.13A.15). Table 3 indicates that the impact of lockdown on the relative reduction in $Rt$ was 64% for model 1, while in model 2, seven countries already had $Rt$ $≤$ 1.0 and only two countries had 95% CIs for $Rt$ exceeding 1.0 at the time of lockdown. In model 3, in contrast to the period until May 5th, with longer follow-up lockdown was statistically significant (95% CI is 0.23,1.43).
Table 3Comparison of the value of $Rt$ at lockdown (LD) and its 95% CIs between models 1 and 2 for all eleven countries analyzed in Flaxman et al.
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
and an additional three countries of Greece, Netherlands and Portugal, for the time horizon March 4th to July 12th.
CountryModel 1Model 2
$Rt$ one day before LD$Rt$ at LD% change$Rt$ at LD
UK3.080.81$−$73.251.20
(2.32, 3.78)(0.76, 0.86)$−$79.28, $−$64.03)(0.72, 1.82)
Austria1.820.61$−$64.580.72
(1.16, 2.81)(0.55, 0.67)$−$78.02, $−$47.53)(0.30, 1.42)
Belgium2.100.70$−$65.581.43
(1.46, 2.98)(0.67, 0.73)$−$76.83, $−$51.27)(0.90, 2.05)
Denmark1.730.68$−$59.120.56
(1.16, 2.48)(0.60, 0.76)$−$72.79, $−$41.89)(0.25, 1.05)
France2.260.71$−$67.371.77
(1.59, 3.12)(0.67, 0.75)$−$77.65, $−$53.86)(1.11, 2.60)
Germany3.310.71$−$78.131.12
(2.51, 4.19)(0.66, 0.76)$−$83.73, $−$70.87)(0.69, 1.67)
Italy1.740.75$−$55.661.41
(1.26, 2.32)(0.71, 0.79)$−$68.31, $−$39.35)(0.88, 2.03)
Norway1.520.57$−$60.720.53
(0.97, 2.22)(0.48, 0.66)$−$74.83, $−$40.59)(0.27, 0.88)
Spain3.470.75$−$77.741.74
(2.51, 4.46)(0.72, 0.79)$−$83.34, $−$69.56)(1.07, 2.49)
Sweden
Switzerland1.760.61$−$64.490.96
(1.25, 2.41)(0.57, 0.64)$−$75.75, $−$50.23)(0.58, 1.39)
Greece1.460.69$−$51.030.35
(0.90, 2.05)(0.63, 0.74)$−$67.21, $−$22.64)(0.16, 0.61)
Netherlands1.770.66$−$62.141.00
(1.34, 2.25)(0.61, 0.70)$−$72.27, $−$49.34)(0.61, 1.44)
Portugal1.740.83$−$50.310.66
(1.12, 2.39)(0.80, 0.86)$−$65.50, $−$25.24)(0.36, 1.07)
In comparing the models, Table A.4 shows that model 2 provides a lower RMSE than model 1 for all countries for the period March 4th to July 12th, except Austria and Norway. Similarly, Table 2 again demonstrates that model 2 is the best supported by the data for all three information criteria: WAIC1, WAIC2 and DIC.

### 3.5 Change of start date

Inferences regarding the impact of the imposition of NPIs are not substantially affected by the start date nor the priors for the initial infection count (Fig. A.16).

## 4. Discussion

We demonstrate that effects of NPIs are non-robust and highly sensitive to model specification, assumptions and data employed to fit models. We obtained very different inferences regarding the effectiveness of lockdown measures in terms of curbing the epidemic wave and reducing fatalities. Lockdown appeared the most effective measure to save lives in the original analysis of 11 European countries performed by the Imperial College team through model 1. This analysis was published in Nature and has probably had a major impact to maintain a mentality among policy makers that lockdown should be used during the advent of second waves in many countries in the Fall of 2020. However, model 2 (which was also originally developed by the same team), suggested little or no benefit from lockdown in most of the same countries.
Importantly, model 2 typically outperformed model 1 in data fit. Consideration of longer follow-up that included also the lifting of many measures still suggested that the originally [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] claimed effects of lockdown were grossly overstated. Fitting yet a third model, resulted in yet further variant conclusions, with only mobility and event ban having regression coefficients with 95% CIs that did not contain 0 for the period until May 5th.
The different results and inferences of these models may be partly explained by the highly correlated structure of NPIs and mobility data, as well as the dense time clustering of the different NPIs being applied typically in close sequence. NPIs largely reduce $Rt$ by reducing contact among individuals. An indirect measure of the reduction in individual contact is the mobility data, and so these data will be highly correlated with NPIs, making any inference difficult by default. Moreover, as different NPIs are typically introduced in close sequence, their exact time lag before impact is difficult to model. Interaction effects between different NPIs may also exist. The effectiveness of different NPIs may also vary across locations and across time based on adherence, acceptability, and enforcement. Any collateral harms may also affect acceptability and adherence.
Given that the inference around the effectiveness of various NPIs is highly model-dependent and that more aggressive NPIs have more adverse effects on other aspects of health, society, and economy [
• Woolf S.
• Chapman D.
• Sabo R.
• Weinberger D.
• Hill L.
Excess deaths from COVID-19 and other causes, March-April 2020.
,
• VanderWeele T.
Challenges estimating total lives lost in COVID-19 decisions: consideration of mortality related to unemployment, social isolation, and depression.
,
• De Filippo O.
• D'Ascenzo F.
• Angelini F.
• Bocchino P.P.
• Conrotto F.
• Saglietto A.
• et al.
Reduced rate of hospital admissions for ACS during Covid-19 outbreak in northern Italy.
,
• Metzler B.
• Siostrzonek P.
• Binder R.
• Bauer A.
Decline of acute coronary syndrome admissions in Austria since the outbreak of COVID-19: the pandemic response causes cardiac collateral damage.
,
• Ioannidis J.P.A.
Global perspective of COVID-19 epidemiology for a full-cycle pandemic.
,
• Czeisler M.É
• Lane R.I.
• Petrosky E.
• Wiley J.F.
• Christensen A.
• Njai R.
• et al.
Mental health, substance use, and suicidal ideation during the COVID-19 pandemic–United States, June 24-30, 2020.
,
• Melnick E.R.
• Ioannidis J.P.A.
Should governments continue lockdown to slow the spread of COVID-19?.
,
• Brooks S.K.
• Webster R.K.
• Smith L.E.
• Woodland L.
• Wessely S.
• Greenberg N.
• et al.
The psychological impact of quarantine and how to reduce it: rapid review of the evidence.
,
• Sud A.
• Jones M.E.
• Broggio J.
• Loveday C.
• Torr B.
• Garrett A.
• et al.
Collateral damage: the impact on outcomes from cancer surgery of the COVID-19 pandemic.
,
• Stephenson J.
Sharp drop in routine vaccinations for US children amid COVID-19 pandemic.
,
• Docherty K.F.
• Butt J.H.
• de Boer R.A.
• Dewan P.
• Køber L.
• Maggioni A.P.
• et al.
Deaths from COVID-19: who are the forgotten victims?.
,
• Moser D.A.
• Glaus J.
• Frangou S.
• Schechter D.S.
Years of life lost due to the psychosocial consequences of COVID-19 mitigation strategies based on Swiss data.
,
• Roesch E.
• Amin A.
• Gupta J.
• Garcí-Moreno C.
Violence against women during covid-19 pandemic restrictions.
,
• Boman J.
• Gallupe O.
Has COVID-19 changed crime? Crime rates in the United States during the pandemic.
,
• Picheta R.
Coronavirus pandemic will cause global famines of ‘biblical proportions,’ UN warns..
,
• Zumla A.
• Marais B.J.
• McHugh T.D.
• Maeurer M.
• Zumla A.
• Kapata N.
• et al.
COVID-19 And tuberculosis–threats and opportunities.
,
• Ribeiro F.
• Leist A.
Who is going to pay the price of Covid-19? Reflections about an unequal Brazil.
,
• Fu S.
• George E.
• Maggio P.
• Hawn M.
• Nazerali R.
The consequences of delaying elective surgery: surgical perspective.
,
• Del Vecchio Blanco G.
• Calabrese E.
• Biancone L.
• Monteleone G.
• Paoluzi O.A.
The impact of COVID-19 pandemic in the colorectal cancer prevention.
,
• Avery C.
• Bossert W.
• Clark A.
• Ellison G.
• Ellison S.F.
Policy implications of models of the spread of coronavirus: perspectives and opportunities for economists.
], it is ill-advised to ignore the substantial model uncertainty. Failing to report this uncertainty may ultimately undermine the public’s trust in the value of policy decisions based on statistical modeling. Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] made the statement “We find that, across 11 countries, since the beginning of the epidemic, 3,100,000 [2,800,000–3,500,000] deaths have been averted due to intervention”. Both the provided estimate and the accompanying limited uncertainty are uncertain. When results vary widely based on model specification, strong inferences should be avoided. Equally careful modeling and evaluation of uncertainty needs to be performed also for the potential postulated harms of lockdown and other NPIs.
Given that modeling studies are typically not pre-registered, multiple analytical approaches and model specifications may be used on the same data [
• Ioannidis J.P.A.
• Cripps S.
• Tanner M.A.
Forecasting for COVID-19 has failed.
], and data and results may be filtered by modelers according to whether they fit their prior beliefs. Clearly, an important issue in model comparison is the selection of the models to be compared [
• Friston K.J.
• Parr T.
• Zeidman P.
• Razi A.
• Flandin G.
• Daunizeau J.
• et al.
Dynamic causal modelling of COVID-19.
]. In one sense, this selection can be cast in terms of priors over models. For example, investigators who just report one model may have prior beliefs that this is the only plausible model. The key argument made in this paper is that there are formal procedures for evaluating these prior assumptions that may lead to very different conclusions. Similarly, one can use Bayesian model comparison to optimize the priors over the parameters of any given structural form. In other words, a model can be specified in terms of the priors over parameters and the priors themselves can then be optimized with respect to Bayesian model evidence. When the functional form of the posterior is known, there are procedures that can do this very quickly and efficiently. For example, Bayesian model reduction allows one to optimize priors analytically by using a generalization of the Savage-Dickey ratio.
We do not claim that lockdown measures definitely had no impact in the first wave of COVID-19. Indeed model 2 showed that $Rt$ was still above 1 in some countries and thus it is possible that in these locations it may have some impact on the course of the epidemic wave. Other investigators using a different analytical approach have suggested also some benefits from lockdown; however, these benefits were of a smaller magnitude (e.g., 13% relative risk reduction [
• Islam N.
• Sharp S.J.
• Chowell G.
• Shabnam S.
• Kawachi I.
• Lacey B.
• et al.
Physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries.
]). Small benefits of such modest size would be less likely to match complete lockdown-induced harms in a careful decision analysis. Another modeling approach has found that benefits can be reaped by simple self-imposed interventions such as washing hands, wearing masks, and some social distancing [
• Teslya A.
• Pham T.M.
• Godijk N.G.
• Kretzschmar M.E.
• Bootsma M.C.J.
• Rozhnova G.
• et al.
Impact of self-imposed prevention measures and short-term government-imposed social distancing on mitigating and delaying a COVID-19 epidemic: a modelling study.
]. Brauner et al. [
• Brauner J.M.
• Mindermann S.
• Sharma M.
• Johnston D.
• Salvatier J.
• Gaveniak T.
• et al.
Inferring the effectiveness of government interventions against COVID-19.
] analyze lockdown as a continuum with various measures to restrict contacts.
Some limitations of our work should be acknowledged. Besides model fit and parsimony metrics, theoretical and subjective considerations, as well as experience from other countries should be considered in model choice. However, given the observational nature of the data and the dynamic course of epidemic waves, one should avoid strong priors about effectiveness of different NPIs. Similarly, our results should not be interpreted with a nihilistic lens, i.e., that NPIs are totally ineffective. Decreasing exposures makes sense as a way to reduce epidemic wave propagation and eventually fatalities. However, if exposures can be reduced with less aggressive measures and fewer or no harms, this would be optimal. Finally, we did not examine very long-term time horizons. In theory, even effective measures may achieve only temporary mitigation and epidemic waves may surge again, when measures are relieved. We did observe this for the uplifting of measures in the July 12th analyses and empirical data from the emergence of second waves in many European countries and the USA in the fall of 2020 validate this hypothesis [
• Ioannidis J.P.A.
• Axfors C.
• Contopoulos-Ioannidis D.G.
Second versus first wave of COVID-19 deaths: shifts in age distribution and in nursing home fatalities.
]. Availability of effective and safe vaccines may also affect risk-benefit ratios of NPI measures of different aggressiveness and different duration of implementation. It is noted that other investigators [
• Kuhbandner C.
• Homburg S.
Commentary: estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
,
• Soltesz K.
• Gustafsson F.
• Timpka T.
• Jaldén J.
• Jidling C.
• Heimerson A.
• et al.
The effect of interventions on COVID-19.
,

Wood S. Did COVID-19 infections decline before UK lockdown? 2020; https://arxiv.org/abs/2005.02090.

] have raised similar concerns using alternative approaches.
Overall, observational data that are fed into complex epidemic models should be dissected very carefully and substantial uncertainty may remain despite the best efforts of modelers [
• Ioannidis J.P.A.
• Cripps S.
• Tanner M.A.
Forecasting for COVID-19 has failed.
,
• Jewell N.P.
• Lewnard J.A.
• Jewell B.L.
Predictive mathematical models of the COVID-19 pandemic: underlying principles and value of projections.
]. While there has been resistance to testing NPIs with randomized trials, such trials are feasible, and more thought and effort should be devoted on how to complement the available, tenuous observational data [
• Cristea I.A.
• Naudet F.
• Ioannidis J.P.A.
Preserving equipoise and performing randomised trials for COVID-19 social distancing interventions.
]. Regardless, causal interpretations from non-robust models should be avoided. In any decision analysis the accurate quantification of the size, not just the existence, of the impact of lockdown on $Rt$ is also critical. This is difficult task when one considers all the confounds between NPIs and mobility, as well as the several behavioral changes such as hand washing and wearing masks. This is an interesting area for research, and crucial for the management of future pandemics.

## Author Contributions

All authors contributed equally to this work. VC performed all the computations and produced all the graphics. SC wrote the initial draft. JI and MT wrote subsequent drafts. All authors discussed the results and implications and commented on the manuscript at all stages.

## Code Availability

All source code for the replication of our results is available from https://github.com/dare-centre/imperial-covid19-model.

## Acknowledgments

We congratulate the Imperial College Response Team for sharing openly the code for their models and for the overall transparency of their work that has allowed performing these analyses. We thank Hadi Ashfar for his suggestions to improve the computational efficiency of the HMC scheme. We also thank Jack Wood for his help in the construction of Table A.3. We especially thank the three reviewers and the Editor for their highly thoughtful and deep comments which greatly improved the quality of this paper. We acknowledge the Sydney Informatics Hub and the University of Sydney’s high performance computing cluster Artemis for providing the high performance computing resources that have contributed to the research results reported within this paper.

## Appendix A. Additional Figures and Tables

Table A.1Seeding dates of new infections. Two seeding dates were used for Belgium – March 9th and March 4th for the data up to May 5th and July 12th respectively due to a reporting correction in the data.
CountrySeeding date
AustriaMarch 13th
BelgiumMarch 9th/March 4th
DenmarkMarch 12th
FranceFebruary 27th
GermanyMarch 6th
GreeceMarch 12th
ItalyFebruary 16th
NetherlandsMarch 5th
NorwayMarch 15th
PortugalMarch 12th
SpainFebruary 29th
SwedenMarch 9th
SwitzerlandMarch 5th
Table A.2Correspondence of subscripts for $k$ to each NPI.
$k$NPIs
1School closure
2Event ban
3Lockdown
4Self-isolation
5Social distancing
6Government intervention
Table A.3End dates for school closure

Hale T., Webster S., Petherick A., Phillips T., Kira B.. Oxford COVID-19 government response tracker. Retrieved from: https://github.com/OxCGRT/covid-policy-tracker; 2020. Last accessed: July 15, 2020.

, event ban

Hale T., Webster S., Petherick A., Phillips T., Kira B.. Oxford COVID-19 government response tracker. Retrieved from: https://github.com/OxCGRT/covid-policy-tracker; 2020. Last accessed: July 15, 2020.

and lockdown in each country

Our World in Data. Policy responses to the coronavirus pandemic. Retrieved from: https://ourworldindata.org/policy-responses-covid; 2020. Last accessed: July 15, 2020.

,

SBS News. Denmark reports no spike in coronavirus cases since lifting lockdown. 2020. Retrieved from: https://www.sbs.com.au/news/denmark-reports-no-spike-in-coronavirus-cases-since-lifting-lockdown; Last accessed: July 15, 2020.

,

The Local. AFTER LOCKDOWN: are Denmark’s and Norway’s restrictions now like Sweden’s? Retrieved from: https://www.thelocal.com/20200421/explained-are-denmark-and-norways-restrictions-still-tougher-than-swedens; 2020. Last accessed: July 15, 2020.

. NPIs that are still in place as of July 12th are shown in $✓,$ while NPIs that were not implemented are shown in ✗.
CountrySchool closureEvent banLockdown
UK$✓$$✓$May 13th
AustriaMay 18th$✓$May 1st
BelgiumJuly 1st$✓$June 7th
Denmark$✓$$✓$April 20th
FranceJune 22th$✓$May 11th
GermanyJuly 7th$✓$May 6th
GreeceJune 1stJune 15thMay 30th
Italy$✓$$✓$May 4th
NetherlandsJune 15thJuly 1stMay 11th
NorwayMay 11thJune 2ndApril 21st
Portugal$✓$$✓$July 5th
Spain$✓$$✓$May 26th
Sweden$✓$
SwitzerlandJune 6th$✓$June 21st
Table A.4RMSE of daily death counts for models 1 and 2 for the data up to May 5th and July 12th. A lower RMSE between models 1 and 2 for each country is shown in bold.
CountryUp to May 5thUp to July 12th
Model 1Model 2Model 1Model 2
UK145.41145.64134.26129.68
Austria5.885.884.484.57
Belgium71.1652.9125.2015.84
Denmark3.273.082.422.39
France242.07227.22187.33168.34
Germany48.6248.7537.0436.32
Italy85.9671.2963.4757.42
Norway3.063.072.212.22
Spain95.2392.43143.82135.03
Sweden35.8235.5533.1233.09
Switzerland14.6114.3410.3710.31
Greece1.721.51
Netherlands21.4821.01
Portugal6.295.75
Table A.5Basic reproduction number $R0$ and time-varying reproduction number $Rt$ immediately after the introduction of NPIs given by models 1 and 2 using data up to May 5th for all eleven countries analyzed in Flaxman et al.
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
These NPIs are self-isolation (SI), social distancing (SD), school closure (SC), event ban (EB) and lockdown (LD). 95% credible intervals are given in parentheses below the corresponding point estimates. Countries where the seeding of new infections occur after the introduction of NPIs are denoted with an asterisk.
Country$R0$$Rt$ immediately after NPIs introduction
SISDSCEBLD
Model 1.
UK3.553.453.423.390.680.68
(2.99, 4.27)(2.95, 4.00)(2.92, 3.96)(2.84, 3.94)(0.55, 0.81)(0.55, 0.81)
Austria*3.140.520.522.960.52
(1.91, 4.66)(0.40, 0.64)(0.40, 0.64)(1.67, 4.50)(0.40, 0.64)
Belgium4.724.594.304.304.380.90
(3.38, 6.46)(3.23, 6.32)(2.87, 6.06)(2.87, 6.06)(2.92, 6.23)(0.78, 1.02)
Denmark3.563.313.253.253.310.68
(2.27, 5.06)(2.01, 4.84)(1.98, 4.81)(1.98, 4.81)(2.01, 4.84)(0.57, 0.80)
France4.454.064.064.184.220.71
(3.78, 5.27)(2.98, 4.95)(2.98, 4.95)(3.14, 4.99)(3.20, 5.03)(0.61, 0.82)
Germany3.863.753.723.680.730.73
(3.07, 4.90)(3.00, 4.65)(2.97, 4.58)(2.94, 4.51)(0.60, 0.85)(0.60, 0.85)
Italy3.182.902.903.132.900.70
(2.80, 3.61)(2.17, 3.46)(2.17, 3.46)(2.69, 3.57)(2.17, 3.46)(0.63, 0.78)
Norway*2.652.442.420.40
(1.57, 3.99)(1.36, 3.73)(1.36, 3.71)(0.25, 0.57)
Spain4.390.674.344.290.670.67
(3.49, 5.50)(0.59, 0.75)(3.43, 5.43)(3.35, 5.39)(0.59, 0.75)(0.59, 0.75)
Sweden2.051.991.980.86
(1.51, 2.74)(1.48, 2.60)(1.48, 2.57)(0.63, 1.10)
Switzerland*2.942.672.692.720.55
(2.18, 3.86)(1.93, 3.48)(1.95, 3.51)(1.96, 3.57)(0.44, 0.68)

Model 2.
UK4.174.264.082.341.111.11
(2.62,6.39)(3.28,5.35)(3.13,5.11)(1.76,3.00)(0.75,1.60)(0.75,1.60)
Austria*3.340.870.871.880.87
(1.46,6.09)(0.42,1.55)(0.42,1.55)(0.92,3.33)(0.42,1.55)
Belgium4.334.384.524.524.814.83
(2.55,6.72)(2.77,6.48)(2.87,6.69)(2.87,6.69)(3.06,7.11)(3.47,6.45)
Denmark2.431.510.870.871.510.58
(1.16,4.87)(0.80,2.66)(0.45,1.57)(0.45,1.57)(0.80,2.66)(0.28,1.05)
France4.103.773.774.365.101.69
(2.66,6.11)(3.00,4.65)(3.00,4.65)(3.52,5.37)(4.04,6.35)(1.16,2.39)
Germany4.564.434.394.121.021.02
(2.72,7.11)(2.72,6.69)(2.69,6.63)(2.97,5.56)(0.68,1.47)(0.68,1.47)
Italy4.552.122.122.912.121.30
(2.76,6.98)(1.47,2.80)(1.47,2.80)(2.20,3.65)(1.47,2.80)(0.86,1.76)
Norway*2.100.460.680.50
(1.06,4.39)(0.21,0.87)(0.33,1.26)(0.27,0.79)
Spain4.681.784.973.771.781.78
(2.98,6.96)(1.22,2.42)(3.81,6.28)(2.85,4.80)(1.22,2.42)(1.22,2.42)
Sweden3.493.252.501.55
(1.91,5.96)(1.93,5.25)(1.73,3.51)(1.11,2.06)
Switzerland*3.482.622.573.160.93
(1.84,5.85)(1.79,3.66)(1.76,3.59)(2.13,4.44)(0.62,1.31)

## Appendix B. Priors and Measures of Fit

### B.1 Priors

For posterior inference in model 1, we use the same priors as in Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] for the analysis up to May 5th and July 12th. For model 2, we use the same prior distributions as in Unwin et al. [
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
] except for $R0,$ and $α$ in Equation (2).
For $R0,$ we use a weakly informative prior of a normal distribution truncated below at 1 with mean 3.28 and standard deviation 2. This prior is chosen so that approximately 95% of the prior density is between 1 and 7 [
• Liu Y.
• Gayle A.A.
• Wilder-Smith A.
• Rocklöv J.
The reproductive number of COVID-19 is higher compared to SARS coronavirus.
], and that $R0$ is above the critical value of 1 at the start of the epidemic.
For $α,$ we examine the sensitivity of the posterior to two priors. The first prior that we consider is that used by Unwin et al. [
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
]–this prior is very informative, with $α∼N(0,0.5)$. That is, a priori they assume $α$ lies in the interval $[−1,1]$ with probability 0.95. In contrast, the second prior we considered is an uninformative prior, $α∼N(0,5),$ and the posterior mode of $α$ in model 2 up to May 5th is found to be approximately -4. This means that the prior used by Unwin et al. [
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
] has almost no support over this posterior distribution. This has two consequences, first it makes convergence of the Markov chain very difficult and sensitive to starting values. Second, it shrinks the value of $α$ towards zero, underestimating the impact of mobility on $Rt$. The second prior, $α∼N(0,5),$ makes the convergence of the chain more robust to poor starting values.
We also change the prior for the number of initial infection count at the start of the time period for two reasons. First, due to data constraints, we chose to start the seeding of infections only 10 days before the date of the 10th cumulative death. In contrast, Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] chose to start the seeding of infections 30 days prior to the date of the 10th cumulative death. Flaxman et al. [
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
] chose a prior for initial infection count which was relatively tight, the probability that the initial infection count was greater than 500 is $≈0$. Using this prior for the number of infections 20 days later again is not realistic and again leads to convergence problems. We therefore chose a less informative prior for the initial infection count. Plots of these prior distributions can be found in Fig. B.1, and the posterior distributions of the parameter in the 11 countries for the analysis until May 5th are shown in Fig. B.2.
Notably, Bayesian data analysts typically examine a variety of priors to gauge the sensitivity of results to the prior specification [
• Tanner M.
Tools for statistical inference.
].

### B.2 Bayesian measures of model fit

To compare the fit of the three models to the data, we consider four metrics: three estimates of various information criteria, as well as the root mean square error (RMSE). The information criteria metrics are two versions of the Watanabe-Akaike information criteria[
• Watanabe S.
A widely applicable Bayesian information criterion.
], denoted by WAIC1 and WAIC2 and the Deviance information criterion DIC[
• Spiegelhalter D.J.
• Best N.G.
• Carlin B.P.
• Van Der Linde A.
Bayesian measures of model complexity and fit.
]. Both WAIC1 and WAIC2 use the log pointwise predictive density (lppd) as a measure of fit.
When comparing the evidence for one model relative to another, one is effectively comparing the marginal likelihood of the data under a particular model with the likelihood of the same data under a different model. This can be taken as the relative likelihood of the two models if both models were equally likely a priori. Crucially, the log evidence can always be decomposed into accuracy and complexity, where complexity is the Kullback-Leibler divergence between the prior and the posterior. Generally, this is extremely difficult to evaluate using sampling procedures, and is usually approximated with a function of the number of free parameters. This leads to various information criteria, some of which we report in this work (i.e., the WAIC and DIC).
The differences in these information criteria can be taken as the log odds ratio of two models. For example, a difference in the DIC of three corresponds roughly to an odds ratio of 20 to one, in terms of the marginal likelihood of the two models. For completeness, we also report the accuracy in terms of the root mean square error (RMSE). A key aspect of model comparison is the evaluation of model complexity. In other words, simply maximizing accuracy will generally lead to overfitting and a failure to generalize - that goes hand-in-hand with a poor predictive validity.
To penalize the fit, WAIC1 uses
$pWAIC1=2∑i=1n(logEpost[p(yi|θ)]−Epost[logp(yi|θ)]),$
(4)

as an estimate of the effective number of parameters, where $Epost$ denotes the expectation over the posterior distribution of model parameters $θ$ given the observed data $y={yi;i=1,…,n}$. The criteria WAIC2 uses
$pWAIC2=∑i=1nVpost(logp(yi|θ)),$
(5)

where $Vpost$ denotes the variance over the posterior distribution of $θ$. The DIC metric uses $logp(y|θ^Bayes),$ with $θ^Bayes$ being the posterior mean of $θ,$ as a measure of fit and
$pDIC=2(logp(y|θ^Bayes)−Epost[logp(y|θ)]),$
(6)

as the penalty. It is well known [
• Friston K.
• Costello A.
• Pillay D.
’Dark matter’, second waves and epidemiological modelling.
] that it is notoriously difficult to evaluate model evidence from sample distributions (especially in hierarchical Bayesian models where it is difficult to count the true number of parameters required in such metrics as AIC or BIC), both in terms of computational costs, as well as mathematically. This may be why model comparison may be generally lacking in epidemiology.

## Appendix C. Analysis up to May 5th for all 14 countries

Table C.1Comparison of the value of $Rt$ at lockdown (LD) and its 95% CIs between models 1 and 2 for all eleven countries analyzed in Flaxman et al.
• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
and an additional three countries of Greece, Netherlands, and Portugal, for the time horizon March 4th to May 5th
CountryModel 1Model 2
$Rt$ one day before LD$Rt$ at LD% change$Rt$ at LD
UK3.310.68−79.181.11
(2.55, 3.87)(0.57, 0.80)(−84.65, −70.85)(0.74, 1.60)
Austria2.080.52−73.010.87
(1.17, 3.57)(0.41, 0.64)(−85.48, −57.22)(0.41, 1.55)
Belgium2.900.72−74.141.46
(2.06, 4.01)(0.62, 0.83)(−83.75, −61.69)(1.00, 1.99)
Denmark2.280.68$−$68.630.57
(1.39, 3.53)(0.57, 0.79)$−$80.92, $−$51.75)(0.28, 1.04)
France3.030.75$−$74.611.70
(2.18, 4.14)(0.65, 0.84)$−$83.60, $−$63.22)(1.16, 2.40)
Germany3.650.73$−$79.781.02
(2.90, 4.40)(0.62, 0.84)$−$85.08, $−$72.05)(0.68, 1.44)
Italy2.110.71$−$65.311.28
(1.51, 2.86)(0.64, 0.78)$−$75.81, $−$51.48)(0.86, 1.73)
Norway1.720.44$−$72.770.50
(0.99, 2.77)(0.28, 0.60)$−$85.98, $−$55.88)(0.28, 0.79)
Spain4.190.68$−$83.531.78
(3.09, 5.24)(0.60, 0.75)$−$87.98, $−$77.23)(1.22, 2.42)
Sweden
Switzerland2.150.60$−$71.140.93
(1.57, 2.90)(0.49, 0.71)$−$82.00, $−$57.72)(0.62, 1.30)
Greece1.200.36$−$68.900.34
(0.68, 1.89)(0.21, 0.51)$−$83.34, $−$48.58)(0.18, 0.54)
Netherlands1.970.62$−$68.090.93
(1.58, 2.42)(0.50, 0.73)$−$78.27, $−$55.73)(0.63, 1.28)
Portugal2.040.65$−$66.930.67
(1.32, 2.92)(0.53, 0.76)$−$79.89, $−$47.70)(0.42, 0.99)

## References

• Flaxman S.
• Mishra S.
• Gandy A.
• Unwin H.J.T.
• Mellan T.A.
• Coupland H.
• et al.
Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
Nature. 2020; 584: 257-261
• Unwin H.J.T.
• Mishra S.
• Gandy A.
• Mellan T.A.
• Coupland H.
• et al.
State-level tracking of COVID-19 in the United States.
Nat Commun. 2020; 11

• Grifoni A.
• Weiskopf D.
• Ramirez S.I.
• Mateus J.
• Dan J.M.
• Moderbacher C.R.
• et al.
Targets of T cell responses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals.
Cell. 2020; 181: 1489-1501
• Gelman A.
• Rubin D.
Inference from iterative simulation using multiple sequences.
Stat Sci. 1992; 7: 457-472
• Tanner M.
Tools for statistical inference.
Springer, 1996
• Woolf S.
• Chapman D.
• Sabo R.
• Weinberger D.
• Hill L.
Excess deaths from COVID-19 and other causes, March-April 2020.
JAMA. 2020; 324: 510-513
• VanderWeele T.
Challenges estimating total lives lost in COVID-19 decisions: consideration of mortality related to unemployment, social isolation, and depression.
JAMA. 2020; 324: 445-446
• De Filippo O.
• D'Ascenzo F.
• Angelini F.
• Bocchino P.P.
• Conrotto F.
• Saglietto A.
• et al.
Reduced rate of hospital admissions for ACS during Covid-19 outbreak in northern Italy.
N Engl J Med. 2020; 383: 88-89
• Metzler B.
• Siostrzonek P.
• Binder R.
• Bauer A.
Decline of acute coronary syndrome admissions in Austria since the outbreak of COVID-19: the pandemic response causes cardiac collateral damage.
Eur Heart J. 2020; 41: 1852-1853
• Ioannidis J.P.A.
Global perspective of COVID-19 epidemiology for a full-cycle pandemic.
Eur J Clin Invest. 2020; 50
• Czeisler M.É
• Lane R.I.
• Petrosky E.
• Wiley J.F.
• Christensen A.
• Njai R.
• et al.
Mental health, substance use, and suicidal ideation during the COVID-19 pandemic–United States, June 24-30, 2020.
Morbid Mortality Weekly Rep. 2020; 69: 1049-1057
• Melnick E.R.
• Ioannidis J.P.A.
Should governments continue lockdown to slow the spread of COVID-19?.
BMJ. 2020; 369
• Brooks S.K.
• Webster R.K.
• Smith L.E.
• Woodland L.
• Wessely S.
• Greenberg N.
• et al.
The psychological impact of quarantine and how to reduce it: rapid review of the evidence.
Lancet. 2020; 395: 912-920
• Sud A.
• Jones M.E.
• Broggio J.
• Loveday C.
• Torr B.
• Garrett A.
• et al.
Collateral damage: the impact on outcomes from cancer surgery of the COVID-19 pandemic.
Ann Oncol. 2020; 31: 1065-1074
• Stephenson J.
Sharp drop in routine vaccinations for US children amid COVID-19 pandemic.
JAMA Health Forum. 2020;
• Docherty K.F.
• Butt J.H.
• de Boer R.A.
• Dewan P.
• Køber L.
• Maggioni A.P.
• et al.
Deaths from COVID-19: who are the forgotten victims?.
medRxiv. 2020;
• Moser D.A.
• Glaus J.
• Frangou S.
• Schechter D.S.
Years of life lost due to the psychosocial consequences of COVID-19 mitigation strategies based on Swiss data.
Eur Psychiat. 2020; 63
• Roesch E.
• Amin A.
• Gupta J.
• Garcí-Moreno C.
Violence against women during covid-19 pandemic restrictions.
BMJ. 2020; 369
• Boman J.
• Gallupe O.
Has COVID-19 changed crime? Crime rates in the United States during the pandemic.
American Journal of Criminal Justice. 2020; 45: 537-545
• Picheta R.
Coronavirus pandemic will cause global famines of ‘biblical proportions,’ UN warns..
CNN. 2020;
• Zumla A.
• Marais B.J.
• McHugh T.D.
• Maeurer M.
• Zumla A.
• Kapata N.
• et al.
COVID-19 And tuberculosis–threats and opportunities.
Int J Tuberculosis Lung Dis. 2020; 24: 757-760
• Ribeiro F.
• Leist A.
Who is going to pay the price of Covid-19? Reflections about an unequal Brazil.
Int J Equity Health. 2020; 19
• Fu S.
• George E.
• Maggio P.
• Hawn M.
• Nazerali R.
The consequences of delaying elective surgery: surgical perspective.
Ann Surg. 2020; 272
• Del Vecchio Blanco G.
• Calabrese E.
• Biancone L.
• Monteleone G.
• Paoluzi O.A.
The impact of COVID-19 pandemic in the colorectal cancer prevention.
Int J Colorectal Dis. 2020; 35: 1951-1954
• Avery C.
• Bossert W.
• Clark A.
• Ellison G.
• Ellison S.F.
Policy implications of models of the spread of coronavirus: perspectives and opportunities for economists.
Natl Bureau Econ Res Working PapSer. 2020;
• Ioannidis J.P.A.
• Cripps S.
• Tanner M.A.
Forecasting for COVID-19 has failed.
Int J Forecast. 2020;
• Friston K.J.
• Parr T.
• Zeidman P.
• Razi A.
• Flandin G.
• Daunizeau J.
• et al.
Dynamic causal modelling of COVID-19.
Wellcome Open Res. 2020; 5:89
• Islam N.
• Sharp S.J.
• Chowell G.
• Shabnam S.
• Kawachi I.
• Lacey B.
• et al.
Physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries.
BMJ. 2020; 370
• Teslya A.
• Pham T.M.
• Godijk N.G.
• Kretzschmar M.E.
• Bootsma M.C.J.
• Rozhnova G.
• et al.
Impact of self-imposed prevention measures and short-term government-imposed social distancing on mitigating and delaying a COVID-19 epidemic: a modelling study.
PLoS Med. 2020;
• Brauner J.M.
• Mindermann S.
• Sharma M.
• Johnston D.
• Salvatier J.
• Gaveniak T.
• et al.
Inferring the effectiveness of government interventions against COVID-19.
Science. 2021; 371
• Ioannidis J.P.A.
• Axfors C.
• Contopoulos-Ioannidis D.G.
Second versus first wave of COVID-19 deaths: shifts in age distribution and in nursing home fatalities.
Environ Res. 2021; : 110856
• Kuhbandner C.
• Homburg S.
Commentary: estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe.
Front Med (Lausanne). 2020; 7
• Soltesz K.
• Gustafsson F.
• Timpka T.
• Jaldén J.
• Jidling C.
• Heimerson A.
• et al.
The effect of interventions on COVID-19.
Nature. 2020; 588: E26-E28
2. Wood S. Did COVID-19 infections decline before UK lockdown? 2020; https://arxiv.org/abs/2005.02090.

• Jewell N.P.
• Lewnard J.A.
• Jewell B.L.
Predictive mathematical models of the COVID-19 pandemic: underlying principles and value of projections.
JAMA. 2020; 323: 1893-1894
• Cristea I.A.
• Naudet F.
• Ioannidis J.P.A.
Preserving equipoise and performing randomised trials for COVID-19 social distancing interventions.
Epidemiol Psychiatr Sci. 2020; 29
3. Hale T., Webster S., Petherick A., Phillips T., Kira B.. Oxford COVID-19 government response tracker. Retrieved from: https://github.com/OxCGRT/covid-policy-tracker; 2020. Last accessed: July 15, 2020.

4. Our World in Data. Policy responses to the coronavirus pandemic. Retrieved from: https://ourworldindata.org/policy-responses-covid; 2020. Last accessed: July 15, 2020.

5. SBS News. Denmark reports no spike in coronavirus cases since lifting lockdown. 2020. Retrieved from: https://www.sbs.com.au/news/denmark-reports-no-spike-in-coronavirus-cases-since-lifting-lockdown; Last accessed: July 15, 2020.

6. The Local. AFTER LOCKDOWN: are Denmark’s and Norway’s restrictions now like Sweden’s? Retrieved from: https://www.thelocal.com/20200421/explained-are-denmark-and-norways-restrictions-still-tougher-than-swedens; 2020. Last accessed: July 15, 2020.

• Liu Y.
• Gayle A.A.
• Wilder-Smith A.
• Rocklöv J.
The reproductive number of COVID-19 is higher compared to SARS coronavirus.
J Travel Med. 2020; 27
• Watanabe S.
A widely applicable Bayesian information criterion.
J Mach Learn Res. 2013; 14: 867-897
• Spiegelhalter D.J.
• Best N.G.
• Carlin B.P.
• Van Der Linde A.
Bayesian measures of model complexity and fit.
J R Stat Soc B. 2002; 64: 583-639
• Friston K.
• Costello A.
• Pillay D.
’Dark matter’, second waves and epidemiological modelling.
BMJ Global Health. 2020; 5: e003978