Estimating a region's SARS-CoV-2 prevalence by accounting for under-reporting of cases

Patel, Virag and Taylor, Rachel A. and McCarthy, Catherine M. and Moir, Ruth and Kelly, Louise A. and Snary, Emma L. (2024) Estimating a region's SARS-CoV-2 prevalence by accounting for under-reporting of cases. PLOS One. ISSN 1932-6203 (In Press)

[thumbnail of Patel-etal-PLOS-One-2024-Estimating-a-regions-SARS-CoV-2-prevalence] Text. Filename: Patel-etal-PLOS-One-2024-Estimating-a-regions-SARS-CoV-2-prevalence.pdf
Accepted Author Manuscript
Restricted to Repository staff only until 1 January 2099.

Download (765kB) | Request a copy


The first year since the identification of SARS-CoV-2 (December 2019 to December 2020) saw more than 83 million cases of COVID-19 worldwide. Comparisons of prevalence between regions can inform policy decisions and help implement appropriate control measures. However, estimating prevalence can be challenging because officially reported figures are likely to be underestimates. Previous methods for estimating prevalence fail to incorporate differences between populations or testing rates, therefore direct comparisons can be misleading. We present an improved methodology for estimating prevalence of SARS-CoV-2. We apply an age-adjustment since age of the population influences the case-fatality rate and the asymptomatic rate, both of which influence predicted prevalence. Further, we estimate an under-reporting factor, based on testing rates, and use this to adjust our prevalence estimates. We estimate the prevalence for 146 countries and all 50 US states. Our estimates show that on 30th December 2020, the countries with the highest estimated prevalence were Slovenia (1.86%, 95% CI: 1.59 – 2.22), Czech Republic (1.71%, 95% CI: 1.47 – 2.03) and Montenegro (1.64%, 95% CI: 1.41 – 1.94). We conclude that 47.84% (95% CI: 40.97 - 56.51) of all global cases were from India, Brazil and the USA at that time. Our results suggest that Guyana (0.10 tests per predicted case), Afghanistan (0.11 tests per predicted case) and French Polynesia (0.11 tests per predicted case) had the highest under-reporting within the top 25 countries. These results can be used to understand the risk between different geographical areas and highlight where the prevalence of SARS-CoV-2 is increasing most rapidly. Our UK estimates were validated against the UK Office for National Statistics prevalence survey and showed consistent agreement over the course of 2020. The methodology can be adapted for use in future pandemics, especially in the early stages prior to vaccination, or for other diseases with similar under-reporting issues.