Skip to content

Commit

Permalink
Remove normal approximation (#153)
Browse files Browse the repository at this point in the history
* Remove normal approximation

Given instability of the normal approximation for many values (especially given asymmetric likelihood), and because binomial implementation is quick, this is being removed to ensure accurate outputs.

* Update documentation

Output NA if total_outcomes<=total_deaths

* Update README example

Focus on early stage

* Fix typo and add news

* Update tests and linting

* Address linting issues

* Fix remaining linting

* Run styler

* Automatic readme update

* Remove pkgdown: as_is: true

Knitted vignettes appear to show equations OK once removed.

---------

Co-authored-by: GitHub Action <[email protected]>
  • Loading branch information
adamkucharski and actions-user authored Jul 17, 2024
1 parent db8b768 commit 29ee12a
Show file tree
Hide file tree
Showing 19 changed files with 60 additions and 95 deletions.
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# cfr (development version)

# cfr 0.1.2

Updated version to fix instability in normal approximation with displayed Ebola example. This release includes:

1. Removal of normal approximation from the `.estimate_severity()` function, instead using the binomial likelihood unless criteria for a Poisson approximation met.
2. Updated README example focusing on first 30 days of outbreak, to emphasise effects of not accounting for delays to outcome.

# cfr 0.1.1

Maintainer is changing to @adamkucharski (#143).
Expand Down
65 changes: 30 additions & 35 deletions R/estimate_severity.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,8 @@
#' the estimate and confidence intervals cannot be calculated and the output
#' `<data.frame>` contains only `NA`s.
#'
#' - When `total_cases == total_deaths` _and_ `total_outcomes <= total_deaths`,
#' while `total_cases < poisson_threshold`, the confidence intervals cannot be
#' calculated and are returned as `NA`. The severity is returned as the lowest
#' possible value for the method used when cases are below the Poisson
#' threshold, which is 0.001.
#'
#' - When `total_outcomes == total_deaths` while
#' `total_cases < poisson_threshold` the confidence intervals cannot be
#' calculated and are returned as `NA`s while the severity estimate is returned
#' as `0.999`.
#' - When `total_outcomes <= total_deaths`, estimate and confidence intervals
#' cannot be reliably calculated and are returned as `NA`.
.estimate_severity <- function(total_cases,
total_deaths,
total_outcomes,
Expand Down Expand Up @@ -68,16 +60,38 @@
)

# maximum likelihood estimation for corrected severity
# using increments of 0.1% severity
pprange <- seq(from = 1e-4, to = 1.0, by = 1e-4)

# if more expected outcomes than observed deaths, set outcomes equal to deaths
if (total_outcomes >= total_deaths) {
total_outcomes_checked <- total_outcomes
} else {
total_outcomes_checked <- NA
message(
"Total deaths = ", total_deaths,
" and expected outcomes = ", round(total_outcomes),
" so setting expected outcomes = NA. If we were to assume
total deaths = expected outcomes, it would produce an estimate of 1."
)
}

# get likelihoods using selected function
lik <- func_likelihood(total_outcomes, total_deaths, pprange)
lik <- func_likelihood(total_outcomes_checked, total_deaths, pprange)

# maximum likelihood estimate - if this is empty, return NA
# Otherwise return 95% confidence interval of likelihood
severity_estimate <- pprange[which.max(lik)]

# 95% confidence interval of likelihood
severity_lims <- range(pprange[lik >= (max(lik) - 1.92)])
if (length(severity_estimate) == 0) {
severity_estimate <- NA
severity_lims <- c(NA, NA)
} else {
severity_lims <- range(
pprange[lik >=
(max(lik, na.rm = TRUE) - 1.92)],
na.rm = TRUE
)
}

# return a vector for easy conversion to data
severity_estimate <- c(severity_estimate, severity_lims)
Expand Down Expand Up @@ -111,9 +125,6 @@
#' - Poisson approximation: when `total_cases >= poisson_threshold` but
#' when `p_mid` < 0.05;
#'
#' - Normal approximation: when `total_cases >= poisson_threshold` and
#' `p_mid >=` 0.05.
#'
#' @return A function with three arguments, `total_outcomes`, `total_deaths`,
#' and `pp`, which is used to generate the profile likelihood.
#' Also prints messages to the screen when a Poisson or Normal approximation
Expand All @@ -123,7 +134,7 @@
# NOTE: internal function is not input checked
# switch likelihood function based on total cases and p_mid
# Binomial approx
if (total_cases < poisson_threshold) {
if (total_cases < poisson_threshold || (p_mid >= 0.05)) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
lchoose(round(total_outcomes), total_deaths) +
(total_deaths * log(pp)) +
Expand All @@ -132,7 +143,7 @@
}

# Poisson approx
if ((total_cases >= poisson_threshold) && p_mid < 0.05) {
if ((total_cases >= poisson_threshold) && (p_mid < 0.05)) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
stats::dpois(
total_deaths, pp * round(total_outcomes),
Expand All @@ -145,21 +156,5 @@
)
}

# Normal approx
if ((total_cases >= poisson_threshold) && p_mid >= 0.05) {
func_likelihood <- function(total_outcomes, total_deaths, pp) {
stats::dnorm(
total_deaths,
mean = pp * round(total_outcomes),
sd = pp * (1 - pp) * round(total_outcomes),
log = TRUE
)
}
message(
"Total cases = ", total_cases, " and p = ", signif(p_mid, 3),
": using Normal approximation to binomial likelihood."
)
}

func_likelihood
}
7 changes: 5 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,15 @@ library(cfr)
# Load the Ebola 1976 data provided with the package
data(ebola1976)
# Focus on the first 20 days the outbreak
ebola1976_first_30 <- ebola1976[1:30, ]
# Calculate the static CFR without correcting for delays
cfr_static(data = ebola1976)
cfr_static(data = ebola1976_first_30)
# Calculate the static CFR while correcting for delays
cfr_static(
data = ebola1976,
data = ebola1976_first_30,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33)
)
```
Expand Down
17 changes: 7 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,21 +73,21 @@ library(cfr)
# Load the Ebola 1976 data provided with the package
data(ebola1976)

# Focus on the first 20 days the outbreak
ebola1976_first_30 <- ebola1976[1:30, ]

# Calculate the static CFR without correcting for delays
cfr_static(data = ebola1976)
cfr_static(data = ebola1976_first_30)
#> severity_estimate severity_low severity_high
#> 1 0.955102 0.9210866 0.9773771
```

``` r
#> 1 0.4740741 0.3875497 0.5617606

# Calculate the static CFR while correcting for delays
cfr_static(
data = ebola1976,
data = ebola1976_first_30,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33)
)
#> severity_estimate severity_low severity_high
#> 1 0.9742 0.8356 0.9877
#> 1 0.9422 0.8701 0.9819
```

### Change in real-time estimates of overall severity during the 1976 Ebola outbreak
Expand Down Expand Up @@ -118,9 +118,6 @@ head(rolling_cfr_naive)
#> 4 1976-08-28 0 0 0.975
#> 5 1976-08-29 0 0 0.975
#> 6 1976-08-30 0 0 0.975
```

``` r

# Calculate the rolling daily CFR while correcting for delays
rolling_cfr_corrected <- cfr_rolling(
Expand Down
10 changes: 2 additions & 8 deletions man/dot-estimate_severity.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 0 additions & 2 deletions man/dot-select_func_likelihood.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified man/figures/README-fig-rolling-cfr-ebola-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 2 additions & 6 deletions tests/testthat/_snaps/estimate_ascertainment.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,16 @@
Code
estimate_ascertainment(data = ebola1976, delay_density = function(x) dgamma(x,
shape = 2.4, scale = 3.33), severity_baseline = 0.7)
Message
Total cases = 245 and p = 0.959: using Normal approximation to binomial likelihood.
Output
ascertainment_estimate ascertainment_low ascertainment_high
1 0.7185383 0.7087172 0.8377214
1 0.7297748 0.7147963 0.7530931

# Static ascertainment from vignette

Code
estimate_ascertainment(data = covid_uk, delay_density = function(x) dlnorm(x,
meanlog = 2.577, sdlog = 0.44), severity_baseline = 0.014)
Message
Total cases = 283420 and p = 0.206: using Normal approximation to binomial likelihood.
Output
ascertainment_estimate ascertainment_low ascertainment_high
1 0.09810792 0.02316347 0.2167183
1 0.06779661 0.06734007 0.06829268

2 changes: 1 addition & 1 deletion tests/testthat/_snaps/estimate_severity.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
severity_estimate
Output
severity_estimate severity_low severity_high
0.9742 0.8356 0.9877
0.9592 0.9295 0.9793

---

Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/_snaps/estimate_static.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@
scfr_corrected
Output
severity_estimate severity_low severity_high
1 0.9742 0.8356 0.9877
1 0.9592 0.9295 0.9793

2 changes: 1 addition & 1 deletion tests/testthat/test-estimate_ascertainment.R
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ test_that("Ascertainment > 1.0 throws a warning", {
estimate_ascertainment(
data = ebola1976,
delay_density = function(x) dgamma(x, shape = 2.40, scale = 3.33),
severity_baseline = 0.9
severity_baseline = 0.99
),
regexp = "Ascertainment ratios > 1.0 detected, setting these values to 1.0"
)
Expand Down
19 changes: 3 additions & 16 deletions tests/testthat/test-estimate_severity.R
Original file line number Diff line number Diff line change
Expand Up @@ -131,22 +131,9 @@ test_that("Special cases of `.estimate_severity()`", {
poisson_threshold = 100
),
c(
severity_estimate = 1e-4, # lowest possible severity under this method
severity_low = NA_real_,
severity_high = NA_real_
)
)

total_outcomes <- 99
expect_identical(
.estimate_severity(
total_cases, total_deaths, total_outcomes,
poisson_threshold = 100
),
c(
severity_estimate = 1 - 1e-4, # highest possible severity
severity_low = NA_real_,
severity_high = NA_real_
severity_estimate = NA, # set NA because not valid calculation
severity_low = NA,
severity_high = NA
)
)

Expand Down
Binary file added tests/testthat/testthat-problems.rds
Binary file not shown.
2 changes: 0 additions & 2 deletions vignettes/cfr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
2 changes: 0 additions & 2 deletions vignettes/data_from_incidence2.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
vignette: >
%\VignetteIndexEntry{Handling data from {incidence2}}
%\VignetteEngine{knitr::rmarkdown}
Expand Down
2 changes: 0 additions & 2 deletions vignettes/delay_distributions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
2 changes: 0 additions & 2 deletions vignettes/estimate_ascertainment.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down
4 changes: 1 addition & 3 deletions vignettes/estimate_static_severity.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down Expand Up @@ -172,7 +170,7 @@ head(df_covid_uk)
We retrieve the appropriate distribution for Covid-19 from @linton2020; this is a lognormal distribution with $\mu$ = 2.577 and $\sigma$ = 0.440.

::: {.alert .alert-warning}
**Note that** @linton2020 fitted a discrete lognormal distribution and we use a continuous distribution, and that we are ignoring uncertainty in the distribution parameters and likely under-estimating uncertainty in the CFR.
**Note that** @linton2020 fitted a discrete lognormal distribution and we use a continuous distribution, and that we are ignoring uncertainty in the distribution parameters and hence likely under-estimating uncertainty in the CFR.
:::

### Estimating the naive and corrected CFR
Expand Down
2 changes: 0 additions & 2 deletions vignettes/estimate_time_varying_severity.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ output:
bookdown::html_vignette2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
bibliography: resources/library.json
link-citations: true
vignette: >
Expand Down

0 comments on commit 29ee12a

Please sign in to comment.