-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method for Disaggregating Grouped Mortality Rates by Single Ages #59
Comments
See https://robjhyndman.com/publications/monotonic-splines-2/ |
Actually that paper works for population or deaths, or some other count variable, but not for rates. |
Here's an attempt that might give you a start, using the library(dplyr)
library(ggplot2)
library(vital)
# USA 2022
usa2022 <- tibble(
age = c("00-14", "15-64", "65-74", "75-84", "85+"), # Age Group
cmr = c(51.5, 381, 1966, 4738, 15404) # Crude Mortality Rate per 100k
) |>
mutate(
year = 2022,
lower_age = readr::parse_number(age),
upper_age = stringr::str_extract(age, "[\\d]*[+]*$"),
upper_age = readr::parse_number(upper_age),
upper_age = if_else(upper_age == max(upper_age), 110, upper_age)
)
# Reference population for disaggregation
reference <- vital::norway_mortality |>
filter(Year == max(Year), Sex == "Total") |>
vital::smooth_mortality(Mortality) |>
as_tibble() |>
select(age = Age, pop = Population, cmr = .smooth)
# Disaggregate the age groups
disaggregate <- function(data, reference) {
years <- unique(data[["year"]])
agegroups <- unique(data[["age"]])
ages <- unique(reference[["age"]])
disagg <- expand.grid(age = ages, year = years) |>
left_join(reference, by = "age")
for (i in seq_along(agegroups)) {
lo <- data$lower_age[i]
hi <- data$upper_age[i]
# Crude mortality rate for the given age group in the reference data
age_group_ref <- reference |>
filter(age >= lo, age <= hi) |>
mutate(deaths = cmr * pop)
total_cmr <- sum(age_group_ref$deaths) / sum(age_group_ref$pop)
# Scale age-specific mortality rates
disagg <- disagg |>
mutate(cmr = if_else(age >= lo & age <= hi,
cmr * data$cmr[i] / total_cmr, cmr
))
}
disagg <- disagg |>
vital::as_vital(index = "year", key = "age", .age = "age",
.pop = "pop") |>
smooth_mortality(cmr)
disagg |> select(year, age, cmr = c(.smooth))
}
disaggregate(usa2022, reference) |>
ggplot() +
aes(x = age, y = cmr / 100000) +
geom_line() +
geom_step(data = usa2022, aes(x = lower_age - 1), direction = "hv", color = "red") +
geom_step(data = usa2022, aes(x = upper_age), color = "red", direction = "vh") +
scale_y_log10() Created on 2024-12-07 with reprex v2.1.1 |
Dear Prof. Hyndman,
I am currently looking for a method to disaggregate and estimate mortality rates by single years of age from aggregated data, such as that available from mortality.org or similar resources, which typically provide age-grouped data (e.g., 5-year or 10-year groups).
Here is an example illustrating the type of challenge I am working on:
I was wondering if this or any of your R packages support this type of use case, or if you might have any additional recommendations or pointers that could guide me in the right direction.
Thank you very much for your time and any insights you might share.
Ben M
The text was updated successfully, but these errors were encountered: