Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method for Disaggregating Grouped Mortality Rates by Single Ages #59

Open
USMortality opened this issue Dec 6, 2024 · 3 comments
Open

Comments

@USMortality
Copy link

USMortality commented Dec 6, 2024

Dear Prof. Hyndman,

I am currently looking for a method to disaggregate and estimate mortality rates by single years of age from aggregated data, such as that available from mortality.org or similar resources, which typically provide age-grouped data (e.g., 5-year or 10-year groups).

Here is an example illustrating the type of challenge I am working on:

plot-1

# USA 2022
tibble(
  age = c("00-14", "15-64", "65-74", "75-84", "85+"), # Age Group
  cmr = c(51.5, 381, 1966, 4738, 15404) # Crude Mortality Rate per 100k
)

I was wondering if this or any of your R packages support this type of use case, or if you might have any additional recommendations or pointers that could guide me in the right direction.

Thank you very much for your time and any insights you might share.

Ben M

@robjhyndman
Copy link
Owner

robjhyndman commented Dec 7, 2024

See https://robjhyndman.com/publications/monotonic-splines-2/
Method implemented in demography::cm.spline()

@robjhyndman
Copy link
Owner

Actually that paper works for population or deaths, or some other count variable, but not for rates.

@robjhyndman
Copy link
Owner

Here's an attempt that might give you a start, using the vital package. It uses a reference population --- the disaggregation is roughly proportional to the mortality patterns in the reference population. I used 2022 Norway as the reference, but use something that should be close to the right shape.

library(dplyr)
library(ggplot2)
library(vital)

# USA 2022
usa2022 <- tibble(
  age = c("00-14", "15-64", "65-74", "75-84", "85+"), # Age Group
  cmr = c(51.5, 381, 1966, 4738, 15404) # Crude Mortality Rate per 100k
) |>
  mutate(
    year = 2022,
    lower_age = readr::parse_number(age),
    upper_age = stringr::str_extract(age, "[\\d]*[+]*$"),
    upper_age = readr::parse_number(upper_age),
    upper_age = if_else(upper_age == max(upper_age), 110, upper_age)
  )

# Reference population for disaggregation
reference <- vital::norway_mortality |>
  filter(Year == max(Year), Sex == "Total") |>
  vital::smooth_mortality(Mortality) |>
  as_tibble() |>
  select(age = Age, pop = Population, cmr = .smooth)

# Disaggregate the age groups
disaggregate <- function(data, reference) {
  years <- unique(data[["year"]])
  agegroups <- unique(data[["age"]])
  ages <- unique(reference[["age"]])
  disagg <- expand.grid(age = ages, year = years) |>
    left_join(reference, by = "age")
  for (i in seq_along(agegroups)) {
    lo <- data$lower_age[i]
    hi <- data$upper_age[i]
    # Crude mortality rate for the given age group in the reference data
    age_group_ref <- reference |>
      filter(age >= lo, age <= hi) |>
      mutate(deaths = cmr * pop)
    total_cmr <- sum(age_group_ref$deaths) / sum(age_group_ref$pop)
    # Scale age-specific mortality rates
    disagg <- disagg |>
      mutate(cmr = if_else(age >= lo & age <= hi,
        cmr * data$cmr[i] / total_cmr, cmr
      ))
  }
  disagg <- disagg |>
      vital::as_vital(index = "year", key = "age", .age = "age",
                                    .pop = "pop") |>
    smooth_mortality(cmr)
  disagg |> select(year, age, cmr = c(.smooth))
}

disaggregate(usa2022, reference) |>
  ggplot() +
  aes(x = age, y = cmr / 100000) +
  geom_line() +
  geom_step(data = usa2022, aes(x = lower_age - 1), direction = "hv", color = "red") +
  geom_step(data = usa2022, aes(x = upper_age), color = "red", direction = "vh") +
  scale_y_log10()

Created on 2024-12-07 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants