Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forecasting age specific mortality rates with prediction intervals produced from the bootstrap variants using percentiles #58

Open
ThisaakhyaJ opened this issue Dec 6, 2024 · 9 comments

Comments

@ThisaakhyaJ
Copy link

Hi I am a PhD student currently attached to the School of Mathematics and Statistics of the University of Melbourne. For my PhD projec,t I need to forecast age specific mortality and fertility rates of specific countries. These outputs will be used to answer my main research questions related to an application of the Markovian binary tree. I am able to do this without any problem when I use the default parametric method. This gives me the forecasts with prediction intervals computed using the normal distribution. However when I try to forecast by using bootstrap variants it fails. I wanted to implement this method after reading your paper "Functional time series forecasting". To remedy this I tried using the FTSA package as well. However, it does not work.

Is there a way for me to forecast age specific rates by obtaining bootstrap samples using the Demography package itself? The code I am using is shown below.

library(demography)
library(tidyverse)
library(dplyr)
library(readxl)
library("stringr")
library(openxlsx)
library(reshape2) # To convert data into demogdata object
library(forecast)

Importing fertility and mortality data of USA

data<-read_excel("us_rates.xlsx")

Making demogdata object

Matrix of rates

data_matrix <- dcast(data, age ~ year, value.var = "mortality")
rownames(data_matrix) <- data_matrix$age
data_matrix <- data_matrix[, -1] # Remove the age column now that it's used as row names

Matrix of population

pop_matrix <- dcast(data, age ~ year, value.var = "population")
rownames(pop_matrix) <- pop_matrix$age
pop_matrix <- pop_matrix[, -1] # Remove the age column now that it's used as row names

Vectors of age and years

ages<-sort(as.numeric(rownames(pop_matrix)))
years<-sort(as.numeric(colnames(pop_matrix)))

Convert data into deomogdata object

demog<-demogdata(data = data_matrix, pop=pop_matrix, ages=ages, years=years,
type="mortality", label="United States of America", name="female")

Smoothing mortality rates

smus <- smooth.demogdata(demog)
plot(demog, years=1950, series="female", type="p", pch=1, col="gray")
lines(smus, years=1950, series="female")

Fitting FPC to mortality rates

fdm.female_1 <- fdm(smus, series="female",ages = ages, max.age = max(ages), method="classical", weight=TRUE, beta=0.2, order=6)

Forecasting with normal prediction intervals

forecast.fdm.female_parametric <- forecast(fdm.female_1, h=20, level = 90)
plot(smus, series="female", ylim=c(-10,0), lty=2)
lines(forecast.fdm.female_parametric)

Forecasting with bootstrap prediction intervals

forecast.fdm.female_non_parametric <- forecast(fdm.female_1, h=20, level = 90, pimethod="nonparametric", B=100, usedata = nrow(fdm.female_1$coeff))

The last command gives me the following error,

Error in if (n <= 0L) stop("'spline' requires n >= 1") :
argument is of length zero

I cannot seem to find the problem. Would you be able to help me with this? Or is there an alternative method to do the same?

Thank you in advance,
Thisaakhya

@robjhyndman
Copy link
Owner

Please provide a minimal reproducible example.

@ThisaakhyaJ
Copy link
Author

ThisaakhyaJ commented Dec 6, 2024

Hi Rob,

Will this do?

library(demography)

usa <- hmd.mx("USA", "username", "password", "USA")
usa1950.100 <- extract.years(extract.ages(usa, 0:100), years=1950:max(usa$year))

Smoothing mortality rates
smus1950.100 <- smooth.demogdata(usa1950.100)
plot(usa1950.100, years=2003, series="female", type="p", pch=1, col="gray")
lines(smus1950.100, years=2003, series="female")

Fitting FPC to mortality rates
fdm.female <- fdm(smus1950.100, series=
"female"
, method=
"classical"
, weight=TRUE, beta=0.1)

Forecasting with normal prediction intervals (the default method)
forecast.fdm.female_parametric <- forecast(fdm.female, h=20, level = 90)
plot(smus1950.100, series="female", ylim=c(-10,0), lty=2)
lines(forecast.fdm.female_parametric)

Forecasting with bootstrap prediction intervals
forecast.fdm.female_non_parametric <- forecast(fdm.female, h=20, level = 90, pimethod="nonparametric")

This is the error I get after the last R command above

Error in if (n <= 0L) stop("'spline' requires n >= 1") :
argument is of length zero

@robjhyndman
Copy link
Owner

Thanks. Fixed in 8200edd

@ThisaakhyaJ
Copy link
Author

ThisaakhyaJ commented Dec 6, 2024

Thank you very much Rob. I reinstalled the demography package from CRAN and github. However, the error still seems to appear. Is there any other way I could get the version with the adjustment?

@robjhyndman
Copy link
Owner

The CRAN version won't be updated any time soon. Install from github. Then restart your R session.

@ThisaakhyaJ
Copy link
Author

ThisaakhyaJ commented Dec 7, 2024

Hi Rob

Thank you very much. I Just had two more suggestions.

  1. Is there a way to get the bootstrap prediction forecasts as as an output value of the forecast.fdm function similar to the forecast function in the ftsm package? Hoping that this would be a good addition to the existing function.
    I mean something like this as an output,
    bootsamp: n array of dimension = c(p, B, h)dimension=c(p,B,h) containing the bootstrapped point forecasts. p is the number of variables. B is the number of bootstrap samples. h is the forecast horizon.

  2. Allowing forecasting with bootstrap prediction intervals when another lambda value is used for the box cox transformation. For example using lambda = 0.4 instead of 0 for fertility rates. I actually tried this out and the forecasts seems to be in a different scale when pimethod= "nonparametric".

Hoping that these suggestions might improve the flexibility of the demography package.

Thank you very much.
Thisaakhya

@robjhyndman
Copy link
Owner

These features are possible using the vital package, which is the successor to the demography package. The demography package is only being updated to fix bugs; I won't be adding new features. Please consider using the vital package instead, which I am actively developing.

@robjhyndman
Copy link
Owner

Here is how to do it using vital:

library(vital)
library(dplyr)
library(ggplot2)

usa <- read_hmd("USA", "[email protected]", "r#ggE5KkACGAhd6f")
usa1950.100 <- usa |>
  filter(Age <= 100, Year >= 1950)

# Smoothing mortality rates
smus1950.100 <- usa1950.100 |>
  smooth_mortality(Mortality)

usa1950.100 |>
  filter(Year == 2003, Sex == "Female") |>
  ggplot(aes(x = Age, y = Mortality)) +
  geom_point() +
  geom_line(data = smus1950.100 |> filter(Year == 2003, Sex == "Female")) +
  scale_y_log10()

# Fitting FPC to mortality rates
fdm <- smus1950.100 |>
  model(fdm = FDM(log(Mortality)))

# Forecasting with normal prediction intervals
forecast.fdm <- forecast(fdm, h = 20)

forecast.fdm |>
  filter(Sex == "Female", Year == 2030) |>
  mutate(
    lo = quantile(Mortality, 0.025),
    hi = quantile(Mortality, 0.975)
  ) |>
  ggplot(aes(x = Age)) +
  geom_line(aes(y = .mean)) +
  geom_ribbon(aes(ymin = lo, ymax = hi), alpha = 0.5) +
  scale_y_log10()

# Forecasting with bootstrap prediction intervals
forecast.fdm.bootstrap <- fdm |>
  filter(Sex == "Female") |>
  forecast(h = 20, times = 500, bootstrap = TRUE)

forecast.fdm.bootstrap |>
  filter(Sex == "Female", Year == 2030) |>
  mutate(
    lo = quantile(Mortality, 0.025),
    hi = quantile(Mortality, 0.975)
  ) |>
  ggplot(aes(x = Age)) +
  geom_line(aes(y = .mean)) +
  geom_ribbon(aes(ymin = lo, ymax = hi), alpha = 0.5) +
  scale_y_log10()

Created on 2024-12-08 with reprex v2.1.1

@ThisaakhyaJ
Copy link
Author

Hi Rob. Thank you so much for this. I really appreciate it. Will use this package to do it right away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants