Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support of svykm objects #224

Open
larmarange opened this issue Sep 26, 2024 · 6 comments
Open

Feature request: support of svykm objects #224

larmarange opened this issue Sep 26, 2024 · 6 comments

Comments

@larmarange
Copy link

larmarange commented Sep 26, 2024

When dealing with complex and weighted datasets, survey:svykm() should be used instead of survival::survfit(). Would it be relevant to include support of such objects?

@ddsjoberg
Copy link
Collaborator

Hey @larmarange !! I think it can be supported here.

Just like in gtsummay, where we've been developing an ARD-first, I plan on updating ggsurvfit to also support ARD first. Once that is done, we'd just need an ARD function for survey:svykm() and then it can simply be incorporated into the pipeline.

I don't have a timeline for this at the moment (more pressing matters to deal with at the moment), unfortunately. Also, once we generalize to accept ARD inputs, I wonder if it'll be best to support survey directly here or in a spinoff package that takes advantage of this infrastructure. Something we can decide/discuss when we get to that point. (But not sure when we will be to that point).

@larmarange
Copy link
Author

Hi @ddsjoberg

Thanks for your feedback. I also noted that gtsummary::tbl_survfit() is not compatible with svykm().

Would a first step be to add a ard_survey_svykm() function in cardx?

@ddsjoberg
Copy link
Collaborator

Yeah that would be among the first for sure. The BIG item is going to be updates in ggsurvfit to handle ARDs.

Regarding ARDs, I still need to land on a consistent method for reporting variable-level and model-level statistics and how to link them. This applies to regression models, where coefs are associated with a variables in the model, and we have model levels stats like AIC (and many others).

Anyway, what i want to say is that if you write an cardx::ard_survey_svykm() function now, we'll probably need to update it in future once we work out some storage details.

@larmarange
Copy link
Author

Let me know if you think it's too early for cardx::ard_survey_svykm(). There is no emergency. But in that case it could be relevant to have an issue open as a reminder.

So far, I'm teaching to my students the classic approach for weighted KM, and I give them a small function to get the data by time points: https://larmarange.github.io/guide-R/analyses_avancees/analyse-survie.html#analyse-de-survie-pond%C3%A9r%C3%A9e

But on a longer term, it would be nice to have a unified way to do it regardless it is weighted or not

@jinseob2kim
Copy link

How about https://github.com/jinseob2kim/jskm ?

@larmarange
Copy link
Author

Thanks @jinseob2kim I have added in my teaching reference to jskm.

@ddsjoberg Just as a reminder when developing support of svykm() in cardx, I have drafted two exploratory functions to get times and probs from such object.

svykm_probs <- function(x,
                        probs = c(1, .75, 5, .25),
                        ci_level = .95,
                        strata = NULL) {
  if (inherits(x, "svykm")) {
    if (is.null(ci_level) | is.null(x$varlog)) {
      res <- quantile(x, probs, ci = FALSE) |> 
        dplyr::as_tibble(rownames = "prob")
    } else {
      tmp <- quantile(
        x,
        probs,
        ci = TRUE,
        level = ci_level
      )
      ci <- attr(tmp, "ci") |> 
        dplyr::as_tibble(rownames = "prob") |> 
        dplyr::rename(conf.low = 2, conf.high = 3)
      res <- tmp |> 
        dplyr::as_tibble(rownames = "prob") |> 
        dplyr::left_join(ci, by = "prob")
    }
    if (!is.null(strata))
      res$strata <- strata
    res
  } else {
    x |> 
      seq_along() |> 
      lapply(
        \(i) {
          svykm_probs(
            x[[i]],
            probs = probs,
            ci_level = ci_level,
            strata = names(x)[[i]]
          )
        }
      ) |> 
      dplyr::bind_rows()
  }
}
svykm_times <- function(x,
                        times,
                        ci_level = .95,
                        strata = NULL) {
  if (inherits(x, "svykm")) {
    idx <- sapply(
      times,
      function(t) max(which(x$time <= t))
    )
    if (is.null(ci_level) | is.null(x$varlog)) {
      res <- dplyr::tibble(
        time = times,
        value = x$surv[idx]
      )
    } else {
      ci <- confint(x, parm = times, level = ci_level)
      res <- dplyr::tibble(
        time = times,
        value = x$surv[idx],
        conf.low = ci[, 1],
        conf.high = ci[, 2]
      )
    }  
    if (!is.null(strata))
      res$strata <- strata
    res
  } else {
    x |> 
      seq_along() |> 
      lapply(
        \(i) {
          svykm_times(
            x[[i]],
            times = times,
            ci_level = ci_level,
            strata = names(x)[[i]]
          )
        }
      ) |> 
      dplyr::bind_rows()
  }
}

Some examples of use here: https://larmarange.github.io/guide-R/analyses_avancees/analyse-survie.html#courbes-de-kaplan-meier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants