Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need functions for working with our hierarchical results? #301

Open
ddsjoberg opened this issue Aug 26, 2024 · 7 comments
Open

Do we need functions for working with our hierarchical results? #301

ddsjoberg opened this issue Aug 26, 2024 · 7 comments
Assignees

Comments

@ddsjoberg
Copy link
Collaborator

  • Filtering based on the prevalence?
  • Sorting them by prevalence?
@ddsjoberg
Copy link
Collaborator Author

Make separate functions we can pipe into to filter/sort.

Make this generalizable for ard_categorical() as well.

@ddsjoberg
Copy link
Collaborator Author

Question for the devs: Do we ever need to filter on anything other than the last grouping variable?

@ddsjoberg
Copy link
Collaborator Author

@bzkrouse @jtalboys take a look at your shells and list the types of filtering we may need.

@ddsjoberg
Copy link
Collaborator Author

ddsjoberg commented Jan 27, 2025

@bzkrouse @jtalboys Would an API like this work for your needs? I think it meet our needs at Roche...

Can you let us know by Wednesday, so we can get started on the implementaton?

ard <- 
  cards::ard_stack_hierarchical(
    cards::ADAE,
    variables = c(AESOC, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL |> dplyr::rename(TRTA = ARM),
    id = USUBJID,
    filter = <>, 
  )

# `filter = p > 0.05`: we keep all rows with at least one AE with a prev above 5%
# `filter = n > 10`: we keep all rows with at least one AE with a count above 10
# `filter = sum(n) > 10`: we keep all rows with where the sum of the counts in the row is above 10

here, the rows are across by groups.

Image

@ddsjoberg
Copy link
Collaborator Author

I know we casually mentioned using followup functions, but we lost the detailed information on the grouping variables vs variables from the original call which is why they are now proposed to be a part of the primary function. Let us know by EOD if you'd like to discuss further. If not, we'll get started on the implementation tomorrow!

@jtalboys
Copy link
Collaborator

@bzkrouse @jtalboys Would an API like this work for your needs? I think it meet our needs at Roche...

Can you let us know by Wednesday, so we can get started on the implementaton?

ard <- 
  cards::ard_stack_hierarchical(
    cards::ADAE,
    variables = c(AESOC, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL |> dplyr::rename(TRTA = ARM),
    id = USUBJID,
    filter = <>, 
  )

# `filter = p > 0.05`: we keep all rows with at least one AE with a prev above 5%
# `filter = n > 10`: we keep all rows with at least one AE with a count above 10
# `filter = sum(n) > 10`: we keep all rows with where the sum of the counts in the row is above 10
here, the rows are across `by` groups.

Image

Hi @ddsjoberg, sorry for the delay in replying but this looks great! Would the p's and n's in your example come from the stat_label? So the user would have the flexibility to filter by any stat types in the table?

@ddsjoberg
Copy link
Collaborator Author

@jtalboys yes, users can filter on any of the statistics. The n and p would come from the stat_name, rather than the stat_label, however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants