Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis/Session Provenance #18

Open
cjsifuen opened this issue Dec 2, 2022 · 3 comments
Open

Analysis/Session Provenance #18

cjsifuen opened this issue Dec 2, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@cjsifuen
Copy link

cjsifuen commented Dec 2, 2022

Enable users to capture with ease, fidelity, and accuracy the actions/analysis performed on a dataset or sets of datasets.

Information to capture

  • Files uploaded, data filtering steps and parameters used
  • Dataset/visualization subsetting and parameters
  • Selection of visualization types, positions, sizes, etc.

Potential implementations

  • Capture information in a file that can be used to "rerun" what was done
  • Capture and save as an "instance", in a more perpetual nature

Considerations

  • This might look different for datasets of different sizes
  • This might look different for a hosted vs local version
@cjsifuen cjsifuen added the enhancement New feature or request label Dec 2, 2022
@cjsifuen
Copy link
Author

I spoke with the imaging group about learning from napari. Their strategy is different in that they support a local instance only. They capture the commands run, but I'm not sure it's actually they type of provenance we're talking about.

@ergonyc
Copy link
Member

ergonyc commented Feb 2, 2023

From what I understand about SODAs current workflow this should be pretty straightforward. The "output" of SODA is either a downloaded dataset or a visualization. So I think there are just 3 states to log or capture:

  1. data origin (file name / path / source?) + metadata
  2. sample filter
  3. feature filter
  4. visualization
    • vis type + parameters
    • subset selection

I think a generic R logging module can do capture, so that log as metadata just needs to be added to metadata and saved along with the visualization / data.

@cjsifuen
Copy link
Author

cjsifuen commented Feb 3, 2023

From what I understand about SODAs current workflow this should be pretty straightforward. The "output" of SODA is either a downloaded dataset or a visualization. So I think there are just 3 states to log or capture:

  1. data origin (file name / path / source?) + metadata

  2. sample filter

  3. feature filter

  4. visualization

    • vis type + parameters
    • subset selection

I think a generic R logging module can do capture, so that log as metadata just needs to be added to metadata and saved along with the visualization / data.

This would be a light way to implement the first option.

A few more things to flag if this approach was taken:

  • Could add in a "save" or. "log" button to actively log metadata, but would also want to log changes automatically.
  • Might want to ensure no additional filtering takes place in the UI
  • Should check the R logging captures interactive visualizations

A possible way to do more complex logging/debugging could be to use a shiny logger to capture events and interactions -- though perhaps this is unnecessary. Just wanted to add some options that I found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants