Skip to content

Commit

Permalink
fixing carina
Browse files Browse the repository at this point in the history
accidentally overwrote carina with tammy. now they're both there
  • Loading branch information
codebeaker committed Jul 17, 2024
1 parent 7d66b32 commit 3d17c6d
Show file tree
Hide file tree
Showing 2 changed files with 128 additions and 12 deletions.
26 changes: 14 additions & 12 deletions family/cards/Carina-the-clinical-researcher.qmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Carina the Clinical Researcher"
title: "Carina the clinical researcher"
toc-expand: 2
---

Expand All @@ -9,20 +9,22 @@ toc-expand: 2
:::

::: {.g-col-12 .g-col-xl-8}
- Carina **needs** direct access to structured and unstructured clinical data so she can design trials, conduct clinical research, and assess the feasibility of studies    
- She **struggles** with getting all her analyses done fast enough to meet her grant deadlines  
- **We can help** her by providing direct access to clinical data for retrospective chart review studies, prospective cohort studies, randomized trials, and other clinical research. Secondary use of patient data collected during the course of care will dramatically improve research efficiency and lower costs.     
- Carina **needs** direct access to structured and unstructured clinical data so she can design trials, conduct clinical research, and assess the feasibility of studies.
- She **struggles** with getting all her analyses done fast enough to meet her grant deadlines
- **We can help** by providing direct access to clinical data for retrospective chart review studies, prospective cohort studies, randomized trials, and other clinical research. Secondary use of patient data collected during the course of care will dramatically improve research efficiency and lower costs.
:::
:::

::: lightblue-highlight
## Carina needs to get her data set or analysis ready in time for grant deadlines

Time, tide, and grant deadlines wait for no one, as Carina knows very well. In addition to offering patient care, Carina is always developing ideas for how to make a positive impact on the world and analyzing data to spot critical needs that must be met. Although she can access patient records through Epic, she often needs help from Tammy the Translational Analytics Data Scientist to extract all the data she needs for analysis. Tammy sends her the data she needs in a csv, and Carina converts it to Excel to make the annotations she needs, and stores all the data on OneDrive. Tammy is fantastic, but Carina **wishes she could have direct access to the clinical data herself**. Sometimes **she’s not exactly sure of the right things to ask for**, and it takes a little back and forth with Tammy to nail down just what she needs. Other times she just wants to do some initial analysis to assess whether a study is even feasible, for instance by seeing how many patients at her institution meet particular criteria. Once she needed data access rapidly for a multi-institutional partnership. Some turnover time was needed for the Data Governance Analyst to make sure that her extramural partners had all the permissions in place to access the data, and Carina felt that if she had been able to access the data herself instead of going through gatekeepers she might have been able to save her wonderful colleagues a little time and ease their way. Since Carina doesn’t know how to code herself, she would love **an easy to use GUI to access and visualize data**, and a way to import data to other secure systems so she can combine the patient data she pulls with other data sets, such as survey or biospecimen repository data.  
Time, tide, and grant deadlines wait for no one, as Carina knows very well. In addition to offering patient care, Carina is always developing ideas for how to make a positive impact on the world and analyzing data to spot critical needs that must be met. Although she can access patient records through Epic, she often needs help from Tammy the Translational Analytics Data Scientist to extract all the data she needs for analysis. Tammy sends her the data she needs in a csv, and Carina converts it to Excel to make the annotations she needs, and stores all the data on OneDrive. Tammy is fantastic, but Carina wishes she could have **direct access to the clinical data** herself. Sometimes she’s **not exactly sure of the right things to ask for,** and it takes a little back and forth with Tammy to nail down just what she needs. Other times she just wants to do some initial analysis to assess whether a study is even feasible, for instance by seeing how many patients at Hutch meet particular criteria. Once she needed data access rapidly for a multi-institutional partnership. Some turnover time was needed for the Data Governance Analyst to make sure that her extramural partners had all the permissions in place to access the data, and Carina felt that if she had been able to access the data herself instead of going through gatekeepers she might have been able to save her wonderful colleagues a little time and ease their way. Since Carina doesn’t know how to code herself, she would love an **easy to use GUI to access and visualize data,** and a way to import data to other secure systems so she can combine the patient data she pulls with other data sets, such as survey or biospecimen repository data.
:::

::: darkblue-highlight
Collaborators: Data Scientist, Data Governance Analyst Downstream users: clinical research community
Collaborators: Data Scientist, Data Governance Analyst, Biostatistician

Downstream users: clinical research community
:::

::: grid
Expand All @@ -33,32 +35,32 @@ Collaborators: Data Scientist, Data Governance Analyst Downstream users: clinica

- Lack of direct access to data makes it hard for her to see what data is available and conduct exploratory analysis, reducing her opportunities for discovery and inspiration

- Converting unstructured data (e.g., data from clinical notes) to structured data for research

- Combining data with data from other institutions, and making sure their resulting data sets are structured similarly to their extramural collaborators’ data

- There is no PHI-approved cloud computing platform, so all data must be stored locally and all analyses must be performed locally, limiting options and posing a security risk  
- There is no PHI-approved cloud computing platform, so all data must be stored locally and all analyses must be performed locally, limiting options and posing a security risk\
:::

::: {.g-col-12 .g-col-xl-6}
# Needs and Wants

- Direct access to clinical data, with clear information on who to reach out to for help if she needs it

- A PHI-approved cloud computing platform to store and analyze data, that supports best practices for code review and version control for reproducible science  
- A PHI-approved cloud computing platform to store and analyze data, that supports best practices for code review and version control for reproducible science

- A de-identified data asset and chart review tool that is available without additional governance review to enable exploration and discovery
- A de-identified data asset and chart review tool that is available without additional governance review to enable exploration and discovery\
:::
:::

# Types of data used

- Data about patients from EHR or other systems including patient demographics, conditions, comorbidities, treatments, location

- Novel, non-clinically reported data is relevant such as research use only genetic assay results

- Survey and case report form type datasets

<div>

Image attribution: "[Women In Tech - 38](https://www.flickr.com/photos/136629440@N06/21911138603)" by [wocintechchat.com](https://www.flickr.com/photos/136629440@N06) is licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/?ref=openverse).
Image attribution: "[Women In Tech - 53](https://www.flickr.com/photos/136629440@N06/22344625928)" by [wocintechchat.com](https://www.flickr.com/photos/136629440@N06) is licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/?ref=openverse).

</div>
114 changes: 114 additions & 0 deletions family/cards/Tammy-the-data-scientist.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
title: "Tammy the Data Scientist"
toc-expand: 2
---

::: grid
::: {.g-col-12 .g-col-xl-3}
<img src="/img/tammy.jpg" width="200" height="200"/>
:::

::: {.g-col-12 .g-col-xl-8}
- Tammy **needs** efficient ways to help users across the Hutch with analysis and access to healthcare data.
- She **struggles** to rapidly distribute data needed by researchers. Because there is no governed system to provision access and allow researchers to access data on their own and write their own queries, dataset development falls on Tammy
- **We can help** her by creating a central platform that unifies data from disparate sources in a open (non-proprietary), common data model and provides modern data science tooling for statistical/ML model development and production, so that Tammy can spend more of her time using her skillset in statistical modeling and data science to build data products (e.g., NLP systems, predictive models) that address the data needs of multiple groups at once.
:::
:::

::: lightblue-highlight
## Tammy needs a queriable centralized data repository

Researchers at the Hutch need access to adult oncology program clinical data, and Tammy is here to help! She supports researchers with a wide range of data science experience and aims to create solutions that can be leveraged by researchers regardless of target or disease they study. Requests range from curated datasets for specific research projects, to reproducible reports/dashboards, to output from statistical/ML models and NLP systems. Tammy can’t always get them what they need because she **doesn’t have access to all the relevant clinical data assets** and doesn’t have a **PHI-secure, cloud-based computing environment for developing statistical/ML workflows**. Tammy is focused on improving the interoperability of datasets and increasing democratization of data (along with the Data Governance Analyst) so that **researchers who have data skills can query data directly rather than wait on Tammy** to send data for their projects. Rather than spend time writing code for these projects, she would like to focus on building workflows and data science products that can be leveraged across research groups.
:::

::: darkblue-highlight
Collaborators: Data Engineers, Data Governance Analyst, Analytics Engineers, Clinical Analyst

Downstream users: Clinical/Translational Researcher, Biostatistician, Program and Service Line Managers
:::

::: grid
::: {.g-col-12 .g-col-xl-6}
# Key Challenges

- Understanding the landscape of clinical data applications at Fred Hutch, where data is stored, and how to acquire access

- Local machines are not the best computing environment for clinical data science; some clinical databases cannot be accessed from a Mac and many computing environments for reproducible analysis cannot be re-created on Windows

- Educating and nudging researchers towards best practices for clinical data science

- Lack of self-service tools for researchers mean that the Tammy spends more time building one-off datasets rather than data science tools that can be used by many researchers

- There is no way to clearly attach information about data use agreements and access permissions to each dataset or project

- Availability of time and staff limits pace and volume of help provided

- There is no unified system with all the relevant data; data must be collated from multiple systems
:::

::: {.g-col-12 .g-col-xl-6}
# Needs and Wants

- An efficient way to store and retrieve past models/queries for future reference

- A more efficient way to access multimodal clinical data that...

- is PHI-approved

- displays information about provenance, lineage, and data governance (e.g., whether a column contains PHI, what access restrictions are on the data)  

<!-- -->

- supports best practices for dataset documentation

- Cloud computing environments for managing statistical/machine learning workflows

- Secure platform to publish and share deliverables (e.g. Quarto/Jupyter notebooks, dashboards, datasets)  

- A way to help users help themselves to expand capacity of the department
:::
:::

# Types of data used

- Structured and unstructured data from the current EHR (Epic/Clarity) and historical EHR systems (ORCA/Cerner, etc.)

<!-- -->

- Cancer registry data (CNeXT)

<!-- -->

- Sunquest lab data

<!-- -->

- Mosaiq radiation oncology data

<!-- -->

- OnCore Clinical Trials Management (CTMS) system data

<!-- -->

- Pyxis medication administration

<!-- -->

- Gateway transplant and immunotherapy data

<!-- -->

- Novel, non-clinically reported data is relevant such as research use only genetic assay results

<!-- -->

- Survey and case report form type datasets

- Validated lists of genomic data such as tumor mutations or structural variants

<div>

Image attribution: "[Women In Tech - 53](https://www.flickr.com/photos/136629440@N06/22344625928)" by [wocintechchat.com](https://www.flickr.com/photos/136629440@N06) is licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/?ref=openverse).

</div>

0 comments on commit 3d17c6d

Please sign in to comment.