Skip to content

Commit

Permalink
Revert "Provenance final pages (elixir-europe#364)" (elixir-europe#366)
Browse files Browse the repository at this point in the history
This reverts commit 561a970.
  • Loading branch information
bedroesb authored Oct 2, 2024
1 parent 561a970 commit 666fa53
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 85 deletions.
81 changes: 0 additions & 81 deletions provenance/general_provenance.md

This file was deleted.

26 changes: 25 additions & 1 deletion provenance/human-clinical-and-health-data.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Human clinical and health data
description: Tracking data and analysis steps.
contributors: [Rudolf Wittner]
contributors: [Rudolf Wittner, Stian Soiland-Reyes, Simone Leo]
page_id: hchd_provenance
redirect_from: /human-clinical-and-health-data/provenance
rdmkit:
Expand All @@ -14,10 +14,34 @@ training:
# More information on how to fill in this metadata section can be found here https://www.infectious-diseases-toolkit.org/contribute/page-metadata
---

## W3C PROV

[W3C PROV](https://www.w3.org/TR/prov-overview/) is a general purpose standard for provenance information. The standard suggests expression of provenance in terms of entities, activities, agents, and their mutual relations. The standard's data model is realized in different serializations, including the [PROV-O ontology](https://www.w3.org/TR/prov-o/), which have been extended for various domains.

In addition to the [PROV primer](https://www.w3.org/TR/prov-primer/), the [PROV Book](https://www.provbook.org/) gives a detailed introduction to using PROV.

## HL7 FHIR Provenance

[HL7 FHIR](http://hl7.org/fhir/) is an interoperability standard for healthcare information exchange between systems. FHIR aims to define the key entities involved in healthcare information exchange as resources.

FHIR provides support for [expression of provenance](https://www.hl7.org/fhir/provenance.html) information of resources. Provenance of a resource is "a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource", and "tracks information about the activity that created, revised, deleted, or signed a version of a resource, describing the entities and agents involved".

The provenance part of HL7 FHIR extends W3C PROV.

## The Common Provenance Model

The [Common Provenance Model](https://doi.org/10.1038/s41597-022-01537-6) (CPM) is an extension of W3C PROV that aims to provide support for the integration of provenance information from heterogeneous environments. In particular, it provides guidelines for the representation of domain-independent provenance information (provenance _backbone_), to which domain-specific provenance information can be attached in a prescribed way.

The CPM forms a conceptual foundation for the ISO standard series _ISO 23494 Provenance information model for biological specimen and data_. The ISO standard is still in an early phase of its development.

## RO-Crate

{% tool "research-object-crate" %} is a lightweight implementation of a _FAIR Digital Object_, which is able to pack data together with its metadata into a _Research Object_. It is based on Linked Data standards including {% tool "schema-org" %} and [JSON-LD](https://json-ld.org/), but can be written and consumed as regular JSON.

The [RO-Crate specifications](https://www.researchobject.org/ro-crate/specification.html) can be used to form different [RO-Crate profiles](https://www.researchobject.org/ro-crate/profiles.html), which are suitable for various domains and use cases. While the base specifications already contain some [guidelines on representing the provenance of data entities](https://www.researchobject.org/ro-crate/1.1/provenance.html#software-used-to-create-files) included in the crate, some contexts require a more detailed description to enhance traceability and reproducibility. To meet this demand, several provenance-oriented RO-Crate profiles are being developed:

* The [Workflow Run RO-Crate working group](https://www.researchobject.org/workflow-run-crate/) is developing a collection of [profiles to describe the execution of computational workflows](https://www.researchobject.org/workflow-run-crate/profiles/). The profiles define provenance descriptions at different granularity levels, from "black box" (only workflow-level inputs, outputs and parameters are considered) to step-by-step rundown.

* The CPM team, with the help of the RO-Crate community, is developing an RO-Crate profile for representing CPM-compliant provenance and meta-provenance in an RO-Crate.

Support for RO-Crate provenance reporting is being added or is planned to be added to several workflow engines, including {% tool "galaxy" %}, {% tool "common-workflow-language" %}, {% tool "snakemake" %}, {% tool "streamflow" %}, {% tool "sapporo-wes" %}, {% tool "compss" %}, {% tool "wfexs" %}.
2 changes: 1 addition & 1 deletion provenance/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Provenance
toc: false
---

Provenance is information documenting the history of an object, such as a dataset or a sample. For a general description and approaches applicable to all domains, follow to the general provenance page. After getting familiar with the general description, you can proceed to domain-specific pages. Reading the general page before proceeding to the domain-specific pages is not necessary, but it is recommended.
Provenance refers to the practice of meticulously tracking data, as well as associated processing and analysis steps, to ensure the integrity and reproducibility of research findings. This involves documenting the origins of data, all transformations and manipulations it undergoes, and the analytical methods applied. By maintaining detailed records of these elements, researchers can provide a transparent path from raw data to final results, facilitating the validation and verification of the data by other researchers. This is critical in infectious disease research, where data accuracy and reliability are essential for effective disease monitoring, outbreak response, and public health decision-making.


{% include section-navigation-tiles.html type="provenance" except="index.md" %}
Expand Down
11 changes: 9 additions & 2 deletions provenance/pathogen-characterisation.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Pathogen characterisation
description: Tracking data and analysis steps.
contributors: [Rudolf Wittner]
contributors: []
no_robots: true
search_exclude: true
sitemap: false
Expand All @@ -17,4 +17,11 @@ training:
# More information on how to fill in this metadata section can be found here https://www.infectious-diseases-toolkit.org/contribute/page-metadata
---

Provenance information for pathogen characterization is described in the context of quality control in the [Pathogen Characterization Quality Control page](/quality-control/pathogen-characterisation).
**We are still working on the content for this page.** If you are interested in adding to the page, then:

[Feel free to contribute](/contribute/){: class="btn btn-primary btn-lg rounded-pill"}

This is a community-driven website, so contributions are welcome! You will, of course, be listed as a contributor on the page.

New content is announced on the [home page](/) and [news page](/about/news), so please check for updates there. You can also watch for changes on this page by using a free service like [Visual Ping](https://visualping.io/) or [Distill Web Monitor](https://distill.io/), or by using a [browser add-on](https://chrome.google.com/webstore/detail/distill-web-monitor/inlikjemeeknofckkjolnjbpehgadgge?hl=en).

0 comments on commit 666fa53

Please sign in to comment.