Skip to content

Commit

Permalink
docs: Adding Tutorial and Updating Site for AWS Open Data Requirements (
Browse files Browse the repository at this point in the history
#909)

1. Added tutorial from Utz on predicting sample boundaries
2. Updated the Data Organization Page with metadata to Portal API
mapping
3. Updated Quick Start with an overview of API functions and additional
examples
4. Added Tutorials homepage

Preview the site here:
https://dgmccart.github.io/cryoet-data-portal/index.html

---------

Co-authored-by: uermel <[email protected]>
  • Loading branch information
dgmccart and uermel authored Jul 23, 2024
1 parent 9a06326 commit 68c8c7f
Show file tree
Hide file tree
Showing 17 changed files with 837 additions and 15 deletions.
112 changes: 109 additions & 3 deletions docs/cryoet_data_portal_docsite_data.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/cryoet_data_portal_docsite_landing.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ We welcome feedback from the community on the data structure, design and functio

- [Installation](https://chanzuckerberg.github.io/cryoet-data-portal/cryoet_data_portal_docsite_quick_start.html)
- [Python Client API Reference](https://chanzuckerberg.github.io/cryoet-data-portal/python-api.html)
- [Example Code Snippets for Common Tasks](https://chanzuckerberg.github.io/cryoet-data-portal/cryoet_data_portal_docsite_examples.html)
- [Tutorials](./tutorials.md)
- [napari Plugin Documentation](https://chanzuckerberg.github.io/cryoet-data-portal/cryoet_data_portal_docsite_napari.html)

## Amazon Web Services S3 Bucket Info
Expand Down
78 changes: 69 additions & 9 deletions docs/cryoet_data_portal_docsite_quick_start.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Quick start

This page provides details to start using the CryoET Data Portal.
This page provides details to help you get started using the CryoET Data Portal Client API.

**Contents**

1. [Installation](#installation).
2. [Python quick start](#python-quick-start).
1. [Installation](#installation)
2. [API Methods Overview](#api-methods-overview)
3. [Example Code Snippets](#examples)

## Installation

Expand All @@ -18,7 +19,7 @@ The CryoET Data Portal Client requires a Linux or MacOS system with:
- Recommended: >5 Mbps internet connection.
- Recommended: for increased performance, use the API through an AWS-EC2 instance from the region `us-west-2`. The CryoET Portal data are hosted in a AWS-S3 bucket in that region.

### Python
### Install in a Virtual Environment

(Optional) In your working directory, make and activate a virtual environment or conda environment. For example:

Expand All @@ -33,13 +34,56 @@ Install the latest `cryoet_data_portal` package via pip:
pip install -U cryoet-data-portal
```

## Python quick start
## API Methods Overview

Below are 3 examples of common operations you can do with the client. Check out the [examples page](https://chanzuckerberg.github.io/cryoet-data-portal/cryoet_data_portal_docsite_examples.html) for more code snippets.
The Portal API has methods for searching and downloading data. **Every class** has a `find` and `get_by_id` method for selecting data, and most classes have `download...` methods for downloading the data. Below is a table of the API classes download methods.

### Browse data in the portal
| **Class** | **Download Methods** |
|-------------------------|--------------------------------------------------------------------------------------------------------|
| [Dataset](./python-api.rst#dataset)| `download_everything` |
| [DatasetAuthor](./python-api.rst#datasetauthor)| Not applicable as this class doesn't contain data files|
| [DatasetFunding](./python-api.rst#datasetfunding)| Not applicable as this class doesn't contain data files|
| [Run](./python-api.rst#run)| `download_everything` |
| [TomogramVoxelSpacing](./python-api.rst#tomogramvoxelspacing)| `download_everything` |
| [Tomogram](./python-api.rst#tomogram)| `download_all_annotations`, `download_mrcfile`, `download_omezarr` |
| [TomogramAuthor](./python-api.rst#tomogramauthor)| Not applicable as this class doesn't contain data files |
| [Annotation](./python-api.rst#annotation)| `download` |
| [AnnotationFile](./python-api.rst#annotationfile)| None, use the Annotation or Tomogram class to download annotations |
| [AnnotationAuthor](./python-api.rst#annotationauthor)| Not applicable as this class doesn't contain data files |
| [TiltSeries](./python-api.rst#tiltseries)| `download_alignment_file`, `download_angle_list`, `download_collection_metadata`, `download_mrcfile`, `download_omezarr` |

The following iterates over all datasets in the portal, then all runs per dataset, then all tomograms per run
The `find` method selects data based on user-chosen queries. These queries can have python operators `==`, `!=`, `>`, `>=`, `<`, `<=`; method operators `like`, `ilike`, `_in`; and strings or numbers. The method operators are defined in the table below:

| **Method Operator** | **Definition** |
|---------------------|----------------------------------------------------------------------------------------------|
| like | partial match, with the `%` character being a wildcard |
| ilike | case-insensitive partial match, with the `%` character being a wildcard |
| _in | accepts a list of values that are acceptable matches |

The general format of using the `find` method is as follows:

```
data_of_interest = find(client, queries)
```

The `get_by_id` method allows you to select data using the ID found on the Portal. For example, to select the data for [Dataset 10005](https://cryoetdataportal.czscience.com/datasets/10005) on the Portal and download it into your current directory use this snippet:

```
data_10005 = Dataset.get_by_id(client, 10005)
data_10005.download_everything()
```

## Examples

Below are 3 examples of common operations you can do with the API. Check out the [examples page](./cryoet_data_portal_docsite_examples.md) for more code snippets or the [tutorials page](./tutorials.md) for longer examples.

### Browse all data in the portal

To illustrate the relationships among the classes in the Portal, below is a loop that iterates over all datasets in the portal, then all runs per dataset, then all tomograms per run and outputs the name of each object.

:::{attention}
This loop is impractical! It iterates over all data in the Portal. It is simply for demonstrative purposes and should not be included in efficient code.
:::

```python
from cryoet_data_portal import Client, Dataset
Expand All @@ -59,7 +103,7 @@ for dataset in Dataset.find(client):

```

And outputs the name of each object:
The output with the object names would display something like:

```
Dataset: S. pombe cells with defocus
Expand All @@ -69,6 +113,22 @@ Dataset: S. pombe cells with defocus
...
```

### Find all datasets containing membrane annotations

The below example uses the `find` method with a longer API expression in the query to select datasets that have membrane annotations and print the IDs of those datasets.

```
import cryoet_data_portal as portal
# Instantiate a client, using the data portal GraphQL API by default
client = portal.Client()
# Use the find method to select datasets that contain membrane annotations
datasets = portal.Dataset.find(client, [portal.Dataset.runs.tomogram_voxel_spacings.annotations.object_name.ilike("%membrane%")])
for d in datasets:
print(d.id)
```

### Find all tomograms for a certain organism and download preview-sized MRC files:

The following iterates over all tomograms related to a specific organism and downloads each tomogram in MRC format.
Expand Down
Binary file added docs/figures/chimx_boundary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/final.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/mesh_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/prediction_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/tomo_side_dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/tomo_side_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/tomo_top_both.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/top_bottom_dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/top_bottom_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/valid_area_dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/valid_area_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 4 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@
:parser: myst_parser.sphinx_

.. toctree::
:maxdepth: 1
:maxdepth: 2
:hidden:

cryoet_data_portal_docsite_quick_start.md
python-api
cryoet_data_portal_docsite_data.md
cryoet_data_portal_docsite_napari.md
cryoet_data_portal_docsite_examples.md
tutorials.md
tutorial_sample_boundaries.md
cryoet_data_portal_docsite_examples.md
cryoet_data_portal_docsite_aws.md
cryoet_data_portal_docsite_faq.md
Loading

0 comments on commit 68c8c7f

Please sign in to comment.