diff --git a/.github/ISSUE_TEMPLATE/bug.md b/.github/ISSUE_TEMPLATE/bug.md index 2d9f3017b..ebecce6e2 100644 --- a/.github/ISSUE_TEMPLATE/bug.md +++ b/.github/ISSUE_TEMPLATE/bug.md @@ -24,9 +24,9 @@ Provide a description of your system and the software versions. - Machine - OS -* Software versions: - - Python user: `pip list` - - R user: `sessionInfo()` +- Software versions: + - Python user: `pip list` + - R user: `sessionInfo()` ## Additional context diff --git a/.github/ISSUE_TEMPLATE/feature-request.md b/.github/ISSUE_TEMPLATE/feature-request.md index 466265de7..bd8f89ed9 100644 --- a/.github/ISSUE_TEMPLATE/feature-request.md +++ b/.github/ISSUE_TEMPLATE/feature-request.md @@ -24,6 +24,6 @@ _e.g. I found and ad-hoc solution but incorporating it into the code base adds s Provide any alternate or temporary solutions you have considered or used. -## Ideal behavior +## Ideal behavior Provide a brief description of the behavior you'd expect. diff --git a/.markdownlint.yaml b/.markdownlint.yaml new file mode 100644 index 000000000..5007c3dad --- /dev/null +++ b/.markdownlint.yaml @@ -0,0 +1,23 @@ +# Configuration for markdownlint-cli, used in pre-commit + +# Default state for all rules +default: true + +# MD013/line-length - Line length increased to a large number +MD013: + line_length: 1000 + +# MD024/no-duplicate-heading/no-duplicate-header - Multiple headings with the same content +MD024: + allow_different_nesting: true + +# MD026/no-trailing-punctuation - Trailing punctuation in heading +MD026: + # Punctuation characters + punctuation: ".,;。,;:!" + +# MD033/no-inline-html - disable, allowing HTML +MD033: false + +# MD041/first-line-heading/first-line-h1 - First line in a file should be a top-level heading +MD041: false diff --git a/.markdownlintignore b/.markdownlintignore new file mode 100644 index 000000000..0d9b59b68 --- /dev/null +++ b/.markdownlintignore @@ -0,0 +1 @@ +SECURITY.md diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index c3afd88a6..6bb8cbe85 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -72,3 +72,8 @@ repos: hooks: - id: nbqa-black files: ^api/python/notebooks + + - repo: https://github.com/igorshubovych/markdownlint-cli + rev: v0.35.0 + hooks: + - id: markdownlint diff --git a/README.md b/README.md index 31aeee76c..5013c1115 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,13 @@ # CZ CELLxGENE Discover Census - The Census of [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) is a free-to-use service (API + Data) that allows for querying its single-cell data corpus at low-latency directly into Python or R. To learn more and start using the Census please go to the main [**Census site**](https://chanzuckerberg.github.io/cellxgene-census/). ## Issues -- Bugs: please submit a [github issue](https://github.com/chanzuckerberg/cellxgene-census/issues). +- Bugs: please submit a [github issue](https://github.com/chanzuckerberg/cellxgene-census/issues). - Security issues: if you believe you have found a security issue, in lieu of filing an issue please responsibly disclose it by contacting . ## Reuse @@ -19,4 +18,3 @@ The contents of this Github repository are freely available for reuse under the ## Code of Conduct This project adheres to the Contributor Covenant [code of conduct](https://github.com/chanzuckerberg/.github/blob/master/CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to . - diff --git a/api/python/cellxgene_census/README.md b/api/python/cellxgene_census/README.md index 7ddfa933d..ec30d84b1 100644 --- a/api/python/cellxgene_census/README.md +++ b/api/python/cellxgene_census/README.md @@ -1,9 +1,9 @@ # CZ CELLxGENE Discover Census -The `cellxgene_census` package provides an API to facilitate the use of the CZ CELLxGENE Discover Census. For more information about the API and the project visit the [chanzuckerberg/cellxgene-census GitHub repo](https://github.com/chanzuckerberg/cellxgene-census/). - +The `cellxgene_census` package provides an API to facilitate the use of the CZ CELLxGENE Discover Census. For more information about the API and the project visit the [chanzuckerberg/cellxgene-census GitHub repo](https://github.com/chanzuckerberg/cellxgene-census/). ## For More Help + For more help, please file a issue on the repo, or contact us at . If you believe you have found a security issue, we would appreciate notification. Please send email to . diff --git a/api/python/cellxgene_census/release_process.md b/api/python/cellxgene_census/release_process.md index 7ea8f0656..143e63c26 100644 --- a/api/python/cellxgene_census/release_process.md +++ b/api/python/cellxgene_census/release_process.md @@ -13,9 +13,9 @@ The following approach is used to manage releases of the Python cellxgene_census While not strictly required, this process assumes you have met the following prerequisites: - You have write access to the `chanzuckerberg/cellxgene-census` repo -- You have an account on pypi.org and test.pypi.org, both with access to the cellxgene_census project. You will need to have created an API token on each account so that you can authenticate to test.pypi.org and pypi.org accounts when using `twine`. Usually this means adding these tokens to your `~/.pypirc` file. See https://pypi.org/help/#apitoken for more information. +- You have an account on pypi.org and test.pypi.org, both with access to the cellxgene_census project. You will need to have created an API token on each account so that you can authenticate to test.pypi.org and pypi.org accounts when using `twine`. Usually this means adding these tokens to your `~/.pypirc` file. See for more information. - You have the Github CLI tool (`gh`) installed. See [documentation](https://cli.github.com/). -- You have the `pipx` CLI tool installed. See [documentation](https://pypa.github.io/pipx/). +- You have the `pipx` CLI tool installed. See [documentation](https://pypa.github.io/pipx/). ## Step 1: Building the package assets @@ -28,14 +28,16 @@ Unless you are revising and testing the build process itself, there is no need t Any pre-built asset on Github can be installed and tested from the Github URL. For example: 1. Identify the GH workflow run ID that contains the asset you wish to test. A simple way to do this is: + ```shell - $ gh run list + gh run list ``` + Alternatively, you can use the "Actions" tag in the GitHub web UI. 2. Download the build artifact.zip from GitHub, using the GH Action run ID associated with the `build` action for your commit OR utilizing the web UI: ```shell - $ gh run download + gh run download ``` If you download using the browser, unzip into a temp directory, e.g., @@ -48,9 +50,10 @@ Any pre-built asset on Github can be installed and tested from the Github URL. F ``` 3. Install and test the downloaded build, e.g., + ```shell - $ pip uninstall cellxgene-census - $ pip install ./artifact/cellxgene_census-*-any.whl + pip uninstall cellxgene-census + pip install ./artifact/cellxgene_census-*-any.whl ``` To test a release candidate: @@ -71,17 +74,21 @@ To create a release, perform the following: 1. Identify both the (tested & validated) commit and semver for the release. 2. Tag the commit with the release version (_including_ a `v` prefix) and push the tag to origin. **Important**: use an annotated tag, e.g., `git tag -a v1.9.4 -m 'Release 1.9.4`. For example (please replace with your version, _including_ a `v`, e.g. `v1.9.4`: + ```shell - $ git tag -a -m 'Release ' - $ git push origin + git tag -a -m 'Release ' + git push origin ``` + 3. Trigger a build for this tag by manually triggering the `py-build.yml` workflow. For example: + ```shell - $ gh workflow run py-build.yml --ref + gh workflow run py-build.yml --ref ``` + 4. When the workflow completes, make note of the run ID (e.g., using `gh run list`). 5. Optional, _but recommended_: download the asset from the build workflow and validate it. -6. Create and publish a GitHub Release [here](https://github.com/chanzuckerberg/cellxgene-census/releases/new). Set the release title to the ``. Select `Set as the latest release`. Use the `Generate Release Notes` button to auto-populate the summary with a changelog. It is reasonable to remove any R-specific or builder-specific entries. Add a prelude to the summary, noting any major new features or API changes. +6. Create and publish a GitHub Release [here](https://github.com/chanzuckerberg/cellxgene-census/releases/new). Set the release title to the ``. Select `Set as the latest release`. Use the `Generate Release Notes` button to auto-populate the summary with a changelog. It is reasonable to remove any R-specific or builder-specific entries. Add a prelude to the summary, noting any major new features or API changes. ## Step 4: Publish assets to PyPi @@ -89,28 +96,36 @@ To publish built release assets to PyPi (_note_: this will require your pypi/tes 1. Delete any existing release builds you may have accumulated in the past: `rm ./artifact/*`. 2. Download the assets built for your release commit, using the same method as step 2 above, e.g., + ```shell - $ gh run download + gh run download ``` + 3. Optional: upload to TestPyPi (this assumes the downloaded assets are in ./artifact/). ```shell pipx run twine upload --repository testpypi ./artifact/* ``` - Following the upload, confirm correct presentation on the project page and ability to download install from TestPyPi. + Following the upload, confirm correct presentation on the project page and ability to download install from TestPyPi. 4. To test installation from TestPyPi: + ```shell pip install --no-cache-dir -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple cellxgene-census python -c "import cellxgene_census; print(cellxgene_census.__version__)" ``` + Note that the `--extra-index-url` option ensures that any transitive package dependencies that are _not_ available on `test.pypi.org` can be satisfied by installing them from the production `pypi.org`. You can find more information [here](https://packaging.python.org/en/latest/guides/using-testpypi/). 5. Use twine to upload to PyPi (this assumes the downloaded assets are in ./artifact/), e.g., + ```shell pipx run twine upload ./artifact/* + ``` + 6. Test the installation from PyPi, as a final sanity check. Note that it may take a minute for the new release to be visible on pypi.org: + ```shell pip install --no-cache-dir cellxgene-census python -c "import cellxgene_census; print(cellxgene_census.__version__)" diff --git a/api/python/cellxgene_census/tests/README.md b/api/python/cellxgene_census/tests/README.md index 7d3f640c7..99dce0501 100644 --- a/api/python/cellxgene_census/tests/README.md +++ b/api/python/cellxgene_census/tests/README.md @@ -36,7 +36,7 @@ You can also combine them, e.g., > pytest -m 'not live_corpus' --expensive --experimental -# Acceptance (expensive) tests +## Acceptance (expensive) tests These tests are periodically run, and are not part of CI due to their overhead. @@ -52,13 +52,13 @@ When run, please record the results in this file (below) and commit the change t - any run notes - full output of: `pytest -v --durations=0 --expensive ./api/python/cellxgene_census/tests/` -## 2023-07-26 +### 2023-07-26 - Host: EC2 instance type: `r6id.32xlarge`, all nvme mounted as swap. - Uname: Linux 5.19.0-1028-aws #29~22.04.1-Ubuntu SMP Tue Jun 20 19:12:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Python & census versions: -``` +```python >>> import cellxgene_census, tiledbsoma >>> tiledbsoma.show_package_versions() tiledbsoma.__version__ 1.2.7 @@ -71,7 +71,7 @@ OS version Linux 5.19.0-1028-aws **Pytest output:** -``` +```text ============================= test session starts ============================== platform linux -- Python 3.10.6, pytest-7.1.3, pluggy-1.0.0 -- /home/ubuntu/venv/bin/python cachedir: .pytest_cache @@ -237,12 +237,13 @@ api/python/cellxgene_census/tests/experimental/pp/test_stats.py::test_mean_varia =============== 72 passed, 202 deselected in 14584.03s (4:03:04) =============== ``` -## 2023-06-23 +### 2023-06-23 - Host: EC2 instance type: `r6id.32xlarge`, all nvme mounted as swap. - Uname: Linux 5.19.0-1025-aws #26~22.04.1-Ubuntu SMP Mon Apr 24 01:58:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Python & census versions: -``` + +```python >>> import cellxgene_census, tiledbsoma >>> tiledbsoma.show_package_versions() tiledbsoma.__version__ 1.2.5 @@ -250,7 +251,7 @@ TileDB-Py tiledb.version() (0, 21, 5) TileDB core version 2.15.4 libtiledbsoma version() libtiledb=2.15.2 python version 3.10.6.final.0 -OS version +OS version >>> cellxgene_census.__version__ '1.2.1' >>> cellxgene_census.get_census_version_description('latest') @@ -259,7 +260,7 @@ OS version **Pytest output:** -``` +```text ============================= test session starts ============================== platform linux -- Python 3.10.6, pytest-7.4.0, pluggy-1.2.0 -- /home/ubuntu/venv/bin/python3 cachedir: .pytest_cache @@ -420,12 +421,13 @@ test_util.py::test_uri_join PASSED [100%] =============== 69 passed, 111 deselected in 15040.59s (4:10:40) =============== ``` -## 2023-05-16 +### 2023-05-16 - Host: EC2 instance type: `r6id.32xlarge`, all nvme mounted as swap. - Uname: Linux 5.19.0-1022-aws #23~22.04.1-Ubuntu SMP Fri Mar 17 15:38:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Python & census versions: -``` + +```python >>> import cellxgene_census, tiledbsoma >>> tiledbsoma.show_package_versions() tiledbsoma.__version__ 1.2.3 @@ -442,7 +444,7 @@ OS version Linux 5.19.0-1022-aws **Pytest output:** -``` +```text ============================= test session starts ============================== platform linux -- Python 3.10.6, pytest-7.3.1, pluggy-1.0.0 -- /home/ubuntu/venv-cellxgene-census/bin/python3 cachedir: .pytest_cache @@ -565,15 +567,15 @@ tests/test_util.py::test_uri_join PASSED [100%] ======================= 51 passed in 10696.20s (2:58:16) ======================= ``` -## 2023-03-29 +### 2023-03-29 -**Config** +**Config:** - Host: EC2 instance type: `r6id.32xlarge`, all nvme mounted as swap. - Uname: Linux bruce.aegea 5.15.0-1033-aws #37~20.04.1-Ubuntu SMP Fri Mar 17 11:39:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Python & census versions: -``` +```python In [1]: import cell_census, tiledbsoma In [2]: tiledbsoma.show_package_versions() @@ -585,7 +587,7 @@ python version 3.9.16.final.0 OS version Linux 5.15.0-1033-aws In [3]: cell_census.get_census_version_description('latest') -Out[3]: +Out[3]: {'release_date': None, 'release_build': '2023-03-16', 'soma': {'uri': 's3://cellxgene-data-public/cell-census/2023-03-16/soma/', @@ -603,14 +605,14 @@ The test `test_acceptance.py::test_get_anndata[None-homo_sapiens]` manifest a la **Pytest output:** -``` +```text $ pytest -v --durations=0 --expensive ./api/python/cell_census/tests/ ==================================================== test session starts ===================================================== platform linux -- Python 3.9.16, pytest-7.2.2, pluggy-1.0.0 -- /home/bruce/cell-census/venv/bin/python cachedir: .pytest_cache rootdir: /home/bruce/cell-census/api/python/cell_census, configfile: pyproject.toml plugins: requests-mock-1.10.0, anyio-3.6.2 -collected 45 items +collected 45 items api/python/cell_census/tests/test_acceptance.py::test_load_axes[homo_sapiens] PASSED [ 2%] api/python/cell_census/tests/test_acceptance.py::test_load_axes[mus_musculus] PASSED [ 4%] diff --git a/api/python/notebooks/README.md b/api/python/notebooks/README.md index f97d73fdf..2b1c683ef 100644 --- a/api/python/notebooks/README.md +++ b/api/python/notebooks/README.md @@ -8,6 +8,7 @@ Demonstration notebooks for the CZ CELLxGENE Discover Census. There are two kind ## Dependencies You must be on a Linux or MacOS system, with the following installed: + * Python 3.8 to 3.11 * Jupyter or some other means of running notebooks (e.g., vscode) @@ -19,25 +20,32 @@ not been tested). I also recommend you use a `d` instance type, and mount all of the NVME drives as swap, as it will keep you from running out of RAM. - ## Set up Python environment + 1. (optional, but highly recommended) In your working directory, make and activate a virtual environment. For example: -```shell - $ python -m venv ./venv - $ source ./venv/bin/activate -``` + + ```shell + python -m venv ./venv + source ./venv/bin/activate + ``` + 2. Install the required dependencies: -```shell - $ pip install -U -r cellxgene-census/api/python/notebooks/requirements.txt -``` + + ```shell + pip install -U -r cellxgene-census/api/python/notebooks/requirements.txt + ``` + ## Verify your installation + Check that your installation works - this make take a few seconds, as it loads metadata from S3: + ```shell $ python -c 'import cellxgene_census; print(cellxgene_census.open_soma().soma_type)' SOMACollection ``` ## Run notebooks + Run notebooks, which you can find in the `cellxgene-census/api/python/notebooks` directory. ## For more help diff --git a/api/r/cellxgene.census/README.md b/api/r/cellxgene.census/README.md index 99dd70c05..ebf102e1d 100644 --- a/api/r/cellxgene.census/README.md +++ b/api/r/cellxgene.census/README.md @@ -4,9 +4,9 @@ -This is the documentation for the R package `cellxgene.census` which is part of CZ CELLxGENE Discover Census. For full details on Census data and capabilities please go to the [main Census site](https://chanzuckerberg.github.io/cellxgene-census/). +This is the documentation for the R package `cellxgene.census` which is part of CZ CELLxGENE Discover Census. For full details on Census data and capabilities please go to the [main Census site](https://chanzuckerberg.github.io/cellxgene-census/). -`cellxgene.census` provides an API to efficiently access the cloud-hosted Census single-cell data from R. In just a few seconds users can access any slice of Census data using cell or gene filters across hundreds of single-cell datasets. +`cellxgene.census` provides an API to efficiently access the cloud-hosted Census single-cell data from R. In just a few seconds users can access any slice of Census data using cell or gene filters across hundreds of single-cell datasets. Census data can be fetched in an iterative fashion for bigger-than-memory slices of data, or quickly exported to basic R structures, as well as `Seurat` or `SingleCellExperiment` objects for downstream analysis. @@ -23,7 +23,7 @@ Then in an R session install `cellxgene.census` from R-Universe. ```r install.packages( "cellxgene.census", - repos=c('https://chanzuckerberg.r-universe.dev', 'https://cloud.r-project.org') + repos=c('https://chanzuckerberg.r-universe.dev', 'https://cloud.r-project.org') ) ``` @@ -36,11 +36,10 @@ install.packages("Seurat") # SingleCellExperiment if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") - + BiocManager::install("SingleCellExperiment") ``` - ## Usage Check out the vignettes in the "Articles" section of the navigation bar on this site. We highly recommend the following vignettes as a starting point: @@ -50,7 +49,6 @@ Check out the vignettes in the "Articles" section of the navigation bar on this You can also check out out the [quick start guide](https://chanzuckerberg.github.io/cellxgene-census/cellxgene_census_docsite_quick_start.html) in the main Census site. - ### Example `Seurat` and `SingleCellExperiment` query The following creates a `Seurat` object on-demand with all sympathetic neurons in Census and filtering only for the genes `ENSG00000161798`, `ENSG00000188229`. @@ -91,6 +89,6 @@ sce_obj <- get_single_cell_experiment( ## For More Help -For more help, please go visit the [main Census site](https://chanzuckerberg.github.io/cellxgene-census/). +For more help, please go visit the [main Census site](https://chanzuckerberg.github.io/cellxgene-census/). If you believe you have found a security issue, we would appreciate notification. Please send an email to . diff --git a/api/r/cellxgene.census/tests/README.md b/api/r/cellxgene.census/tests/README.md index a433240f3..c6e638f80 100644 --- a/api/r/cellxgene.census/tests/README.md +++ b/api/r/cellxgene.census/tests/README.md @@ -2,7 +2,7 @@ This directory contains tests of the `cellxgene.census` R package API, _and_ the use of the API on the live "corpus", i.e., data in the public Census S3 bucket. The tests use the R package `tessthat`. -In addition, a set of acceptance (expensive) tests are available and `testthat` does not run them by default (see [section below](#Acceptance-expensive-tests)). +In addition, a set of acceptance (expensive) tests are available and `testthat` does not run them by default (see [section below](#acceptance-expensive-tests)). Tests can be run in the usual manner. First, ensure you have `cellxgene-census` and `testthat` installed, e.g., from the top-level repo directory: @@ -14,20 +14,20 @@ library("cellxgene.census") test_dir("./api/r/cellxgene.census/tests/") ``` -# Acceptance (expensive) tests +## Acceptance (expensive) tests These tests are periodically run, and are not part of CI due to their overhead. These tests use a modified `Reporter` from `testthat` to record running times of each test in a `csv` file. To run the test execute the following command: -``` +```shell Rscript ./api/r/cellxgene.census/tests/testthat/acceptance-tests-run-script.R > stdout.txt ``` This command will result in two files: - `stdout.txt` with the test progress logs. -- `acceptance-tests-logs-[YYY]-[MM]-[DD].csv` with the running times and test outputs. +- `acceptance-tests-logs-[YYY]-[MM]-[DD].csv` with the running times and test outputs. When run, please record the results in this file (below) and commit the change to git. Please include the following information: @@ -42,13 +42,13 @@ When run, please record the results in this file (below) and commit the change t - `stdout.txt` - `acceptance-tests-logs-[YYY]-[MM]-[DD].csv` -## 2023-07-15 +### 2023-07-15 - Host: EC2 instance type: `r6id.x32xlarge`, all nvme mounted as swap. - Uname: Linux ip-172-31-62-52 5.19.0-1028-aws #29~22.04.1-Ubuntu SMP Tue Jun 20 19:12:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Census version -``` +```r > cellxgene.census::get_census_version_description('latest') $release_date [1] "" @@ -75,186 +75,186 @@ $census_version [1] "latest" ``` -- R session info +- R session info -``` +```r > library("cellxgene.census"); sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS Matrix products: default -BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 +BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 locale: - [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 - [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 - [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C -[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C + [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 + [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 + [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C +[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C time zone: America/Los_Angeles tzcode source: system (glibc) attached base packages: -[1] stats graphics grDevices utils datasets methods base +[1] stats graphics grDevices utils datasets methods base other attached packages: [1] cellxgene.census_0.0.0.9000 loaded via a namespace (and not attached): - [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 - [4] tiledbsoma_0.0.0.9031 rlang_1.1.1 purrr_1.0.1 - [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 -[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 -[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 -[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 -[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 -[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 -[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 -[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 -[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 -[34] xml2_1.3.4 + [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 + [4] tiledbsoma_0.0.0.9031 rlang_1.1.1 purrr_1.0.1 + [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 +[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 +[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 +[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 +[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 +[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 +[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 +[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 +[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 +[34] xml2_1.3.4 ``` - `stdout.txt` -``` -START TEST ( 2023-07-14 11:53:28.178227 ): test_load_obs_human -END TEST ( 2023-07-14 11:53:35.399914 ): test_load_obs_human -START TEST ( 2023-07-14 11:53:35.40381 ): test_load_var_human -END TEST ( 2023-07-14 11:53:37.241811 ): test_load_var_human -START TEST ( 2023-07-14 11:53:37.243805 ): test_load_obs_mouse -END TEST ( 2023-07-14 11:53:39.325767 ): test_load_obs_mouse -START TEST ( 2023-07-14 11:53:39.327926 ): test_load_var_mouse -END TEST ( 2023-07-14 11:53:40.895862 ): test_load_var_mouse -START TEST ( 2023-07-14 11:53:40.899131 ): test_incremental_read_obs_human -END TEST ( 2023-07-14 11:53:46.524585 ): test_incremental_read_obs_human -START TEST ( 2023-07-14 11:53:46.526778 ): test_incremental_read_var_human -END TEST ( 2023-07-14 11:53:47.874486 ): test_incremental_read_var_human -START TEST ( 2023-07-14 11:53:47.877107 ): test_incremental_read_obs_mouse -END TEST ( 2023-07-14 11:53:50.236715 ): test_incremental_read_obs_mouse -START TEST ( 2023-07-14 11:53:50.239098 ): test_incremental_read_var_mouse -END TEST ( 2023-07-14 11:53:51.736903 ): test_incremental_read_var_mouse -START TEST ( 2023-07-14 11:53:51.739117 ): test_incremental_read_X_human -END TEST ( 2023-07-14 12:14:34.869316 ): test_incremental_read_X_human -START TEST ( 2023-07-14 12:14:34.871589 ): test_incremental_read_X_human-large-buffer-size -END TEST ( 2023-07-14 12:49:30.484771 ): test_incremental_read_X_human-large-buffer-size -START TEST ( 2023-07-14 12:49:30.570718 ): test_incremental_read_X_mouse -END TEST ( 2023-07-14 12:54:44.455472 ): test_incremental_read_X_mouse -START TEST ( 2023-07-14 12:54:44.466457 ): test_incremental_read_X_mouse-large-buffer-size -END TEST ( 2023-07-14 12:56:15.48859 ): test_incremental_read_X_mouse-large-buffer-size -START TEST ( 2023-07-14 12:56:15.491021 ): test_incremental_query_human_brain -END TEST ( 2023-07-14 12:57:17.430526 ): test_incremental_query_human_brain -START TEST ( 2023-07-14 12:57:17.434666 ): test_incremental_query_human_aorta -END TEST ( 2023-07-14 12:57:29.836529 ): test_incremental_query_human_aorta -START TEST ( 2023-07-14 12:57:29.838435 ): test_incremental_query_mouse_brain -END TEST ( 2023-07-14 12:57:41.417177 ): test_incremental_query_mouse_brain -START TEST ( 2023-07-14 12:57:41.419112 ): test_incremental_query_mouse_aorta -END TEST ( 2023-07-14 12:57:48.16363 ): test_incremental_query_mouse_aorta -START TEST ( 2023-07-14 12:57:48.166098 ): test_seurat_small-query -END TEST ( 2023-07-14 12:58:09.744611 ): test_seurat_small-query -START TEST ( 2023-07-14 12:58:09.746436 ): test_seurat_10K-cells-human -END TEST ( 2023-07-14 12:58:19.528256 ): test_seurat_10K-cells-human -START TEST ( 2023-07-14 12:58:19.530055 ): test_seurat_100K-cells-human -END TEST ( 2023-07-14 12:58:52.22588 ): test_seurat_100K-cells-human -START TEST ( 2023-07-14 12:58:52.227741 ): test_seurat_250K-cells-human -END TEST ( 2023-07-14 13:00:12.885999 ): test_seurat_250K-cells-human -START TEST ( 2023-07-14 13:00:12.887802 ): test_seurat_500K-cells-human -END TEST ( 2023-07-14 13:03:13.64018 ): test_seurat_500K-cells-human -START TEST ( 2023-07-14 13:03:13.641989 ): test_seurat_750K-cells-human -END TEST ( 2023-07-14 13:08:09.243155 ): test_seurat_750K-cells-human -START TEST ( 2023-07-14 13:08:09.800513 ): test_seurat_1M-cells-human -END TEST ( 2023-07-14 13:14:21.120332 ): test_seurat_1M-cells-human -START TEST ( 2023-07-14 13:14:21.122229 ): test_seurat_common-tissue -END TEST ( 2023-07-14 13:18:23.092255 ): test_seurat_common-tissue -START TEST ( 2023-07-14 13:18:23.094857 ): test_seurat_common-tissue-large-buffer-size -END TEST ( 2023-07-14 13:22:31.049091 ): test_seurat_common-tissue-large-buffer-size -START TEST ( 2023-07-14 13:22:31.050861 ): test_seurat_common-cell-type -END TEST ( 2023-07-14 13:39:16.80712 ): test_seurat_common-cell-type -START TEST ( 2023-07-14 13:39:16.818948 ): test_seurat_common-cell-type-large-buffer-size -END TEST ( 2023-07-14 15:06:24.936444 ): test_seurat_common-cell-type-large-buffer-size -START TEST ( 2023-07-14 15:06:24.943206 ): test_seurat_whole-enchilada-large-buffer-size -END TEST ( 2023-07-14 15:06:24.955094 ): test_seurat_whole-enchilada-large-buffer-size -START TEST ( 2023-07-14 15:06:24.958625 ): test_sce_small-query -END TEST ( 2023-07-14 15:06:53.449669 ): test_sce_small-query -START TEST ( 2023-07-14 15:06:53.451782 ): test_sce_10K-cells-human -END TEST ( 2023-07-14 15:07:03.753756 ): test_sce_10K-cells-human -START TEST ( 2023-07-14 15:07:03.756331 ): test_sce_100K-cells-human -END TEST ( 2023-07-14 15:07:39.928058 ): test_sce_100K-cells-human -START TEST ( 2023-07-14 15:07:39.931532 ): test_sce_250K-cells-human -END TEST ( 2023-07-14 15:08:59.480538 ): test_sce_250K-cells-human -START TEST ( 2023-07-14 15:08:59.482945 ): test_sce_500K-cells-human -END TEST ( 2023-07-14 15:12:02.190109 ): test_sce_500K-cells-human -START TEST ( 2023-07-14 15:12:02.192345 ): test_sce_750K-cells-human -END TEST ( 2023-07-14 15:17:29.745159 ): test_sce_750K-cells-human -START TEST ( 2023-07-14 15:17:29.748402 ): test_sce_1M-cells-human -END TEST ( 2023-07-14 15:22:46.696071 ): test_sce_1M-cells-human -START TEST ( 2023-07-14 15:22:46.69859 ): test_sce_common-tissue -END TEST ( 2023-07-14 15:25:59.307055 ): test_sce_common-tissue -START TEST ( 2023-07-14 15:25:59.309585 ): test_sce_common-tissue-large-buffer-size -END TEST ( 2023-07-14 15:29:05.180097 ): test_sce_common-tissue-large-buffer-size -START TEST ( 2023-07-14 15:29:05.182871 ): test_sce_common-cell-type -END TEST ( 2023-07-14 17:16:40.55286 ): test_sce_common-cell-type -START TEST ( 2023-07-14 17:16:40.557382 ): test_sce_common-cell-type-large-buffer-size -END TEST ( 2023-07-14 19:37:35.293807 ): test_sce_common-cell-type-large-buffer-size -START TEST ( 2023-07-14 19:37:35.299182 ): test_sce_whole-enchilada-large-buffer-size -END TEST ( 2023-07-14 19:37:35.305957 ): test_sce_whole-enchilada-large-buffer-size +```text +START TEST ( 2023-07-14 11:53:28.178227 ): test_load_obs_human +END TEST ( 2023-07-14 11:53:35.399914 ): test_load_obs_human +START TEST ( 2023-07-14 11:53:35.40381 ): test_load_var_human +END TEST ( 2023-07-14 11:53:37.241811 ): test_load_var_human +START TEST ( 2023-07-14 11:53:37.243805 ): test_load_obs_mouse +END TEST ( 2023-07-14 11:53:39.325767 ): test_load_obs_mouse +START TEST ( 2023-07-14 11:53:39.327926 ): test_load_var_mouse +END TEST ( 2023-07-14 11:53:40.895862 ): test_load_var_mouse +START TEST ( 2023-07-14 11:53:40.899131 ): test_incremental_read_obs_human +END TEST ( 2023-07-14 11:53:46.524585 ): test_incremental_read_obs_human +START TEST ( 2023-07-14 11:53:46.526778 ): test_incremental_read_var_human +END TEST ( 2023-07-14 11:53:47.874486 ): test_incremental_read_var_human +START TEST ( 2023-07-14 11:53:47.877107 ): test_incremental_read_obs_mouse +END TEST ( 2023-07-14 11:53:50.236715 ): test_incremental_read_obs_mouse +START TEST ( 2023-07-14 11:53:50.239098 ): test_incremental_read_var_mouse +END TEST ( 2023-07-14 11:53:51.736903 ): test_incremental_read_var_mouse +START TEST ( 2023-07-14 11:53:51.739117 ): test_incremental_read_X_human +END TEST ( 2023-07-14 12:14:34.869316 ): test_incremental_read_X_human +START TEST ( 2023-07-14 12:14:34.871589 ): test_incremental_read_X_human-large-buffer-size +END TEST ( 2023-07-14 12:49:30.484771 ): test_incremental_read_X_human-large-buffer-size +START TEST ( 2023-07-14 12:49:30.570718 ): test_incremental_read_X_mouse +END TEST ( 2023-07-14 12:54:44.455472 ): test_incremental_read_X_mouse +START TEST ( 2023-07-14 12:54:44.466457 ): test_incremental_read_X_mouse-large-buffer-size +END TEST ( 2023-07-14 12:56:15.48859 ): test_incremental_read_X_mouse-large-buffer-size +START TEST ( 2023-07-14 12:56:15.491021 ): test_incremental_query_human_brain +END TEST ( 2023-07-14 12:57:17.430526 ): test_incremental_query_human_brain +START TEST ( 2023-07-14 12:57:17.434666 ): test_incremental_query_human_aorta +END TEST ( 2023-07-14 12:57:29.836529 ): test_incremental_query_human_aorta +START TEST ( 2023-07-14 12:57:29.838435 ): test_incremental_query_mouse_brain +END TEST ( 2023-07-14 12:57:41.417177 ): test_incremental_query_mouse_brain +START TEST ( 2023-07-14 12:57:41.419112 ): test_incremental_query_mouse_aorta +END TEST ( 2023-07-14 12:57:48.16363 ): test_incremental_query_mouse_aorta +START TEST ( 2023-07-14 12:57:48.166098 ): test_seurat_small-query +END TEST ( 2023-07-14 12:58:09.744611 ): test_seurat_small-query +START TEST ( 2023-07-14 12:58:09.746436 ): test_seurat_10K-cells-human +END TEST ( 2023-07-14 12:58:19.528256 ): test_seurat_10K-cells-human +START TEST ( 2023-07-14 12:58:19.530055 ): test_seurat_100K-cells-human +END TEST ( 2023-07-14 12:58:52.22588 ): test_seurat_100K-cells-human +START TEST ( 2023-07-14 12:58:52.227741 ): test_seurat_250K-cells-human +END TEST ( 2023-07-14 13:00:12.885999 ): test_seurat_250K-cells-human +START TEST ( 2023-07-14 13:00:12.887802 ): test_seurat_500K-cells-human +END TEST ( 2023-07-14 13:03:13.64018 ): test_seurat_500K-cells-human +START TEST ( 2023-07-14 13:03:13.641989 ): test_seurat_750K-cells-human +END TEST ( 2023-07-14 13:08:09.243155 ): test_seurat_750K-cells-human +START TEST ( 2023-07-14 13:08:09.800513 ): test_seurat_1M-cells-human +END TEST ( 2023-07-14 13:14:21.120332 ): test_seurat_1M-cells-human +START TEST ( 2023-07-14 13:14:21.122229 ): test_seurat_common-tissue +END TEST ( 2023-07-14 13:18:23.092255 ): test_seurat_common-tissue +START TEST ( 2023-07-14 13:18:23.094857 ): test_seurat_common-tissue-large-buffer-size +END TEST ( 2023-07-14 13:22:31.049091 ): test_seurat_common-tissue-large-buffer-size +START TEST ( 2023-07-14 13:22:31.050861 ): test_seurat_common-cell-type +END TEST ( 2023-07-14 13:39:16.80712 ): test_seurat_common-cell-type +START TEST ( 2023-07-14 13:39:16.818948 ): test_seurat_common-cell-type-large-buffer-size +END TEST ( 2023-07-14 15:06:24.936444 ): test_seurat_common-cell-type-large-buffer-size +START TEST ( 2023-07-14 15:06:24.943206 ): test_seurat_whole-enchilada-large-buffer-size +END TEST ( 2023-07-14 15:06:24.955094 ): test_seurat_whole-enchilada-large-buffer-size +START TEST ( 2023-07-14 15:06:24.958625 ): test_sce_small-query +END TEST ( 2023-07-14 15:06:53.449669 ): test_sce_small-query +START TEST ( 2023-07-14 15:06:53.451782 ): test_sce_10K-cells-human +END TEST ( 2023-07-14 15:07:03.753756 ): test_sce_10K-cells-human +START TEST ( 2023-07-14 15:07:03.756331 ): test_sce_100K-cells-human +END TEST ( 2023-07-14 15:07:39.928058 ): test_sce_100K-cells-human +START TEST ( 2023-07-14 15:07:39.931532 ): test_sce_250K-cells-human +END TEST ( 2023-07-14 15:08:59.480538 ): test_sce_250K-cells-human +START TEST ( 2023-07-14 15:08:59.482945 ): test_sce_500K-cells-human +END TEST ( 2023-07-14 15:12:02.190109 ): test_sce_500K-cells-human +START TEST ( 2023-07-14 15:12:02.192345 ): test_sce_750K-cells-human +END TEST ( 2023-07-14 15:17:29.745159 ): test_sce_750K-cells-human +START TEST ( 2023-07-14 15:17:29.748402 ): test_sce_1M-cells-human +END TEST ( 2023-07-14 15:22:46.696071 ): test_sce_1M-cells-human +START TEST ( 2023-07-14 15:22:46.69859 ): test_sce_common-tissue +END TEST ( 2023-07-14 15:25:59.307055 ): test_sce_common-tissue +START TEST ( 2023-07-14 15:25:59.309585 ): test_sce_common-tissue-large-buffer-size +END TEST ( 2023-07-14 15:29:05.180097 ): test_sce_common-tissue-large-buffer-size +START TEST ( 2023-07-14 15:29:05.182871 ): test_sce_common-cell-type +END TEST ( 2023-07-14 17:16:40.55286 ): test_sce_common-cell-type +START TEST ( 2023-07-14 17:16:40.557382 ): test_sce_common-cell-type-large-buffer-size +END TEST ( 2023-07-14 19:37:35.293807 ): test_sce_common-cell-type-large-buffer-size +START TEST ( 2023-07-14 19:37:35.299182 ): test_sce_whole-enchilada-large-buffer-size +END TEST ( 2023-07-14 19:37:35.305957 ): test_sce_whole-enchilada-large-buffer-size ``` - `acceptance-tests-logs-2023-07-14.csv` -``` +```text test,user,system,real,test_result -test_load_obs_human,15.906,89.247,7.221,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_human,0.494,0.52300000000001,1.831,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_load_obs_mouse,2.221,6.60799999999999,2.081,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_mouse,0.385999999999999,0.501999999999995,1.567,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_incremental_read_obs_human,15.758,102.911,5.624,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_human,0.390999999999998,0.533999999999992,1.346,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_obs_mouse,2.95200000000001,11.719,2.359,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_mouse,0.412999999999997,0.378999999999991,1.497,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_X_human,8126.569,12080.887,1243.129,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_human-large-buffer-size,8129.084,83266.503,2095.526,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_mouse,950.918,1498.723,313.867,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_mouse-large-buffer-size,941.432000000001,1447.306,91.02,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_query_human_brain,202.976999999999,245.77900000001,61.9389999999999,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_human_aorta,15.2920000000013,87.8139999999985,12.4009999999998,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_mouse_brain,46.5919999999969,59.3540000000066,11.578,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_mouse_aorta,14.3100000000013,10.6429999999964,6.74299999999994,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_seurat_small-query,23.9830000000002,82.7459999999992,21.578,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_10K-cells-human,5.35099999999875,4.23699999999371,9.78099999999995,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_100K-cells-human,32.75,29.4389999999985,32.6950000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_250K-cells-human,88.8899999999994,73.1219999999885,80.6570000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_500K-cells-human,203.519,153.688000000009,180.751,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_750K-cells-human,319.052,240.021999999997,295.6,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_1M-cells-human,399.764000000003,306.130999999994,371.32,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-tissue,362.931999999997,274.218999999997,241.969,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-tissue-large-buffer-size,359.698,260.566000000006,247.953,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-cell-type,3382.376,16645.538,1005.753,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_common-cell-type-large-buffer-size,3376.691,49904.262,5228.115,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_whole-enchilada-large-buffer-size,0.00799999999799184,0.00099999998928979,0.0100000000002183,expect_true(TRUE): expectation_success: TRUE is not TRUE -test_sce_small-query,28.0150000000031,102.805999999982,28.4889999999996,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_10K-cells-human,5.32399999999689,5.06299999999464,10.2999999999993,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_100K-cells-human,30.0080000000016,540.113000000012,36.1700000000001,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_250K-cells-human,84.3679999999986,77.1530000000203,79.5490000000009,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_500K-cells-human,195.394,1590.212,182.706,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_750K-cells-human,278.846000000001,312.385000000009,327.552,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_1M-cells-human,369.940000000002,287.795000000013,316.947,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_common-tissue,334.884999999998,266.495999999985,192.607,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_common-tissue-large-buffer-size,333.330000000002,239.800999999978,185.868,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE -test_sce_common-cell-type,5363.865,41800.937,6455.368,test_sce(test_args): Error: Error in `asMethod(object)`: unable to coerce from TsparseMatrix to [CR]sparseMatrixwhen length of 'i' slot exceeds 2^31-1 -test_sce_common-cell-type-large-buffer-size,5398.502,89696.129,8454.734,test_sce(test_args): Error: Error in `asMethod(object)`: unable to coerce from TsparseMatrix to [CR]sparseMatrixwhen length of 'i' slot exceeds 2^31-1 -test_sce_whole-enchilada-large-buffer-size,0.00600000000122236,0,0.00599999999758438,expect_true(TRUE): expectation_success: TRUE is not TRUE +test_load_obs_human,15.906,89.247,7.221,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_human,0.494,0.52300000000001,1.831,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_load_obs_mouse,2.221,6.60799999999999,2.081,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_mouse,0.385999999999999,0.501999999999995,1.567,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_incremental_read_obs_human,15.758,102.911,5.624,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_human,0.390999999999998,0.533999999999992,1.346,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_obs_mouse,2.95200000000001,11.719,2.359,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_mouse,0.412999999999997,0.378999999999991,1.497,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_X_human,8126.569,12080.887,1243.129,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_human-large-buffer-size,8129.084,83266.503,2095.526,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_mouse,950.918,1498.723,313.867,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_mouse-large-buffer-size,941.432000000001,1447.306,91.02,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_query_human_brain,202.976999999999,245.77900000001,61.9389999999999,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_human_aorta,15.2920000000013,87.8139999999985,12.4009999999998,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_mouse_brain,46.5919999999969,59.3540000000066,11.578,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_mouse_aorta,14.3100000000013,10.6429999999964,6.74299999999994,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_seurat_small-query,23.9830000000002,82.7459999999992,21.578,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_10K-cells-human,5.35099999999875,4.23699999999371,9.78099999999995,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_100K-cells-human,32.75,29.4389999999985,32.6950000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_250K-cells-human,88.8899999999994,73.1219999999885,80.6570000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_500K-cells-human,203.519,153.688000000009,180.751,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_750K-cells-human,319.052,240.021999999997,295.6,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_1M-cells-human,399.764000000003,306.130999999994,371.32,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-tissue,362.931999999997,274.218999999997,241.969,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-tissue-large-buffer-size,359.698,260.566000000006,247.953,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-cell-type,3382.376,16645.538,1005.753,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_common-cell-type-large-buffer-size,3376.691,49904.262,5228.115,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_whole-enchilada-large-buffer-size,0.00799999999799184,0.00099999998928979,0.0100000000002183,expect_true(TRUE): expectation_success: TRUE is not TRUE +test_sce_small-query,28.0150000000031,102.805999999982,28.4889999999996,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_10K-cells-human,5.32399999999689,5.06299999999464,10.2999999999993,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_100K-cells-human,30.0080000000016,540.113000000012,36.1700000000001,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_250K-cells-human,84.3679999999986,77.1530000000203,79.5490000000009,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_500K-cells-human,195.394,1590.212,182.706,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_750K-cells-human,278.846000000001,312.385000000009,327.552,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_1M-cells-human,369.940000000002,287.795000000013,316.947,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_common-tissue,334.884999999998,266.495999999985,192.607,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_common-tissue-large-buffer-size,333.330000000002,239.800999999978,185.868,test_sce(test_args): expectation_success: is(this_sce, "SingleCellExperiment") is not TRUE ; test_sce(test_args): expectation_success: ncol(this_sce) > 0 is not TRUE +test_sce_common-cell-type,5363.865,41800.937,6455.368,test_sce(test_args): Error: Error in `asMethod(object)`: unable to coerce from TsparseMatrix to [CR]sparseMatrixwhen length of 'i' slot exceeds 2^31-1 +test_sce_common-cell-type-large-buffer-size,5398.502,89696.129,8454.734,test_sce(test_args): Error: Error in `asMethod(object)`: unable to coerce from TsparseMatrix to [CR]sparseMatrixwhen length of 'i' slot exceeds 2^31-1 +test_sce_whole-enchilada-large-buffer-size,0.00600000000122236,0,0.00599999999758438,expect_true(TRUE): expectation_success: TRUE is not TRUE ``` -## 2023-07-02 +### 2023-07-02 - Host: EC2 instance type: `r6id.x32xlarge`, all nvme mounted as swap. - Uname: Linux ip-172-31-62-52 5.19.0-1028-aws #29~22.04.1-Ubuntu SMP Tue Jun 20 19:12:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Census version -``` +```r > cellxgene.census::get_census_version_description('latest') $release_date [1] "" @@ -281,151 +281,151 @@ $census_version [1] "latest" ``` -- R session info +- R session info -``` +```r > library("cellxgene.census"); sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS Matrix products: default -BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 +BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 locale: - [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 - [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 - [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C -[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C + [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 + [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 + [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C +[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C time zone: America/Los_Angeles tzcode source: system (glibc) attached base packages: -[1] stats graphics grDevices utils datasets methods base +[1] stats graphics grDevices utils datasets methods base other attached packages: [1] cellxgene.census_0.0.0.9000 loaded via a namespace (and not attached): - [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 - [4] tiledbsoma_0.0.0.9030 rlang_1.1.1 purrr_1.0.1 - [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 -[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 -[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 -[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 -[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 -[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 -[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 -[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 -[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 -[34] xml2_1.3.4 + [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 + [4] tiledbsoma_0.0.0.9030 rlang_1.1.1 purrr_1.0.1 + [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 +[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 +[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 +[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 +[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 +[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 +[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 +[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 +[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 +[34] xml2_1.3.4 ``` - `stdout.txt` -``` -START TEST ( 2023-07-02 14:46:28.791692 ): test_load_obs_human -END TEST ( 2023-07-02 14:46:35.845693 ): test_load_obs_human -START TEST ( 2023-07-02 14:46:35.847513 ): test_load_var_human -END TEST ( 2023-07-02 14:46:37.550566 ): test_load_var_human -START TEST ( 2023-07-02 14:46:37.552496 ): test_load_obs_mouse -END TEST ( 2023-07-02 14:46:39.540367 ): test_load_obs_mouse -START TEST ( 2023-07-02 14:46:39.542638 ): test_load_var_mouse -END TEST ( 2023-07-02 14:46:41.049362 ): test_load_var_mouse -START TEST ( 2023-07-02 14:46:41.051673 ): test_incremental_read_obs_human -END TEST ( 2023-07-02 14:46:46.651326 ): test_incremental_read_obs_human -START TEST ( 2023-07-02 14:46:46.653535 ): test_incremental_read_var_human -END TEST ( 2023-07-02 14:46:48.216871 ): test_incremental_read_var_human -START TEST ( 2023-07-02 14:46:48.219266 ): test_incremental_read_obs_mouse -END TEST ( 2023-07-02 14:46:50.634455 ): test_incremental_read_obs_mouse -START TEST ( 2023-07-02 14:46:50.636518 ): test_incremental_read_var_mouse -END TEST ( 2023-07-02 14:46:52.02957 ): test_incremental_read_var_mouse -START TEST ( 2023-07-02 14:46:52.031717 ): test_incremental_read_X_human -END TEST ( 2023-07-02 15:06:21.675927 ): test_incremental_read_X_human -START TEST ( 2023-07-02 15:06:21.678379 ): test_incremental_read_X_human-large-buffer-size -END TEST ( 2023-07-02 15:38:51.28431 ): test_incremental_read_X_human-large-buffer-size -START TEST ( 2023-07-02 15:38:51.361892 ): test_incremental_read_X_mouse -END TEST ( 2023-07-02 15:43:56.700087 ): test_incremental_read_X_mouse -START TEST ( 2023-07-02 15:43:56.720547 ): test_incremental_read_X_mouse-large-buffer-size -END TEST ( 2023-07-02 15:45:23.18604 ): test_incremental_read_X_mouse-large-buffer-size -START TEST ( 2023-07-02 15:45:23.188516 ): test_incremental_query_human_brain -END TEST ( 2023-07-02 15:46:27.33182 ): test_incremental_query_human_brain -START TEST ( 2023-07-02 15:46:27.333765 ): test_incremental_query_human_aorta -END TEST ( 2023-07-02 15:46:40.686538 ): test_incremental_query_human_aorta -START TEST ( 2023-07-02 15:46:40.688573 ): test_incremental_query_mouse_brain -END TEST ( 2023-07-02 15:46:51.875727 ): test_incremental_query_mouse_brain -START TEST ( 2023-07-02 15:46:51.877772 ): test_incremental_query_mouse_aorta -END TEST ( 2023-07-02 15:46:58.295933 ): test_incremental_query_mouse_aorta -START TEST ( 2023-07-02 15:46:58.29842 ): test_seurat_small-query -END TEST ( 2023-07-02 15:47:20.06609 ): test_seurat_small-query -START TEST ( 2023-07-02 15:47:20.067965 ): test_seurat_10K-cells-human -END TEST ( 2023-07-02 15:47:32.549183 ): test_seurat_10K-cells-human -START TEST ( 2023-07-02 15:47:32.550956 ): test_seurat_100K-cells-human -END TEST ( 2023-07-02 15:48:22.766206 ): test_seurat_100K-cells-human -START TEST ( 2023-07-02 15:48:22.768067 ): test_seurat_250K-cells-human -END TEST ( 2023-07-02 15:50:07.128338 ): test_seurat_250K-cells-human -START TEST ( 2023-07-02 15:50:07.130188 ): test_seurat_500K-cells-human -END TEST ( 2023-07-02 15:53:52.198963 ): test_seurat_500K-cells-human -START TEST ( 2023-07-02 15:53:52.200954 ): test_seurat_750K-cells-human -END TEST ( 2023-07-02 15:59:06.944844 ): test_seurat_750K-cells-human -START TEST ( 2023-07-02 15:59:06.946713 ): test_seurat_1M-cells-human -END TEST ( 2023-07-02 15:59:50.717664 ): test_seurat_1M-cells-human -START TEST ( 2023-07-02 15:59:50.720414 ): test_seurat_common-tissue -END TEST ( 2023-07-02 16:03:47.743072 ): test_seurat_common-tissue -START TEST ( 2023-07-02 16:03:47.745073 ): test_seurat_common-tissue-large-buffer-size -END TEST ( 2023-07-02 16:07:50.285648 ): test_seurat_common-tissue-large-buffer-size -START TEST ( 2023-07-02 16:07:50.287765 ): test_seurat_common-cell-type -END TEST ( 2023-07-02 16:22:28.235837 ): test_seurat_common-cell-type -START TEST ( 2023-07-02 16:22:28.241764 ): test_seurat_common-cell-type-large-buffer-size -END TEST ( 2023-07-02 17:38:56.592504 ): test_seurat_common-cell-type-large-buffer-size -START TEST ( 2023-07-02 17:38:56.5975 ): test_seurat_whole-enchilada-large-buffer-size -END TEST ( 2023-07-02 17:38:56.60277 ): test_seurat_whole-enchilada-large-buffer-size +```text +START TEST ( 2023-07-02 14:46:28.791692 ): test_load_obs_human +END TEST ( 2023-07-02 14:46:35.845693 ): test_load_obs_human +START TEST ( 2023-07-02 14:46:35.847513 ): test_load_var_human +END TEST ( 2023-07-02 14:46:37.550566 ): test_load_var_human +START TEST ( 2023-07-02 14:46:37.552496 ): test_load_obs_mouse +END TEST ( 2023-07-02 14:46:39.540367 ): test_load_obs_mouse +START TEST ( 2023-07-02 14:46:39.542638 ): test_load_var_mouse +END TEST ( 2023-07-02 14:46:41.049362 ): test_load_var_mouse +START TEST ( 2023-07-02 14:46:41.051673 ): test_incremental_read_obs_human +END TEST ( 2023-07-02 14:46:46.651326 ): test_incremental_read_obs_human +START TEST ( 2023-07-02 14:46:46.653535 ): test_incremental_read_var_human +END TEST ( 2023-07-02 14:46:48.216871 ): test_incremental_read_var_human +START TEST ( 2023-07-02 14:46:48.219266 ): test_incremental_read_obs_mouse +END TEST ( 2023-07-02 14:46:50.634455 ): test_incremental_read_obs_mouse +START TEST ( 2023-07-02 14:46:50.636518 ): test_incremental_read_var_mouse +END TEST ( 2023-07-02 14:46:52.02957 ): test_incremental_read_var_mouse +START TEST ( 2023-07-02 14:46:52.031717 ): test_incremental_read_X_human +END TEST ( 2023-07-02 15:06:21.675927 ): test_incremental_read_X_human +START TEST ( 2023-07-02 15:06:21.678379 ): test_incremental_read_X_human-large-buffer-size +END TEST ( 2023-07-02 15:38:51.28431 ): test_incremental_read_X_human-large-buffer-size +START TEST ( 2023-07-02 15:38:51.361892 ): test_incremental_read_X_mouse +END TEST ( 2023-07-02 15:43:56.700087 ): test_incremental_read_X_mouse +START TEST ( 2023-07-02 15:43:56.720547 ): test_incremental_read_X_mouse-large-buffer-size +END TEST ( 2023-07-02 15:45:23.18604 ): test_incremental_read_X_mouse-large-buffer-size +START TEST ( 2023-07-02 15:45:23.188516 ): test_incremental_query_human_brain +END TEST ( 2023-07-02 15:46:27.33182 ): test_incremental_query_human_brain +START TEST ( 2023-07-02 15:46:27.333765 ): test_incremental_query_human_aorta +END TEST ( 2023-07-02 15:46:40.686538 ): test_incremental_query_human_aorta +START TEST ( 2023-07-02 15:46:40.688573 ): test_incremental_query_mouse_brain +END TEST ( 2023-07-02 15:46:51.875727 ): test_incremental_query_mouse_brain +START TEST ( 2023-07-02 15:46:51.877772 ): test_incremental_query_mouse_aorta +END TEST ( 2023-07-02 15:46:58.295933 ): test_incremental_query_mouse_aorta +START TEST ( 2023-07-02 15:46:58.29842 ): test_seurat_small-query +END TEST ( 2023-07-02 15:47:20.06609 ): test_seurat_small-query +START TEST ( 2023-07-02 15:47:20.067965 ): test_seurat_10K-cells-human +END TEST ( 2023-07-02 15:47:32.549183 ): test_seurat_10K-cells-human +START TEST ( 2023-07-02 15:47:32.550956 ): test_seurat_100K-cells-human +END TEST ( 2023-07-02 15:48:22.766206 ): test_seurat_100K-cells-human +START TEST ( 2023-07-02 15:48:22.768067 ): test_seurat_250K-cells-human +END TEST ( 2023-07-02 15:50:07.128338 ): test_seurat_250K-cells-human +START TEST ( 2023-07-02 15:50:07.130188 ): test_seurat_500K-cells-human +END TEST ( 2023-07-02 15:53:52.198963 ): test_seurat_500K-cells-human +START TEST ( 2023-07-02 15:53:52.200954 ): test_seurat_750K-cells-human +END TEST ( 2023-07-02 15:59:06.944844 ): test_seurat_750K-cells-human +START TEST ( 2023-07-02 15:59:06.946713 ): test_seurat_1M-cells-human +END TEST ( 2023-07-02 15:59:50.717664 ): test_seurat_1M-cells-human +START TEST ( 2023-07-02 15:59:50.720414 ): test_seurat_common-tissue +END TEST ( 2023-07-02 16:03:47.743072 ): test_seurat_common-tissue +START TEST ( 2023-07-02 16:03:47.745073 ): test_seurat_common-tissue-large-buffer-size +END TEST ( 2023-07-02 16:07:50.285648 ): test_seurat_common-tissue-large-buffer-size +START TEST ( 2023-07-02 16:07:50.287765 ): test_seurat_common-cell-type +END TEST ( 2023-07-02 16:22:28.235837 ): test_seurat_common-cell-type +START TEST ( 2023-07-02 16:22:28.241764 ): test_seurat_common-cell-type-large-buffer-size +END TEST ( 2023-07-02 17:38:56.592504 ): test_seurat_common-cell-type-large-buffer-size +START TEST ( 2023-07-02 17:38:56.5975 ): test_seurat_whole-enchilada-large-buffer-size +END TEST ( 2023-07-02 17:38:56.60277 ): test_seurat_whole-enchilada-large-buffer-size ``` - `acceptance-tests-logs-2023-07-02.csv` -``` +```text test,user,system,real,test_result -test_load_obs_human,16.559,95.552,7.053,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_human,0.413,0.480000000000004,1.697,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_load_obs_mouse,2.223,7.28399999999999,1.987,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_mouse,0.413,0.475000000000009,1.505,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_incremental_read_obs_human,18.338,103.328,5.598,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_human,0.408000000000001,0.482000000000028,1.562,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_obs_mouse,2.351,6.16499999999999,2.415,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_mouse,0.395000000000003,0.401999999999987,1.392,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_X_human,8479.745,13912.628,1169.643,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_human-large-buffer-size,8247.498,77383.086,1949.527,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_mouse,960.691000000003,1551.99800000001,305.317,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_mouse-large-buffer-size,961.543999999998,1518.712,86.4630000000002,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_query_human_brain,228.550999999999,269.589999999997,64.1419999999998,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_human_aorta,19.0260000000017,72.7629999999917,13.3519999999999,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_mouse_brain,46.9169999999976,56.8659999999945,11.1860000000001,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_incremental_query_mouse_aorta,12.260000000002,11.2419999999984,6.41700000000037,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE -test_seurat_small-query,25.7360000000008,70.429999999993,21.7669999999998,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_10K-cells-human,9.01800000000003,9.08999999999651,12.4810000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_100K-cells-human,54.7450000000026,48.3930000000109,50.2150000000001,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_250K-cells-human,118.756999999998,94.1370000000024,104.36,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_500K-cells-human,253.110000000001,194.872000000003,225.068,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_750K-cells-human,344.812999999998,264.630999999994,314.742999999999,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_1M-cells-human,160.532999999999,234.426000000007,43.7690000000002,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_common-tissue,387.287,257.248000000007,237.022,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-tissue-large-buffer-size,382.359,260.178,242.54,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-cell-type,3359.342,11201.16,877.945000000001,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_common-cell-type-large-buffer-size,3579.832,33378.649,4588.348,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_whole-enchilada-large-buffer-size,0.00400000000081491,0,0.00400000000081491,expect_true(TRUE): expectation_success: TRUE is not TRUE +test_load_obs_human,16.559,95.552,7.053,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_human,0.413,0.480000000000004,1.697,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_load_obs_mouse,2.223,7.28399999999999,1.987,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_mouse,0.413,0.475000000000009,1.505,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_incremental_read_obs_human,18.338,103.328,5.598,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_human,0.408000000000001,0.482000000000028,1.562,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_obs_mouse,2.351,6.16499999999999,2.415,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_mouse,0.395000000000003,0.401999999999987,1.392,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_X_human,8479.745,13912.628,1169.643,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_human-large-buffer-size,8247.498,77383.086,1949.527,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_mouse,960.691000000003,1551.99800000001,305.317,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_mouse-large-buffer-size,961.543999999998,1518.712,86.4630000000002,expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_query_human_brain,228.550999999999,269.589999999997,64.1419999999998,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_human_aorta,19.0260000000017,72.7629999999917,13.3519999999999,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_mouse_brain,46.9169999999976,56.8659999999945,11.1860000000001,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_incremental_query_mouse_aorta,12.260000000002,11.2419999999984,6.41700000000037,expect_true(table_iter_is_ok(query$obs())): expectation_success: table_iter_is_ok(query$obs()) is not TRUE ; expect_true(table_iter_is_ok(query$var())): expectation_success: table_iter_is_ok(query$var()) is not TRUE ; expect_true(table_iter_is_ok(query$X("raw")$tables())): expectation_success: table_iter_is_ok(query$X("raw")$tables()) is not TRUE +test_seurat_small-query,25.7360000000008,70.429999999993,21.7669999999998,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_10K-cells-human,9.01800000000003,9.08999999999651,12.4810000000002,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_100K-cells-human,54.7450000000026,48.3930000000109,50.2150000000001,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_250K-cells-human,118.756999999998,94.1370000000024,104.36,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_500K-cells-human,253.110000000001,194.872000000003,225.068,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_750K-cells-human,344.812999999998,264.630999999994,314.742999999999,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_1M-cells-human,160.532999999999,234.426000000007,43.7690000000002,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_common-tissue,387.287,257.248000000007,237.022,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-tissue-large-buffer-size,382.359,260.178,242.54,test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-cell-type,3359.342,11201.16,877.945000000001,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_common-cell-type-large-buffer-size,3579.832,33378.649,4588.348,test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_whole-enchilada-large-buffer-size,0.00400000000081491,0,0.00400000000081491,expect_true(TRUE): expectation_success: TRUE is not TRUE ``` -## 2023-06-23 +### 2023-06-23 - Host: EC2 instance type: `r6id.x32xlarge`, all nvme mounted as swap. - Uname: Linux ip-172-31-62-52 5.19.0-1026-aws #27~22.04.1-Ubuntu SMP Mon May 22 15:57:16 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Census version -``` +```r > cellxgene.census::get_census_version_description('latest') $release_date [1] "" @@ -452,131 +452,131 @@ $census_version [1] "latest" ``` -- R session info +- R session info -``` +```r > library("cellxgene.census"); sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS Matrix products: default -BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 +BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 locale: - [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 - [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 - [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C -[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C + [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 + [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 + [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C +[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C time zone: America/Los_Angeles tzcode source: system (glibc) attached base packages: -[1] stats graphics grDevices utils datasets methods base +[1] stats graphics grDevices utils datasets methods base other attached packages: [1] cellxgene.census_0.0.0.9000 loaded via a namespace (and not attached): - [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 - [4] tiledbsoma_0.0.0.9028 rlang_1.1.1 purrr_1.0.1 - [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 -[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 -[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 -[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 -[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 -[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 -[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 -[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 -[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 -[34] xml2_1.3.4 + [1] vctrs_0.6.3 httr_1.4.6 cli_3.6.1 + [4] tiledbsoma_0.0.0.9028 rlang_1.1.1 purrr_1.0.1 + [7] assertthat_0.2.1 data.table_1.14.8 jsonlite_1.8.5 +[10] glue_1.6.2 bit_4.0.5 triebeard_0.4.1 +[13] grid_4.3.0 RcppSpdlog_0.0.13 base64enc_0.1-3 +[16] lifecycle_1.0.3 compiler_4.3.0 fs_1.6.2 +[19] Rcpp_1.0.10 aws.s3_0.3.21 lattice_0.21-8 +[22] digest_0.6.31 R6_2.5.1 tidyselect_1.2.0 +[25] curl_5.0.1 magrittr_2.0.3 urltools_1.7.3 +[28] Matrix_1.5-4.1 tools_4.3.0 bit64_4.0.5 +[31] aws.signature_0.6.0 spdl_0.0.5 arrow_12.0.1 +[34] xml2_1.3.4 ``` - `stdout.txt` -``` -START TEST ( 2023-06-23 13:55:52.324647 ): test_load_obs_human -END TEST ( 2023-06-23 13:55:59.362807 ): test_load_obs_human -START TEST ( 2023-06-23 13:55:59.364576 ): test_load_var_human -END TEST ( 2023-06-23 13:56:00.939187 ): test_load_var_human -START TEST ( 2023-06-23 13:56:00.941131 ): test_load_obs_mouse -END TEST ( 2023-06-23 13:56:03.004462 ): test_load_obs_mouse -START TEST ( 2023-06-23 13:56:03.006694 ): test_load_var_mouse -END TEST ( 2023-06-23 13:56:04.727943 ): test_load_var_mouse -START TEST ( 2023-06-23 13:56:04.730105 ): test_incremental_read_obs_human -END TEST ( 2023-06-23 13:56:10.508174 ): test_incremental_read_obs_human -START TEST ( 2023-06-23 13:56:10.51139 ): test_incremental_read_var_human -END TEST ( 2023-06-23 13:56:12.073684 ): test_incremental_read_var_human -START TEST ( 2023-06-23 13:56:12.07612 ): test_incremental_read_X_human -END TEST ( 2023-06-23 14:17:46.233498 ): test_incremental_read_X_human -START TEST ( 2023-06-23 14:17:46.236098 ): test_incremental_read_X_human-large-buffer-size -END TEST ( 2023-06-23 14:42:27.219841 ): test_incremental_read_X_human-large-buffer-size -START TEST ( 2023-06-23 14:42:27.313921 ): test_incremental_read_obs_mouse -END TEST ( 2023-06-23 14:42:35.792303 ): test_incremental_read_obs_mouse -START TEST ( 2023-06-23 14:42:35.825136 ): test_incremental_read_var_mouse -END TEST ( 2023-06-23 14:44:48.181343 ): test_incremental_read_var_mouse -START TEST ( 2023-06-23 14:44:48.225299 ): test_incremental_read_X_mouse -END TEST ( 2023-06-23 14:46:40.709836 ): test_incremental_read_X_mouse -START TEST ( 2023-06-23 14:46:40.712315 ): test_incremental_read_X_mouse-large-buffer-size -END TEST ( 2023-06-23 14:48:02.087424 ): test_incremental_read_X_mouse-large-buffer-size -START TEST ( 2023-06-23 14:48:02.091451 ): test_incremental_query -END TEST ( 2023-06-23 14:48:02.100564 ): test_incremental_query -START TEST ( 2023-06-23 14:48:02.102893 ): test_seurat_small-query -END TEST ( 2023-06-23 14:48:48.744465 ): test_seurat_small-query -START TEST ( 2023-06-23 14:48:48.746438 ): test_seurat_10K-cells-human -END TEST ( 2023-06-23 14:49:01.11209 ): test_seurat_10K-cells-human -START TEST ( 2023-06-23 14:49:01.114074 ): test_seurat_100K-cells-human -END TEST ( 2023-06-23 14:49:51.088361 ): test_seurat_100K-cells-human -START TEST ( 2023-06-23 14:49:51.090358 ): test_seurat_250K-cells-human -END TEST ( 2023-06-23 14:51:32.084494 ): test_seurat_250K-cells-human -START TEST ( 2023-06-23 14:51:32.086453 ): test_seurat_500K-cells-human -END TEST ( 2023-06-23 14:55:04.211365 ): test_seurat_500K-cells-human -START TEST ( 2023-06-23 14:55:04.213284 ): test_seurat_750K-cells-human -END TEST ( 2023-06-23 15:00:02.888813 ): test_seurat_750K-cells-human -START TEST ( 2023-06-23 15:00:02.890819 ): test_seurat_1M-cells-human -END TEST ( 2023-06-23 15:00:54.993723 ): test_seurat_1M-cells-human -START TEST ( 2023-06-23 15:00:54.996504 ): test_seurat_common-tissue -END TEST ( 2023-06-23 15:05:05.376396 ): test_seurat_common-tissue -START TEST ( 2023-06-23 15:05:05.378534 ): test_seurat_common-tissue-large-buffer-size -END TEST ( 2023-06-23 15:09:13.780464 ): test_seurat_common-tissue-large-buffer-size -START TEST ( 2023-06-23 15:09:13.782572 ): test_seurat_common-cell-type -END TEST ( 2023-06-23 15:24:43.865822 ): test_seurat_common-cell-type -START TEST ( 2023-06-23 15:24:43.867832 ): test_seurat_common-cell-type-large-buffer-size -END TEST ( 2023-06-23 16:56:58.016858 ): test_seurat_common-cell-type-large-buffer-size -START TEST ( 2023-06-23 16:56:58.020263 ): test_seurat_whole-enchilada-large-buffer-size -END TEST ( 2023-06-23 16:56:58.025497 ): test_seurat_whole-enchilada-large-buffer-size +```text +START TEST ( 2023-06-23 13:55:52.324647 ): test_load_obs_human +END TEST ( 2023-06-23 13:55:59.362807 ): test_load_obs_human +START TEST ( 2023-06-23 13:55:59.364576 ): test_load_var_human +END TEST ( 2023-06-23 13:56:00.939187 ): test_load_var_human +START TEST ( 2023-06-23 13:56:00.941131 ): test_load_obs_mouse +END TEST ( 2023-06-23 13:56:03.004462 ): test_load_obs_mouse +START TEST ( 2023-06-23 13:56:03.006694 ): test_load_var_mouse +END TEST ( 2023-06-23 13:56:04.727943 ): test_load_var_mouse +START TEST ( 2023-06-23 13:56:04.730105 ): test_incremental_read_obs_human +END TEST ( 2023-06-23 13:56:10.508174 ): test_incremental_read_obs_human +START TEST ( 2023-06-23 13:56:10.51139 ): test_incremental_read_var_human +END TEST ( 2023-06-23 13:56:12.073684 ): test_incremental_read_var_human +START TEST ( 2023-06-23 13:56:12.07612 ): test_incremental_read_X_human +END TEST ( 2023-06-23 14:17:46.233498 ): test_incremental_read_X_human +START TEST ( 2023-06-23 14:17:46.236098 ): test_incremental_read_X_human-large-buffer-size +END TEST ( 2023-06-23 14:42:27.219841 ): test_incremental_read_X_human-large-buffer-size +START TEST ( 2023-06-23 14:42:27.313921 ): test_incremental_read_obs_mouse +END TEST ( 2023-06-23 14:42:35.792303 ): test_incremental_read_obs_mouse +START TEST ( 2023-06-23 14:42:35.825136 ): test_incremental_read_var_mouse +END TEST ( 2023-06-23 14:44:48.181343 ): test_incremental_read_var_mouse +START TEST ( 2023-06-23 14:44:48.225299 ): test_incremental_read_X_mouse +END TEST ( 2023-06-23 14:46:40.709836 ): test_incremental_read_X_mouse +START TEST ( 2023-06-23 14:46:40.712315 ): test_incremental_read_X_mouse-large-buffer-size +END TEST ( 2023-06-23 14:48:02.087424 ): test_incremental_read_X_mouse-large-buffer-size +START TEST ( 2023-06-23 14:48:02.091451 ): test_incremental_query +END TEST ( 2023-06-23 14:48:02.100564 ): test_incremental_query +START TEST ( 2023-06-23 14:48:02.102893 ): test_seurat_small-query +END TEST ( 2023-06-23 14:48:48.744465 ): test_seurat_small-query +START TEST ( 2023-06-23 14:48:48.746438 ): test_seurat_10K-cells-human +END TEST ( 2023-06-23 14:49:01.11209 ): test_seurat_10K-cells-human +START TEST ( 2023-06-23 14:49:01.114074 ): test_seurat_100K-cells-human +END TEST ( 2023-06-23 14:49:51.088361 ): test_seurat_100K-cells-human +START TEST ( 2023-06-23 14:49:51.090358 ): test_seurat_250K-cells-human +END TEST ( 2023-06-23 14:51:32.084494 ): test_seurat_250K-cells-human +START TEST ( 2023-06-23 14:51:32.086453 ): test_seurat_500K-cells-human +END TEST ( 2023-06-23 14:55:04.211365 ): test_seurat_500K-cells-human +START TEST ( 2023-06-23 14:55:04.213284 ): test_seurat_750K-cells-human +END TEST ( 2023-06-23 15:00:02.888813 ): test_seurat_750K-cells-human +START TEST ( 2023-06-23 15:00:02.890819 ): test_seurat_1M-cells-human +END TEST ( 2023-06-23 15:00:54.993723 ): test_seurat_1M-cells-human +START TEST ( 2023-06-23 15:00:54.996504 ): test_seurat_common-tissue +END TEST ( 2023-06-23 15:05:05.376396 ): test_seurat_common-tissue +START TEST ( 2023-06-23 15:05:05.378534 ): test_seurat_common-tissue-large-buffer-size +END TEST ( 2023-06-23 15:09:13.780464 ): test_seurat_common-tissue-large-buffer-size +START TEST ( 2023-06-23 15:09:13.782572 ): test_seurat_common-cell-type +END TEST ( 2023-06-23 15:24:43.865822 ): test_seurat_common-cell-type +START TEST ( 2023-06-23 15:24:43.867832 ): test_seurat_common-cell-type-large-buffer-size +END TEST ( 2023-06-23 16:56:58.016858 ): test_seurat_common-cell-type-large-buffer-size +START TEST ( 2023-06-23 16:56:58.020263 ): test_seurat_whole-enchilada-large-buffer-size +END TEST ( 2023-06-23 16:56:58.025497 ): test_seurat_whole-enchilada-large-buffer-size ``` -- `acceptance-tests-logs-2023-06-23.csv ` +- `acceptance-tests-logs-2023-06-23.csv` -``` +```text test,user,system,real,test_result -test_load_obs_human,18.366,101.34,7.037,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_human,0.369,0.530999999999992,1.569,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_load_obs_mouse,2.011,6.202,2.062,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE -test_load_var_mouse,0.417999999999999,0.486000000000004,1.721,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE -test_incremental_read_obs_human,18.411,94.142,5.777,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_human,0.454999999999998,0.531000000000006,1.561,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_X_human,8193.649,14613.005,1294.156,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_human-large-buffer-size,8091.321,43302.612,1480.899,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_obs_mouse,2.22500000000036,13.4630000000034,8.45900000000029,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE -test_incremental_read_var_mouse,0.7549999999992,128.547999999995,132.348,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE -test_incremental_read_X_mouse,928.090000000002,1636.642,112.482,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_read_X_mouse-large-buffer-size,899.965,1779.616,81.373,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE -test_incremental_query,0.00500000000101863,0.000999999996565748,0.00800000000026557,expect_true(TRUE): expectation_success: TRUE is not TRUE -test_seurat_small-query,26.2459999999992,105.765999999996,46.6410000000001,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_10K-cells-human,8.78700000000026,9.39099999999598,12.3649999999998,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_100K-cells-human,52.6959999999999,51.7980000000025,49.973,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_250K-cells-human,117.172999999999,100.030000000006,100.993,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_500K-cells-human,236.339,191.728999999999,212.124,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_750K-cells-human,334.721999999998,264.296999999999,298.675,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_1M-cells-human,156.661,246.656000000003,52.1020000000003,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_common-tissue,392.353999999999,277.388999999996,250.379,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-tissue-large-buffer-size,379.781999999999,300.778000000006,248.401000000001,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE -test_seurat_common-cell-type,3346.826,11816.851,930.083,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_common-cell-type-large-buffer-size,3437.291,28305.89,5534.148,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 -test_seurat_whole-enchilada-large-buffer-size,0.00399999999717693,0.00099999998928979,0.00500000000101863,expect_true(TRUE): expectation_success: TRUE is not TRUE -``` \ No newline at end of file +test_load_obs_human,18.366,101.34,7.037,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_human,0.369,0.530999999999992,1.569,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_load_obs_mouse,2.011,6.202,2.062,expect_true(nrow(obs_df) > 0): expectation_success: nrow(obs_df) > 0 is not TRUE +test_load_var_mouse,0.417999999999999,0.486000000000004,1.721,expect_true(nrow(var_df) > 0): expectation_success: nrow(var_df) > 0 is not TRUE +test_incremental_read_obs_human,18.411,94.142,5.777,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_human,0.454999999999998,0.531000000000006,1.561,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_X_human,8193.649,14613.005,1294.156,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_human-large-buffer-size,8091.321,43302.612,1480.899,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_obs_mouse,2.22500000000036,13.4630000000034,8.45900000000029,expect_true(table_iter_is_ok(obs_iter)): expectation_success: table_iter_is_ok(obs_iter) is not TRUE +test_incremental_read_var_mouse,0.7549999999992,128.547999999995,132.348,expect_true(table_iter_is_ok(var_iter)): expectation_success: table_iter_is_ok(var_iter) is not TRUE +test_incremental_read_X_mouse,928.090000000002,1636.642,112.482,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_read_X_mouse-large-buffer-size,899.965,1779.616,81.373,expect_warning(X_iter <- census$get("census_data")$get(organism)$ms$get("RNA")$X$get("raw")$read()$tables()): expectation_success: ; expect_true(table_iter_is_ok(X_iter)): expectation_success: table_iter_is_ok(X_iter) is not TRUE +test_incremental_query,0.00500000000101863,0.000999999996565748,0.00800000000026557,expect_true(TRUE): expectation_success: TRUE is not TRUE +test_seurat_small-query,26.2459999999992,105.765999999996,46.6410000000001,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_10K-cells-human,8.78700000000026,9.39099999999598,12.3649999999998,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_100K-cells-human,52.6959999999999,51.7980000000025,49.973,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_250K-cells-human,117.172999999999,100.030000000006,100.993,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_500K-cells-human,236.339,191.728999999999,212.124,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_750K-cells-human,334.721999999998,264.296999999999,298.675,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_1M-cells-human,156.661,246.656000000003,52.1020000000003,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_common-tissue,392.353999999999,277.388999999996,250.379,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-tissue-large-buffer-size,379.781999999999,300.778000000006,248.401000000001,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): expectation_success: is(this_seurat, "Seurat") is not TRUE ; test_seurat(test_args): expectation_success: ncol(this_seurat) > 0 is not TRUE +test_seurat_common-cell-type,3346.826,11816.851,930.083,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_common-cell-type-large-buffer-size,3437.291,28305.89,5534.148,test_seurat(test_args): expectation_warning: Iteration results cannot be concatenated on its entirety because array has non-zero elements greater than '.Machine$integer.max'. ; test_seurat(test_args): Error: Error in `vec_to_Array(x, type)`: long vectors not supported yet: memory.c:3888 +test_seurat_whole-enchilada-large-buffer-size,0.00399999999717693,0.00099999998928979,0.00500000000101863,expect_true(TRUE): expectation_success: TRUE is not TRUE +``` diff --git a/api/r/cellxgene.census/tests/installation/README.md b/api/r/cellxgene.census/tests/installation/README.md index fce0cc870..87dba73e4 100644 --- a/api/r/cellxgene.census/tests/installation/README.md +++ b/api/r/cellxgene.census/tests/installation/README.md @@ -1,6 +1,8 @@ +# Installation + This Dockerfile & script are used to test the installation instructions for our `cellxgene.census` R package in a clean environment (in contrast to developer machines which tend to have a lot of unrelated packages and other miscellaneous state). Simply: -``` +```shell docker build -t cellxgene_census_r_install_test api/r/cellxgene.census/tests/installation ``` diff --git a/docs/README.md b/docs/README.md index d0dea9442..c61956e6b 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,6 +1,6 @@ # API Documentation -The documentation website is currently hosted on https://chanzuckerberg.github.io/cellxgene-census/. +The documentation website is currently hosted on . The documentation site is rebuilt each time a tag is created on the repo, which happens on release, including regenerating the Sphinx Python API docs. The R `pkgdown` website is checked into git and simply copied in during the doc site rebuild; see [`api/r/cellxgene.census/vignettes_/`](https://github.com/chanzuckerberg/cellxgene-census/tree/main/api/r/cellxgene.census/vignettes_) for further explanation. @@ -8,7 +8,7 @@ A docsite rebuild can be [triggered manually through `workflow_dispatch`](https: In order to test docsite changes locally, first install the necessary requirements: -``` +```shell pip install -r docs/requirements.txt brew install pandoc # Mac OS ``` @@ -17,9 +17,9 @@ Then, And then run the following command: -``` +```shell cd docs make html ``` -The generated docsite will then be found at `docs/_build/html/index.html`. \ No newline at end of file +The generated docsite will then be found at `docs/_build/html/index.html`. diff --git a/docs/articles/2023/20230808-r_api_release.md b/docs/articles/2023/20230808-r_api_release.md index 18fd2f51b..2b2b0259f 100644 --- a/docs/articles/2023/20230808-r_api_release.md +++ b/docs/articles/2023/20230808-r_api_release.md @@ -1,18 +1,17 @@ # R package `cellxgene.census` V1 is out! -*Published: August 7th, 2023* +*Published:* *August 7th, 2023* -*By: [Pablo Garcia-Nieto](pgarcia-nieto@chanzuckerberg.com)* +*By:* *[Pablo Garcia-Nieto](pgarcia-nieto@chanzuckerberg.com)* -The Census team is pleased to announce the release of the R package `cellxgene.census`. 🎉 🎉 +The Census team is pleased to announce the release of the R package `cellxgene.census`. 🎉 🎉 This has been long coming since our Python release back in May. Now, from R, computational biologists can access the Census data which is the largest standardized aggregation of single-cell data, composed of >33M cells and >60K genes. - + With `cellxgene.census` in a few seconds users can access any slice of Census data using cell or gene filters across hundreds of datasets. The data can be fetched in an iterative fashion for bigger-than-memory slices of data, or quickly exported to basic R structures, and [Seurat](https://satijalab.org/seurat/) or [SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) for downstream analysis. ![image](20230808-r_api_release.svg) - ## Installation and usage Users can install `cellxgene.census` and its dependencies following the [installation instructions](../../cellxgene_census_docsite_installation.md). @@ -25,7 +24,7 @@ To learn more about the package please make sure to check out the following reso ## Census R package is made possible by `tiledbsoma` -The `cellxgene.census` package relies on [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) R's package `tiledbsoma` for all of its data access capabilities as shown in the next section. +The `cellxgene.census` package relies on [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) R's package `tiledbsoma` for all of its data access capabilities as shown in the next section. CZI and TileDB have worked closely on the development of `tiledbsoma` and recently upgraded it from beta to its first stable version. Release notes can be found [here](https://github.com/single-cell-data/TileDB-SOMA/releases/tag/1.4.0). @@ -37,7 +36,6 @@ Census data are accompanied by cell and gene metadata that have been standardize With the `cellxgene.census` R package, researchers can have access to all of these data and metadata directly from an R session with the following capabilities: - ### Easy-to-use handles to the cloud-hosted Census data From R users can get a handle to the data by opening the Census. @@ -50,7 +48,7 @@ census <- open_soma() # Your work! census$close() -``` +``` ### Querying and reading single-cell metadata from Census diff --git a/docs/cellxgene_census_docsite_FAQ.md b/docs/cellxgene_census_docsite_FAQ.md index cf2c02877..25433d037 100644 --- a/docs/cellxgene_census_docsite_FAQ.md +++ b/docs/cellxgene_census_docsite_FAQ.md @@ -16,7 +16,7 @@ Last updated: May, 2023. - [How can I ask for new features?](#how-can-i-ask-for-new-features) - [How can I contribute my data to the Census?](#how-can-i-contribute-my-data-to-the-census) - [Why do I get an `ArraySchema` error when opening the Census?](#why-do-i-get-an-arrayschema-error-when-opening-the-census) -- [Why do I get an error when running `import cellxgene_census` on Databricks?](#why-do-i-get-an-error-when-running-import-cellxgene-census-on-databricks) +- [Why do I get an error when running `import cellxgene_census` on Databricks?](#why-do-i-get-an-error-when-running-import-cellxgene_census-on-databricks) ## Why should I use the Census? @@ -27,28 +27,25 @@ The Census provides efficient low-latency access via Python and R APIs to most s - Easily load multi-dataset slices into Scanpy or Seurat. - Implement out-of-core (a.k.a online) operations for larger-than-memory processes. - -For example, a user can easily get “*all T-cells from Lung with COVID-19*” into [AnnData](https://anndata.readthedocs.io/en/latest/), [Seurat](https://satijalab.org/seurat/), or into memory-sufficient data chunks via [PyArrow](https://arrow.apache.org/docs/python/index.html) or [R Arrow](https://arrow.apache.org/docs/r/). - +For example, a user can easily get “*all T-cells from Lung with COVID-19*” into [AnnData](https://anndata.readthedocs.io/en/latest/), [Seurat](https://satijalab.org/seurat/), or into memory-sufficient data chunks via [PyArrow](https://arrow.apache.org/docs/python/index.html) or [R Arrow](https://arrow.apache.org/docs/r/). The Census is not suited for: - Access to non-standardized cell metadata and gene metadata available in the original [datasets](https://cellxgene.cziscience.com/datasets). -- Access to the author-contributed normalized expression values or embeddings. +- Access to the author-contributed normalized expression values or embeddings. - Access to all data from just one dataset. -- Access to non-RNA or spatial data present in CZ CELLxGENE Discover as it is not yet supported in the Census. +- Access to non-RNA or spatial data present in CZ CELLxGENE Discover as it is not yet supported in the Census. -If you’d like to perform any of the above tasks, you can access web downloads directly from the [CZ CELLxGENE Discover Datasets](https://cellxgene.cziscience.com/datasets) feature. [Click here](https://cellxgene.cziscience.com/docs/03__Download%20Published%20Data) for more information about downloading published data on CELLxGENE Discover. +If you’d like to perform any of the above tasks, you can access web downloads directly from the [CZ CELLxGENE Discover Datasets](https://cellxgene.cziscience.com/datasets) feature. [Click here](https://cellxgene.cziscience.com/docs/03__Download%20Published%20Data) for more information about downloading published data on CELLxGENE Discover. ## What data is contained in the Census? -Most RNA non-spatial data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) is included. You can see a general description of these data and their organization in the [schema description](cellxgene_census_docsite_schema.md) or you can use the APIs to explore the data as indicated in this [tutorial](notebooks/analysis_demo/comp_bio_census_info.ipynb). +Most RNA non-spatial data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) is included. You can see a general description of these data and their organization in the [schema description](cellxgene_census_docsite_schema.md) or you can use the APIs to explore the data as indicated in this [tutorial](notebooks/analysis_demo/comp_bio_census_info.ipynb). ## How do I cite the use of the Census for a publication? Please follow the [citation guidelines](https://cellxgene.cziscience.com/docs/08__Cite%20cellxgene%20in%20your%20publications) offered by CZ CELLxGENE Discover. - ## Why does the Census not have a normalized layer or embeddings? The Census does not have normalized counts or embeddings because: @@ -62,13 +59,12 @@ If you have any suggestions for methods that our team should explore, please sha The Census differentiates from existing single-cell tools by providing fast, efficient access to the largest corpus of standardized single-cell data from CZ CELLxGENE Discover via [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA/issues/new/choose). Thus, single-cell data from about 33M unique cells (50M total) across >60 K genes, with 11 standardized cell metadata variables and harmonized GENCODE annotations are ready for: -* Opening and reading data at low latency from the cloud. -* Querying and accessing data using metadata filters. -* Loading and creating AnnData objects. -* Loading and creating Seurat objects. -* From Python, creating PyArrow objects, SciPy sparse matrices, NumPy arrays, and Pandas data frames. -* From R, creating R Arrow objects, sparse matrices (via the Matrix package), and standard data frames and (dense) matrices. - +- Opening and reading data at low latency from the cloud. +- Querying and accessing data using metadata filters. +- Loading and creating AnnData objects. +- Loading and creating Seurat objects. +- From Python, creating PyArrow objects, SciPy sparse matrices, NumPy arrays, and Pandas data frames. +- From R, creating R Arrow objects, sparse matrices (via the Matrix package), and standard data frames and (dense) matrices. ## Can I query human and mouse data in a single query? @@ -125,7 +121,7 @@ You can submit a [feature request in the github repository](https://github.com/c ## How can I contribute my data to the Census? -To inquire about submitting your data to CZ CELLxGENE Discover, [click here](https://cellxgene.cziscience.com/docs/032__Contribute%20and%20Publish%20Data). If your data request is accepted, the data will automatically be included in the Census if it meets the [biological criteria defined in the Census schema](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md#data-included). +To inquire about submitting your data to CZ CELLxGENE Discover, [click here](https://cellxgene.cziscience.com/docs/032__Contribute%20and%20Publish%20Data). If your data request is accepted, the data will automatically be included in the Census if it meets the [biological criteria defined in the Census schema](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md#data-included). ## Why do I get an `ArraySchema` error when opening the Census? @@ -135,20 +131,25 @@ If the error persists please file a [github issue](https://github.com/chanzucker ## Why do I get an error when running `import cellxgene_census` on Databricks? -This can occur if the `cellxgene_census` Python package is installed in a Databricks notebook using `%sh pip install cellxgene_census`. This command does _not_ restart the Python process after installing `cellxgene_census` and any pip package dependencies that were pre-installed by the Databricks Runtime environment but upgraded for `cellxgene_census` will not be reloaded with their new version. You may see `numba` or `pyarrow` related errors, for example. +This can occur if the `cellxgene_census` Python package is installed in a Databricks notebook using `%sh pip install cellxgene_census`. This command does *not* restart the Python process after installing `cellxgene_census` and any pip package dependencies that were pre-installed by the Databricks Runtime environment but upgraded for `cellxgene_census` will not be reloaded with their new version. You may see `numba` or `pyarrow` related errors, for example. To fix, simply install using one of the following Databricks notebook "magic" commands: -``` + +```shell pip install -U cellxgene-census ``` + or -``` + +```shell %pip install -U cellxgene-census ``` -These commands restart the Python process after installing the `cellxgene-census` package, similar to using `dbutils.library.restartPython()`. Additionally, these magic commands also ensure that the package is installed on all nodes of a multi-node cluster. + +These commands restart the Python process after installing the `cellxgene-census` package, similar to using `dbutils.library.restartPython()`. Additionally, these magic commands also ensure that the package is installed on all nodes of a multi-node cluster. See also: -* https://docs.databricks.com/libraries/notebooks-python-libraries.html#can-i-use-sh-pip-pip-or-pip-what-is-the-difference -* https://community.databricks.com/s/question/0D53f00001GHVP3CAP/whats-the-difference-between-magic-commands-pip-and-sh-pip + +- +- Alternately, you can configure your cluster to install the `cellxgene-census` package each time it is started by adding this package to the "Libraries" tab on the cluster configuration page per these [instructions](https://docs.databricks.com/libraries/cluster-libraries.html). diff --git a/docs/cellxgene_census_docsite_data_release_info.md b/docs/cellxgene_census_docsite_data_release_info.md index 0443a3ad1..b78071c10 100644 --- a/docs/cellxgene_census_docsite_data_release_info.md +++ b/docs/cellxgene_census_docsite_data_release_info.md @@ -1,27 +1,26 @@ -# Census data releases +# Census data releases **Last edited**: July 7th, 2023. -**Contents** +**Contents:** -1. [What is a Census data release?](#What-is-a-Census-data-release) -2. [List of LTS Census data releases](#List-of-LTS-Census-data-releases) +1. [What is a Census data release?](#what-is-a-census-data-release) +2. [List of LTS Census data releases](#list-of-lts-census-data-releases) ## What is a Census data release? -It is a Census build that is publicly hosted online. A Census build is -a [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) collection with the Census data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) as specified in the [Census schema](cellxgene_census_docsite_schema.md). +It is a Census build that is publicly hosted online. A Census build is +a [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) collection with the Census data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) as specified in the [Census schema](cellxgene_census_docsite_schema.md). Any given Census build is named with a unique tag, normally the date of build, e.g., `"2023-05-15"`. - ### Long-term supported (LTS) Census releases To enable data stability and scientific reproducibility, [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) plans to perform regular LTS Census data releases: * Published online every six months for public access, starting on May 15, 2023. * Available for public access for at least 5 years upon publication. - + The most recent LTS Census data release is the default opened by the APIs and recognized as `census_version = "stable"`. To open previous LTS Census data releases, you can directly specify the version via its build date `census_version = "[YYYY]-[MM]-[DD]"`. Python @@ -68,7 +67,6 @@ Open this data release by specifying `census_version = "2023-07-25"` in future c #### Version information - | Information | Value | |-----------------------------------|------------| | Census schema version | [1.0.0](https://github.com/chanzuckerberg/cellxgene-census/blob/f06bcebb6471735681fd84734d2d581c44e049e7/docs/cellxgene_census_schema.md) | @@ -76,17 +74,14 @@ Open this data release by specifying `census_version = "2023-07-25"` in future c | Dataset schema version | [3.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/a64ac9eb70e3e777ee34098ae82120c2d21692b0/schema/3.0.0/schema.md) | | Number of datasets | 593 | - #### Cell and donor counts | Type | _Homo sapiens_ | _Mus musculus_ | |-------------------|----------------|----------------| -| Total cells | 56,400,873 | 5,255,245 | +| Total cells | 56,400,873 | 5,255,245 | | Unique cells | 33,364,242 | 4,083,531 | | Number of donors | 13,035 | 1,417 | - - #### Cell metadata | Category | _Homo sapiens_ | _Mus musculus_ | @@ -99,13 +94,13 @@ Open this data release by specifying `census_version = "2023-07-25"` in future c | Sex | 3 | 3 | | Suspension type | 2 | 2 | | Tissue | 220 | 66 | -| Tissue general | 54 | 27 | +| Tissue general | 54 | 27 | ### LTS 2023-05-15 Open this data release by specifying `census_version = "2023-05-15"` in future calls to `open_soma()`. -#### 🔴 Errata 🔴 +#### 🔴 Errata 🔴 ##### Duplicate observations with `is_primary_data = True` @@ -113,12 +108,10 @@ In order to prevent duplicate data in analyses, each observation (cell) should b This issue will be corrected in the following LTS data release, by identifying and marking only one cell out of the duplicates as `is_primary_data = True`. -If you wish to use this data release, you can consider filtering out all of these 243,569 cells by using the `soma_joinids` provided in this file [duplicate_cells_census_LTS_2023-05-15.csv.zip](https://github.com/chanzuckerberg/cellxgene-census/raw/773edab79bbdc78eccb26ec4f8211a9b4c98a71a/tools/cell_dup_check/duplicate_cells_census_LTS_2023-05-15.csv.zip). You can filter specific cells by using the `value_filter` or `obs_value_filter` of the querying API functions, for more information follow this [tutorial](https://chanzuckerberg.github.io/cellxgene-census/notebooks/api_demo/census_query_extract.html). - +If you wish to use this data release, you can consider filtering out all of these 243,569 cells by using the `soma_joinids` provided in this file [duplicate_cells_census_LTS_2023-05-15.csv.zip](https://github.com/chanzuckerberg/cellxgene-census/raw/773edab79bbdc78eccb26ec4f8211a9b4c98a71a/tools/cell_dup_check/duplicate_cells_census_LTS_2023-05-15.csv.zip). You can filter specific cells by using the `value_filter` or `obs_value_filter` of the querying API functions, for more information follow this [tutorial](https://chanzuckerberg.github.io/cellxgene-census/notebooks/api_demo/census_query_extract.html). #### Version information - | Information | Value | |-----------------------------------|------------| | Census schema version | [1.0.0](https://github.com/chanzuckerberg/cellxgene-census/blob/f06bcebb6471735681fd84734d2d581c44e049e7/docs/cellxgene_census_schema.md) | @@ -126,17 +119,14 @@ If you wish to use this data release, you can consider filtering out all of thes | Dataset schema version | [3.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/a64ac9eb70e3e777ee34098ae82120c2d21692b0/schema/3.0.0/schema.md) | | Number of datasets | 562 | - #### Cell and donor counts | Type | _Homo sapiens_ | _Mus musculus_ | |-------------------|----------------|----------------| -| Total cells | 53,794,728 | 4,086,032 | +| Total cells | 53,794,728 | 4,086,032 | | Unique cells | 33,758,887 | 2,914,318 | | Number of donors | 12,493 | 1,362 | - - #### Cell metadata | Category | _Homo sapiens_ | _Mus musculus_ | @@ -149,4 +139,4 @@ If you wish to use this data release, you can consider filtering out all of thes | Sex | 3 | 3 | | Suspension type | 2 | 2 | | Tissue | 227 | 51 | -| Tissue general | 61 | 27 | +| Tissue general | 61 | 27 | diff --git a/docs/cellxgene_census_docsite_installation.md b/docs/cellxgene_census_docsite_installation.md index 4abc2b9c6..7a337c264 100644 --- a/docs/cellxgene_census_docsite_installation.md +++ b/docs/cellxgene_census_docsite_installation.md @@ -1,4 +1,4 @@ -# Installation +# Installation ## Requirements @@ -6,10 +6,9 @@ The Census API requires a Linux or MacOS system with: - Python 3.8 to Python 3.11. Or R, supported versions TBD. - Recommended: >16 GB of memory. -- Recommended: >5 Mbps internet connection. +- Recommended: >5 Mbps internet connection. - Recommended: for increased performance use the API through a AWS-EC2 instance from the region `us-west-2`. The Census data builds are hosted in a AWS-S3 bucket in that region. - ## Python (Optional) In your working directory, make and activate a virtual environment or conda environment. For example: @@ -40,7 +39,7 @@ From an R session, first install `tiledb` from R-Universe, the latest release in ```r install.packages( "cellxgene.census", - repos=c('https://chanzuckerberg.r-universe.dev', 'https://cloud.r-project.org') + repos=c('https://chanzuckerberg.r-universe.dev', 'https://cloud.r-project.org') ) ``` @@ -53,6 +52,6 @@ install.packages("Seurat") # SingleCellExperiment if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") - + BiocManager::install("SingleCellExperiment") ``` diff --git a/docs/cellxgene_census_docsite_landing.md b/docs/cellxgene_census_docsite_landing.md index 7ed2f0023..dff23e948 100644 --- a/docs/cellxgene_census_docsite_landing.md +++ b/docs/cellxgene_census_docsite_landing.md @@ -12,7 +12,6 @@ Get started: - [R tutorials](https://chanzuckerberg.github.io/cellxgene-census/r/articles/) - [Github repository](https://github.com/chanzuckerberg/cellxgene-census) - ![image](cellxgene_census_docsite_workflow.svg) ## Citing the Census @@ -23,56 +22,54 @@ Please follow the [citation guidelines](https://cellxgene.cziscience.com/docs/08 The Census is a data object publicly hosted online and an API to open it. The object is built using the [SOMA](https://github.com/single-cell-data/SOMA) API specification and data model, and it is implemented via [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA). As such, the Census has all the data capabilities offered by TileDB-SOMA including: -**Data access at scale** +**Data access at scale:** - Cloud-based data access. - Efficient access for larger-than-memory slices of data. - Query and access data based on cell or gene metadata at low latency. -**Interoperability with existing single-cell toolkits** +**Interoperability with existing single-cell toolkits:** - Load and create [AnnData](https://anndata.readthedocs.io/en/latest/) objects. - Load and create [Seurat](https://satijalab.org/seurat/) objects. - Load and create [SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) objects. -**Interoperability with existing Python or R data structures** +**Interoperability with existing Python or R data structures:** - From Python create [PyArrow](https://arrow.apache.org/docs/python/index.html) objects, SciPy sparse matrices, NumPy arrays, and pandas data frames. - From R create [R Arrow](https://arrow.apache.org/docs/r/index.html) objects, sparse matrices (via the [Matrix](https://cran.r-project.org/package=Matrix) package), and standard data frames and (dense) matrices. ## Census Data and Schema -A description of the Census data and its schema is detailed [here](cellxgene_census_docsite_schema.md). +A description of the Census data and its schema is detailed [here](cellxgene_census_docsite_schema.md). -⚠️ Note that the data includes: +⚠️ Note that the data includes: -* **Full-gene sequencing read counts** (e.g. Smart-Seq2) and **molecule counts** (e.g. 10X). -* **Duplicate cells** present across multiple datasets, these can be filtered in or out using the cell metadata variable `is_primary_data`. +- **Full-gene sequencing read counts** (e.g. Smart-Seq2) and **molecule counts** (e.g. 10X). +- **Duplicate cells** present across multiple datasets, these can be filtered in or out using the cell metadata variable `is_primary_data`. ## Census Data Releases -The Census data release plans are detailed [here](cellxgene_census_docsite_data_release_info.md). +The Census data release plans are detailed [here](cellxgene_census_docsite_data_release_info.md). Starting May 15th, 2023, Census data releases with long-term support will be published every six months. These releases will be publicly accessible for at least five years. In addition, weekly releases may be published without any guarantee of permanence. - ## Questions, Feedback and Issues - Users are encouraged to submit questions and feature requests about the Census via [github issues](https://github.com/chanzuckerberg/cellxgene-census/issues). - For quick support, you can join the CZI Science Community on Slack ([czi.co/science-slack](https://czi.co/science-slack)) and ask questions in the `#cellxgene-census-users` channel. - Users are encouraged to share their feedback by emailing . -- Bugs can be submitted via [github issues](https://github.com/chanzuckerberg/cellxgene-census/issues). -- If you believe you have found a security issue, please disclose it by contacting . +- Bugs can be submitted via [github issues](https://github.com/chanzuckerberg/cellxgene-census/issues). +- If you believe you have found a security issue, please disclose it by contacting . - Additional FAQs can be found [here](cellxgene_census_docsite_FAQ.md). - ## Coming Soon! - We are currently working on creating the tooling necessary to perform data modeling at scale with seamless integration of the Census and [PyTorch](https://pytorch.org/). - To increase the usability of the Census for research, in 2023 and 2024 we are planning to explore the following areas: - - Include organism-wide normalized layers. - - Include organism-wide embeddings. - - On-demand information-rich subsampling. + - Include organism-wide normalized layers. + - Include organism-wide embeddings. + - On-demand information-rich subsampling. ## Projects and Tools Using Census diff --git a/docs/cellxgene_census_docsite_quick_start.md b/docs/cellxgene_census_docsite_quick_start.md index a3684c50b..1b0d4f854 100644 --- a/docs/cellxgene_census_docsite_quick_start.md +++ b/docs/cellxgene_census_docsite_quick_start.md @@ -2,7 +2,7 @@ This page provides details to start using the Census. Click [here](examples.rst) for more detailed Python tutorials (R vignettes coming soon). -**Contents** +**Contents:** 1. [Installation](#installation). 2. [Python quick start](python-quick-start). @@ -10,7 +10,6 @@ This page provides details to start using the Census. Click [here](examples.rst) ## Installation - Install the Census API by following [these instructions.](cellxgene_census_docsite_installation.md) ## Python quick start @@ -25,7 +24,7 @@ help(cellxgene_census.get_anndata) # etc ``` -### Querying a slice of cell metadata. +### Querying a slice of cell metadata The following reads the cell metadata and filters `female` cells of cell type `microglial cell` or `neuron`, and selects the columns `assay`, `cell_type`, `tissue`, `tissue_general`, `suspension_type`, and `disease`. @@ -33,19 +32,19 @@ The following reads the cell metadata and filters `female` cells of cell type `m import cellxgene_census with cellxgene_census.open_soma() as census: - + # Reads SOMADataFrame as a slice cell_metadata = census["census_data"]["homo_sapiens"].obs.read( value_filter = "sex == 'female' and cell_type in ['microglial cell', 'neuron']", column_names = ["assay", "cell_type", "tissue", "tissue_general", "suspension_type", "disease"] ) - + # Concatenates results to pyarrow.Table cell_metadata = cell_metadata.concat() - + # Converts to pandas.DataFrame cell_metadata = cell_metadata.to_pandas() - + print(cell_metadata) ``` @@ -69,7 +68,7 @@ The "stable" release is currently 2023-07-25. Specify 'census_version="2023-07-2 [379224 rows x 7 columns] ``` -### Obtaining a slice as AnnData +### Obtaining a slice as AnnData The following creates an `anndata.AnnData` object on-demand with the same cell filtering criteria as above and filtering only the genes `ENSG00000161798`, `ENSG00000188229`. @@ -84,7 +83,7 @@ with cellxgene_census.open_soma() as census: obs_value_filter = "sex == 'female' and cell_type in ['microglial cell', 'neuron']", column_names = {"obs": ["assay", "cell_type", "tissue", "tissue_general", "suspension_type", "disease"]}, ) - + print(adata) ``` @@ -98,7 +97,7 @@ AnnData object with n_obs × n_vars = 379224 × 2 ### Memory-efficient queries -This example provides a demonstration to access the data for larger-than-memory operations using **TileDB-SOMA** operations. +This example provides a demonstration to access the data for larger-than-memory operations using **TileDB-SOMA** operations. First we initiate a lazy-evaluation query to access all brain and male cells from human. This query needs to be closed — `query.close()` — or called in a context manager — `with ...`. @@ -107,7 +106,7 @@ import cellxgene_census import tiledbsoma with cellxgene_census.open_soma() as census: - + human = census["census_data"]["homo_sapiens"] query = human.axis_query( measurement_name = "RNA", @@ -115,7 +114,7 @@ with cellxgene_census.open_soma() as census: value_filter = "tissue == 'brain' and sex == 'male'" ) ) - + # Continued below ``` @@ -123,12 +122,12 @@ with cellxgene_census.open_soma() as census: Now we can iterate over the matrix count, as well as the cell and gene metadata. For example, to iterate over the matrix count, we can get an iterator and perform operations for each iteration. ```python - # Continued from above - + # Continued from above + iterator = query.X("raw").tables() - + # Get an iterative slice as pyarrow.Table - raw_slice = next (iterator) + raw_slice = next (iterator) ... ``` @@ -136,7 +135,7 @@ And you can now perform operations on each iteration slice. As with any any Pyth And you must close the query. -``` +```python # Continued from above query.close() ``` @@ -151,11 +150,11 @@ library("cellxgene.census") ?cellxgene.census::get_seurat ``` -### Querying a slice of cell metadata. +### Querying a slice of cell metadata The following reads the cell metadata and filters `female` cells of cell type `microglial cell` or `neuron`, and selects the columns `assay`, `cell_type`, `tissue`, `tissue_general`, `suspension_type`, and `disease`. -The `cellxgene.census` package uses [R6](https://r6.r-lib.org/articles/Introduction.html) classes and we recommend you to get familiar with their usage. +The `cellxgene.census` package uses [R6](https://r6.r-lib.org/articles/Introduction.html) classes and we recommend you to get familiar with their usage. ```r library("cellxgene.census") @@ -187,22 +186,22 @@ The output is a `tibble` with over 300K cells meeting our query criteria and the ```bash # A tibble: 379,224 × 7 assay cell_type sex tissue tissue_general suspension_type disease - - 1 10x 3' v3 microglial cell fema… eye eye cell normal - 2 10x 3' v3 microglial cell fema… eye eye cell normal - 3 10x 3' v3 microglial cell fema… eye eye cell normal - 4 10x 3' v3 microglial cell fema… eye eye cell normal - 5 10x 3' v3 microglial cell fema… eye eye cell normal - 6 10x 3' v3 microglial cell fema… eye eye cell normal - 7 10x 3' v3 microglial cell fema… eye eye cell normal - 8 10x 3' v3 microglial cell fema… eye eye cell normal - 9 10x 3' v3 microglial cell fema… eye eye cell normal -10 10x 3' v3 microglial cell fema… eye eye cell normal + + 1 10x 3' v3 microglial cell fema… eye eye cell normal + 2 10x 3' v3 microglial cell fema… eye eye cell normal + 3 10x 3' v3 microglial cell fema… eye eye cell normal + 4 10x 3' v3 microglial cell fema… eye eye cell normal + 5 10x 3' v3 microglial cell fema… eye eye cell normal + 6 10x 3' v3 microglial cell fema… eye eye cell normal + 7 10x 3' v3 microglial cell fema… eye eye cell normal + 8 10x 3' v3 microglial cell fema… eye eye cell normal + 9 10x 3' v3 microglial cell fema… eye eye cell normal +10 10x 3' v3 microglial cell fema… eye eye cell normal # ℹ 379,214 more rows # ℹ Use `print(n = ...)` to see more rows ``` -### Obtaining a slice as a `Seurat` or `SingleCellExperiment` object +### Obtaining a slice as a `Seurat` or `SingleCellExperiment` object The following creates a Seurat object on-demand with a smaller set of cells and filtering only the genes `ENSG00000161798`, `ENSG00000188229`. @@ -230,9 +229,9 @@ print(seurat_obj) The output with over 4K cells and 2 genes can be now used for downstream analysis using [Seurat](https://satijalab.org/seurat/). -``` shell -An object of class Seurat -2 features across 4744 samples within 1 assay +```shell +An object of class Seurat +2 features across 4744 samples within 1 assay Active assay: RNA (2 features, 0 variable features) ``` @@ -254,9 +253,9 @@ print(sce_obj) The output with over 4K cells and 2 genes can be now used for downstream analysis using the [Bioconductor ecosystem](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html). -``` shell -class: SingleCellExperiment -dim: 2 4744 +```shell +class: SingleCellExperiment +dim: 2 4744 metadata(0): assays(1): counts rownames(2): ENSG00000106034 ENSG00000107317 @@ -268,17 +267,16 @@ mainExpName: RNA altExpNames(0): ``` - ### Memory-efficient queries -This example provides a demonstration to access the data for larger-than-memory operations using **TileDB-SOMA** operations. +This example provides a demonstration to access the data for larger-than-memory operations using **TileDB-SOMA** operations. First we initiate a lazy-evaluation query to access all brain and male cells from human. This query needs to be closed — `query$close()`. ```r library("cellxgene.census") library("tiledbsoma") - + human <- census$get("census_data")$get("homo_sapiens") query <- human$axis_query( measurement_name = "RNA", @@ -286,33 +284,31 @@ query <- human$axis_query( value_filter = "tissue == 'brain' & sex == 'male'" ) ) - -# Continued below +# Continued below ``` Now we can iterate over the matrix count, as well as the cell and gene metadata. For example, to iterate over the matrix count, we can get an iterator and perform operations for each iteration. ```r -# Continued from above +# Continued from above iterator <- query$X("raw")$tables() # For sparse matrices use query$X("raw")$sparse_matrix() # Get an iterative slice as an Arrow Table -raw_slice <- iterator$read_next() +raw_slice <- iterator$read_next() #... ``` -And you can now perform operations on each iteration slice. This logic can be wrapped around a `while()` loop and checking the iteration state by monitoring the logical output of `iterator$read_complete()` +And you can now perform operations on each iteration slice. This logic can be wrapped around a `while()` loop and checking the iteration state by monitoring the logical output of `iterator$read_complete()` And you must close the query and census. -``` +```r # Continued from above query.close() census.close() ``` - diff --git a/docs/cellxgene_census_docsite_schema.md b/docs/cellxgene_census_docsite_schema.md index bd7d4477b..450862eec 100644 --- a/docs/cellxgene_census_docsite_schema.md +++ b/docs/cellxgene_census_docsite_schema.md @@ -1,12 +1,12 @@ # Census data and schema -This page provides a user-friendly overview of the Census contents and its schema, in case you are interested you can find the full schema specification [here](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md). +This page provides a user-friendly overview of the Census contents and its schema, in case you are interested you can find the full schema specification [here](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md). -**Contents** +**Contents:** 1. [Schema](#schema) 2. [Data included in the Census](#data-included-in-the-census) -1. [SOMA objects](#soma-objects) +3. [SOMA objects](#soma-objects) ## Schema @@ -17,7 +17,7 @@ The Census is a collection of a variety of **[SOMA objects](#soma-objects)** org As you can see the Census data is a `SOMACollection` with two high-level items: 1. `"census_info"` for the census summary info. -2. `"census_data"` for the single-cell data and metadata. +2. `"census_data"` for the single-cell data and metadata. ### Census summary info `"census_info"` @@ -25,7 +25,7 @@ A `SOMAcollection` with tables providing information of the census as a whole, i - `"summary"`: high-level information of this Census, e.g. build date, total cell count, etc. - `"datasets"`: A table with all datasets from CELLxGENE Discover used to create the Census. --`"summary_cell_counts"`: Cell counts stratified by relevant cell metadata +- `"summary_cell_counts"`: Cell counts stratified by relevant cell metadata ### Census single-cell data `"census_data"` @@ -33,10 +33,10 @@ Data for each organism is stored in independent `SOMAExperiment` objects which a This is how the data is organized for one organism – *Homo sapiens*: -* `["homo_sapiens"].obs`: Cell metadata -* `["homo_sapiens"].ms["RNA"].X`: Data matrices, currently only raw counts exist `X["raw"]` -* `["homo_sapiens"].ms["RNA"].var`: Gene Metadata -* `["homo_sapiens"].ms["RNA"]["feature_dataset_presence_matrix"]`: a sparse boolean array indicating which genes were measured in each dataset. +- `["homo_sapiens"].obs`: Cell metadata +- `["homo_sapiens"].ms["RNA"].X`: Data matrices, currently only raw counts exist `X["raw"]` +- `["homo_sapiens"].ms["RNA"].var`: Gene Metadata +- `["homo_sapiens"].ms["RNA"]["feature_dataset_presence_matrix"]`: a sparse boolean array indicating which genes were measured in each dataset. ## Data included in the Census @@ -47,22 +47,21 @@ All data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) that ad - Raw counts. - Only standardized cell and gene metadata as described in the CELLxGENE Discover dataset [schema](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md). -⚠️ Note that the data includes: +⚠️ Note that the data includes: -* **Full-gene sequencing read counts** (e.g. Smart-Seq2) and **molecule counts** (e.g. 10X). -* **Duplicate cells** present across multiple datasets, these can be filtered in or out using the cell metadata variable `is_primary_data`. +- **Full-gene sequencing read counts** (e.g. Smart-Seq2) and **molecule counts** (e.g. 10X). +- **Duplicate cells** present across multiple datasets, these can be filtered in or out using the cell metadata variable `is_primary_data`. - -## SOMA objects +## SOMA objects You can find the full SOMA specification [here](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md#foundational-types). The following is short description of the main SOMA objects used by the Census: -* `DenseNDArray` is a dense, N-dimensional array, with offset (zero-based) integer indexing on each dimension. -* `SparseNDArray` is the same as `DenseNDArray` but sparse, and supports point indexing (disjoint index access). -* `DataFrame` is a multi-column table with a user-defined columns names and value types, with support for point indexing. -* `Collection` is a persistent container of named SOMA objects. -* `Experiment` is a class that represents a single-cell experiment. It always contains two objects: - * `obs`: a `DataFrame` with primary annotations on the observation axis. - * `ms`: a `Collection` of measurements, each composed of `X` matrices and axis annotation matrices or data frames (e.g. `var`, `varm`, `obsm`, etc). \ No newline at end of file +- `DenseNDArray` is a dense, N-dimensional array, with offset (zero-based) integer indexing on each dimension. +- `SparseNDArray` is the same as `DenseNDArray` but sparse, and supports point indexing (disjoint index access). +- `DataFrame` is a multi-column table with a user-defined columns names and value types, with support for point indexing. +- `Collection` is a persistent container of named SOMA objects. +- `Experiment` is a class that represents a single-cell experiment. It always contains two objects: + - `obs`: a `DataFrame` with primary annotations on the observation axis. + - `ms`: a `Collection` of measurements, each composed of `X` matrices and axis annotation matrices or data frames (e.g. `var`, `varm`, `obsm`, etc). diff --git a/docs/cellxgene_census_schema.md b/docs/cellxgene_census_schema.md index fa8d94a34..650d20b99 100644 --- a/docs/cellxgene_census_schema.md +++ b/docs/cellxgene_census_schema.md @@ -1,4 +1,4 @@ -# CZ CELLxGENE Discover Census Schema +# CZ CELLxGENE Discover Census Schema **Version**: 1.1.0 @@ -10,7 +10,6 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S The CZ CELLxGENE Discover Census, hereafter referred as Census, is a versioned data object and API for most of the single-cell data hosted at [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/). To learn more about the Census visit the `chanzuckerberg/cellxgene-census` [github repository](https://github.com/chanzuckerberg/cellxgene-census) - To better understand this document the reader should be familiar with the [CELLxGENE dataset schema](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md) and [SOMA](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md). ## Definitions @@ -29,20 +28,18 @@ The following terms are used throughout this document: The Census Schema follows [Semver](https://semver.org/) for its versioning: * Major: any schema changes that make the Census incompatible with the Census API or SOMA API. Examples: - * Column deletion in Census `obs` - * Addition of new modality + * Column deletion in Census `obs` + * Addition of new modality * Minor: schema additions that are compatible with public API(s) and SOMA. Examples: - * New column to Census `obs` is added - * tissue/tissue_general mapping changes + * New column to Census `obs` is added + * tissue/tissue_general mapping changes * Patch: schema fixes. Examples: - * Editorial schema changes - + * Editorial schema changes Changes MUST be documented in the schema [Changelog](#changelog) at the end of this document. Census data releases are versioned separately from the schema. - ## Schema ### Data included @@ -57,7 +54,7 @@ The Census MUST only contain features (genes) with a [`feature_reference`](https #### Multi-species data constraints -Per the CELLxGENE dataset schema, [multi-species datasets MAY contain observations (cells) of a given organism and features (genes) of a different one](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#general-requirements), as defined in [`organism_ontology_term_id`](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#organism_ontology_term_id) and [`feature_reference`](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#feature_reference) respectively. +Per the CELLxGENE dataset schema, [multi-species datasets MAY contain observations (cells) of a given organism and features (genes) of a different one](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#general-requirements), as defined in [`organism_ontology_term_id`](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#organism_ontology_term_id) and [`feature_reference`](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#feature_reference) respectively. For a given multi-species dataset, the table below shows all possible combinations of organisms for both observations and features. For each combination, inclusion criteria for the Census is provided. @@ -222,7 +219,7 @@ an `assay_ontology_term_id` value from the list below: Additional Notes: -* EFO:0030026 "sci-Plex" is not included in spite of being RNA data. This assay has specific cell metadata that is not present in the CELLxGENE dataset schema, without these metadata the RNA data lacks proper context and thus may be misleading to include. +* EFO:0030026 "sci-Plex" is not included in spite of being RNA data. This assay has specific cell metadata that is not present in the CELLxGENE dataset schema, without these metadata the RNA data lacks proper context and thus may be misleading to include. * EFO:0009920 and EFO:0030062 "Slide-seq" and "Slide-seqV2", respectively, are not included as coverage is low compared to other assays and data lacks context without spatial metadata not included in CELLxGENE dataset schema. #### Data matrix types @@ -272,28 +269,28 @@ Census metadata MUST be stored as a `SOMADataFrame` with two columns: This `SOMADataFrame` MUST have the following rows: - + 1. Census schema version: - 1. label: `"census_schema_version"` - 1. value: Semver schema version. -1. Census build date: - 1. label: `"census_build_date"` - 1. value: The date this Census was built in ISO 8601 date format -1. Dataset schema version: - 1. label: `"dataset_schema_version"` - 1. value: The CELLxGENE Discover schema version of the source H5AD files. -1. Total number of cells included in this Census build: - 1. label: `"total_cell_count"` - 1. value: Cell count -1. Unique number of cells included in this Census build (is_primary_data == True) - 1. label: `"unique_cell_count"` - 1. value: Cell count -1. Number of human donors included in this Census build. Donors are guaranteed to be unique within datasets, not across all Census. - 1. label: `"number_donors_homo_sapiens"` - 1. value: Donor count -1. Number of mouse donors included in this Census build. Donors are guaranteed to be unique within datasets, not across all Census. - 1. label: `"number_donors_mus_musculus"` - 1. value: Donor count + 1. label: `"census_schema_version"` + 2. value: Semver schema version. +2. Census build date: + 1. label: `"census_build_date"` + 2. value: The date this Census was built in ISO 8601 date format +3. Dataset schema version: + 1. label: `"dataset_schema_version"` + 2. value: The CELLxGENE Discover schema version of the source H5AD files. +4. Total number of cells included in this Census build: + 1. label: `"total_cell_count"` + 2. value: Cell count +5. Unique number of cells included in this Census build (is_primary_data == True) + 1. label: `"unique_cell_count"` + 2. value: Cell count +6. Number of human donors included in this Census build. Donors are guaranteed to be unique within datasets, not across all Census. + 1. label: `"number_donors_homo_sapiens"` + 2. value: Donor count +7. Number of mouse donors included in this Census build. Donors are guaranteed to be unique within datasets, not across all Census. + 1. label: `"number_donors_mus_musculus"` + 2. value: Donor count An example of this `SOMADataFrame` is shown below: @@ -430,7 +427,7 @@ Summary cell counts grouped by organism and relevant cell metadata MUST be model ontology_term_id string - ID associated to instance of metadata (e.g. "UBERON:0002048" if category is "tissue"). "na" if none. + ID associated to instance of metadata (e.g. "UBERON:0002048" if category is "tissue"). "na" if none. total_cell_count @@ -672,11 +669,11 @@ For each organism the `SOMAExperiment` MUST contain the following: * Cell metadata – `census_obj["census_data"][organism].obs` – `SOMADataFrame` * Data – `census_obj["census_data"][organism].ms` – `SOMACollection`. This `SOMACollection` MUST only contain one `SOMAMeasurement` in `census_obj["census_data"][organism].ms["RNA"]` with the following: - * Matrix data – `census_obj["census_data"][organism].ms["RNA"].X` – `SOMACollection`. It MUST contain exactly one layer: - * Count matrix – `census_obj["census_data"][organism].ms["RNA"].X["raw"]` – `SOMASparseNDArray` - * Normalized count matrix – `census_obj["census_data"][organism].ms["RNA"].X["normalized"]` – `SOMASparseNDArray` - * Feature metadata – `census_obj["census_data"][organism].ms["RNA"].var` – `SOMAIndexedDataFrame` - * Feature dataset presence matrix – `census_obj["census_data"][organism].ms["RNA"]["feature_dataset_presence_matrix"]` – `SOMASparseNDArray` + * Matrix data – `census_obj["census_data"][organism].ms["RNA"].X` – `SOMACollection`. It MUST contain exactly one layer: + * Count matrix – `census_obj["census_data"][organism].ms["RNA"].X["raw"]` – `SOMASparseNDArray` + * Normalized count matrix – `census_obj["census_data"][organism].ms["RNA"].X["normalized"]` – `SOMASparseNDArray` + * Feature metadata – `census_obj["census_data"][organism].ms["RNA"].var` – `SOMAIndexedDataFrame` + * Feature dataset presence matrix – `census_obj["census_data"][organism].ms["RNA"]["feature_dataset_presence_matrix"]` – `SOMASparseNDArray` #### Matrix Data, count (raw) matrix – `census_obj["census_data"][organism].ms["RNA"].X["raw"]` – `SOMASparseNDArray` @@ -694,7 +691,7 @@ as `normalized[i,j] = X[i,j] / sum(X[i, ])`. The Census MUST only contain features with a [`feature_biotype`](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#feature_biotype) value of "gene". -The [gene references are pinned](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#required-gene-annotations) as defined in the CELLxGENE dataset schema. +The [gene references are pinned](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#required-gene-annotations) as defined in the CELLxGENE dataset schema. The following columns MUST be included: @@ -884,31 +881,33 @@ Cell metadata MUST be encoded as a `SOMADataFrame` with the following columns: - ## Changelog ### Version 1.1.0 + * Adds `dataset_version_id` to "Census table of CELLxGENE Discover datasets – `census_obj["census_info"]["datasets"]`" * Add `X["normalized"]` layer * Add `nnz` and `n_measured_obs` columns to `ms["RNA"].var` dataframe * Add `nnz`, `n_measured_vars`, `raw_sum`, `raw_mean_nnz` and `raw_variance_nnz` columns to `obs` dataframe ### Version 1.0.0 + * Updates text to reflect official name: CZ CELLxGENE Discover Census. * Updates `census["census_info"]["summary"]` to reflect official name in the column `label`: - * From `"cell_census_build_date"` to `"census_build_date"`. - * From `"cell_census_schema_version"` to `"census_schema_version"`. + * From `"cell_census_build_date"` to `"census_build_date"`. + * From `"cell_census_schema_version"` to `"census_schema_version"`. * Adds the following row to `census["census_info"]["summary"]`: - * `"dataset_schema_version"` - + * `"dataset_schema_version"` ### Version 0.1.1 + * Adds clarifying text for "Feature Dataset Presence Matrix" ### Version 0.1.0 + * The "Dataset Presence Matrix" was renamed to "Feature Dataset Presence Matrix" and moved from `census_obj["census_data"][organism].ms["RNA"].varp["dataset_presence_matrix"]` to `census_obj["census_data"][organism].ms["RNA"]["feature_dataset_presence_matrix"]`. * Editorial: changes all double quotes in the schema to ASCII quotes 0x22. ### Version 0.0.1 -* Initial Census schema is published. +* Initial Census schema is published. diff --git a/docs/cellxgene_census_storage_and_release_policy.md b/docs/cellxgene_census_storage_and_release_policy.md index a7f252e6e..51c95b805 100644 --- a/docs/cellxgene_census_storage_and_release_policy.md +++ b/docs/cellxgene_census_storage_and_release_policy.md @@ -8,7 +8,7 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S ## Definitions -* **Census build**: a SOMA collection with the Census data [as specified in the Census schema](https://github.com/chanzuckerberg/cell-census/blob/main/docs/cell_census_schema.md#data-encoding-and-organization). +* **Census build**: a SOMA collection with the Census data [as specified in the Census schema](https://github.com/chanzuckerberg/cell-census/blob/main/docs/cell_census_schema.md#data-encoding-and-organization). * **Census source H5AD files**: the set of H5AD files used to create a Census build. * **Census release**: a Census build that is publicly hosted online. * **Census release `tag`**: a label for a Census release, it MUST be a string of printable ASCII characters. @@ -20,9 +20,9 @@ The following S3 bucket MUST be used as the root to store Census data: `s3://cellxgene-data-public/` Census data MUST be deposited under a folder named `cell-census` of the root S3 bucket: - + `./cell-census` - + All data related to a Census **release** MUST be deposited in a folder named with the **tag** of the release: `./cell-census/[tag]/` @@ -31,26 +31,23 @@ The Census **release** MUST be deposited in a folder named `soma`: `./cell-census/[tag]/soma/` -All Census **source h5ads** used to create a specific Census **release** MUST be copied into a folder named `h5ads`: +All Census **source h5ads** used to create a specific Census **release** MUST be copied into a folder named `h5ads`: `./cell-census/[tag]/h5ads/` ## Census release information `json` - The publication date along with the full URI paths for the `soma` folder and the `h5ads` folder for all Census releases MUST be recorded in a `json` file with the following naming convention and structure, which will be used as a machine- and human-readable directory of available Census builds: - `./cell-census/releases.json` -* This file MUST be in `json` formats where the parent keys are release identifiers (alias or name). -* The alias `latest` MUST be present and MUST point to the **Weekly Census release**. +* This file MUST be in `json` formats where the parent keys are release identifiers (alias or name). +* The alias `latest` MUST be present and MUST point to the **Weekly Census release**. * The prefix `V` MUST be used followed by an integer counter to label long-term supported Census releases, e.g. `V1`. - -``` +```json { [release_alias]: [release_name|release_alias], - [release_name]: { #defines a given release + [release_name]: { #defines a given release “release_date”: [yyyy-mm-dd] #optional, ISO 8601 date, may be null “release_build”: [yyyy-mm-dd] #required, ISO 8601 date, date of Census build “soma”: { @@ -68,7 +65,7 @@ The publication date along with the full URI paths for the `soma` folder and the An example of this file is shown below: -``` +```json { "latest": "release-1" "release-1": { @@ -89,4 +86,3 @@ An example of this file is shown below: ... } ``` - diff --git a/docs/census_article_guidelines.md b/docs/census_article_guidelines.md index 7ac4380e4..73f76006f 100644 --- a/docs/census_article_guidelines.md +++ b/docs/census_article_guidelines.md @@ -9,7 +9,7 @@ The goals of these articles are to have: -* Master reference articles to link for other channels (e.g. slack, twitter). +* Master reference articles to link for other channels (e.g. slack, twitter). * One-stop place for users to have a historical view of Census developments and analysis. A great example of this approach is the [Apache Arrow Blog](https://arrow.apache.org/blog/). @@ -47,14 +47,12 @@ Immediately below the title, and date and author(s) should be added to the artic Example: -``` +```markdown *Published: 10 August 2023* *By: [John Smith](author1@chanzuckerberg.com), [Phil Scoot](author2@chanzuckerberg.com)* ``` - - ### Introduction Introductory text of 1-2 paragraphs must be included right underneath the date and authors. @@ -62,11 +60,10 @@ Introductory text of 1-2 paragraphs must be included right underneath the date a * It must provide a one paragraph summary of the article. * It must not contain an explanation of the Census. -Example: - +Example: > The Census team is pleased to announce the release of the R package `cellxgene.census`, this has been long coming since our Python release back in May. Now, from R users can access the Census data which is the largest harmonized aggregation of single-cell data, composed of >30M cells and >60K genes. -> +> > With `cellxgene.census` users can access Census access and slice the data using cell or gene filters across hundreds of datasets. Users can fetch the data in an iterative fashion for bigger-than-memory slices of data, or export to Seurat or SingleCellExperiment objects ### Sections @@ -77,22 +74,21 @@ The rest of the article content must be organized within sections: * The section title should be concise, self-explanatory. * The section's contents and presence or absence of sub-headers are left to the discretion of the writer. +## Example article -## Example article - -``` +```markdown # R package cellxgene.census 1.0.0 is out *Published: 10 August 2023* -*By: [Pablo Garcia-Nieto](pgarcia-nieto@chanzuckerberg.com)* +*By: [Pablo Garcia-Nieto](pgarcia-nieto@chanzuckerberg.com)* The Census team is pleased to announce the release of the R package `cellxgene.census`, this has been long coming since our Python release back in May. Now, from R users can access the Census data which is the largest harmonized aggregation of single-cell data, composed of >30M cells and >60K genes. - + With `cellxgene.census` users can access Census access and slice the data using cell or gene filters across hundreds of datasets. Users can fetch the data in an iterative fashion for bigger-than-memory diff --git a/docs/census_notebook_guidelines.md b/docs/census_notebook_guidelines.md index f3b7661aa..115c8ccbe 100644 --- a/docs/census_notebook_guidelines.md +++ b/docs/census_notebook_guidelines.md @@ -1,6 +1,6 @@ # Census API notebook/vignette editorial guidelines -API demonstration code that is part of the documentation should be deposited here: +API demonstration code that is part of the documentation should be deposited here: - Python notebooks [`cellxgene-census/api/python/notebooks`](https://github.com/chanzuckerberg/cellxgene-census/tree/main/api/python/notebooks) - R vignettes [`cellxgene-census/api/r/CellCensus/vignettes`](https://github.com/chanzuckerberg/cellxgene-census/tree/main/api/r/cellxgene.census/vignettes) @@ -13,9 +13,9 @@ These assets are user-facing and are automatically rendered to the doc-sites and ### Title -* It must use the highest-level markdown header `#`. -* Unless needed, it should not contain any direct mentions of "Census". -* It should be concise, self-explanatory, and if possible indicate an action. +- It must use the highest-level markdown header `#`. +- Unless needed, it should not contain any direct mentions of "Census". +- It should be concise, self-explanatory, and if possible indicate an action. Examples: @@ -31,10 +31,10 @@ Examples: Introductory text must be included right underneath the title. -* It must provide a one paragraph summary of the notebook's goals. -* It must not contain an explanation of the Census. +- It must provide a one paragraph summary of the notebook's goals. +- It must not contain an explanation of the Census. -Examples: +Examples: :white_check_mark: @@ -46,20 +46,20 @@ Examples: > >This notebook shows you how to learn about the Census contents and how to query it. -### Table of Contents +### Table of Contents Immediately after the introduction a table of contents must be provided: -* It must be placed under the bolded word "**Contents**" . -* It must contain an ordered list of the second-level headers (`##`) used for [Sections](#sections). -* If necessary it may contain sub-lists corresponding to lower-level headers (`###`, etc) +- It must be placed under the bolded word "**Contents**" . +- It must contain an ordered list of the second-level headers (`##`) used for [Sections](#sections). +- If necessary it may contain sub-lists corresponding to lower-level headers (`###`, etc) Example: :white_check_mark: > **Contents** -> +> > 1. Learning about the lung data. > 2. Fetching all human lung data from the Census. > 3. Obtaining QC metrics for this data slice. @@ -68,20 +68,19 @@ Example: The rest of the notebook/vignette content must be organized within sections: -* The section title must use the second-level markdown header `##`. **This is important as the python doc-site renders these in the sidebar and in the full view of all notebooks.** -* The section title should be concise, self-explanatory, and if possible indicate an action. -* The section's contents and presence or absence of sub-headers are left to the discretion of the writer. -* The section's non-code content should be kept as succinct as possible. - +- The section title must use the second-level markdown header `##`. **This is important as the python doc-site renders these in the sidebar and in the full view of all notebooks.** +- The section title should be concise, self-explanatory, and if possible indicate an action. +- The section's contents and presence or absence of sub-headers are left to the discretion of the writer. +- The section's non-code content should be kept as succinct as possible. -## Example notebook/vignette +## Example notebook/vignette -``` +```markdown # Integrating data with SCVI. -This notebook provides a demonstration for integrating two -Census datasets using `scvi-tools`. The goal is not to -provide an exhaustive guide on proper integration, but to showcase +This notebook provides a demonstration for integrating two +Census datasets using `scvi-tools`. The goal is not to +provide an exhaustive guide on proper integration, but to showcase what information in the Census can inform data integration. **Contents** @@ -104,20 +103,20 @@ Let's load all modules needed for this notebook. import numpy as np import scvi from scipy.sparse import csr_matrix -\code +\code -Now we can open the Census +Now we can open the Census -\code +\code census = cellxgene_census.open_soma(census_version="latest") \code -In this notebook we will use Tabula Muris Senis data -from the liver as it contains cells from both 10X +In this notebook we will use Tabula Muris Senis data +from the liver as it contains cells from both 10X Genomics and Smart-Seq2 technologies. -Let's query the datasets table of the Census by -filtering on collection_name for "Tabula Muris Senis" +Let's query the datasets table of the Census by +filtering on collection_name for "Tabula Muris Senis" and dataset_title for "liver". [...] diff --git a/tools/cell_dup_check/README.md b/tools/cell_dup_check/README.md index 491d8729d..9dbe512b2 100644 --- a/tools/cell_dup_check/README.md +++ b/tools/cell_dup_check/README.md @@ -1,6 +1,5 @@ +# To install and run from source: -To install and run from source: * clone the repo and/or create a copy of this directory * create a python venv, and install the packages listed in requirements.txt. E.g., `pip install -r requirements.txt` * open the notebook, set the config variables, run all cells - diff --git a/tools/cellxgene_census_builder/README.md b/tools/cellxgene_census_builder/README.md index b641edd21..f523db570 100644 --- a/tools/cellxgene_census_builder/README.md +++ b/tools/cellxgene_census_builder/README.md @@ -31,7 +31,7 @@ This will perform four steps (more will be added the future): This will result in the following file tree: -``` +```text working_dir: | +-- config.yaml # build config (user provided, read-only) @@ -90,7 +90,7 @@ This is primarily for the use of package developers. The defaults are suitable f If you need to override a default, create `config.yaml` in the build working directory and specify the overrides. An example `config.yaml` might look like: -``` +```yaml verbose: 2 # debug level logging consolidate: false # disable TileDB consolidation ``` @@ -154,11 +154,13 @@ If you run out of memory, reduce `--max-workers`. You can also try a higher numb #### Mode (b) - creating a Census from a user-provided list of H5AD files: - Create a manifest file, in CSV format, containing two columns: dataset_id, h5ad_uri. Example: + ```csv 53d208b0-2cfd-4366-9866-c3c6114081bc, /files/53d208b0-2cfd-4366-9866-c3c6114081bc.h5ad 559ed814-a9c9-4b77-a0e6-7da7b907fe3a, /files/559ed814-a9c9-4b77-a0e6-7da7b907fe3a.h5ad 5b93b8fc-7c9a-45bd-ad3f-dc883137de30, /files/5b93b8fc-7c9a-45bd-ad3f-dc883137de30.h5ad ``` + You can specify a file system path or a URI in the second field - To create a Census at ``, execute: > $ python -m cellxgene_census_builder build --manifest