Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaden paper test data #125

Open
jth3galv opened this issue Oct 10, 2023 · 1 comment
Open

Scaden paper test data #125

jth3galv opened this issue Oct 10, 2023 · 1 comment

Comments

@jth3galv
Copy link

Hi Kevin, sorry to bother you here and misuse GitHub :) but I did not know how to reach you.

I am working with scaden and I need to do some tests.
I am struggling to obtain the test data to reproduce the results of your paper.

Would it be possible to share them?

Thanks!

Giulio

@WanderingHedgie
Copy link

WanderingHedgie commented May 2, 2024

Hi jth3galv !

I don't know if you're still searching, but I find datasets used for Scaden training on the 10xGenomics website. I made a file to list all of them (names and number of cells match to datasets described in Supplementary Table 1 from the article). I hope it will be useful to you !

Here is the file :

Information about the datasets used in the scaden method

1. General information

All datasets gathered for training are :

  • 6k_pbmc
  • 8k_pbmc
  • donorA
  • donorC

According filtered datasets matrices were downloaded from the 10xGenomics website, it allows to avoid some barcodes that should not be in the dataset due to errors.

Notes :

  • A part of Quality Control is already done on the datasets (number of genes per cell, number of reads assigned to a barcode, etc.) but more may be needed.
  • Matrices downloaded are :
    • matrix.mtx : sparse matrix of counts (X for AnnData)
    • genes.tsv : list of genes (var for AnnData)
    • barcodes.tsv : list of cells with barcodes (obs for AnnData)

A ready-to-use dataset is available on the scaden website which contains all 4 datasets with 32k simulated data : https://scaden.readthedocs.io/en/latest/datasets.html#human-pbmc.

2. Download datasets for training

2.1 - 6k PBMCs from a healthy donor

Webpage

https://www.10xgenomics.com/datasets/6-k-pbm-cs-from-a-healthy-donor-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_web_summary.html

2.2 - 8k PBMCs from a healthy donor

Webpage

https://www.10xgenomics.com/datasets/8-k-pbm-cs-from-a-healthy-donor-2-standard-2-1-0

Input Files

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_fastqs.tar

Output Files

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_raw_gene_bc_matrices_h5.h5
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_web_summary.html
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_cloupe.cloupe

2.3 - Frozen PBMCs (Donor A)

Webpage

https://www.10xgenomics.com/datasets/frozen-pbm-cs-donor-a-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_web_summary.html

2.4 - Frozen PBMCs (Donor C)

Webpage

https://www.10xgenomics.com/datasets/frozen-pbm-cs-donor-c-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_web_summary.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants