Skip to content

Commit

Permalink
Add test files with 'chr' prefix added to seqnames (#177)
Browse files Browse the repository at this point in the history
* add test files with 'chr' prefix added to seqnames

* note about chr_prefix files in tests/test_data/README.md
  • Loading branch information
SamBryce-Smith authored Sep 28, 2021
1 parent ca33f6a commit 4a204b3
Show file tree
Hide file tree
Showing 9 changed files with 1,114 additions and 1 deletion.
3 changes: 2 additions & 1 deletion tests/test_data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
Use the files provided here for developing, debugging, testing your code.

* `.bam` files with index files (`.bam.bai`) as inputs for execution workflows ([Created by choosing 2 genes with MACE-seq PAS reads at at least 2 sites (GSE151724) and subsetting bam files (generated for the pilot_benchmark: siControl_R1, SRR11918577 and siSrsf3_R1, SRR11918579) to reads contain reads falling within +/- 1kb from gene boundaries from the gtf file]).
* `.fastq` files can be generated to test alignments using `samtools bam2fq input.bam > output.fastq`.
* `.fastq` files can be generated to test alignments using `samtools bam2fq input.bam > output.fastq`.
* corresponding `.gtf` ([Created GENCODE release M18 subsetted to the 2 genes for the test data with leading "chr" removed to match bam files])
* corresponding `.gff3` ([Created GENCODE release M18 subsetted to the 2 genes for the test data with leading "chr" removed to match bam files])
* `.MACEseq.mm10.bed` as a ground truth example files ([BED6 files for clevage and poly(A) sites for the 2 genes from the two samples (siControl_R1, SRR11918617 and siSrsf3_R2, SRR11918619) where the score column corresponds to the TPM for each PAS detected by MACE-seq in that sample])
* `.bam`, `.bam.bai`, `.gtf`, `.gff3` and `.bed` files containing `_Chr_prefix` in the filename. These were generated from their counterparts without the string by adding a `chr` prefix has been added to the fields with chromosome names.
* [EXTEND THIS LIST WHEN ADDING MORE TEST FILES TO THE DIRECTORY]
551 changes: 551 additions & 0 deletions tests/test_data/gencode_2genes_Chr_prefix.vM18.annotation.gff3

Large diffs are not rendered by default.

549 changes: 549 additions & 0 deletions tests/test_data/gencode_2genes_Chr_prefix.vM18.annotation.gtf

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
chr6 119923969 119923970 dPAS_9821 174.961985532 -
chr6 119925550 119925551 pPAS_9822 25.4490160774 -
chr16 78336750 78336751 pPAS_1596 12.7245080387 +
chr16 78340181 78340182 oPAS_1597 12.7245080387 +
chr16 78340750 78340751 oPAS_1598 162.237477494 +
chr16 78359782 78359783 dPAS_1599 6.36225401935 +
Binary file not shown.
Binary file not shown.
6 changes: 6 additions & 0 deletions tests/test_data/siSrsf3_R1_2genes_Chr_prefix.MACEseq.mm10.bed
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
chr6 119923969 119923970 dPAS_9821 180.804499064 -
chr6 119925550 119925551 pPAS_9822 72.861514548 -
chr16 78336750 78336751 pPAS_1596 32.3828953547 +
chr16 78340181 78340182 oPAS_1597 8.09572383867 +
chr16 78340750 78340751 oPAS_1598 83.6558129996 +
chr16 78359782 78359783 dPAS_1599 13.4928730644 +
Binary file added tests/test_data/siSrsf3_R1_2genes_Chr_prefix.bam
Binary file not shown.
Binary file not shown.

0 comments on commit 4a204b3

Please sign in to comment.