Skip to content

Commit

Permalink
Update CDS DRS mapping data (#728)
Browse files Browse the repository at this point in the history
* update CDS DRS mapping data
* Use drs_uri as source of truth for sequencing data

Fixes #719

---------

Co-authored-by: 🔧 Ino de Bruijn 🧬 <[email protected]>
  • Loading branch information
onursumer and inodb authored Dec 12, 2024
1 parent 0e75d08 commit 0d4e700
Show file tree
Hide file tree
Showing 8 changed files with 35,539 additions and 29,319 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,10 @@ that using these commands (requires access to the htan-dcc google project):
cd data
bq extract --destination_format CSV released.entities_v6_1 gs://htan-release-files/entities_v6_1.csv
bq extract --destination_format CSV released.metadata_v6_1 gs://htan-release-files/metadata_v6_1.csv
bq extract --destination_format NEWLINE_DELIMITED_JSON released.cds_drs_mapping_V2 gs://htan-release-files/cds_drs_mapping.json
gsutil cp gs://htan-release-files/entities_v6_1.csv entities_v6_1.csv
gsutil cp gs://htan-release-files/metadata_v6_1.csv metadata_v6_1.csv

gsutil cp gs://htan-release-files/cds_drs_mapping.json cds_drs_mapping.json
```

#### Pull files from Synapse and Process for ingestion
Expand Down Expand Up @@ -64,6 +65,7 @@ There are currently no automated tests, other than building the project, so be c
## Getting Started

First, make sure you have the latest processed json file:

```bash
yarn gunzip
```
Expand All @@ -81,6 +83,7 @@ Open [http://localhost:3000](http://localhost:3000) with your browser to see the
You can start editing any page. The page auto-updates as you edit the file.

## Debugging processSynapseJSON

Add `debugger;` somewhere in the code. Then run:

```bash
Expand Down
2 changes: 1 addition & 1 deletion components/HomePage.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ const HomePage: React.FunctionComponent<IHomePropsProps> = ({
}}
>
<a style={{ color: 'white' }} href="/data-updates">
Data Release V6.1 (Last updated 2024-11-22)
Data Release V6.1 (Last updated 2024-12-11)
</a>
</div>
<Row className="justify-content-md-center">
Expand Down
64,817 changes: 35,517 additions & 29,300 deletions data/cds_drs_mapping.json

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions data/processSynapseJSON.log
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
ncc: Version 0.28.6
ncc: Compiling file index.js
ncc: Using [email protected] (local user-provided)
40kB sourcemap-register.js
8479kB index.js
1425kB index.js.map
8519kB [29085ms] - ncc 0.28.6
40kB sourcemap-register.js
10961kB index.js
1425kB index.js.map
11001kB [9288ms] - ncc 0.28.6
Missing ParentBiospecimenID: {
Component: 'ScRNA-seqLevel3',
Filename: 'single_cell_RNAseq_level_3_ped_glioma/HTAN_pHGG_161_New_Reg1_snRNA/barcodes.tsv.gz',
Expand Down
18 changes: 6 additions & 12 deletions data/processSynapseJSON.ts
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,6 @@ function getReleaseSynapseIds(

function addDownloadSourcesInfo(
file: BaseSerializableEntity,
dbgapSynapseSet: Set<string>,
dbgapImgSynapseSet: Set<string>
) {
if (
Expand All @@ -389,9 +388,8 @@ function addDownloadSourcesInfo(
// them from the portal listing entirely so assume they are there
file.isRawSequencing = true;
if (
file.synapseId &&
(dbgapSynapseSet.has(file.synapseId) ||
file.Filename.endsWith('bai'))
(file.synapseId && file.viewers?.cds?.drs_uri) ||
file.Filename.endsWith('bai')
) {
file.downloadSource = DownloadSourceCategory.dbgap;
} else {
Expand All @@ -402,13 +400,10 @@ function addDownloadSourcesInfo(
if (file.synapseId && dbgapImgSynapseSet.has(file.synapseId)) {
// Level 2 imaging data is open access
// ImagingLevel2, SRRSImagingLevel2 as specified in released.entities table (CDS_Release) column
if (
file.level === 'Level 2' &&
file.Component.startsWith('Imaging')
) {
if (file.viewers?.cds?.drs_uri) {
file.downloadSource = DownloadSourceCategory.cds;
} else {
file.downloadSource = DownloadSourceCategory.dbgap;
file.downloadSource = DownloadSourceCategory.comingSoon;
}
} else if (
file.Component === 'OtherAssay' &&
Expand Down Expand Up @@ -617,7 +612,6 @@ function processSynapseJSON(
ancestryByParticipantID
);

const dbgapSynapseSet = new Set<string>(getDbgapSynapseIds(entitiesById));
const dbgapImgSynapseSet = new Set<string>(
getDbgapImgSynapseIds(entitiesById)
);
Expand Down Expand Up @@ -645,10 +639,10 @@ function processSynapseJSON(
parentData?.therapy || []
).map((d) => d.ParticipantID);

addDownloadSourcesInfo(file, dbgapSynapseSet, dbgapImgSynapseSet);
addViewers(file);
addDownloadSourcesInfo(file, dbgapImgSynapseSet);
addReleaseInfo(file, entitiesById);
addImageChannelMetadata(file, entitiesById);
addViewers(file);
return file as SerializableEntity;
});
// .filter((f): f is SerializableEntity => !!f); // file should be defined (typescript doesnt understand (f=>f)
Expand Down
2 changes: 1 addition & 1 deletion lib/helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ export async function fetchData(): Promise<LoadDataResult> {
const processedSynURL =
process.env.NODE_ENV === 'development'
? '/processed_syn_data.json'
: `${getCloudBaseUrl()}/processed_syn_data_20241204_1543.json`;
: `${getCloudBaseUrl()}/processed_syn_data_20241211_2209.json`;
return fetchSynData(processedSynURL);
}

Expand Down
6 changes: 6 additions & 0 deletions pages/static/data-updates.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@

<h1>Data Updates</h1>

<h2 id="2024-12-11">December 11th, 2024</h2>
<p>
We fixed an issue with manifest generation for CDS. This previously resulted
in some files showing up in CDS while they weren't available yet.
</p>

<h2 id="2024-11-22">November 22nd, 2024</h2>
<p>
More than 30K Level 1-2 imaging and sequencing files are now accessible via
Expand Down
Binary file modified public/processed_syn_data.json.gz
Binary file not shown.

0 comments on commit 0d4e700

Please sign in to comment.