Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge 2.0 to master #157

Merged
merged 87 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
e55b8be
Create next.deploy
devbyaccident Nov 21, 2022
3db074f
Merge pull request #59 from kids-first/devbyaccident-patch-2
devbyaccident Nov 21, 2022
87fb2a6
fixing name to match current resources
devbyaccident Nov 22, 2022
3780bb0
:bug: Avoid logging an empty object (#60)
evans-g-crsj Nov 25, 2022
8f42981
:bugs: Fix arranger-project (genes.cosmic.tumour_types_germline) (#63)
evans-g-crsj Dec 19, 2022
ca09ea1
:wrench: Add main_branch entry to next.deploy (#64)
evans-g-crsj Dec 20, 2022
4d4aa63
:wrench: Add members to arranger-project (#65)
evans-g-crsj Dec 20, 2022
c9f2978
:wrench: Add helper script to add aliases (#66)
evans-g-crsj Jan 4, 2023
3615b9b
:sparkles: Add variants to sets (#67)
evans-g-crsj Jan 9, 2023
40ee2e0
:wrench: Implement searchAfter for participants (phenotypes) (#68)
evans-g-crsj Feb 15, 2023
5abf703
:wrench: Update conf for specimens (#69)
evans-g-crsj Mar 17, 2023
4e175a6
Small fixes for UCSF (using public key and pass esUser and esPass to …
celinepelletier Mar 29, 2023
697858a
Review comments
celinepelletier Mar 29, 2023
ff0408d
fix: UCSF-000 fix try to match regex on something that is not a string
celinepelletier Apr 18, 2023
f79e5e7
:wrench: Create include-next-deploy
evans-g-crsj May 11, 2023
c8a23cf
:pencil: Rename include-next-deploy to include.next.deploy
evans-g-crsj May 11, 2023
5169e1a
:pencil: Update include.next.deploy
evans-g-crsj May 11, 2023
61e558a
:wrench: Prepare admin conf for INCLUDE
evans-g-crsj Jun 27, 2023
0abb6b7
:wrench: Change arranger project name
evans-g-crsj Jun 27, 2023
7e5f4aa
Merge pull request #86 from kids-first/feat-conf-include-qa
adipaul1981 Jun 27, 2023
93cd8c4
Add prd cidr to next.deploy
devbyaccident Jun 27, 2023
59bab06
Merge pull request #87 from kids-first/devbyaccident-patch-2
devbyaccident Jun 27, 2023
9369ca8
Update include.next.deploy
devbyaccident Jun 27, 2023
9905931
Update include.next.deploy
devbyaccident Sep 7, 2023
f14385d
Merge pull request #92 from kids-first/feature/cbl/add-security-headers
devbyaccident Sep 7, 2023
a9e34e5
:bug: Incorporate bug fix for resolveSet (#93)
evans-g-crsj Sep 19, 2023
9cdbd7f
:wrench: Add extended mapping to arranger project (source_text_tumor_…
evans-g-crsj Sep 21, 2023
5e988cc
🔒️ Add container vulnerability scan
devbyaccident Sep 27, 2023
6b000df
Merge pull request #95 from kids-first/feature/cbl/container-scan
devbyaccident Sep 27, 2023
983645d
:bug: Make sure that numbers checks are supported when testing regex …
evans-g-crsj Oct 6, 2023
735bbd5
feat: SJIP-569 add logs to debug no sets
celinepelletier Oct 10, 2023
c976837
feat: SJIP-569 add log to debug arranger next in include QA
celinepelletier Oct 10, 2023
99a1a51
feat: SJIP-569 add logs for debug
celinepelletier Oct 10, 2023
07d1d93
feat: SKIP-569 add logs
celinepelletier Oct 10, 2023
b1fd0f2
feat: SJIP-569 add logs
celinepelletier Oct 10, 2023
3ad432d
feat: SJIP-569 add new route for test
celinepelletier Oct 10, 2023
6c2a597
feat: SJIP-569 add route
celinepelletier Oct 10, 2023
affe48a
feat: SJIP-569 remove logs and new routes
celinepelletier Oct 10, 2023
98faeb8
feat: SJIP-569 trigger deployment
celinepelletier Oct 10, 2023
e31f469
:zap: Improve helpers scripts (#106)
evans-g-crsj Oct 26, 2023
91a441f
feat: SJIP-572 retrieve shared user set and fix tests
celinepelletier Oct 31, 2023
a5d6fb8
feat: SJIP-572 update dependancies
celinepelletier Oct 31, 2023
bf49f23
feat: SJIP-572 update base docker images
celinepelletier Oct 31, 2023
1643c3e
:hammer: Improve helper scripts (#109)
evans-g-crsj Nov 6, 2023
a446875
:bug: Fix admin script when extracting conf (#110)
evans-g-crsj Nov 17, 2023
33c194e
feat: SJIP-640 bump keycloak dep
celinepelletier Nov 19, 2023
c3964fa
:sparkles: Feat add new route for authorized studies (#112)
evans-g-crsj Nov 23, 2023
d0687d1
:wrench: Remove survival endpoint + update dockerfile + remove ts-nod…
evans-g-crsj Nov 30, 2023
e98134b
:bug: Fix helpers script + change payload in auth studies route (#114)
evans-g-crsj Dec 12, 2023
9b18b41
:pencil: Add script to get stats on a given release (#116)
evans-g-crsj Jan 16, 2024
e0158e6
:wrench: Add script to help manage project conf (#117)
evans-g-crsj Jan 22, 2024
d4d255c
:wrench: Fix minor stuff in admin scripts (#118)
evans-g-crsj Feb 7, 2024
37fe579
fix: SKFP-000 fix in admin scripts
celinepelletier Feb 7, 2024
72b4ff4
:sparkles: Add script so that htp study can have mocked new fields (#…
evans-g-crsj Feb 22, 2024
27154af
:wrench: Make sure new study fields are correctly mapped in Arranger …
evans-g-crsj Feb 23, 2024
2fcd2b5
:wrench: Adapt study script to allow multiple studies (#122)
evans-g-crsj Feb 27, 2024
f85b7a1
:bug: remove data_category duplicate from conf (#123)
evans-g-crsj Feb 27, 2024
c3a653b
:bug: Fix duplicate counts in auth studies and add open access data u…
evans-g-crsj Mar 1, 2024
45ee5bf
:bug: Fix mock studies script (#125)
evans-g-crsj Mar 6, 2024
9ad703e
:wrench: Add helper script to check aliases and releases (#127)
evans-g-crsj Mar 26, 2024
f4055cb
:bug: Make sure only clinical indices are deleted (#129)
evans-g-crsj Mar 27, 2024
b783740
:construction: Simplify checkReleaseStats admin script (#130)
evans-g-crsj Apr 3, 2024
cc8a17c
:wrench: Update study mocks (#132)
evans-g-crsj Apr 5, 2024
ec983fe
feat: SJIP-768 add studies participant count, variants count, genomes…
aperron-ferlab Apr 11, 2024
5c69ff5
fix: SJIP-768 remove unused env variable (#134)
aperron-ferlab Apr 11, 2024
2542ab8
:pencil: Update studies mock (#135)
evans-g-crsj Apr 11, 2024
e826c56
fix: SJIP-768 adjust genomes and transcriptomes values (#136)
aperron-ferlab Apr 11, 2024
acf6c67
fix: SJIP-768 increase size for studies population count (#137)
aperron-ferlab Apr 11, 2024
6ea71be
fix: SJIP-768 adjust stats queries to take participant count (#138)
aperron-ferlab Apr 12, 2024
27dc0bd
:pencil: Update domains and study_designs fields (#139)
evans-g-crsj Apr 12, 2024
abccd1d
feat: SJIP-812 add demographics to public routes (#140)
aperron-ferlab Apr 22, 2024
76703e7
feat: SJIP-768 add top mondo diagnosis to public route (#141)
aperron-ferlab Apr 24, 2024
8bfb201
fix: SJIP-812 replace ethnicity with race in statistic route (#142)
aperron-ferlab Apr 25, 2024
2f2575d
:goal_net: Add extra condition before deleting indices from a given r…
evans-g-crsj May 14, 2024
074c668
:loud_sound: Show if duplicates exist in checkReleaseStats (#144)
evans-g-crsj May 17, 2024
0d31f3e
feat: SKFP-1077 add members count stats (#145)
aperron-ferlab May 22, 2024
19a3eeb
fix: SKFP-1077 remove usage of env vars for members count (#146)
aperron-ferlab May 22, 2024
4df26d2
fix: SKFP-1077 check if indices exist for members (#147)
aperron-ferlab May 22, 2024
4688d7a
fix: SJIP-768 adjust top mondo hits to use display_term (#148)
aperron-ferlab May 24, 2024
5add1af
:wrench: Update projects conf (#149)
evans-g-crsj May 28, 2024
ee29634
fix: SKFP-1097 adjust auth studies allowed files count (#151)
aperron-ferlab Jun 5, 2024
1f32b1e
fix: SKFP-1097 remove controlled condition to match condition query (…
aperron-ferlab Jun 6, 2024
2f4ae8e
:wrench: Remove fields from mock studies (#153)
evans-g-crsj Jun 7, 2024
b9eb67b
:wrench: Add new script and improve others (admin) (#154)
evans-g-crsj Jun 14, 2024
c8f5ba9
:wrench: Remove data_types from studies mock and add new script to in…
evans-g-crsj Jun 17, 2024
c802656
:sparkle: Add script to help build stepfn etl payload (#156)
evans-g-crsj Jun 18, 2024
eb03194
fix: SKFP-1033 merge 2.0 to master
celinepelletier Jun 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,6 @@ SEND_UPDATE_TO_SQS=
SQS_QUEUE_URL=
MAX_SET_CONTENT_SIZE=

# Python configuration (used for survival endpoint)
SURVIVAL_PY_FILE=
PYTHON_PATH=

# Riff
RIFF_URL=

Expand Down
25 changes: 25 additions & 0 deletions .github/workflows/check_pull_request.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Check Pull Request Quality

on:
pull_request:

jobs:
tests:
name: Run Tests
runs-on: ubuntu-latest
steps:
- name: Checkout Source Code
uses: actions/checkout@v3
- name: Setup node
uses: actions/setup-node@v3
with:
node-version: 20
- name: Use Dependencies Cache
uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-modules-${{ hashFiles('**/package-lock.json') }}
- name: Install Dependencies
run: npm ci
- name: Run tests
run: npm run test
21 changes: 21 additions & 0 deletions .github/workflows/scan.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: build
on:
pull_request:

jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Build an image from Dockerfile
run: |
docker build -t ${{ github.sha }} .
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: '${{ github.sha }}'
format: 'table'
exit-code: '1'
severity: 'CRITICAL,HIGH'
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ node_modules
/docker/esdata/
/.idea/
dev/es_data/*
*.env-dev
*.env-dev
venv
17 changes: 5 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,18 +1,11 @@
# First image to compile typescript to javascript
FROM node:16.13-alpine AS build-image
FROM node:20-alpine3.18 AS build
WORKDIR /app
COPY . .
RUN npm ci
RUN npm run clean
RUN npm run build
RUN npm ci && npm run cleanAndBuild

# Second image, that creates an image for production
FROM nikolaik/python-nodejs:python3.9-nodejs16-alpine AS prod-image
FROM node:20-alpine3.18 AS prod-image
WORKDIR /app
COPY --from=build-image ./app/dist ./dist
COPY --from=build ./app/dist ./dist
COPY package* ./
COPY ./resource ./resource
RUN npm ci --production
RUN pip3 install -r resource/py/requirements.txt

RUN apk update && apk upgrade --no-cache libcrypto3 libssl3 && npm ci --production
CMD [ "node", "./dist/src/index.js" ]
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ Arranger server is an application that wraps Elasticsearch and provides a GraphQ

## Development

* Execute: `npm run build` then `npm run start`
* Execute: `npm run cbs`

Note: You can execute this project in a docker container if you prefer: `docker run -u node -it --rm --network host -v ${PWD}:/app --workdir /app node:20-alpine3.18 sh`

### General

Expand Down
4 changes: 2 additions & 2 deletions admin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ The Law of the Land is: 1 arranger project per environment (qa, staging, prod).
npm run admin-project
or
# run the script with docker (PWD = root of the project)
docker run -it --network host --rm -v ${PWD}:/code --workdir /code node:16.13-alpine sh -c "npm install && npm run build && npm run admin-project"
docker run -it --network host --rm -v ${PWD}:/code --workdir /code node:20-alpine3.18 sh -c "npm install && npm run build && npm run admin-project"
# run the script with docker (PWD = root of the project) and local elastic search (from /dev)
docker run -it --rm --network es-net -v ${PWD}:/code --workdir /code node:16.13-alpine sh -c "npm install && npm run build && npm run admin-project"
docker run -it --rm --network es-net -v ${PWD}:/code --workdir /code node:20-alpine3.18 sh -c "npm install && npm run build && npm run admin-project"
```

150 changes: 150 additions & 0 deletions admin/addFieldsToStudies.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
// You must be connected to the correct ES HOST
// You can do:
// docker run --rm -it -v ${PWD}:/app -u node --network=host --workdir /app node:20-alpine3.18 sh
// node admin/addFieldsToStudies.mjs
import assert from 'node:assert/strict';
import EsInstance from '../dist/src/ElasticSearchClientInstance.js';
import readline from 'readline';

import { mockStudies, validateStudies } from './mockStudies.mjs';
const userReadline = readline.createInterface({
input: process.stdin,
output: process.stdout,
});

const yesOrNo = await new Promise(resolve => {
userReadline.question(`This script is intend to be for INCLUDE. Do you want to proceed y/n? > `, answer =>
resolve(answer === 'y'),
);
});
userReadline.close();
if (!yesOrNo) {
console.info('Terminating Script');
process.exit(0);
}

const { keys, values } = Object;

const ms = [...mockStudies];

const vr = validateStudies(ms);
const invalidStudies = vr.filter(v => !v[1]);
if (invalidStudies.length > 0) {
invalidStudies.forEach(v => {
const [code, , errors] = v;
console.log(`study=${code} is invalid`);
console.log(errors);
});
process.exit(0);
}

const sCodes = [...new Set(ms.map(x => x.study_code))];
const nOfStudiesToEnhance = ms.length;
assert(sCodes.length === nOfStudiesToEnhance, 'Duplicated study_codes in mocks');

const client = await EsInstance.default.getInstance();

//Quick validation
const rM = await client.indices.getMapping({ index: 'study_centric' });
assert(rM.statusCode === 200);
const m = values(rM.body)[0]?.mappings?.properties;
assert(!!m);
// !Notice Warning: mappings are multivalued (one for each study). Only the first found is used. So validation may, in certain instances, be incomplete,
const mappedKeys = ms.map(s => {
const sTopLevelKeys = keys(s);
const allKeysExistInMapping = sTopLevelKeys.every(sk => !!m[sk]);
return [allKeysExistInMapping, sTopLevelKeys.filter(sk => !m[sk])];
});

const firstLevelNestedOK = ['dataset', 'data_types', 'contacts', 'experimental_strategies'].every(k => m[k]?.type === 'nested');
const mappingSeemsValid = firstLevelNestedOK && mappedKeys.every(x => !!x[0]);
if (!mappingSeemsValid) {
console.error('It seems like not all values are mapped correctly.');
if (m.dataset.type === 'nested') {
console.error('Problematic keys: ', [
...new Set(
mappedKeys
.filter(x => !x[0])
.map(x => x[1])
.flat(),
),
]);
}
process.exit(0);
}

// Processing
const r = await client.search({
index: 'study_centric',
size: sCodes.length,
body: {
query: {
bool: {
must: [
{
terms: {
study_code: sCodes,
},
},
],
},
},
},
});
assert(r.statusCode === 200);
const hits = r.body.hits;
assert(
hits.total.value <= nOfStudiesToEnhance &&
hits?.hits &&
hits.hits.every(h => sCodes.includes(h._source.study_code)),
);

const operations = ms
.flatMap(doc => {
const oDoc = hits.hits.find(h => h._source.study_code === doc.study_code);
if (!oDoc) {
return undefined;
}
return [
{ update: { _index: oDoc._index, _id: oDoc._id } },
{
doc: {
...oDoc._source,
...doc,
},
},
];
})
.filter(x => !!x);

assert(operations.length >= 1);
const br = await client.bulk({ refresh: true, body: operations });
assert(br.statusCode === 200 || !br.body?.errors, br);

// Post-validation
const uItems = br.body.items;
// Not a perfect check theoretically, but it should be largely sufficient.
// Besides, identity ( f(x)=x ) transform is considered as an update
const allUpdated = uItems.length === ms.length;
const updatedDocsIds = uItems.map(x => x.update._id)
const updatedCodes = updatedDocsIds.reduce((xs, x) => {
const code = hits.hits.find(h => h._id === x)?._source.study_code;
return code ? [...xs, code] : xs
}, [])
console.log('Codes updated');
console.log(updatedCodes);
const notUpdatedCodes = sCodes.filter(c => !updatedCodes.includes(c))
if (notUpdatedCodes.length > 0) {
console.log('Codes NOT updated (no studies found with these codes in study_centric)');
console.log(notUpdatedCodes);
}
console.log(
allUpdated
? 'All items were updated'
: `Updated ${br.body.items.length} docs (ids=${uItems
.map(item => item.update._id)
.sort()
.join(',')})`,
);

process.exit(0);
15 changes: 8 additions & 7 deletions admin/arrangerApi.mjs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { updateFieldExtendedMapping } from '@arranger/admin/dist/schemas/ExtendedMapping/utils';
import { createNewIndex, getProjectMetadataEsLocation } from '@arranger/admin/dist/schemas/IndexSchema/utils';
import { addArrangerProject } from '@arranger/admin/dist/schemas/ProjectSchema/utils';
import { constants } from '@arranger/admin/dist/services/constants';
import { updateFieldExtendedMapping } from '@arranger/admin/dist/schemas/ExtendedMapping/utils.js';
import { createNewIndex, getProjectMetadataEsLocation } from '@arranger/admin/dist/schemas/IndexSchema/utils.js';
import { addArrangerProject } from '@arranger/admin/dist/schemas/ProjectSchema/utils.js';
import { constants } from '@arranger/admin/dist/services/constants.js';

const createNewIndices = async (esClient, confIndices) => {
const createNewIndexWithClient = createNewIndex(esClient);
Expand All @@ -10,11 +10,12 @@ const createNewIndices = async (esClient, confIndices) => {
}
};

const fixExtendedMapping = async (esClient, confExtendedMappingMutations) => {
const fixExtendedMapping = async (esClient, mutations) => {
const updateFieldExtendedMappingWithClient = updateFieldExtendedMapping(esClient);
for (const confExtendedMappingMutation of confExtendedMappingMutations) {
for (const [index, mutation] of mutations.entries()) {
console.debug('updating field = ', mutation?.field, ` ${index + 1} of ${mutations.length}`);
await updateFieldExtendedMappingWithClient({
...confExtendedMappingMutation,
...mutation,
});
}
};
Expand Down
42 changes: 42 additions & 0 deletions admin/checkAliasWithRelease.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import { Client } from '@elastic/elasticsearch';
import { esHost } from '../dist/src/env.js';
import assert from 'node:assert/strict';

const cbKeepClinicalIndicesOnly = x =>
['file', 'biospecimen', 'participant', 'study'].some(stem => x.index.includes(stem));

const client = new Client({ node: esHost });

const rAllAliases = await client.cat.aliases({
h: 'alias,index',
format: 'json',
});

assert(rAllAliases.statusCode === 200);

const allAliases = rAllAliases.body;
const hasNext = allAliases.some(x => x.alias.includes('next_'));
const clinicalAliases = allAliases
.filter(cbKeepClinicalIndicesOnly)
.filter(x => (hasNext ? x.alias.includes('next_') : x));

const aliasToReleases = clinicalAliases.reduce((xs, x) => {
const r = 're' + x.index.split('_re_')[1];
const v = [...new Set(xs[x.alias] ? [...xs[x.alias], r] : [r])];
return {
...xs,
[x.alias]: v,
all: v,
};
}, {});

const { all, ...entities } = aliasToReleases;
console.log(`\n`);

//not the best test but it should suffice
const ok = hasNext ? all.length === 1 : all.length <= 2
if (!ok) {
console.warn('Check if the clinical aliases are ok - There might be a problem')
}
console.log(`Release(s) found: ${all}`);
console.log(entities);
13 changes: 13 additions & 0 deletions admin/checkConf.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import includeConf from './confInclude.json' assert { type: "json" };
import kfConf from './confKfNext.json' assert { type: "json" };

const kfs = kfConf.extendedMappingMutations.map(m => [m.field, m.graphqlField]);
const incs = includeConf.extendedMappingMutations.map(m => [m.field, m.graphqlField]);

console.info('mutation in Kf only');
const diffKfOnly = kfs.filter(kf => !incs.some(ins => kf[0] === ins[0] && kf[1] === ins[1])).map(x => ({ field: x[0], entity: x[1] }))
diffKfOnly.length === 0 ? console.log('No diff') : console.table(diffKfOnly)

console.info('mutation in Include only');
const diffIncOnly = incs.filter(ins => !kfs.some(kf => kf[0] === ins[0] && kf[1] === ins[1])).map(x => ({ field: x[0], entity: x[1] }))
diffIncOnly.length === 0 ? console.log('No diff') : console.table(diffIncOnly)
Loading
Loading