You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A basic ingest of this data would model as mutant alleles or a gene-condition relation indicating that this gene X is essential for growth in condition Y. As key supporting data the gene annotations should also be ingested: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_genes.tab
with the caveat that these are 'free text' annotations so may require standardization.
Further ingests could include:
In addition, the expsUsed table could be treated as a Sample metadata table and run through the usual NLP process.
The text was updated successfully, but these errors were encountered:
realmarcin
changed the title
future: ingest gene knockout data from LBL microbial fitness experiments
ingest gene knockout data from LBL microbial fitness experiments
Dec 23, 2020
All of the data is here (84G total):
http://genomics.lbl.gov/supplemental/bigfit/
The numerical relative growth data would have to be converted - growth vs no growth, via eg thresholding.
Just taking the first organism as an example:
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/
On the organism page, under 'Genes' the 'Specific phenotypes' link gives a table of most significant phenotype per gene for this KO dataset:
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/specific_phenotypes
and this file can serve as the primary data source.
These columns:
sysName desc name lrn t Group Condition_1 Concentration_1 Units_1
provide the following data:
gene name
description
internal name
log ratio normalized
t-statistic
condition group
condition name
concentration
unit
For reference under 'Genes' the 'Gene fitness' link gives a full table of relative fitness values:
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_logratios_good.tab
The y-axis labels are 'locusId' which are gene ids and the x-axis labels are condition (sample) ids including a text description.
There is additional data on each condition on the organism page under 'Tables' then 'Experiments' then 'Detailed metadata for experiments':
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/expsUsed
A basic ingest of this data would model as mutant alleles or a gene-condition relation indicating that this gene X is essential for growth in condition Y. As key supporting data the gene annotations should also be ingested:
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_genes.tab
with the caveat that these are 'free text' annotations so may require standardization.
Further ingests could include:
http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_t.tab
The text was updated successfully, but these errors were encountered: