Skip to content
Kim Rutherford edited this page Nov 30, 2024 · 12 revisions

New system: run InterProScan then parse the JSON output

Quick version:

Follow download instructions here: https://interproscan-docs.readthedocs.io/en/latest/HowToDownload.html

INTERPROSCAN_VERSION=interproscan-5.69-101.0
tar -pxvzf $INTERPROSCAN_VERSION-bit.tar.gz
cd $INTERPROSCAN_VERSION

nice -19 bash -x ~/git/pombase-domain-process/etc/run_and_process_interpro.sh

If successful, this produces $INTERPROSCAN_VERSION/pombe_domain_results.json and $INTERPROSCAN_VERSION/japonicus_domain_results.json

cp pombe_domain_results.json /var/pomcur/sources/interpro/
sudo cp japonicus_domain_results.json /var/pomcur/japonicus_sources/japonicus_domain_results.json
sudo chown japonicus:japonicus /var/pomcur/japonicus_sources/japonicus_domain_results.json

Alternative: run InterProScan manaully

Follow download instructions here: https://interproscan-docs.readthedocs.io/en/latest/HowToDownload.html

INTERPROSCAN_VERSION=interproscan-5.69-101.0
tar -pxvzf $INTERPROSCAN_VERSION-bit.tar.gz
cd $INTERPROSCAN_VERSION
python3 setup.py -f interproscan.properties

Get protein data:

curl https://curation.pombase.org/dumps/latest_build/fasta/feature_sequences/peptide.fa.gz | gzip -d | perl -pne 's/\*$//' > pombe_peptide.fa

Run InterProScan:

PATH=/usr/local/jdk-14.0.1/bin:$PATH nice -19 ./interproscan.sh -i pombe_peptide.fa -f json

Run domain processor:

PATH=/usr/local/tmhmm-2.0c/bin:$PATH /var/pomcur/bin/pombase-domain-process -v $INTERPROSCAN_VERSION -p pombe_peptide.fa -i pombe_peptide.fa.json -o pombe_domain_results.json

If successful:

mv pombe_domain_results.json /var/pomcur/sources/interpro/

Old system: parse match_complete.xml.gz

Create a backup of /var/pomcur/sources/interpro/pombe_domain_results.json

Then, in a temporary directory:

  • download XML with:
    wget -N ftp://ftp.ebi.ac.uk/pub/databases/interpro/current/match_complete.xml.gz

  • then run:

    PATH=/usr/local/tmhmm-2.0c/bin:$PATH /var/pomcur/bin/pombase-domain-process -p "postgres://kmr44:kmr44@localhost/pombase-build-2018-03-20" -i <(gzip -d < match_complete.xml.gz) -o pombe_domain_results.json

    (changing pombase-build-2018-03-20 to the latest database)

  • move pombe_domain_results.json to /var/pomcur/sources/interpro/ on success.