Skip to content

chunjy92/Aligners

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aligners

Nov 20, 2022

  • UPDATE: March 2024

1. Setup

A. JAMR

  • java version: 1.8.0_392
    • installed with eclipse temurin (via sdkman)
  • sbt version: 1.0.2
    • should be reflected in project/build.properties
    • installed with SDKMAN!
  • scala version: 2.11.12
    • should be reflected in build.sbt and scripts/config.sh
    • installed with SDKMAN!
  • see modified project/plugins.sbt
    • see references below
  • make executables
    • chmod +x setup compile sbt scripts/config.sh
  • run sbt first! (not ./sbt but just sbt)
    • setup, etc scripts
  • see jamr_post_setup.sh which fixes outdated perl expression
  • references:
    1. jflanigan/jamr#44
    2. https://github.com/DreamerDeo/JAMR

B. ISI

  • requires python2
    • pip2 install virtualenv then python2 -m virtualenv python2-env
    • python2-env virtualenv within ISI dir by default; or set up new one separately
    • UPDATE: pyenv virtualenv 2.7.17 isi
  • mgizapp requires Boost c++ libraries
    • sudo apt-get install libboost-all-dev
  • read instructions in INSTALL inside mgizapp
  • running scripts/jamr2isi.py on JAMR output prepares input data in required format

C. LEAMR

  • requires python3 (3.7.3 or 3.7.6 on ubuntu 22.04)
    • UPDATE: 3.7.17 seems required, along with spacy==2.3.7
  • pyenv virtualenv by default; or set up local venv separately and configure the script acordingly
  • torch 1.13.1+cu117
  • relies on Stanza and spaCy whose version may be different from what model uses
    • set up Stanza with STANZA_RESOURCES_DIR=./leamr/leamr_stanza_resources python -c "import stanza; stanza.download('en')"
  • also need to set up neuralmonkey separately
    • pip install cython==0.29 --upgrade
  • don't forget python -m spacy download en
  • contains minor modifications
    • doesn't drop .txt in the outputs filename
    • uses pre-tokenized outputs

2. Run

May have to set pyenv global to Python2 + Python3, i.e.

pyenv global 2.7.15
  • since Python3 are usually pre-installed, this exposes python2 and python3.

To run aligners in sequence (JAMR -> ISI -> LEAMR)

./scripts/run_aligners.sh [input_file]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published