Skip to content

Commit

Permalink
Dockerfiles and updated Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
lefterav committed May 12, 2021
1 parent 3c7a237 commit 9bfe5a9
Show file tree
Hide file tree
Showing 6 changed files with 291 additions and 1 deletion.
64 changes: 63 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,64 @@
# marian-docker
Dockerfiles for providing a compiled version of the Marian neural machine translation toolkit
Docker files and instructions for providing a compiled version of the Marian. Marian is "an efficient, free Neural Machine Translation framework written in pure C++ with minimal dependencies." [1] The dockers are based on Ubuntu environments with CUDA support, for running into GPU servers.

# Introduction #

By compiling or simply pulling these docker containers, users can have a fully functioning Marian implementation, without having to go through the complicated and relatively slow process on compiling the toolkit by itself. The latest is based on the instructions of the official website and provides the executables for training, decoding and starting a server, including support for SentencePiece, CPU processing and MPI support. Additional tools such as moses-scripts and sacrebleu are also provided, in order to allow easy reproduction of the example experiments.

The dockers do not contain any trained models. Additionally the users are recommended to mount a local directory into the docker, so that they can save any produced output. This is a totally unofficial release, with no rlation to the original Marian developers and is provided as is, without any warranty or support.

# Usage instructions #

## Pulling from dockerhub ##

Compiled dockers are uploaded in docker hub, so they can be pulled with

``` docker pull lefterav/marian-nmt:1.10.0_sentencepiece_cuda-11.3.0
```

## Starting the container ##

After pulling the docker, it can be started with the following command (replace <username> with your linux account username).

``` docker run -v /home/<username>:/home/<username> -e HOME=/home/<username> -it lefterav/marian-nmt:1.10.0_sentencepiece_cuda-11.3.0
```

This will provide a commandline which gives you the possibility to run `marian` coomands. Additionally, the `-v` parameter, makes the user folders accessible to the docker, so that the results of the experiments can be saved (otherwise they would be lost when the docker is unloaded).

In some server installations, the administrators suggest not using the home folder but other designated storage units, so add these in a similar way with the `-v` parameter.

## Running Marian commands

The Marian command is compiled at `/marian/build/marian`. You can test if marian works by running

```/marian/build/marian --help
```

By starting the Marian container, the commandline is already directed in the `experiments` directory of Marian. In the subdirectories one can run the bash scripts in order to perform the experiments. It is suggested to msave the data and the models to a mounted folder

```
cd transformer
mkdir -p ~/myexperiment/model ~/myexperiment/data
ln -s ~/myexperiment/model .
ln -s ~/myexperiment/data .
bash run-me.sh
```

The above example creates the directories for the data and the models in your home folder and then creates a symbolic link to the experiments directory, so that the data and the models remain even if you exit the container. Then, the bash script will download the required data and train a transformer system. Read the respective Readme file for more infos.

You can exit the container by hitting Ctrl+D or typing `exit`

# Advanced: compiling the dockerfiles #

We provide different dockerfiles for different versions of Ubuntu and CUDA, and these can be found in the `/dockerfiles` subdirectory. The dockers can be compiled with the following command:

``` docker build --tag <username>/marian-nmt:1.10.0_sentencepiece_cuda-11.3.0 - < dockerfiles/marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
```

This gives the possibility to produce one containers with modified setup


[1] https://marian-nmt.github.io/



12 changes: 12 additions & 0 deletions dockerfiles/.history
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
2021-05-04.21-46-22 vim marian_1.10.0_sentencepiece.Dockerfile
2021-05-04.21-47-07 ls Dockerfile
2021-05-04.21-49-42 less marian_1.10.0_sentencepiece.Dockerfile
2021-05-04.21-49-54 mv marian_1.10.0_sentencepiece.Dockerfile marian_1.10.0_no-sentencepiece.Dockerfile
2021-05-04.21-51-44 cp marian_1.10.0_no-sentencepiece.Dockerfile marian_1.10.0_sentencepiece.Dockerfile
2021-05-04.21-51-56 gedit marian_1.9.0_sentencepiece.Dockerfile &
2021-05-04.21-53-22 vim marian_1.10.0_sentencepiece.Dockerfile
2021-05-04.22-01-23 vim marian_1.10.0_sentencepiece.Dockerfile
2021-05-05.01-13-45 cp marian_1.10.0_sentencepiece.Dockerfile marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
2021-05-05.01-14-14 vim marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
2021-05-05.01-15-00 docker build --tag lefterav/marian-nmt:1.10.0_sentencepiece_cuda-11.3.0 - < dockerfiles/marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
2021-05-05.01-15-07 docker build --tag lefterav/marian-nmt:1.10.0_sentencepiece_cuda-11.3.0 - < dockerfiles/marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
46 changes: 46 additions & 0 deletions dockerfiles/marian_1.10.0_no-sentencepiece.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
FROM nvidia/cuda:9.2-devel-ubuntu18.04

MAINTAINER Eleftherios Avramidis <[email protected]>
LABEL description="Basic Marian 1.10.0 nvidia-docker container for Ubuntu 18.04 "

ENV MARIANPATH /marian
ENV TOOLSDIR /tools

# Install necessary system packages
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get upgrade -yq && apt-get install -yq \
build-essential \
git-core \
pkg-config \
libtool \
zlib1g-dev \
libbz2-dev \
automake \
python-dev \
perl \
libsparsehash-dev \
libboost-all-dev \
openssl \
libssl-dev \
libgoogle-perftools-dev \
wget \
apt-transport-https ca-certificates gnupg software-properties-common \
cmake \
vim nano unzip gzip python-pip php \
&& rm -rf /var/lib/apt/lists/*

# Install Marian
RUN git clone --depth 1 --branch 1.10.0 https://github.com/marian-nmt/marian
WORKDIR $MARIANPATH
RUN mkdir -p build
WORKDIR $MARIANPATH/build
RUN cmake .. && make -j8

# Install tools
RUN pip install langid

WORKDIR $MARIANPATH/examples/tools
RUN make

# Direct the user to the examples directory
WORKDIR $MARIANPATH/examples/
49 changes: 49 additions & 0 deletions dockerfiles/marian_1.10.0_sentencepiece.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
FROM nvidia/cuda:9.2-devel-ubuntu18.04

MAINTAINER Eleftherios Avramidis <[email protected]>
LABEL description="Basic Marian 1.10.0 nvidia-docker container for Ubuntu 18.04 "

ENV MARIANPATH /marian
ENV TOOLSDIR /tools

# Install necessary system packages
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get upgrade -yq && apt-get install -yq \
build-essential \
git-core \
pkg-config \
libtool \
zlib1g-dev \
libbz2-dev \
automake \
python-dev \
perl \
libsparsehash-dev \
libboost-all-dev \
libprotobuf10 \
protobuf-compiler \
libprotobuf-dev \
openssl \
libssl-dev \
libgoogle-perftools-dev \
wget \
apt-transport-https ca-certificates gnupg software-properties-common \
cmake \
vim nano unzip gzip python-pip php \
&& rm -rf /var/lib/apt/lists/*

# Install Marian
RUN git clone --depth 1 --branch 1.10.0 https://github.com/marian-nmt/marian
WORKDIR $MARIANPATH
RUN mkdir -p build
WORKDIR $MARIANPATH/build
RUN cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_SENTENCEPIECE=ON && make -j8

# Install tools
RUN pip install langid

WORKDIR $MARIANPATH/examples/tools
RUN make

# Direct the user to the examples directory
WORKDIR $MARIANPATH/examples/
51 changes: 51 additions & 0 deletions dockerfiles/marian_1.10.0_sentencepiece_cuda_11.3.0.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
FROM nvidia/cuda:11.3.0-devel-ubuntu20.04

MAINTAINER Eleftherios Avramidis <[email protected]>
LABEL description="Basic Marian 1.10.0 nvidia-docker container for Ubuntu 20.04 with CUDA 11.3 support"

ENV MARIANPATH /marian
ENV TOOLSDIR /tools

# Install necessary system packages
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get upgrade -yq && apt-get install -yq \
build-essential \
git-core \
pkg-config \
libtool \
zlib1g-dev \
libbz2-dev \
automake \
python-dev \
perl \
libsparsehash-dev \
libboost-all-dev \
libprotobuf17 \
protobuf-compiler \
libprotobuf-dev \
openssl \
libssl-dev \
libgoogle-perftools-dev \
wget \
apt-transport-https ca-certificates gnupg software-properties-common \
cmake \
vim nano unzip gzip python3-pip php \
&& rm -rf /var/lib/apt/lists/*

RUN wget -qO- 'https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB' | apt-key add - && sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list' && apt-get update && apt-get install -yq intel-mkl-64bit-2020.0-088

# Install Marian
RUN git clone --depth 1 --branch 1.10.0 https://github.com/marian-nmt/marian
WORKDIR $MARIANPATH
RUN mkdir -p build
WORKDIR $MARIANPATH/build
RUN cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_SENTENCEPIECE=ON -DUSE_MPI=ON -DCOMPILE_CPU=on -DCOMPILE_SERVER=on && make -j8

# Install tools
RUN pip3 install langid

WORKDIR $MARIANPATH/examples/tools
RUN make

# Direct the user to the examples directory
WORKDIR $MARIANPATH/examples/
70 changes: 70 additions & 0 deletions dockerfiles/marian_1.9.0_sentencepiece.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
FROM nvidia/cuda:9.2-devel-ubuntu18.04

MAINTAINER Eleftherios Avramidis <[email protected]>
LABEL description="Basic Marian nvidia-docker container for Ubuntu 18.04 with included Sentencepiece"

# Install some necessary tools.


RUN apt-get update && apt-get install -y \
build-essential \
git-core \
pkg-config \
libtool \
zlib1g-dev \
libbz2-dev \
automake \
python-dev \
perl \
libsparsehash-dev \
libboost-all-dev \
libprotobuf10 \
protobuf-compiler \
libprotobuf-dev \
openssl \
libssl-dev \
libgoogle-perftools-dev \
#doxygen \
#graphviz \
wget \
apt-transport-https ca-certificates gnupg software-properties-common \
cmake \
&& rm -rf /var/lib/apt/lists/*

# Install cmake 3.15.2
#RUN wget https://github.com/Kitware/CMake/releases/download/v3.15.2/cmake-3.15.2.tar.gz; tar -zxvf cmake-3.15.2.tar.gz
#ENV CMAKEPATH /cmake-3.15.2
#WORKDIR $CMAKEPATH
#RUN ./bootstrap --parallel=8 && make -j 8 && make install

#RUN add-apt-repository ppa:ubuntu-toolchain-r/test
#RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null
#RUN apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'
#RUN apt-get update && apt-get install -y cmake

# Install intel mkl
#RUN wget -qO- 'https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB' | apt-key add -
#RUN sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'
#RUN apt-get update && apt-get install -y gcc-8 g++-8 cmake
#RUN apt-get update && apt-get install -y cmake
#intel-mkl-64bit-2020.0-088
#RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 60 --slave /usr/bin/g++ g++ /usr/bin/g++-8
#ENV CXXFLAGS "$CXXFLAGS -std=c++14"

RUN cmake --version

# Install Marian
RUN git clone https://github.com/marian-nmt/marian
ENV MARIANPATH /marian
WORKDIR $MARIANPATH
RUN mkdir -p build
WORKDIR $MARIANPATH/build
RUN cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_SENTENCEPIECE=ON && make -j8

# Install SacreBLEU
RUN git clone https://github.com/marian-nmt/sacreBLEU.git sacreBLEU

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -yq vim nano unzip gzip python-pip php
RUN pip install langid
WORKDIR ~

0 comments on commit 9bfe5a9

Please sign in to comment.