Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rule for ocrd-tool-all.json, reduce image size, fix+update modules, fix CUDA #362

Merged
merged 64 commits into from
Jun 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
f8cfe20
add rule for ocrd-tool-all.json
bertsky Mar 28, 2023
3bc8d6a
Update makedocker.yml
bertsky Mar 28, 2023
2835c6c
docker rmi: fix argument
bertsky Mar 28, 2023
39955c9
docker rmi: avoid assuming which Ubuntu is installed
bertsky Mar 28, 2023
3e4f209
reinstate -git variants
bertsky Mar 28, 2023
c50dfa3
generate ocrd-all-tool.json: fix image name
bertsky Mar 28, 2023
a0dbe73
fix docker run (needs -i)
bertsky Mar 28, 2023
672eac3
fix make ocrd-all-tool.json
bertsky Mar 28, 2023
bc364d7
make ocrd-all-tool.json: avoid git actions
bertsky Mar 28, 2023
8cd239c
add SSH session for debugging
bertsky Mar 28, 2023
a799797
make ocrd-all-tool.json: try outside of Docker
bertsky Mar 28, 2023
32e1538
make ocrd-all-tool.json: add venv dependency
bertsky Mar 28, 2023
e746a88
TF1: exclude nvidia-tensorflow==1.15.5+nv23.3
bertsky Mar 29, 2023
72b3738
downgrade protobuf
bertsky Mar 29, 2023
353fa44
hold Numpy for ocrd_cis
bertsky Mar 29, 2023
5b1cb13
update core
bertsky Apr 15, 2023
0e21c7f
also hold OpenCV for ocrd_cis
bertsky Apr 20, 2023
5a67884
post-update Shapely after (ocrd_)kraken
bertsky Apr 20, 2023
a24bf17
move ocrd_detectron2 to top venv but order before typegroups_classifier
bertsky Apr 20, 2023
5b1ce3e
hold TF1 via nvidia-tensorflow at nv22.12 still compatible with CUDA …
bertsky Apr 20, 2023
18a027b
docker*: prefer plain progress meter on buildkit
bertsky Apr 20, 2023
995eefb
venv/pip updates: rehash to ensure pip is dereferenced correctly
bertsky Apr 20, 2023
15e8d77
tidy: new variant of clean with extra recursive git clean and git gc
bertsky Apr 20, 2023
c06998d
CI: build on newer base image to get 'git clone --single-branch' for …
bertsky Apr 20, 2023
1f6ac8f
docker*: revert 7a5ff45 to have all intermediate deps in 1 layer again
bertsky Apr 21, 2023
2aa21a2
docker*: rm /.cache regardless of editable/-git or not
bertsky Apr 21, 2023
2d7b4c7
docker*: no more git update in Docker layers (only on build context v…
bertsky Apr 21, 2023
3f0e9e3
docker*: move make check to separate step
bertsky Apr 21, 2023
f7abce1
add make testcuda for diagnostics
bertsky Apr 21, 2023
bad270f
kraken workaround not needed anymore
bertsky Apr 21, 2023
46c8722
update modules
bertsky Apr 21, 2023
7852fcb
update core to v2.50
bertsky Apr 24, 2023
446b32e
opencv-python: adapt to pip wheel builds
bertsky Apr 26, 2023
d6d1bf0
remove explicit dependency for numpy
bertsky Apr 26, 2023
db16444
replace findstring with filter (no substring matches)
bertsky Apr 26, 2023
55849c0
honour PIP_OPTIONS=-e again
bertsky Apr 26, 2023
47d837f
get tesserocr from PyPI if not enabled
bertsky Apr 26, 2023
884aae5
get ocrd from PyPI if core not enabled
bertsky Apr 26, 2023
4a08f40
install ocrd_detectron2 before ocrd_kraken (better Pytorch installer)
bertsky Jun 1, 2023
2cb4420
update opencv-python (with fixes for py38)
bertsky Jun 1, 2023
f972fb1
update modules
bertsky Jun 2, 2023
fc35815
docker-*-cuda: workaround for conflicting cuDNN version (TF/Torch)
bertsky Jun 2, 2023
b2adc3d
apply suggestions from review
bertsky Jun 2, 2023
e5fb240
docker-*-cuda: workaround for conflicting cuDNN version (TF/Torch)
bertsky Jun 5, 2023
d6fd7fb
update ocrd_fileformat and ocrd_kraken
bertsky Jun 6, 2023
421baea
docker*: always editable, *-git only as alias, never rm /build
bertsky Jun 7, 2023
b5b3602
update submodules
bertsky Jun 9, 2023
d68fd0c
docker*cuda: move fix-cuda to makefile, add deps-cuda from core
bertsky Jun 9, 2023
c106b1a
Merge remote-tracking branch 'bertsky/fix-opencv-and-tesserocr' into …
bertsky Jun 9, 2023
08f39d8
update submodules
bertsky Jun 9, 2023
c37f993
downgrade eynollah
bertsky Jun 10, 2023
1757708
update core (deps-cuda)
bertsky Jun 11, 2023
8044756
add 'test-core' and 'test-workflow', improve 'help'
bertsky Jun 11, 2023
482f364
update ocrd_kraken (default to device=cuda:0), adapt test-workflow
bertsky Jun 11, 2023
c7c170b
update/improve readme
bertsky Jun 12, 2023
5f6a27b
improve readme markup
bertsky Jun 12, 2023
8c62507
improve/fix docker rules
bertsky Jun 12, 2023
93b445f
:memo: changelog
kba Jun 12, 2023
edb8f23
switch detectron2/kraken dependency
bertsky Jun 13, 2023
e4fe65d
update changelog again
bertsky Jun 13, 2023
626110a
GHA makedocker: add input switch for upterm console
bertsky Jun 13, 2023
6131c46
GHA makedocker: move upterm console step before build
bertsky Jun 13, 2023
1a5a49a
GHA makedocker: workaround for input boolean vs string mixup
bertsky Jun 14, 2023
4ecde60
docker*: avoid unconstrained parallelism (which leads to deadlock)
bertsky Jun 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,17 @@ version: 2.1
jobs:
build:
docker:
- image: cimg/base:stable-18.04
- image: cimg/base:stable-22.04
steps:
- checkout
- setup_remote_docker # https://circleci.com/docs/2.0/building-docker-images/
- run:
name: build image
command: make docker-maximum-cuda
command: make docker-maximum-cuda GIT_DEPTH=--single-branch
no_output_timeout: 30m
- when:
# takes too long for 1h1m CircleCI timeout overall
# also, storage is limited...
condition: false
steps:
- run:
Expand All @@ -26,7 +27,7 @@ jobs:
destination: artifacts
deploy:
docker:
- image: cimg/base:stable-18.04
- image: cimg/base:stable-22.04
environment:
GIT_DEPTH: "--depth 1"
parameters:
Expand All @@ -38,7 +39,7 @@ jobs:
- setup_remote_docker # https://circleci.com/docs/2.0/building-docker-images/
- run:
name: Build Docker image
command: make docker-<< parameters.variant >>-git
command: make docker-<< parameters.variant >>-git GIT_DEPTH=--single-branch
# fails due to pip races: DOCKER_PARALLEL=-j3
no_output_timeout: 30m
- run:
Expand Down
133 changes: 69 additions & 64 deletions .github/workflows/makedocker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,29 +7,10 @@ on:
# Trigger workflow in GitHub web frontend or from API.
workflow_dispatch:
inputs:
os:
description: 'Operating system'
required: true
default: 'ubuntu-18.04'
type: choice
options:
- 'ubuntu-18.04'
- 'ubuntu-20.04'
python-version:
description: 'Python version'
required: true
default: '3.6'
type: choice
options:
- '3.6'
- '3.7'
- '3.8'
- '3.9'
- '3.10'
docker-image:
description: 'Docker image'
required: true
default: 'docker-minimum'
default: 'minimum'
type: choice
options:
- 'minimum'
Expand All @@ -44,80 +25,104 @@ on:
- 'medium-cuda-git'
- 'maximum-git'
- 'maximum-cuda-git'
upload-docker-image:
description: 'Upload Docker image'
upload-dockerhub:
description: 'Upload Docker image to Dockerhub'
default: False
type: boolean
upload-github:
description: 'Upload Docker image Github Container Registry'
default: False
type: boolean
upterm-session:
description: 'Run SSH login server for debugging'
default: False
type: boolean
# not yet:
#push:
# branches: [ "master" ]

jobs:
make:
runs-on: ${{ github.event.inputs.os }}

env:
PYTHON_VERSION: ${{ github.event.inputs.python-version }}
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Show Python3 version
run: python3 --version
- name: Show disk usage of Homebrew, Android and .NET
run: sudo du -mscx /home/linuxbrew /usr/local/lib/android /usr/share/dotnet 2>/dev/null || true
- name: Remove Docker images
run: |
df -h
docker images
docker rmi alpine:3.12 alpine:3.13 alpine:3.14
docker rmi buildpack-deps:stretch buildpack-deps:buster buildpack-deps:bullseye
docker rmi debian:9 debian:10 debian:11
docker rmi moby/buildkit:latest
docker rmi node:12-alpine node:14-alpine node:16-alpine
docker rmi node:12 node:14 node:16
if false; then # don't remove Ubuntu images
docker rmi ubuntu:16.04 ubuntu:18.04 ubuntu:20.04
fi
docker images
docker rmi $(docker images --filter=reference="alpine:*" -q)
docker rmi $(docker images --filter=reference="buildpack-deps:*" -q)
docker rmi $(docker images --filter=reference="debian:*" -q)
docker rmi $(docker images --filter=reference="node:*" -q)
df -h /
- name: Remove unneeded Debian packages
run: |
if false; then # skip time consuming package uninstall
sudo apt-get install -y deborphan
deborphan -a | sort
sudo apt-get purge -y $(deborphan -a|fgrep main/cli-mono|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a|fgrep main/database|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a|fgrep main/devel|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a|fgrep main/httpd|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a|fgrep main/php|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a|fgrep main/vcs|while read dummy package; do echo $package; done)
sudo apt-get purge -y $(deborphan -a | fgrep -e main/cli-mono -e main/database -e main/devel -e main/httpd -e main/php -e main/vcs | while read _ pkg; do echo $package; done)
deborphan | sort
sudo du -mscx /* 2>/dev/null || true
sudo du -mscx /opt/* 2>/dev/null || true
sudo du -mscx /usr/* 2>/dev/null || true
df -h /
fi
- name: Remove Homebrew, Android and .NET
run: |
# https://github.com/actions/virtual-environments/issues/2606#issuecomment-772683150
# NONINTERACTIVE=1 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/uninstall.sh)"
sudo rm -rf /home/linuxbrew # will release Homebrew
sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android
sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET
sudo rm -rf /opt/ghc
sudo rm -rf /usr/local/share/boost
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
sudo du -mscx /* 2>/dev/null || true
df -h /
- name: Setup upterm session
# interactive SSH logins for debugging
if: github.event.inputs.upterm-session == 'true'
uses: lhotari/action-upterm@v1
- name: Make Docker image
run: make docker-${{ github.event.inputs.docker-image }}
- name: Show Docker images
run: docker images
- name: Login to Docker Hub and push new image(s) to Docker Hub
run: make docker-${{ github.event.inputs.docker-image }} GIT_DEPTH=--single-branch
- name: Generate ocrd-all-tool.json
# the Docker build will set OCRD_MODULES inside the image, which we can re-use
# regardless of whether we have /build, we can just use the Makefile from outside again
run: |
export OCRD_MODULES=$(docker run --rm ocrd/all:${{ github.event.inputs.docker-image }} bash -c 'echo $OCRD_MODULES')
make ocrd-all-tool.json
wc -l ocrd-all-tool.json
- name: Upload ocrd-all-tool.json
uses: actions/upload-artifact@v3
with:
name: ${{ github.event.inputs.docker-image }}_ocrd-all-tool.json
path: ./ocrd-all-tool.json
# if-no-files-found: error
- name: Login to Docker Hub
if: github.event.inputs.upload-dockerhub == 'true'
run: echo ${{ secrets.DOCKERHUB_PASSWORD }} | docker login --username ${{ secrets.DOCKERHUB_USERNAME }} --password-stdin
- name: Push to Docker Hub
if: github.event.inputs.upload-dockerhub == 'true'
run: |
if ${{ github.event.inputs.upload-docker-image }}; then
echo ${{ secrets.DOCKERHUB_PASSWORD }} | docker login --username ${{ secrets.DOCKERHUB_USERNAME }} --password-stdin
docker push ocrd/all:${{ github.event.inputs.docker-image }}
if test ${{ github.event.inputs.docker-image }} = maximum-git; then
# Alias Docker image.
docker tag ocrd/all:maximum-git ocrd/all:latest
docker push ocrd/all:latest
fi
docker push ocrd/all:${{ github.event.inputs.docker-image }}
if test ${{ github.event.inputs.docker-image }} = maximum-git; then
# Alias Docker image.
docker tag ocrd/all:maximum-git ocrd/all:latest
docker push ocrd/all:latest
fi
- name: Login to GitHub Container Registry
if: github.event.inputs.upload-github == 'true'
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Push to Github Container Registry
if: github.event.inputs.upload-github == 'true'
run: |
docker tag ocrd/all:${{ github.event.inputs.docker-image }} ghcr.io/ocr-d/all:${{ github.event.inputs.docker-image }}
docker push ghcr.io/ocr-d/all:${{ github.event.inputs.docker-image }}
if test ${{ github.event.inputs.docker-image }} = maximum-git; then
# Alias Docker image.
docker tag ocrd/all:maximum-git ghcr.io/ocr-d/all:latest
docker push ghcr.io/ocr-d/all:latest
fi

Loading