Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-41366: Add Docker image build for butler server #900

Merged
merged 5 commits into from
Nov 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# This file is part of daf_butler.
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (http://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This software is dual licensed under the GNU General Public License and also
# under a 3-clause BSD license. Recipients may choose which of these licenses
# to use; please see the files gpl-3.0.txt and/or bsd_license.txt,
# respectively. If you choose the GPL option then the following text applies
# (but note that there is still no warranty even if you opt for BSD instead):
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

name: Docker
on:
pull_request: {}
push:
tags:
- "*"

jobs:
docker:
runs-on: ubuntu-latest
timeout-minutes: 20
permissions:
contents: read
packages: write

steps:
- uses: actions/checkout@v3
with:
# Needed to fetch tags, used by Python install process to
# figure out version number
fetch-depth: 0


- uses: lsst-sqre/build-and-push-to-ghcr@v1
id: build
with:
image: ${{ github.repository }}
github_token: ${{ secrets.GITHUB_TOKEN }}

- run: echo Pushed ghcr.io/${{ github.repository }}:${{ steps.build.outputs.tag }}
99 changes: 99 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# This file is part of daf_butler.
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (http://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This software is dual licensed under the GNU General Public License and also
# under a 3-clause BSD license. Recipients may choose which of these licenses
# to use; please see the files gpl-3.0.txt and/or bsd_license.txt,
# respectively. If you choose the GPL option then the following text applies
# (but note that there is still no warranty even if you opt for BSD instead):
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

# This Dockerfile is based on the fastapi_safir_app template from
# lsst/templates
#
# This Dockerfile has four stages:
#
# base-image
# Updates the base Python image with security patches and common system
# packages. This image becomes the base of all other images.
# dependencies-image
# Installs third-party dependencies (requirements/main.txt) into a virtual
# environment. This virtual environment is ideal for copying across build
# stages.
# install-image
# Installs the app into the virtual environment.
# runtime-image
# - Copies the virtual environment into place.
# - Runs a non-root user.
# - Sets up the entrypoint and port.

FROM python:3.11.6-slim-bullseye as base-image

# Update system packages
COPY server/scripts/install-base-packages.sh .
RUN ./install-base-packages.sh && rm ./install-base-packages.sh

FROM base-image AS dependencies-image

# Install system packages only needed for building dependencies.
COPY server/scripts/install-dependency-packages.sh .
RUN ./install-dependency-packages.sh

# Create a Python virtual environment
ENV VIRTUAL_ENV=/opt/venv
RUN python -m venv $VIRTUAL_ENV
# Make sure we use the virtualenv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
# Put the latest pip and setuptools in the virtualenv
RUN pip install --upgrade --no-cache-dir pip setuptools wheel

# Install the app's Python runtime dependencies
COPY requirements.txt .
COPY server/requirements.txt server_requirements.txt
RUN pip install --quiet --no-cache-dir -r requirements.txt -r server_requirements.txt

FROM dependencies-image AS install-image

# Use the virtualenv
ENV PATH="/opt/venv/bin:$PATH"

COPY . /workdir
WORKDIR /workdir
RUN pip install --no-cache-dir --no-deps .

FROM base-image AS runtime-image

# Create a non-root user
RUN useradd --create-home appuser

# Copy the virtualenv
COPY --from=install-image /opt/venv /opt/venv

# Make sure we use the virtualenv
ENV PATH="/opt/venv/bin:$PATH"

# Switch to the non-root user.
USER appuser

# Expose the port.
EXPOSE 8080

# Run the application.
CMD ["uvicorn", "lsst.daf.butler.remote_butler.server:app", "--host", "0.0.0.0", "--port", "8080"]
13 changes: 13 additions & 0 deletions compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Provides a configuration for running Butler Server in docker against
# test data from the ci_hsc_gen3 package. Assumes that ci_hsc_gen3
# has been built via lsstsw in a directory adjacent to the directory
# containing this file.
services:
butler:
build: .
ports:
- "8080:8080"
environment:
BUTLER_SERVER_CONFIG_URI: "/butler_root"
volumes:
- ../ci_hsc_gen3/DATA:/butler_root
46 changes: 46 additions & 0 deletions python/lsst/daf/butler/remote_butler/server/_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# This file is part of daf_butler.
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (http://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This software is dual licensed under the GNU General Public License and also
# under a 3-clause BSD license. Recipients may choose which of these licenses
# to use; please see the files gpl-3.0.txt and/or bsd_license.txt,
# respectively. If you choose the GPL option then the following text applies
# (but note that there is still no warranty even if you opt for BSD instead):
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

import os
from dataclasses import dataclass

__all__ = ("Config", "get_config_from_env")


@dataclass(frozen=True)
class Config:
config_uri: str


def get_config_from_env() -> Config:
config_uri = os.getenv("BUTLER_SERVER_CONFIG_URI")
if config_uri is None:
raise Exception(
"The environment variable BUTLER_SERVER_CONFIG_URI "
"must point to a Butler configuration to be used by the server"
)
return Config(config_uri=config_uri)
6 changes: 3 additions & 3 deletions python/lsst/daf/butler/remote_butler/server/_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,10 @@
SerializedDatasetType,
)

from ._config import get_config_from_env
from ._factory import Factory
from ._server_models import FindDatasetModel

BUTLER_ROOT = "ci_hsc_gen3/DATA"

log = logging.getLogger(__name__)

app = FastAPI()
Expand All @@ -70,7 +69,8 @@ def missing_dataset_type_exception_handler(request: Request, exc: MissingDataset

@cache
def _make_global_butler() -> Butler:
return Butler.from_config(BUTLER_ROOT)
config = get_config_from_env()
return Butler.from_config(config.config_uri)


def factory_dependency() -> Factory:
Expand Down
2 changes: 2 additions & 0 deletions server/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
uvicorn
psycopg2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory pip install .[postgres] should pick up this dependency (it's listed as an optional dep in the pyproject.toml), but not if I've just told you to use --no-deps for that part above...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependencies actually want to be installed in a separate step from the pip install of the application itself -- if the requirements.txt files don't change, Docker can cache the intermediate dependency image and save a few minutes on the next build.

Also the requirements will need to be pinned in future so we won't want pyproject.toml involved.

64 changes: 64 additions & 0 deletions server/scripts/install-base-packages.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

# This file is part of daf_butler.
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (http://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This software is dual licensed under the GNU General Public License and also
# under a 3-clause BSD license. Recipients may choose which of these licenses
# to use; please see the files gpl-3.0.txt and/or bsd_license.txt,
# respectively. If you choose the GPL option then the following text applies
# (but note that there is still no warranty even if you opt for BSD instead):
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

# This script is based on the fastapi_safir_app template from
# lsst/templates

# This script updates packages in the base Docker image that's used by both the
# build and runtime images, and gives us a place to install additional
# system-level packages with apt-get.
#
# Based on the blog post:
# https://pythonspeed.com/articles/system-packages-docker/

# Bash "strict mode", to help catch problems and bugs in the shell
# script. Every bash script you write should include this. See
# http://redsymbol.net/articles/unofficial-bash-strict-mode/ for
# details.
set -euo pipefail

# Display each command as it's run.
set -x

# Tell apt-get we're never going to be able to give manual
# feedback:
export DEBIAN_FRONTEND=noninteractive

# Update the package listing, so we know what packages exist:
apt-get update

# Install security updates:
apt-get -y upgrade

# libpq is required for psycopg2, the Python postgres bindings
apt-get -y install --no-install-recommends libpq5

# Delete cached files we don't need anymore:
apt-get clean
rm -rf /var/lib/apt/lists/*
67 changes: 67 additions & 0 deletions server/scripts/install-dependency-packages.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/bin/bash

# This file is part of daf_butler.
#
# Developed for the LSST Data Management System.
# This product includes software developed by the LSST Project
# (http://www.lsst.org).
# See the COPYRIGHT file at the top-level directory of this distribution
# for details of code ownership.
#
# This software is dual licensed under the GNU General Public License and also
# under a 3-clause BSD license. Recipients may choose which of these licenses
# to use; please see the files gpl-3.0.txt and/or bsd_license.txt,
# respectively. If you choose the GPL option then the following text applies
# (but note that there is still no warranty even if you opt for BSD instead):
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

# This script is based on the fastapi_safir_app template from
# lsst/templates

# This script installs additional packages used by the dependency image but
# not needed by the runtime image, such as additional packages required to
# build Python dependencies.
#
# Since the base image wipes all the apt caches to clean up the image that
# will be reused by the runtime image, we unfortunately have to do another
# apt-get update here, which wastes some time and network.

# Bash "strict mode", to help catch problems and bugs in the shell
# script. Every bash script you write should include this. See
# http://redsymbol.net/articles/unofficial-bash-strict-mode/ for
# details.
set -euo pipefail

# Display each command as it's run.
set -x

# Tell apt-get we're never going to be able to give manual
# feedback:
export DEBIAN_FRONTEND=noninteractive

# Update the package listing, so we know what packages exist:
apt-get update

# Install build-essential because sometimes Python dependencies need to build
# C modules, particularly when upgrading to newer Python versions. libffi-dev
# is sometimes needed to build cffi (a cryptography dependency).
# Git is needed because we still have some Python dependencies pointing
# directly at Git repos
# Libpq-dev is needed to build psycopg2, the postgres bindings for python
apt-get -y install --no-install-recommends build-essential libffi-dev git libpq-dev

# Delete cached files we don't need anymore:
apt-get clean
rm -rf /var/lib/apt/lists/*