Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: update caching to resolve concurrency #2869

Merged
merged 25 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion .github/actions/pull-caches/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ inputs:
description: cache restore/dump key
required: false
default: "pypi-packages"
cache-torch-HF:
description: "cache torch and HF"
required: false
default: "true"
cache-references:
description: "cache metrics references"
required: false
default: "false"

runs:
using: "composite"
Expand Down Expand Up @@ -67,6 +75,7 @@ runs:
shell: bash

- name: Cache Torch & HF
if: inputs.cache-torch-HF == 'true' # since the input is string
continue-on-error: true
uses: actions/cache/restore@v3
with:
Expand All @@ -75,6 +84,7 @@ runs:
key: ci-caches

- name: Restored Torch & HF
if: inputs.cache-torch-HF == 'true' # since the input is string
run: |
mkdir -p $CACHES_DIR
pip install -q py-tree
Expand All @@ -83,14 +93,15 @@ runs:

- name: Cache References
# do not use this cache for dispatch and crone, to enable rebuild caches if needed
if: github.event_name != 'workflow_dispatch' && github.event_name != 'schedule'
if: github.event_name != 'workflow_dispatch' && github.event_name != 'schedule' && inputs.cache-references == 'true'
continue-on-error: true
uses: actions/cache/restore@v3
with:
path: tests/_cache-references
key: cache-references

- name: Restored References
if: inputs.cache-references == 'true' # since the input is string
continue-on-error: true
working-directory: tests/
run: |
Expand Down
83 changes: 41 additions & 42 deletions .github/actions/push-caches/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,18 @@ inputs:
description: location to pull PyTorch from
required: false
default: "https://download.pytorch.org/whl/cpu/torch_stable.html"
cache-artifact-appendix:
description: "unique name or running index"
required: false
default: ""
cache-torch-HF:
description: "cache torch and HF"
required: false
default: "true"
cache-references:
description: "cache metrics references"
required: false
default: "false"

runs:
using: "composite"
Expand All @@ -23,66 +35,50 @@ runs:
shell: bash

- name: Freeze local emv.
if: inputs.cache-artifact-appendix != ''
run: |
pip freeze > requirements.dump
cat requirements.dump
shell: bash

#- name: Filter self pkg
# run: |
# import os
# fp = 'requirements.dump'
# with open(fp) as fopen:
# lines = [ln.strip() for ln in fopen.readlines()]
# lines = [ln.split('+')[0] for ln in lines if '-e ' not in ln]
# with open(fp, 'w') as fwrite:
# fwrite.writelines([ln + os.linesep for ln in lines])
# shell: python

- name: Dump wheels
if: inputs.cache-artifact-appendix != ''
run: |
pip wheel -r requirements/_devel.txt --prefer-binary \
--wheel-dir=.pip-wheels \
--wheel-dir=_pip-wheels \
-f ${{ inputs.torch-url }} -f ${{ inputs.pypi-dir }}
ls -lh .pip-wheels
ls -lh _pip-wheels
shell: bash

- name: Cache pull packages
uses: actions/cache/restore@v3
with:
enableCrossOsArchive: true
path: ${{ inputs.pypi-dir }}
key: ${{ inputs.pypi-key }}

- name: Find diff
id: wheels-diff
- name: Move new packages to staging
if: inputs.cache-artifact-appendix != ''
run: |
import os, glob
wheels = [os.path.basename(p) for p in glob.glob(".pip-wheels/*")]
pkgs = [os.path.basename(p) for p in glob.glob("${{ inputs.pypi-dir }}/*")]
diff = [w for w in wheels if w not in pkgs]
print(diff)
with open(os.environ['GITHUB_OUTPUT'], 'a') as fh:
print(f'count-new={len(diff)}', file=fh)
shell: python

- run: cp .pip-wheels/* ${{ inputs.pypi-dir }}
if: ${{ steps.wheels-diff.outputs.count-new != 0 }}
mkdir -p _pip-staging
python .github/assistant.py move_new_packages \
--dir-cache="${{ inputs.pypi-dir }}" \
--dir_local="_pip-wheels" \
--dir_staging="_pip-staging"
ls -lh _pip-staging/
# count files in the staging dir
file_count=$(ls -1 "_pip-staging/" | wc -l)
echo "NUM_PACKAGES=$file_count" >> $GITHUB_ENV
shell: bash

- name: Cache push packages
if: ${{ steps.wheels-diff.outputs.count-new != 0 }}
uses: actions/cache/save@v3
- name: Upload new packages
if: inputs.cache-artifact-appendix != '' && env.NUM_PACKAGES != 0
uses: actions/upload-artifact@v4
with:
enableCrossOsArchive: true
path: ${{ inputs.pypi-dir }}
key: ${{ inputs.pypi-key }}
name: ${{ inputs.pypi-key }}-run-${{ inputs.cache-artifact-appendix }}
path: _pip-staging
retention-days: 1

- name: Post Torch & HF
if: inputs.cache-torch-HF == 'true' # since the input is string
run: py-tree $CACHES_DIR
shell: bash

- name: Cache Torch & HF
if: inputs.cache-torch-HF == 'true' # since the input is string
continue-on-error: true
uses: actions/cache/save@v3
with:
Expand All @@ -91,13 +87,16 @@ runs:
key: ci-caches

- name: Cache references
if: inputs.cache-references == 'true' # since the input is string
continue-on-error: true
uses: actions/cache/save@v3
with:
#enableCrossOsArchive: true
path: tests/_cache-references
key: cache-references

- name: Post References
run: py-tree tests/_cache-references/ --show_hidden
shell: bash
#- name: Post References
# # This print taken soo many lines, so it is commented out
# if: inputs.cache-references == 'true' # since the input is string
# run: py-tree tests/_cache-references/ --show_hidden
# shell: bash
17 changes: 17 additions & 0 deletions .github/assistant.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,23 @@ def _crop_path(fname: str, paths: list[str]) -> str:
raise ValueError(f"Missing following paths: {not_exists}")
return " ".join(test_modules)

@staticmethod
def move_new_packages(dir_cache: str, dir_local: str, dir_staging: str) -> None:
"""Move unique packages from local folder to staging."""
assert os.path.isdir(dir_cache), f"Missing folder with saved packages: '{dir_cache}'" # noqa: S101
assert os.path.isdir(dir_local), f"Missing folder with local packages: '{dir_local}'" # noqa: S101
assert os.path.isdir(dir_staging), f"Missing folder for staging: '{dir_staging}'" # noqa: S101

import shutil

for pkg in os.listdir(dir_local):
if not os.path.isfile(pkg):
continue
if pkg in os.listdir(dir_cache):
continue
logging.info(f"Moving '{pkg}' to staging...")
shutil.move(os.path.join(dir_cache, pkg), os.path.join(dir_staging, pkg))


if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
Expand Down
57 changes: 57 additions & 0 deletions .github/workflows/_merge_cache.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: Collect new packages and upload cache

on:
workflow_call:
inputs:
pypi-key:
description: cache restore/dump key
required: false
type: string
default: "pypi-packages"
pypi-dir:
description: location of local PyPI cache
required: false
type: string
default: "_ci-cache_PyPI"
cache-artifact-appendix:
description: "unique name for the job"
required: true
type: string

jobs:
merge-caches:
runs-on: ubuntu-latest
steps:
- name: Download 📥 artifacts
uses: actions/download-artifact@v4
with:
pattern: ${{ inputs.pypi-key }}-run-${{ inputs.cache-artifact-appendix }}*
merge-multiple: true
path: _local-packages
- name: Cache pull packages
uses: actions/cache/restore@v3
with:
enableCrossOsArchive: true
path: ${{ inputs.pypi-dir }}
key: ${{ inputs.pypi-key }}

- name: show 📦
run: |
# create the directory if it doesn't exist - no artifact were found
mkdir -p _local-packages
ls -lh _local-packages
ls -lh ${{ inputs.pypi-dir }}
# count files in the staging dir
file_count=$(ls -1 "_local-packages/" | wc -l)
echo "NUM_PACKAGES=$file_count" >> $GITHUB_ENV
- name: Move collected 📦
if: env.NUM_PACKAGES != 0
run: mv _local-packages/* ${{ inputs.pypi-dir }}

- name: Cache push packages
if: env.NUM_PACKAGES != 0
uses: actions/cache/save@v3
with:
enableCrossOsArchive: true
path: ${{ inputs.pypi-dir }}
key: ${{ inputs.pypi-key }}
24 changes: 19 additions & 5 deletions .github/workflows/ci-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ defaults:
run:
shell: bash

env:
PYPI_CACHE_DIR: "_ci-cache_PyPI"

jobs:
check-diff:
if: github.event.pull_request.draft == false
Expand Down Expand Up @@ -59,10 +62,10 @@ jobs:
- { os: "windows-2022", python-version: "3.11", pytorch-version: "2.6.0" }
env:
FREEZE_REQUIREMENTS: ${{ ! (github.ref == 'refs/heads/master' || startsWith(github.ref, 'refs/heads/release/')) }}
PYPI_CACHE_DIR: "_ci-cache_PyPI"
TOKENIZERS_PARALLELISM: false
TEST_DIRS: ${{ needs.check-diff.outputs.test-dirs }}
PIP_EXTRA_INDEX_URL: "--find-links https://download.pytorch.org/whl/cpu/torch_stable.html"
PIP_EXTRA_INDEX_URL: "--find-links=https://download.pytorch.org/whl/cpu/torch_stable.html"
UNITTEST_TIMEOUT: "" # by default, it is not set

# Timeout: https://stackoverflow.com/a/59076067/4521646
# seems that macOS jobs take much more than orger OS
Expand Down Expand Up @@ -98,6 +101,7 @@ jobs:
requires: ${{ matrix.requires }}
pytorch-version: ${{ matrix.pytorch-version }}
pypi-dir: ${{ env.PYPI_CACHE_DIR }}
cache-references: true

- name: Switch to PT test URL
if: ${{ matrix.pytorch-version == '2.6.0' }}
Expand All @@ -107,7 +111,7 @@ jobs:
run: |
pip --version
pip install -e . -U "setuptools==69.5.1" -r requirements/_doctest.txt \
$PIP_EXTRA_INDEX_URL --find-links $PYPI_CACHE_DIR
$PIP_EXTRA_INDEX_URL --find-links="$PYPI_CACHE_DIR"
pip list

- name: DocTests
Expand All @@ -126,7 +130,7 @@ jobs:
python adjust-torch-versions.py $fpath
done
pip install --requirement requirements/_devel.txt -U \
$PIP_EXTRA_INDEX_URL --find-links $PYPI_CACHE_DIR
$PIP_EXTRA_INDEX_URL --find-links="$PYPI_CACHE_DIR"
pip list

- name: set special vars for PR
Expand Down Expand Up @@ -201,7 +205,7 @@ jobs:
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: tests/coverage.xml
files: "tests/coverage.xml"
flags: cpu,${{ runner.os }},python${{ matrix.python-version }},torch${{ steps.info.outputs.TORCH }}
env_vars: OS,PYTHON
name: codecov-umbrella
Expand All @@ -213,6 +217,8 @@ jobs:
uses: ./.github/actions/push-caches
with:
pypi-dir: ${{ env.PYPI_CACHE_DIR }}
cache-artifact-appendix: ${{ github.run_id }}-${{ strategy.job-index }}
cache-references: true

testing-guardian:
runs-on: ubuntu-latest
Expand All @@ -227,3 +233,11 @@ jobs:
if: contains(fromJSON('["cancelled", "skipped"]'), needs.pytester.result)
timeout-minutes: 1
run: sleep 90

merge-pkg-artifacts:
needs: pytester
if: success()
uses: ./.github/workflows/_merge_cache.yml
with:
pypi-dir: "_ci-cache_PyPI"
cache-artifact-appendix: ${{ github.run_id }}-${{ strategy.job-index }}
1 change: 0 additions & 1 deletion tests/unittests/_helpers/wrappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
"We couldn't connect to",
"Connection error",
"Can't load",
"`nltk` resource `punkt` is",
)


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from sklearn.metrics import auc as _sk_auc
from torch import Tensor, tensor
from torchmetrics.utilities.compute import auc

from unittests import NUM_BATCHES
from unittests._helpers import seed_all
from unittests._helpers.testers import MetricTester
Expand Down
Loading