Skip to content

2022.11

Compare
Choose a tag to compare
@pantaray pantaray released this 11 Nov 12:52
· 181 commits to main since this release

Major changes in managing auto-generated files

  • If write_worker_results is True, ACME now creates an aggregate results
    container comprised of external links that point to actual data in HDF5
    payload files generated by parallel workers.
  • Optionally, results can be slotted into a single dataset/array (via the
    result_shape keyword).
  • If single_file is True, ACME stores results of parallel compute runs
    not in dedicated payload files but all workers write to a single aggregate
    results container.
  • By providing output_dir, the location of auto-generated HDF5/pickle files can
    be customized
  • Entities in a distributed computing client that concurrently process tasks
    are now consistently called "workers" (in line with dask terminology).
    Accordingly the keywords n_jobs, mem_per_job, n_jobs_startup and
    workers_per_job have been renamed n_workers, mem_per_worker,
    n_workers_startup and processes_per_worker, respectively. To ensure
    compatibility with existing code, the former names have been marked
    deprecated but were not removed and are still functional.

A full list of changes is provided below

NEW

  • Included keyword output_dir in ParallelMap that allows to customize the
    storage location of files auto-generated by ACME (HDF5 and pickle). Only
    effective if write_worker_results is True.
  • Added keyword result_shape in ParallelMap to permit specifying the
    shape of an aggregate dataset/array that results from all computational runs
    are slotted into. In conjunction with the shape specification, the new keyword
    result_dtype offers the option to control the numerical type (set to
    "float64" by default) of the resulting dataset (if write_worker_results = True)
    or array (write_worker_results = False). On-disk dataset results collection
    is only available for auto-generated HDF5 containers (i.e, write_pickle = False)
  • Introduced keyword single_file in ParallelMap to control, whether parallel
    workers store results of computational runs in dedicated HDF5 files (single_file = False,
    default) or share a single results container for saving (single_file = True).
    This option is only available for auto-generated HDF5 containers, pickle
    files are not supported (i.e., write_worker_results = True and
    write_pickle = False).
  • Included options to specify worker count and memory consumption in local_cluster_setup
  • Added a new section "Advanced Usage and Customization" in the online documentation
    that discusses settings and associated technical details
  • Added support for Python 3.10 and updated dask dependencies

CHANGED

  • Modified employed terminology throughout the package: to clearly delineate
    the difference between compute runs and worker processes (and to minimize
    friction between the documentation of ACME and dask), the term "worker"
    is now consistently used throughout the code base. If ACME is running on a
    SLURM cluster, a dask "worker" corresponds to a SLURM "job".
  • In line with the above change, the following input arguments have been
    renamed:
    • in ParallelMap:
      • n_jobs -> n_workers
      • mem_per_job -> mem_per_worker
    • in esi_cluster_setup and slurm_cluster_setup:
      • n_jobs -> n_workers
      • mem_per_job -> mem_per_worker
      • n_jobs_startup -> n_workers_startup
    • in slurm_cluster_setup:
      • workers_per_job -> processes_per_worker
  • Made esi_cluster_setup respect already running clients so that new parallel
    computing clients are not launched on top of existing ones (thanks to @timnaher)
  • Introduced support for positional/keyword arguments of unit-length in
    ParallelMap so that n_inputs can be used as scaling parameter to launch
    n_inputs calls of a user-provided function
  • All docstrings and the online documentation have been re-written (and in
    parts clarified) to account for the newly introduced features.
  • Code coverage is not computed by a GitHub action workflow but is now
    calculated by the GitLab CI job that invokes SLURM to run tests on the
    ESI HPC cluster.

DEPRECATED

The keywords n_jobs, mem_per_job, n_jobs_startup and workers_per_job
have been renamed. Using these keywords is still supported but raises a
DeprecationWarning.

  • The keywords n_jobs and mem_per_job in both ParallelMap and
    esi_cluster_setup are deprecated. To specify the number of parallel
    workers and their memory resources, please use n_workers and mem_per_worker,
    respectively (see corresponding item in the Section CHANGED above)
  • The keyword n_jobs_startup in esi_cluster_setup is deprecated. Please
    use n_workers_startup instead

FIXED

  • Updated dependency versions (pin click to version < 8.1) and fixed Syncopy
    compatibility (increase recursion depth of input size estimation to one
    million calls)
  • Streamlined dryrun stopping logic invoked if user chooses to not continue
    with the computation after performing a dry-run
  • Modified tests that are supposed to use an existing distributed computing
    client to not shut down that very client
  • Updated memory estimation routine to deactivate auto-generation of results
    files to not accidentally corrupt pre-allocated containers before launching
    the actual concurrent computation