Skip to content

ecmwf/pyodc

Repository files navigation

pyodc

PyPI Build Status Documentation Status Code style: black Licence

A Python interface to odc for encoding/decoding ODB-2 files.

The package contains two different implementations of the same library:

  • pyodc is a pure-python encoder and decoder for ODB-2 data, which encodes data from, and decodes it into pandas data frames
  • codc is an implementation of the same API as pyodc that depends on the ECMWF odc library, and comes with much better performance.

Both libraries are be installed by running pip install pyodc, and since version 1.6.0, a pre-built wheel version of odc will be automatically installed so that codc can be used without any additional steps.

Documentation Changelog

Dependencies

Required

  • Python 3.x

Optional

For codc to work, the odc library must be compiled and installed on the system and made available to Python. Typically this happens automatically as described above through the dependency on odclib which bundles a precompiled version of odc as a wheel. If some some reason this doesn't work, there are multiple other ways to make the library visible to pyodc:

  • It can be installed as a system library.
  • The installation prefix can be passed in the odc_DIR or ODC_DIR environment variables.
  • The library directory can be included in `LD_LIBRARY_PATH.

Installation

pip install pyodc

Check if the module was installed correctly:

python
>>> import pyodc as odc # pure python
>>> import codc as odc # faster

Usage

An introductory Jupyter Notebook with helpful usage examples is provided in the root of this repository:

git clone [email protected]:ecmwf/pyodc.git
cd pyodc
jupyter notebook Introduction.ipynb

Note that codc is not thread safe so care should be taken when using it with dask. You can set dask to use processses rather than threads by doing:

with dask.config.set(scheduler='processes'):
    dask.compute(...)

Development

Run Unit Tests

To run the unit tests, make sure that the pytest module is installed first:

python -m pytest

Run Unit Tests across multiple python versions with Tox

Tox is a useful tool to quickly run pytest across multiple python versions by managing a set of python environments for you. A tox.ini file is provided that targets python3.8 - 3.12. Note that this will also install older versions of libraries like numpy which helps to catch incompatibilities with older versions of those libraries too.

To run tox, install it, modify the ODC_HOME = ../build line in tox.ini to point to a build of odc, this will be reused for all the tests. Then run

tox

The first run will take a while for it to install all the environments but after that it's very fast.

Build Documentation

To build the documentation locally, please install the Python dependencies first:

cd docs
pip install -r requirements.txt
make html

The built HTML documentation will be available under the docs/_build/html/index.html path.

License

This software is licensed under the terms of the Apache Licence Version 2.0 which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.

In applying this licence, ECMWF does not waive the privileges and immunities granted to it by virtue of its status as an intergovernmental organisation nor does it submit to any jurisdiction.