Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore nix as conda alternative #95

Open
foolnotion opened this issue Apr 19, 2022 · 5 comments
Open

Explore nix as conda alternative #95

foolnotion opened this issue Apr 19, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@foolnotion
Copy link
Contributor

Hi,

Since this issue does not require other changes in this repo except for flake.nix, I took the liberty of working with the dev branch.

This allows nix to be used as an alternative to conda, with a bunch of advantages:

  • conda lacks sane management of transitive deps leading to conflicts
  • nix is far more robust and allows complete reproducibility
  • faster
  • easy to build the environment (just type nix develop)
  • easy to generate docker images on the fly
    docker run -p 8888:8888 -ti --rm docker.nix-community.org/nixpkgs/nix-flakes nix develop github:cavalab/srbench/dev --no-write-lock-file
    
  • flake-enabled frameworks like pyoperon pull their own dependencies automatically (no need to keep adding things to an environment file)
  • flake.lock files can fix versions/revisions

This is obviously a low priority issue right now, but I've been using it to deploy srbench/operon without conda.
My frustration with conda began with not being able to add gcc/gxx-11.2.0 to the environment.

This issue is meant to track integration of other frameworks with nix. So far I have also integrated FEAT and Ellyn (wip). Other frameworks should be easily integrated as long as they use standard packaging.

There are some aspects that will need attention from other authors:

  • FEAT:

    • seems to be incompatible with latest numpy (this usually gets fixed upstream)
      >>> import feat
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/nix/store/h3gfxkz8a31l60qmpzj15ryq12nqsspm-python3.9-feat_ml-0.5.2/lib/python3.9/site-packages/feat/__init__.py", line 1, in <module>
          from .feat import Feat, FeatRegressor, FeatClassifier
        File "/nix/store/h3gfxkz8a31l60qmpzj15ryq12nqsspm-python3.9-feat_ml-0.5.2/lib/python3.9/site-packages/feat/feat.py", line 12, in <module>
          from .pyfeat import PyFeat
        File "feat/pyfeat.pyx", line 1, in init feat.pyfeat
      ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
      
    • the install script setup.py seems to be really tailored for conda environments, it took some tricks with setting ENV vars to get it to work
  • Ellyn

    • setup.py hardcoded for Conda, does not work with nix (I will try to patch it)

Best,
Bogdan

@foolnotion foolnotion added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Apr 19, 2022
@foolnotion foolnotion self-assigned this Apr 19, 2022
@lacava
Copy link
Member

lacava commented Apr 23, 2022

thanks for working on this! i could see nix being a good option for managing dependencies, just have these concerns

  1. everyone uses conda and/or has it if they have anaconda installed. i don't know too many people that use nix. maybe that's just my world but i imagine conda is generally much more popular/available.
  2. to add to that, i had a hell of a hard time getting nix to install on ubuntu 20.04. had to resolve lots of issues related to gcc 11.2, actually :)

@foolnotion
Copy link
Contributor Author

thanks for the input. I also think conda is much more popular at the moment and will remain of course the main provider for srbench, especially since nix does not support windows. however, in terms of robustness and customization, there are some good arguments in favor of nix.

now that we use mamba things are not so bad anymore but some issues still remain (for example, due to the gcc issue, operon will have to switch to clang-13). I suspect that as the srbench conda env grows things will start to break.

to add to that, i had a hell of a hard time getting nix to install on ubuntu 20.04. had to resolve lots of issues related to gcc 11.2, actually :)

That's strange, I have nix-2.7.0 installed on Ubuntu 20.04.4 LTS x86_64 and it works like a charm. It's important to install it directly from the source (not apt or conda versions since those are old/broken).

@folivetti
Copy link
Contributor

Let me add my two cents from the experience I had with srbench recently:

  • Using conda is currently a test of patience, it seems to be taking ages to install the environment.
  • After switching to mamba and activating debug mode, I could see that the problem was with solving the version of the packages such that there's no conflict; mamba was much faster to solve that but it reveals the problem that eventually this may become unsolvable.
  • At this one time I tried to install with conda there was at least one algorithm that didn't allow newer versions of numpy
  • When testing my last PR, I also found two other issues: some packages can be installed with conda but not with mamba (don't know why, since they are all taken from conda-forge) and mamba requires sudo. So if you don't have sudo permission you'll have to suffer with conda.

Some solutions include:

  • Specifying the version of each and every one of the packages in environment.yml
  • Creating a separate env for each algorithm (but I think conda will keep copies of the same packages for each env)

Nix will go along this solution but without wasting additional resources. The cons are:

  • steepier learning curve
  • it requires sudo to install

Maybe we can go with an intermediate solution by asking competitors to optionally provide a flake.nix. It seems to me that most competitors fall into a pure python implementation or a mix of C++ and Python (and there's also me, the Haskeller :) ). I guess eventually we would have some nix templates that would make it easier for everyone to provide their own flake.
Still, we would have the downside of requiring sudo to install nix...

@lacava
Copy link
Member

lacava commented Apr 24, 2022

All good points @folivetti. I think you should both take a look at the Competition2022 branch, where competitors DO live in separate envs. I was hoping to eventually move the main repo there.

Basically, there is a base srbench env that is then updated separately by each method. conda envs can be stacked by activating them sequentially with a flag (something like --stack), which helps minimize package copies.

The same goes for the docker file - we would have to have separate containers per method, maybe in a docker compose file. We really SHOULD have isolation between methods, even if it may create slight differences in versioning for the evaluation routine.

@foolnotion re nix installation, i just tried on my new laptop with ubuntu 20.04 and it installed easily. So, maybe it's just an issue with my old one :)

@folivetti
Copy link
Contributor

Didn't know about the --stack flag! This would certainly be a solution!

I'm going to look at the Competition2022 branch (should have done that some time ago 😶 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants