Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc improvements #42

Merged
merged 5 commits into from
Dec 18, 2023
Merged

Misc improvements #42

merged 5 commits into from
Dec 18, 2023

Conversation

mariosasko
Copy link
Contributor

@mariosasko mariosasko commented Dec 15, 2023

Misc improvements to the CI (separate pytest from linting) and formatting logic (add Makefile and use ruff for formatting instead of black). Additionally, it adds a LICENSE file and organizes the extras dependencies in setup.py

@mariosasko mariosasko requested a review from guipenedo December 18, 2023 15:12
@mariosasko mariosasko marked this pull request as ready for review December 18, 2023 15:12
@guipenedo
Copy link
Collaborator

Regarding the LICENSE, is Apache 2.0 the standard for HF OSS projects?

Since you cleaned up the dependencies, something that would also be interesting to discuss imo is the best way to handle all the extra dependencies (and I don't mean dev deps/ruff) - all the dependencies that are used by a single block that are not included in setup.py to keep things lightweight.
Currently a few blocks check if their dependency libs are installed and give an error otherwise, but it can be a bit tedious having to relaunch a pipeline a few times until all dependencies have been collected. We could dynamically install any missing dependencies before launching a pipeline I suppose, though I'm not a huge fan of just running pip on subprocess for this as some people do. Wondering if you have any thoughts on this

@mariosasko
Copy link
Contributor Author

mariosasko commented Dec 18, 2023

Regarding the LICENSE, is Apache 2.0 the standard for HF OSS projects?

Yup!

Currently a few blocks check if their dependency libs are installed and give an error otherwise, but it can be a bit tedious having to relaunch a pipeline a few times until all dependencies have been collected. We could dynamically install any missing dependencies before launching a pipeline I suppose, though I'm not a huge fan of just running pip on subprocess for this as some people do. Wondering if you have any thoughts on this

Yes, I noticed this, but it's probably better to address this in a separate PR. Regarding the implementation, I think we can print a list of missing dependencies (as a pip command), as this is what transformers and datasets do in that case.

@guipenedo
Copy link
Collaborator

Sounds great. Merged!

@guipenedo guipenedo merged commit 4eb6fcd into main Dec 18, 2023
2 checks passed
@guipenedo guipenedo deleted the misc branch December 18, 2023 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants