fix(utils): Enhance the dependencies check to include pip distribution #317
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With the 0.4.0 release, Datatrove transitioned to numpy 2.0, and the pyproject.toml was updated to ensure that the version of fasttext used is compatible with numpy 2.x.
However, if users do not start with a totally new virtual environment and just continue using
fasttext-wheels
compatible with numpy 1.x, we check only verifies the module namefasttext
, which can cause an error to be raised during execution.To solve this issue, I modfied below:
_is_distribution_available
in_import_utils
to strictly manage pip distributions.check_required_dependencies
function to perform above.fasttext-numpy2-wheel
During pytest, an error related to the Tibetan language occurred, but since it is not relevant to my region, I did not make any additional changes.