Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WECHSEL Tutorial Notebook #5

Merged
merged 37 commits into from
Nov 3, 2024
Merged

WECHSEL Tutorial Notebook #5

merged 37 commits into from
Nov 3, 2024

Conversation

AnesBenmerzoug
Copy link
Owner

@AnesBenmerzoug AnesBenmerzoug commented Nov 3, 2024

This PR closes #1

It adds a tutorial notebook for WECHSEL, fixes issues and improves the package overall

Changes

  • Track data artifacts from original methods' code to git lfs in order to use them for testing.
  • Added notebook dependencies to dev dependency group.
  • Fix bilingual dictionary alignement implementation.
  • Get path to cache directory from platformdirs instead of Path.home() in order to respect user configuration through XDG_CACHE_DIR environment variable.
  • Fix downloading of fasttext embeddings.
  • Compute embeddings vectors in batches for weighted average initialization.
  • Compute cosine similarity using scikit-learn instead of fastdist.
  • Set smaller values in TopKWeights to -np.inf instead of 0.
  • Added test for bilingual dictionary alignment.
  • Created notebook demonstrating use of WECHSEL.
  • Updated readme.

@AnesBenmerzoug AnesBenmerzoug self-assigned this Nov 3, 2024
@AnesBenmerzoug AnesBenmerzoug merged commit 96b58dd into main Nov 3, 2024
3 checks passed
@AnesBenmerzoug AnesBenmerzoug deleted the 1-wechsel-notebook branch November 15, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add notebook demonstrating use of WECHSEL
1 participant