Releases · EducationalTestingService/skll

27 Dec 15:07

damien2012eng

v5.1.0

b350eb0

SKLL 5.1.0 Latest

Latest

What's Changed

Replace setup.py with pyproject.toml by @tamarl08 in #773
Remove pre-commit hooks that are covered by ruff by @tamarl08 in #774
Allow None value in feature_scaling by @tamarl08 in #775
Update Numpy to use the latest version [2.0] by @damien2012eng in #772
Update scikit-learn to the latest version by @damien2012eng in #777

New Contributors

@damien2012eng made their first contribution in #772

Full Changelog: v5.0.1...v5.1.0

Contributors

damien2012eng and tamarl08

Assets 2

08 Mar 20:13

desilinguist

v5.0.1

91d58df

SKLL 5.0.1

🛠 Minor Changes 🛠

SKLL v5.0.1 is a minor release with no changes for users.

Updated pre-commit checks.
Updated dependencies.
- Removed all dev dependencies from requirements.txt.
- Updated versions in doc/requirements.txt.
Added new requirements.dev file. This file contains the runtime as well as dev dependencies.
Updated CONTRIBUTING.md to use this file instead of requirements.txt.
Excluded this file in MANIFEST.in so that it's not part of the PyPI package.
Updated CI pipelines to use requirements.dev instead of requirements.txt.
Updated release process checklist.

Full Changelog: v5.0.0...v5.0.1

Assets 2

22 Feb 18:18

tamarl08

v5.0.0

affb97a

SKLL 5.0.0

💥 Breaking changes 💥

scikit-learn has been updated to v1.4.0. This means that the SKLL experiments will likely yield different results compared to SKLL v4.0.1 (#766)
Python 3.8 and 3.9 are no longer supported since scikit-learn v1.4.0 doesn't support them.
Compared to previous versions, additional information is included in the results.json output files produced when running experiments (#761).

💡 New features 💡

SKLL results can now be automatically logged to Weights & Biases (#758, #761, #765)
Python 3.12 is now supported.

🛠 Bugfixes & Improvements 🛠

Fix ReadTheDocs config (#757)

Full Changelog: v4.0.1...v5.0.0

Assets 2

14 Nov 15:08

tamarl08

v4.0.1

e2cbb84

SKLL 4.0.1

What's Changed

Fix calls to yaml.load by @tamarl08 in #754

Full Changelog: v4.0.0...v4.0.1

Contributors

tamarl08

Assets 2

17 Jul 16:04

desilinguist

v4.0.0

b10ce39

SKLL 4.0.0

💥 Breaking changes 💥

scikit-learn has been updated to v1.3.0. This could mean that the same SKLL experiments when run with SKLL 3.2.0 could yield different results.

💡 New features 💡

Add BaggingClassifier and BaggingRegressor support by @desilinguist in #742
Add support for HistGradientBoostingClassifier and HistGradientBoostingRegressor by @desilinguist in #743
Include model fit times in learning curves by @desilinguist in #745
Add neg_root_mean_squared_error metric and objective for regressors by @desilinguist in #741
Add support for Python 3.11 by @desilinguist in #749

🛠 Bugfixes & Improvements 🛠

Apply code formatting and other minor changes. by @desilinguist in #724
Use pathlib.Path where possible. by @desilinguist in #725
Migrate to new codecov uploader. by @desilinguist in #728
Add type hints to skll.config module by @desilinguist in #729
Add type hints to skll.data module & improve types in skll.config by @desilinguist in #730
Bug fix in feature set split method by @tamarl08 in #731
Add type hints to skll.experiments module by @desilinguist in #732
Add type hints to skll.learner module + other refactoring by @desilinguist in #734
Add type hints inskll.utils module and in all other remaining files. by @desilinguist in #736
Improve docstrings and create linkable type hints (Part 1) by @desilinguist in #737
Improve docstrings and type hints (Part 2). by @desilinguist in #738
Improve docstrings and type hints (Part 3) by @desilinguist in #739
Improve docstrings & type hints (Part 4) by @desilinguist in #740
Migrate tests to nose2 instead of nose by @desilinguist in #747
Stop using sklearn's private _scorer API for custom metrics in SKLL. by @desilinguist in #751
Fix a few typos, etc. in the documentation by @mulhod in #712

🙏🏽 Code reviewers 🙏🏽

In no particular order: @dblandan, @mulhod, @Frost45, @tamarl08, @damien2012eng

New Contributors

@tamarl08 made their first contribution in #731

Full Changelog: v3.2.0...v4.0.0

Contributors

desilinguist, mulhod, and 4 other contributors

Assets 2

19 Jan 17:20

desilinguist

v3.2.0

c27198f

SKLL 3.2.0

What's Changed

Update RTD requirements to fix failing build. by @desilinguist in #719
Update dependencies, consolidate requirements, and tweak coverage by @desilinguist in #721
Release v3.2.0 by @desilinguist in #722

Full Changelog: v3.1.0...v3.2.0

Contributors

desilinguist

Assets 2

14 Sep 19:33

desilinguist

v3.1.0

4821da3

SKLL 3.1.0

This is a new release with with dependency updates, bugfixes, and improvements.

💥 Dependency Updates 💥

scikit-learn has been updated to v1.1.2. This could mean that the same SKLL experiments when run with SKLL 3.1.0 could yield different results. (Issue #713, PR #716 ).

🛠 Bugfixes & Improvements 🛠

SKLL Learners now support a new method get_feature_names_out() which returns the correct set of features actually used by the learner. Since some features might be removed by the feature selector, relying on the vectorizer vocabulary is not enough in those cases. This method allows easy access to the names of the actual features used, even if the selector has removed some features (Issue #714, PR #715).
Updated learning curve code to use the new API for seaborn v0.12.0 (PR #716)
Removed the Boston housing dataset from SKLL examples and tests. This dataset has ethical issues and is being removed from scikit-learn. (Issue #700, #717)

✔️ Tests ✔️

Added new tests for Learner.get_feature_name_out(). (Issue #714, PR #715)

👩‍🔬 Contributors 👨‍🔬

(Note: This list is sorted alphabetically by last name and not by the quality/quantity of contributions to this release.)

Sanjna Kashyap (@Frost45), Nitin Madnani (@desilinguist), Matt Mulholland (@mulhod), and Remo Nitschke (@remo-help).

Contributors

desilinguist, mulhod, and 2 other contributors

Assets 2

21 Dec 20:12

desilinguist

v3.0

1502fe8

SKLL 3.0.0

This is a major new release with with dependency updates and bugfixes!

⚡️ SKLL 3.0 is backwards incompatible with previous versions of SKLL and might yield different results compared to previous versions even with the same data and same settings. ⚡️

💥 Breaking Changes 💥

Python 3.7 is no longer officially supported while official support for Python 3.10 has been added (Issue #701, PR #711).
scikit-learn has been updated to v1.0.1 (Issue #699, PR #702).
The configuration field pos_label_str from the “Tuning" section has been renamed to pos_label. Older configuration files with pos_label_str will now raise an exception (Issue #569, PR #706).
The configuration field log from the “Output” section that was renamed to logs in SKLL v2.5 has now been completely deprecated. Older configuration files with log will now raise an exception (Issue #671, PR #705).

💡 New features 💡

SKLL now supports specifying custom seed values for cross-validation tasks. This option may be useful for running the same cross-validation experiment multiple times (with the same number of differently constituted folds) to get a sense of the variance across replicates (Issue #593, PR #707).

🛠 Bugfixes & Improvements 🛠

Using the --drop-blanks option with filter_features now raises a more useful error for the case when every single row in a tabular feature file has a blank column (Issue #693, PR #703).
SKLL conda packages are again generic Python packages instead of platform-specific ones (Issue #710, PR #711).

📖 Documentation Updates 📖

Add a new section to the hands-on tutorial explaining how to first install SKLL in a virtual environment (Issue #689, PR #709).
Add missing link to SKLL repository in the tutorial data section (Issue #688, PR #691).
Update CONTRIBUTING.md to include more detailed instructions for pushing to the SKLL repository (Issue #680, PR #704).
Link to the RSMTool implementation of quadratic_weighted_kappa which supports continuous values and can be used as a custom metric in SKLL for both hyper-parameter tuning as well as validation. See the quadratic_weighted_kappa bullet under the objectives section (Issue #512, PR #704).
Continued readability improvements to function and method docstrings.

✔️ Tests ✔️

All tests now specify local=True when making run_configuration() calls. This ensures that tests always run in local mode and prevent an unnecessary check for the gridmap library. (Issue #616, PR #708).

👩‍🔬 Contributors 👨‍🔬

(Note: This list is sorted alphabetically by last name and not by the quality/quantity of contributions to this release.)

Binod Gyawali (@bndgyawali), Robbie Imbrie (@RobertImbrie), Sanjna Kashyap (@Frost45), Sözen Ozkan Grigoras (@sozkangrigoras), Nitin Madnani (@desilinguist), Matt Mulholland (@mulhod), and Damien Xie (@damien2012eng),

Contributors

desilinguist, mulhod, and 4 other contributors

Assets 2

26 Feb 03:01

desilinguist

v2.5

d590ece

SKLL 2.5

This is a major new release with dozens of new features, bugfixes, and documentation updates!

⚡️ SKLL 2.5 is backwards incompatible with previous versions of SKLL and might yield different results compared to previous versions even with the same data and same settings. ⚡️

💥 Breaking Changes 💥

Python 3.6 is no longer officially supported since the latest versions of pandas and numpy have dropped support for it.
Older top-level imports have been removed and should now be rewritten as follows (Issue #661, PR #662):
- from skll import Learner ➡️ from skll.learner import Learner
- from skll import FeatureSet ➡️ from skll.data import FeatureSet
- from skll import run_configuration ➡️ from skll.experiments import run_configuration
The default value for the class_labels keyword argument for Learner.predict() is now True instead of False. Therefore, for probabilistic classifiers, this method will now return class labels by default instead of class probabilities. To obtain class probabilities, set class_labels to False when calling this method (Issue #621, PR #622).
The filter_features script now offers more intuitive command line options. Input files must be specified using the -i/--input and output files must be specified using the -o/--output. Additionally, --inverse must now be used to invert the filtering command since -i is used for input files (Issue #598, PR #660).
The MegaMReader and MegaMWriter classes have been removed from SKLL since .megam files are no longer supported by SKLL (Issue #532, PR #557).
The param_grids option in the configuration file is now a list of dictionaries instead of a list of list of dictionaries, one for each learner specified in the learners option. Correspondingly, the and the param_grid option in Learner.train() and Learner.cross_validate() is now a dictionary instead of a list of dictionaries and the default parameter grids for each learner are also simply dictionaries. (Issue #618, PR #619).
Running a learning_curve task via a configuration file now requires at least 500 examples. Fewer examples will raise a ValueError. This behavior can only be overridden when using Learner.learning_curve() directly via the API (Issue #624, PR #631).

💡 New features 💡

VotingClassifier and VotingRegressor from scikit-learn are now available for use in SKLL. This was done by adding a new VotingLearner class that uses Learner instances to represent underlying estimators (Issue #488, PR #665).
SKLL now supports custom, user-defined metrics for both hyperparameter tuning as well as evaluation (Issue #606, PR #612).
The following new built-in classification metrics are now available in SKLL: f05, f05_score_macro, f05_score_micro, f05_score_weighted, jaccard, jaccard_macro, jaccard_micro, jaccard_weighted, precision_macro, precision_micro, precision_weighted, recall_macro, recall_micro, and recall_weighted (Issues #609 and #610, PRs #607 and #612).
scikit-learn has been updated to 0.24.1 (Issue #653, PR #659).

🛠 Bugfixes & Improvements 🛠

Hyperparamter tuning now uses 5-fold cross-validation, instead of 3, to match the change in the default value of the cv parameter for GridSearchCV. This will marginally increase the time taken for experiments with grid search but should produce more reliable results (Issue #487, PR #667).
The SKLL codebase now uses sub-packages instead of very long modules which makes it easier to navigate and understand (Issue #600, PR #601).
The log configuration file option has been renamed to logs. Using log will still work but will raise a warning. The log option will be removed entirely in the next release (Issue #520, PR #670).
Learning curves are now correctly generated for probabilistic classifiers (Issue #648, PR #649).
Saving models in the current directory via Learner.save() no longer requires adding ./ to the path (Issue #572, PR #604).
The filter_features script no longer automatically assumes labels specified with -L or --label to be strings (Issue #598, PR #660).
Remove the create_label_dict keyword argument from Learner.train() since it did not need to be user-facing (Issue #565, PR #605).
Do not return 0 from correlation metrics when NaN is more appropriate. Doing this resulted in incorrect hyperparameter tuning results (Issue #585, PR #588).
The Learner._check_input_formatting() private method now works correctly for dense featuresets (Issue #656, PR #658).
SKLL conda packages are again platform-specific and the recipe now uses a conda_build_config.yaml to build the Python 3.7, 3.8, and 3.9 variants in one go (Issue #623, PR #XXX).
Several useful changes to the SKLL code style:
- Standardize string concatenation (Issue #636, PR #645)
- Use with context manager when opening files (Issue #641, PR #644)
- Use f-strings where possible (Issue #633, PR #634)
- Follow standard guidelines for sorting imports (Issue #638, PR #650)
- Use pre-commit hooks to enforce code formatting guidelines during development (Issue #646, PR #650)

📖 Documentation Updates 📖

Update CONTRIBUTING.md with the new sub-package structure of the SKLL codebase (Issue #611, PR #628).
Add a section to the README that explains how to cite SKLL (Issue #599, PR #672).
Add Azure Pipelines badge to the README (Issue #608, PR #672).
Add explicit .readthedocs.yml file to configure the auto-built documentation (Issue #668, PR #672).
Make it clear that not specifying predictions configuration file option leads to prediction files being output in the current directory (Issue #664, PR #672).

✔️ Tests ✔️

Reduce code duplication in tests (Issue #635, PR #642).
The Linux and Windows CI builds now use Python 3.7 and 3.8 respectively, instead of Python 3.6 (Issue #524, PR #665)
Both the Linux and Windows CI builds now use consistent nosetests commands (Issue #584, PR #665).
nose-cov is now automatically installed via conda_requirements.txt when setting up a development environment instead of requiring a separate step (Issue #527, PR #672).
Add comprehensive new tests for voting learners, custom metrics, new built-in metrics, as well as for new bugfixes.
Current code coverage for SKLL tests is at 97%, the highest it has ever been!

👩‍🔬 Contributors 👨‍🔬

(Note: This list is sorted alphabetically by last name and not by the quality/quantity of contributions to this release.)

Aoife Cahill (@aoifecahill), Binod Gyawali (@bndgyawali), Nitin Madnani (@desilinguist), Matt Mulholland (@mulhod), Sree Harsha Ramesh (@srhrshr)

Assets 2

13 Mar 17:22

desilinguist

v2.1

1f7a6fa

SKLL 2.1

This is a minor release of SKLL with the only change being that it is now compatible with scikit-learn v0.22.2.

⚡️ There are several changes in scikit-learn v0.22 that might cause several estimators and functions to produce different results even when fit with the same data and parameters. Therefore, SKLL 2.1 can also yield different results compared to previous versions even with the same data and same settings. ⚡️

💡 New features 💡

scikit-learn updated to 0.22.2 (Issue #594, PR #595).

🔎 Other minor changes 🔎

Update imports to align with the new scikit-learn API.
A minor bugfix in logutils.py.
Update some test outputs due to changes in scikit-learn models and functions.
Update some tests to make pre-release testing for conda and PyPI packages possible.

👩‍🔬 Contributors 👨‍🔬

(Note: This list is sorted alphabetically by last name and not by the quality/quantity of contributions to this release.)

Aoife Cahill (@aoifecahill), Binod Gyawali (@bndgyawali), Matt Mulholland (@mulhod), Nitin Madnani (@desilinguist), and Mengxuan Zhao (@chaomenghsuan).

Assets 2

Releases: EducationalTestingService/skll

SKLL 5.1.0

What's Changed

New Contributors

Contributors

SKLL 5.0.1

🛠 Minor Changes 🛠

SKLL 5.0.0

💥 Breaking changes 💥

💡 New features 💡

🛠 Bugfixes & Improvements 🛠

SKLL 4.0.1

What's Changed

Contributors

SKLL 4.0.0

💥 Breaking changes 💥

💡 New features 💡

🛠 Bugfixes & Improvements 🛠

🙏🏽 Code reviewers 🙏🏽

New Contributors

Contributors

SKLL 3.2.0

What's Changed

Contributors

SKLL 3.1.0

💥 Dependency Updates 💥

🛠 Bugfixes & Improvements 🛠

✔️ Tests ✔️

👩‍🔬 Contributors 👨‍🔬

Contributors

SKLL 3.0.0

💥 Breaking Changes 💥

💡 New features 💡

🛠 Bugfixes & Improvements 🛠

📖 Documentation Updates 📖

✔️ Tests ✔️

👩‍🔬 Contributors 👨‍🔬

Contributors

SKLL 2.5

💥 Breaking Changes 💥

💡 New features 💡

🛠 Bugfixes & Improvements 🛠

📖 Documentation Updates 📖

✔️ Tests ✔️

👩‍🔬 Contributors 👨‍🔬

SKLL 2.1

💡 New features 💡

🔎 Other minor changes 🔎

👩‍🔬 Contributors 👨‍🔬