Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Preprocessing plugins #24

Open
bcebere opened this issue Feb 1, 2023 · 0 comments
Open

[Epic] Preprocessing plugins #24

bcebere opened this issue Feb 1, 2023 · 0 comments
Labels
enhancement New feature or request Epic

Comments

@bcebere
Copy link
Contributor

bcebere commented Feb 1, 2023

Description

Preprocessing plugins, for scaling or dimensionality reduction.

Why?

Epics require a lot of work and often require a change in the scope of development. Justify your epic - why can't it just be a simple issue?

Breakdown

Provide a bulleted or numbered list of how you might break this epic down into smaller issues.

  • drop constant features - TODO
  • handle multicollinearity - TODO
  • drop low variance features - TODO
  • encode data
@bcebere bcebere added the enhancement New feature or request label Feb 1, 2023
@DrShushen DrShushen transferred this issue from another repository Mar 3, 2023
DrShushen added a commit that referenced this issue Mar 3, 2023
DrShushen added a commit that referenced this issue Mar 3, 2023
* Update all configurations

* Set up data format and validation (#2)

* Implement library config

* Isort docs/conf.py

* Improve configuration

* Add logging

* Add .print to logger

* Set up way of updating things on config changes

* Simplify and use get_config()

* Tidy multi-line logger messages

* Add logger.raise_ convenience method

* Move logging into a package

* Add diagnose, backtrace as configurable options

* Make default config be defined in yaml file only

* Add validator design

* Add extra config test

* Define global data settings

* Clean up validator

* Use builtin types in DataSettings

* Minor refactor validator

* Set up DataRequirements

* Set up validation implementation

* Decouple tests, tidy imports

* Imports to Google style where possible

* Implement some data validators (root_validate)

* Use decorator in val. impl. method registry

* Improve exception logging

* Update some validation_implementation methods

* Reorganize dir structure

* Clean up tests

* Allow for inheritance in val. impl.

* Add validator > df tests

* Separate out SupportsContainer interface

* Shorten package names

* Factor out SupportsImplementations

* Factor out RegisterMethodDecorator

* Add dispatch_to_implementation()

* Separate out interface

* Set up framework for *Samples objects

* Add data utils

* Implement as_array for TimeSeriesSamples

* Introduce data container def.

* Update data container defs, separate out tests

* Set up default container flavor mechanism

* Add check_untyped_defs = True in mypy.ini

* Implement EventSamples

* Implement as_array for EventSamples

* Update setup requirements

* Set up docs (#4)

* Remove unneeded BAK file

* Small additions to README

* Update image display

* Major update README

* Table width

* Change image align

* Prepare README for docs

* Add logo

* Update docs

* Fix issue in README

* Dr shushen/model setup (#23)

* Remove unneeded BAK file

* Change abc import to be Google code format style

* Introduce TemporBaseModel

* Introduce fit()

* Bugfix, add typing overloads for fit()

* Add core requirements

* Introduce RequirementCategory

* Set up requirement validator concept

* Reorganize data/

* Introduce DataBundle

* Add DataBundle requirements

* Set up from_data_containers static method

* Update install_requires

* Introduce RequirementsConfig

* Reorganize packages to improve naming

* Develop RequirementsConfig further

* Improve RequirementsConfig repr

* Deal with _validate_method_config

* Minor bugfixes

* Add some int. tests for base model fit config

* Add _fit_called flag

* Introduce transform method

* Tidy imports

* Update LICENSE (#24)

* [Feat] Basic plugin interface and loader (#27)

* Rename model dir to plugins

* Set up plugins/core dir

* Simplify estimator, transformer

* Add predictor

* Fix circular import

* Add core plugin methods like name()

* Implement a Plugin interface

* Factorize out test utility "patch_module"

* Add test for plugin infrastructure

* Add test for loaded plugins

* add fit_{predict,transform} methods

* Add hyperparameter methods

* Add Base* to indicate base models

* Keep only estimator init with params as kwargs

* Rename test file

* Add parent constructor calls for clarity

* Set up Dataset (#36)

* Remove old data format

* Initial interface for Dataset

* Remove unnecessary decorator

* Add StaticSamples validation and tests

* Add TimeSeriesSamples validation

* Add EventSamples validation

* Add tests for samples basics

* Test EventSamples.split method

* Add  time/sample_index helper methods

* Add @validate_arguments and some note comments

* Add repr's

* Implement from_numpy for {Static,Event}Samples

* Implement .numpy for {Static,Event}Samples

* Add utils for array/df manipulation

* Implement TimeSeriesSamples .numpy()

* Fix some typing definitions

* Write utils for array -> TS df conversion

* Check in register_plugin if re-imported (no exc.)

* Add docstrings in utils

* Add pydantic.validate_arguments in data utils

* Implement TimeSeriesSamples from_numpy()

* Update docstring in TimeSeriesSamples.__init__

* Add unit tests for Dataset

* Set default debug level to INFO

* Update tests

* Add data format tutorial notebook

* Add reprs

* Simplify _check_same_class

* Add docstring to Dataset classes
@DrShushen DrShushen added the Epic label May 25, 2023
@DrShushen DrShushen added enhancement New feature or request and removed enhancement New feature or request labels Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Epic
Projects
None yet
Development

No branches or pull requests

2 participants