-
-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Adam Li <[email protected]>
- Loading branch information
Showing
4 changed files
with
65 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,7 @@ | |
Thanks for considering contributing! Please read this document to learn the various ways you can contribute to this project and how to go about doing it. | ||
|
||
**Submodule dependency on a fork of scikit-learn** | ||
Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a maintained fork of scikit-learn at https://github.com/neurodata/scikit-learn, where specifically, the `submodulev2` branch is used to build and install this repo. We keep that fork well-maintained and up-to-date with respect to the main sklearn repo. The only difference is the refactoring of the `tree/` submodule. This fork is used internally under the namespace ``sktree._lib.sklearn``. It is necessary to use this fork for anything related to: | ||
Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a maintained fork of scikit-learn at <https://github.com/neurodata/scikit-learn>, where specifically, the `submodulev2` branch is used to build and install this repo. We keep that fork well-maintained and up-to-date with respect to the main sklearn repo. The only difference is the refactoring of the `tree/` submodule. This fork is used internally under the namespace ``sktree._lib.sklearn``. It is necessary to use this fork for anything related to: | ||
|
||
- `RandomForest*` | ||
- `ExtraTrees*` | ||
|
@@ -27,16 +27,16 @@ code sample or an executable test case demonstrating the expected behavior. | |
|
||
We use GitHub issues to track feature requests. Before you create an feature request: | ||
|
||
* Make sure you have a clear idea of the enhancement you would like. If you have a vague idea, consider discussing | ||
- Make sure you have a clear idea of the enhancement you would like. If you have a vague idea, consider discussing | ||
it first on a GitHub issue. | ||
* Check the documentation to make sure your feature does not already exist. | ||
* Do [a quick search](https://github.com/neurodata/scikit-tree/issues) to see whether your feature has already been suggested. | ||
- Check the documentation to make sure your feature does not already exist. | ||
- Do [a quick search](https://github.com/neurodata/scikit-tree/issues) to see whether your feature has already been suggested. | ||
|
||
When creating your request, please: | ||
|
||
* Provide a clear title and description. | ||
* Explain why the enhancement would be useful. It may be helpful to highlight the feature in other libraries. | ||
* Include code examples to demonstrate how the enhancement would be used. | ||
- Provide a clear title and description. | ||
- Explain why the enhancement would be useful. It may be helpful to highlight the feature in other libraries. | ||
- Include code examples to demonstrate how the enhancement would be used. | ||
|
||
## Making a pull request | ||
|
||
|
@@ -52,7 +52,7 @@ When you're ready to contribute code to address an open issue, please follow the | |
|
||
git clone https://github.com/USERNAME/scikit-tree.git | ||
|
||
or | ||
or | ||
|
||
git clone [email protected]:USERNAME/scikit-tree.git | ||
|
||
|
@@ -142,6 +142,7 @@ When you're ready to contribute code to address an open issue, please follow the | |
</details> | ||
|
||
### Installing locally with Meson | ||
|
||
Meson is a modern build system with a lot of nice features, which is why we use it for our build system to compile the Cython/C++ code. | ||
However, there are some intricacies that might be new to a pure Python developer. | ||
|
||
|
@@ -151,7 +152,7 @@ In general, the steps to build scikit-tree are: | |
- build and install scikit-tree locally using `spin` | ||
|
||
Example would be: | ||
|
||
pip uninstall scikit-learn | ||
|
||
# install the fork of scikit-learn | ||
|
@@ -172,13 +173,13 @@ The most common errors come from the following: | |
|
||
The CI files for github actions shows how to build and install for each OS. | ||
|
||
|
||
### Writing docstrings | ||
|
||
We use [Sphinx](https://www.sphinx-doc.org/en/master/index.html) to build our API docs, which automatically parses all docstrings | ||
of public classes and methods. All docstrings should adhere to the [Numpy styling convention](https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html). | ||
|
||
### Testing Changes Locally With Poetry | ||
|
||
With poetry installed, we have included a few convenience functions to check your code. These checks must pass and will be checked by the PR's continuous integration services. You can install the various different developer dependencies with poetry: | ||
|
||
poetry install --with style, docs, test | ||
|
@@ -217,6 +218,22 @@ If you need to add new, or remove old dependencies, then you need to modify the | |
|
||
To update the lock file. | ||
|
||
## Developing a new Tree model | ||
|
||
Here, we define some high-level procedures for how to best approach implementing a new decision-tree model that is not supported yet in scikit-tree. | ||
|
||
1. First-pass on implementation: | ||
|
||
Implement a Cython splitter class and expose it in Python afterwards. Follow the framework for PatchObliqueSplitter and ObliqueSplitter and their respective decision-tree models: PatchObliqueDecisionTreeClassifier and ObliqueDecisionTreeClassifier. | ||
|
||
2. Second-pass on implementation: | ||
|
||
This involves extending relevant API beyond just the Splitter in Cython. This requires maintaining some degree of backwards-compatibility. Extend the existing API for Tree, TreeBuilder, Criterion, or ObliqueSplitter to enable whatever functionality you desire. | ||
|
||
3. Third-pass on implementation: | ||
|
||
This is the most complex implementation and should in theory be rarely used. This involves both designing a change in the scikit-learn fork submodule as well as relevant changes in scikit-tree itself. Extend the scikit-learn fork API. This requires maintaining some degree of backwards-compatability and testing the proposed changes wrt whatever changes you then make in scikit-tree. | ||
|
||
--- | ||
|
||
The Project abides by the Organization's [code of conduct](https://github.com/py-why/governance/blob/main/CODE-OF-CONDUCT.md) and [trademark policy](https://github.com/py-why/governance/blob/main/TRADEMARKS.md). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -209,6 +209,7 @@ extend-exclude = ''' | |
| sktree/_lib | ||
| .asv | ||
| env | ||
| build-install | ||
) | ||
''' | ||
|
||
|