Skip to content

Commit

Permalink
add example notebooks to the docs and fix scrolling inconvenience
Browse files Browse the repository at this point in the history
  • Loading branch information
paulbkoch committed Jan 4, 2024
1 parent b4aef18 commit 098fbbf
Show file tree
Hide file tree
Showing 165 changed files with 16,835 additions and 5,710 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Visual Studio
.vs/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added _images/group-importances-all-other-groups.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/group-importances-education-group.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/group-importances-global-lstat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/group-importances-local-exp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/group-importances-social-group.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
6 changes: 3 additions & 3 deletions _sources/debugging-guide.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@
"id": "757b49b4",
"metadata": {},
"source": [
"# Logging and Debugging\n"
"# Logging and Debugging"
]
},
{
"cell_type": "markdown",
"id": "b2d36e19",
"metadata": {},
"source": [
"## Enable logging in a Python script\n",
"<h2>Enable logging in a Python script</h2>\n",
"\n",
"1. Import the debug_mode script\n",
" ```sh\n",
Expand Down Expand Up @@ -51,7 +51,7 @@
"id": "08ad404c",
"metadata": {},
"source": [
"## Debugging Python and C++ in VS Code\n",
"<h2>Debugging Python and C++ in VS Code</h2>\n",
"\n",
"1. Set up debugging configurations for _Python_ and _C++ Attach_. As an example, the launch configuration file (`launch.json`) should contain\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions _sources/deployment-guide.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"id": "drawn-hamburg",
"metadata": {},
"source": [
"## Install with every dependency (default)\n",
"<h2>Install with every dependency (default)</h2>\n",
"\n",
"The package `interpret` installs every dependency needed to run any part of the package.\n",
"\n",
Expand All @@ -41,7 +41,7 @@
"id": "surprised-driver",
"metadata": {},
"source": [
"## Install with minimal dependencies\n",
"<h2>Install with minimal dependencies</h2>\n",
"\n",
"When you only want the required dependencies, or you wish to customize the dependencies, install the package `interpret-core` instead.\n",
"\n",
Expand All @@ -63,7 +63,7 @@
"id": "alone-equivalent",
"metadata": {},
"source": [
"## Install with some official dependencies (pip)\n",
"<h2>Install with some official dependencies (pip)</h2>\n",
"\n",
"This scenario is not covered in all package managers we support. If you are installing with `pip`, you can take advantage of extra tags that are exposed for `interpret-core`.\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions _sources/dpebm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"source": [
"# Differentially Private EBMs\n",
"\n",
"Links to API References: [DPExplainableBoostingClassifier](./DPExplainableBoostingClassifier.ipynb), [DPExplainableBoostingRegressor](./DPExplainableBoostingRegressor.ipynb)\n",
"Links to API References: [DPExplainableBoostingClassifier](./python/api/DPExplainableBoostingClassifier.ipynb), [DPExplainableBoostingRegressor](./python/api/DPExplainableBoostingRegressor.ipynb)\n",
"\n",
"*See the reference paper for full details [[1](dp_ebms)].* [Link](https://proceedings.mlr.press/v139/nori21a/nori21a.pdf)\n"
]
Expand All @@ -17,7 +17,7 @@
"id": "announced-warning",
"metadata": {},
"source": [
"## Code Example\n",
"<h2>Code Example</h2>\n",
"\n",
"The following code will train a DPEBM classifier for the adult income dataset. The visualizations provided will be for both global and local explanations."
]
Expand Down Expand Up @@ -126,7 +126,7 @@
"id": "engaging-string",
"metadata": {},
"source": [
"## Bibliography\n",
"<h2>Bibliography</h2>\n",
"\n",
"(dp_ebms)=\n",
"[1] Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, and Janardhan Kulkarni. Accuracy, Interpretability, and Differential Privacy via Explainable Boosting. In Proceedings of the 38th International Conference on Machine Learning, 8227-8237. 2021. [Paper Link](https://proceedings.mlr.press/v139/nori21a/nori21a.pdf)"
Expand Down
12 changes: 6 additions & 6 deletions _sources/dr.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,23 @@
"source": [
"# Decision Rule\n",
"\n",
"Link to API Reference: [DecisionListClassifier](./DecisionListClassifier.ipynb)\n",
"Link to API Reference: [DecisionListClassifier](./python/api/DecisionListClassifier.ipynb)\n",
"\n",
"*See the backing repository for Skope Rules [here](https://github.com/scikit-learn-contrib/skope-rules).*\n",
"\n",
"## Summary\n",
"<h2>Summary</h2>\n",
"\n",
"Decision rules are logical expressions of the form `IF ... THEN ...`. Interpret's implementation uses a wrapped variant of `skope-rules`[[1](skrules_2017_dr)], which is a weighted combination of rules extracted from a tree ensemble using L1-regularized optimization over the weights. Rule systems, like single decision trees, can give interpretability at the cost of model performance. These discovered decision rules are often integrated into expert-driven rule-based systems.\n",
"\n",
"## How it Works\n",
"<h2>How it Works</h2>\n",
"\n",
"The creators of skope-rules have a lucid synopsis of what decision rules are [here](https://github.com/scikit-learn-contrib/skope-rules).\n",
"\n",
"Christoph Molnar's \"Interpretable Machine Learning\" e-book [[2](molnar2020interpretable_dr)] has an excellent overview on decision rules that can be found [here](https://christophm.github.io/interpretable-ml-book/rules.html).\n",
"\n",
"For implementation specific details, see the skope-rules GitHub repository [here](https://github.com/scikit-learn-contrib/skope-rules).\n",
"\n",
"## Code Example\n",
"<h2>Code Example</h2>\n",
"\n",
"The following code will train an skope-rules classifier for the breast cancer dataset. The visualizations provided will be for both global and local explanations."
]
Expand Down Expand Up @@ -92,7 +92,7 @@
"id": "varying-powell",
"metadata": {},
"source": [
"## Further Resources\n",
"<h2>Further Resources</h2>\n",
"\n",
"- [Skope Rules Documentation](https://skope-rules.readthedocs.io/en/latest/)"
]
Expand All @@ -102,7 +102,7 @@
"id": "mexican-philadelphia",
"metadata": {},
"source": [
"## Bibliography\n",
"<h2>Bibliography</h2>\n",
"\n",
"\n",
"(skrules_2017_dr)=\n",
Expand Down
12 changes: 6 additions & 6 deletions _sources/dt.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,21 @@
"source": [
"# Decision Tree\n",
"\n",
"Links to API References: [ClassificationTree](./ClassificationTree.ipynb), [RegressionTree](./RegressionTree.ipynb)\n",
"Links to API References: [ClassificationTree](./python/api/ClassificationTree.ipynb), [RegressionTree](./python/api/RegressionTree.ipynb)\n",
"\n",
"*See the backing repository for Decision Tree [here](https://github.com/scikit-learn/scikit-learn).*\n",
"\n",
"## Summary\n",
"<h2>Summary</h2>\n",
"\n",
"A supervised decision tree. This is a recursive partitioning method where the feature space is continually split into further partitions based on a split criteria. A predicted value is learned for each partition in the \"leaf nodes\" of the learned tree. This is a light wrapper to the decision trees exposed in `scikit-learn`. Single decision trees often have weak model performance, but are fast to train and great at identifying associations. Low depth decision trees are easy to interpret, but quickly become complex and unintelligible as the depth of the tree increases. \n",
"\n",
"## How it Works\n",
"<h2>How it Works</h2>\n",
"\n",
"Christoph Molnar's \"Interpretable Machine Learning\" e-book [[1](molnar2020interpretable_dt)] has an excellent overview on decision trees that can be found [here](https://christophm.github.io/interpretable-ml-book/tree.html).\n",
"\n",
"For implementation specific details, scikit-learn's user guide [[2](pedregosa2011scikit_dt)] on decision trees is solid and can be found [here](https://scikit-learn.org/stable/modules/tree.html#tree).\n",
"\n",
"## Code Example\n",
"<h2>Code Example</h2>\n",
"\n",
"The following code will train an decision tree classifier for the breast cancer dataset. The visualizations provided will be for both global and local explanations."
]
Expand Down Expand Up @@ -90,7 +90,7 @@
"id": "metropolitan-idaho",
"metadata": {},
"source": [
"## Further Resources\n",
"<h2>Further Resources</h2>\n",
"\n",
"- [Wikipedia on Decision Trees](https://en.wikipedia.org/wiki/Decision_tree_learning)\n",
"- [scikit-learn on their Decision Tree module](https://scikit-learn.org/stable/modules/tree.html)"
Expand All @@ -101,7 +101,7 @@
"id": "supreme-prescription",
"metadata": {},
"source": [
"## Bibliography\n",
"<h2>Bibliography</h2>\n",
"\n",
"(molnar2020interpretable_dt)=\n",
"[1] Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.\n",
Expand Down
9 changes: 4 additions & 5 deletions _sources/ebm-internals-classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sample code\n",
"<h2>Sample code</h2>\n",
"\n",
"Finally, here's some code which puts the above considerations together into a function that can make predictions for simplified scenarios. This code does not handle things like regression, multiclass, unknown values, or interactions beyond pairs.\n",
"\n",
Expand All @@ -273,10 +273,9 @@
" # main effects will have 1 feature, and pairs will have 2 features\n",
" for feature_idx in features:\n",
" feature_val = sample[feature_idx]\n",
" if feature_val is None or feature_val is np.nan:\n",
" # missing values are always in the 0th bin\n",
" bin_idx = 0\n",
" else:\n",
" bin_idx = 0 # if missing value, use bin index 0\n",
"\n",
" if feature_val is not None and feature_val is not np.nan:\n",
" # we bin differently for main effects and pairs,\n",
" # so determine which resolution is needed\n",
" if len(features) == 1 or len(ebm.bins_[feature_idx]) == 1:\n",
Expand Down
17 changes: 6 additions & 11 deletions _sources/ebm-internals-multiclass.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sample code\n",
"<h2>Sample code</h2>\n",
"\n",
"This sample code incorporates everything discussed in all 3 sections. It could be used as a drop in replacement for the existing EBM predict function of the ExplainableBoostingRegressor or as the predict_proba function of the ExplainableBoostingClassifier."
]
Expand All @@ -265,13 +265,10 @@
"sample_scores = []\n",
"for sample in X:\n",
" # start from the intercept for each sample\n",
" score = ebm.intercept_\n",
" score = ebm.intercept_.copy()\n",
" if isinstance(score, float) or len(score) == 1:\n",
" # binary classification or regression\n",
" # regression or binary classification\n",
" score = float(score)\n",
" else:\n",
" # multiclass\n",
" score = score.copy()\n",
"\n",
" # we have 2 terms, so add their score contributions\n",
" for term_idx, features in enumerate(ebm.term_features_):\n",
Expand All @@ -281,11 +278,9 @@
" # main effects will have 1 feature, and pairs will have 2 features\n",
" for feature_idx in features:\n",
" feature_val = sample[feature_idx]\n",
" bin_idx = 0 # if missing value, use bin index 0\n",
"\n",
" if feature_val is None or feature_val is np.nan:\n",
" # missing values are always in the 0th bin\n",
" bin_idx = 0\n",
" else:\n",
" if feature_val is not None and feature_val is not np.nan:\n",
" # we bin differently for main effects and pairs, so first \n",
" # get the list containing the bins for different resolutions\n",
" bin_levels = ebm.bins_[feature_idx]\n",
Expand Down Expand Up @@ -319,7 +314,7 @@
"\n",
"if hasattr(ebm, 'classes_'):\n",
" # classification\n",
" if len(ebm.classes_) <= 2:\n",
" if len(ebm.classes_) == 2:\n",
" # binary classification\n",
"\n",
" # softmax expects two logits for binary classification\n",
Expand Down
2 changes: 1 addition & 1 deletion _sources/ebm-internals-regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sample code\n",
"<h2>Sample code</h2>\n",
"\n",
"Finally, here's some code which puts the above considerations together into a function that can make predictions for simplified scenarios. This code does not handle things like interactions, missing values, unknown values, or classification.\n",
"\n",
Expand Down
28 changes: 7 additions & 21 deletions _sources/ebm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@
"source": [
"# Explainable Boosting Machine\n",
"\n",
"Links to API References: [ExplainableBoostingClassifier](./ExplainableBoostingClassifier.ipynb), [ExplainableBoostingRegressor](./ExplainableBoostingRegressor.ipynb)\n",
"Links to API References: [ExplainableBoostingClassifier](./python/api/ExplainableBoostingClassifier.ipynb), [ExplainableBoostingRegressor](./python/api/ExplainableBoostingRegressor.ipynb)\n",
"\n",
"*See the reference paper for full details [[1](lou2013accurate_ebm)].* [Link](https://www.cs.cornell.edu/~yinlou/papers/lou-kdd13.pdf)\n",
"\n",
"## Summary\n",
"<h2>Summary</h2>\n",
"\n",
"Explainable Boosting Machine (EBM) is a tree-based, cyclic gradient boosting Generalized Additive Model with automatic interaction detection. EBMs are often as accurate as state-of-the-art blackbox models while remaining completely interpretable. Although EBMs are often slower to train than other modern algorithms, EBMs are extremely compact and fast at prediction time.\n",
"Explainable Boosting Machine (EBM) is a tree-based, cyclic gradient boosting Generalized Additive Model with automatic interaction detection. EBMs are often as accurate as state-of-the-art blackbox models while remaining completely interpretable.\n",
"\n",
"## How it Works\n",
"<h2>How it Works</h2>\n",
"\n",
"As part of the framework, InterpretML also includes a new interpretability algorithm -- the Explainable Boosting Machine (EBM). EBM is a glassbox model, designed to have accuracy comparable to state-of-the-art machine learning methods like Random Forest and Boosted Trees, while being highly intelligibile and explainable. EBM is a generalized additive model (GAM) of the form:\n",
"\n",
Expand Down Expand Up @@ -48,25 +48,11 @@
"id": "announced-warning",
"metadata": {},
"source": [
"## Code Example\n",
"<h2>Code Example</h2>\n",
"\n",
"The following code will train an EBM classifier for the adult income dataset. The visualizations provided will be for both global and local explanations."
]
},
{
"cell_type": "markdown",
"id": "coated-palestinian",
"metadata": {},
"source": [
"````{margin}\n",
"```{note}\n",
"EBM is slow and we don't have loading bars. If it looks like it froze, it's probably still burning all your CPU cycles.\n",
"\n",
"All of them.\n",
"```\n",
"````"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -163,7 +149,7 @@
"id": "occupied-withdrawal",
"metadata": {},
"source": [
"## Further Resources\n",
"<h2>Further Resources</h2>\n",
"\n",
"- [Paper: GA2M](https://www.cs.cornell.edu/~yinlou/papers/lou-kdd12.pdf)\n",
"- [Paper: InterpretML Framework](https://arxiv.org/pdf/1909.09223.pdf)\n",
Expand All @@ -175,7 +161,7 @@
"id": "engaging-string",
"metadata": {},
"source": [
"## Bibliography\n",
"<h2>Bibliography</h2>\n",
"\n",
"(lou2013accurate_ebm)=\n",
"[1] Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 623–631. 2013. [Paper Link](https://www.cs.cornell.edu/~yinlou/papers/lou-kdd13.pdf)\n",
Expand Down
1 change: 1 addition & 0 deletions _sources/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Examples
Loading

0 comments on commit 098fbbf

Please sign in to comment.