-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: generate valid EBMModel
when merging
#578
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Signed-off-by: DerWeh <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #578 +/- ##
===========================================
- Coverage 76.73% 73.99% -2.74%
===========================================
Files 72 72
Lines 9622 9634 +12
===========================================
- Hits 7383 7129 -254
- Misses 2239 2505 +266
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Yes, @DerWeh, this approach looks great. Let me know when you feel it's ready to be merged. |
f8501a4
to
e0369a7
Compare
50d4254
to
5e999b6
Compare
@paulbkoch sorry for the long break. I think most things work by now, but I would need some help with the test.
Another point is that on failures, always bare |
Hi @DerWeh -- Thanks for these improvements! I suspect for the monotone_constraints test, the issue isn't the time spent in the fit function. I've noticed that merge_ebms is extremely slow in general. This isn't related to your PR, but there's something in merge_ebms that takes an inordinate amount of time, and merge_ebms often takes longer than fitting for smaller EBMs. I haven't had time to diagnose that issue yet. If I'm wrong, and it's the fitting time, then it might be that the given monotone constraints somehow violate what the model wants to do. We have a good example in our docs of the shape functions for the synthetic dataset here: https://interpret.ml/docs/python/examples/interpretable-regression-synthetic.html I might suggest using monotone increasing for feature#6 and feature#3. We don't have a good example of a decreasing function actually, but if you want a feature to test monotone decreasing maybe use feature#7 (the unused one). The other issue you pointed to is a bug somewhere in my code that is allowing a single feature to get mixed categorical/continuous bin definitions. The bins_ attribute should contain information that allows discretization of continuous features and bin assignment for categoricals. For EBMs we need to be able to discretize features differently depending on whether the feature is being used as a main, a pair, or a higher order of interaction. For continuous features, you might see something like this for a single feature: [[1.5, 3.5, 5.5, 7.5, 9.5], [4.5, 6.5]] Which means that if the feature is being used as a main, it is discretized into the bins with ranges: and for pairs: For categoricals we use a dictionary, so you might see: [{"Canada": 1, "France": 2, "Germany": 3}] The bug seems to be that we somehow get into a state where we have mixed categorical/continuous bins, so something like this for a single feature: [[1.5, 3.5, 5.5, 7.5, 9.5], {"4": 1, "5": 2, "BAD": 3}] This probably arises, I think, when two models are initially merged where one model contains a categorical and the other model contains a corresponding continuous feature. It's easy to see how this might occur given we automatically detect whether a feature is continuous or categorical by default. When this occurs, we are able to convert the categorical into a continuous for one of the models, which then allows the merge to proceed. I think the bug occurs during this conversion and we somehow get inconsistent categorical/continuous bin definitions. If that's true, then it's probably located in this function: interpret/python/interpret-core/interpret/glassbox/_ebm/_utils.py Lines 43 to 52 in 64158be
If your test_merge_exclude test is hitting this exception, it is probably a limited corner case of that bigger issue in that the excluded feature is probably containing an empty list [] inside the bins_ attribute. Since the feature is never used, it's probably not getting a bin definition. Since the bin_levels list contains no objects, len(set(map(type, bin_levels))) would be zero instead of 1. This check should be changed to <= 1 instead of != 1 to handle the excluded feature case. If the other model doesn't exclude the feature, then we can use whatever categorical/continuous definition is used in the other model. I agree the exception types could be improved. Probably best to do that in a new PR. |
Thanks for the info, in this case the bottleneck is fitting the model with constrains. Switching from classification to regression and focusing only on the features you mentioned, I can get a somewhat reasonable test times.
This is exactly the case. When setting bin_types = {type(model.bins_[feature_idx][0]) for model in models} with feature_bins = [model.bins_[feature_idx] for model in models if model.bins_[feature_idx]]
bin_types = {type(bin_[0]) for bin_ in feature_bins} to ignore if a single model doesn't define the bins. If it is missing in all models, it becomes more intricate. I planned to catch that case using if not feature_bins:
new_feature_types.append("continuous") # default type
new_bins.append([]) but this leads to crashes in |
Common case is just a single element, thus that the overhead of NumPy is very large.
Hi @DerWeh -- I think we might not need to call _harmonize_tensor if a feature is excluded in all models. What that function does is match the tensors between two features, so if you had two mains with the cuts: model1: [2.5, 4.5, 6.5] The new model would have cuts: The _harmonize_tensors function is then used to update the term_score_ arrays to match the increases in the number of cuts. In the case where all of the models ignore a feature, there are no tensors that need to be updated, so I think we can just avoid calling _harmonize_tensor in that case. The resulting model shouldn't include this feature as a term, so the call to _harmonize_tensors is superfluous I think and will get ignored in the end. @DerWeh, I'd be happy to jump in on your branch if this isn't enough explanation and you'd like some help getting this to the finish. |
@paulbkoch if you feel like you can jump in easily, you are welcome to do so. I pushed all my changes. I have one test case which currently fails due to Your explanation is fine, but it feels like I need plenty more time to work through the intrinsics to properly fix it. |
This PR is meant to fix #576. Upon merging, we average numerical parameters.
I would add a warning to the documentation, that we do not guarantee that EBMs produced by
merge_ebms
can be fitted. You often allow for so many different types of arguments, that hardly can catch every edge case meaningfully.@paulbkoch please take a look at
_initialize_ebm
. Is this how you imagine handling the parameters? I would appreciate some early feedback. If this is in general what you have in mind, I'll try cleaning up the code some more. But it's quite messy, as your API is so flexible.TODO
monotonize
andexclude