-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial attempt to implementing hard majority voting #126
Conversation
Thanks @LukasGardberg, dont worry about the CI problem. I will leave the detailed comment as soon as possible. |
torchensemble/_base.py
Outdated
@@ -349,8 +357,8 @@ def __init__(self, input_dim, output_dim, depth=5, lamda=1e-3, cuda=False): | |||
|
|||
self._validate_parameters() | |||
|
|||
self.internal_node_num_ = 2 ** self.depth - 1 | |||
self.leaf_node_num_ = 2 ** self.depth | |||
self.internal_node_num_ = 2**self.depth - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several lines here do not meet the coding style requirements. Please consider to execute the command black --skip-string-normalization --config pyproject.toml torchensemble/
in the root directory of the project, which will automatically format the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I ran the same black
command and it automatically did the changes mentioned above. I reverted them manually now. Maybe it has something to do with my version of black
.
torchensemble/voting.py
Outdated
def __init__(self, voting_strategy="soft", **kwargs): | ||
super(VotingClassifier, self).__init__(**kwargs) | ||
|
||
self._implemented_voting_strategies = ["soft", "hard"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This attribute is redundant, maybe we could hard-code the supported strategies, for example, change the attribute in line 100 into a set ("soft", "hard")
, and replace the latter {}
in line 103 with a fixed string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if my latest changes is what you meant :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core implementation on voting strategies looks good after a rough look, we need extra unit tests to ensure the correctrness. Have you ever written unit tests?
Yes, a bit, but never with pytest. I'm thinking I'll just imitate the tests that already exists. Is there anything specific I should test for? Also, when running the tests locally I seem to be getting an error:
I tried adding an init method to the |
Thanks for your great work @LukasGardberg ! Kind of busy these days, and I will take a look in a few days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As to unit tests, provided that we move the implementation of hard voting to the utils/operator
sub-package, it should be enough to only test this function in tests/test_operator.py
. You could manually specify several tensors, and see if the result from hard voting conforms to the expected result.
Meanwhile, please click the |
@all-contributors please add @LukasGardberg for code |
I've put up a pull request to add @LukasGardberg! 🎉 |
Okay, now hard majority has been moved to ops. I also added a simple test, and added soft voting to NeuralForestClassifier and SnapshotEnsembleClassifier, but they have not been tested. Are there any other models I should add hard voting for? I'm assuming it's only applicable to classifiers right? Also, I'm unsure if I solved the init inheritance problem properly, but the tests are at least passing now @xuyxu :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the comments, and I will add some final docs once they are resolved.
@@ -203,7 +203,7 @@ def predict(self, *x): | |||
class BaseTreeEnsemble(BaseModule): | |||
def __init__( | |||
self, | |||
n_estimators, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason on adding the default value for n_estimators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, no, I accidentally left it in there when trying some other stuff, hope its ok!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Merged, thanks @LukasGardberg |
Thanks a lot @xuyxu for all the help! Exciting making my first contribution :) |
This is a first start to implementing hard majority voting, as mention in #118.
For a brief description of soft and hard voting, see sklearn's definition.
For hard voting, the vector
proba
returned by theforward
method of the VotingClassifier class now instead becomes a one-hot vector with a 1 on the class which had the majority of the base_estimators votes.Worth considering is that
torch.max
is called in theevaluate
method which essentially just gets the index of the 1 inproba
from theforward
method, which works, but maybe isn't a problem?Any suggestions for improvements are very welcome!