Skip to content
This repository has been archived by the owner on Mar 6, 2021. It is now read-only.

min_samples_split == 1 raises ValueError in Decision Tree Classifier #5

Open
rnmourao opened this issue Feb 18, 2018 · 0 comments
Open

Comments

@rnmourao
Copy link

Hi:

I tested the simplest call of ensemble_train and got a ValueError for the parameter min_samples_split:

Traceback (most recent call last):
File "pyensemble/ensemble_train.py", line 202, in
ens.fit(X_train, y_train)
File "/home/mourao/income_prediction/pyensemble/ensemble.py", line 290, in fit
self.fit_models(X, y)
File "/home/mourao/income_prediction/pyensemble/ensemble.py", line 325, in fit_models
model.fit(X[train_inds], y[train_inds])
File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 790, in fit
X_idx_sorted=X_idx_sorted)
File "/usr/local/lib/python2.7/dist-packages/sklearn/tree/tree.py", line 194, in fit
% self.min_samples_split)
ValueError: min_samples_split must be an integer greater than 1 or a float in (0.0, 1.0]; got the integer 1

I solved the problem removing 1 from the list in the file model_library.py:


def build_decisionTreeClassifiers(random_state=None):
    rs = check_random_state(random_state)

    param_grid = {
        'criterion': ['gini', 'entropy'],
        'max_features': [None, 'auto', 'sqrt', 'log2'],
        'max_depth': [None, 1, 2, 5, 10],
        'min_samples_split': [2, 5, 10],
        'random_state': [rs.random_integers(100000) for i in xrange(3)],
    }

    return build_models(DecisionTreeClassifier, param_grid)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant