-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of Nan #142
Comments
For LOCO and PermutationImportance, I think that the handling of NaNs values can be left to the sklearn pipeline (using a model that supports them natively or including an imputation step). However, for CPI, I think we need further work inside the library as the main predictive model should support NaNs but also the estimator of the conditional distribution, Not sure, but this might also apply to knockoffs? |
Do you know if Lasso and RandomForest can handle natively NaNs values? |
RandomForest yes, Lasso no |
But in most cases, the end estimator will be an sklearn one. If we can defer the handling of NaN to it, it would be ideal. |
The problem is not for the estimator but for the method itself. |
My best suggestion is to take a look at https://skrub-data.org
|
It was my suggestion to @jpaillard but he told me that NaN contains information by itself for the estimation and they are important to keep it. |
Only when you work with categorical variables AFAIK |
I would argue for the two last options: for categorical data, consider NaNs as a special value, and for continuous impute NaN values.
The user might want to use a main predictive model/pipeline that handles NaN values (i.e., |
But then, the problem should rather be addressed from the beginning, before using hidimstat, no ? |
I was specifically thinking about the case where the predictive model is
This may be too specific, and we should leave the management of NaNs to the user. |
There will be a need to test how to handle Nan in the methods and to add tests associated with it.
The text was updated successfully, but these errors were encountered: