train test split for Weibull Cure mode evaluation #1581

josarago · 2023-11-28T03:36:16Z

josarago
Nov 28, 2023

I am using the Weibull Cure model implementation proposed in the docs to predict the conversion of subjects (leads) for a specific business process.
A cure model is necessary here as most conversion events will never happen and we are typically interested in the cdf in this context as it can be interpreted as the probability for the subject to have converted by time $t$.
The way I plan on using the model is by making predictions on subjects that were created up to max_age ago and have not converted yet.

For survival regression I routinely see that people split the data on the subject ids (sometimes with stratification on the event_col) and I am trying to understand why this is the right thing to do, especially if I want to use the model to make predictions over a predictive horizon, using .predict_cumulative_density() (and with the conditional_after argument). The performance of the model over that predictive horizon will be something important to understand.

In practice, one would train the model on a time window ranging between some time in the past and the time of training, which becomes the censoring time, typically ’now’.

Forgetting about hyper-parameter tuning and CV for the sake this example, my intuition tends to make me want to evaluate the model using a data split similar to a TimesSeriesSplit to be able to understand how the model performance change over the predictive horizon.

As someone discussed here, I am tempted to follow this approach:

For training:

set a date in the past as the censoring time, so that the duration between that censoring time and now is (at least roughly) equal to the predictive horizon we will use.
train the model on a time window extending up to that censoring time. Events observed after that time would be ignored and marked as False in the event_col column and durations would be calculated up to that censoring time in the past.

For evaluation:

evaluate on all leads created up to the censoring time and going back max_age ago from it.
don’t ignore the events observed after the censoring time, effectively using the training time as censoring time and computing the durations up to that time.

On one hand I think this approach introduces so bias as the training and test set will not be sampled from the same exact distribution but on the other hand it seems closer to a typical use case.

Curious to know if some implementations of this exist and what people are doing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train test split for Weibull Cure mode evaluation #1581

{{title}}

Replies: 0 comments

Select a reply

train test split for Weibull Cure mode evaluation #1581

josarago Nov 28, 2023

Replies: 0 comments

josarago
Nov 28, 2023