-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the c-index with IPCW #71
base: main
Are you sure you want to change the base?
Conversation
The CI for the doc fails because the previous boosting tree model is missing. This should be fixed when #53 is merged. |
Update on performanceOur implementation is 100x slower than scikit-survival code benchmarkimport numpy as np
import pandas as pd
from time import time
from lifelines import CoxPHFitter
from lifelines.datasets import load_kidney_transplant
from sklearn.model_selection import train_test_split
from hazardous.metrics._concordance_index import _concordance_index_incidence_report
df = load_kidney_transplant()
# make the dataset 100x times longer for benchmarking purposes
df = pd.concat([df] * 100, axis=0)
df_train, df_test = train_test_split(df, stratify=df["death"])
cox = CoxPHFitter().fit(df_train, duration_col="time", event_col="death")
t_min, t_max = df["time"].min(), df["time"].max()
time_grid = np.linspace(t_min, t_max, 20)
y_pred = 1 - cox.predict_survival_function(df_test, times=time_grid).T.to_numpy()
y_train = df_train[["death", "time"]].rename(columns=dict(
death="event", time="duration"
))
y_test = df_test[["death", "time"]].rename(columns=dict(
death="event", time="duration"
))
tic = time()
result = _concordance_index_incidence_report(
y_test=y_test,
y_pred=y_pred,
time_grid=time_grid,
taus=None,
y_train=y_train,
)
print(f"our implementation: {time() - tic:.2f}s")
# scikit-survival
from sksurv.metrics import concordance_index_ipcw
def make_recarray(y):
event, duration = y["event"].values, y["duration"].values
return np.array(
[(event[i], duration[i]) for i in range(len(event))],
dtype=[("e", bool), ("t", float)],
)
tic = time()
concordance_index_ipcw(
make_recarray(y_train),
make_recarray(y_test),
y_pred[:, -1],
tau=None,
)
print(f"scikit-survival: {time() - tic:.2f}s")
# lifelines
from lifelines.utils import concordance_index
concordance_index(
event_times=y_test["duration"],
predicted_scores=1 - y_pred[:, -1],
event_observed=y_test["event"],
)
print(f"lifelines: {time() - tic:.2f}s") On a dataset with 20k rows:
The flamegraph is quite clear about the culprit, being the list comprehension that computes the IPCW weight for each pair. When I remove the IPCWs, the performance becomes similar to lifelines. I tried to fix this performance issue using numba @jitclass on the BTree, but it is still very slow. I put the numba BTree on a separate draft branch for reference. ConclusionI only see two ways forward:
|
Pinged by @Vincent-Maladiere, but have no time for it. Random pile of pieces of advice:
|
No, don't use compiled languages, please. It will make release and distribution much harder.
…On Jul 26, 2024, 13:46, at 13:46, Julien Jerphanion ***@***.***> wrote:
Pinged by @Vincent-Maladiere, but have no time for it.
Random pile of pieces of advice:
- find if a better algorithm exist first
- profile to see what's the bottleneck
- see if tree-based structures can be used from another library (e.g.
[`pydatastructures`](https://github.com/codezonediitj/pydatastructs/tree/main/pydatastructs/trees)
- use another language (like Cython or C++) to implement the costly
algorithmic part
--
Reply to this email directly or view it on GitHub:
#71 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
After giving it some more thought, there is room for improvement with the current balanced tree design :
However, when we use a conditional IPCW estimator (like Cox or SurvivalBoost), we have: In this case, the balanced tree is not adapted anymore, and we should use the naive implementation. So, to make things simpler, I suggest we only implement the naive version for now, and eventually return to the balanced tree later, for the non-conditional and unweighted cases. WDYT? |
Sounds good to me. We can always iterate if needed
…On Jul 26, 2024, 18:38, at 18:38, Vincent M ***@***.***> wrote:
After giving it some more thought, there is room for improvement with
the current balanced tree design :
1. When we don't use an IPCW estimator (like lifelines):
$$W_{ij,1} = W_{ij,2} = 1$$
2. When we use a **non-conditional** IPCW estimator (Kaplan-Meier, like
scikit-survival):
$$W_{ij,1} = W_{i,1} = \hat{G}(T_i) ^ 2 \space \mathrm{and} \space
W_{ij,2} = \hat{G}(T_i) \hat{G}(T_j) $$
However, when we use a **conditional** IPCW estimator (like Cox or
SurvivalBoost), we have:
$$W_{ij,1} = \hat{G}(T_i | X_i) \hat{G}(T_i | X_j) \space \mathrm{and}
\space W_{ij,2} = \hat{G}(T_i | X_i) \hat{G}(T_j | X_j)$$
In this case, the balanced tree is not adapted anymore, and we should
use the naive implementation.
So, to make things simpler, **I suggest we only implement the naive
version for now**, and eventually return to the balanced tree later,
for the non-conditional and unweighted cases.
WDYT?
--
Reply to this email directly or view it on GitHub:
#71 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
This PR is now ready to be reviewed :) |
A pair :math:`(i, j)` is comparable, with :math:`i` experiencing the event of | ||
interest at time :math:`T_i` if: | ||
|
||
- :math:`j` experiences the event of interest at a strictly greater time | ||
:math:`T_j > T_i` (pair of type A) | ||
- :math:`j` is censored at time :math:`T_j = T_i` or greater (pair of type A) | ||
- :math:`j` experiences a competing event before or at time :math:`T_i` | ||
(pair of type B) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does type A / type B refer to the A_ij and B_ij matrices in Eq (3.4) of ref [1] ?
- type=A: j experienced any event (competing, censoring, event of interest) strictly after i
$(T_j > T_i)$ - type=B: j experienced a competing event before i
$(T_j \leq T_i, \Delta_j \neq 0, k)$
Are competing events after i missing in the list for pair of type A ? Should censoring events only be in type A if T_j > T_i ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, A and B refer to Eq (3.4). You're right, this docstring is slightly inaccurate; let's rewrite it as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking deeper, that part was actually correct. Both scikit survival and reference R implementation consider ties in times with censoring (D_i = 1, D_j = 0
and T_i = T_j
) as comparable pairs, although this is not mentioned in the paper [1].
Also looking a bit more at our implementation, we don't consider these.
Therefore:
- I'll revert the documentation
- I'll fix the implementation to consider ties in time for censored
j
n_ties_times_a: list of int | ||
Number of tied pairs of type A with D_i = D_j = 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n_ties_times_a: list of int | |
Number of tied pairs of type A with D_i = D_j = 1. | |
n_ties_times_a: list of int | |
Number of tied pairs of type A with T_i = T_j = 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here I believe the docstring is accurate, we consider ties of type A only individuals having experienced the event of interest simultaneously.
If you think it would bring clarity, we could write:
D_i = D_j = 1 and T_j = T_i
The pair :math:`(i, j)` is considered a tie for time if :math:`j` experiences | ||
the event of interest at the same time (:math:`T_j=T_i`). This tied time pair will | ||
be counted as :math:`1/2` for the count of comparable pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tie in time pair is not considered in ref [1] ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the type A with T_i=T_j is not consistent with ref [1] (where type A is T_i < T_j , and any event for j). Maybe you have a mix of ref [1] and sksurv.metrics.concordance_index_ipcw
conventions ?
Are you maybe adding events (T_i=T_j and D_i=D_j=event_of_interest), discarded in ref [1], but here counted as "tied in time" with 1/2 weight ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, we chose to include them to be closer to the results of sksurv
in the survival analysis setting (i.e. no competing events). Maybe @judithabk6 can add a little bit more context for our understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let's remove this section from the docstring because ties in time when D_j = 1
are not acceptable pairs, and we don't count them. We only use the ties in predictions.
It could be nice to add the explicit math formula in the docstring for the API reference as was done for the Brier score. |
Thank you very much for this thorough review @antoinebaker! I updated the docstring and returned a numpy array instead of a list. The two most important topics left to address are:
|
@Vincent-Maladiere For the difference is G(t-) = P(C >= t) and G(t) = P(C > t). |
Yes, but the question is should we stay consistent with [1], where we have theoretical evidence that we're doing the right thing, or derive our own method? |
So, after discussing this with @judithabk6, the paper doesn't mention the ties because they are implementation details. The ties are present in the R version of this metric: https://github.com/cran/pec/blob/fb32746f6119450e8435deb52aa8583a91b05ba5/src/ccr.c#L105 So, I'll make sure to update the documentation accordingly. |
Let's also add the C-index in an example :) |
I'm still a bit confused about the tied in time events, I think there are some inconsistencies in the docstrings. Pairs of type A are according to
I guess 2. is the correct one ? If so, what is the use of |
The example is nice! The C-index (mathematical formulation) should be replaced by the C-index in time. Should we add also a version with IPWC ? [but maybe the example will be a bit confusing, with two unrelated KM estimators, one as a baseline to compare with SurvivalBoost, the other as IPWC to debias the C-index in time] |
Hey @antoinebaker, thanks for this additional feedback.
I wonder if we should remove these math formulation entirely from the example, to only keep the ones in the function docstring. WDYT? |
Ping @antoinebaker :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add a few comments to make the code easier to follow, but otherwise LGTM :)
\hat{W}_{ij,2} &= \hat{G}(\tilde{T}_i-|X_i) \hat{G}(\tilde{T}_j-|X_j) \\ | ||
Q_{ij}(t) &= I\{M(t, X_i) > M(t, X_j)\} | ||
\end{align} | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably defined elsewhere, but on standalone reading, might be useful to redefine
:math:`T_j > T_i` (pair of type A) | ||
- :math:`j` is censored at the exact same time :math:`T_i = T_j` (pair of type A). | ||
- :math:`j` experiences a competing event before or at time :math:`T_i` | ||
(pair of type B) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the docstring should specify that this is the event-specific concordance index, to explicit why we distinguish the event and the competing events (maybe at the beginning)
---------- | ||
.. [Wolbers2014] M. Wolbers, P. Blanche, M. T. Koller, J. C. Witteman, T. A. Gerds, | ||
"Concordance for prognostic models with competing risks", 2014 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those links are clickable, but refer to the same page, maybe they should link to the article?
"ipcw_estimator is set, but y_train is None. " | ||
"Set y_train to fix this error." | ||
) | ||
# TODO: add cox option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe explicit why cox would be relevant here (just say that it would allow the censoring model to be conditional)?
Co-authored-by: antoinebaker <[email protected]>
Thank you very much @judithabk6 and @antoinebaker for this thorough round of reviews! I added your suggestions and both the doc and the codebase are much clearer now :) |
What does this PR propose?
This PR proposes to add the c-index as defined in [1]. I think this is ready to be reviewed for merging, with some questions/suggestions in the TODO section below.
show maths
where:
and
and
where$M$ is the probability of incidence of the event of interest.
concordance_index_incidence
function is inspired by theconcordance_index
function in lifelines, with some significant differences:concordance_index_ipcw
._BTree
class fromlifelines.utils.btree.py
by adding a weighting count mechanism. I referenced lifelines inhazardous.metrics._btree.py
, but I can reference it also in thehazardous.metrics._concordance_index.py
file if necessary.TODO
tied_tol
parameter for ties in predictions?cc @ogrisel @GaelVaroquaux @juAlberge @glemaitre
[1] Wolbers, M., Blanche, P., Koller, M. T., Witteman, J. C., & Gerds, T. A. (2014). Concordance for prognostic models with competing risks.