-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
this link text is not shown properly https://github.com/BIDData/BIDMach/wiki/Causal-Inference #189
Comments
|
so formatting failed to be shown in chrome and EI |
may you help find links to read more about this, pls |
it should be like this IPTW However, a regression model often wont be exact, and a different kind of estimate is needed. The next approach is to simulate randomization of Z. If we had randomly assigned each user to classes Z=0 and Z=1 we could simply use the difference in responses as the causal effect. This is the approach taken in randomized trials. But given a dataset, we can't change the assignments to Z that were made. The actual assignment could depend in an arbitrary fashion on the other features X and then the difference in response will depend on the influence of those features as well as Z. We could partition or cluster the data according to X, and then look within each subset at the difference in outcome for samples with Z=0 and Z=1. But this is very difficult with high-dimensional features, and there turns out to be a much more efficient approach based on the propensity score. The propensity score is the probability of the treatment assignment, i.e. P(Z|X), and the notations g_0(X)=P(Z=0|X) and g_1(X)=P(Z=1|X) are often used. Stratification by the propensity score turns out to be sufficient for accurate estimation of the causal effect. For a regression-based estimate, we can treat the propensity score as a sampling bias. By dividing by it, we create a pseudo-sample in which units have equal (and random) probability of assignment to Z=0 or Z=1. Then the IPTW estimator for the causal effect is defined as: E(Y_1)-E(Y_0) = \frac{1}{n}\sum\limits_{i=1}^n \left( \frac{Z_i Y_i}{g_1(X_i)} - \frac{(1-Z_i)Y_i}{g_0(X_i)}\right) where Y_0 and Y_1 are the responses Y when Z is forced respectively to 0 or 1. There is a simple estimator for IPTW effects in the causal package. It concurrently computes P(Z|X) using logistic regression and the estimator above. It expects a targets option which encodes both the treatment(s) Z and the effects Y. It can analyze k effects at once, and targets should be a 2k x nfeats matrix. The first k rows encode the k treatments, and the next k rows encode the corresponding effects. The treatment and effects should therefore be among the input features, and the j^{th} row of targets will normally be zeros with a single 1 in the position of the feature that encodes the j^{th} treatment. The (j+k)^{th} row should have a single 1 in the position that encodes the feature for the j^{th} effect. The output is in the learner's modelmats(1) field. This will be an FMat with k entries corresponding to the k effects. A-IPTW E(Y_1)-E(Y_0) = \frac{1}{n}\sum\limits_{i=1}^n \left( \frac{Z_i Y_i}{g_1(X_i)} - \frac{(1-Z_i)Y_i}{g_0(X_i)}- (Z_i-g_1(X_i))\left( \frac{m_1(X_i)}{g_1(X_i)} + \frac{m_0(X_i)}{g_0(X_i)}\right)\right) where m_0(X_i) and m_1(X_i) are derived from a regression model R(X,Z) for Y given X,Z. m_0(X_i)= R(X_i,0) and m_1(X_i)=R(X_i,1). Creation of an A-IPTW model is the same as for an IPTW model. Both estimators are computed by the same IPTW class. IPTW estimates appear in modelmats(1), while the A-IPTW estimates are in modelmats(2). Data Preparation
|
https://github.com/BIDData/BIDMach/wiki/Causal-Inference
it is how it is shown
IPTW
BIDMach has several basic causal estimators. IPTW stands for Inverse Probability of Treatment Weighting and is a widely used method technique for causal inference with binary treatments. We start with some data features X, a response Y, and a "treatment" Z. In causal inference we are interested in the effects of directly changing Z on Y. This is different from the conditional probability of Y given Z which depends on joint probability distribution of the system "as is".
The text was updated successfully, but these errors were encountered: