You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We collect data on food and drug intake in addition to symptom severity ratings.
Adaptive Intervention and Predictive Control Models
This data is fed into a predictive control model system, a concept borrowed from behavioral medicine and control systems engineering. This system uses the data to continually refine its suggestions, helping you optimize your health and well-being.
Adaptive intervention is a strategy used in behavioral medicine to create individually tailored strategies for the prevention and treatment of chronic disorders. It involves intensive measurement and frequent decision-making over time, allowing the intervention to adapt to the individual's needs.
Predictive control models are a control system that uses data to predict future outcomes and adjust actions accordingly. In the context of Longevitron, this means using the data it collects to predict your future health outcomes and adjust its suggestions to optimize your health.
A control systems engineering approach for adaptive behavioral interventions: illustration with a fibromyalgia intervention - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4167895/
Real-Life Application and Benefits
Consider a hypothetical scenario where you're dealing with a chronic condition like fibromyalgia. We collect data on your symptoms, medication intake, stress levels, sleep quality, and other relevant factors. It would then feed this data into its predictive control model system, which would use it to predict your future symptoms and adjust your treatment plan accordingly.
This could involve suggesting changes to your medication dosage, recommending lifestyle changes, or even alerting your healthcare provider if it detects a potential issue. The goal is to optimize your health and well-being based on your needs and circumstances.
☝️The image above is what we're trying to achieve here.
To determine the effects of various factors on health outcomes, we currently apply pharmacokinetic modeling over various onset delay and duration of action hyper-parameters and combine that with some other parameters for each of Hill's criteria for causality.
The distributions in this type of data aren't super normal, and you've got the onset delays and durations of action so regular Pearson correlations don't work so well. So we mainly focus on change from baseline. There's a ton of room for improvement by controlling using instrumental variables or convolutional recursive neural networks.
Hybrid Predictive Control Black Box Models seem most appropriate.
Test and Training Data
It's a matrix of years of self-reported Arthritis Severity Rating measurements and hundreds of potential factors over
time.
The first row is the variable names. The first column is Unix timestamp (seconds since 1970-01-01 00:00:00 UTC).
Pre-Processing
To make it easier to analyze some preprocessing has been done. This includes zero-filling where appropriate. Also,
the factor measurement values are aggregated values preceding the Arthritis measurements based on the onset
delay and duration of action.
Hyper-Parameters
The aggregation method and other hyper-parameters can be found by putting the Variable Name in either
Analyzing the effects of a treatment based on observational time series data is a common need in many domains like medicine, psychology, and economics. However, this analysis often faces several key challenges:
The data is sparse - there are limited number of observations.
The data is irregular - observations are not at regular time intervals.
There is missing data - many timepoints have no observation.
The onset delay of the treatment effect is unknown. It may take time to appear.
The duration of the treatment effect is unknown. It may persist after cessation.
Both acute (short-term) and cumulative (long-term) effects need to be analyzed.
Causality and statistical significance need to be established rigorously.
The optimal dosage needs to be determined to maximize benefits.
This article provides a comprehensive methodology to overcome these challenges and determine whether a treatment makes an outcome metric better, worse, or has no effect based on sparse, irregular time series data with missingness.
Before statistical analysis can begin, the data must be preprocessed:
Resample the time series to a regular interval if needed while preserving original timestamps. This allows handling missing data. For example, resample to 1 measurement per day.
Do not do interpolation or forward fill to estimate missing values. This introduces incorrect data. Simply exclude those time periods from analysis.
Filter out any irrelevant variances like daily/weekly cycles. For example, detrend the data.
Proper preprocessing sets up the data for robust analysis.
Combine the acute and cumulative insights to determine the overall effect direction and statistical significance.
For example, acute worsening but long-term cumulative improvement would imply an initial side effect but long-term benefits. Lack of statistical significance would imply no effect.
To determine the optimal dosage, incrementally adjust the daily dosage amount in the models above. Determine the dosage that minimizes the outcome variable in both the acute and cumulative sense.
Absolutely, given your constraints and requirements, here's a refined methodology:
Data Preprocessing:
Handling Missingness: Exclude rows or time periods with missing data. This ensures the analysis is grounded in actual observations.
Standardization: For treatments with larger scales, standardize values to have a mean of 0 and a standard deviation of 1. This will make regression coefficients more interpretable, representing changes in symptom severity per standard deviation change in treatment.
Lagged Regression Analysis:
Evaluate if treatment on previous days affects today's outcome, given the discrete nature of treatment.
Examine up to a certain number of lags (e.g., 30 days) to determine potential onset delay and duration.
Coefficients represent the change in symptom severity due to a one unit or one standard deviation change in treatment, depending on whether standardization was applied. P-values indicate significance.
Reverse Causality Check:
Assess if symptom severity on previous days predicts treatment intake. This helps in understanding potential feedback mechanisms.
Cross-Correlation Analysis:
Analyze the correlation between treatment and symptom severity across various lags.
This aids in understanding potential onset delays and durations of effect.
Granger Causality Tests:
Test if past values of treatment provide information about future values of symptom severity and vice-versa.
This test can help in determining the direction of influence.
Moving Window Analysis (for cumulative effects):
Create aggregated variables representing the sum or average treatment intake over windows (e.g., 7 days, 14 days) leading up to each observation.
Use these in regression models to assess if cumulative intake over time affects symptom severity.
Optimal Dosage Analysis:
Group data by discrete dosage levels.
Calculate the mean symptom severity for each group.
The dosage associated with the lowest mean symptom severity suggests the optimal intake level.
Control for Confounders (if data is available):
If data on potential confounding variables is available, incorporate them in the regression models. This helps in isolating the unique effect of the treatment.
Model Diagnostics:
After regression, check residuals for normality, autocorrelation, and other potential issues to validate the model.
Interpretation:
Consistency in findings across multiple analyses strengthens the case for a relationship.
While no single test confirms causality, evidence from multiple methods can offer a compelling case.
By adhering to this methodology, you will be working with actual observations, minimizing the introduction of potential errors from imputation. The combination of lagged regression, Granger causality tests, and moving window analysis will provide insights into both acute and cumulative effects, onset delays, and optimal treatment dosages.
¶ Data Schema for Storing User Variable Relationship Analyses
Property
Type
Nullable
Description
id
int auto_increment
No
Unique identifier for each correlation entry.
user_id
bigint unsigned
No
ID of the user to whom this correlation data belongs.
cause_variable_id
int unsigned
No
ID of the variable considered as the cause in the correlation.
effect_variable_id
int unsigned
No
ID of the variable considered as the effect in the correlation.
qm_score
double
Yes
Quantitative metric scoring the importance of the correlation based on strength, usefulness, and causal plausibility.
forward_pearson_correlation_coefficient
float(10, 4)
Yes
Statistical measure indicating the linear relationship strength between cause and effect.
value_predicting_high_outcome
double
Yes
Specific cause variable value that predicts a higher than average effect.
value_predicting_low_outcome
double
Yes
Specific cause variable value that predicts a lower than average effect.
predicts_high_effect_change
int(5)
Yes
Percentage change in the effect when the predictor is near the value predicting high outcome.
predicts_low_effect_change
int(5)
No
Percentage change in the effect when the predictor is near the value predicting low outcome.
average_effect
double
No
Average value of the effect variable across all measurements.
average_effect_following_high_cause
double
No
Average value of the effect variable following high cause variable measurements.
average_effect_following_low_cause
double
No
Average value of the effect variable following low cause variable measurements.
average_daily_low_cause
double
No
Daily average of cause variable values that are below average.
average_daily_high_cause
double
No
Daily average of cause variable values that are above average.
Average of reverse Pearson correlation coefficients over different onset delays.
cause_changes
int
No
Count of changes in cause variable values across the dataset.
cause_filling_value
double
Yes
Default value used to fill gaps in cause variable data.
cause_number_of_processed_daily_measurements
int
No
Count of daily processed measurements for the cause variable.
cause_number_of_raw_measurements
int
No
Count of raw data measurements for the cause variable.
cause_unit_id
smallint unsigned
Yes
ID representing the unit of measurement for the cause variable.
confidence_interval
double
No
Statistical range indicating the reliability of the correlation effect size.
critical_t_value
double
No
Threshold value for statistical significance in correlation analysis.
created_at
timestamp
No
Timestamp of when the correlation record was created.
data_source_name
varchar(255)
Yes
Name of the data source for the correlation data.
deleted_at
timestamp
Yes
Timestamp of when the correlation record was marked as deleted.
duration_of_action
int
No
Duration in seconds for which the cause is expected to have an effect.
effect_changes
int
No
Count of changes in effect variable values across the dataset.
effect_filling_value
double
Yes
Default value used to fill gaps in effect variable data.
effect_number_of_processed_daily_measurements
int
No
Count of daily processed measurements for the effect variable.
effect_number_of_raw_measurements
int
No
Count of raw data measurements for the effect
variable. |
| forward_spearman_correlation_coefficient| float | No | Spearman correlation assessing monotonic relationships between lagged cause and effect data. |
| number_of_days | int | No | Number of days over which the correlation data was collected. |
| number_of_pairs | int | No | Total number of cause-effect pairs used for calculating the correlation. |
| onset_delay | int | No | Estimated time in seconds between cause occurrence and effect observation. |
| onset_delay_with_strongest_pearson_correlation | int(10) | Yes | Onset delay duration yielding the strongest Pearson correlation. |
| optimal_pearson_product | double | Yes | Theoretical optimal value for the Pearson product in the correlation analysis. |
| p_value | double | Yes | Statistical significance indicator for the correlation, with values below 0.05 indicating high significance. |
| pearson_correlation_with_no_onset_delay | float | Yes | Pearson correlation coefficient calculated without considering onset delay. |
| predictive_pearson_correlation_coefficient | double | Yes | Pearson coefficient quantifying the predictive strength of the cause variable on the effect. |
| reverse_pearson_correlation_coefficient | double | Yes | Correlation coefficient when cause and effect variables are reversed, used to assess causality. |
| statistical_significance | float(10, 4) | Yes | Value representing the combination of effect size and sample size in determining correlation significance. |
| strongest_pearson_correlation_coefficient | float | Yes | The highest Pearson correlation coefficient observed in the analysis. |
| t_value | double | Yes | Statistical value derived from correlation and sample size, used in assessing significance. |
| updated_at | timestamp | No | Timestamp of the most recent update made to the correlation record. |
| grouped_cause_value_closest_to_value_predicting_low_outcome | double | No | Realistic daily cause variable value associated with lower-than-average outcomes. |
| grouped_cause_value_closest_to_value_predicting_high_outcome | double | No | Realistic daily cause variable value associated with higher-than-average outcomes. |
This rigorous methodology uses interrupted time series analysis, regression modeling, statistical testing, onset/duration modeling, and optimization to determine treatment effects from sparse, irregular observational data with missingness. It establishes causality and significance in both an acute and cumulative sense. By finding the optimal dosage, it provides actionable insights for maximizing the benefits of the treatment.
Data Collection and Analysis
We collect data on food and drug intake in addition to symptom severity ratings.
Adaptive Intervention and Predictive Control Models
This data is fed into a predictive control model system, a concept borrowed from behavioral medicine and control systems engineering. This system uses the data to continually refine its suggestions, helping you optimize your health and well-being.
Adaptive intervention is a strategy used in behavioral medicine to create individually tailored strategies for the prevention and treatment of chronic disorders. It involves intensive measurement and frequent decision-making over time, allowing the intervention to adapt to the individual's needs.
Predictive control models are a control system that uses data to predict future outcomes and adjust actions accordingly. In the context of Longevitron, this means using the data it collects to predict your future health outcomes and adjust its suggestions to optimize your health.
A control systems engineering approach for adaptive behavioral interventions: illustration with a fibromyalgia intervention - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4167895/
Real-Life Application and Benefits
Consider a hypothetical scenario where you're dealing with a chronic condition like fibromyalgia. We collect data on your symptoms, medication intake, stress levels, sleep quality, and other relevant factors. It would then feed this data into its predictive control model system, which would use it to predict your future symptoms and adjust your treatment plan accordingly.
This could involve suggesting changes to your medication dosage, recommending lifestyle changes, or even alerting your healthcare provider if it detects a potential issue. The goal is to optimize your health and well-being based on your needs and circumstances.
☝️The image above is what we're trying to achieve here.
To determine the effects of various factors on health outcomes, we currently apply pharmacokinetic modeling over various onset delay and duration of action hyper-parameters and combine that with some other parameters for each of Hill's criteria for causality.
The distributions in this type of data aren't super normal, and you've got the onset delays and durations of action so regular Pearson correlations don't work so well. So we mainly focus on change from baseline. There's a ton of room for improvement by controlling using instrumental variables or convolutional recursive neural networks.
Hybrid Predictive Control Black Box Models seem most appropriate.
Test and Training Data
It's a matrix of years of self-reported Arthritis Severity Rating measurements and hundreds of potential factors over
time.
https://github.com/curedao/curedao-black-box-optimization-engine/raw/main/data/arthritis-factor-measurements-matrix-zeros-unixtime.csv
Format
The first row is the variable names. The first column is Unix timestamp (seconds since 1970-01-01 00:00:00 UTC).
Pre-Processing
To make it easier to analyze some preprocessing has been done. This includes zero-filling where appropriate. Also,
the factor measurement values are aggregated values preceding the Arthritis measurements based on the onset
delay and duration of action.
Hyper-Parameters
The aggregation method and other hyper-parameters can be found by putting the Variable Name in either
https://studies.fdai.earth/VARIABLE_NAME_HERE
.Determining Treatment Effects from Sparse and Irregular Time Series Data
¶ Introduction
Analyzing the effects of a treatment based on observational time series data is a common need in many domains like medicine, psychology, and economics. However, this analysis often faces several key challenges:
This article provides a comprehensive methodology to overcome these challenges and determine whether a treatment makes an outcome metric better, worse, or has no effect based on sparse, irregular time series data with missingness.
¶ Data Preprocessing
Before statistical analysis can begin, the data must be preprocessed:
Proper preprocessing sets up the data for robust analysis.
¶ Statistical Analysis Methodology
With cleaned data, a rigorous methodology can determine treatment effects:
¶ Segment Data
First, split the data into three segments:
This enables separate analysis of the acute and cumulative effects.
¶ Acute Effects Analysis
To analyze acute effects, compare the 'during treatment' segment vs the 'pre-treatment' segment:
¶ Cumulative Effects Analysis
To analyze cumulative effects, build regression models between the outcome variable and the cumulative treatment dosage over time:
¶ Overall Effect Determination
Combine the acute and cumulative insights to determine the overall effect direction and statistical significance.
For example, acute worsening but long-term cumulative improvement would imply an initial side effect but long-term benefits. Lack of statistical significance would imply no effect.
¶ Optimization
To determine the optimal dosage, incrementally adjust the daily dosage amount in the models above. Determine the dosage that minimizes the outcome variable in both the acute and cumulative sense.
¶ Analysis Pipeline
Absolutely, given your constraints and requirements, here's a refined methodology:
Data Preprocessing:
Lagged Regression Analysis:
Reverse Causality Check:
Cross-Correlation Analysis:
Granger Causality Tests:
Moving Window Analysis (for cumulative effects):
Optimal Dosage Analysis:
Control for Confounders (if data is available):
Model Diagnostics:
Interpretation:
By adhering to this methodology, you will be working with actual observations, minimizing the introduction of potential errors from imputation. The combination of lagged regression, Granger causality tests, and moving window analysis will provide insights into both acute and cumulative effects, onset delays, and optimal treatment dosages.
¶ Data Schema for Storing User Variable Relationship Analyses
id
user_id
cause_variable_id
effect_variable_id
qm_score
forward_pearson_correlation_coefficient
value_predicting_high_outcome
value_predicting_low_outcome
predicts_high_effect_change
predicts_low_effect_change
average_effect
average_effect_following_high_cause
average_effect_following_low_cause
average_daily_low_cause
average_daily_high_cause
average_forward_pearson_correlation_over_onset_delays
average_reverse_pearson_correlation_over_onset_delays
cause_changes
cause_filling_value
cause_number_of_processed_daily_measurements
cause_number_of_raw_measurements
cause_unit_id
confidence_interval
critical_t_value
created_at
data_source_name
deleted_at
duration_of_action
effect_changes
effect_filling_value
effect_number_of_processed_daily_measurements
effect_number_of_raw_measurements
variable. |
|
forward_spearman_correlation_coefficient
| float | No | Spearman correlation assessing monotonic relationships between lagged cause and effect data. ||
number_of_days
| int | No | Number of days over which the correlation data was collected. ||
number_of_pairs
| int | No | Total number of cause-effect pairs used for calculating the correlation. ||
onset_delay
| int | No | Estimated time in seconds between cause occurrence and effect observation. ||
onset_delay_with_strongest_pearson_correlation
| int(10) | Yes | Onset delay duration yielding the strongest Pearson correlation. ||
optimal_pearson_product
| double | Yes | Theoretical optimal value for the Pearson product in the correlation analysis. ||
p_value
| double | Yes | Statistical significance indicator for the correlation, with values below 0.05 indicating high significance. ||
pearson_correlation_with_no_onset_delay
| float | Yes | Pearson correlation coefficient calculated without considering onset delay. ||
predictive_pearson_correlation_coefficient
| double | Yes | Pearson coefficient quantifying the predictive strength of the cause variable on the effect. ||
reverse_pearson_correlation_coefficient
| double | Yes | Correlation coefficient when cause and effect variables are reversed, used to assess causality. ||
statistical_significance
| float(10, 4) | Yes | Value representing the combination of effect size and sample size in determining correlation significance. ||
strongest_pearson_correlation_coefficient
| float | Yes | The highest Pearson correlation coefficient observed in the analysis. ||
t_value
| double | Yes | Statistical value derived from correlation and sample size, used in assessing significance. ||
updated_at
| timestamp | No | Timestamp of the most recent update made to the correlation record. ||
grouped_cause_value_closest_to_value_predicting_low_outcome
| double | No | Realistic daily cause variable value associated with lower-than-average outcomes. ||
grouped_cause_value_closest_to_value_predicting_high_outcome
| double | No | Realistic daily cause variable value associated with higher-than-average outcomes. |¶ Conclusion
This rigorous methodology uses interrupted time series analysis, regression modeling, statistical testing, onset/duration modeling, and optimization to determine treatment effects from sparse, irregular observational data with missingness. It establishes causality and significance in both an acute and cumulative sense. By finding the optimal dosage, it provides actionable insights for maximizing the benefits of the treatment.
Resources
Links
The text was updated successfully, but these errors were encountered: