Repository Owner: Justin Torok (Email: [email protected])
Project Lead: Chaitali Anand (Email: [email protected])
The following code was developed for running the Nexopathy in silico (NexIS) model described in Anand, et al., 2022 (preprint here), along with all of the auxilliary functions required for plotting the outputs as shown in the manuscript.
We note that the naming conventions of the models have changed, but these changes are not reflected in the code at this time. The NexIS:global and NexIS:microglia results described in the manuscript were run using the stdNDM_mouse.m
and eNDM_mouse.m
functions, respectively (these are described in more detail below).
All code is written in MATLAB and has been tested in versions 2020a and 2022a. However, the developers do not anticipate difficulties with using other MATLAB versions for any of the functions contained within this package. There are no toolboxes required to run the model, but the Statistics & Machine Learning Toolbox is required for the corr.m
function. All necessary data dependencies are within the raw_data_mouse folder, which contains the following data germane to the present study:
- Mouse tauopathy data, which were obtained from Kaufman, et al., 2017,
- Regional gene expression data, which were obtained from the Allen Gene Expression Atlas,
- Mouse mesoscale connectome, which was obtained from Oh, et al., 2014.
Below is a short description of each of the code files contained in the Nexis repository, grouped by general functionality. Scripts that are also functions have their inputs and outputs described.
-
stdNDM_mouse.m
: The core function used to generate NexIS:global results (refer to Anand, et al., 2022 for full model details). This function solves a parameter inference problem on [gamma, alpha, beta, s] and saves final model outputs for the optimized values of these parameters, with summary statistics, in a MATLAB struct object. All inputs are optional and specified as keyword arguments usinginputParser
.- Inputs:
study
: String indicating which dataset to use (default = 'IbaHippInj')costfun
: String indicating which cost function to use, which is passed toobjfun_eNDM_general_dir_costopts.m
(default = 'LinR')solvetype
: String indicating whether to use an analytical solution to the NexIS:global model or a numerical implentation (default =analytic
)volcorrect
: Binary flag indicating whether to add a volume correction to the model. In practice, this does not affect model results very much (default = 0)exclseed_costfun
: Binary flag indicating whether or not to exclude the seed regions from the cost function calculation (default = 0)excltpts_costfun
: List of time point indices to exclude from the cost function calculation (default = [])normtype
: String indicating which normalization, if any, to use on the data (default = 'sum' - we recommend changing this to 'none')w_dir
: Binary flag indicating whether or not to use the directional connectome. If 0, the connectome will be symmetrized prior to running the model and the parameter 's' will be fixed at 0.5 (default = 0)param_init
: Initial parameter values to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model. Unless otherwise specified, the initial value of gamma is determined heuristically in the function itself (default = [NaN,0.5,1,0.5])ub
: Upper bounds to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model (default = [Inf,Inf,Inf,1])lb
: Lower bounds to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model (default = zeros(1,4))algo
: String specifying thefmincon
algorithm (default = 'sqp')opttol
: Value of the optimality tolerance infmincon
(default = 1e-8)fxntol
: Value of the function tolerance infmincon
(default = 1e-8)steptol
: Value of the optimality tolerance infmincon
(default = 1e-12)maxeval
: Maximum number of function evaluations infmincon
(default = 10000)bootstrapping
: Binary flag indicating whether the parameters should be fit using bootstrapping on regions (default = 0)resample_rate
: Ifbootstrapping
= 1, the fraction of regions to use to fit the model per iteration (default = 0.8)niters
: Ifbootstrapping
= 1, the number of bootstrapping iterations (default = 100)verbose
: Binary flag indicating whether to use verbose in-line outputs (default = 0)fmindisplay
: Binary flag indicating whether to use verbosefmincon
displays (default = 0)flowthresh
: Percentile at which to threshold the calculated flows usingFlowCalculator.m
(default = 99.93)
- Outputs:
outputs
: A MATLAB struct containing inputs, outputs, and summary statistics of the final model.
- Inputs:
-
eNDM_mouse.m
: The core function used to generate NexIS:Trem2 results, among others (refer to Anand, et al., 2022 for full model details). This function solves a parameter inference problem on [gamma, alpha, beta, s] and saves final model outputs for the optimized values of these parameters, with summary statistics, in a MATLAB struct object. All inputs are optional and specified as keyword arguments usinginputParser
. This function uses a hierarchical parameter fitting approach, where requires a run ofstdNDM_mouse.m
is used to first fit the global parameters [gamma, alpha, beta, s] and then uses those outputs to define the initial values and bounds of these parameters to pass tofmincon
when fitting for the new parameters [b, p]. For iterating over genes or cell types, we highly recommend runningstdNDM_mouse.m
first and then passing its output struct toeNDM_mouse.m
to save computational time, but the function will callstdNDM_mouse.m
internally if this struct is not provided.- Inputs:
study
: String indicating which dataset to use (default = 'IbaHippInj')costfun
: String indicating which cost function to use, which is passed toobjfun_eNDM_general_dir_costopts.m
(default = 'LinR')solvetype
: String indicating whether to use an analytical solution to the NexIS model or a numerical implentation (default =analytic
)volcorrect
: Binary flag indicating whether to add a volume correction to the model. In practice, this does not affect model results very much (default = 0)exclseed_costfun
: Binary flag indicating whether or not to exclude the seed regions from the cost function calculation (default = 0)excltpts_costfun
: List of time point indices to exclude from the cost function calculation (default = [])normtype
: String indicating which normalization, if any, to use on the data (default = 'sum' - we recommend changing this to 'none')w_dir
: Binary flag indicating whether or not to use the directional connectome. If 0, the connectome will be symmetrized prior to running the model and the parameter 's' will be fixed at 0.5 (default = 0)param_init
: Initial parameter values to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model. Unless otherwise specified, the initial value of gamma is determined heuristically in the function itself (default = [NaN,0.5,1,0.5])ub
: Upper bounds to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model (default = [Inf,Inf,Inf,1])lb
: Lower bounds to pass tofmincon
. In order, these correspond to gamma, alpha, beta, and s in the model (default = zeros(1,4))algo
: String specifying thefmincon
algorithm (default = 'sqp')opttol
: Value of the optimality tolerance infmincon
(default = 1e-8)fxntol
: Value of the function tolerance infmincon
(default = 1e-8)steptol
: Value of the optimality tolerance infmincon
(default = 1e-12)maxeval
: Maximum number of function evaluations infmincon
(default = 10000)bootstrapping
: Binary flag indicating whether the parameters should be fit using bootstrapping on regions (default = 0)resample_rate
: Ifbootstrapping
= 1, the fraction of regions to use to fit the model per iteration (default = 0.8)niters
: Ifbootstrapping
= 1, the number of bootstrapping iterations (default = 100)verbose
: Binary flag indicating whether to use verbose in-line outputs (default = 0)fmindisplay
: Binary flag indicating whether to use verbosefmincon
displays (default = 0)outputs_ndm
: Struct object fromstdNDM_mouse.m
. If not provided,stdNDM_mouse.m
will called to get estimates of the initial values of the global parameters (default = [])bounds_type_endm
: String indicating which type of upper and lower limits to pass tofmincon
based on the bootstrapped distributions of the global parameters fromstdNDM_mouse.m
, if bootstrapping is run. 'CI_X' uses the Xth confidence interval, while 'old' uses +/- 30% of the mean parameter values (default = 'old')bootstrapping_endm
: Binary flag indicating whether the parameters should be fit using bootstrapping on regions (default = 0)resample_rate_endm
: Ifbootstrapping
= 1, the fraction of regions to use to fit the model per iteration (default = 0.8)niters_endm
: Ifbootstrapping
= 1, the number of bootstrapping iterations (default = 100)verbose_endm
: Binary flag indicating whether to use verbose in-line outputs (default = 0)fmindisplay_endm
: Binary flag indicating whether to use verbosefmincon
displays (default = 0)datatype_endm
: String identifying whether to use cell types or genes to add to the NexIS. The regional gene expression data come from the AGEA and the regional cell-type density data from from Mezias, et al., 2022 (default = 'gene')datalist_endm
: Genes or cell types to add to the model. This can be specified as a list of indices, if these are known, or as a cell array of names. Note: if using names, a cell array must be used even if there is only a single name in the list (default = 3578 (AGEA index for Trem2))datapca_endm
: Binary flag indicating whether to use the first PC scores of a list of 2 or more genes or cell types (default = 0)flowthresh
: Percentile at which to threshold the calculated flows usingFlowCalculator.m
(default = 99.93)
- Outputs:
outputs
: A MATLAB struct containing inputs, outputs, and summary statistics of the final model.
- Inputs:
-
FlowCalculator.m
: Function that calculates the flows on the connectome graph over time, which use a first-order approximation for the time derivatives (refer to Anand, et al., 2022 for full details). This function is called internally withinstdNDM_mouse.m
andeNDM_mouse.m
and generally does not need to be called outside of these functions.- Inputs:
X_
: Inferred pathology values over time as a regions-by-time-points matrix.C_
: Mouse mesoscale connectome (regions-by-regions)beta_
: Rate parameter for spread on the connectome, which is obtained from NexISisndm
: Binary flag indicating whether to use NexIS:global or NexIS:extendedU_
: Regions-by-types matrix of gene expression or cell-type density, necessary for NexIS:extendedb_
: NexIS:extended rate parameter(s)
- Outputs:
flow_mat
: Regions-by-regions-by-time-points array of calculated flow values
- Inputs:
-
Output2Table.m
: Produces a MATLAB table of key information from the outputs struct object fromstdNDM_mouse.m
oreNDM_mouse.m
. Optionally writes to a .csv file.- Inputs:
outputs
: MATLAB struct object fromstdNDM_mouse.m
oreNDM_mouse.m
writeout
: Optional binary flag indicating whether or not to write to a .csv (default = 0)filename
: Custom file name string for the table written to a .csv (default contains the date, the study name, and whether it is NexIS:global or NexIS:extended)filepath
: Directory (string) to which the .csv should be written (default = cd)
- Outputs:
summarytable
: MATLAB table object that unpacks key information from outputs
- Inputs:
-
BootstrappingPlotter.m
: Creates boxplots of the parameter distributions from the outputs struct object fromstdNDM_mouse.m
oreNDM_mouse.m
.- Inputs:
outputs
: MATLAB struct object fromstdNDM_mouse.m
oreNDM_mouse.m
savenclose
: Binary flag indicating whether to write the figure to an uncompressed .tiff file and then close the figure handle (default = 0)
- Inputs:
-
CorrelationPlotter.m
: Creates correlation plots per timepoint of modeled pathology vs. expected pathology and, if bootstrapping was used, boxplots of the R2 distributions across iterations. Uses the outputs struct object fromstdNDM_mouse.m
oreNDM_mouse.m
.- Inputs:
outputs
: MATLAB struct object fromstdNDM_mouse.m
oreNDM_mouse.m
savenclose
: Binary flag indicating whether to write the figure to an uncompressed .tiff file and then close the figure handle (default = 0)
- Inputs:
-
CorrelationPlotter_single.m
: Creates correlation plots per timepoint of modeled pathology vs. expected pathology. This function has more inherent control over the way the scatterplots are rendered, but requires some unpacking of the outputs struct fromstdNDM_mouse.m
oreNDM_mouse.m
. (Will likely be subsumed intoCorrelationPlotter.m
at some point)- Inputs:
predicted
: Regions-by-time-points matrix of predicted pathology valuesdata
: Regions-by-time-points matrix of empirical pathology valuestpts
: 1-by-time-points vector of time pointsstudy
: String handle indicating the tauopathy study usedsv_types
: Cell array of genes or cell types used in fitting the NexIS model (default = {})geneorct
: String indicating whether genes or cell types were used in fitting the model (default =genes
)ispca
: Binary flag indicating whether PCA was used on the gene or cell-type inputs to the model (seeeNDM_mouse.m
; default = 0)wdir
: Binary flag indicating whether the directionality parameter, s, was fit in NexIS (seestdNDM_mouse.m
oreNDM_mouse.m
; default = 0)color
: 1-by-3 vector of RGB values for the data points in the scatterplot(default = [1 0 0])shape
: Valid string indicating the data point shape forscatter.m
to use (default = 'o')savenclose
: Binary flag indicating whether to write the figure to an uncompressed .tiff file and then close the figure handle (default = 0)
- Inputs: