Skip to content

Using phylogenetic and functional relationships to inform nonlinear trend estimates from long-term biodiversity data

Notifications You must be signed in to change notification settings

nicholasjclark/phylo_func_trends

Repository files navigation

To what extent can phylogenetic or functional relationships among species be leveraged to inform estimates of nonlinear changes in abundance?

This study aims to use long-term multi-species monitoring data to tackle the above question.

Contributors (in no particular order)

Nicholas Clark

Adam Smith

Shubhi Sharma

Casey Youngflesh

Caleb Robbins

Hammed Akande

Guillermo Fandos

Thomas Johnson

Proposed methodology

  1. Gather multi-species abundance (or relative abundance) measures from long-term monitoring studies
  2. Construct phylogenetic and functional trees to represent relationships among species
  3. Gather other appropriate information necessary to capture spatial confounding (i.e. coordinates, polygon structures etc...). The script BBS_trends_data.R in this repo has some annotated code to walk through such a data gathering / cleaning scheme
  4. Build Generalized Additive Models (GAMs) in mgcv using tensor product decompositions (see the help page on tensor products for more information) that can be used to ask how species' relationships inform estimates of nonlinear trend. Make use of the highly flexible mrf basis in mgcv to incorporate phylogenetic and functional information (see this post from Cross Validated Gavin Simpson and this blogpost from myself to get a bit more context on how these models work. The script BBS_trends_models.R in this repo has some example annotated code to show how these can be fit in bam() while also attempting to account for unmodelled temporal autocorrelation
  5. Design a model evaluation scheme that allows us to compare fits from phylogenetic, functional and "null" models (that use only the random effect grouping factors of "species", but not their relationships) in a variety of ways (cross-validation by leaving certain species or groups out, with appropriate proper scoring rules; calculating trait contributions to squared second derivatives of trends; comparisons against models that assume trends are linear)

Tasks

Design and justification

  • Review literature to understand approaches that have been used to leverage phylogenetic or functional relationships to inform population estimates
  • Gather information on the types of models / analyses that are commonly used for large multispecies datasets to inform decisions or calculations of indices (for example, how do US and Canadian Governments use NA BBS data? And could the proposed models make any impact on these pipelines?)
  • Also gather information on Gaussian Markov Random Fields and their potential applications in complex nonlinear effect estimates (see for example this work by Rue and Held and this post)

Methodology

  • Identify appropriate multi-species datasets. There is considerable information (with example code) provided by this preprint and the accompanying Github repo
  • For candidate datasets, determine appropriate steps for cleaning and preparing data. We don't want too many shortcuts here (i.e. blindly aggregating with no justification for this), it would be better to think through the data generating process for each dataset
  • Determine appropriate cross-validation schemes, considering blocking over space, time and phylogeny / functional dendrogram, to evaluate candidate models
  • Prepare scoring scripts and justify scoring rules to prioritize; consider CRPS, energy and variogram scores see this lecture on univariate forecast evaluation and this lecture on multivariate forecast evaluation for context
  • Brainstorm the kinds of outputs that we will need, and make sure we have well-annotated functions that can be applied to any of the models for calculating important metrics (look through the in-development functions in the Functions/utilities.R script in this repo for examples; and see the BBS_trends_analysis.R script for examples of how these might be used)

About

Using phylogenetic and functional relationships to inform nonlinear trend estimates from long-term biodiversity data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published