From 3bde34f650d7c06e05bd5957ebca2a7857a4c0ed Mon Sep 17 00:00:00 2001 From: Michael McLaren Date: Thu, 18 Aug 2022 21:44:25 -0400 Subject: [PATCH] Make edits from AW and BC --- abundance-measurement.Rmd | 33 ++++++++++++++++----------------- case-studies.Rmd | 18 +++++++++--------- conclusion.Rmd | 6 ++---- differential-abundance.Rmd | 12 ++++++------ introduction.Rmd | 8 ++++---- main.bib | 11 +++++++++++ solutions.Rmd | 24 ++++++++++++------------ 7 files changed, 60 insertions(+), 52 deletions(-) diff --git a/abundance-measurement.Rmd b/abundance-measurement.Rmd index e7b662d..f8a3359 100644 --- a/abundance-measurement.Rmd +++ b/abundance-measurement.Rmd @@ -1,9 +1,9 @@ # How bias affects abundance measurements {#abundance-measurement} -This section extends the theoretical results of @mclaren2019cons to describe the effect that consistent taxonomic bias within an MGS experiment affects the relative and absolute abundances measured for various microbial species. +This section extends the theoretical results of @mclaren2019cons to describe how taxonomic bias in an MGS experiment affects the relative and absolute abundances measured for various microbial species. We show that some approaches to quantifying species abundance yield constant fold errors (FEs), while others yield FEs that depend on overall community composition and thus can vary across samples. -## Model of MGS measurement +## A model of MGS measurements Our primary tool for understanding the impact of taxonomic bias on MGS measurement is the theoretical model of MGS measurement developed and empirically validated by @mclaren2019cons. This model describes the mathematical relationship between the read counts obtained by MGS and the (actual) abundances of the various species in a sample. @@ -49,7 +49,7 @@ is the _sample mean efficiency_, defined as the mean efficiency of all species w ## Relative abundance {#relative-abundance} We distinguish between two types of species-level *relative abundances* within a sample. -The *proportion* $P_{i}^{(a)}$ of species $i$ in sample $a$ equals its abundance relative to the total abundance of all species in $S$, +The *proportion* $P_{i}^{(a)}$ of species $i$ in sample $a$ equals its abundance divided by the total abundance of all species in $S$, \begin{align} (\#eq:prop) P_{i}^{(a)} &\equiv \frac{A_i^{(a)}}{A_\tot^{(a)}}. @@ -71,8 +71,7 @@ From Equations \@ref(eq:mgs-model), \@ref(eq:total-reads), and \@ref(eq:prop-mea (\#eq:prop-error) \tilde P_{i}^{(a)} &= P_{i}^{(a)} \cdot \frac{B_i}{\bar B^{(a)}}. \end{align} - -Taxonomic bias thus creates a fold-error (FE) in the measured proportion of a species that is equal to its efficiency divided by the mean efficiency in the sample. +Taxonomic bias creates a fold-error (FE) in the measured proportion of a species that is equal to its efficiency divided by the mean efficiency in the sample. Since the mean efficiency varies across samples, so does the FE. This phenomenon can be seen for Species 3 in the two hypothetical communities in Figure \@ref(fig:error-proportions). Species 3, which has an efficiency of 6, is under-measured in Sample 1 (FE < 1) but over-measured (FE > 1) in Sample 2. @@ -89,8 +88,8 @@ From Equations \@ref(eq:mgs-model) and \@ref(eq:ratio-meas), it follows that the (\#eq:ratio-error) \tilde R_{i/j}^{(a)} = R_{i/j}^{(a)} \cdot \frac{B_i}{B_j}. \end{align} -Taxonomic bias thus creates a FE in the measured ratio that is equal to the ratio in the species' efficiencies; the FE is therefore constant across samples. -For instance, in Figure \@ref(fig:error-proportions), the ratio of Species 3 (with an efficiency of 6) to Species 1 (with an efficiency of 1) is over-estimated by a factor of 6 in both communities despite their varying compositions. +Taxonomic bias creates a FE in the measured ratio that is equal to the ratio in the species' efficiencies; the FE is therefore constant across samples. +For instance, in Figure \@ref(fig:error-proportions), the ratio of Species 3 (with an efficiency of 6) to Species 1 (with an efficiency of 1) is over-measured by a factor of 6 in both communities despite their varying compositions. A demonstration in bacterial mock communities is shown in [Figure 3D](https://doi.org/10.7554/eLife.46923.004) of @mclaren2019cons. @@ -113,10 +112,10 @@ We further define the efficiency of taxon $I$ as the abundance-weighted average (\#eq:efficiency-general) B_I^{(a)} \equiv \frac{\sum_{i\in I} A_{i}^{(a)} B_{i}}{\sum_{i\in I} A_{i}^{(a)}}. \end{align} -With these definitions, the read count for taxon $I$ can be expressed as +With these definitions, the read count for higher-order taxon $I$ can be expressed as $M_{I}^{(a)} = A_{I}^{(a)} B_I^{(a)} F^{(a)}$. -Thus $B_I^{(a)}$ plays a role analogous to the efficiency of an individual species, but differs in that it need not be constant across samples: -If the constituent species have different efficiencies, then the efficiency of the higher-order taxon $I$ depends on the relative abundances of its constituents and so will tend to vary across samples (@mclaren2019cons). +Thus $B_I^{(a)}$ plays a role analogous to the efficiency of an individual species, but differs in that it is not constant across samples: +If the constituent species have different efficiencies, then the efficiency of the higher-order taxon $I$ depends on the relative abundances of its constituents and so will vary across samples (@mclaren2019cons). As an example, suppose that Species 1 and Species 2 in Figure \@ref(fig:error-proportions) were in the same phylum. The efficiency of the phylum would then be $\tfrac{1}{2} \cdot 1 + \tfrac{1}{2} \cdot 18 = 9.5$ in Sample 1 and $\tfrac{15}{16} \cdot 1 + \tfrac{1}{16} \cdot 18 \approx 2.1$ in Sample 2. Equations \@ref(eq:prop-error) and \@ref(eq:ratio-error) continue to describe the measurement error in proportions and ratios involving higher-order taxa, so long as the sample-dependent, higher-order taxa efficiencies $B_I^{(a)}$ and $B_J^{(a)}$ are used. @@ -126,22 +125,22 @@ In this way, we see that both proportions and ratios among higher-order taxa may Several extensions of the standard MGS experiment make it possible to measure absolute species abundances. These extensions fall into two general approaches. -The first approach leverages information about the abundance of the total community; for example, @vandeputte2017quan measured total-community abundance using flow cytometry and multiplied this number by MGS genus proportions to obtain the absolute abundances of individual genera (@vandeputte2017quan). +The first approach leverages information about the abundance of the total community; for example, @vandeputte2017quan measured total-community abundance using flow cytometry and multiplied this number by genus proportions measured by MGS to quantify the absolute abundances of individual genera (@vandeputte2017quan). A second approach leverages information about the abundance of one or more individual species; for example, a researcher might 'spike in' a known, fixed amount of an extraneous species to all samples prior to MGS, and normalize the read counts of all species to the spike-in species (@harrison2021theq). We consider each approach in detail to determine how taxonomic bias affects the resulting absolute-abundance measurements. ### Leveraging information about total-community abundance Suppose that the total abundance of all species in the sample, $A_{\tot}^{(a)}$, has been measured by a non-MGS method, yielding a measurement $\tilde A_\tot^{(a)}$. -The absolute abundance of an individual species can be measured by multiplying the species' proportion from MGS by this total-abundance measurement, +The absolute abundance of an individual species can be quantified by multiplying the species' proportion from MGS by this total-abundance measurement, \begin{align} (\#eq:density-prop-meas) \tilde A_i^{(a)} &= \tilde P_i^{(a)} \tilde A_\tot^{(a)}. \end{align} Total-abundance measurements recently used for this purpose include counting cells with microscopy (@lloyd2020evid) or flow cytometry (@props2017abso, @vandeputte2017quan, @galazzo2020howt), measuring the concentration of a marker-gene with qPCR or ddPCR (@zhang2017soil, @barlow2020aqau, @galazzo2020howt, @tettamantiboshier2020comp), and measuring bulk DNA concentration with a florescence-based DNA quantification method (@contijoch2019gutm). -Importantly, these methods of measuring total abundance are themselves subject to taxonomic bias. -Flow cytometry may, for example, yield lower cell counts for species whose cells tend to clump together or are prone to lysis during steps involved in sample collection, storage, and preparation. +Importantly, these methods of measuring total abundance are themselves subject to taxonomic bias that is analogous to, but quantitatively different from, the MGS relative abundance measurements. +Flow cytometry may yield lower cell counts for species whose cells tend to clump together or are prone to lysis during steps involved in sample collection, storage, and preparation. Marker-gene concentrations measured by qPCR are affected by variation among species in extraction efficiency, marker-gene copy number, and PCR binding and amplification efficiency (@lloyd2013meta). We can easily understand the impact of taxonomic bias on total-abundance measurement under simplifying assumptions analogous to those in our MGS model. Suppose that each species $i$ has an _absolute efficiency_ $B_{i}^{\mtot}$ for the total-abundance measurement that is constant across samples. @@ -155,15 +154,15 @@ Neglecting other error sources, the total-abundance measurement equals \end{align} -Species abundance measurements derived by this method are affected by taxonomic bias in both the MGS and total-abundance measurement. -We can determine the resulting fold error (FE) by substituting Equations \@ref(eq:prop-error) and \@ref(eq:total-density-error) into Equation \@ref(eq:density-prop-meas), yielding +Species abundance measurements derived by this method (Equation \@ref(eq:density-prop-meas)) are affected by taxonomic bias in both the MGS and total-abundance measurement. +We can determine the resulting fold error (FE) in the estimate $\tilde A_i^{(a)}$ by substituting Equations \@ref(eq:prop-error) and \@ref(eq:total-density-error) into Equation \@ref(eq:density-prop-meas), yielding \begin{align} (\#eq:density-prop-error) \tilde A_\tot^{(a)} = A_\tot^{(a)} \cdot \frac{B_i \bar B^{\mtot (a)}}{\bar B^{(a)}}. \end{align} Equation \@ref(eq:density-prop-error) indicates that the FE in the measured absolute abundance of a species equals its MGS efficiency relative to the mean MGS efficiency in the sample, multiplied by the mean efficiency of the total measurement. -As in the case of proportions (Equation \@ref(eq:prop-error)), the FE depends on sample composition through the two mean efficiency terms and so will vary across samples unless the two perfectly covary. +As in the case of proportions (Equation \@ref(eq:prop-error)), the FE depends on sample composition through the two mean efficiency terms and so will, in general, vary across samples. ### Leveraging information about a reference species diff --git a/case-studies.Rmd b/case-studies.Rmd index 50f6a57..e304ba3 100644 --- a/case-studies.Rmd +++ b/case-studies.Rmd @@ -6,7 +6,7 @@ To better understand the potential impact of taxonomic bias on DA analysis in pr The taxonomic bias species present in 'mock communities' of known composition can be directly measured, and the measured bias can then be used to correct downstream DA analysis (@mclaren2019cons). In practice it is difficult to construct control communities that span the many species present in complex natural communities. -However, gnotobiotic community experiments are well suited to this form of _calibration via community controls_ since, unlike most natural ecosystems, it is possible to assemble mock communities containing all species in known relative abundances. +However, gnotobiotic community experiments are well suited to this form of _calibration via community controls_ since it is possible to assemble mock communities containing all species in known relative abundances. In a study of host-commensal-pathogen interactions, @leopold2020host inoculated plants with 8 commensal fungal species and subsequently exposed plants to a fungal pathogen. The authors used ITS amplicon sequencing to measure communities before and after pathogen infection. @@ -15,7 +15,7 @@ The authors found a 13-fold difference between the most and least efficiently me @leopold2020host performed two related DA analyses on the pre-infection communities: the first characterized the relative importance of host genetics and species arrival order on species relative abundances in the fully-established community, and the second quantified the strength of 'priority effects'---the advantage gained by a species from being allowed to colonize first. Both analyses were based on fold differences (FDs) in species proportions and so in principle were sensitive to taxonomic bias. -To improve accuracy, the authors incorporated the bias measured from the control samples with analysis-specific calibration procedures. +To improve accuracy, the authors incorporated the bias measured from the control samples into analysis-specific calibration procedures. We repeated the two DA analyses of @leopold2020host with and without calibration and found that the results did not meaningfully differ ([SI Analysis of host genetics and arrival order](https://mikemc.github.io/differential-abundance-theory/notebook/posts/2022-01-06-leopold2020host-original-regression-analysis/), @@ -24,7 +24,7 @@ To understand why, we examined the variation in species proportions and the mean ([SI Analysis](https://mikemc.github.io/differential-abundance-theory/notebook/posts/2022-01-08-leopold2020host-case-study/), SI Figure \@ref(fig:leopold2020host-variation)). Despite the 13-fold variation in the efficiencies among species, the mean efficiency hardly varied across samples (SI Figure \@ref(fig:leopold2020host-variation)C), having a geometric range of 1.62 and a geometric standard deviation of 1.05. -This consistency in the mean efficiency was despite the fact that each species each showed substantial multiplicative variation (SI Figure \@ref(fig:leopold2020host-variation)A). +This consistency in the mean efficiency was despite the fact that each species showed substantial multiplicative variation (SI Figure \@ref(fig:leopold2020host-variation)A). But the pre-infection samples were always dominated by the three species with the highest efficiencies, which varied by 3-fold and by just 1.5-fold between the two most dominant. The mean efficiency, equal to the proportion-weighted arithmetic average of species efficiencies, is insensitive to species present at low relative abundance and so remained relatively constant across samples. Because the multiplicative variation in the mean efficiency was much smaller than that in the proportions of individual species, it had a negligible impact on the inferred FDs and the DA analyses based on them. @@ -32,8 +32,8 @@ Because the multiplicative variation in the mean efficiency was much smaller tha We performed an additional DA analysis on the data from @leopold2020host to investigate whether any commensals increased in absolute concentration in response to infection ([SI Analysis](https://mikemc.github.io/differential-abundance-theory/notebook/posts/2022-01-08-leopold2020host-case-study/#change-in-commensals-due-to-infection)). The pathogen is absent in pre-infection samples but tends to dominate the community post-infection, resulting in a substantially higher mean efficiency in post-infection samples (SI Figure \@ref(fig:leopold2020host-infection-mean-efficiency-dist)). Across different host genotypes, the average post-infection increase in mean efficiency ranged from 2.5-fold to 5.2-fold. -Using gamma-Poisson (or negative-binomial) regression of the observed read counts, we estimated the average LFD in commensal species proportions following infection with and without calibration (SI Figure \@ref(fig:leopold2020host-infection-lfc)). -Commensals typically decreased post-infection, which is expected given the pathogen's growth to most abundant community member and the sum-to-one constraint faced by proportions. +Using gamma-Poisson regression of the observed read counts, we estimated the average LFD in commensal species proportions following infection with and without calibration (SI Figure \@ref(fig:leopold2020host-infection-lfc)). +The commensals' proportions typically decreased post-infection, which is expected given the pathogen's growth to most abundant community member and the sum-to-one constraint faced by proportions. However, failure to account for bias caused the decrease to be overestimated, by an amount corresponding to the inverse change in log mean efficiency. For commensal-host pairs with relatively small observed decreases, bias correction greatly reduced the magnitude of the negative LFDs or, in several cases, resulted in LFDs that were near 0 or slightly positive. @@ -126,10 +126,10 @@ hmp_species_diff <- hmp_species_stats %>% To investigate this question, we analyzed bacterial species profiles of vaginal and stool samples derived from shotgun sequencing in the Human Microbiome Project (@huttenhower2012stru, [SI Analysis](https://mikemc.github.io/differential-abundance-theory/notebook/posts/2022-01-30-hmp-stool-vagina-comparison/)). -On average, stool profiles had substantially greater order-2 alpha diversity than vaginal profiles (SI Figure \@ref(fig:gut)A; GM $\md$ GSD of `r hmp_div_stats['stool', 'gm'] %>% round(1)` $\md$ `r hmp_div_stats['stool', 'gsd'] %>% round(1)` in stool samples and `r hmp_div_stats['vagina', 'gm'] %>% round(1)` $\md$ `r hmp_div_stats['vagina', 'gsd'] %>% round(1)` in vaginal samples). +On average, stool profiles had substantially greater order-2 alpha diversity than vaginal profiles (SI Figure \@ref(fig:gut)A; geometric mean (GM) $\md$ geometric standard deviation (GSD) of `r hmp_div_stats['stool', 'gm'] %>% round(1)` $\md$ `r hmp_div_stats['stool', 'gsd'] %>% round(1)` in stool samples and `r hmp_div_stats['vagina', 'gm'] %>% round(1)` $\md$ `r hmp_div_stats['vagina', 'gsd'] %>% round(1)` in vaginal samples). To assess the potential importance of bias for proportion-based DA analyses in the two ecosystems, we quantified the multiplicative variation in the mean efficiency across samples for a large number of possible taxonomic biases under the assumption that the measured profiles reflected the truth. Across simulation replicates, the GSD in the mean efficiency was typically lower in stool than in vaginal samples (SI Figure \@ref(fig:gut)B; ratio of GSD in vaginal samples to GSD in stool samples of `r hmp_me_stats['iid', 'gm'] %>% round(1)` $\md$ `r hmp_me_stats['iid', 'gsd'] %>% round(1)` across 1000 simulations). -Notably, however, the multiplicative variation in species also tended to be lower (SI Figure \@ref(fig:gut)C), which the GSD in the proportion of a gut species across gut samples an average of `r hmp_species_diff %>% round(2)`-fold lower than that of a vaginal species across vaginal samples. +Notably, however, the multiplicative variation in species also tended to be lower (SI Figure \@ref(fig:gut)C); the GSD in the proportion of a gut species across gut samples was an average of `r hmp_species_diff %>% round(2)`-fold lower than that of a vaginal species across vaginal samples. Recall that the importance of bias estimation of the FD in a species' proportion depends on GSD in the mean efficiency relative the GSD in the species' proportion (Section \@ref(differential-abundance)). Thus these results suggest that although the mean efficiency likely varies less across stool than vaginal samples, bias may be just as problematic for proportion-based DA analyses. @@ -203,8 +203,8 @@ In the vaginal case study, we also observed that the mean efficiency was relativ In both cases, the stability of the mean efficiency can be understood by the fact that it is an additive average over species and so is primarily determined by the most abundant species in the sample. Yet we also observed cases where the mean efficiency varied substantially. -In several cases, the mean efficiency varied systematically with the condition of interest sufficient to cause substantial systematic error in LFD estimates (Scenario 3). -Examples include the comparison of foliar fungal microbiomes pre- and post-infection, for which the mean efficiency increased post-infection, and the comparison of vaginal microbiomes with low versus high diversity in the MOMS-PI study, for which the mean efficiency was typically lower in high-diversity samples. +In several cases, the (log) mean efficiency co-varied sufficiently strongly with the condition of interest to cause substantial systematic errors in LFD estimates (Scenario 3). +Examples include the comparison of foliar fungal microbiomes pre- and post-infection, for which the mean efficiency substantially increased post-infection, and the comparison of vaginal microbiomes with low versus high diversity in the MOMS-PI study, for which the mean efficiency was typically lower in high-diversity samples. Our analysis of marine sediment communities is consistent with a systematic decline of the mean efficiency with burial time that is sufficient to substantially inflate the estimated growth rates of slowly changing species. The mean efficiency cannot differ by more than the species proportions constituting it. Therefore, the error in the estimated LFDs tend to be practically significant only for species whose LFDs are substantially smaller in magnitude than the largest LFD. diff --git a/conclusion.Rmd b/conclusion.Rmd index 585adda..a37405c 100644 --- a/conclusion.Rmd +++ b/conclusion.Rmd @@ -9,12 +9,10 @@ In Section \@ref(solutions), we describe how ratio-based methods can thus provid in addition, calibration with community controls, bias-sensitivity analysis, and bias-aware meta-analysis provide additional ways to correct or mitigate the effect of bias. Proportion-based DA methods encompass many of the most popular methods for analyzing relative and absolute abundances. -Importantly, however, we found that the degree and significance of the error due to bias depends on experimental context and may often be negligible. -In particular, if the mean efficiency is approximately constant across samples, then bias has a negligible effect on multiplicative and rank-based cross-sample comparisons, and DA results will be unaffected. +Importantly, however, if the mean efficiency is approximately constant across samples, then bias has a negligible effect on multiplicative and rank-based DA results. If the mean efficiency varies but is unassociated with the condition of interest, then bias merely serves to increase noise and does not create systematic errors in DA results. It is only when the mean efficiency is associated with the condition being analyzed that large systematic errors can occur. -Our case studies suggest that this problematic scenario can occur but may well be the exception rather than the rule. -Thus our results are still consistent with taxonomic bias having had a limited impact on extant DA results. +Our case studies suggest that this problematic scenario does occur, but it may be the exception rather than the rule. Systematic investigation of how the mean efficiency affects DA results across a wide range of studies may increase our confidence in previous DA results and/or alert us to the conditions in which they are most suspect. Important open questions include 1) determining the extent to which bias is consistent across samples for different taxonomic levels, MGS methods, and sample types; 2) assessing the validity of post-extraction measurements as measures of pre-extraction abundance; 3) understanding how the multiplicative error from bias interacts with the non-multiplicative error from contamination and taxonomic misassignment; and 4) understanding how different underlying community dynamics, and in particular the source of zero counts in MGS measurements, affect bias sensitivity. diff --git a/differential-abundance.Rmd b/differential-abundance.Rmd index f603ad5..242666b 100644 --- a/differential-abundance.Rmd +++ b/differential-abundance.Rmd @@ -58,16 +58,16 @@ For example, the ratio of Species 2 to Species 3 shows the same 4-fold decrease -Taxonomic bias affects the measured FDs in a species' absolute abundance in a similar fashion. +Taxonomic bias affects measured FDs in a species' absolute abundance in a similar fashion. First, consider a species' abundance measurement $\tilde A_i^{(a)}$ derived from a non-MGS measurement of total-community abundance using Equation \@ref(eq:density-prop-meas). Equation \@ref(eq:density-prop-error) indicates that the FE in this measurement equals the (constant) species efficiency multiplied by ${\bar B^{\mtot (a)}}/{\bar B^{(a)}}$, the ratio of the mean efficiency of the total-abundance measurement to that of the MGS measurement. This ratio can vary, creating error in the measured FD between samples. -Notably, if the FDs in the mean efficiency of the total-abundance measurement mirrors that of the MGS measurement, then the two offset each other and lead to more stable FEs (and hence more accurate FDs) in $\tilde A_i^{(a)}$. +A notable special case is if the FDs in the mean efficiency of the total-abundance measurement mirrors that of the MGS measurement; then, the two will offset each other and lead to more stable FEs (and hence more accurate FDs) in $\tilde A_i^{(a)}$. We discuss how this possibility might be exploited in real experimental workflows in Section \@ref(solutions). Now consider a species' abundance derived from a reference species using Equation \@ref(eq:density-ratio-meas). Equation \@ref(eq:density-ratio-error) indicates that the FE equals a constant ratio in species' efficiencies multiplied by the FE in the measurement of the reference species' abundance. -If the abundance of the reference species can determined up to constant FE, then the FE in $\tilde A_i^{(a)}$ will also be constant, leading to accurate FDs. +If the abundance of the reference species can be determined up to constant FE across samples, then the FE in $\tilde A_i^{(a)}$ will also be constant, leading to accurate FDs. ## Regression analysis of many samples @@ -83,8 +83,8 @@ A DA regression analysis is primarily interested in the slope coefficient $\beta For a binary covariate, $\beta$ equals the average difference in $y$ between samples with $x=1$ versus those with $x=0$. Taking $y$ to be the logarithm of a species' relative or absolute abundance makes the DA analysis multiplicative, with the coefficient $\beta$ corresponding to the average log fold difference (LFD) in abundance between conditions or for a unit increase in $x$. -Although researchers typically use more sophisticated models, the simple linear regression model in Equation \@ref(eq:regression) provides an intuitive basis for understanding the effect of taxonomic bias on regression-based DA analyses. -Appendix \@ref(appendix-regression) describes a general framework for finding the error in the estimated slope coefficient $\beta$ caused by taxonomic bias; here, we summarize and illustrate these results for the various abundance measures decribed in Section \@ref(abundance-measurement). +Although researchers often use more sophisticated models, the simple linear regression model in Equation \@ref(eq:regression) provides an intuitive basis for understanding the effect of taxonomic bias on regression-based DA analyses. +Appendix \@ref(appendix-regression) describes a general framework for finding the error in the estimated slope coefficient $\beta$ caused by taxonomic bias; here, we summarize and illustrate these results for the various abundance measures described in Section \@ref(abundance-measurement). In each case, a multiplicative DA analysis can be conducted by setting $y$ equal to the logarithm of the untransformed abundance measurement ($\tilde P_i$, $\tilde R_{i/j}$, or $\tilde A_i$). Consider an abundance measure such as the ratio between species $i$ and $j$, $\tilde R_{i/j}$. Taxonomic bias creates a constant FE in $\tilde R_{i/j}$ that translates to a constant additive shift in $\log \tilde R_{i/j}$ and therefore does not affect the estimate of the slope $\hat \beta$. @@ -94,7 +94,7 @@ The measured abundance $\tilde A_i$ thus has a FE that varies inversely with the If $\bar B$ is stable across samples, then the additive error in $\log \tilde A_i$ will be approximately constant and $\hat \beta$ is again unaffected---bias has no effect on the DA result. If, on the other hand, $\log \bar B$ tends to linearly vary with the covariate $x$, then the additive error in $\log \tilde A_i$ will linearly vary in the opposite direction and cause a systematic error in $\hat \beta$ that is equal in magnitude but opposite in sign to the slope of $\log \bar B$ with $x$. A third possibility is that $\bar B$ varies in an essentially random manner that is independent of the covariate $x$. -In this scenario, the error from taxonomic bias in the measurement $\log \tilde A_{i}$ acts to increase the noise in $\hat beta$ (i.e., increase its standard error), but does not cause a systematic error. +In this scenario, the error from taxonomic bias in the measurement $\log \tilde A_{i}$ acts to increase the noise in $\hat \beta$ (i.e., increase its standard error), but does not cause a systematic error. Figure \@ref(fig:regression-example) shows a simulated example of a regression analysis of the absolute abundances of 10 species across two conditions in which the mean efficiency is systematically greater in the condition $x=1$ than the in the condition $x=0$. The increase in the log mean efficiency from $x=0$ to $x=1$ (Figure \@ref(fig:regression-example) A and B) causes an artificial decrease in the estimated LFD between conditions for each species (Figure \@ref(fig:regression-example) C and D). diff --git a/introduction.Rmd b/introduction.Rmd index 9d946b0..0621466 100644 --- a/introduction.Rmd +++ b/introduction.Rmd @@ -26,14 +26,14 @@ If bias caused a species' abundance to be consistently measured as 10-fold great However, @mclaren2019cons showed mathematically and with MGS measurements of artificially constructed ('mock') communities that consistent taxonomic bias can create fold errors (FEs) that vary across samples and, as a result, majorly distort cross-sample comparisons. In particular, they showed that the FE in a species' proportion---the most common measure of relative abundance---varies among samples, distorting the observed FDs between samples. In some cases, bias can even lead to incorrect inferences about the direction of change (for example, by causing a taxon that decreased to appear to increase). -Yet @mclaren2019cons also found that other abundance measures---those based on the ratios among species---have proportional errors and may lead to more robust DA analyses. +Yet @mclaren2019cons also found that other abundance measures---those based on the ratios among species---have constant FEs and may lead to more robust DA analyses. The implications of these findings for DA analysis of absolute abundances and for the joint analysis of variation of many species across many samples, as is typical in microbiome association testing, have yet to be investigated. Here we use a combination of theoretical analysis, simulation, and re-analysis of published experiments to consider when and why taxonomic bias in MGS measurements leads to spurious results in DA analysis of relative and absolute abundances. -We show that, in contrast to received wisdom, taxonomic bias can affect the results of DA methods (for relative and absolute abundance) that are based on species proportions. +We show that, in contrast to received wisdom, taxonomic bias can affect the results of DA methods that are based on species proportions, even if bias is the same across samples. We describe the theoretical conditions when this effect is negligible and when it may cause serious scientific errors, and explore this effect in case studies based on real microbiome experiments. -We further demonstrate that another set of DA methods, based on the ratios among species, are robust to bias. +We further demonstrate that another set of DA methods, based on the ratios among species, are robust to consistent bias. Finally, we present several methods for quantifying, correcting, or otherwise accounting for taxonomic bias in DA analyses which, in many cases, can be deployed with only modest changes to existing experimental and analytical workflows. -Application of these insights and methods can build the confidence needed to turn the findings of microbiome studies into readily-translatable scientific knowledge. +These insights and methods will aid microbiome data analysts in turning the findings of microbiome studies into readily-translatable scientific knowledge. diff --git a/main.bib b/main.bib index 99e1ab5..afb7a97 100644 --- a/main.bib +++ b/main.bib @@ -60,6 +60,17 @@ @article{chng2020meta volume = {4}, year = {2020} } +@article{clausen2022mode, + title = "Modeling complex measurement error in microbiome experiments", + author = "Clausen, David S and Willis, Amy D", + month = apr, + year = 2022, + url = "http://arxiv.org/abs/2204.12733", + archivePrefix = "arXiv", + eprint = "2204.12733", + primaryClass = "stat.ME", + arxivid = "2204.12733" +} @article{conover2012ther, title = "The rank transformation-an easy and intuitive way to connect many nonparametric methods to their parametric counterparts for diff --git a/solutions.Rmd b/solutions.Rmd index 24bb519..0ca904a 100644 --- a/solutions.Rmd +++ b/solutions.Rmd @@ -1,4 +1,4 @@ -# Potential solutions {#solutions} +# Solutions {#solutions} ## Ratio-based relative DA analysis @@ -16,16 +16,16 @@ A common approach to enable the computation of FDs in the presence of zeros is t Unfortunately, this procedure violates bias invariance. How bias interacts with more sophisticated approaches for handling zeros remains an important open question. Second, the bias invariance of ratios among species does not extend to additive aggregates of species into higher-order taxa such phyla unless bias is conserved within the species group (@mclaren2019cons). -Multiplicative aggregates of species, recently used for biomarker discovery (@riverapinto2018bala, @quinn2020inte), provide an alternative that remains bias-invariant but may be harder to interpret. +Multiplicative aggregates of species, recently used for biomarker discovery (@riverapinto2018bala, @quinn2020inte), provide an alternative that remains bias-invariant but are harder to interpret. If biological or technical considerations favor a DA method that is not bias invariant, it can be beneficial to also use a ratio-based method as a robustness check. For example, @hevroni2020seas examined how the within-sample ranks of viral species changed from summer to winter. -Since changes in within-sample ranks can be affected by bias, the authors also considered the (bias-invariant) changes in the centered log ratios (CLR) associated with each species. +Since changes in within-sample ranks can be affected by bias, the authors also considered the changes in the centered log ratios (CLR) associated with each species. They found that the species whose within-sample ranks increased also increased in their CLR value, providing evidence that their initial results were not driven by bias. ## Calibration using community controls {#calibrate-compositions} -_Community calibration controls_ are samples whose species identities and relative abundances are known, either by construction or by characterization with a chosen 'gold standard' or _reference protocol_ (@mclaren2019cons). +_Community calibration controls_ are samples whose species identities and relative abundances are known, either by construction or by characterization with a chosen 'gold standard' or _reference protocol_ (@mclaren2019cons, @clausen2022mode). Inclusion of these samples along with the primary samples in an MGS experiment can enable researchers to calibrate their MGS measurement by directly measuring and removing the effect of bias. Calibration can be extended to species not in the controls by imputing the efficiencies of the missing species using phylogenetic relatedness, genetic characteristics (such as 16S copy number), and/or phenotypic properties such as cell-wall structure. @@ -39,7 +39,7 @@ The fungal and vaginal case studies of Section \@ref(case-studies) provide sever In studies of synthetic communities (like the fungal case study) or of ecosystems dominated by a relatively small number of culturable taxa (like the vaginal microbiome), calibration controls can be artificially constructed ('mock') communities. In other cases, _natural community controls_ can be derived from aliquots of a homogenized microbiome sample. Measurement of these controls across multiple experiments enables characterization of the _differential bias_ between experiments (@mclaren2019cons). -Calibration using differential bias can make results directly comparable across studies that used different. +Calibration using differential bias can make results directly comparable across studies that used different protocols. @@ -55,9 +55,9 @@ fs::path( -## Choose absolute-abundance methods with more stable FEs +## Absolute-abundance methods with more stable FEs -There are many methods for obtaining species (absolute) abundances from the relative abundances measured by MGS, which had previously not been evaluated for their sensitivity to taxonomic bias (@williamson2021amul being a notable exception). +There are many methods for obtaining absolute abundances from the relative abundances measured by MGS, but these methods have not previously been evaluated for their sensitivity to taxonomic bias (@williamson2021amul being a notable exception). Our theoretical results show that these methods differ in the extent to which taxonomic bias causes the FEs in species abundances to vary across samples. The effect of bias on absolute DA analysis can be mitigated by choosing methods with more stable FEs, using either of two general approaches. @@ -70,17 +70,17 @@ Section \@ref(differential-abundance) shows that the effect of both forms of bia Consider the debate over whether flow cytometry or 16S qPCR are better methods for measuring total abundance for the purposes of normalizing 16S amplicon sequencing data (@galazzo2020howt, @jian2021comm). Flow cytometry directly counts cells, whereas 16S qPCR measures the concentration of the 16S gene following DNA extraction. Thus the 16S qPCR measurement is subject to bias from extraction, copy-number variation, primer binding, and amplification, just like the 16S sequencing measurement. -Although this shared bias likely makes 16S-qPCR less accurate as a measure of total cell concentration, our theory suggests that it can lead to more accurate FDs across samples. +Although this shared bias likely makes 16S-qPCR less accurate as a measure of total cell concentration, our theory suggests that it will lead to more accurate FDs---and thus more accurate DA results. -More generally, these observations suggest that for the purposes of performing an absolute DA analysis from amplicon sequencing measurements, the ideal total-abundance measurement may be qPCR the same marker gene, from the same DNA extraction. +More generally, these observations suggest that for the purposes of performing an absolute DA analysis from amplicon sequencing measurements, the ideal total-abundance measurement is qPCR of the same marker gene, from the same DNA extraction. Similar reasoning suggests that for shotgun sequencing, the ideal total-abundance measurement is bulk DNA quantification: Shotgun sequencing and bulk DNA quantification are both subject to bias from extraction and variation in genome size. Optimal use of these pairings requires thoughtful choices during bioinformatics. For example, performing copy-number correction on amplicon read counts prior to multiplication by qPCR measurements would be counter-productive, as it decouples bias in the two measurements. Additionally, the MGS proportions should be computed prior to discarding any unassigned reads, since species that are missing from the given taxonomy database still contribute to the total concentration of marker-gene and/or bulk DNA. Future experiments should evaluate the extent to which the taxonomic bias of DNA-based total-abundance measurements is stable across samples and shared with the complementary MGS measurement. ### Normalize to a reference species -A second approach to obtaining species abundances involves normalizing the MGS count for each species to that of a reference species with constant and/or known abundance (Equation \@ref(eq:density-ratio-meas)). -Our theory predicts that taxonomic bias induces constant FEs in the resulting species abundances which do not affect DA results. +A second approach to obtaining absolute species abundances involves normalizing the MGS count for each species to that of a reference species with constant and/or known abundance (Equation \@ref(eq:density-ratio-meas)). +Our theory predicts that taxonomic bias induces constant FEs in the abundances obtained by this approach, which will not affect DA results. Previous studies have used species with a constant abundance as references: spike-ins, computationally-identified 'housekeeping species', and the host. Researchers may naturally be concerned about bias between the reference species and the native species being normalized against it (see, for example, @harrison2021theq's recommendations for spike-in experiments). @@ -112,7 +112,7 @@ The development of tools and workflows to facilitate bias sensitivity analysis m ## Bias-aware meta-analysis Meta-analysis of microbiome samples measured across multiple studies must contend with the fact that different studies typically use different protocols and hence have different taxonomic biases. -These different biases can be explicitly accounted for in a _bias meta-analysis_ which has the potential to improve statistical power as well as interpretability of multi-study DA analyses. +These different biases can be explicitly accounted for in a _bias-aware meta-analysis_ which has the potential to improve statistical power as well as interpretability of multi-study DA analyses. Parametric meta-analysis models include study-specific latent parameters representing 'batch effects'—non-biological differences in the data from each study which can distort the observed biological patterns. By estimating these 'nuisance parameters' along with the biological parameters of interest (such as the difference in a species' log abundance between conditions), the meta-analysis aims to reduce statistical bias in the biological parameters created by the non-biological differences among studies (@leek2010tack,@wang2019mana). In a bias-aware meta-analysis, the meta-analysis model is configured so that (some of) the latent parameters correspond to study- and species-specific relative efficiencies.