diff --git a/docs/Metabolomics_files/figure-html/bem-1.png b/docs/Metabolomics_files/figure-html/bem-1.png
index a70df89..a357e16 100644
Binary files a/docs/Metabolomics_files/figure-html/bem-1.png and b/docs/Metabolomics_files/figure-html/bem-1.png differ
diff --git a/docs/introduction.html b/docs/introduction.html
index 38df2de..8c0aec3 100644
--- a/docs/introduction.html
+++ b/docs/introduction.html
@@ -535,8 +535,8 @@ <h2><span class="header-section-number">1.3</span> Trends in Metabolomics<a href
 </div>
 <div id="workflow-1" class="section level2 hasAnchor" number="1.4">
 <h2><span class="header-section-number">1.4</span> Workflow<a href="introduction.html#workflow-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<div id="htmlwidget-8516ffa0e56e3e209769" style="width:300px;height:480px;" class="grViz html-widget"></div>
-<script type="application/json" data-for="htmlwidget-8516ffa0e56e3e209769">{"x":{"diagram":"digraph workflow {\nnode [shape = box]\nA [label = \"raw data\"]\nB [label = \"open source format\"]\nC [label = \"DoE folder\"]\nD [label = \"peaks list\"]\nE [label = \"retention time correction\"]\nF [label = \"peaks grouping\"]\nG [label = \"peaks filling\"]\nH [label = \"raw peaks\"]\nI [label = \"data visulization\"]\nJ [label = \"batch effects correction\"]\nK [label = \"corrected peaks\"]\nL [label = \"annotation\"]\nM [label = \"metabolomics pathway analysis\"]\nN [label = \"omics analysis\"]\nO [label = \"biomarkers discovery/diagnoise\"]\n\nA -> B -> C -> D -> E -> F -> G -> H\nH -> I\nI -> J\nH -> J -> K -> L\nL -> M -> N\nL -> O\n                  }","config":{"engine":"dot","options":null}},"evals":[],"jsHooks":[]}</script>
+<div id="htmlwidget-08e8d605cbc9ba6711fa" style="width:300px;height:480px;" class="grViz html-widget"></div>
+<script type="application/json" data-for="htmlwidget-08e8d605cbc9ba6711fa">{"x":{"diagram":"digraph workflow {\nnode [shape = box]\nA [label = \"raw data\"]\nB [label = \"open source format\"]\nC [label = \"DoE folder\"]\nD [label = \"peaks list\"]\nE [label = \"retention time correction\"]\nF [label = \"peaks grouping\"]\nG [label = \"peaks filling\"]\nH [label = \"raw peaks\"]\nI [label = \"data visulization\"]\nJ [label = \"batch effects correction\"]\nK [label = \"corrected peaks\"]\nL [label = \"annotation\"]\nM [label = \"metabolomics pathway analysis\"]\nN [label = \"omics analysis\"]\nO [label = \"biomarkers discovery/diagnoise\"]\n\nA -> B -> C -> D -> E -> F -> G -> H\nH -> I\nI -> J\nH -> J -> K -> L\nL -> M -> N\nL -> O\n                  }","config":{"engine":"dot","options":null}},"evals":[],"jsHooks":[]}</script>
 
 </div>
 </div>
diff --git a/docs/raw-data-pretreatment.html b/docs/raw-data-pretreatment.html
index 1a30e32..db56add 100644
--- a/docs/raw-data-pretreatment.html
+++ b/docs/raw-data-pretreatment.html
@@ -537,17 +537,17 @@ <h3><span class="header-section-number">6.7.1</span> Non-detects<a href="raw-dat
 ## 
 ## Coefficients:
 ##             Estimate Std. Error z value Pr(&gt;|z|)    
-## (Intercept)   1.0000     0.4366    2.29    0.022 *  
-## x            10.0000     0.3162   31.62   &lt;2e-16 ***
-## Log(scale)    2.1627     0.0000     Inf   &lt;2e-16 ***
+## (Intercept)   1.0000     0.4325   2.312   0.0208 *  
+## x            10.0000     0.3162  31.623   &lt;2e-16 ***
+## Log(scale)    2.1541     0.0000     Inf   &lt;2e-16 ***
 ## ---
 ## Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1
 ## 
-## Scale: 8.695 
+## Scale: 8.62 
 ## 
 ## Gaussian distribution
 ## Number of Newton-Raphson Iterations: 1 
-## Log-likelihood: -3082 on 3 Df
+## Log-likelihood: -3073 on 3 Df
 ## Wald-statistic:  1000 on 1 Df, p-value: &lt; 2.22e-16</code></pre>
 <p>According to Ronald Hites’s simulation<span class="citation">(<a href="#ref-hites2019">Hites 2019</a>)</span>, measurements below the LOD (even missing measurements) with the LOD/2 or with the <span class="math inline">\(LOD/\sqrt2\)</span> causes little bias and “Any time you have a % non-detected &gt;20%, for whatever reason, it is unlikely that the data set can give useful results.”</p>
 <p>Another study find random forest could be the best imputation method for missing at random (MAR), and missing completely at random (MCAR) data. Quantile regression imputation of left-censored data is the best imputation methods for left-censored missing not at random data <span class="citation">(<a href="#ref-wei2018">Wei et al. 2018</a>)</span>.</p>
diff --git a/docs/search_index.json b/docs/search_index.json
index 59195d8..3c49b34 100644
--- a/docs/search_index.json
+++ b/docs/search_index.json
@@ -1 +1 @@
-[["index.html", "Meta-Workflow Preface", " Meta-Workflow Miao YU 2024-04-10 Preface This is an online handout for mass spectrometry based metabolomics data analysis. It would cover a full reproducible metabolomics workflow for data analysis and important topics related to metabolomics. Here is a list of topics: Sample collection Sample pretreatment Principles of metabolomics data analysis Software selection Batch correction Annotation Omics analysis Exposome This is a book written in Bookdown. You could contribute it by a pull request in Github. A workshop based on this book could be found here. Meanwhile, a docker image xcmsrocker is available for metabolomics reproducible research. R and Rstudio are the software needed in this workflow. "],["introduction.html", "Chapter 1 Introduction 1.1 History 1.2 Reviews and tutorials 1.3 Trends in Metabolomics 1.4 Workflow", " Chapter 1 Introduction Information in living organism communicates along the Central Dogma in different scales from individual, population, community to ecosystem. Metabolomics (i.e., the profiling and quantification of metabolites) is a relatively new field of “omics” studies. Different from other omics studies, metabolomics always focused on small molecular (molecular weight below 1500 Da) with much lower mass than polypeptide with single or doubled charged ions. Here is a demo of the position of metabolomics in “omics” studies[@b.dunn2011]. Figure 1.1: The complex interactions of functional levels in biological systems. Metabolomics studies always employ GC-MS(Theodoridis et al. 2012; Beale et al. 2018), GC*GC-MS(T.-F. Tian et al. 2016), LC-MS(Gika et al. 2014), LC-MS/MS(Begou et al. 2017), IM-MS(Levy et al. 2019), infrared ion spectroscopy(Martens et al. 2017) or NMR[@b.dunn2011] to measure metabolites. For analytical methods, this review could be checked(A. Zhang et al. 2012). The overall technique progress of metabolomics (2012-2018) could be found here(Miggiels et al. 2019). However, this workflow will only cover mass spectrometry based metabolomics or XC-MS based research. 1.1 History 1.1.1 History of Mass Spectrometry Here is a historical commentary for mass spectrometry(Yates Iii 2011). In details, here is a summary: 1913, Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.” Figure 1.2: Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.” Petroleum industry bring mass spectrometry from physics to chemistry The first commercial mass spectrometer is from Consolidated Engineering Corp to analysis simple gas mixtures from petroleum In World War II, U.S. use mass spectrometer to separate and enrich isotopes of uranium in Manhattan Project U.S. also use mass spectrometer for organic compounds during wartime and extend the application of mass spectrometer 1946, TOF, William E. Stephens 1970s, quadrupole mass analyzer 1970s, R. Graham Cooks developed mass-analyzed ion kinetic energy spectrometry, or MIKES to make MRM analysis for multi-stage mass sepctrometry 1980s, MALDI rescue TOF and mass spectrometry move into biological application 1990s, Orbitrap mass spectrometry 2010s, Aperture Coding mass spectrometry 1.1.2 History of Metabolomcis You could check this report(Baker 2011). According to this book section(Kusonmano, Vongsangnak, and Chumnanpuen 2016): Figure 1.3: Metabolomics timeline during pre- and post-metabolomics era 2000-1500 BC some traditional Chinese doctors who began to evaluate the glucose level in urine of diabetic patients using ants 300 BC ancient Egypt and Greece that traditionally determine the urine taste to diagnose human diseases 1913 Joseph John Thomson and Francis William Aston mass spectrometry 1946 Felix Bloch and Edward Purcell Nuclear magnetic resonance late 1960s chromatographic separation technique 1971 Pauling’s research team “Quantitative Analysis of Urine Vapor and Breath by Gas–Liquid Partition Chromatography” Willmitzer and his research team pioneer group in metabolomics which suggested the promotion of the metabolomics field and its potential applications from agriculture to medicine and other related areas in the biological sciences 2007 Human Metabolome Project consists of databases of approximately 2500 metabolites, 1200 drugs, and 3500 food components post-metabolomics era high-throughput analytical techniques 1.1.3 Defination Metabolomics is actually a comprehensive analysis with identification and quantification of both known and unknown compounds in an unbiased way. Metabolic fingerprinting is working on fast classification of samples based on metabolite data without quantifying or identification of the metabolites. Metabolite profiling always need a pre-defined metabolites list to be quantification(Madsen, Lundstedt, and Trygg 2010). Meanwhile, targeted and untargeted metabolomics are also used in publications. For targeted metabolomics, the majority of the molecules within a biological pathway or a defined group of related metabolites are determined. Sometimes broad collection of known metabolites could also be referred as targeted analysis. Untargeted analysis detect all of possible metabolites unbiased in the samples of interest. A similar concept called non-targeted analysis/screen is actually describe the similar studies or workflow. 1.2 Reviews and tutorials Some nice reviews and tutorials related to this workflow could be found in those papers or directly online: 1.2.1 Workflow Those papers are recommended(González-Riano et al. 2020; Pezzatti et al. 2020; X. Liu et al. 2019; Barnes et al. 2016a; Cajka and Fiehn 2016; Gika et al. 2014; Theodoridis et al. 2012; X. Lu and Xu 2008; Fiehn 2002) for general metabolomics related topics. For targeted metabolomics, you could check those reviews(Griffiths et al. 2010; W. Lu, Bennett, and Rabinowitz 2008; Weljie et al. 2006; Yuan et al. 2012; J. Zhou and Yin 2016; Begou et al. 2017). 1.2.2 Data analysis You could firstly read those papers(Barnes et al. 2016b; Kusonmano, Vongsangnak, and Chumnanpuen 2016; Madsen, Lundstedt, and Trygg 2010; Uppal et al. 2016; Alonso, Marsal, and Julià 2015) to get the concepts and issues for data analysis in metabolomics. Then this paper(Gromski et al. 2015) could be treated as a step-by-step tutorial. For GC-MS based metabolomics, check this paper(Rey-Stolle et al. 2022). A guide could be used choose a inofrmatics software and tools for lipidomics(Z. Ni et al. 2022). For annotation, this paper(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) is a well organized review. For database used in metabolomics, you could check this review(Vinaixa et al. 2016). For metabolomics software, check this series of reviews for each year(Misra and van der Hooft 2016; Misra, Fahrmann, and Grapov 2017; Misra 2018). For open sourced software, those reviews(Chang et al. 2021; Spicer et al. 2017; Dryden et al. 2017) could be a good start. For DIA or DDA metabolomics, check those papers(Fenaille et al. 2017; Bilbao et al. 2015). Here is the slides for metabolomics data analysis workshop and I have made presentations twice in UWaterloo and UC Irvine. Introduction Statistical Analysis Batch Correction Annotation 1.2.3 Application For environmental research related metabolomics or exposome, check those papers(Matich et al. 2019; Tang et al. 2020; Warth et al. 2017; Bundy, Davey, and Viant 2009). For toxicology, check this paper(Mark R. Viant et al. 2019). Check this piece(Wishart 2016) for drug discovery and precision medicine. For food chemistry, check this paper(Castro-Puyana et al. 2017), this paper for livestock(Goldansaz et al. 2017) and those papers for nutrition(Allam-Ndoul et al. 2016; Jones, Park, and Ziegler 2012; Müller and Bosy-Westphal 2020). For disease related metabolomics such as oncology(Spratlin, Serkova, and Eckhardt 2009), Cardiovascular(Cheng et al. 2017) . This paper(Kennedy et al. 2018) cover the metabolomics realted clinic research. For plant science, check those paper(Lloyd W. Sumner, Mendes, and Dixon 2003; Jorge, Mata, and António 2016; Hansen and Lee 2018). For single cell metabolomics analysis, check here(Fessenden 2016; Zenobi 2013; Ali et al. 2019; Hansen and Lee 2018). For gut microbiota, check here(Smirnov et al. 2016). 1.2.4 Challenge General challenge for metabolomics studies could be found here (Schymanski and Williams 2017; Uppal et al. 2016; Schrimpe-Rutledge et al. 2016; Wolfender et al. 2015). For reproducible research, check those papers (Xinsong Du et al. 2022; Place et al. 2021; Verhoeven, Giera, and Mayboroda 2020; Mangul et al. 2019; Wallach, Boyack, and Ioannidis 2018; Hites and Jobst 2018; Considine et al. 2017; Sarpe and Schriemer 2017). To match data from different LC system, M2S could be used(Climaco Pinto et al. 2022). Quantitative Metabolomics related issues could be found here(Kapoore and Vaidyanathan 2016; Jorge, Mata, and António 2016; Lv et al. 2022; Vitale et al. 2022). For quality control issues, check here(Dudzik et al. 2018; Siskos et al. 2017; Lloyd W. Sumner et al. 2007; Place et al. 2021; Corey D. Broeckling et al. 2023; González-Domínguez et al. 2024). You might also try postcolumn infusion as a quality control tool(González, Dubbelman, and Hankemeier 2022). 1.3 Trends in Metabolomics library(rentrez) papers_by_year &lt;- function(years, search_term){ return(sapply(years, function(y) entrez_search(db=&quot;pubmed&quot;,term=search_term, mindate=y, maxdate=y, retmax=0)$count)) } years &lt;- 2002:2022 total_papers &lt;- papers_by_year(years, &quot;&quot;) omics &lt;- c(&quot;genomics&quot;, &quot;epigenomics&quot;, &quot;metagenomic&quot;, &quot;proteomics&quot;, &quot;transcriptomics&quot;,&quot;metabolomics&quot;,&quot;exposomics&quot;) trend_data &lt;- sapply(omics, function(t) papers_by_year(years, t)) trend_props &lt;- trend_data/total_papers library(reshape) library(ggplot2) trend_df &lt;- melt(data.frame(years, trend_data), id.vars=&quot;years&quot;) p &lt;- ggplot(trend_df, aes(years, value, colour=variable)) p + geom_line(size=1) + scale_y_log10(&quot;number of papers&quot;) + theme_bw() 1.4 Workflow References Ali, Ahmed, Yasmine Abouleila, Yoshihiro Shimizu, Eiso Hiyama, Samy Emara, Alireza Mashaghi, and Thomas Hankemeier. 2019. “Single-Cell Metabolomics by Mass Spectrometry: Advances, Challenges, and Future Applications.” TrAC Trends in Analytical Chemistry 120 (November): 115436. https://doi.org/10.1016/j.trac.2019.02.033. Allam-Ndoul, Bénédicte, Frédéric Guénard, Véronique Garneau, Hubert Cormier, Olivier Barbier, Louis Pérusse, and Marie-Claude Vohl. 2016. “Association Between Metabolite Profiles, Metabolic Syndrome and Obesity Status.” Nutrients 8 (6): 324. https://doi.org/10.3390/nu8060324. Alonso, Arnald, Sara Marsal, and Antonio Julià. 2015. “Analytical Methods in Untargeted Metabolomics: State of the Art in 2015.” Frontiers in Bioengineering and Biotechnology 3 (March). https://doi.org/10.3389/fbioe.2015.00023. Baker, Monya. 2011. “Metabolomics: From Small Molecules to Big Ideas.” Nature Methods 8 (2): 117–21. https://doi.org/10.1038/nmeth0211-117. Barnes, Stephen, H. Paul Benton, Krista Casazza, Sara J. Cooper, Xiangqin Cui, Xiuxia Du, Jeffrey Engler, et al. 2016a. “Training in Metabolomics Research. I. Designing the Experiment, Collecting and Extracting Samples and Generating Metabolomics Data.” Journal of Mass Spectrometry 51 (7): 461–75. https://doi.org/10.1002/jms.3782. ———, et al. 2016b. “Training in Metabolomics Research. II. Processing and Statistical Analysis of Metabolomics Data, Metabolite Identification, Pathway Analysis, Applications of Metabolomics and Its Future.” Journal of Mass Spectrometry 51 (8): 535–48. https://doi.org/10.1002/jms.3780. Beale, David J., Farhana R. Pinu, Konstantinos A. Kouremenos, Mahesha M. Poojary, Vinod K. Narayana, Berin A. Boughton, Komal Kanojia, Saravanan Dayalan, Oliver A. H. Jones, and Daniel A. Dias. 2018. “Review of Recent Developments in GC–MS Approaches to Metabolomics-Based Research.” Metabolomics 14 (11): 152. https://doi.org/10.1007/s11306-018-1449-2. Begou, O., H. G. Gika, I. D. Wilson, and G. Theodoridis. 2017. “Hyphenated MS-based Targeted Approaches in Metabolomics.” Analyst 142 (17): 3079–3100. https://doi.org/10.1039/C7AN00812K. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Bundy, Jacob G., Matthew P. Davey, and Mark R. Viant. 2009. “Environmental Metabolomics: A Critical Review and Future Perspectives.” Metabolomics 5 (1): 3. https://doi.org/10.1007/s11306-008-0152-0. Cajka, Tomas, and Oliver Fiehn. 2016. “Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics.” Analytical Chemistry 88 (1): 524–45. https://doi.org/10.1021/acs.analchem.5b04491. Castro-Puyana, María, Raquel Pérez-Míguez, Lidia Montero, and Miguel Herrero. 2017. “Application of Mass Spectrometry-Based Metabolomics Approaches for Food Safety, Quality and Traceability.” TrAC Trends in Analytical Chemistry 93 (August): 102–18. https://doi.org/10.1016/j.trac.2017.05.004. Chang, Hui-Yin, Sean M. Colby, Xiuxia Du, Javier D. Gomez, Maximilian J. Helf, Katerina Kechris, Christine R. Kirkpatrick, et al. 2021. “A Practical Guide to Metabolomics Software Development.” Analytical Chemistry 93 (4): 1912–23. https://doi.org/10.1021/acs.analchem.0c03581. Cheng, Susan, Svati H. Shah, Elizabeth J. Corwin, Oliver Fiehn, Robert L. Fitzgerald, Robert E. Gerszten, Thomas Illig, et al. 2017. “Potential Impact and Study Considerations of Metabolomics in Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.” Circulation: Cardiovascular Genetics 10 (2): e000032. https://doi.org/10.1161/HCG.0000000000000032. Climaco Pinto, Rui, Ibrahim Karaman, Matthew R. Lewis, Jenny Hällqvist, Manuja Kaluarachchi, Gonçalo Graça, Elena Chekmeneva, et al. 2022. “Finding Correspondence Between Metabolomic Features in Untargeted Liquid Chromatography–Mass Spectrometry Metabolomics Datasets.” Analytical Chemistry 94 (14): 5493–503. https://doi.org/10.1021/acs.analchem.1c03592. Considine, E. C., G. Thomas, A. L. Boulesteix, A. S. Khashan, and L. C. Kenny. 2017. “Critical Review of Reporting of the Data Analysis Step in Metabolomics.” Metabolomics 14 (1): 7. https://doi.org/10.1007/s11306-017-1299-3. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Dryden, Michael D. M., Ryan Fobel, Christian Fobel, and Aaron R. Wheeler. 2017. “Upon the Shoulders of Giants: Open-Source Hardware and Software in Analytical Chemistry.” Analytical Chemistry 89 (8): 4330–38. https://doi.org/10.1021/acs.analchem.7b00485. Du, Xinsong, Juan J. Aristizabal-Henao, Timothy J. Garrett, Mathias Brochhausen, William R. Hogan, and Dominick J. Lemas. 2022. “A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.” Metabolites 12 (1): 87. https://doi.org/10.3390/metabo12010087. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fessenden, Marissa. 2016. “Metabolomics: Small Molecules, Single Cells.” Nature 540 (7631): 153–55. https://doi.org/10.1038/540153a. Fiehn, Oliver. 2002. “Metabolomics – the Link Between Genotypes and Phenotypes.” Plant Molecular Biology 48 (1): 155–71. https://doi.org/10.1023/A:1013713905833. Gika, Helen G., Georgios A. Theodoridis, Robert S. Plumb, and Ian D. Wilson. 2014. “Current Practice of Liquid Chromatography–Mass Spectrometry in Metabolomics and Metabonomics.” Journal of Pharmaceutical and Biomedical Analysis, Review Papers on Pharmaceutical and Biomedical Analysis 2013, 87 (January): 12–25. https://doi.org/10.1016/j.jpba.2013.06.032. Goldansaz, Seyed Ali, An Chi Guo, Tanvir Sajed, Michael A. Steele, Graham S. Plastow, and David S. Wishart. 2017. “Livestock Metabolomics and the Livestock Metabolome: A Systematic Review.” PLOS ONE 12 (5): e0177675. https://doi.org/10.1371/journal.pone.0177675. González, Oskar, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2022. “Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis.” Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis, April. https://doi.org/10.1021/jasms.2c00022. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. González-Riano, Carolina, Danuta Dudzik, Antonia Garcia, Alberto Gil-de-la-Fuente, Ana Gradillas, Joanna Godzien, Ángeles López-Gonzálvez, et al. 2020. “Recent Developments Along the Analytical Process for Metabolomics Workflows.” Analytical Chemistry 92 (1): 203–26. https://doi.org/10.1021/acs.analchem.9b04553. Griffiths, William J., Therese Koal, Yuqin Wang, Matthias Kohl, David P. Enot, and Hans-Peter Deigner. 2010. “Targeted Metabolomics for Biomarker Discovery.” Angewandte Chemie International Edition 49 (32): 5426–45. https://doi.org/10.1002/anie.200905579. Gromski, Piotr S., Howbeer Muhamadali, David I. Ellis, Yun Xu, Elon Correa, Michael L. Turner, and Royston Goodacre. 2015. “A Tutorial Review: Metabolomics and Partial Least Squares-Discriminant Analysis – a Marriage of Convenience or a Shotgun Wedding.” Analytica Chimica Acta 879 (June): 10–23. https://doi.org/10.1016/j.aca.2015.02.012. Hansen, Rebecca L., and Young Jin Lee. 2018. “High-Spatial Resolution Mass Spectrometry Imaging: Toward Single Cell Metabolomics in Plant Tissues.” The Chemical Record 18 (1): 65–77. https://doi.org/10.1002/tcr.201700027. Hites, Ronald A., and Karl J. Jobst. 2018. “Is Nontargeted Screening Reproducible?” Environmental Science &amp; Technology 52 (21): 11975–76. https://doi.org/10.1021/acs.est.8b05671. Jones, Dean P., Youngja Park, and Thomas R. Ziegler. 2012. “Nutritional Metabolomics: Progress in Addressing Complexity in Diet and Health.” Annual Review of Nutrition 32 (1): 183–202. https://doi.org/10.1146/annurev-nutr-072610-145159. Jorge, Tiago F., Ana T. Mata, and Carla António. 2016. “Mass Spectrometry as a Quantitative Tool in Plant Metabolomics.” Phil. Trans. R. Soc. A 374 (2079): 20150370. https://doi.org/10.1098/rsta.2015.0370. Kapoore, Rahul Vijay, and Seetharaman Vaidyanathan. 2016. “Towards Quantitative Mass Spectrometry-Based Metabolomics in Microbial and Mammalian Systems.” Phil. Trans. R. Soc. A 374 (2079): 20150363. https://doi.org/10.1098/rsta.2015.0363. Kennedy, Adam D., Bryan M. Wittmann, Anne M. Evans, Luke A. D. Miller, Douglas R. Toal, Shaun Lonergan, Sarah H. Elsea, and Kirk L. Pappan. 2018. “Metabolomics in the Clinic: A Review of the Shared and Unique Features of Untargeted Metabolomics for Clinical Research and Clinical Testing.” Journal of Mass Spectrometry 53 (11): 1143–54. https://doi.org/10.1002/jms.4292. Kusonmano, Kanthida, Wanwipa Vongsangnak, and Pramote Chumnanpuen. 2016. “Informatics for Metabolomics.” In Translational Biomedical Informatics, 91–115. Advances in Experimental Medicine and Biology. Springer, Singapore. https://doi.org/10.1007/978-981-10-1503-8_5. Levy, Allison J., Nicholas R. Oranzi, Atiye Ahmadireskety, Robin H. J. Kemperman, Michael S. Wei, and Richard A. Yost. 2019. “Recent Progress in Metabolomics Using Ion Mobility-Mass Spectrometry.” TrAC Trends in Analytical Chemistry 116 (July): 274–81. https://doi.org/10.1016/j.trac.2019.05.001. Liu, Xinyu, Lina Zhou, Xianzhe Shi, and Guowang Xu. 2019. “New Advances in Analytical Methods for Mass Spectrometry-Based Large-Scale Metabolomics Study.” TrAC Trends in Analytical Chemistry 121 (December): 115665. https://doi.org/10.1016/j.trac.2019.115665. Lu, Wenyun, Bryson D. Bennett, and Joshua D. Rabinowitz. 2008. “Analytical Strategies for LC–MS-based Targeted Metabolomics.” Journal of Chromatography B, Hyphenated Techniques for Global Metabolite Profiling, 871 (2): 236–42. https://doi.org/10.1016/j.jchromb.2008.04.031. Lu, Xin, and Guowang Xu. 2008. “LC-MS Metabonomics Methodology in Biomarker Discovery.” In Biomarker Methods in Drug Discovery and Development, edited by Feng Wang, 291–315. Methods in Pharmacology and Toxicology™. Humana Press. https://doi.org/10.1007/978-1-59745-463-6_14. Lv, Wangjie, Zhongda Zeng, Yuqing Zhang, Qingqing Wang, Lichao Wang, Zhaoxuan Zhang, Xianzhe Shi, Xinjie Zhao, and Guowang Xu. 2022. “Comprehensive Metabolite Quantitative Assay Based on Alternate Metabolomics and Lipidomics Analyses.” Analytica Chimica Acta 1215 (July): 339979. https://doi.org/10.1016/j.aca.2022.339979. Madsen, Rasmus, Torbjörn Lundstedt, and Johan Trygg. 2010. “Chemometrics in Metabolomics—A Review in Human Disease Diagnosis.” Analytica Chimica Acta 659 (1): 23–33. https://doi.org/10.1016/j.aca.2009.11.042. Mangul, Serghei, Thiago Mosqueiro, Richard J. Abdill, Dat Duong, Keith Mitchell, Varuni Sarwal, Brian Hill, et al. 2019. “Challenges and Recommendations to Improve the Installability and Archival Stability of Omics Computational Tools.” PLOS Biology 17 (6): e3000333. https://doi.org/10.1371/journal.pbio.3000333. Martens, Jonathan, Giel Berden, Rianne E. van Outersterp, Leo A. J. Kluijtmans, Udo F. Engelke, Clara D. M. van Karnebeek, Ron A. Wevers, and Jos Oomens. 2017. “Molecular Identification in Metabolomics Using Infrared Ion Spectroscopy.” Scientific Reports 7 (June). https://doi.org/10.1038/s41598-017-03387-4. Matich, Eryn K., Nita G. Chavez Soria, Diana S. Aga, and G. Ekin Atilla-Gokcumen. 2019. “Applications of Metabolomics in Assessing Ecological Effects of Emerging Contaminants and Pollutants on Plants.” Journal of Hazardous Materials 373 (July): 527–35. https://doi.org/10.1016/j.jhazmat.2019.02.084. Miggiels, Paul, Bert Wouters, Gerard J. P. van Westen, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2019. “Novel Technologies for Metabolomics: More for Less.” TrAC Trends in Analytical Chemistry 120 (November): 115323. https://doi.org/10.1016/j.trac.2018.11.021. Misra, Biswapriya B. 2018. “New Tools and Resources in Metabolomics: 2016–2017.” ELECTROPHORESIS 39 (7): 909–23. https://doi.org/10.1002/elps.201700441. Misra, Biswapriya B., Johannes F. Fahrmann, and Dmitry Grapov. 2017. “Review of Emerging Metabolomic Tools and Resources: 2015–2016.” ELECTROPHORESIS 38 (18): 2257–74. https://doi.org/10.1002/elps.201700110. Misra, Biswapriya B., and Justin J. J. van der Hooft. 2016. “Updates in Metabolomics Tools and Resources: 2014–2015.” ELECTROPHORESIS 37 (1): 86–110. https://doi.org/10.1002/elps.201500417. Müller, Manfred J., and Anja Bosy-Westphal. 2020. “From a ‘Metabolomics Fashion’ to a Sound Application of Metabolomics in Research on Human Nutrition.” European Journal of Clinical Nutrition 74 (12): 1619–29. https://doi.org/10.1038/s41430-020-00781-6. Ni, Zhixu, Michele Wölk, Geoff Jukes, Karla Mendivelso Espinosa, Robert Ahrends, Lucila Aimo, Jorge Alvarez-Jarreta, et al. 2022. “Guiding the Choice of Informatics Software and Tools for Lipidomics Research Applications.” Nature Methods, December, 1–12. https://doi.org/10.1038/s41592-022-01710-0. Pezzatti, Julian, Julien Boccard, Santiago Codesido, Yoric Gagnebin, Abhinav Joshi, Didier Picard, Víctor González-Ruiz, and Serge Rudaz. 2020. “Implementation of Liquid Chromatography–High Resolution Mass Spectrometry Methods for Untargeted Metabolomic Analyses of Biological Samples: A Tutorial.” Analytica Chimica Acta 1105 (April): 28–44. https://doi.org/10.1016/j.aca.2019.12.062. Place, Benjamin J., Elin M. Ulrich, Jonathan K. Challis, Alex Chao, Bowen Du, Kristin Favela, Yong-Lai Feng, et al. 2021. “An Introduction to the Benchmarking and Publications for Non-Targeted Analysis Working Group.” Analytical Chemistry 93 (49): 16289–96. https://doi.org/10.1021/acs.analchem.1c02660. Rey-Stolle, Fernanda, Danuta Dudzik, Carolina Gonzalez-Riano, Miguel Fernández-García, Vanesa Alonso-Herranz, David Rojo, Coral Barbas, and Antonia García. 2022. “Low and High Resolution Gas Chromatography-Mass Spectrometry for Untargeted Metabolomics: A Tutorial.” Analytica Chimica Acta 1210 (June): 339043. https://doi.org/10.1016/j.aca.2021.339043. Sarpe, Vladimir, and David C Schriemer. 2017. “Supporting Metabolomics with Adaptable Software: Design Architectures for the End-User.” Current Opinion in Biotechnology, Analytical biotechnology, 43 (February): 110–17. https://doi.org/10.1016/j.copbio.2016.11.001. Schrimpe-Rutledge, Alexandra C., Simona G. Codreanu, Stacy D. Sherrod, and John A. McLean. 2016. “Untargeted Metabolomics Strategies—Challenges and Emerging Directions.” Journal of The American Society for Mass Spectrometry 27 (12): 1897–1905. https://doi.org/10.1007/s13361-016-1469-y. Schymanski, Emma L., and Antony J. Williams. 2017. “Open Science for Identifying ‘Known Unknown’ Chemicals.” Environmental Science &amp; Technology 51 (10): 5357–59. https://doi.org/10.1021/acs.est.7b01908. Siskos, Alexandros P., Pooja Jain, Werner Römisch-Margl, Mark Bennett, David Achaintre, Yasmin Asad, Luke Marney, et al. 2017. “Interlaboratory Reproducibility of a Targeted Metabolomics Platform for Analysis of Human Serum and Plasma.” Analytical Chemistry 89 (1): 656–65. https://doi.org/10.1021/acs.analchem.6b02930. Smirnov, Kirill S., Tanja V. Maier, Alesia Walker, Silke S. Heinzmann, Sara Forcisi, Inés Martinez, Jens Walter, and Philippe Schmitt-Kopplin. 2016. “Challenges of Metabolomics in Human Gut Microbiota Research.” International Journal of Medical Microbiology, Intestinal microbiota - a microbial ecosystem at the edge between immune homeostasis and inflammation, 306 (5): 266–79. https://doi.org/10.1016/j.ijmm.2016.03.006. Spicer, Rachel, Reza M. Salek, Pablo Moreno, Daniel Cañueto, and Christoph Steinbeck. 2017. “Navigating Freely-Available Software Tools for Metabolomics Analysis.” Metabolomics 13 (9). https://doi.org/10.1007/s11306-017-1242-7. Spratlin, Jennifer L., Natalie J. Serkova, and S. Gail Eckhardt. 2009. “Clinical Applications of Metabolomics in Oncology: A Review.” Clinical Cancer Research 15 (2): 431–40. https://doi.org/10.1158/1078-0432.CCR-08-1059. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Sumner, Lloyd W, Pedro Mendes, and Richard A Dixon. 2003. “Plant Metabolomics: Large-Scale Phytochemistry in the Functional Genomics Era.” Phytochemistry, Plant Metabolomics, 62 (6): 817–36. https://doi.org/10.1016/S0031-9422(02)00708-2. Tang, Yanan, Caley B. Craven, Nicholas J. P. Wawryk, Junlang Qiu, Feng Li, and Xing-Fang Li. 2020. “Advances in Mass Spectrometry-Based Omics Analysis of Trace Organics in Water.” TrAC Trends in Analytical Chemistry 128 (July): 115918. https://doi.org/10.1016/j.trac.2020.115918. Theodoridis, Georgios A., Helen G. Gika, Elizabeth J. Want, and Ian D. Wilson. 2012. “Liquid Chromatography–Mass Spectrometry Based Global Metabolite Profiling: A Review.” Analytica Chimica Acta 711 (January): 7–16. https://doi.org/10.1016/j.aca.2011.09.042. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Uppal, Karan, Douglas I. Walker, Ken Liu, Shuzhao Li, Young-Mi Go, and Dean P. Jones. 2016. “Computational Metabolomics: A Framework for the Million Metabolome.” Chemical Research in Toxicology 29 (12): 1956–75. https://doi.org/10.1021/acs.chemrestox.6b00179. Verhoeven, Aswin, Martin Giera, and Oleg A. Mayboroda. 2020. “Scientific Workflow Managers in Metabolomics: An Overview.” Analyst 145 (11): 3801–8. https://doi.org/10.1039/D0AN00272K. Viant, Mark R., Timothy M. D. Ebbels, Richard D. Beger, Drew R. Ekman, David J. T. Epps, Hennicke Kamp, Pim E. G. Leonards, et al. 2019. “Use Cases, Best Practice and Reporting Standards for Metabolomics in Regulatory Toxicology.” Nature Communications 10 (1): 3041. https://doi.org/10.1038/s41467-019-10900-y. Vinaixa, Maria, Emma L. Schymanski, Steffen Neumann, Miriam Navarro, Reza M. Salek, and Oscar Yanes. 2016. “Mass Spectral Databases for LC/MS- and GC/MS-based Metabolomics: State of the Field and Future Prospects.” TrAC Trends in Analytical Chemistry 78 (April): 23–35. https://doi.org/10.1016/j.trac.2015.09.005. Vitale, Chiara Maria, Arjen Lommen, Carolin Huber, Kevin Wagner, Borja Garlito Molina, Rosalie Nijssen, Elliott James Price, et al. 2022. “Harmonized Quality Assurance/Quality Control Provisions for Nontargeted Measurement of Urinary Pesticide Biomarkers in the HBM4EU Multisite SPECIMEn Study.” Analytical Chemistry 94 (22): 7833–43. https://doi.org/10.1021/acs.analchem.2c00061. Wallach, Joshua D., Kevin W. Boyack, and John P. A. Ioannidis. 2018. “Reproducible Research Practices, Transparency, and Open Access Data in the Biomedical Literature, 2015–2017.” PLOS Biology 16 (11): e2006930. https://doi.org/10.1371/journal.pbio.2006930. Warth, Benedikt, Scott Spangler, Mingliang Fang, Caroline H. Johnson, Erica M. Forsberg, Ana Granados, Richard L. Martin, et al. 2017. “Exposome-Scale Investigations Guided by Global Metabolomics, Pathway Analysis, and Cognitive Computing.” Analytical Chemistry 89 (21): 11505–13. https://doi.org/10.1021/acs.analchem.7b02759. Weljie, Aalim M., Jack Newton, Pascal Mercier, Erin Carlson, and Carolyn M. Slupsky. 2006. “Targeted Profiling:  Quantitative Analysis of 1H NMR Metabolomics Data.” Analytical Chemistry 78 (13): 4430–42. https://doi.org/10.1021/ac060209g. Wishart, David S. 2016. “Emerging Applications of Metabolomics in Drug Discovery and Precision Medicine.” Nature Reviews Drug Discovery 15 (7): 473–84. https://doi.org/10.1038/nrd.2016.32. Wolfender, Jean-Luc, Guillaume Marti, Aurélien Thomas, and Samuel Bertrand. 2015. “Current Approaches and Challenges for the Metabolite Profiling of Complex Natural Extracts.” Journal of Chromatography A, Editors’ Choice IX, 1382 (February): 136–64. https://doi.org/10.1016/j.chroma.2014.10.091. Yates Iii, John R. 2011. “A Century of Mass Spectrometry: From Atoms to Proteomes.” Nature Methods 8 (8): 633–37. https://doi.org/10.1038/nmeth.1659. Yuan, Min, Susanne B. Breitkopf, Xuemei Yang, and John M. Asara. 2012. “A Positive/Negative Ion–Switching, Targeted Mass Spectrometry–Based Metabolomics Platform for Bodily Fluids, Cells, and Fresh and Fixed Tissue.” Nature Protocols 7 (5): 872–81. https://doi.org/10.1038/nprot.2012.024. Zenobi, R. 2013. “Single-Cell Metabolomics: Analytical and Biological Perspectives.” Science 342 (6163): 1243259. https://doi.org/10.1126/science.1243259. Zhang, Aihua, Hui Sun, Ping Wang, Ying Han, and Xijun Wang. 2012. “Modern Analytical Techniques in Metabolomics Analysis.” The Analyst 137 (2): 293–300. https://doi.org/10.1039/C1AN15605E. Zhou, Juntuo, and Yuxin Yin. 2016. “Strategies for Large-Scale Targeted Metabolomics Quantification by Liquid Chromatography-Mass Spectrometry.” Analyst 141 (23): 6362–73. https://doi.org/10.1039/C6AN01753C. "],["experimental-designdoe.html", "Chapter 2 Experimental design(DoE) 2.1 Homogeneity study 2.2 Heterogeneity study 2.3 Power analysis 2.4 Optimization 2.5 Pooled QC", " Chapter 2 Experimental design(DoE) Before you perform any metabolomics experiment, a clean and meaningful experimental design is the best start. Depending on different research purposes, experimental design can be classified into homogeneity and heterogeneity study. Technique such as isotope labeled media will not be discussed in this chapter while this paper(Jang, Chen, and Rabinowitz 2018) could be a good start. 2.1 Homogeneity study In homogeneity study, the research purpose is about method validation in most cases. Pooled sample made from multiple samples or technical replicates from same population will be used. Variances within the samples should be attributed to factors other than the samples themselves. For example, we want to know if sample injection order will affect the intensities of the unknown peaks, one pooled sample or technical replicates samples should be used. Another experimental design for homogeneity study will use biological replicates to find the common features from a group of samples. Biological replicates mean samples from same population with same biological process. For example, we wanted to know metabolites profiles of a certain species and we could collected lots of the individual samples from the population. Then only the peaks/compounds appeared in all samples will be used to describe the metabolites profiles of this species. Technical replicates could also be used with biological replicates. 2.2 Heterogeneity study In heterogeneity study, the research purpose is to find the differences among samples. You need at least a baseline to perform the comparison. Such baseline could be generated by random process, control samples or background knowledge. For example, outlier detection can be performed to find abnormal samples in unsupervised manners. Distribution or spatial analysis could be used to find geological relationship of known and unknown compounds. Temporal trend of metabolites profile could be found by time series or cohort studies. Clinical trial or random control trial is also an important class of heterogeneity studies. In this cases, you need at least two groups: treated group and control group. Also you could treat this group information as the one primary variable or primary variables to be explored for certain research purposes. In the following discussion about experimental design, we will use random control trail as model to discuss important issues. 2.3 Power analysis Supposing we have control and treated groups, the numbers of samples in each group should be carefully calculated.For each metabolite, such comparison could be treated as one t-test. You need to perform a Power analysis to get the numbers. For example, we have two groups of samples with 10 samples in each group. Then we set the power at 0.9, which means one minus Type II error probability, the standard deviation at 1 and the significance level (Type 1 error probability) at 0.05. Then we will get the meaningful delta between the two groups should be higher than 1.53367 under this experiment design. Also we could set the delta to get the minimized numbers of the samples in each group. To get those data such as the standard deviation or delta for power analysis, you need to perform preliminary or pilot experiments. power.t.test(n=10,sd=1,sig.level = 0.05,power = 0.9) ## ## Two-sample t test power calculation ## ## n = 10 ## delta = 1.53367 ## sd = 1 ## sig.level = 0.05 ## power = 0.9 ## alternative = two.sided ## ## NOTE: n is number in *each* group power.t.test(delta = 5,sd=1,sig.level = 0.05,power = 0.9) ## ## Two-sample t test power calculation ## ## n = 2.328877 ## delta = 5 ## sd = 1 ## sig.level = 0.05 ## power = 0.9 ## alternative = two.sided ## ## NOTE: n is number in *each* group However, since sometimes we could not perform preliminary experiment, we could directly compute the power based on false discovery rate control. If the power is lower than certain value, say 0.8, we just exclude this peak as significant features. In this review (Oberg and Vitek 2009), author suggest to estimate an average \\(\\alpha\\) according to this equation (Benjamini and Hochberg 1995) and then use normal way to calculate the sample numbers: \\[ \\alpha_{ave} \\leq (1-\\beta_{ave})\\cdot q\\frac{1}{1+(1-q)\\cdot m_0/m_1} \\] Other study (Blaise et al. 2016) show a method based on simulation to estimate the sample size. They used BY correction to limit the influences from correlations. Other investigation could be found here(Saccenti and Timmerman 2016; Blaise 2013). However, the nature of omics study make the power analysis hard to use one number for all metabolites and all the methods are trying to find a balance to represent more peaks with least samples. MetSizeR GUI Tool for Estimating Sample Sizes for metabolomics Experiments(Nyamundanda et al. 2013). MSstats Protein/Peptide significance analysis (Choi et al. 2014). enviGCMS GC/LC-MS Data Analysis for Environmental Science(Z. Yu et al. 2017). 2.4 Optimization One experiment can contain lots of factors with different levels and only one set of parameters for different factors will show the best sensitivity or reproducibility for certain study. To find this set of parameters, Plackett-Burman Design (PBD), Response Surface Methodology (RSM), Central Composite Design (CCD), and Taguchi methods could be used to optimize the parameters for metabolomics study. The target could be the quality of peaks, the numbers of peaks, the stability of peaks intensity, and/or the statistics of the combination of those targets. You could check those paper for details(Jacyna, Kordalewska, and Markuszewski 2019; Box, Hunter, and Hunter 2005). 2.5 Pooled QC Pooled QC samples are unique and very important for metabolomics study. Every 10 or 20 samples, a pooled sample from all samples and blank sample in one study should be injected as quality control samples. Pooled QC samples contain the changes during the instrumental analysis and blank samples could tell where the variances come from. Meanwhile the cap of sequence should old the column with pooled QC samples. The injection sequence should be randomized. Those papers(Phapale et al. 2020; Dudzik et al. 2018; Dunn et al. 2012; Broadhurst et al. 2018; Corey D. Broeckling et al. 2023; González-Domínguez et al. 2024) should be read for details. If there are other co-factors, a linear model or randomizing would be applied to eliminate their influences. You need to record the values of those co-factors for further data analysis. Common co-factors in metabolomics studies are age, gender, location, etc. If you need data correction, some background or calibration samples are required. However, control samples could also be used for data correction in certain DoE. Another important factors are instrumentals. High-resolution mass spectrum is always preferred. As shown in Lukas’s study (Najdekr et al. 2016): the most effective mass resolving powers for profiling analyses of metabolite rich biofluids on the Orbitrap Elite were around 60000-120000 fwhm to retrieve the highest amount of information. The region between 400-800 m/z was influenced the most by resolution. However, elimination of peaks with high RSD% within group were always omitted by most study. Based on pre-experiment, you could get a description of RSD% distribution and set cut-off to use stable peaks for further data analysis. To my knowledge, 30% is suitable considering the batch effects. Adding certified reference material or standard reference material will help to evaluate the quality large scale data collocation or important metabolites(Wise 2022; Wright, Beach, and McCarron 2022). For quality control in long term, ScreenDB provide a data analysis strategy for HRMS data founded on structured query language database archiving(Mardal et al. 2023). AVIR develops a computational solution to automatically recognize metabolic features with computational variation in a metabolomics data set(Z. Zhang et al. 2024). References Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological) 57 (1): 289–300. https://www.jstor.org/stable/2346101. Blaise, Benjamin J. 2013. “Data-Driven Sample Size Determination for Metabolic Phenotyping Studies.” Analytical Chemistry 85 (19): 8943–50. https://doi.org/10.1021/ac4022314. Blaise, Benjamin J., Gonçalo Correia, Adrienne Tin, J. Hunter Young, Anne-Claire Vergnaud, Matthew Lewis, Jake T. M. Pearce, et al. 2016. “Power Analysis and Sample Size Determination in Metabolic Phenotyping.” Analytical Chemistry 88 (10): 5179–88. https://doi.org/10.1021/acs.analchem.6b00188. Box, George E. P., J. Stuart Hunter, and William G. Hunter. 2005. Statistics for Experimenters. Wiley-Interscience. Broadhurst, David, Royston Goodacre, Stacey N. Reinke, Julia Kuligowski, Ian D. Wilson, Matthew R. Lewis, and Warwick B. Dunn. 2018. “Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies.” Metabolomics 14 (6). https://doi.org/10.1007/s11306-018-1367-3. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Dunn, Warwick B, Ian D Wilson, Andrew W Nicholls, and David Broadhurst. 2012. “The Importance of Experimental Design and QC Samples in Large-Scale and MS-driven Untargeted Metabolomic Studies of Humans.” Bioanalysis 4 (18): 2249–64. https://doi.org/10.4155/bio.12.204. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. Jacyna, Julia, Marta Kordalewska, and Michał J. Markuszewski. 2019. “Design of Experiments in Metabolomics-Related Studies: An Overview.” Journal of Pharmaceutical and Biomedical Analysis 164 (February): 598–606. https://doi.org/10.1016/j.jpba.2018.11.027. Jang, Cholsoon, Li Chen, and Joshua D. Rabinowitz. 2018. “Metabolomics and Isotope Tracing.” Cell 173 (4): 822–37. https://doi.org/10.1016/j.cell.2018.03.055. Mardal, Marie, Petur W. Dalsgaard, Brian S. Rasmussen, Kristian Linnet, and Christian B. Mollerup. 2023. “Scalable Analysis of Untargeted LC-HRMS Data by Means of SQL Database Archiving.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.2c03769. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Nyamundanda, Gift, Isobel Claire Gormley, Yue Fan, William M. Gallagher, and Lorraine Brennan. 2013. “MetSizeR: Selecting the Optimal Sample Size for Metabolomic Studies Using an Analysis Based Approach.” BMC Bioinformatics 14: 338. https://doi.org/10.1186/1471-2105-14-338. Oberg, Ann L., and Olga Vitek. 2009. “Statistical Design of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Journal of Proteome Research 8 (5): 2144–56. https://doi.org/10.1021/pr8010099. Phapale, Prasad, Vineeta Rai, Ashok Kumar Mohanty, and Sanjeeva Srivastava. 2020. “Untargeted Metabolomics Workshop Report: Quality Control Considerations from Sample Preparation to Data Analysis.” Journal of the American Society for Mass Spectrometry 31 (9): 2006–10. https://doi.org/10.1021/jasms.0c00224. Saccenti, Edoardo, and Marieke E. Timmerman. 2016. “Approaches to Sample Size Determination for Multivariate Data: Applications to PCA and PLS-DA of Omics Data.” Journal of Proteome Research 15 (8): 2379–93. https://doi.org/10.1021/acs.jproteome.5b01029. Wise, Stephen A. 2022. “What If Using Certified Reference Materials (CRMs) Was a Requirement to Publish in Analytical/Bioanalytical Chemistry Journals?” Analytical and Bioanalytical Chemistry 414 (24): 7015–22. https://doi.org/10.1007/s00216-022-04163-8. Wright, Elliott J., Daniel G. Beach, and Pearse McCarron. 2022. “Non-Target Analysis and Stability Assessment of Reference Materials Using Liquid Chromatography-High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1201 (April): 339622. https://doi.org/10.1016/j.aca.2022.339622. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. Zhang, Zixuan, Huaxu Yu, Ethan Wong-Ma, Pouneh Dokouhaki, Ahmed Mostafa, Jay S. Shavadia, Fang Wu, and Tao Huan. 2024. “Reducing Quantitative Uncertainty Caused by Data Processing in Untargeted Metabolomics.” Analytical Chemistry 96 (9): 3727–32. https://doi.org/10.1021/acs.analchem.3c04046. "],["pretreatment.html", "Chapter 3 Pretreatment 3.1 Collection 3.2 Quenching 3.3 Extraction 3.4 Derivatization 3.5 Isotope label 3.6 Storage", " Chapter 3 Pretreatment Pretreatment will affect the results of metabolomics and cover the sample treatment from crude samples to injection vials for instrumental analysis. The purpose of sample pretreatment is the to retain more interesting compounds while remove unrelated compounds. For metabolomics studies, we might not know ‘interesting’ compounds in advance and the unrelated compounds are highly depended on research purpose. For example, Gel Permeation Chromatograph(GPC), Florisil, Alumina, Silica gel could be used to remove lipid while alcohols and strong acid/base could make protein denaturation to release more compounds. However, if we are interested in small lipid or peptide, such pretreatment methods should be changed. In general, sample collection, quenching, extraction methods, derivatization, and storage should be optimized in pretreatment. 3.1 Collection Those papers investigated different fecal collection methods(Loftfield et al. 2016; Deda et al. 2017). This paper discuss the influence of sample normalization(Wu and Li 2016). 3.2 Quenching Quenching solvent is always used to stop stop enzymatic activity. In this review(W. Lu et al. 2017), authors said: A classical approach, which works well for many analytes, is boiling ethanol. Although the boiling solvent raises concerns about thermal degradation, it reliably denatures enzymes. In contrast, cold organic solvent may not fully denature enzymes or may do so too slowly such that some metabolic reactions continue, interconverting metabolites during the quenching process. This review(J. Kim et al. 2020) summarized the urease-dependent metabolome sample preparation and found: activities of urease and endogenous urinary enzymes and metabolite contaminants from the urease preparations introduce artefacts into metabolite profiles, thus leading to misinterpretation. 3.3 Extraction According to this research(Bennett et al. 2009): The total metabolome concentration is approximately 300 mM, whereas the protein concentration is approximately 7 mM., which implies that most cellular metabolites are in free form. Dmitri et.al(Sitnikov, Monnin, and Vuckovic 2016) thought the most orthogonal methods to methanol-based precipitation were ion-exchange solid-phase extraction and liquid-liquid extraction using methyl-tertbutyl ether. Another study used stable isotope labeled sample and found the use of a water-methanol-acetonitrile mixture for global metabolite extraction instead of aqueous methanol or aqueous acetonitrile alone (Doppler et al. 2016). Metabolic information was highly influenced by the extraction solvent(Ibáñez et al. 2017). Tissue samples need to first be pulverized into fine powders. Feces collected with 95% ethanol or FOBT would be more reproducible and stable. In this review(W. Lu et al. 2017), authors said: In our experience, for both cell and tissue specimens, 40:40:20 acetonitrile:methanol:water with 0.1 M formic acid (and subsequent neutralization with ammonium bicarbonate) is generally an effective solvent system for both quenching and extraction, including for ATP and other high-energy phosphorylated compounds. We typically use approximately 1 mL of solvent mix to extract 25 mg of biological specimen. …Thus, although drying is acceptable for most metabolites, care must be taken with redox-active species. nano LC-MS could be used to analysis small numbers of cells(Luo and Li 2017). For plant like soybeans(Mahmud et al. 2017), ammonium acetate/methanol could be selected as extraction strategies compared with water/methanol and sodium phosphate/methanol. For general plant samples, check this comprehensive investigation(Bijttebier et al. 2016). For blood plasma and serum sample, a comprehensive evaluation of 12 sample preparation methods (SPM) using phospholipid and protein removal plates (PLR), solid phase extraction plates (SPE), supported liquid extraction cartridge (SLE), and conventionally used protein precipitation (PPT) were purformed. Results show PPT and PLR on the same samples by implementing a simple analytical workflow as their complementarity would allow the broadening of the visible chemical space (Chaker et al. 2022). 3.4 Derivatization Derivatization is always used in GC-based metabolomics study. This paper(Miyagawa and Bamba 2019) compared sequential derivatization methods and found different compounds would show different fluctuations during oximation or silylation process. This paper summarized derivatization methods for LC-MS (S. Zhao and Li 2020). 3.5 Isotope label You might try heavy water to exchange oxygen atom with samples to track certain metabolites(Osipenko et al. 2022) or MS-IDF(S. Wang et al. 2022). 3.6 Storage Samples should be stored after sample collection or sample pretreatment. -80°C or -20°C is always preferred to store samples. Dry ice should be used during sample pretreatment. However, comprehensive investigation of storage influences found the metabolites profile will change after one day storage at -80°C(M. Yu et al. 2020) . Rapid analysis of samples should be considered to capture more accurate information in the samples. Storage conditions such as temperature and time can affect the metabolite composition of various samples. Laparre et al.(Laparre et al. 2017) noted that the metabolite profiles of urine samples were significantly changed after 5 days of storage at 4°C , while Wandro and colleagues(Wandro et al. 2017) observed that the metabolomic profiles of cystic fibrosis sputum samples underwent notable changes after only 1 day of storage at 4°C . Likewise, Roszkowska et al. demonstrated that various signaling molecules were lost from the lipidome profile of tissue after storing the samples for one year at 80°C (Roszkowska et al. 2018). To date, most metabolomics studies involving storage of samples prior to the analysis have used a storage temperature of 80°C , as previous investigations have shown that low temperatures or freeze-thaw cycles do not significantly change the metabolite profile of certain samples(Lin et al. 2007) . For gut microbiota, this paper could be checked for storage issue(Zubeldia-Varela et al. 2020). For blood sample storage, you could check this paper(Hernandes, Barbas, and Dudzik 2017). For urine sample storage, check this(Laparre et al. 2017). This piece reviewed the stability of energy metabolites(Gil et al. 2015). References Bennett, Bryson D., Elizabeth H. Kimball, Melissa Gao, Robin Osterhout, Stephen J. Van Dien, and Joshua D. Rabinowitz. 2009. “Absolute Metabolite Concentrations and Implied Enzyme Active Site Occupancy in Escherichia Coli.” Nature Chemical Biology 5 (8): 593–99. https://doi.org/10.1038/nchembio.186. Bijttebier, Sebastiaan, Anastasia Van der Auwera, Kenn Foubert, Stefan Voorspoels, Luc Pieters, and Sandra Apers. 2016. “Bridging the Gap Between Comprehensive Extraction Protocols in Plant Metabolomics Studies and Method Validation.” Analytica Chimica Acta 935 (September): 136–50. https://doi.org/10.1016/j.aca.2016.06.047. Chaker, Jade, David Møbjerg Kristensen, Thorhallur Ingi Halldorsson, Sjurdur Frodi Olsen, Christine Monfort, Cécile Chevrier, Bernard Jégou, and Arthur David. 2022. “Comprehensive Evaluation of Blood Plasma and Serum Sample Preparations for HRMS-Based Chemical Exposomics: Overlaps and Specificities.” Analytical Chemistry 94 (2): 866–74. https://doi.org/10.1021/acs.analchem.1c03638. Deda, Olga, Anastasia Chrysovalantou Chatziioannou, Stella Fasoula, Dimitris Palachanis, Nicolaos Raikos, Georgios A. Theodoridis, and Helen G. Gika. 2017. “Sample Preparation Optimization in Fecal Metabolic Profiling.” Journal of Chromatography B, Advances in mass spectrometry-based applications, 1047 (March): 115–23. https://doi.org/10.1016/j.jchromb.2016.06.047. Doppler, Maria, Bernhard Kluger, Christoph Bueschl, Christina Schneider, Rudolf Krska, Sylvie Delcambre, Karsten Hiller, Marc Lemmens, and Rainer Schuhmacher. 2016. “Stable Isotope-Assisted Evaluation of Different Extraction Solvents for Untargeted Metabolomics of Plants.” International Journal of Molecular Sciences 17 (7). https://doi.org/10.3390/ijms17071017. Gil, Andres, David Siegel, Hjalmar Permentier, Dirk-Jan Reijngoud, Frank Dekker, and Rainer Bischoff. 2015. “Stability of Energy Metabolites—An Often Overlooked Issue in Metabolomics Studies: A Review.” ELECTROPHORESIS 36 (18): 2156–69. https://doi.org/10.1002/elps.201500031. Hernandes, Vinicius Veri, Coral Barbas, and Danuta Dudzik. 2017. “A Review of Blood Sample Handling and Pre-Processing for Metabolomics Studies.” ELECTROPHORESIS 38 (18): 2232–41. https://doi.org/10.1002/elps.201700086. Ibáñez, Clara, Lamia Mouhid, Guillermo Reglero, and Ana Ramírez de Molina. 2017. “Lipidomics Insights in Health and Nutritional Intervention Studies.” Journal of Agricultural and Food Chemistry 65 (36): 7827–42. https://doi.org/10.1021/acs.jafc.7b02643. Kim, Jungyeon, Joong Kyong Ahn, Yu Eun Cheong, Sung-Joon Lee, Hoon-Suk Cha, and Kyoung Heon Kim. 2020. “Systematic Re-Evaluation of the Long-Used Standard Protocol of Urease-Dependent Metabolome Sample Preparation.” PloS One 15 (3): e0230072. https://doi.org/10.1371/journal.pone.0230072. Laparre, Jérôme, Zied Kaabia, Mark Mooney, Tom Buckley, Mark Sherry, Bruno Le Bizec, and Gaud Dervilly-Pinel. 2017. “Impact of Storage Conditions on the Urinary Metabolomics Fingerprint.” Analytica Chimica Acta 951 (January): 99–107. https://doi.org/10.1016/j.aca.2016.11.055. Lin, Ching Yu, Huifeng Wu, Ronald S. Tjeerdema, and Mark R. Viant. 2007. “Evaluation of Metabolite Extraction Strategies from Tissue Samples Using NMR Metabolomics.” Metabolomics 3 (1): 55–67. https://doi.org/10.1007/s11306-006-0043-1. Loftfield, Erikka, Emily Vogtmann, Joshua N. Sampson, Steven C. Moore, Heidi Nelson, Rob Knight, Nicholas Chia, and Rashmi Sinha. 2016. “Comparison of Collection Methods for Fecal Samples for Discovery Metabolomics in Epidemiologic Studies.” Cancer Epidemiology and Prevention Biomarkers 25 (11): 1483–90. https://doi.org/10.1158/1055-9965.EPI-16-0409. Lu, Wenyun, Xiaoyang Su, Matthias S. Klein, Ian A. Lewis, Oliver Fiehn, and Joshua D. Rabinowitz. 2017. “Metabolite Measurement: Pitfalls to Avoid and Practices to Follow.” Annual Review of Biochemistry 86 (1): 277–304. https://doi.org/10.1146/annurev-biochem-061516-044952. Luo, Xian, and Liang Li. 2017. “Metabolomics of Small Numbers of Cells: Metabolomic Profiling of 100, 1000, and 10000 Human Breast Cancer Cells.” Analytical Chemistry 89 (21): 11664–71. https://doi.org/10.1021/acs.analchem.7b03100. Mahmud, Iqbal, Sandi Sternberg, Michael Williams, and Timothy J. Garrett. 2017. “Comparison of Global Metabolite Extraction Strategies for Soybeans Using UHPLC-HRMS.” Analytical and Bioanalytical Chemistry 409 (26): 6173–80. https://doi.org/10.1007/s00216-017-0557-6. Miyagawa, Hiromi, and Takeshi Bamba. 2019. “Comparison of Sequential Derivatization with Concurrent Methods for GC/MS-based Metabolomics.” Journal of Bioscience and Bioengineering 127 (2): 160–68. https://doi.org/10.1016/j.jbiosc.2018.07.015. Osipenko, Sergey, Alexander Zherebker, Lidiia Rumiantseva, Oxana Kovaleva, Evgeny N. Nikolaev, and Yury Kostyukevich. 2022. “Oxygen Isotope Exchange Reaction for Untargeted LC–MS Analysis.” Journal of the American Society for Mass Spectrometry 33 (2): 390–98. https://doi.org/10.1021/jasms.1c00383. Roszkowska, Anna, Miao Yu, Vincent Bessonneau, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2018. “Tissue Storage Affects Lipidome Profiling in Comparison to in Vivo Microsampling Approach.” Scientific Reports 8 (1): 6980. https://doi.org/10.1038/s41598-018-25428-2. Sitnikov, Dmitri G., Cian S. Monnin, and Dajana Vuckovic. 2016. “Systematic Assessment of Seven Solvent and Solid-Phase Extraction Methods for Metabolomics Analysis of Human Plasma by LC-MS.” Scientific Reports 6 (December). https://doi.org/10.1038/srep38885. Wandro, Stephen, Lisa Carmody, Tara Gallagher, John J. LiPuma, and Katrine Whiteson. 2017. “Making It Last: Storage Time and Temperature Have Differential Impacts on Metabolite Profiles of Airway Samples from Cystic Fibrosis Patients.” mSystems 2 (6). https://doi.org/10.1128/mSystems.00100-17. Wang, Suping, Xiaojuan Jiang, Rong Ding, Binbin Chen, Haiyan Lyu, Junyang Liu, Chunyan Zhu, et al. 2022. “MS-IDF: A Software Tool for Nontargeted Identification of Endogenous Metabolites After Chemical Isotope Labeling Based on a Narrow Mass Defect Filter.” Analytical Chemistry 94 (7): 3194–3202. https://doi.org/10.1021/acs.analchem.1c04719. Wu, Yiman, and Liang Li. 2016. “Sample Normalization Methods in Quantitative Metabolomics.” Journal of Chromatography A, Editors’ Choice X, 1430 (January): 80–95. https://doi.org/10.1016/j.chroma.2015.12.007. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Zhao, Shuang, and Liang Li. 2020. “Chemical Derivatization in LC-MS-based Metabolomics Study.” TrAC Trends in Analytical Chemistry 131 (October): 115988. https://doi.org/10.1016/j.trac.2020.115988. Zubeldia-Varela, Elisa, Domingo Barber, Coral Barbas, Marina Perez-Gordo, and David Rojo. 2020. “Sample Pre-Treatment Procedures for the Omics Analysis of Human Gut Microbiota: Turning Points, Tips and Tricks for Gene Sequencing and Metabolomics.” Journal of Pharmaceutical and Biomedical Analysis 191 (November): 113592. https://doi.org/10.1016/j.jpba.2020.113592. "],["instrumental-analysis.html", "Chapter 4 Instrumental analysis 4.1 Column and gradient selection 4.2 Mass resolution 4.3 Matrix effects", " Chapter 4 Instrumental analysis To get more information in the samples, full scan is preferred on GC/LC-MS. Each scan would collect a mass spectrum to cover the setting mass range. If you narrow down your mass range and keep the same scan time, each mass would gain the collection time and you would get a higher sensitivity. However, if you expand your scan range, the sensitivity for each mass would decrease. You could also extend the collection time for each scan. However, it would affect the separation process. Full scan is performed synchronously with the separation process. For a better separation on chromotograph, each peak should have at least 10 points to get a nice peak shape. If you want to separate two peaks with a retention time differences of 10s. Assuming the half peak width is 5s, you need to collect 10 mass spectrum within 10s. So the drwell time for each scan is 1s. If you use a high resolution column and the half peak width is 1s, you need to finish a scan within 0.2s. As we discussed above, shorter dwell time would decrease the sensitivity. Thus there is a trade-off between separation and sensitivity. If you use UPLC, the separation could be finished within 20 min while you need to calculate if you mass spectrometry could still show a good sensitivity. Recently a study (J. Cai and Yan 2021) show 6 points will be enough to generate peaks with 20 points with optimized workflow. 4.1 Column and gradient selection For GC, higher temperature could release compounds with higher boiling point. For LC, gradient and functional groups of stationary phase would be more important than temperature. Polarity of samples and column should match. More polar solvent could release polar compounds. Normal-phase column will not retain non-polar compounds while reversed-phase will elute polar column in the very beginning. To cover a wide polarity range or logP value compounds, normal phase column should match with non-polar to polar gradient to get a better separation of polar compounds while reverse phase column should match with polar to non-polar gradient to elute compounds. If you use an inappropriate order of gradient, you compounds would not be separated well. If you have no idea about column and gradient selection, check literature’s condition. Meanwhile, the pretreatment methods should fit the column and gradient selection. You will get limited information by injection of non-polar extracts on a normal phase column and nothing will be retained on column. This study show improved chromatography conditions will improve the annotation results(Anderson et al. 2021). You can also install polar and non-polar columns and run separation on one column while condition on another one, which could extend the chemical coverage(Flasch et al. 2022). Meta-analysis of chromatographic methods in EBI metabolights and NIH Workbench could be a guide for lab without experience on metabolomics chromatographic methods(Harrieder et al. 2022). This work introduce Sequential Quantification using Isotope Dilution (SQUID), a method combining serial sample injections into a continuous isocratic mobile phase, enabling rapid analysis of target molecules with high accuracy, as demonstrated by detecting microbial polyamines in human urine samples with an LLOQ of 106 nM and analysis times as short as 57 s, thus proposing SQUID as a high-throughput LC–MS tool for quantifying target biomarkers in large cohorts(Groves et al. 2023). 4.2 Mass resolution For metabolomics, high resolution mass spectrum should be used to make identification of compounds easier. The Mass Resolving Power is very important for annotation and high resolution mass spectrum should be calibrated in real time. The region between 400–800 m/z was influenced the most by resolution(Najdekr et al. 2016). Orbitrap Fusion’s performance was evaluated here(Barbier Saint Hilaire et al. 2018), as well as the comparison with Fourier transform ion cyclotron resonance (FT-ICR)(Ghaste, Mistrik, and Shulaev 2016; Huang et al. 2021). Mass Difference Maps could recalibrate HRMS data (Smirnov et al. 2019). 4.3 Matrix effects Matrix effects could decrease the sensitivity of untargeted analysis. Such matrix effects could be checked by low resolution mass spectrometry(Z. Yu et al. 2017) and found for high resolution mass spectrometry(Calbiani et al. 2006). Ion suppression should also be considered as a critical issue comparing heterogeneous metabolic profiles(Ghosson et al. 2021). This work discussed the matrix effects after Trimethylsilyl derivatization(Tarakhovskaya et al. 2023).The study(Dagan et al. 2023) investigated how the complexity of matrices affects nontargeted detection using LC-MS/MS analysis, finding that detection limits for trace compounds were significantly influenced by matrix complexity, with higher concentrations required for detection within the “top 1000” list compared to the first 10,000 peaks, suggesting a negative power law functional relationship between peak location and concentration; the research also demonstrated a correlation between power law coefficient and dilution factor, while showcasing the distribution of matrix peaks across various matrices, providing insights into the capabilities and limitations of LC-MS in analyzing nontargets in complex matrices. dist_loc &lt;- list.files( find.package(&quot;DiagrammeR&quot;), recursive = TRUE, pattern = &quot;mermaid.*js&quot;, full.names = TRUE ) js_cdn_url &lt;- &quot;https://cdnjs.cloudflare.com/ajax/libs/mermaid/9.0.1/mermaid.min.js&quot; download.file(js_cdn_url, dist_loc) References Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Barbier Saint Hilaire, Pierre, Ulli M. Hohenester, Benoit Colsch, Jean-Claude Tabet, Christophe Junot, and François Fenaille. 2018. “Evaluation of the High-Field Orbitrap Fusion for Compound Annotation in Metabolomics.” Analytical Chemistry 90 (5): 3030–35. https://doi.org/10.1021/acs.analchem.7b05372. Cai, Jingwei, and Zhengyin Yan. 2021. “Re-Examining the Impact of Minimal Scans in Liquid Chromatography–Mass Spectrometry Analysis.” Journal of the American Society for Mass Spectrometry, June. https://doi.org/10.1021/jasms.1c00073. Calbiani, F., M. Careri, L. Elviri, A. Mangia, and I. Zagnoni. 2006. “Matrix Effects on Accurate Mass Measurements of Low-Molecular Weight Compounds Using Liquid Chromatography-Electrospray-Quadrupole Time-of-Flight Mass Spectrometry.” Journal of Mass Spectrometry 41 (3): 289–94. https://doi.org/10.1002/jms.984. Dagan, Shai, Dana Marder, Nitzan Tzanani, Eyal Drug, Hagit Prihed, and Lilach Yishai-Aviram. 2023. “Evaluation of Matrix Complexity in Nontargeted Analysis of Small-Molecule Toxicants by Liquid Chromatography–High-Resolution Mass Spectrometry.” Analytical Chemistry 95 (20): 7924–32. https://doi.org/10.1021/acs.analchem.3c00413. Flasch, Mira, Veronika Fitz, Evelyn Rampler, Chibundu N. Ezekiel, Gunda Koellensperger, and Benedikt Warth. 2022. “Integrated Exposomics/Metabolomics for Rapid Exposure and Effect Analyses.” JACS Au 2 (11): 2548–60. https://doi.org/10.1021/jacsau.2c00433. Ghaste, Manoj, Robert Mistrik, and Vladimir Shulaev. 2016. “Applications of Fourier Transform Ion Cyclotron Resonance (FT-ICR) and Orbitrap Based High Resolution Mass Spectrometry in Metabolomics and Lipidomics.” International Journal of Molecular Sciences 17 (6). https://doi.org/10.3390/ijms17060816. Ghosson, Hikmat, Yann Guitton, Amani Ben Jrad, Chandrashekhar Patil, Delphine Raviglione, Marie-Virginie Salvia, and Cédric Bertrand. 2021. “Electrospray Ionization and Heterogeneous Matrix Effects in Liquid Chromatography/Mass Spectrometry Based Meta-Metabolomics: A Biomarker or a Suppressed Ion?” Rapid Communications in Mass Spectrometry 35 (2): e8977. https://doi.org/10.1002/rcm.8977. Groves, Ryan A., Carly C. Y. Chan, Spencer D. Wildman, Daniel B. Gregson, Thomas Rydzak, and Ian A. Lewis. 2023. “Rapid LC–MS Assay for Targeted Metabolite Quantification by Serial Injection into Isocratic Gradients.” Analytical and Bioanalytical Chemistry 415 (2): 269–76. https://doi.org/10.1007/s00216-022-04384-x. Harrieder, Eva-Maria, Fleming Kretschmer, Sebastian Böcker, and Michael Witting. 2022. “Current State-of-the-Art of Separation Methods Used in LC-MS Based Metabolomics and Lipidomics.” Journal of Chromatography B 1188 (January): 123069. https://doi.org/10.1016/j.jchromb.2021.123069. Huang, Danning, Marcos Bouza, David A. Gaul, Franklin E. Leach, I. Jonathan Amster, Frank C. Schroeder, Arthur S. Edison, and Facundo M. Fernández. 2021. “Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation.” Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation, August. https://doi.org/10.1021/acs.analchem.1c02224. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Smirnov, Kirill S., Sara Forcisi, Franco Moritz, Marianna Lucio, and Philippe Schmitt-Kopplin. 2019. “Mass Difference Maps and Their Application for the Recalibration of Mass Spectrometric Data in Nontargeted Metabolomics.” Analytical Chemistry 91 (5): 3350–58. https://doi.org/10.1021/acs.analchem.8b04555. Tarakhovskaya, Elena, Andrea Marcillo, Caroline Davis, Sanja Milkovska-Stamenova, Antje Hutschenreuther, and Claudia Birkemeyer. 2023. “Matrix Effects in GC-MS Profiling of Common Metabolites After Trimethylsilyl Derivatization.” Molecules (Basel, Switzerland) 28 (6): 2653. https://doi.org/10.3390/molecules28062653. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. "],["workflow-2.html", "Chapter 5 Workflow 5.1 Platform for metabolomics data analysis 5.2 Project Setup 5.3 Data sharing 5.4 Contest", " Chapter 5 Workflow You could check this book for metabolomics data analysis (S. Li 2020). DiagrammeR::mermaid(&quot; flowchart TB I(peak-picking) --&gt; C C(visulization) --&gt; D(normalization/batch correction) D --&gt; A(annotation/identification) A --&gt; H(statistical analysis) C --&gt; A --&gt; B(omics analysis) D --&gt; H B --&gt; H H --&gt; E(experimental validation) A --&gt; E H --&gt; A B --&gt; E C --&gt; H &quot;) 5.1 Platform for metabolomics data analysis Here is a list for related open source projects 5.1.1 XCMS &amp; XCMS online XCMS online is hosted by Scripps Institute. If your datasets are not large, XCMS online would be the best option for you. Recently they updated the online version to support more functions for systems biology. They use metlin and iso metlin to annotate the MS/MS data. Pathway analysis is also supported. Besides, to accelerate the process, xcms online employed stream (windows only). You could use stream to connect your instrument workstation to their server and process the data along with the data acquisition automate. They also developed apps for xcms online, but I think apps for slack would be even cooler to control the data processing. xcms is different from xcms online while they might share the same code. I used it almost every data to run local metabolomics data analysis. Recently, they will change their version to xcms 3 with major update for object class. Their data format would integrate into the MSnbase package and the parameters would be easy to set up for each step. Normally, I will use msconvert-IPO-xcms-xMSannotator-metaboanalyst as workflow to process the offline data. It could accelerate the process by parallel processing. However, if you are not familiar with R, you would better to choose some software below. For xcms, 1000 files will need around 5 hours to generate the peaks list on a regular workstation. IPO A Tool for automated Optimization of XCMS Parameters (Libiseller et al. 2015) and Warpgroup is used for chromatogram subregion detection, consensus integration bound determination and accurate missing value integration(Mahieu, Spalding, and Patti 2016). A case study to compare different xcms parameters with IPO can be found for GC-MS (Dos Santos and Canuto 2023). Another option is AutoTuner, which are much faster than IPO(McLean and Kujawinski 2020). Recently, MetaboAnalystR 3.0 could also optimize the parameters for xcms while you need to perform the following analysis within this software(Pang et al. 2020). For IPO, ten files will need ~12 hours to generate the optimized results on a regular workstation. Paramounter is a direct measurement of universal parameters to process metabolomics data in a “White Box”(J. Guo, Shen, and Huan 2022). Another research use machine learning method to compare different optimization methods and they are all better than the default setting of xcms(Lassen et al. 2021). It could be extended to include ion mobility(Dodds et al. 2022). Check those papers for the XCMS based workflow(Forsberg et al. 2018; Huan et al. 2017; Mahieu et al. 2016; Montenegro-Burke et al. 2017; Domingo-Almenara and Siuzdak 2020; Stancliffe et al. 2022). For metlin related annotation, check those papers(Guijas et al. 2018; Tautenhahn et al. 2012; Xue, Guijas, et al. 2020; Domingo-Almenara, Montenegro-Burke, Ivanisevic, et al. 2018). MAIT based on xcms and you could find source code here(Fernández-Albert et al. 2014). iMet-Q is an automated tool with friendly user interfaces for quantifying metabolites in full-scan liquid chromatography-mass spectrometry (LC-MS) data (Chang et al. 2016) compMS2Miner is an Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets. Here is related papers (Edmands et al. 2017; Edmands, Hayes, and Rappaport 2018; Edmands, Barupal, and Scalbert 2015). mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language, which could be coupled with xcms (Scheltema et al. 2011; Creek et al. 2012). It also could be used for annotation with MetAssign(Daly et al. 2014). 5.1.2 PRIMe PRIMe is from RIKEN and UC Davis. They update their database frequently(Tsugawa et al. 2016). It supports mzML and major MS vendor formats. They defined own file format ABF and eco-system for omics studies. The software are updated almost everyday. You could use MS-DIAL for untargeted analysis and MRMOROBS for targeted analysis. For annotation, they developed MS-FINDER and statistic tools with excel. This platform could replaced the dear software from company and well prepared for MS/MS data analysis and lipidomics. They are open source, work on Windows and also could run within mathmamtics. However, they don’t cover pathway analysis. Another feature is they always show the most recently spectral records from public repositories. You could always get the updated MSP spectra files for your own data analysis. For PRIMe based workflow, check those papers(Lai et al. 2018; Matsuo et al. 2017; Treutler et al. 2016; Tsugawa et al. 2015; Tsugawa et al. 2016; Kind et al. 2018). There are also extensions for their workflow(Uchino et al. 2022) and workflow for environmental science(Bonnefille et al. 2023). 5.1.3 GNPS GNPS is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. It’s a straight forward annotation methods for MS/MS data. Feature-based molecular networking (FBMN) within GNPS could be coupled with xcms, openMS, MS-DIAL, MZmine2, and other popular software. GNPS also have a dashboard for online mass spectrometery data analysis(Petras et al. 2021). Check those papers for GNPS and related projects(Aron et al. 2020; Nothias et al. 2020; Scheubert et al. 2017; Ricardo R. da Silva et al. 2018; M. Wang et al. 2016; Bittremieux et al. 2023). 5.1.4 OpenMS &amp; SIRIUS OpenMS is another good platform for mass spectrum data analysis developed with C++. You could use them as plugin of KNIME. I suggest anyone who want to be a data scientist to get familiar with platform like KNIME because they supplied various API for different programme language, which is easy to use and show every steps for others. Also TOPPView in OpenMS could be the best software to visualize the MS data. You could always use the metabolomics workflow to train starter about details in data processing. pyOpenMS and OpenSWATH are also used in this platform. If you want to turn into industry, this platform fit you best because you might get a clear idea about solution and workflow. Check those paper for OpenMS based workflow(Bertsch et al. 2011; Pfeuffer et al. 2017, 2024; Röst et al. 2014, 2016; Rurik et al. 2020; Alka et al. 2020). OpenMS could be coupled to SIRIUS 4 for annotation. Sirius is a new java-based software framework for discovering a landscape of de-novo identification of metabolites using single and tandem mass spectrometry. SIRIUS 4 project integrates a collection of our tools, including CSI:FingerID, ZODIAC and CANOPUS. Check those papers for SIRIUS based workflow(Dührkop et al. 2019, 2020; Alka et al. 2020; Ludwig et al. 2020). 5.1.5 MZmine 2 MZmine 2 has three version developed on Java platform and the lastest version is included into MSDK. Similar function could be found from MZmine 2 as shown in XCMS online. However, MZmine 2 do not have pathway analysis. You could use metaboanalyst for that purpose. Actually, you could go into MSDK to find similar function supplied by ProteoSuite and Openchrom. If you are a experienced coder for Java, you should start here. Check those papers for MZmine based workflow(Pluskal et al. 2010; Pluskal et al. 2020). 5.1.6 Emory MaHPIC This platform is composed by several R packages from Emory University including apLCMS to collect the data, xMSanalyzer to handle automated pipeline for large-scale, non-targeted metabolomics data, xMSannotator for annotation of LC-MS data and Mummichog for pathway and network analysis for high-throughput metabolomics. This platform would be preferred by someone from environmental science to study exposome. You could check those papers for Emory workflow(Uppal et al. 2013; Uppal, Walker, and Jones 2017; T. Yu et al. 2009; S. Li et al. 2013; Q. Liu et al. 2020). 5.1.7 Others PMDDA is a reproducible workflow for exhaustive MS2 data acquisition of MS1 features(M. Yu, Dolios, and Petrick 2022) will data and script available online. tidymass is an object-oriented reproducible analysis framework for LC–MS data(Shen et al. 2022). R for mass spectrometry is a R software collection for the analysis and interpretation of high throughput mass spectrometry assays. MAVEN from Princeton University (Melamud, Vastag, and Rabinowitz 2010; Clasquin, Melamud, and Rabinowitz 2012). metabolomics is a CRAN package for analysis of metabolomics data. autoGCMSDataAnal is a Matlab based comprehensive data analysis strategy for GC-MS-based untargeted metabolomics and AntDAS2 provided An automatic data analysis strategy for UPLC-HRMS-based metabolomics(Y.-J. Yu et al. 2019; Y.-Y. Zhang et al. 2020). enviGCMS from environmental non-targeted analysis and rmwf for reproducible metabolomics workflow (M. Yu et al. 2020; M. Yu, Olkowicz, and Pawliszyn 2019). Pseudotargeted metabolomics method (Zheng et al. 2020; Y. Wang et al. 2016). pySM provides a reference implementation of our pipeline for False Discovery Rate-controlled metabolite annotation of high-resolution imaging mass spectrometry data (Palmer et al. 2017). TinyMS is a Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows (Riquelme et al. 2020) MetaboliteDetector is a QT4 based software package for the analysis of GC/MS based metabolomics data (Hiller et al. 2009). W4M and metaX could analysis data online (Giacomoni et al. 2015; Wen et al. 2017; Jalili et al. 2020). FTMSVisualization is a suite of tools for visualizing complex mixture FT-MS data (Kew et al. 2017) magma could predict and match MS/MS files. metabCombiner Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets(Habra et al. 2021) SLAW is a scalable and self-Optimizing processing workflow for Untargeted LC-MS with a docker image (Delabriere et al. 2021). patRoon: open source software platform for environmental mass spectrometry based non-target screening (Helmus et al. 2021). ‘shape-orientated’ algorithm: A new ‘shape-orientated’ continuous wavelet transform (CWT)-based algorithm employing an adapted Marr wavelet (AMW) with a shape matching index (SMI), defined as peak height normalized wavelet coefficient for feature filtering, was developed for chromatographic peak detection and quantification. (Bai et al. 2022) automRm An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning. (Eilertz, Mitterer, and Buescher 2022) IDSL.UFAIntrinsic Peak Analysis (IPA) for HRMS Data. (Baygi et al. 2022) DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data (Colby et al. 2022) Omics Untargeted Key Script is a tools to make untargeted LC-MS metabolomic profiling with the latest computational features readily accessible in a ready-to-use unified manner to a research community(Plyushchenko et al. 2022). MetEx is a targeted extraction strategy for improving the coverage and accuracy of metabolite annotation(Zheng et al. 2022). Asari:Trackable and scalable LC-MS metabolomics data processing software in Python(S. Li et al. 2023) NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter(Volikov, Rukhovich, and Perminova 2023) MARS:A Multipurpose Software for Untargeted LC−MS-Based Metabolomics and Exposomics with GUI in C++ (Goracci et al. 2024) MeRgeION: a Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing (Y. Liu et al. 2023) 5.1.8 Workflow Comparison Here are some comparisons for different workflow and you could make selection based on their works(Myers et al. 2017; Weber et al. 2017; Z. Li et al. 2018; Liao et al. 2023). xcmsrocker is a docker image for metabolomics to compare R based software with template(M. Yu, Dolios, and Petrick 2022). 5.2 Project Setup I suggest building your data analysis projects in RStudio (Click File - New project - New dictionary - Empty project). Then assign a name for your project. I also recommend the following tips if you are familiar with it. Use git/github to make version control of your code and sync your project online. Don’t use your name for your project because other peoples might cooperate with you and someone might check your data when you publish your papers. Each project should be a work for one paper or one chapter in your thesis. Use workflow document(txt or doc) in your project to record all of the steps and code you performed for this project. Treat this document as digital version of your experiment notebook Use data folder in your project folder for the raw data and the results you get in data analysis Use figure folder in your project folder for the figure Use munuscript folder in your project folder for the manuscript (you could write paper in rstudio with the help of template in Rmarkdown) Just double click \\[yourprojectname\\].Rproj to start your project 5.3 Data sharing See this paper(Haug, Salek, and Steinbeck 2017): MetaboLights EU based The Metabolomics Workbench US based MetaboBank Japan based MetabolomeXchange search engine MetabolomeExpress a public place to process, interpret and share GC/MS metabolomics datasets(Carroll, Badger, and Harvey Millar 2010). 5.4 Contest CASMI predict small molecular contest(Blaženović et al. 2017) References Alka, Oliver, Timo Sachsenberg, Leon Bichmann, Julianus Pfeuffer, Hendrik Weisser, Samuel Wein, Eugen Netz, Marc Rurik, Oliver Kohlbacher, and Hannes Röst. 2020. “CHAPTER 6:OpenMS and KNIME for Mass Spectrometry Data Processing.” In Processing Metabolomics and Proteomics Data with Open Software, 201–31. https://doi.org/10.1039/9781788019880-00201. Aron, Allegra T., Emily C. Gentry, Kerry L. McPhail, Louis-Félix Nothias, Mélissa Nothias-Esposito, Amina Bouslimani, Daniel Petras, et al. 2020. “Reproducible Molecular Networking of Untargeted Mass Spectrometry Data Using GNPS.” Nature Protocols 15 (6): 1954–91. https://doi.org/10.1038/s41596-020-0317-5. Bai, Caihong, Suyun Xu, Jingyi Tang, Yuxi Zhang, Jiahui Yang, and Kaifeng Hu. 2022. “A ‘Shape-Orientated’ Algorithm Employing an Adapted Marr Wavelet and Shape Matching Index Improves the Performance of Continuous Wavelet Transform for Chromatographic Peak Detection and Quantification.” Journal of Chromatography A 1673 (June): 463086. https://doi.org/10.1016/j.chroma.2022.463086. Baygi, Sadjad Fakouri, Sanjay K. Banerjee, Praloy Chakraborty, Yashwant Kumar, and Dinesh Kumar Barupal. 2022. “IDSL.UFA Assigns High-Confidence Molecular Formula Annotations for Untargeted LC/HRMS Data Sets in Metabolomics and Exposomics.” Analytical Chemistry 94 (39): 13315–22. https://doi.org/10.1021/acs.analchem.2c00563. Bertsch, Andreas, Clemens Gröpl, Knut Reinert, and Oliver Kohlbacher. 2011. “OpenMS and TOPP: Open Source Software for LC-MS Data Analysis.” In Data Mining in Proteomics: From Standards to Applications, edited by Michael Hamacher, Martin Eisenacher, and Christian Stephan, 353–67. Methods in Molecular Biology. Totowa, NJ: Humana Press. https://doi.org/10.1007/978-1-60761-987-1_23. Bittremieux, Wout, Nicole E. Avalon, Sydney P. Thomas, Sarvar A. Kakhkhorov, Alexander A. Aksenov, Paulo Wender P. Gomes, Christine M. Aceves, et al. 2023. “Open Access Repository-Scale Propagated Nearest Neighbor Suspect Spectral Library for Untargeted Metabolomics.” Nature Communications 14 (1): 8488. https://doi.org/10.1038/s41467-023-44035-y. Blaženović, Ivana, Tobias Kind, Hrvoje Torbašinović, Slobodan Obrenović, Sajjan S. Mehta, Hiroshi Tsugawa, Tobias Wermuth, et al. 2017. “Comprehensive Comparison of in Silico MS/MS Fragmentation Tools of the CASMI Contest: Database Boosting Is Needed to Achieve 93% Accuracy.” Journal of Cheminformatics 9 (1): 32. https://doi.org/10.1186/s13321-017-0219-x. Bonnefille, Bénilde, Oskar Karlsson, May Britt Rian, Rubhana Raqib, Faruque Parvez, Stefano Papazian, M. Sirajul Islam, and Jonathan W. Martin. 2023. “Nontarget Analysis of Polluted Surface Waters in Bangladesh Using Open Science Workflows.” Environmental Science &amp; Technology, April. https://doi.org/10.1021/acs.est.2c08200. Carroll, Adam J., Murray R. Badger, and A. Harvey Millar. 2010. “The MetabolomeExpress Project: Enabling Web-Based Processing, Analysis and Transparent Dissemination of GC/MS Metabolomics Datasets.” BMC Bioinformatics 11 (1): 376. https://doi.org/10.1186/1471-2105-11-376. Chang, Hui-Yin, Ching-Tai Chen, T. Mamie Lih, Ke-Shiuan Lynn, Chiun-Gung Juo, Wen-Lian Hsu, and Ting-Yi Sung. 2016. “iMet-Q: A User-Friendly Tool for Label-Free Metabolomics Quantitation Using Dynamic Peak-Width Determination.” PLOS ONE 11 (1): e0146112. https://doi.org/10.1371/journal.pone.0146112. Clasquin, Michelle F., Eugene Melamud, and Joshua D. Rabinowitz. 2012. “LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine.” Current Protocols in Bioinformatics 37 (1): 14.11.1–23. https://doi.org/10.1002/0471250953.bi1411s37. Colby, Sean M., Christine H. Chang, Jessica L. Bade, Jamie R. Nunez, Madison R. Blumer, Daniel J. Orton, Kent J. Bloodsworth, et al. 2022. “DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data.” Analytical Chemistry 94 (16): 6130–38. https://doi.org/10.1021/acs.analchem.1c05017. Creek, Darren J., Andris Jankevics, Karl E. V. Burgess, Rainer Breitling, and Michael P. Barrett. 2012. “IDEOM: An Excel Interface for Analysis of LC–MS-based Metabolomics Data.” Bioinformatics 28 (7): 1048–49. https://doi.org/10.1093/bioinformatics/bts069. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. Delabriere, Alexis, Philipp Warmer, Vincenth Brennsteiner, and Nicola Zamboni. 2021. “SLAW: A Scalable and Self-Optimizing Processing Workflow for Untargeted LC-MS.” Analytical Chemistry 93 (45): 15024–32. https://doi.org/10.1021/acs.analchem.1c02687. Dodds, James N., Lingjue Wang, Gary J. Patti, and Erin S. Baker. 2022. “Combining Isotopologue Workflows and Simultaneous Multidimensional Separations to Detect, Identify, and Validate Metabolites in Untargeted Analyses.” Analytical Chemistry 94 (5): 2527–35. https://doi.org/10.1021/acs.analchem.1c04430. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, Julijana Ivanisevic, Aurelien Thomas, Jonathan Sidibé, Tony Teav, Carlos Guijas, et al. 2018. “XCMS-MRM and METLIN-MRM: A Cloud Library and Public Resource for Targeted Analysis of Small Molecules.” Nature Methods 15 (9): 681–84. https://doi.org/10.1038/s41592-018-0110-3. Domingo-Almenara, Xavier, and Gary Siuzdak. 2020. “Metabolomics Data Processing Using XCMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 11–24. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_2. Dos Santos, Emile Kelly Porto, and Gisele André Baptista Canuto. 2023. “Optimizing XCMS Parameters for GC-MS Metabolomics Data Processing: A Case Study.” Metabolomics: Official Journal of the Metabolomic Society 19 (4): 26. https://doi.org/10.1007/s11306-023-01992-1. Dührkop, Kai, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker. 2019. “SIRIUS 4: A Rapid Tool for Turning Tandem Mass Spectra into Metabolite Structure Information.” Nature Methods 16 (4): 299–302. https://doi.org/10.1038/s41592-019-0344-8. Dührkop, Kai, Louis-Félix Nothias, Markus Fleischauer, Raphael Reher, Marcus Ludwig, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Systematic Classification of Unknown Metabolites Using High-Resolution Fragmentation Mass Spectra.” Nature Biotechnology, November, 1–10. https://doi.org/10.1038/s41587-020-0740-8. Edmands, William M. B., Dinesh K. Barupal, and Augustin Scalbert. 2015. “MetMSLine: An Automated and Fully Integrated Pipeline for Rapid Processing of High-Resolution LC–MS Metabolomic Datasets.” Bioinformatics 31 (5): 788–90. https://doi.org/10.1093/bioinformatics/btu705. Edmands, William M. B., Josie Hayes, and Stephen M. Rappaport. 2018. “SimExTargId: A Comprehensive Package for Real-Time LC-MS Data Acquisition and Analysis.” Bioinformatics 34 (20): 3589–90. https://doi.org/10.1093/bioinformatics/bty218. Edmands, William M. B., Lauren Petrick, Dinesh K. Barupal, Augustin Scalbert, Mark J. Wilson, Jeffrey K. Wickliffe, and Stephen M. Rappaport. 2017. “compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets.” Analytical Chemistry 89 (7): 3919–28. https://doi.org/10.1021/acs.analchem.6b02394. Eilertz, Daniel, Michael Mitterer, and Joerg M. Buescher. 2022. “automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning.” Analytical Chemistry 94 (16): 6163–71. https://doi.org/10.1021/acs.analchem.1c05224. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Forsberg, Erica M., Tao Huan, Duane Rinehart, H. Paul Benton, Benedikt Warth, Brian Hilmers, and Gary Siuzdak. 2018. “Data Processing, Multi-Omic Pathway Mapping, and Metabolite Activity Analysis Using XCMS Online.” Nature Protocols 13 (4): 633–51. https://doi.org/10.1038/nprot.2017.151. Giacomoni, Franck, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, et al. 2015. “Workflow4Metabolomics: A Collaborative Research Infrastructure for Computational Metabolomics.” Bioinformatics 31 (9): 1493–95. https://doi.org/10.1093/bioinformatics/btu813. Goracci, Laura, Paolo Tiberi, Stefano Di Bona, Stefano Bonciarelli, Giovanna Ilaria Passeri, Marta Piroddi, Simone Moretti, Claudia Volpi, Ismael Zamora, and Gabriele Cruciani. 2024. “MARS: A Multipurpose Software for Untargeted LC–MS-Based Metabolomics and Exposomics.” Analytical Chemistry, January. https://doi.org/10.1021/acs.analchem.3c03620. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Jian, Sam Shen, and Tao Huan. 2022. “Paramounter: Direct Measurement of Universal Parameters To Process Metabolomics Data in a ‘White Box’.” Analytical Chemistry, March. https://doi.org/10.1021/acs.analchem.1c04758. Habra, Hani, Maureen Kachman, Kevin Bullock, Clary Clish, Charles R. Evans, and Alla Karnovsky. 2021. “metabCombiner: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.” Analytical Chemistry 93 (12): 5028–36. https://doi.org/10.1021/acs.analchem.0c03693. Haug, Kenneth, Reza M Salek, and Christoph Steinbeck. 2017. “Global Open Data Management in Metabolomics.” Current Opinion in Chemical Biology, Omics, 36 (February): 58–63. https://doi.org/10.1016/j.cbpa.2016.12.024. Helmus, Rick, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt, and Emma L. Schymanski. 2021. “patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening.” Journal of Cheminformatics 13 (1): 1. https://doi.org/10.1186/s13321-020-00477-w. Hiller, Karsten, Jasper Hangebrauk, Christian Jäger, Jana Spura, Kerstin Schreiber, and Dietmar Schomburg. 2009. “MetaboliteDetector: Comprehensive Analysis Tool for Targeted and Nontargeted GC/MS Based Metabolome Analysis.” Analytical Chemistry 81 (9): 3429–39. https://doi.org/10.1021/ac802689c. Huan, Tao, Erica M. Forsberg, Duane Rinehart, Caroline H. Johnson, Julijana Ivanisevic, H. Paul Benton, Mingliang Fang, et al. 2017. “Systems Biology Guided by XCMS Online Metabolomics.” Nature Methods 14 (5): 461–62. https://doi.org/10.1038/nmeth.4260. Jalili, Vahid, Enis Afgan, Qiang Gu, Dave Clements, Daniel Blankenberg, Jeremy Goecks, James Taylor, and Anton Nekrutenko. 2020. “The Galaxy Platform for Accessible, Reproducible and Collaborative Biomedical Analyses: 2020 Update.” Nucleic Acids Research 48 (W1): W395–402. https://doi.org/10.1093/nar/gkaa434. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Kind, Tobias, Hiroshi Tsugawa, Tomas Cajka, Yan Ma, Zijuan Lai, Sajjan S. Mehta, Gert Wohlgemuth, et al. 2018. “Identification of Small Molecules Using Accurate Mass MS/MS Search.” Mass Spectrometry Reviews 37 (4): 513–32. https://doi.org/10.1002/mas.21535. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Lassen, Johan, Kirstine Lykke Nielsen, Mogens Johannsen, and Palle Villesen. 2021. “Assessment of XCMS Optimization Methods with Machine-Learning Performance.” Analytical Chemistry 93 (40): 13459–66. https://doi.org/10.1021/acs.analchem.1c02000. Li, Shuzhao. 2020. Computational Methods and Data Analysis for Metabolomics. Springer. Li, Shuzhao, Youngja Park, Sai Duraisingham, Frederick H. Strobel, Nooruddin Khan, Quinlyn A. Soltow, Dean P. Jones, and Bali Pulendran. 2013. “Predicting Network Activity from High Throughput Metabolomics.” PLOS Computational Biology 9 (7): e1003123. https://doi.org/10.1371/journal.pcbi.1003123. Li, Shuzhao, Amnah Siddiqa, Maheshwor Thapa, Yuanye Chi, and Shujian Zheng. 2023. “Trackable and Scalable LC-MS Metabolomics Data Processing Using Asari.” Nature Communications 14 (1): 4113. https://doi.org/10.1038/s41467-023-39889-1. Li, Zhucui, Yan Lu, Yufeng Guo, Haijie Cao, Qinhong Wang, and Wenqing Shui. 2018. “Comprehensive Evaluation of Untargeted Metabolomics Data Processing Software in Feature Detection, Quantification and Discriminating Marker Selection.” Analytica Chimica Acta 1029 (October): 50–57. https://doi.org/10.1016/j.aca.2018.05.001. Liao, Jingyu, Yuhao Zhang, Wendan Zhang, Yuanyuan Zeng, Jing Zhao, Jingfang Zhang, Tingting Yao, et al. 2023. “Different Software Processing Affects the Peak Picking and Metabolic Pathway Recognition of Metabolomics Data.” Journal of Chromatography A 1687 (January): 463700. https://doi.org/10.1016/j.chroma.2022.463700. Libiseller, Gunnar, Michaela Dvorzak, Ulrike Kleb, Edgar Gander, Tobias Eisenberg, Frank Madeo, Steffen Neumann, et al. 2015. “IPO: A Tool for Automated Optimization of XCMS Parameters.” BMC Bioinformatics 16 (April): 118. https://doi.org/10.1186/s12859-015-0562-8. Liu, Qin, Douglas Walker, Karan Uppal, Zihe Liu, Chunyu Ma, ViLinh Tran, Shuzhao Li, Dean P. Jones, and Tianwei Yu. 2020. “Addressing the Batch Effect Issue for LC/MS Metabolomics Data in Data Preprocessing.” Scientific Reports 10 (1): 13856. https://doi.org/10.1038/s41598-020-70850-0. Liu, Youzhong, Yingjie Zhang, Tom Vennekens, Jennifer L. Lippens, Luc Duijsens, Danh Bui-Thi, Kris Laukens, and Thomas de Vijlder. 2023. “MeRgeION: A Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing.” Analytical Chemistry 95 (22): 8433–42. https://doi.org/10.1021/acs.analchem.2c04343. Ludwig, Marcus, Louis-Félix Nothias, Kai Dührkop, Irina Koester, Markus Fleischauer, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Database-Independent Molecular Formula Annotation Using Gibbs Sampling Through ZODIAC.” Nature Machine Intelligence 2 (10): 629–41. https://doi.org/10.1038/s42256-020-00234-6. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Mahieu, Nathaniel G., Jonathan L. Spalding, and Gary J. Patti. 2016. “Warpgroup: Increased Precision of Metabolomic Data Processing by Consensus Integration Bound Analysis.” Bioinformatics 32 (2): 268–75. https://doi.org/10.1093/bioinformatics/btv564. Matsuo, Teruko, Hiroshi Tsugawa, Hiromi Miyagawa, and Eiichiro Fukusaki. 2017. “Integrated Strategy for Unknown EI–MS Identification Using Quality Control Calibration Curve, Multivariate Analysis, EI–MS Spectral Database, and Retention Index Prediction.” Analytical Chemistry 89 (12): 6766–73. https://doi.org/10.1021/acs.analchem.7b01010. McLean, Craig, and Elizabeth B. Kujawinski. 2020. “AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing.” Analytical Chemistry 92 (8): 5724–32. https://doi.org/10.1021/acs.analchem.9b04804. Melamud, Eugene, Livia Vastag, and Joshua D. Rabinowitz. 2010. “Metabolomic Analysis and Visualization Engine for LC-MS Data.” Analytical Chemistry 82 (23): 9818–26. https://doi.org/10.1021/ac1021166. Montenegro-Burke, J. Rafael, Aries E. Aisporna, H. Paul Benton, Duane Rinehart, Mingliang Fang, Tao Huan, Benedikt Warth, et al. 2017. “Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes.” Analytical Chemistry 89 (2): 1254–59. https://doi.org/10.1021/acs.analchem.6b03890. Myers, Owen D., Susan J. Sumner, Shuzhao Li, Stephen Barnes, and Xiuxia Du. 2017. “Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data.” Analytical Chemistry 89 (17): 8689–95. https://doi.org/10.1021/acs.analchem.7b01069. Nothias, Louis-Félix, Daniel Petras, Robin Schmid, Kai Dührkop, Johannes Rainer, Abinesh Sarvepalli, Ivan Protsyuk, et al. 2020. “Feature-Based Molecular Networking in the GNPS Analysis Environment.” Nature Methods 17 (9): 905–8. https://doi.org/10.1038/s41592-020-0933-6. Palmer, Andrew, Prasad Phapale, Ilya Chernyavsky, Regis Lavigne, Dominik Fay, Artem Tarasov, Vitaly Kovalev, et al. 2017. “FDR-controlled Metabolite Annotation for High-Resolution Imaging Mass Spectrometry.” Nature Methods 14 (1): 57–60. https://doi.org/10.1038/nmeth.4072. Pang, Zhiqiang, Jasmine Chong, Shuzhao Li, and Jianguo Xia. 2020. “MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics.” Metabolites 10 (5): 186. https://doi.org/10.3390/metabo10050186. Petras, Daniel, Vanessa V. Phelan, Deepa Acharya, Andrew E. Allen, Allegra T. Aron, Nuno Bandeira, Benjamin P. Bowen, et al. 2021. “GNPS Dashboard: Collaborative Exploration of Mass Spectrometry Data in the Web Browser.” Nature Methods, December, 1–3. https://doi.org/10.1038/s41592-021-01339-5. Pfeuffer, Julianus, Chris Bielow, Samuel Wein, Kyowon Jeong, Eugen Netz, Axel Walter, Oliver Alka, et al. 2024. “OpenMS 3 Enables Reproducible Analysis of Large-Scale Mass Spectrometry Data.” Nature Methods 21 (3): 365–67. https://doi.org/10.1038/s41592-024-02197-7. Pfeuffer, Julianus, Timo Sachsenberg, Oliver Alka, Mathias Walzer, Alexander Fillbrunn, Lars Nilse, Oliver Schilling, Knut Reinert, and Oliver Kohlbacher. 2017. “OpenMS – A Platform for Reproducible Analysis of Mass Spectrometry Data.” Journal of Biotechnology, Bioinformatics Solutions for Big Data Analysis in Life Sciences presented by the German Network for Bioinformatics Infrastructure, 261 (November): 142–48. https://doi.org/10.1016/j.jbiotec.2017.05.016. Pluskal, Tomáš, Sandra Castillo, Alejandro Villar-Briones, and Matej Orešič. 2010. “MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data.” BMC Bioinformatics 11: 395. https://doi.org/10.1186/1471-2105-11-395. Pluskal, Tomáš, Ansgar Korf, Aleksandr Smirnov, Robin Schmid, Timothy R. Fallon, Xiuxia Du, and Jing-Ke Weng. 2020. “CHAPTER 7:Metabolomics Data Analysis Using MZmine.” In Processing Metabolomics and Proteomics Data with Open Software, 232–54. https://doi.org/10.1039/9781788019880-00232. Plyushchenko, Ivan V., Elizaveta S. Fedorova, Natalia V. Potoldykova, Konstantin A. Polyakovskiy, Alexander I. Glukhov, and Igor A. Rodin. 2022. “Omics Untargeted Key Script: R-Based Software Toolbox for Untargeted Metabolomics with Bladder Cancer Biomarkers Discovery Case Study.” Journal of Proteome Research 21 (3): 833–47. https://doi.org/10.1021/acs.jproteome.1c00392. Riquelme, Gabriel, Nicolás Zabalegui, Pablo Marchi, Christina M. Jones, and María Eugenia Monge. 2020. “A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows.” Metabolites 10 (10): 416. https://doi.org/10.3390/metabo10100416. Röst, Hannes L., Timo Sachsenberg, Stephan Aiche, Chris Bielow, Hendrik Weisser, Fabian Aicheler, Sandro Andreotti, et al. 2016. “OpenMS: A Flexible Open-Source Software Platform for Mass Spectrometry Data Analysis.” Nature Methods 13 (9): 741–48. https://doi.org/10.1038/nmeth.3959. Röst, Hannes L., Uwe Schmitt, Ruedi Aebersold, and Lars Malmström. 2014. “pyOpenMS: A Python-based Interface to the OpenMS Mass-Spectrometry Algorithm Library.” PROTEOMICS 14 (1): 74–77. https://doi.org/10.1002/pmic.201300246. Rurik, Marc, Oliver Alka, Fabian Aicheler, and Oliver Kohlbacher. 2020. “Metabolomics Data Processing Using OpenMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 49–60. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_4. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Scheubert, Kerstin, Franziska Hufsky, Daniel Petras, Mingxun Wang, Louis-Félix Nothias, Kai Dührkop, Nuno Bandeira, Pieter C. Dorrestein, and Sebastian Böcker. 2017. “Significance Estimation for Large Scale Metabolomics Annotations by Spectral Matching.” Nature Communications 8 (1): 1494. https://doi.org/10.1038/s41467-017-01318-5. Shen, Xiaotao, Hong Yan, Chuchu Wang, Peng Gao, Caroline H. Johnson, and Michael P. Snyder. 2022. “TidyMass an Object-Oriented Reproducible Analysis Framework for LC–MS Data.” Nature Communications 13 (1): 4365. https://doi.org/10.1038/s41467-022-32155-w. Silva, Ricardo R. da, Mingxun Wang, Louis-Félix Nothias, Justin J. J. van der Hooft, Andrés Mauricio Caraballo-Rodríguez, Evan Fox, Marcy J. Balunas, Jonathan L. Klassen, Norberto Peporine Lopes, and Pieter C. Dorrestein. 2018. “Propagating Annotations of Molecular Networks Using in Silico Fragmentation.” PLOS Computational Biology 14 (4): e1006089. https://doi.org/10.1371/journal.pcbi.1006089. Stancliffe, Ethan, Michaela Schwaiger-Haber, Miriam Sindelar, Matthew J. Murphy, Mette Soerensen, and Gary J. Patti. 2022. “An Untargeted Metabolomics Workflow That Scales to Thousands of Samples for Population-Based Studies.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.2c01270. Tautenhahn, Ralf, Kevin Cho, Winnie Uritboonthai, Zhengjiang Zhu, Gary J. Patti, and Gary Siuzdak. 2012. “An Accelerated Workflow for Untargeted Metabolomics Using the METLIN Database.” Nature Biotechnology 30 (9): 826–28. https://doi.org/10.1038/nbt.2348. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uchino, Haruki, Hiroshi Tsugawa, Hidenori Takahashi, and Makoto Arita. 2022. “Computational Mass Spectrometry Accelerates C = C Position-Resolved Untargeted Lipidomics Using Oxygen Attachment Dissociation.” Communications Chemistry 5 (1): 1–13. https://doi.org/10.1038/s42004-022-00778-1. Uppal, Karan, Quinlyn A. Soltow, Frederick H. Strobel, W. Stephen Pittard, Kim M. Gernert, Tianwei Yu, and Dean P. Jones. 2013. “xMSanalyzer: Automated Pipeline for Improved Feature Detection and Downstream Analysis of Large-Scale, Non-Targeted Metabolomics Data.” BMC Bioinformatics 14 (1): 15. https://doi.org/10.1186/1471-2105-14-15. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. Volikov, Alexander, Gleb Rukhovich, and Irina V. Perminova. 2023. “NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter.” NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter, June. https://doi.org/10.1021/jasms.3c00003. Wang, Mingxun, Jeremy J. Carver, Vanessa V. Phelan, Laura M. Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, et al. 2016. “Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking.” Nature Biotechnology 34 (8): 828–37. https://doi.org/10.1038/nbt.3597. Wang, Yang, Fang Liu, Peng Li, Chengwei He, Ruibing Wang, Huanxing Su, and Jian-Bo Wan. 2016. “An Improved Pseudotargeted Metabolomics Approach Using Multiple Ion Monitoring with Time-Staggered Ion Lists Based on Ultra-High Performance Liquid Chromatography/Quadrupole Time-of-Flight Mass Spectrometry.” Analytica Chimica Acta 927 (July): 82–88. https://doi.org/10.1016/j.aca.2016.05.008. Weber, Ralf J. M., Thomas N. Lawson, Reza M. Salek, Timothy M. D. Ebbels, Robert C. Glen, Royston Goodacre, Julian L. Griffin, et al. 2017. “Computational Tools and Workflows in Metabolomics: An International Survey Highlights the Opportunity for Harmonisation Through Galaxy.” Metabolomics 13 (2). https://doi.org/10.1007/s11306-016-1147-x. Wen, Bo, Zhanlong Mei, Chunwei Zeng, and Siqi Liu. 2017. “metaX: A Flexible and Comprehensive Software for Processing Metabolomics Data.” BMC Bioinformatics 18 (March): 183. https://doi.org/10.1186/s12859-017-1579-y. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Yu, Tianwei, Youngja Park, Jennifer M. Johnson, and Dean P. Jones. 2009. “apLCMS—Adaptive Processing of High-Resolution LC/MS Data.” Bioinformatics 25 (15): 1930–36. https://doi.org/10.1093/bioinformatics/btp291. Yu, Yong-Jie, Qing-Xia Zheng, Yue-Ming Zhang, Qian Zhang, Yu-Ying Zhang, Ping-Ping Liu, Peng Lu, et al. 2019. “Automatic Data Analysis Workflow for Ultra-High Performance Liquid Chromatography-High Resolution Mass Spectrometry-Based Metabolomics.” Journal of Chromatography A 1585 (January): 172–81. https://doi.org/10.1016/j.chroma.2018.11.070. Zhang, Yu-Ying, Qian Zhang, Yue-Ming Zhang, Wei-Wei Wang, Li Zhang, Yong-Jie Yu, Chang-Cai Bai, Ji-Zhao Guo, Hai-Yan Fu, and Yuanbin She. 2020. “A Comprehensive Automatic Data Analysis Strategy for Gas Chromatography-Mass Spectrometry Based Untargeted Metabolomics.” Journal of Chromatography A 1616 (April): 460787. https://doi.org/10.1016/j.chroma.2019.460787. Zheng, Fujian, Lei You, Wangshu Qin, Runze Ouyang, Wangjie Lv, Lei Guo, Xin Lu, Enyou Li, Xinjie Zhao, and Guowang Xu. 2022. “MetEx: A Targeted Extraction Strategy for Improving the Coverage and Accuracy of Metabolite Annotation in Liquid Chromatography–High-Resolution Mass Spectrometry Data.” Analytical Chemistry 94 (24): 8561–69. https://doi.org/10.1021/acs.analchem.1c04783. Zheng, Fujian, Xinjie Zhao, Zhongda Zeng, Lichao Wang, Wangjie Lv, Qingqing Wang, and Guowang Xu. 2020. “Development of a Plasma Pseudotargeted Metabolomics Method Based on Ultra-High-Performance Liquid Chromatography–Mass Spectrometry.” Nature Protocols 15 (8): 2519–37. https://doi.org/10.1038/s41596-020-0341-5. "],["raw-data-pretreatment.html", "Chapter 6 Raw data pretreatment 6.1 Data visualization 6.2 Peak extraction 6.3 MS/MS 6.4 Retention Time Correction 6.5 Filling missing values 6.6 Spectral deconvolution 6.7 Dynamic Range 6.8 RSD/fold change Filter 6.9 Power Analysis Filter", " Chapter 6 Raw data pretreatment Raw data from the instruments such as LC-MS or GC-MS were hard to be analyzed. To make it clear, the structure of those data could be summarized as: Indexed scan with time-stamp Each scan contains a full scan mass spectra Common formats for open source mass spectrum data are mzxml, mzml or CDF. However, MassComp might shrink the data size(R. Yang, Chen, and Ochoa 2019). ProteoWizard Toolkit provides a set of open-source, cross-platform software libraries and tools (Chambers et al. 2012). Msconvert is one tool in this toolkit. mzML2ISA &amp; nmrML2ISA could generate enriched ISA-Tab metadata files from metabolomics XML data (Larralde et al. 2017). 6.1 Data visualization You could use msxpertsuite for MS data visualization. It is biological mass spectrometry data visualization and mining with full JavaScript ability (Rusconi 2019). FTMSVisualization is a suite of tools for visualizing complex mixture FT-MS data (Kew et al. 2017). 6.2 Peak extraction GC/LC-MS data are usually be shown as a matrix with column standing for retention times and row standing for masses after bin them into small cell. Figure 6.1: Demo of GC/LC-MS data Conversation from the mass-retention time matrix into a vector with selected MS peaks at certain retention time is the basic idea of Peak extraction. You could EIC for each mass to charge ratio and use the change of trace slope to determine whether there is a peak or not. Then we could make integration for this peak and get peak area and retention time. intensity &lt;- c(10,10,10,10,10,14,19,25,30,33,26,21,16,12,11,10,9,10,11,10) time &lt;- c(1:20) plot(intensity~time, type = &#39;o&#39;, main = &#39;EIC&#39;) Figure 6.2: Demo of EIC with peak However, due to the accuracy of instrument, the detected mass to charge ratio would have some shift and EIC would fail if different scan get the intensity from different mass to charge ratio. In the matchedfilter algorithm (Smith et al. 2006), they solve this issue by bin the data in m/z dimension. The adjacent chromatographic slices could be combined to find a clean signal fitting fixed second-derivative Gaussian with full width at half-maximum (fwhm) of 30s to find peaks with about 1.5-4 times the signal peak width. The the integration is performed on the fitted area. Figure 6.3: Demo of matchedfilter The Centwave algorithm (Tautenhahn, Böttcher, and Neumann 2008) based on detection of regions of interest(ROI) and the following Continuous Wavelet Transform (CWT) is preferred for high-resolution mass spectrum. ROI means a region with stable mass for a certain time. When we find the ROIs, the peak shape is evaluated and ROI could be extended if needed. This algorithm use prefilter to accelerate the processing speed. prefilter with 3 and 100 means the ROI should contain 3 scan with intensity above 100. Centwave use a peak width range which should be checked on pool QC. Another important parameter is ppm. It is the maximum allowed deviation between scans when locating regions of interest (ROIs), which is different from vendor number and you need to extend them larger than the company claimed. For profparam, it’s used for fill peaks or align peaks instead of peak picking. snthr is the cutoff of signal to noise ratio. An Open-source feature detection algorithm for non-target LC–MS analytics could be found here to understand peak picking process(Dietrich, Wick, and Ternes 2022). Pseudo F-ratio moving window could also be used to select untargeted region of interest for gas chromatography – mass spectrometry data(Giebelhaus et al. 2022). mzRAPP could enables the generation of benchmark peak lists by using an internal set of known molecules in the analyzed data set to compare workflows(El Abiead et al. 2022). G-Aligner is a graph-based feature alignment method for untargeted LC–MS-based metabolomics(Ruimin Wang et al. 2023), which will consider the importance of feature matching. qBinning is a novel algorithm for constructing extracted ion chromatograms (EICs) based on statistical principles and without the need to set user parameters(Reuschenbach et al. 2023). Machine learning can also be used for feature extraxtion. Deep learning frame for LC-MS feature detection on 2D pseudo color image could improve the peak picking process (F. Zhao, Huang, and Zhang 2021). Another deep learning-assisted peak curation (NeatMS) can also be used for large-scale LC-MS metabolomics(Gloaguen, Kirwan, and Beule 2022). A feature selection pipeline based on neural network and genetic algorithm could be applied for metabolomics data analysis(Lisitsyna et al. 2022). 6.3 MS/MS Various data acquisition workflow could be checked here(Fenaille et al. 2017). Before using MS/MS annotation, it’s better to know that DDA and DIA will lose precursor found in MS1(J. Guo and Huan 2020; Stincone et al. 2023). 6.3.1 MRM decoMS2 An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites(Nikolskiy et al. 2013). It requires two different collision energies, low (usually 0V) and high, in each precursor range to solve the mathematical equations. Data-Independent Targeted Metabolomics Method could connect MS1 and MRM (Y. Chen et al. 2017) DecoID python-based database-assisted deconvolution of MS/MS spectra. 6.3.2 DDA The coverage of DDA could be enhanced by a feature classification strategy (Y. Hu, Cai, and Huan 2019) or iterative process (Anderson et al. 2021). 6.3.3 DIA DIA methods could be summarized here including MSE, stepwise windows and random windows(Bilbao et al. 2015) and here is comparison(Zhu, Chen, and Subramanian 2014). msPurity Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics (Lawson et al. 2017) ULSA Deconvolution algorithm and a universal library search algorithm (ULSA) for the analysis of complex spectra generated via data-independent acquisition based on Matlab (Samanipour et al. 2018) MS-DIAL was initially designed for DIA (Tsugawa et al. 2015; Treutler and Neumann 2016) DIA-Umpire show a comprehensive computational framework for data-independent acquisition proteomics (Tsou et al. 2015) MetDIA could perform Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition (H. Li et al. 2016) MetaboDIA workflow build customized MS/MS spectral libraries using a user’s own data dependent acquisition (DDA) data and to perform MS/MS-based quantification with DIA data, thus complementing conventional MS1-based quantification (G. Chen et al. 2017) SWATHtoMRM Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery(Zha et al. 2018) Skyline is a freely-available and open source Windows client application for building Selected Reaction Monitoring (SRM) / Multiple Reaction Monitoring (MRM), Parallel Reaction Monitoring (PRM - Targeted MS/MS), Data Independent Acquisition (DIA/SWATH) and targeted DDA with MS1 quantitative methods and analyzing the resulting mass spectrometer data (Adams et al. 2020). MSstats is an R-based/Bioconductor package for statistical relative quantification of peptides and proteins in mass spectrometry-based proteomic experiments(Choi et al. 2014). It is applicable to multiple types of sample preparation, including label-free workflows, workflows that use stable isotope labeled reference proteins and peptides, and work-flows that use fractionation. It is applicable to targeted Selected Reactin Monitoring(SRM), Data-Dependent Acquisiton(DDA or shotgun), and Data-Independent Acquisition(DIA or SWATH-MS). This github page is for sharing source and testing. Other related papers could be found here to cover SWATH and other topic in DIA(Bonner and Hopfgartner 2018; Ruohong Wang, Yin, and Zhu 2019) MetaboAnnotatoR is designed to perform metabolite annotation of features from LC-MS All-ion fragmentation (AIF) datasets, using ion fragment databases(Graça et al. 2022). DIAMetAlyzer is a pipeline for assay library generation and targeted analysis with statistical validation.(Alka et al. 2022) MetaboMSDIA: A tool for implementing data-independent acquisition in metabolomic-based mass spectrometry analysis(Ledesma-Escobar, Priego-Capote, and Calderón-Santiago 2023). CRISP: a cross-run ion selection and peak-picking (CRISP) tool that utilizes the important advantage of run-to-run consistency of DIA and simultaneously examines the DIA data from the whole set of runs to filter out the interfering signals, instead of only looking at a single run at a time(Yan et al. 2023). 6.4 Retention Time Correction For single file, we could get peaks. However, we should make the peaks align across samples for as features and retention time corrections should be performed. The basic idea behind retention time correction is that use the high quality grouped peaks to make a new retention time. You might choose obiwarp(for dramatic shifts) or loess regression(fast) method to get the corrected retention time for all of the samples. Remember the original retention times might be changed and you might need cross-correct the data. After the correction, you could group the peaks again for a better cross-sample peaks list. However, if you directly use obiwarp, you don’t have to group peaks before correction. This paper show a matlab based shift correction methods(H.-Y. Fu et al. 2017). Retention time correction is a Parametric time warping process and this paper is a good start (Wehrens, Bloemberg, and Eilers 2015). Meanwhile, you could use MS2 for retention time correction(Lili Li et al. 2017). This work is a python based RI system and peak shift correction model, significantly enhancing alignment accuracy(Hao et al. 2023). 6.5 Filling missing values Too many zeros or NA in peaks list are problematic for statistics. Then we usually need to integreate the area exsiting a peak. xcms 3 could use profile matrix to fill the blank. They also have function to impute the NA data by replace missing values with a proportion of the row minimum or random numbers based on the row minimum. It depends on the user to select imputation methods as well as control the minimum fraction of features appeared in single group. Figure 6.4: Peak filling of GC/LC-MS data With many groups of samples, you will get another data matrix with column standing for peaks at certain retention time and row standing for samples after the Raw data pretreatment. Figure 6.5: Demo of many GC/LC-MS data 6.6 Spectral deconvolution Without structure information about certain compound, the peak extraction would suffer influence from other compounds. At the same retention time, co-elute compounds might share similar mass. Hard electron ionization methods such as electron impact ionization (EI), APPI suffer this issue. So it would be hard to distinguish the co-elute peaks’ origin and deconvolution method[] could be used to separate different groups according to the similar chromatogragh behaviors. Another computational tool eRah could be a better solution for the whole process(Domingo-Almenara et al. 2016). Also the ADAD-GC3.0 could also be helpful for such issue(Y. Ni et al. 2016). Other solutions for GC could be found here(Styczynski et al. 2007; T.-F. Tian et al. 2016; Xiuxia Du and Zeisel 2013). 6.7 Dynamic Range Another issue is the Dynamic Range. For metabolomics, peaks could be below the detection limit or over the detection limit. Such Dynamic range issues might raise the loss of information. 6.7.1 Non-detects Some of the data were limited by the detect of limitation. Thus we need some methods to impute the data if we don’t want to lose information by deleting the NA or 0. Two major imputation way could be used. The first way is use model-free method such as half the minimum of the values across the data, 0, 1, mean/median across the data( enviGCMS package could do this via getimputation function). The second way is use model-based method such as linear model, random forest, KNN, PCA. Try simputation package for various imputation methods. As mentioned before, you could also use imputeRowMin or imputeRowMinRand within xcms package to perform imputation. Tobit regression is preferred for censored data. Also you might choose maximum likelihood estimation(Estimation of mean and standard deviation by MLE. Creating 10 complete samples. Pool the results from 10 individual analyses). x &lt;- rnorm(1000,1) x[x&lt;0] &lt;- 0 y &lt;- x*10+1 library(AER) tfit &lt;- tobit(y ~ x, left = 0) summary(tfit) ## ## Call: ## tobit(formula = y ~ x, left = 0) ## ## Observations: ## Total Left-censored Uncensored Right-censored ## 1000 0 1000 0 ## ## Coefficients: ## Estimate Std. Error z value Pr(&gt;|z|) ## (Intercept) 1.0000 0.4366 2.29 0.022 * ## x 10.0000 0.3162 31.62 &lt;2e-16 *** ## Log(scale) 2.1627 0.0000 Inf &lt;2e-16 *** ## --- ## Signif. codes: 0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 ## ## Scale: 8.695 ## ## Gaussian distribution ## Number of Newton-Raphson Iterations: 1 ## Log-likelihood: -3082 on 3 Df ## Wald-statistic: 1000 on 1 Df, p-value: &lt; 2.22e-16 According to Ronald Hites’s simulation(Hites 2019), measurements below the LOD (even missing measurements) with the LOD/2 or with the \\(LOD/\\sqrt2\\) causes little bias and “Any time you have a % non-detected &gt;20%, for whatever reason, it is unlikely that the data set can give useful results.” Another study find random forest could be the best imputation method for missing at random (MAR), and missing completely at random (MCAR) data. Quantile regression imputation of left-censored data is the best imputation methods for left-censored missing not at random data (Wei et al. 2018). 6.7.2 Over Detection Limit CorrectOverloadedPeaks could be used to correct the Peaks Exceeding the Detection Limit issue (Lisec et al. 2016). 6.8 RSD/fold change Filter Some peaks need to be rule out due to high RSD% and small fold changes compared with blank samples. A more general feature filtering for biomarker discovery can be found here(Gadara et al. 2021) and a detailed discussion on intensity thresholds could be found here(Houriet et al. 2022). 6.9 Power Analysis Filter As shown in \\[Exprimental design(DoE)\\], the power analysis in metabolomics is ad-hoc since you don’t know too much before you perform the experiment. However, we could perform power analysis after the experiment done. That is, we just rule out the peaks with a lower power for current experimental design. References Adams, Kendra J., Brian Pratt, Neelanjan Bose, Laura G. Dubois, Lisa St John-Williams, Kevin M. Perrott, Karina Ky, et al. 2020. “Skyline for Small Molecules: A Unifying Software Package for Quantitative Metabolomics.” Journal of Proteome Research 19 (4): 1447–58. https://doi.org/10.1021/acs.jproteome.9b00640. Alka, Oliver, Premy Shanthamoorthy, Michael Witting, Karin Kleigrewe, Oliver Kohlbacher, and Hannes L. Röst. 2022. “DIAMetAlyzer Allows Automated False-Discovery Rate-Controlled Analysis for Data-Independent Acquisition in Metabolomics.” Nature Communications 13 (1): 1347. https://doi.org/10.1038/s41467-022-29006-z. Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Bonner, Ron, and Gérard Hopfgartner. 2018. “SWATH Data Independent Acquisition Mass Spectrometry for Metabolomics.” TrAC Trends in Analytical Chemistry, October. https://doi.org/10.1016/j.trac.2018.10.014. Chambers, Matthew C., Brendan Maclean, Robert Burke, Dario Amodei, Daniel L. Ruderman, Steffen Neumann, Laurent Gatto, et al. 2012. “A Cross-Platform Toolkit for Mass Spectrometry and Proteomics.” Nature Biotechnology 30 (October): 918–20. https://doi.org/10.1038/nbt.2377. Chen, Gengbo, Scott Walmsley, Gemmy C. M. Cheung, Liyan Chen, Ching-Yu Cheng, Roger W. Beuerman, Tien Yin Wong, Lei Zhou, and Hyungwon Choi. 2017. “Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.” Analytical Chemistry 89 (9): 4897–4906. https://doi.org/10.1021/acs.analchem.6b05006. Chen, Yanhua, Zhi Zhou, Wei Yang, Nan Bi, Jing Xu, Jiuming He, Ruiping Zhang, Lvhua Wang, and Zeper Abliz. 2017. “Development of a Data-Independent Targeted Metabolomics Method for Relative Quantification Using Liquid Chromatography Coupled with Tandem Mass Spectrometry.” Analytical Chemistry 89 (13): 6954–62. https://doi.org/10.1021/acs.analchem.6b04727. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Dietrich, Christian, Arne Wick, and Thomas A. Ternes. 2022. “Open-Source Feature Detection for Non-Target LC–MS Analytics.” Rapid Communications in Mass Spectrometry 36 (2): e9206. https://doi.org/10.1002/rcm.9206. Domingo-Almenara, Xavier, Jesus Brezmes, Maria Vinaixa, Sara Samino, Noelia Ramirez, Marta Ramon-Krauel, Carles Lerin, et al. 2016. “eRah: A Computational Tool Integrating Spectral Deconvolution and Alignment with Quantification and Identification of Metabolites in GC/MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9821–29. https://doi.org/10.1021/acs.analchem.6b02927. Du, Xiuxia, and Steven H Zeisel. 2013. “SPECTRAL DECONVOLUTION FOR GAS CHROMATOGRAPHY MASS SPECTROMETRY-BASED METABOLOMICS: CURRENT STATUS AND FUTURE PERSPECTIVES.” Computational and Structural Biotechnology Journal 4 (5): 1–10. https://doi.org/10.5936/csbj.201301013. El Abiead, Yasin, Maximilian Milford, Harald Schoeny, Mate Rusz, Reza M. Salek, and Gunda Koellensperger. 2022. “Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection.” Analytical Chemistry 94 (24): 8588–95. https://doi.org/10.1021/acs.analchem.1c05270. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fu, Hai-Yan, Ou Hu, Yue-Ming Zhang, Li Zhang, Jing-Jing Song, Peang Lu, Qing-Xia Zheng, et al. 2017. “Mass-Spectra-Based Peak Alignment for Automatic Nontargeted Metabolic Profiling Analysis for Biomarker Screening in Plant Samples.” Journal of Chromatography A 1513 (Supplement C): 201–9. https://doi.org/10.1016/j.chroma.2017.07.044. Gadara, Darshak, Katerina Coufalikova, Juraj Bosak, David Smajs, and Zdenek Spacil. 2021. “Systematic Feature Filtering in Exploratory Metabolomics: Application Toward Biomarker Discovery.” Analytical Chemistry 93 (26): 9103–10. https://doi.org/10.1021/acs.analchem.1c00816. Giebelhaus, Ryland T., Michael D. Sorochan Armstrong, A. Paulina de la Mata, and James J. Harynuk. 2022. “Untargeted Region of Interest Selection for Gas Chromatography – Mass Spectrometry Data Using a Pseudo F-ratio Moving Window.” Journal of Chromatography A 1682 (October): 463499. https://doi.org/10.1016/j.chroma.2022.463499. Gloaguen, Yoann, Jennifer A. Kirwan, and Dieter Beule. 2022. “Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics.” Analytical Chemistry 94 (12): 4930–37. https://doi.org/10.1021/acs.analchem.1c02220. Graça, Gonçalo, Yuheng Cai, Chung-Ho E. Lau, Panagiotis A. Vorkas, Matthew R. Lewis, Elizabeth J. Want, David Herrington, and Timothy M. D. Ebbels. 2022. “Automated Annotation of Untargeted All-Ion Fragmentation LC–MS Metabolomics Data with MetaboAnnotatoR.” Analytical Chemistry 94 (8): 3446–55. https://doi.org/10.1021/acs.analchem.1c03032. Guo, Jian, and Tao Huan. 2020. “Comparison of Full-Scan, Data-Dependent, and Data-Independent Acquisition Modes in Liquid Chromatography–Mass Spectrometry Based Untargeted Metabolomics.” Analytical Chemistry 92 (12): 8072–80. https://doi.org/10.1021/acs.analchem.9b05135. Hao, Jun-Di, Yao-Yu Chen, Yan-Zhen Wang, Na An, Pei-Rong Bai, Quan-Fei Zhu, and Yu-Qi Feng. 2023. “Novel Peak Shift Correction Method Based on the Retention Index for Peak Alignment in Untargeted Metabolomics.” Analytical Chemistry 95 (35): 13330–37. https://doi.org/10.1021/acs.analchem.3c02583. Hites, Ronald A. 2019. “Correcting for Censored Environmental Measurements.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.9b05042. Houriet, Joelle, Warren S. Vidar, Preston K. Manwill, Daniel A. Todd, and Nadja B. Cech. 2022. “How Low Can You Go? Selecting Intensity Thresholds for Untargeted Metabolomics Data Preprocessing.” Analytical Chemistry 94 (51): 17964–71. https://doi.org/10.1021/acs.analchem.2c04088. Hu, Yaxi, Betty Cai, and Tao Huan. 2019. “Enhancing Metabolome Coverage in Data-Dependent LC–MS/MS Analysis Through an Integrated Feature Extraction Strategy.” Analytical Chemistry 91 (22): 14433–41. https://doi.org/10.1021/acs.analchem.9b02980. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Larralde, Martin, Thomas N. Lawson, Ralf J. M. Weber, Pablo Moreno, Kenneth Haug, Philippe Rocca-Serra, Mark R. Viant, Christoph Steinbeck, and Reza M. Salek. 2017. “mzML2ISA &amp; nmrML2ISA: Generating Enriched ISA-Tab Metadata Files from Metabolomics XML Data.” Bioinformatics 33 (16): 2598–2600. https://doi.org/10.1093/bioinformatics/btx169. Lawson, Thomas N., Ralf J. M. Weber, Martin R. Jones, Andrew J. Chetwynd, Giovanny Rodrı́guez-Blanco, Riccardo Di Guida, Mark R. Viant, and Warwick B. Dunn. 2017. “msPurity: Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics.” Analytical Chemistry 89 (4): 2432–39. https://doi.org/10.1021/acs.analchem.6b04358. Ledesma-Escobar, Carlos Augusto, Feliciano Priego-Capote, and Mónica Calderón-Santiago. 2023. “MetaboMSDIA: A Tool for Implementing Data-Independent Acquisition in Metabolomic-Based Mass Spectrometry Analysis.” Analytica Chimica Acta 1266 (July): 341308. https://doi.org/10.1016/j.aca.2023.341308. Li, Hao, Yuping Cai, Yuan Guo, Fangfang Chen, and Zheng-Jiang Zhu. 2016. “MetDIA: Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition.” Analytical Chemistry 88 (17): 8757–64. https://doi.org/10.1021/acs.analchem.6b02122. Li, Lili, Weijie Ren, Hongwei Kong, Chunxia Zhao, Xinjie Zhao, Xiaohui Lin, Xin Lu, and Guowang Xu. 2017. “An Alignment Algorithm for LC-MS-based Metabolomics Dataset Assisted by MS/MS Information.” Analytica Chimica Acta 990 (October): 96–102. https://doi.org/10.1016/j.aca.2017.07.058. Lisec, Jan, Friederike Hoffmann, Clemens Schmitt, and Carsten Jaeger. 2016. “Extending the Dynamic Range in Metabolomics Experiments by Automatic Correction of Peaks Exceeding the Detection Limit.” Analytical Chemistry 88 (15): 7487–92. https://doi.org/10.1021/acs.analchem.6b02515. Lisitsyna, Anna, Franco Moritz, Youzhong Liu, Loubna Al Sadat, Hans Hauner, Melina Claussnitzer, Philippe Schmitt-Kopplin, and Sara Forcisi. 2022. “Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.” Analytical Chemistry 94 (14): 5474–82. https://doi.org/10.1021/acs.analchem.1c03237. Ni, Yan, Mingming Su, Yunping Qiu, Wei Jia, and Xiuxia Du. 2016. “ADAP-GC 3.0: Improved Peak Detection and Deconvolution of Co-eluting Metabolites from GC/TOF-MS Data for Metabolomics Studies.” Analytical Chemistry 88 (17): 8802–11. https://doi.org/10.1021/acs.analchem.6b02222. Nikolskiy, Igor, Nathaniel G. Mahieu, Ying-Jr Chen, Ralf Tautenhahn, and Gary J. Patti. 2013. “An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites.” Analytical Chemistry 85 (16): 7713–19. https://doi.org/10.1021/ac400751j. Reuschenbach, Max, Felix Drees, Torsten C. Schmidt, and Gerrit Renner. 2023. “qBinning: Data Quality-Based Algorithm for Automized Ion Chromatogram Extraction from High-Resolution Mass Spectrometry.” Analytical Chemistry, September. https://doi.org/10.1021/acs.analchem.3c01079. Rusconi, Filippo. 2019. “mineXpert: Biological Mass Spectrometry Data Visualization and Mining with Full JavaScript Ability.” Journal of Proteome Research 18 (5): 2254–59. https://doi.org/10.1021/acs.jproteome.9b00099. Samanipour, Saer, Malcolm J. Reid, Kine Bæk, and Kevin V. Thomas. 2018. “Combining a Deconvolution and a Universal Library Search Algorithm for the Nontarget Analysis of Data-Independent Acquisition Mode Liquid Chromatography-High-Resolution Mass Spectrometry Results.” Environmental Science &amp; Technology 52 (8): 4694–4701. https://doi.org/10.1021/acs.est.8b00259. Smith, Colin A., Elizabeth J. Want, Grace O’Maille, Ruben Abagyan, and Gary Siuzdak. 2006. “XCMS:  Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification.” Analytical Chemistry 78 (3): 779–87. https://doi.org/10.1021/ac051437y. Stincone, Paolo, Abzer K. Pakkir Shah, Robin Schmid, Lana G. Graves, Stilianos P. Lambidis, Ralph R. Torres, Shu-Ning Xia, et al. 2023. “Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform.” Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform, August. https://doi.org/10.1021/acs.analchem.3c01202. Styczynski, Mark P., Joel F. Moxley, Lily V. Tong, Jason L. Walther, Kyle L. Jensen, and Gregory N. Stephanopoulos. 2007. “Systematic Identification of Conserved Metabolites in GC/MS Data for Metabolomics and Biomarker Discovery.” Analytical Chemistry 79 (3): 966–73. https://doi.org/10.1021/ac0614846. Tautenhahn, Ralf, Christoph Böttcher, and Steffen Neumann. 2008. “Highly Sensitive Feature Detection for High Resolution LC/MS.” BMC Bioinformatics 9: 504. https://doi.org/10.1186/1471-2105-9-504. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Treutler, Hendrik, and Steffen Neumann. 2016. “Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data.” Metabolites 6 (4): 37. https://doi.org/10.3390/metabo6040037. Tsou, Chih-Chiang, Dmitry Avtonomov, Brett Larsen, Monika Tucholska, Hyungwon Choi, Anne-Claude Gingras, and Alexey I. Nesvizhskii. 2015. “DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics.” Nature Methods 12 (3): 258–64. https://doi.org/10.1038/nmeth.3255. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Wang, Ruimin, Miaoshan Lu, Shaowei An, Jinyin Wang, and Changbin Yu. 2023. “G-Aligner: A Graph-Based Feature Alignment Method for Untargeted LC–MS-based Metabolomics.” BMC Bioinformatics 24 (1): 431. https://doi.org/10.1186/s12859-023-05525-4. Wang, Ruohong, Yandong Yin, and Zheng-Jiang Zhu. 2019. “Advancing Untargeted Metabolomics Using Data-Independent Acquisition Mass Spectrometry Technology.” Analytical and Bioanalytical Chemistry 411 (19): 4349–57. https://doi.org/10.1007/s00216-019-01709-1. Wehrens, Ron, Tom G. Bloemberg, and Paul H. C. Eilers. 2015. “Fast Parametric Time Warping of Peak Lists.” Bioinformatics 31 (18): 3063–65. https://doi.org/10.1093/bioinformatics/btv299. Wei, Runmin, Jingye Wang, Mingming Su, Erik Jia, Shaoqiu Chen, Tianlu Chen, and Yan Ni. 2018. “Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.” Scientific Reports 8 (1): 663. https://doi.org/10.1038/s41598-017-19120-0. Yan, Binjun, Mengtian Shi, Siyu Cai, Yuan Su, Renhui Chen, Chiyuan Huang, and David Da Yong Chen. 2023. “Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC–MS/MS.” Analytical Chemistry 95 (45): 16558–66. https://doi.org/10.1021/acs.analchem.3c02689. Yang, Ruochen, Xi Chen, and Idoia Ochoa. 2019. “MassComp, a Lossless Compressor for Mass Spectrometry Data.” BMC Bioinformatics 20 (1): 368. https://doi.org/10.1186/s12859-019-2962-7. Zha, Haihong, Yuping Cai, Yandong Yin, Zhuozhong Wang, Kang Li, and Zheng-Jiang Zhu. 2018. “SWATHtoMRM: Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery.” Analytical Chemistry 90 (6): 4062–70. https://doi.org/10.1021/acs.analchem.7b05318. Zhao, Fan, Shuai Huang, and Xiaozhe Zhang. 2021. “High Sensitivity and Specificity Feature Detection in Liquid Chromatography–Mass Spectrometry Data: A Deep Learning Framework.” Talanta 222 (January): 121580. https://doi.org/10.1016/j.talanta.2020.121580. Zhu, Xiaochun, Yuping Chen, and Raju Subramanian. 2014. “Comparison of Information-Dependent Acquisition, SWATH, and MSAll Techniques in Metabolite Identification Study Employing Ultrahigh-Performance Liquid Chromatography–Quadrupole Time-of-Flight Mass Spectrometry.” Analytical Chemistry 86 (2): 1202–9. https://doi.org/10.1021/ac403385y. "],["annotation.html", "Chapter 7 Annotation 7.1 Issues in annotation 7.2 Peak misidentification 7.3 Annotation v.s. identification 7.4 Molecular Formula Assignment 7.5 Redundant peaks 7.6 MS1 MS2 connection 7.7 MS2 MSn connection 7.8 MS/MS annotation 7.9 Knowledge based annotation 7.10 MS Database for annotation 7.11 Compounds Database", " Chapter 7 Annotation When you get the peaks table or features table, annotation of the peaks would help you. Check this review(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) or other reviews(Chaleckis et al. 2019; Lai et al. 2018; Nash and Dunn 2019; Mark R. Viant et al. 2017; Allard, Genta-Jouve, and Wolfender 2017; Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) for a detailed notes on annotation. The first paper proposed five levels regarding currently computational annotation strategies. Level 1: Peak Grouping: MS Psedospectra extraction based on peak shape similarity and peak abundance correlation Level 2: Peak Annotation: Adducts, Neutral losses, isotopes, and other mass relationships based on mass distances Level 3: Biochemical knowledge based on putative identification, potential biochemical reaction and related statistical analysis Level 4: Use and integration of tandem MS data based on data dependent/independent acquisition mode or in silico prediction Level 5: Retention time prediction based on library-available retention index or quantitative structure-retention relationships (QSRR) models. Most of the software are at level 1 or 2. If we only have compounds structure, we could guess ions under different ionization method. If we have mass spectrum, we could match the mass spectral by a similarity analysis to the database. In metabolomics, we only have mass spectrum or mass-to-charge ratios. Single mass-to-charge ratio is not enough for identification. That’s the one bottleneck for annotation. So prediction is always performed on MS/MS data. 7.1 Issues in annotation The major issue in annotation is the redundancy peaks from same metabolite. Unlike genomes, peaks or features from peak selection are not independent with each other. Adducts, in-source fragments and isotopes would lead to wrong annotation. A common solution is that use known adducts, neutral losses, molecular multimers or multiple charged ions to compare mass distances. Another issue is about the MS/MS database. Only 10% of known metabolites in databases have experimental spectral data. Thus in silico prediction is required. Some works try to fill the gap between experimental data, theoretical values(from chemical database like chemspider) and prediction together. Here is a nice review about MS/MS prediction(Hufsky, Scheubert, and Böcker 2014). 7.2 Peak misidentification Isomer Use separation methods such as chromatography, ion mobility MS, MS/MS. Reversed-phase ion-pairing chromatography and HILIC is useful. Chemical derivatization is another option. Interfering compounds 20ppm is the least exact mass accuracy for HRMS. In-source degradation products 7.3 Annotation v.s. identification According to the definition from the Chemical Analysis Working Group of the Metabolomics Standards Intitvative(Lloyd W. Sumner et al. 2007; Mark R. Viant et al. 2017). Four levels of confidence could be assigned to identification: Level 1 ‘identified metabolites’ Level 2 ‘Putatively annotated compounds’ Level 3 ‘Putatively characterised compound classes’ Level 4 ‘Unknown’ In practice, data analysis based annotation could reach level 2. For level 1, we need at extra methods such as MS/MS, retention time, accurate mass, 2D NMR spectra, and so on to confirm the compounds. However, standards are always required for solid proof. For specific group of compounds such as PFASs, the communication of confidence level could be slightly different(Charbonnet et al. 2022). Through MS/MS seemed a required step for identification, recent study found ESI might also generate fragments ions for structure identification (Xue, Guijas, et al. 2020; Xue et al. 2021, 2023; Bernardo-Bermejo et al. 2023). 7.4 Molecular Formula Assignment Cheminformatics will help for MS annotation. The first task is molecular formula assignment. For a given accurate mass, the formula should be constrained by predefined element type and atom number, mass error window and rules of chemical bonding, such as double bond equivalent (DBE) and the nitrogen rule. The nitrogen rule is that an odd nominal molecular mass implies also an odd number of nitrogen. This rule should only be used with nominal (integer) masses. Degree of unsaturation or DBE use rings-plus-double-bonds equivalent (RDBE) values, which should be interger. The elements oxygen and sulphur were not taken into account. Otherwise the molecular formula will not be true. \\[RDBE = C+Si - 1/2(H+F+Cl+Br+I) + 1/2(N+P)+1 \\] To assign molecular formula to a mass to charge ratio, Seven Golden Rules (Kind and Fiehn 2007) for heuristic filtering of molecular formulas should be considered: Apply heuristic restrictions for number of elements during formula generation. This is the table for known compounds: ## Mass.Range.[Da] Library C.max H.max N.max O.max P.max S.max F.max Cl.max ## 1 &lt; 500 DNP 29 72 10 18 4 7 15 8 ## 2 &lt;NA&gt; Wiley 39 72 20 20 9 10 16 10 ## 3 &lt; 1000 DNP 66 126 25 27 6 8 16 11 ## 4 &lt;NA&gt; Wiley 78 126 20 27 9 14 34 12 ## 5 &lt; 2000 DNP 115 236 32 63 6 8 16 11 ## 6 &lt;NA&gt; Wiley 156 180 20 40 9 14 48 12 ## 7 &lt; 3000 DNP 162 208 48 78 6 9 16 11 ## Br.max Si.max ## 1 5 NA ## 2 4 8 ## 3 8 NA ## 4 8 14 ## 5 8 NA ## 6 10 15 ## 7 8 NA Perform LEWIS and SENIOR check. The LEWIS rule demands that molecules consisting of main group elements, especially carbon, nitrogen and oxygen, share electrons in a way that all atoms have completely filled s, p-valence shells (‘octet rule’). Senior’s theorem requires three essential conditions for the existence of molecular graphs The sum of valences or the total number of atoms having odd valences is even; The sum of valences is greater than or equal to twice the maximum valence; The sum of valences is greater than or equal to twice the number of atoms minus 1. Perform isotopic pattern filter. Isotope ratio abundance was included in the algorithm as an additional orthogonal constraint, assuming high quality data acquisitions, specifically sufficient ion statistics and high signal/noise ratio for the detection of the M+1 and M+2 abundances. For monoisotopic elements (F, Na, P, I) this rule has no impact. isotope pattern will be useful for brominated, chlorinated small molecules and sulphur-containing peptides. Perform H/C ratio check (hydrogen/carbon ratio). In most cases the hydrogen/carbon ratio does not exceed H/C &gt; 3 with rare exception such as in methylhydrazine (CH6N2). Conversely, the H/C ratio is usually smaller than 2, and should not be less than 0.125 like in the case of tetracyanopyrrole (C8HN5). Perform NOPS ratio check (N, O, P, S/C ratios). ## Element.ratios Common.range.(covering.99.7%) Extended.range.(covering.99.99%) ## 1 H/C 0.2–3.1 0.1–6 ## 2 F/C 0–1.5 0–6 ## 3 Cl/C 0–0.8 0–2 ## 4 Br/C 0–0.8 0–2 ## 5 N/C 0–1.3 0–4 ## 6 O/C 0–1.2 0–3 ## 7 P/C 0–0.3 0–2 ## 8 S/C 0–0.8 0–3 ## 9 Si/C 0–0.5 0–1 ## Extreme.range.(beyond.99.99%) ## 1 &lt; 0.1 and 6–9 ## 2 &gt; 1.5 ## 3 &gt; 0.8 ## 4 &gt; 0.8 ## 5 &gt; 1.3 ## 6 &gt; 1.2 ## 7 &gt; 0.3 ## 8 &gt; 0.8 ## 9 &gt; 0.5 Perform heuristic HNOPS probability check (H, N, O, P, S/C high probability ratios) df &lt;- data.frame( stringsAsFactors = FALSE, Element.counts = c(&quot;NOPS all &gt; 1&quot;,&quot;NOP all &gt; 3&quot;,&quot;OPS all &gt; 1&quot;, &quot;PSN all &gt; 1&quot;,&quot;NOS all &gt; 6&quot;), Heuristic.Rule = c(&quot;N&lt; 10, O &lt; 20, P &lt; 4, S &lt; 3&quot;, &quot;N &lt; 11, O &lt; 22, P &lt; 6&quot;,&quot;O &lt; 14, P &lt; 3, S &lt; 3&quot;, &quot;P &lt; 3, S &lt; 3, N &lt; 4&quot;,&quot;N &lt; 19 O &lt; 14 S &lt; 8&quot;), DB.examples.for.maximum.values = c(&quot;C15H34N9O8PS, C22H44N4O14P2S2, C24H38N7O19P3S&quot;,&quot;C20H28N10O21P4, C10H18N5O20P5&quot;, &quot;C22H44N4O14P2S2, C16H36N4O4P2S2&quot;, &quot;C22H44N4O14P2S2, C16H36N4O4P2S2&quot;,&quot;C59H64N18O14S7&quot;) ) df ## Element.counts Heuristic.Rule ## 1 NOPS all &gt; 1 N&lt; 10, O &lt; 20, P &lt; 4, S &lt; 3 ## 2 NOP all &gt; 3 N &lt; 11, O &lt; 22, P &lt; 6 ## 3 OPS all &gt; 1 O &lt; 14, P &lt; 3, S &lt; 3 ## 4 PSN all &gt; 1 P &lt; 3, S &lt; 3, N &lt; 4 ## 5 NOS all &gt; 6 N &lt; 19 O &lt; 14 S &lt; 8 ## DB.examples.for.maximum.values ## 1 C15H34N9O8PS, C22H44N4O14P2S2, C24H38N7O19P3S ## 2 C20H28N10O21P4, C10H18N5O20P5 ## 3 C22H44N4O14P2S2, C16H36N4O4P2S2 ## 4 C22H44N4O14P2S2, C16H36N4O4P2S2 ## 5 C59H64N18O14S7 Perform TMS check (for GC-MS if a silylation step is involved). For TMS derivatized molecules detected in GC/MS analyses, the rules on element ratio checks and valence tests are hence best applied after TMS groups are subtracted, in a similar manner as adducts need to be first recognized and subtracted in LC/MS analyses. Seven Golden Rules were built for GC-MS and Hydrogen Rearrangement Rules were major designed for LC-CID-MS/MS(Tsugawa et al. 2016). Based on extensively curated database records and enthalpy calculations, “hydrogen rearrangement (HR) rules” could be extending the even-electron rule for carbon (C) and heteroatoms, oxygen (O), nitrogen (N), phosphorus (P), and sulfur (S). They used high abundance MS/MS peaks that exceeded 10% of their base peaks to identify common features in terms of 4 HR rules for positive mode and 5 HR rules for negative mode. Seven Golden Rules and Hydrogen Rearrangement Rules might also be captured by statistical models. However, such heuristic rules could reduce the searching space of possible formula. molgen generating all structures (connectivity isomers, constitutions) that correspond to a given molecular formula, with optional further restrictions, e.g. presence or absence of particular substructures (Gugisch et al. 2015). mfFinder can predict formula based on accurate mass (Patiny and Borel 2013). RAMSI is the robust automated mass spectra interpretation and chemical formula calculation method using mixed integer linear programming optimization (Baran and Northen 2013). Here is some other Cheminformatics tools, which could be used to assign meaningful formula or structures for mass spectra. RDKit Open-Source Cheminformatics Software cdk The Chemistry Development Kit (CDK) is a scientific, LGPL-ed library for bio- and cheminformatics and computational chemistry written in Java (Guha 2007). Open Babel Open Babel is a chemical toolbox designed to speak the many languages of chemical data (O’Boyle et al. 2011). ClassyFire is a tool for automated chemical classification with a comprehensive, computable taxonomy (Djoumbou Feunang et al. 2016). BUDDY can perform molecular formula discovery via bottom-up MS/MS interrogation(Xing et al. 2023). 7.5 Redundant peaks Full scan mass spectra always contain lots of redundant peaks such as adducts, isotope, fragments, multiple charged ions and other oligomers. Such peaks dominated the features table(Xu, Lu, and Rabinowitz 2015; Sindelar and Patti 2020; Mahieu and Patti 2017). Annotation tools could label those peaks either by known list or frequency analysis of the paired mass distances(Ju et al. 2020; Kouřil et al. 2020). 7.5.1 Adducts list You could find adducts list here from commonMZ project. 7.5.2 Isotope Here is Isotope pattern prediction. 7.5.3 CAMERA Common annotation for xcms workflow(Kuhl et al. 2012). 7.5.4 RAMClustR The software could be found here (C. D. Broeckling et al. 2014; Corey D. Broeckling et al. 2016). The package included a vignette to follow. 7.5.5 BioCAn BioCAn combines the results from database searches and in silico fragmentation analyses and places these results into a relevant biological context for the sample as captured by a metabolic model (Alden et al. 2017). 7.5.6 mzMatch mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language. (Chokkathukalam et al. 2013; Scheltema et al. 2011) and MetAssign is a probabilistic annotation method using a Bayesian clustering approach, which is part of mzMatch(Daly et al. 2014). 7.5.7 xMSannotator The software could be found here(Uppal, Walker, and Jones 2017). 7.5.8 mWise mWise is an Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features through Diffusion in Graphs(Barranco-Altirriba et al. 2021). 7.5.9 MAIT You could find source code here(Fernández-Albert et al. 2014). 7.5.10 pmd Paired Mass Distance(PMD) analysis for GC/LC-MS based nontarget analysis to remove redundant peaks(M. Yu, Olkowicz, and Pawliszyn 2019). 7.5.11 nontarget nontarget could find Isotope &amp; adduct peak grouping, and perform homologue series detection (Loos and Singer 2017). 7.5.12 Binner Binner Deep annotation of untargeted LC-MS metabolomics data (Kachman et al. 2020) 7.5.13 mz.unity You could find source code here (Mahieu et al. 2016) and it’s for detecting and exploring complex relationships in accurate-mass mass spectrometry data. 7.5.14 MS-FLO ms-flo A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing (DeFelice et al. 2017). 7.5.15 CliqueMS CliqueMS is a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network (Senan et al. 2019). 7.5.16 InterpretMSSpectrum This package is for annotate and interpret deconvoluted mass spectra (mass*intensity pairs) from high resolution mass spectrometry devices. You could use this package to find molecular ions for GC-MS (Jaeger et al. 2016). 7.5.17 NetID NetID is a global network optimization approach to annotate untargeted LC-MS metabolomics data(L. Chen et al. 2021). 7.5.18 ISfrag De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data(J. Guo et al. 2021) 7.5.19 FastEI Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library(Qiong Yang et al. 2023) 7.6 MS1 MS2 connection 7.6.1 PMDDA Three step workflow: MS1 full scan peak-picking, GlobalStd algorithm to select precursor ions for MS2 from MS1 data and collect the MS2 data and annotation with GNPS(M. Yu, Dolios, and Petrick 2022). 7.6.2 HERMES A molecular-formula-oriented method to target the metabolome(Giné et al. 2021). 7.6.3 dpDDA Similar work can be found here with inclusion list of differential and preidentified ions (dpDDA)(Y. Zhang et al. 2023). 7.7 MS2 MSn connection A computational approach to generate adatabase of high-resolution-MS n spectra by converting existing low-resolution MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD(Lieng et al. 2023). 7.8 MS/MS annotation MS/MS annotation is performed to generate a matching score with library spectra. The most popular matching algorithm is dot product similarity. A recent study found spectral entropy algorithm outperformed dot product similarity [Y. Li et al. (2021);Y. Li and Fiehn (2023);]. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment showed modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules(Bittremieux et al. 2022). This work proposed a method weighting low-intensity MS/MS ions and m/z frequency for spectral library annotation, which will be help to annotate unknown spectra(Engler Hart et al. 2024). BLINK enables ultrafast tandem mass spectrometry cosine similarity scoring(Harwood et al. 2023). MS2Query enable the reliable and scalable MS2 mass spectra-based analogue search by machine learning(de Jonge et al. 2023). However, A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect(van Tetering et al. 2024). Machine learning can also be applied for MS2 annotation(Codrean et al. 2023; H. Guo et al. 2023; Bilbao et al. 2023). You could check \\[Workflow\\] section for popular platform. Here are some stand-alone annotation software: 7.8.1 Matchms Matchms is an open-source Python package to import, process, clean, and compare mass spectrometry data (MS/MS). It allows to implement and run an easy-to-follow, easy-to-reproduce workflow from raw mass spectra to pre- and post-processed spectral data. Spectral data can be imported from common formats such mzML, mzXML, msp, metabolomics-USI, MGF, or json (e.g. GNPS-syle json files). Matchms then provides filters for metadata cleaning and checking, as well as for basic peak filtering. Finally, matchms was build to import and apply different similarity measures to compare large amounts of spectra. This includes common Cosine scores, but can also easily be extended by custom measures. Example for spectrum similarity measures that were designed to work in matchms are Spec2Vec and MS2DeepScore(Huber et al. 2020). 7.8.2 MetDNA MetDNA is the Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics (Shen et al. 2019). 7.8.3 MetFusion Java based integration of compound identiﬁcation strategies. You could access the application here (Gerlich and Neumann 2013). 7.8.4 MS2Analyzer MS2Analyzer could annotate small molecule substructure from accurate tandem mass spectra. (Ma et al. 2014) 7.8.5 MetFrag MetFrag could be used to make in silico prediction/match of MS/MS data(Ruttkies et al. 2016; Wolf et al. 2010). 7.8.6 CFM-ID CFM-ID use Metlin’s data to make prediction (Allen et al. 2014) and 4.0 (Allen et al. 2014). 7.8.7 LC-MS2Struct A machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements.(Bach, Schymanski, and Rousu 2022) 7.8.8 LipidFrag LipidFrag could be used to make in silico prediction/match of lipid related MS/MS data (Witting et al. 2017). 7.8.9 Lipidmatch in silico: in silico lipid mass spectrum search (Koelmel et al. 2017). 7.8.10 BarCoding Bar coding select mass-to-charge regions containing the most informative metabolite fragments and designate them as bins. Then translate each metabolite fragmentation pattern into a binary code by assigning 1’s to bins containing fragments and 0’s to bins without fragments. Such coding annotation could be used for MRM data (Spalding et al. 2016). 7.8.11 iMet This online application is a network-based computation method for annotation (Aguilar-Mogas et al. 2017). 7.8.12 DNMS2Purifier XGBoost based MS/MS spectral cleaning tool using intensity ratio fluctuation, appearance rate, and relative intensity(T. Zhao et al. 2023). 7.8.13 IDSL.CSA Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets(Baygi, Kumar, and Barupal 2023). 7.9 Knowledge based annotation 7.9.1 Experimental design Physicochemical Property can be used for annotation with a specific experimental design(Abrahamsson et al. 2023). 7.9.2 Chromatographic retention-related criteria For targeted analysis, chromatographic retention time could be the qualitative method for certain compounds with a carefully designed pre-treatment. For untargeted analysis, such information could also be used for annotation. GC-MS usually use retention index for certain column while LC-MS might not show enough reproducible results as GC. Such method could be tracked back to quantitative structure-retention relationship (QSRR) models or linear solvation energy relationship (LSER). However, such methods need molecular descriptors as much as possible. For untargeted analysis, retention time and mass to charge ratio could not generate enough molecular descriptors to build QSPR models. In this case, such criteria might be usefully for validation instead of annotation unless we could measure or extract more information such as ion mobility from unknown compounds. Retip Retention Time Prediction for Compound Annotation in Untargeted Metabolomics (Bonini et al. 2020). JAVA based MolFind could make annotation for unknown chemical structure by prediction based on RI, ECOM50, drift time and CID spectra (Menikarachchi et al. 2012). For-ident could give a score for identification with the help of logD(relative retention time) and/or MS/MS. RT-Transformer: retention time prediction for metabolite annotation to assist in metabolite identification,which is a novel deep neural network model coupled with graph attention network and 1D-Transformer, which can predict retention times under any chromatographic methods. RT prediction model(random forest) of unified-HILIC/AEX/HRMS/MS, which enables the comprehensive structural annotation of polar metabolites(Unified-HILIC/AEX/HRMS/MS)(Torigoe et al. 2024). 7.9.3 ProbMetab Provides probability ranking to candidate compounds assigned to masses, with the prior assumption of connected sample and additional previous and spectral information modeled by the user. You could find source code here (Ricardo R. Silva et al. 2014). 7.9.4 MI-Pack You could find python software here (Weber and Viant 2010). 7.9.5 MetExpert MetExpert is an expert system to assist users with limited expertise in informatics to interpret GCMS data for metabolite identification without querying spectral databases (Qiu, Lei, and Sumner 2018). 7.9.6 MycompoundID MycompoundID could be used to search known and unknown metabolites online (Liang Li et al. 2013). 7.9.7 MetFamily Shiny app for MS and MS/MS data annotation (Treutler et al. 2016). 7.9.8 CoA-Blast For certain group of compounds such as Acyl-CoA, you might build a class level in silico database to annotated compounds with certain structure(Keshet et al. 2022). 7.9.9 KGMN Knowledge-guided multi-layer network (KGMN) integrates three-layer networks, including knowledge-based metabolic reaction network, knowledge-guided MS/MS similarity network, and global peak correlation network for annotaiton (Z. Zhou et al. 2022). 7.9.10 CCMN CCMNs were then constructed using metabolic features shared classes, which facilitated the structure- or class annotation for completely unknown metabolic features(X. Zhang et al. 2024). 7.10 MS Database for annotation 7.10.1 MS Fiehn Lab NIST: No free Spectral Database for Organic Compounds, SDBS MINE is an open access database of computationally predicted enzyme promiscuity products for untargeted metabolomics. The annotation would be accurate for general compounds database. 7.10.2 MS/MS LibGen can generate high quality spectral libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers(Kong et al. 2023). MoNA Platform to collect all other open source database MassBank GNPS use inner correlationship in the data and make network analysis at peaks’ level instand of annotated compounds to annotate the data. ReSpect: phytochemicals Metlin is another useful online application for annotation(Guijas et al. 2018). LipidBlast: in silico prediction Lipid Maps MZcloud NIST: Not free GMDB a multistage tandem mass spectral database using a variety of structurally defined glycans. HMDB is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. KEGG is a collection of small molecules, biopolymers, and other chemical substances that are relevant to biological systems. 7.11 Compounds Database PubChem is an open chemistry database at the National Institutes of Health (NIH). Chemspider is a free chemical structure database providing fast text and structure search access to over 67 million structures from hundreds of data sources. ChEBI is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. RefMet A Reference list of Metabolite names. CAS Largest substance database CompTox compounds, exposure and toxicity database. Here is related data. T3DB is a unique bioinformatics resource that combines detailed toxin data with comprehensive toxin target information. FooDB is the world’s largest and most comprehensive resource on food constituents, chemistry and biology. Phenol explorer is the first comprehensive database on polyphenol content in foods. Drugbank is a unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information. LMDB is a freely available electronic database containing detailed information about small molecule metabolites found in different livestock species. HPV High Production Volume Information System There are also metabolites atlas for specific domain. PMhub 1.0: a comprehensive plant metabolome database(Z. Tian et al. 2023) Atlas of Circadian Metabolism(Dyar et al. 2018) Plantmat excel library based prediction for plant metabolites(Qiu et al. 2016). References Abrahamsson, Dimitri, Christopher L. Brueck, Carsten Prasse, Dimitra A. Lambropoulou, Lelouda-Athanasia Koronaiou, Miaomiao Wang, June-Soo Park, and Tracey J. Woodruff. 2023. “Extracting Structural Information from Physicochemical Property Measurements Using Machine Learning-A New Approach for Structure Elucidation in Non-targeted Analysis.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.3c03003. Aguilar-Mogas, Antoni, Marta Sales-Pardo, Miriam Navarro, Roger Guimerà, and Oscar Yanes. 2017. “iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.” Analytical Chemistry 89 (6): 3474–82. https://doi.org/10.1021/acs.analchem.6b04512. Alden, Nicholas, Smitha Krishnan, Vladimir Porokhin, Ravali Raju, Kyle McElearney, Alan Gilbert, and Kyongbum Lee. 2017. “Biologically Consistent Annotation of Metabolomics Data.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.7b02162. Allard, Pierre-Marie, Grégory Genta-Jouve, and Jean-Luc Wolfender. 2017. “Deep Metabolome Annotation in Natural Products Research: Towards a Virtuous Cycle in Metabolite Identification.” Current Opinion in Chemical Biology, Omics, 36 (February): 40–49. https://doi.org/10.1016/j.cbpa.2016.12.022. Allen, Felicity, Allison Pon, Michael Wilson, Russ Greiner, and David Wishart. 2014. “CFM-ID: A Web Server for Annotation, Spectrum Prediction and Metabolite Identification from Tandem Mass Spectra.” Nucleic Acids Research 42 (W1): W94–99. https://doi.org/10.1093/nar/gku436. Bach, Eric, Emma L. Schymanski, and Juho Rousu. 2022. “Joint Structural Annotation of Small Molecules Using Liquid Chromatography Retention Order and Tandem Mass Spectrometry Data.” Nature Machine Intelligence 4 (12): 1224–37. https://doi.org/10.1038/s42256-022-00577-2. Baran, Richard, and Trent R. Northen. 2013. “Robust Automated Mass Spectra Interpretation and Chemical Formula Calculation Using Mixed Integer Linear Programming.” Analytical Chemistry 85 (20): 9777–84. https://doi.org/10.1021/ac402180c. Barranco-Altirriba, Maria, Pol Solà-Santos, Sergio Picart-Armada, Samir Kanaan-Izquierdo, Jordi Fonollosa, and Alexandre Perera-Lluna. 2021. “mWISE: An Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features Through Diffusion in Graphs.” Analytical Chemistry 93 (31): 10772–78. https://doi.org/10.1021/acs.analchem.1c00238. Baygi, Sadjad Fakouri, Yashwant Kumar, and Dinesh Kumar Barupal. 2023. “IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets.” IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets, June. https://doi.org/10.1021/acs.analchem.3c00376. Bernardo-Bermejo, Samuel, Jingchuan Xue, Linh Hoang, Elizabeth Billings, Bill Webb, M. Willy Honders, Sanne Venneker, et al. 2023. “Quantitative Multiple Fragment Monitoring with Enhanced in-Source Fragmentation/Annotation Mass Spectrometry.” Nature Protocols, February, 1–20. https://doi.org/10.1038/s41596-023-00803-0. Bilbao, Aivett, Nathalie Munoz, Joonhoon Kim, Daniel J. Orton, Yuqian Gao, Kunal Poorey, Kyle R. Pomraning, et al. 2023. “PeakDecoder Enables Machine Learning-Based Metabolite Annotation and Accurate Profiling in Multidimensional Mass Spectrometry Measurements.” Nature Communications 14 (1): 2461. https://doi.org/10.1038/s41467-023-37031-9. Bittremieux, Wout, Robin Schmid, Florian Huber, Justin J. J. van der Hooft, Mingxun Wang, and Pieter C. Dorrestein. 2022. “Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules.” Journal of the American Society for Mass Spectrometry 33 (9): 1733–44. https://doi.org/10.1021/jasms.2c00153. Bonini, Paolo, Tobias Kind, Hiroshi Tsugawa, Dinesh Kumar Barupal, and Oliver Fiehn. 2020. “Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics.” Analytical Chemistry 92 (11): 7515–22. https://doi.org/10.1021/acs.analchem.9b05765. Broeckling, C. D., F. A. Afsar, S. Neumann, A. Ben-Hur, and J. E. Prenni. 2014. “RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data.” Analytical Chemistry 86 (14): 6812–17. https://doi.org/10.1021/ac501530d. Broeckling, Corey D., Andrea Ganna, Mark Layer, Kevin Brown, Ben Sutton, Erik Ingelsson, Graham Peers, and Jessica E. Prenni. 2016. “Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data Through MS1 Spectrum and Time Prediction.” Analytical Chemistry 88 (18): 9226–34. https://doi.org/10.1021/acs.analchem.6b02479. Chaleckis, Romanas, Isabel Meister, Pei Zhang, and Craig E Wheelock. 2019. “Challenges, Progress and Promises of Metabolite Annotation for LC–MS-based Metabolomics.” Current Opinion in Biotechnology, Analytical Biotechnology, 55 (February): 44–50. https://doi.org/10.1016/j.copbio.2018.07.010. Charbonnet, Joseph A., Carrie A. McDonough, Feng Xiao, Trever Schwichtenberg, Dunping Cao, Sarit Kaserzon, Kevin V. Thomas, et al. 2022. “Communicating Confidence of Per- and Polyfluoroalkyl Substance Identification via High-Resolution Mass Spectrometry.” Environmental Science &amp; Technology Letters, May. https://doi.org/10.1021/acs.estlett.2c00206. Chen, Li, Wenyun Lu, Lin Wang, Xi Xing, Ziyang Chen, Xin Teng, Xianfeng Zeng, et al. 2021. “Metabolite Discovery Through Global Annotation of Untargeted Metabolomics Data.” Nature Methods 18 (11): 1377–85. https://doi.org/10.1038/s41592-021-01303-3. Chokkathukalam, Achuthanunni, Andris Jankevics, Darren J. Creek, Fiona Achcar, Michael P. Barrett, and Rainer Breitling. 2013. “mzMatch–ISO: An R Tool for the Annotation and Relative Quantification of Isotope-Labelled Mass Spectrometry Data.” Bioinformatics 29 (2): 281–83. https://doi.org/10.1093/bioinformatics/bts674. Codrean, S., B. Kruit, N. Meekel, D. Vughs, and F. Béen. 2023. “Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.” Analytical Chemistry, October. https://doi.org/10.1021/acs.analchem.3c03470. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. de Jonge, Niek F., Joris J. R. Louwen, Elena Chekmeneva, Stephane Camuzeaux, Femke J. Vermeir, Robert S. Jansen, Florian Huber, and Justin J. J. van der Hooft. 2023. “MS2Query: Reliable and Scalable MS2 Mass Spectra-Based Analogue Search.” Nature Communications 14 (1): 1752. https://doi.org/10.1038/s41467-023-37446-4. DeFelice, Brian C., Sajjan Singh Mehta, Stephanie Samra, Tomáš Čajka, Benjamin Wancewicz, Johannes F. Fahrmann, and Oliver Fiehn. 2017. “Mass Spectral Feature List Optimizer (MS-FLO): A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing.” Analytical Chemistry 89 (6): 3250–55. https://doi.org/10.1021/acs.analchem.6b04372. Djoumbou Feunang, Yannick, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, et al. 2016. “ClassyFire: Automated Chemical Classification with a Comprehensive, Computable Taxonomy.” Journal of Cheminformatics 8 (1): 61. https://doi.org/10.1186/s13321-016-0174-y. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Dyar, Kenneth A., Dominik Lutter, Anna Artati, Nicholas J. Ceglia, Yu Liu, Danny Armenta, Martin Jastroch, et al. 2018. “Atlas of Circadian Metabolism Reveals System-wide Coordination and Communication Between Clocks.” Cell 174 (6): 1571–1585.e11. https://doi.org/10.1016/j.cell.2018.08.042. Engler Hart, Chloe, Tobias Kind, Pieter C. Dorrestein, David Healey, and Daniel Domingo-Fernández. 2024. “Weighting Low-Intensity MS/MS Ions and m/z Frequency for Spectral Library Annotation.” Journal of the American Society for Mass Spectrometry 35 (2): 266–74. https://doi.org/10.1021/jasms.3c00353. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Gerlich, Michael, and Steffen Neumann. 2013. “MetFusion: Integration of Compound Identification Strategies.” Journal of Mass Spectrometry 48 (3): 291–98. https://doi.org/10.1002/jms.3123. Giné, Roger, Jordi Capellades, Josep M. Badia, Dennis Vughs, Michaela Schwaiger-Haber, Theodore Alexandrov, Maria Vinaixa, Andrea M. Brunner, Gary J. Patti, and Oscar Yanes. 2021. “HERMES: A Molecular-Formula-Oriented Method to Target the Metabolome.” Nature Methods 18 (11): 1370–76. https://doi.org/10.1038/s41592-021-01307-z. Gugisch, Ralf, Adalbert Kerber, Axel Kohnert, Reinhard Laue, Markus Meringer, Christoph Rücker, and Alfred Wassermann. 2015. “Chapter 6 - MOLGEN 5.0, A Molecular Structure Generator.” In Advances in Mathematical Chemistry and Applications, edited by Subhash C. Basak, Guillermo Restrepo, and José L. Villaveces, 113–38. Bentham Science Publishers. https://doi.org/10.1016/B978-1-68108-198-4.50006-0. Guha, Rajarshi. 2007. “Chemical Informatics Functionality in R.” Journal of Statistical Software 18 (1): 1–16. https://doi.org/10.18637/jss.v018.i05. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Hao, Kebing Xue, Haiming Sun, Weihao Jiang, and Shiliang Pu. 2023. “Contrastive Learning-Based Embedder for the Representation of Tandem Mass Spectra.” Analytical Chemistry, May. https://doi.org/10.1021/acs.analchem.3c00260. Guo, Jian, Sam Shen, Shipei Xing, Huaxu Yu, and Tao Huan. 2021. “ISFrag: De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data.” Analytical Chemistry, July. https://doi.org/10.1021/acs.analchem.1c01644. Harwood, Thomas V., Daniel G. C. Treen, Mingxun Wang, Wibe de Jong, Trent R. Northen, and Benjamin P. Bowen. 2023. “BLINK Enables Ultrafast Tandem Mass Spectrometry Cosine Similarity Scoring.” Scientific Reports 13 (1): 13462. https://doi.org/10.1038/s41598-023-40496-9. Huber, Florian, Stefan Verhoeven, Christiaan Meijer, Hanno Spreeuw, Efraín Manuel Villanueva Castilla, Cunliang Geng, Justin J. j van der Hooft, et al. 2020. “Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data.” Journal of Open Source Software 5 (52): 2411. https://doi.org/10.21105/joss.02411. Hufsky, Franziska, Kerstin Scheubert, and Sebastian Böcker. 2014. “Computational Mass Spectrometry for Small-Molecule Fragmentation.” TrAC Trends in Analytical Chemistry 53 (January): 41–48. https://doi.org/10.1016/j.trac.2013.09.008. Jaeger, Carsten, Friederike Hoffmann, Clemens A. Schmitt, and Jan Lisec. 2016. “Automated Annotation and Evaluation of In-Source Mass Spectra in GC/Atmospheric Pressure Chemical Ionization-MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9386–90. https://doi.org/10.1021/acs.analchem.6b02743. Ju, Ran, Xinyu Liu, Fujian Zheng, Xinjie Zhao, Xin Lu, Xiaohui Lin, Zhongda Zeng, and Guowang Xu. 2020. “A Graph Density-Based Strategy for Features Fusion from Different Peak Extract Software to Achieve More Metabolites in Metabolic Profiling from High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1139 (December): 8–14. https://doi.org/10.1016/j.aca.2020.09.029. Kachman, Maureen, Hani Habra, William Duren, Janis Wigginton, Peter Sajjakulnukit, George Michailidis, Charles Burant, and Alla Karnovsky. 2020. “Deep Annotation of Untargeted LC-MS Metabolomics Data with Binner.” Bioinformatics 36 (6): 1801–6. https://doi.org/10.1093/bioinformatics/btz798. Keshet, Uri, Tobias Kind, Xinchen Lu, Sarita Devi, and Oliver Fiehn. 2022. “Acyl-CoA Identification in Mouse Liver Samples Using the In Silico CoA-Blast Tandem Mass Spectral Library.” Analytical Chemistry 94 (6): 2732–39. https://doi.org/10.1021/acs.analchem.1c03272. Kind, Tobias, and Oliver Fiehn. 2007. “Seven Golden Rules for Heuristic Filtering of Molecular Formulas Obtained by Accurate Mass Spectrometry.” BMC Bioinformatics 8 (1): 105. https://doi.org/10.1186/1471-2105-8-105. Koelmel, Jeremy P., Nicholas M. Kroeger, Candice Z. Ulmer, John A. Bowden, Rainey E. Patterson, Jason A. Cochran, Christopher W. W. Beecher, Timothy J. Garrett, and Richard A. Yost. 2017. “LipidMatch: An Automated Workflow for Rule-Based Lipid Identification Using Untargeted High-Resolution Tandem Mass Spectrometry Data.” BMC Bioinformatics 18 (July): 331. https://doi.org/10.1186/s12859-017-1744-3. Kong, Fanzhou, Uri Keshet, Tong Shen, Elys Rodriguez, and Oliver Fiehn. 2023. “LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02263. Kouřil, Štěpán, Julie de Sousa, Jan Václavík, David Friedecký, and Tomáš Adam. 2020. “CROP: Correlation-Based Reduction of Feature Multiplicities in Untargeted Metabolomic Data.” Bioinformatics 36 (9): 2941–42. https://doi.org/10.1093/bioinformatics/btaa012. Kuhl, Carsten, Ralf Tautenhahn, Christoph Böttcher, Tony R. Larson, and Steffen Neumann. 2012. “CAMERA: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets.” Analytical Chemistry 84 (1): 283–89. https://doi.org/10.1021/ac202450g. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Li, Liang, Ronghong Li, Jianjun Zhou, Azeret Zuniga, Avalyn E. Stanislaus, Yiman Wu, Tao Huan, et al. 2013. “MyCompoundID: Using an Evidence-Based Metabolome Library for Metabolite Identification.” Analytical Chemistry 85 (6): 3401–8. https://doi.org/10.1021/ac400099b. Li, Yuanyue, and Oliver Fiehn. 2023. “Flash Entropy Search to Query All Mass Spectral Libraries in Real Time.” Nature Methods 20 (10): 1475–78. https://doi.org/10.1038/s41592-023-02012-9. Li, Yuanyue, Tobias Kind, Jacob Folz, Arpana Vaniya, Sajjan Singh Mehta, and Oliver Fiehn. 2021. “Spectral Entropy Outperforms MS/MS Dot Product Similarity for Small-Molecule Compound Identification.” Nature Methods 18 (12): 1524–31. https://doi.org/10.1038/s41592-021-01331-z. Lieng, Brandon Y., Andrew T. Quaile, Xavier Domingo-Almenara, Hannes L. Röst, and J. Rafael Montenegro-Burke. 2023. “Computational Expansion of High-Resolution-MSn Spectral Libraries.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c03343. Loos, Martin, and Heinz Singer. 2017. “Nontargeted Homologue Series Extraction from Hyphenated High Resolution Mass Spectrometry Data.” Journal of Cheminformatics 9 (February). https://doi.org/10.1186/s13321-017-0197-z. Ma, Yan, Tobias Kind, Dawei Yang, Carlos Leon, and Oliver Fiehn. 2014. “MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra.” Analytical Chemistry 86 (21): 10724–31. https://doi.org/10.1021/ac502818e. Mahieu, Nathaniel G., and Gary J. Patti. 2017. “Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer Than 1000 Unique Metabolites.” Analytical Chemistry 89 (19): 10397–406. https://doi.org/10.1021/acs.analchem.7b02380. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Menikarachchi, Lochana C., Shannon Cawley, Dennis W. Hill, L. Mark Hall, Lowell Hall, Steven Lai, Janine Wilder, and David F. Grant. 2012. “MolFind: A Software Package Enabling HPLC/MS-Based Identification of Unknown Chemical Structures.” Analytical Chemistry 84 (21): 9388–94. https://doi.org/10.1021/ac302048x. Nash, William J., and Warwick B. Dunn. 2019. “From Mass to Metabolite in Human Untargeted Metabolomics: Recent Advances in Annotation of Metabolites Applying Liquid Chromatography-Mass Spectrometry Data.” TrAC Trends in Analytical Chemistry 120 (November): 115324. https://doi.org/10.1016/j.trac.2018.11.022. O’Boyle, Noel M., Michael Banck, Craig A. James, Chris Morley, Tim Vandermeersch, and Geoffrey R. Hutchison. 2011. “Open Babel: An Open Chemical Toolbox.” Journal of Cheminformatics 3 (1): 33. https://doi.org/10.1186/1758-2946-3-33. Patiny, Luc, and Alain Borel. 2013. “ChemCalc: A Building Block for Tomorrow’s Chemical Infrastructure.” Journal of Chemical Information and Modeling 53 (5): 1223–28. https://doi.org/10.1021/ci300563h. Qiu, Feng, Dennis D. Fine, Daniel J. Wherritt, Zhentian Lei, and Lloyd W. Sumner. 2016. “PlantMAT: A Metabolomics Tool for Predicting the Specialized Metabolic Potential of a System and for Large-Scale Metabolite Identifications.” Analytical Chemistry 88 (23): 11373–83. https://doi.org/10.1021/acs.analchem.6b00906. Qiu, Feng, Zhentian Lei, and Lloyd W. Sumner. 2018. “MetExpert: An Expert System to Enhance Gas Chromatography-Mass Spectrometry-Based Metabolite Identifications.” Analytica Chimica Acta, Analytical Metabolomics, 1037 (December): 316–26. https://doi.org/10.1016/j.aca.2018.03.052. Ruttkies, Christoph, Emma L. Schymanski, Sebastian Wolf, Juliane Hollender, and Steffen Neumann. 2016. “MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation.” Journal of Cheminformatics 8 (January): 3. https://doi.org/10.1186/s13321-016-0115-9. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Senan, Oriol, Antoni Aguilar-Mogas, Miriam Navarro, Jordi Capellades, Luke Noon, Deborah Burks, Oscar Yanes, Roger Guimerà, and Marta Sales-Pardo. 2019. “CliqueMS: A Computational Tool for Annotating in-Source Metabolite Ions from LC-MS Untargeted Metabolomics Data Based on a Coelution Similarity Network.” Bioinformatics 35 (20): 4089–97. https://doi.org/10.1093/bioinformatics/btz207. Shen, Xiaotao, Ruohong Wang, Xin Xiong, Yandong Yin, Yuping Cai, Zaijun Ma, Nan Liu, and Zheng-Jiang Zhu. 2019. “Metabolic Reaction Network-Based Recursive Metabolite Annotation for Untargeted Metabolomics.” Nature Communications 10 (1): 1–14. https://doi.org/10.1038/s41467-019-09550-x. Silva, Ricardo R., Fabien Jourdan, Diego M. Salvanha, Fabien Letisse, Emilien L. Jamin, Simone Guidetti-Gonzalez, Carlos A. Labate, and Ricardo Z. N. Vêncio. 2014. “ProbMetab: An R Package for Bayesian Probabilistic Annotation of LC–MS-based Metabolomics.” Bioinformatics 30 (9): 1336–37. https://doi.org/10.1093/bioinformatics/btu019. Sindelar, Miriam, and Gary J. Patti. 2020. “Chemical Discovery in the Era of Metabolomics.” Journal of the American Chemical Society, April. https://doi.org/10.1021/jacs.9b13198. Spalding, Jonathan L., Kevin Cho, Nathaniel G. Mahieu, Igor Nikolskiy, Elizabeth M. Llufrio, Stephen L. Johnson, and Gary J. Patti. 2016. “Bar Coding MS2 Spectra for Metabolite Identification.” Analytical Chemistry 88 (5): 2538–42. https://doi.org/10.1021/acs.analchem.5b04925. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Tian, Zhitao, Xin Hu, Yingying Xu, Mengmeng Liu, Hongbo Liu, Dongqin Li, Lisong Hu, Guozhu Wei, and Wei Chen. 2023. “PMhub 1.0: A Comprehensive Plant Metabolome Database.” Nucleic Acids Research, October, gkad811. https://doi.org/10.1093/nar/gkad811. Torigoe, Taihei, Masatomo Takahashi, Omidreza Heravizadeh, Kazuki Ikeda, Kohta Nakatani, Takeshi Bamba, and Yoshihiro Izumi. 2024. “Predicting Retention Time in Unified-Hydrophilic-Interaction/Anion-Exchange Liquid Chromatography High-Resolution Tandem Mass Spectrometry (Unified-HILIC/AEX/HRMS/MS) for Comprehensive Structural Annotation of Polar Metabolome.” Analytical Chemistry 96 (3): 1275–83. https://doi.org/10.1021/acs.analchem.3c04618. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. van Tetering, Lara, Sylvia Spies, Quirine D. K. Wildeman, Kas J. Houthuijs, Rianne E. van Outersterp, Jonathan Martens, Ron A. Wevers, David S. Wishart, Giel Berden, and Jos Oomens. 2024. “A Spectroscopic Test Suggests That Fragment Ion Structure Annotations in MS/MS Libraries Are Frequently Incorrect.” Communications Chemistry 7 (1): 1–11. https://doi.org/10.1038/s42004-024-01112-7. Viant, Mark R, Irwin J Kurland, Martin R Jones, and Warwick B Dunn. 2017. “How Close Are We to Complete Annotation of Metabolomes?” Current Opinion in Chemical Biology, Omics, 36 (February): 64–69. https://doi.org/10.1016/j.cbpa.2017.01.001. Weber, Ralf J. M., and Mark R. Viant. 2010. “MI-Pack: Increased Confidence of Metabolite Identification in Mass Spectra by Integrating Accurate Masses and Metabolic Pathways.” Chemometrics and Intelligent Laboratory Systems, OMICS, 104 (1): 75–82. https://doi.org/10.1016/j.chemolab.2010.04.010. Witting, Michael, Christoph Ruttkies, Steffen Neumann, and Philippe Schmitt-Kopplin. 2017. “LipidFrag: Improving Reliability of in Silico Fragmentation of Lipids and Application to the Caenorhabditis Elegans Lipidome.” PLOS ONE 12 (3): e0172311. https://doi.org/10.1371/journal.pone.0172311. Wolf, Sebastian, Stephan Schmidt, Matthias Müller-Hannemann, and Steffen Neumann. 2010. “In Silico Fragmentation for Computer Assisted Identification of Metabolite Mass Spectra.” BMC Bioinformatics 11 (March): 148. https://doi.org/10.1186/1471-2105-11-148. Xing, Shipei, Sam Shen, Banghua Xu, Xiaoxiao Li, and Tao Huan. 2023. “BUDDY: Molecular Formula Discovery via Bottom-up MS/MS Interrogation.” Nature Methods, April, 1–10. https://doi.org/10.1038/s41592-023-01850-x. Xu, Yi-Fan, Wenyun Lu, and Joshua D. Rabinowitz. 2015. “Avoiding Misannotation of In-Source Fragmentation Products as Cellular Metabolites in Liquid Chromatography–Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 87 (4): 2273–81. https://doi.org/10.1021/ac504118y. Xue, Jingchuan, Rico J. E. Derks, Bill Webb, Elizabeth M. Billings, Aries Aisporna, Martin Giera, and Gary Siuzdak. 2021. “Single Quadrupole Multiple Fragment Ion Monitoring Quantitative Mass Spectrometry.” Analytical Chemistry 93 (31): 10879–89. https://doi.org/10.1021/acs.analchem.1c01246. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Xue, Jingchuan, Jiamin Zhu, Lixin Hu, Junjie Yang, Yunbo Lv, Fanrong Zhao, Yuxian Liu, Tao Zhang, Yanpeng Cai, and Mingliang Fang. 2023. “EISA-EXPOSOME: One Highly Sensitive and Autonomous Exposomic Platform with Enhanced in-Source Fragmentation/Annotation.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02697. Yang, Qiong, Hongchao Ji, Zhenbo Xu, Yiming Li, Pingshan Wang, Jinyu Sun, Xiaqiong Fan, Hailiang Zhang, Hongmei Lu, and Zhimin Zhang. 2023. “Ultra-Fast and Accurate Electron Ionization Mass Spectrum Matching for Compound Identification with Million-Scale in-Silico Library.” Nature Communications 14 (1): 3722. https://doi.org/10.1038/s41467-023-39279-7. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Zhang, Xiuqiong, Zaifang Li, Chunxia Zhao, Tiantian Chen, Xinxin Wang, Xiaoshan Sun, Xinjie Zhao, Xin Lu, and Guowang Xu. 2024. “Leveraging Unidentified Metabolic Features for Key Pathway Discovery: Chemical Classification-driven Network Analysis in Untargeted Metabolomics.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.3c04591. Zhang, Yuhao, Jingyu Liao, Wanqi Le, Gaosong Wu, and Weidong Zhang. 2023. “Improving the Data Quality of Untargeted Metabolomics Through a Targeted Data-Dependent Acquisition Based on an Inclusion List of Differential and Preidentified Ions.” Analytical Chemistry 95 (34): 12964–73. https://doi.org/10.1021/acs.analchem.3c02888. Zhao, Tingting, Shipei Xing, Huaxu Yu, and Tao Huan. 2023. “De Novo Cleaning of Chimeric MS/MS Spectra for LC-MS/MS-Based Metabolomics.” Analytical Chemistry 95 (35): 13018–28. https://doi.org/10.1021/acs.analchem.3c00736. Zhou, Zhiwei, Mingdu Luo, Haosong Zhang, Yandong Yin, Yuping Cai, and Zheng-Jiang Zhu. 2022. “Metabolite Annotation from Knowns to Unknowns Through Knowledge-Guided Multi-Layer Metabolic Networking.” Nature Communications 13 (1): 6656. https://doi.org/10.1038/s41467-022-34537-6. "],["omics-analysis.html", "Chapter 8 Omics analysis 8.1 From Bottom-up to Top-down 8.2 Pathway analysis 8.3 Network analysis 8.4 Omics integration", " Chapter 8 Omics analysis When you get the filtered ions, the next step is making annotations for them. Such annotations would be helpful for omics studies. Omics analysis try to combine the information from other ‘omics’ to answer one specific question. Since we have got the annotations, Omics analysis could be performed.Upload the data obtained from the xcms to other tools or databases. You will get an updated database list here. Right now, it is hard to connect different omics databases such as gene, protein and metabolites together for a whole scope of certain biological process. However, you might select few metabolites across those databases and find something interesting. 8.1 From Bottom-up to Top-down Bottom-up analysis mean the model for each metabolite. In this case, we could find out which metabolite will be affected by our experiment design. However, take care of multiple comparison issue. \\[ metabolite = f(control/treatment, co-variables) \\] Top-down analysis mean the model for output. In this case, we could evaluate the contribution of each metabolites. You need variable selection to make a better model. \\[ control/treatment = f(metabolite 1,metabolite 2,...,metaboliteN,co-varuables) \\] For omics study, you might need to integrate dataset from different sources. \\[ control/treatment = f(metabolites, proteins, genes, miRNA,co-varuables) \\] 8.2 Pathway analysis Pathway analysis maps annotated data into known pathway and make statistical analysis to find the influenced pathway or the compounds with high influences on certain pathway. 8.2.1 Pathway Database SMPDB (The Small Molecule Pathway Database) is an interactive, visual database containing more than 618 small molecule pathways found in humans. More than 70% of these pathways (&gt;433) are not found in any other pathway database. The pathways include metabolic, drug, and disease pathways. KEGG (Kyoto Encyclopedia of Genes and Genomes) is one of the most complete and widely used databases containing metabolic pathways (495 reference pathways) from a wide variety of organisms (&gt;4,700). These pathways are hyperlinked to metabolite and protein/enzyme information. Currently KEGG has &gt;17,000 compounds (from animals, plants and bacteria), 10,000 drugs (including different salt forms and drug carriers) and nearly 11,000 glycan structures. BioCyc is a collection of 14558 Pathway/Genome Databases (PGDBs), plus software tools for exploring them. Reactome is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. WikiPathway is a database of biological pathways maintained by and for the scientific community. 8.2.2 Pathway software Pathway Commons online tools for pathway analysis RaMP could make pathway analysis for batch search metabox could make pathway analysis impala is used for pathway enrichment analysis Metscape based on Debiased Sparse Partial Correlation (DSPC) algorithm (Basu et al. 2017) to make annotation. 8.3 Network analysis Mummichog could make pathway and network analysis without annotation. MSS: sequential feature screening procedure to select important sub-network and identify the optimal matching for metabolimics data (Q. Cai et al. 2017). Metapone is joint pathway testing package for untargeted metabolomics data (L. Tian et al. 2022). 8.4 Omics integration Blast finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. The Omics Discovery Index (OmicsDI) provides a knowledge discovery framework across heterogeneous omics data (genomics, proteomics, transcriptomics and metabolomics). Omics Data Integration Project Standardized multi-omics of Earth’s microbiomes could check this GNPS based work(Shaffer et al. 2022). Windows Scanning Multiomics: Integrated Metabolomics and Proteomics(Shi et al. 2023) References Basu, Sumanta, William Duren, Charles R. Evans, Charles F. Burant, George Michailidis, and Alla Karnovsky. 2017. “Sparse Network Modeling and Metscape-Based Visualization Methods for the Analysis of Large-Scale Metabolomics Data.” Bioinformatics 33 (10): 1545–53. https://doi.org/10.1093/bioinformatics/btx012. Cai, Qingpo, Jessica A. Alvarez, Jian Kang, and Tianwei Yu. 2017. “Network Marker Selection for Untargeted LC–MS Metabolomics Data.” Journal of Proteome Research 16 (3): 1261–69. https://doi.org/10.1021/acs.jproteome.6b00861. Shaffer, Justin P., Louis-Félix Nothias, Luke R. Thompson, Jon G. Sanders, Rodolfo A. Salido, Sneha P. Couvillion, Asker D. Brejnrod, et al. 2022. “Standardized Multi-Omics of Earth’s Microbiomes Reveals Microbial and Metabolite Diversity.” Nature Microbiology 7 (12): 2128–50. https://doi.org/10.1038/s41564-022-01266-x. Shi, Jiachen, Jialiang Zhao, Yu Zhang, Yanan Wang, Chin Ping Tan, Yong-Jiang Xu, and Yuanfa Liu. 2023. “Windows Scanning Multiomics: Integrated Metabolomics and Proteomics.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.3c03785. Tian, Leqi, Zhenjiang Li, Guoxuan Ma, Xiaoyue Zhang, Ziyin Tang, Siheng Wang, Jian Kang, Donghai Liang, and Tianwei Yu. 2022. “Metapone: A Bioconductor Package for Joint Pathway Testing for Untargeted Metabolomics Data.” Bioinformatics 38 (14): 3662–64. https://doi.org/10.1093/bioinformatics/btac364. "],["peaks-normalization.html", "Chapter 9 Peaks normalization 9.1 Batch effects 9.2 Batch effects classification 9.3 Batch effects visualization 9.4 Source of batch effects 9.5 Avoid batch effects by DoE 9.6 post hoc data normalization 9.7 Method to validate the normalization 9.8 Software", " Chapter 9 Peaks normalization 9.1 Batch effects Batch effects are the variances caused by factor other than the experimental design. We could simply make a linear model for the intensity of one peak: \\[Intensity = Average + Condition + Batch + Error\\] Research is focused on condition contribution part and overall average or random error could be estimated. However, we know little about the batch contribution. Sometimes we could use known variables such as injection order or operators as the batch part. However, in most cases we such variable is unknown. Almost all the batch correction methods are trying to use some estimations to balance or remove the batch effect. For analytical chemistry, internal standards or pool quality control samples are actually standing for the batch contribution part in the model. However, it’s impractical to get all the internal standards when the data is collected untargeted. For methods using internal standards or pool quality control samples, the variations among those samples are usually removed as median, quantile, mean or the ratios. Other ways like quantile regression, centering and scaling based on distribution within samples could be treated as using the stable distribution of peaks intensity to remove batch effects. 9.2 Batch effects classification Variances among the samples across all the extracted peaks might be affected by factors other than the experiment design. There are three types of those batch effects: Monotone, Block and Mixed. Monotone would increase/decrease with the injection order or batches. Block would be system shift among different batches. Mixed would be the combination of monotone and block batch effects. Meanwhile, different compounds would suffer different type of batch effects. In this case, the normalization or batch correction should be done peak by peak. 9.3 Batch effects visualization Any correction might introduce bias. We need to make sure there are patterns which different from our experimental design. Pooled QC samples should be clustered on PCA score plot. 9.4 Source of batch effects Different Operators &amp; Dates &amp; Sequences Different Instrumental Condition such as different instrumental parameters, poor quality control, sample contamination during the analysis, Column (Pooled QC) and sample matrix effects (ions suppression or/and enhancement) Unknown Unknowns 9.5 Avoid batch effects by DoE You could avoid batch effects from experimental design. Cap the sequence with Pooled QC and Randomized samples sequence. Some internal standards/Instrumental QC might Help to find the source of batch effects while it’s not practical for every compounds in non-targeted analysis. Batch effects might not change the conclusion when the effect size is relatively small. Here is a simulation: set.seed(30) # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) re &lt;- t.test(con~group) # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) batch &lt;- seq(0,5,length.out = 10) ins &lt;- batch+con re &lt;- t.test(ins~group) index &lt;- sample(10) ins &lt;- batch+con[index] re &lt;- t.test(ins~group[index]) Randomization could not guarantee the results. Here is a simulation. # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) batch &lt;- seq(5,0,length.out = 10) ins &lt;- batch+con re &lt;- t.test(ins~group) 9.6 post hoc data normalization To make the samples comparable, normalization across the samples are always needed when the experiment part is done. Batch effect should have patterns other than experimental design, otherwise just noise. Correction is possible by data analysis/randomized experimental design. There are numerous methods to make normalization with their combination. We could divided those methods into two categories: unsupervised and supervised. Unsupervised methods only consider the normalization peaks intensity distribution across the samples. For example, quantile calibration try to make the intensity distribution among the samples similar. Such methods are preferred to explore the inner structures of the samples. Internal standards or pool QC samples also belong to this category. However, it’s hard to take a few peaks standing for all peaks extracted. Supervised methods will use the group information or batch information in experimental design to normalize the data. A linear model is always used to model the unwanted variances and remove them for further analysis. Since the real batch effects are always unknown, it’s hard to make validation for different normalization methods. Li et.al developed NOREVA to make comparision among 25 correction method (B. Li et al. 2017) and a recently updates make this numbers to 168 (Qingxia Yang et al. 2020). MetaboDrift also contain some methods for batch correction in excel (Thonusin et al. 2017). Another idea is use spiked-in samples to validate the methods (Franceschi et al. 2012) , which might be good for targeted analysis instead of non-targeted analysis. Relative log abundance (RLA) plots(De Livera et al. 2012) and heatmap often used to show the variances among the samples. 9.6.1 Unsupervised methods 9.6.1.1 Distribution of intensity Intensity collects from LC/GC-MS always showed a right-skewed distribution. Log transformation is often necessary for further statistical analysis. 9.6.1.2 Centering For peak p of sample s in batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = I_{p,s,b} - mean(I_{p,b}) + median(I_{p,qc})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = I_{p,s,b} - mean(I_{p,b})\\] 9.6.1.3 Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{std_{p,b}} * std_{p,qc,b} + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{std_{p,b}}\\] 9.6.1.4 Pareto Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{Sqrt(std_{p,b})} * Sqrt(std_{p,qc,b}) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{Sqrt(std_{p,b})}\\] 9.6.1.5 Range Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{max(I_{p,b}) - min(I_{p,b})} * (max(I_{p,qc,b}) - min(I_{p,qc,b})) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{max(I_{p,b}) - min(I_{p,b})} \\] 9.6.1.6 Level scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{mean(I_{p,b})} * mean(I_{p,qc,b}) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{mean(I_{p,b})} \\] 9.6.1.7 Quantile The idea of quantile calibration is that alignment of the intensities in certain samples according to quantile in each sample. Here is the demo: set.seed(42) a &lt;- rnorm(1000) # b sufferred batch effect with a bias of 10 b &lt;- rnorm(1000,10) hist(a,xlim=c(-5,15),breaks = 50) hist(b,col = &#39;black&#39;, breaks = 50, add=T) # quantile normalized cor &lt;- (a[order(a)]+b[order(b)])/2 # reorder cor &lt;- cor[order(order(a))] hist(cor,col = &#39;red&#39;, breaks = 50, add=T) 9.6.1.8 Ratio based calibration This method calibrates samples by the ratio between qc samples in all samples and in certain batch.For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} * median(I_{p,qc})}{mean_{p,qc,b}}\\] set.seed(42) # raw data I = c(rnorm(10,mean = 0, sd = 0.3),rnorm(10,mean = 1, sd = 0.5)) # batch B = c(rep(0,10),rep(1,10)) # qc Iqc = c(rnorm(1,mean = 0, sd = 0.3),rnorm(1,mean = 1, sd = 0.5)) # corrected data Icor = I * median(c(rep(Iqc[1],10),rep(Iqc[2],10)))/mean(c(rep(Iqc[1],10),rep(Iqc[2],10))) # plot the result plot(I) plot(Icor) 9.6.1.9 Linear Normalizer This method initially scales each sample so that the sum of all peak abundances equals one. In this study, by multiplying the median sum of all peak abundances across all samples,we got the corrected data. set.seed(42) # raw data peaksa &lt;- c(rnorm(10,mean = 10, sd = 0.3),rnorm(10,mean = 20, sd = 0.5)) peaksb &lt;- c(rnorm(10,mean = 10, sd = 0.3),rnorm(10,mean = 20, sd = 0.5)) df &lt;- rbind(peaksa,peaksb) dfcor &lt;- df/apply(df,2,sum)* sum(apply(df,2,median)) image(df) image(dfcor) 9.6.1.10 Internal standards \\[\\hat I_{p,s} = \\frac{I_{p,s} * median(I_{IS})}{I_{IS,s}}\\] Some methods also use pooled calibration samples and multiple internal standard strategy to correct the data (van der Kloet et al. 2009; Sysi-Aho et al. 2007). Also some methods only use QC samples to handle the data (Kuligowski et al. 2015). 9.6.2 Supervised methods 9.6.2.1 Regression calibration Considering the batch effect of injection order, regress the data by a linear model to get the calibration. 9.6.2.2 Batch Normalizer Use the total abundance scale and then fit with the regression line (S.-Y. Wang, Kuo, and Tseng 2013). 9.6.2.3 Surrogate Variable Analysis(SVA) We have a data matrix(M*N) with M stands for identity peaks from one sample and N stand for individual samples. For one sample, \\(X = (x_{i1},...,x_{in})^T\\) stands for the normalized intensities of peaks. We use \\(Y = (y_i,...,y_m)^T\\) stands for the group information of our data. Then we could build such models: \\[x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\] \\(\\mu_i\\) stands for the baseline of the peak intensities in a normal state. Then we have: \\[f_i(y_i) = E(x_{ij}|y_j) - \\mu_i\\] stands for the biological variations caused by the our group, for example, whether treated by exposure or not. However, considering the batch effects, the real model could be: \\[x_{ij} = \\mu_i + f_i(y_i) + \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\] \\(\\gamma_{li}\\) stands for the peak-specific coefficient for potential factor \\(l\\). \\(p_{lj}\\) stands for the potential factors across the samples. Actually, the error item \\(e_{ij}\\) in real sample could always be decomposed as \\(e_{ij} = \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\) with \\(e_{ij}^*\\) standing for the real random error in certain sample for certain peak. We could not get the potential factors directly. Since we don’t care the details of the unknown factors, we could estimate orthogonal vectors \\(h_k\\) standing for such potential factors. Thus we have: \\[ x_{ij} = \\mu_i + f_i(y_i) + \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\\\ = \\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\lambda_{ki}h_{kj} + e_{ij} \\] Here is the details of the algorithm: The algorithm is decomposed into two parts: detection of unmodeled factors and construction of surrogate variables 9.6.2.3.1 Detection of unmodeled factors Estimate \\(\\hat\\mu_i\\) and \\(f_i\\) by fitting the model \\(x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\) and get the residual \\(r_{ij} = x_{ij}-\\hat\\mu_i - \\hat f_i(y_i)\\). Then we have the residual matrix R. Perform the singular value decompositon(SVD) of the residual matrix \\(R = UDV^T\\) Let \\(d_l\\) be the \\(l\\)th eigenvalue of the diagonal matrix D for \\(l = 1,...,n\\). Set \\(df\\) as the freedom of the model \\(\\hat\\mu_i + \\hat f_i(y_i)\\). We could build a statistic \\(T_k\\) as: \\[T_k = \\frac{d_k^2}{\\sum_{l=1}^{n-df}d_l^2}\\] to show the variance explained by the \\(k\\)th eigenvalue. Permute each row of R to remove the structure in the matrix and get \\(R^*\\). Fit the model \\(r_{ij}^* = \\mu_i^* + f_i^*(y_i) + e^*_{ij}\\) and get \\(r_{ij}^0 = r^*_{ij}-\\hat\\mu^*_i - \\hat f^*_i(y_i)\\) as a null matrix \\(R_0\\) Perform the singular value decompositon(SVD) of the residual matrix \\(R_0 = U_0D_0V_0^T\\) Compute the null statistic: \\[ T_k^0 = \\frac{d_{0k}^2}{\\sum_{l=1}^{n-df}d_{0l}^2} \\] Repeat permuting the row B times to get the null statistics \\(T_k^{0b}\\) Get the p-value for eigengene: \\[p_k = \\frac{\\#{T_k^{0b}\\geq T_k;b=1,...,B }}{B}\\] For a significance level \\(\\alpha\\), treat k as a significant signature of residual R if \\(p_k\\leq\\alpha\\) 9.6.2.3.2 Construction of surrogate variables Estimate \\(\\hat\\mu_i\\) and \\(f_i\\) by fitting the model \\(x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\) and get the residual \\(r_{ij} = x_{ij}-\\hat\\mu_i - \\hat f_i(y_i)\\). Then we have the residual matrix R. Perform the singular value decompositon(SVD) of the residual matrix \\(R = UDV^T\\). Let \\(e_k = (e_{k1},...,e_{kn})^T\\) be the \\(k\\)th column of V Set \\(\\hat K\\) as the significant eigenvalues found by the first step. Regress each \\(e_k\\) on \\(x_i\\), get the p-value for the association. Set \\(\\pi_0\\) as the proportion of the peak intensity \\(x_i\\) not associate with \\(e_k\\) and find the numbers \\(\\hat m =[1-\\hat \\pi_0 \\times m]\\) and the index of the peaks associated with the eigenvalues Form the matrix \\(\\hat m_1 \\times N\\), this matrix\\(X_r\\) stand for the potential variables. As was done for R, get the eigengents of \\(X_r\\) and denote these by \\(e_j^r\\) Let \\(j^* = argmax_{1\\leq j \\leq n}cor(e_k,e_j^r)\\) and set \\(\\hat h_k=e_j^r\\). Set the estimate of the surrogate variable to be the eigenvalue of the reduced matrix most correlated with the corresponding residual eigenvalue. Since the reduced matrix is enriched for peaks associated with this residual eigenvalue, this is a principled choice for the estimated surrogate variable that allows for correlation with the primary variable. Employ the \\(\\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\gamma_{ki}\\hat h_{kj} + e_{ij}\\) as the estimate of the ideal model \\(\\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\gamma_{ki}h_{kj} + e_{ij}\\) This method could found the potential unwanted variables for the data. SVA were introduced by Jeff Leek (Leek and Storey 2008, 2007; Leek et al. 2012) and EigenMS package implement SVA with modifications including analysis of data with missing values that are typical in LC-MS experiments (Karpievitch et al. 2014). 9.6.2.4 RUV (Remove Unwanted Variation) This method’s performance is similar to SVA. Instead find surrogate variable from the whole dataset. RUA use control or pool QC to find the unwanted variances and remove them to find the peaks related to experimental design. However, we could also empirically estimate the control peaks by linear mixed model. RUA-random (Livera et al. 2015; De Livera et al. 2012) further use linear mixed model to estimate the variances of random error. A hierarchical approach RUV was recently proposed for metabolomics data(T. Kim et al. 2021). This method could be used with suitable control, which is common in metabolomics DoE. 9.6.2.5 RRmix RRmix also use a latent factor models correct the data (Jr et al. 2017). This method could be treated as linear mixed model version SVA. No control samples are required and the unwanted variances could be removed by factor analysis. This method might be the best choice to remove the unwanted variables with common experiment design. 9.6.2.6 Norm ISWSVR It is a two-step approach via combining the best-performance internal standard correction with support vector regression normalization, comprehensively removing the systematic and random errors and matrix effects(Ding et al. 2022). 9.7 Method to validate the normalization Various methods have been used for batch correction and evaluation. Simulation will ensure groud turth. Difference analysis would be a common method for evaluation. Then we could check whether this peak is true positive or false positive by settings of the simulation. Other methods need statistics or lots of standards to describ the performance of batch correction or normalization results. 9.8 Software BatchCorrMetabolomics is for improved batch correction in untargeted MS-based metabolomics MetNorm show Statistical Methods for Normalizing Metabolomics Data. BatchQC could be used to make batch effect simulation. Noreva could make online batch correction and comparison(J. Fu et al. 2021). References De Livera, Alysha M., Daniel A. Dias, David De Souza, Thusitha Rupasinghe, James Pyke, Dedreia Tull, Ute Roessner, Malcolm McConville, and Terence P. Speed. 2012. “Normalizing and Integrating Metabolomics Data.” Analytical Chemistry 84 (24): 10768–76. https://doi.org/10.1021/ac302748b. Ding, Xian, Fen Yang, Yanhua Chen, Jing Xu, Jiuming He, Ruiping Zhang, and Zeper Abliz. 2022. “Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics.” Analytical Chemistry 94 (21): 7500–7509. https://doi.org/10.1021/acs.analchem.1c05502. Franceschi, Pietro, Domenico Masuero, Urska Vrhovsek, Fulvio Mattivi, and Ron Wehrens. 2012. “A Benchmark Spike-in Data Set for Biomarker Identification in Metabolomics.” Journal of Chemometrics 26 (1-2): 16–24. https://doi.org/10.1002/cem.1420. Fu, Jianbo, Ying Zhang, Yunxia Wang, Hongning Zhang, Jin Liu, Jing Tang, Qingxia Yang, et al. 2021. “Optimization of Metabolomic Data Processing Using NOREVA.” Nature Protocols, December, 1–23. https://doi.org/10.1038/s41596-021-00636-9. Jr, Stephen Salerno, Mahya Mehrmohamadi, Maria V. Liberti, Muting Wan, Martin T. Wells, James G. Booth, and Jason W. Locasale. 2017. “RRmix: A Method for Simultaneous Batch Effect Correction and Analysis of Metabolomics Data in the Absence of Internal Standards.” PLOS ONE 12 (6): e0179530. https://doi.org/10.1371/journal.pone.0179530. Karpievitch, Yuliya V., Sonja B. Nikolic, Richard Wilson, James E. Sharman, and Lindsay M. Edwards. 2014. “Metabolomics Data Normalization with EigenMS.” PLOS ONE 9 (12): e116221. https://doi.org/10.1371/journal.pone.0116221. Kim, Taiyun, Owen Tang, Stephen T. Vernon, Katharine A. Kott, Yen Chin Koay, John Park, David E. James, et al. 2021. “A Hierarchical Approach to Removal of Unwanted Variation for Large-Scale Metabolomics Data.” Nature Communications 12 (1): 4992. https://doi.org/10.1038/s41467-021-25210-5. Kuligowski, Julia, Ángel Sánchez-Illana, Daniel Sanjuán-Herráez, Máximo Vento, and Guillermo Quintás. 2015. “Intra-Batch Effect Correction in Liquid Chromatography-Mass Spectrometry Using Quality Control Samples and Support Vector Regression (QC-SVRC).” Analyst 140 (22): 7810–17. https://doi.org/10.1039/C5AN01638J. Leek, Jeffrey T., W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034. Leek, Jeffrey T., and John D. Storey. 2007. “Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis.” PLOS Genet 3 (9): e161. https://doi.org/10.1371/journal.pgen.0030161. ———. 2008. “A General Framework for Multiple Testing Dependence.” Proceedings of the National Academy of Sciences 105 (48): 18718–23. https://doi.org/10.1073/pnas.0808709105. Li, Bo, Jing Tang, Qingxia Yang, Shuang Li, Xuejiao Cui, Yinghong Li, Yuzong Chen, Weiwei Xue, Xiaofeng Li, and Feng Zhu. 2017. “NOREVA: Normalization and Evaluation of MS-based Metabolomics Data.” Nucleic Acids Research 45 (W1): W162–70. https://doi.org/10.1093/nar/gkx449. Livera, Alysha M. De, Marko Sysi-Aho, Laurent Jacob, Johann A. Gagnon-Bartsch, Sandra Castillo, Julie A. Simpson, and Terence P. Speed. 2015. “Statistical Methods for Handling Unwanted Variation in Metabolomics Data.” Analytical Chemistry 87 (7): 3606–15. https://doi.org/10.1021/ac502439y. Sysi-Aho, Marko, Mikko Katajamaa, Laxman Yetukuri, and Matej Orešič. 2007. “Normalization Method for Metabolomics Data Using Optimal Selection of Multiple Internal Standards.” BMC Bioinformatics 8 (March): 93. https://doi.org/10.1186/1471-2105-8-93. Thonusin, Chanisa, Heidi B. IglayReger, Tanu Soni, Amy E. Rothberg, Charles F. Burant, and Charles R. Evans. 2017. “Evaluation of Intensity Drift Correction Strategies Using MetaboDrift, a Normalization Tool for Multi-Batch Metabolomics Data.” Journal of Chromatography A, Pushing the Boundaries of Chromatography and Electrophoresis, 1523 (Supplement C): 265–74. https://doi.org/10.1016/j.chroma.2017.09.023. van der Kloet, Frans M., Ivana Bobeldijk, Elwin R. Verheij, and Renger H. Jellema. 2009. “Analytical Error Reduction Using Single Point Calibration for Accurate and Precise Metabolomic Phenotyping.” Journal of Proteome Research 8 (11): 5132–41. https://doi.org/10.1021/pr900499r. Wang, San-Yuan, Ching-Hua Kuo, and Yufeng J. Tseng. 2013. “Batch Normalizer: A Fast Total Abundance Regression Calibration Method to Simultaneously Adjust Batch and Injection Order Effects in Liquid Chromatography/Time-of-Flight Mass Spectrometry-Based Metabolomics Data and Comparison with Current Calibration Methods.” Analytical Chemistry 85 (2): 1037–46. https://doi.org/10.1021/ac302877x. Yang, Qingxia, Yunxia Wang, Ying Zhang, Fengcheng Li, Weiqi Xia, Ying Zhou, Yunqing Qiu, Honglin Li, and Feng Zhu. 2020. “NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data.” Nucleic Acids Research 48 (W1): W436–48. https://doi.org/10.1093/nar/gkaa258. "],["statistical-analysis.html", "Chapter 10 Statistical analysis 10.1 Basic Statistical Analysis 10.2 Differences analysis 10.3 PCA 10.4 Cluster Analysis 10.5 PLSDA 10.6 Network analysis 10.7 Software", " Chapter 10 Statistical analysis The general purposes for metabolomics study are strongly associated with research goal. However, since metabolomics are usually performed in a non-targeted mode, statistical analysis methods are always started with the exploratory analysis. The basic target for an exploratory analysis is: Find the relationship among variables Find the relationship among samples/group of samples. This is basically unsupervised analysis. However, sometimes we have group information which could be used to find biomarkers or correlation between variables and groups or continuous variables. This type of data need supervised methods to process. A general discussion about statistical analysis in metabolic phenotyping can be found here(Blaise et al. 2021). Before we talk the details of algorithms, let’s cover some basic statistical concepts. 10.1 Basic Statistical Analysis Statistic is used to describe certain property or variables among the samples. It could be designed for certain purpose to extract signal and remove noise. Statistical models and inference are both based on statistic instead of the data. \\[Statistic = f(sample_1,sample_2,...,sample_n)\\] Null Hypothesis Significance Testing (NHST) is often used to make statistical inference. P value is the probability of certain statistics happens under H0 (pre-defined distribution). For omics studies, you should realize Multiple Comparison issue when you perform a lot of(more than 20) comparisons or tests at the same time. False Discovery Rate(FDR) control is required for multiple tests to make sure the results are not false positive. You could use Benjamini-Hochberg method to adjust raw p values or directly use Storey Q value to make FDR control. NHST is famous for the failure of p-value interpretation as well as multiple comparison issues. Bayesian Hypothesis Testing could be an options to cover some drawbacks of NHST. Bayesian Hypothesis Testing use Bayes factor to show the differences between null hypothesis and any other hypothesis. \\[Bayes\\ factor = \\frac{p(D|Ha)}{p(D|H0)} = \\frac{posterior\\ odds}{prior\\ odds}\\] Statistical model use statistics to make prediction/explanation. Most of the statistical model need to be tuned for parameters to show a better performance. Statistical model is build on real data and could be diagnosed by other general statistics such as \\(R^2\\), \\(ROC curve\\). When the models are built or compared, model selection could be preformed. \\[Target = g(Statistic) = g(f(sample_1,sample_2,...,sample_n))\\] Bias-Variance Tradeoff is an important concept regarding statistical models. Certain models could be overfitted(small Bias, large variance) or underfitted(large Bias, small variance) when the parameters of models are not well selected. \\[E[(y - \\hat f)^2] = \\sigma^2 + Var[\\hat f] + Bias[\\hat f]\\] Cross validation could be used to find the best model based on training-testing strategy such as Jacknife, bootstraping resampling and n-fold cross validation. Regularization for models could also be used to find the model with best prediction performance. Rigid regression, LASSO or other general regularization could be employed to build a robust models. For supervised models, linear model and tree based model are two basic categories. Linear model could be useful to tell the independent or correlated relationship of variables and the influences on the predicted variables. Tree based model, on the other hand, try to build a hierarchical structure for the variables such as bagging, random forest or boosting. Linear model could be treated as special case of tree based model with single layer. Other models like Support Vector Machine (SVM), Artificial Neural Network (ANN) or Deep Learning are also make various assumptions on the data. However, if you final target is prediction, you could try any of those models or even weighted combine their prediction to make meta-prediction. 10.2 Differences analysis After we get corrected peaks across samples, the next step is to find the differences between two groups. Actually, you could perform ANOVA or Kruskal-Wallis Test for comparison among more than two groups. The basic idea behind statistic analysis is to find the meaningful differences between groups and extract such ions or peak groups. So how to find the differences? In most metabolomics software, such task is completed by a t-test and report p-value and fold changes. If you only compare two groups on one peaks, that’s OK. However, if you compare two groups on thousands of peaks, statistic textbook would tell you to notice the false positive. For one comparison, the confidence level is 0.05, which means 5% chances to get false positive result. For two comparisons, such chances would be \\(1-0.95^2\\). For 10 comparisons, such chances would be \\(1-0.95^{10} = 0.4012631\\). For 100 comparisons, such chances would be \\(1-0.95^{100} = 0.9940795\\). You would almost certainly to make mistakes for your results. In statistics, the false discovery rate(FDR) control is always mentioned in omics studies for multiple tests. I suggested using q-values to control FDR. If q-value is less than 0.05, we should expect a lower than 5% chances we make the wrong selections for all of the comparisons showed lower q-values in the whole dataset. Also we could use local false discovery rate, which showed the FDR for certain peaks. However, such values are hard to be estimated accurately. Karin Ortmayr thought fold change might be better than p-values to find the differences (Ortmayr et al. 2016). 10.2.1 T-test or ANOVA If one peak show significant differences among two groups or multiple groups, T-test or ANOVA could be used to find such peaks. However, when multiple hypothesis testings are performed, the probability of false positive would increase. In this case, false discovery rate(FDR) control is required. Q value or adjusted p value could be used in this situation. At certain confidence interval, we could find peaks with significant differences after FDR control. 10.2.2 LIMMA Linear Models for MicroArray Data(LIMMA) model could also be used for high-dimensional data like metabolomics. They use a moderated t-statistic to make estimation of the effects called Empirical Bayes Statistics for Differential Expression. It is a hierarchical model to shrink the t-statistic for each peak to all the peaks. Such estimation is more robust. In LIMMA, we could add the known batch effect variable as a covariance in the model. LIMMA is different from t-test or ANOVA while we could still use p value and FDR control on LIMMA results. 10.2.3 Bayesian mixture model Another way to make difference analysis is based on Bayesian mixture model without p value. Such model would not use hypothesis testing and directly generate the posterior estimation of parameters. A posterior probability could be used to check whether certain peaks could be related to different condition. If we want to make comparison between classical model like LIMMA and Bayesian mixture model. We need to use simulation to find the cutoff. 10.3 PCA In most cases, PCA is used as an exploratory data analysis(EDA) method. In most of those most cases, PCA is just served as visualization method. I mean, when I need to visualize some high-dimension data, I would use PCA. So, the basic idea behind PCA is compression. When you have 100 samples with concentrations of certain compound, you could plot the concentrations with samples’ ID. However, if you have 100 compounds to be analyzed, it would by hard to show the relationship between the samples. Actually, you need to show a matrix with sample and compounds (100 * 100 with the concentrations filled into the matrix) in an informal way. The PCA would say: OK, guys, I could convert your data into only 100 * 2 matrix with the loss of information minimized. Yeah, that is what the mathematical guys or computer programmer do. You just run the command of PCA. The new two “compounds” might have the cor-relationship between the original 100 compounds and retain the variances between them. After such projection, you would see the compressed relationship between the 100 samples. If some samples’ data are similar, they would be projected together in new two “compounds” plot. That is why PCA could be used for cluster and the new “compounds” could be referred as principal components(PCs). However, you might ask why only two new compounds could finished such task. I have to say, two PCs are just good for visualization. In most cases, we need to collect PCs standing for more than 80% variances in our data if you want to recovery the data with PCs. If each compound have no relationship between each other, the PCs are still those 100 compounds. So you have found a property of the PCs: PCs are orthogonal between each other. Another issue is how to find the relationship between the compounds. We could use PCA to find the relationship between samples. However, we could also extract the influences of the compounds on certain PCs. You might find many compounds showed the same loading on the first PC. That means the concentrations pattern between the compounds are looked similar. So PCA could also be used to explore the relationship between the compounds. OK, next time you might recall PCA when you need it instead of other paper showed them. Besides, there are some other usage of PCA. Loadings are actually correlation coefficients between peaks and their PC scores. Yamamoto et.al. (Yamamoto et al. 2014) used t-test on this correlation coefficient and thought the peaks with statistically significant correlation to the PC score have biological meanings for further study such as annotation. However, such analysis works better when few PCs could explain most of the variances in the dataset. 10.4 Cluster Analysis After we got a lot of samples and analyzed the concentrations of many compounds in them, we may ask about the relationship between the samples. You might have the sampling information such as the date and the position and you could use boxplot or violin plot to explore the relationships among those categorical variables. However, you could also use the data to find some potential relationship. But how? if two samples’ data were almost the same, we might think those samples were from the same potential group. On the other hand, how do we define the “same” in the data? Cluster analysis told us that just define a “distances” to measure the similarity between samples. Mathematically, such distances would be shown in many different manners such as the sum of the absolute values of the differences between samples. For example, we analyzed the amounts of compound A, B and C in two samples and get the results: Compounds(ng) A B C Sample 1 10 13 21 Sample 2 54 23 16 The distance could be: \\[ distance = |10-54|+|13-23|+|21-16| = 59 \\] Also you could use the sum of squares or other way to stand for the similarity. After you defined a “distance”, you could get the distances between all of pairs for your samples. If two samples’ distance was the smallest, put them together as one group. Then calculate the distances again to combine the small group into big group until all of the samples were include in one group. Then draw a dendrogram for those process. The following issue is that how to cluster samples? You might set a cut-off and directly get the group from the dendrogram. However, sometimes you were ordered to cluster the samples into certain numbers of groups such as three. In such situation, you need K means cluster analysis. The basic idea behind the K means is that generate three virtual samples and calculate the distances between those three virtual samples and all of the other samples. There would be three values for each samples. Choose the smallest values and class that sample into this group. Then your samples were classified into three groups. You need to calculate the center of those three groups and get three new virtual samples. Repeat such process until the group members unchanged and you get your samples classified. OK, the basic idea behind the cluster analysis could be summarized as define the distances, set your cut-off and find the group. By this way, you might show potential relationships among samples. 10.5 PLSDA PLS-DA, OPLS-DA and HPSO-OPLS-DA (Qin Yang et al. 2017) could be used. Partial least squares discriminant analysis(PLSDA) was first used in the 1990s. However, Partial least squares(PLS) was proposed in the 1960s by Hermann Wold. Principal components analysis produces the weight matrix reflecting the covariance structure between the variables, while partial least squares produces the weight matrix reflecting the covariance structure between the variables and classes. After rotation by weight matrix, the new variables would contain relationship with classes. The classification performance of PLSDA is identical to linear discriminant analysis(LDA) if class sizes are balanced, or the columns are adjusted according to the mean of the class mean. If the number of variables exceeds the number of samples, LDA can be performed on the principal components. Quadratic discriminant analysis(QDA) could model nonlinearity relationship between variables while PLSDA is better for collinear variables. However, as a classifier, there is little advantage for PLSDA. The advantages of PLSDA is that this modle could show relationship between variables, which is not the goal of regular classifier. Different algorithms (Andersson 2009) for PLSDA would show different score, while PCA always show the same score with fixed algorithm. For PCA, both new variables and classes are orthognal. However, for PLS(Wold), only new classes are orthognal. For PLS(Martens), only new variables are orthognal. This paper show the details of using such methods (Brereton and Lloyd 2018). Sparse PLS discriminant analysis(sPLS-DA) make a L1 penal on the variable selection to remove the influences from unrelated variables, which make sense for high-throughput omics data (Lê Cao, Boitard, and Besse 2011). For o-PLS-DA, s-plot could be used to find features(Wiklund et al. 2008). 10.6 Network analysis 10.6.1 Vertex and edge Each node is a vertex and the connection between nodes is a edge in the network. The connection can be directed or undirected depending on the relationship. 10.6.2 Build the network Adjacency matrices were always used to build the network. It’s a square matrix with n dimensions. Row i and column j is equal to 1 if and only if vertices i and j are connected. In directed network, such values could be 1 for i to j and -1 for j to i. 10.6.3 Network attributes Vertex/edge attributes could be the group information or metadata about the nodes/connections. The edges could be weighted as attribute. Path is the way from one node to another node in the network and you could find the shortest path in the path. The largest distance of a graph is called its diameter. An undirected network is connected if there is a way from any vertex to any other. Connected networks can further classified according to the strength of their connectedness. An undirected network with at least two paths between each pairs of nodes is said to be biconnected. The transitivity of network is a crude summary of the structure. A high value means that nodes are connected well locally with dense subgraphs. Network data sets typically show high transitivity. Maximum flows and minimum cuts could be used to check the largest volumns and smallest path flow between two nodes. For example, two hubs is connected by one node and the largest volumn and smallest path flow between two nodes from each hub could be counted at the select node. Sparse network has similar number of edges and the number of nodes. Dense network has the number of edges as a quadratic function of the nodes. 10.7 Software MetaboAnalystR (Chong, Wishart, and Xia 2019) caret could employ more than 200 statistical models in a general framework to build/select models. You could also show the variable importance for some of the models. caretEnsemble Functions for creating ensembles of caret models pROC Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical tests based on U-statistics or bootstrap. Confidence intervals can be computed for (p)AUC or ROC curves. gWQS Fits Weighted Quantile Sum (WQS) regressions for continuous, binomial, multinomial and count outcomes. Community ecology tool could be used to analysis metabolomic data(Passos Mansoldo et al. 2022). References Andersson, Martin. 2009. “A Comparison of Nine PLS1 Algorithms.” Journal of Chemometrics 23 (10): 518–29. https://doi.org/10.1002/cem.1248. Blaise, Benjamin J., Gonçalo D. S. Correia, Gordon A. Haggart, Izabella Surowiec, Caroline Sands, Matthew R. Lewis, Jake T. M. Pearce, et al. 2021. “Statistical Analysis in Metabolic Phenotyping.” Nature Protocols, July, 1–28. https://doi.org/10.1038/s41596-021-00579-1. Brereton, Richard G., and Gavin R. Lloyd. 2018. “Partial Least Squares Discriminant Analysis for Chemometrics and Metabolomics: How Scores, Loadings, and Weights Differ According to Two Common Algorithms.” Journal of Chemometrics 32 (4): e3028. https://doi.org/10.1002/cem.3028. Chong, Jasmine, David S. Wishart, and Jianguo Xia. 2019. “Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis.” Current Protocols in Bioinformatics 68 (1): e86. https://doi.org/10.1002/cpbi.86. Lê Cao, Kim-Anh, Simon Boitard, and Philippe Besse. 2011. “Sparse PLS Discriminant Analysis: Biologically Relevant Feature Selection and Graphical Displays for Multiclass Problems.” BMC Bioinformatics 12 (June): 253. https://doi.org/10.1186/1471-2105-12-253. Ortmayr, Karin, Verena Charwat, Cornelia Kasper, Stephan Hann, and Gunda Koellensperger. 2016. “Uncertainty Budgeting in Fold Change Determination and Implications for Non-Targeted Metabolomics Studies in Model Systems” 142 (1): 80–90. https://doi.org/10.1039/C6AN01342B. Passos Mansoldo, Felipe Raposo, Rafael Garrett, Veronica da Silva Cardoso, Marina Amaral Alves, and Alane Beatriz Vermelho. 2022. “Metabology: Analysis of Metabolomics Data Using Community Ecology Tools.” Analytica Chimica Acta 1232 (November): 340469. https://doi.org/10.1016/j.aca.2022.340469. Wiklund, Susanne, Erik Johansson, Lina Sjöström, Ewa J. Mellerowicz, Ulf Edlund, John P. Shockcor, Johan Gottfries, Thomas Moritz, and Johan Trygg. 2008. “Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models.” Analytical Chemistry 80 (1): 115–22. https://doi.org/10.1021/ac0713510. Yamamoto, Hiroyuki, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, and Yoshiaki Ohashi. 2014. “Statistical Hypothesis Testing of Factor Loading in Principal Component Analysis and Its Application to Metabolite Set Enrichment Analysis.” BMC Bioinformatics 15 (February): 51. https://doi.org/10.1186/1471-2105-15-51. Yang, Qin, Shan-Shan Lin, Jiang-Tao Yang, Li-Juan Tang, and Ru-Qin Yu. 2017. “Detection of Inborn Errors of Metabolism Utilizing GC-MS Urinary Metabolomics Coupled with a Modified Orthogonal Partial Least Squares Discriminant Analysis.” Talanta 165 (April): 545–52. https://doi.org/10.1016/j.talanta.2017.01.018. "],["exposome.html", "Chapter 11 Exposome 11.1 Internal exposure 11.2 External exposure", " Chapter 11 Exposome Nature or nurture debate has a similar paradigm in environmental study: is the ecological system and human health risk dominated by heredity or environment? Twins and siblings study(Lakhani et al. 2019; Polderman et al. 2015) show that both heritability and environmental factors could explain the phenotypic variance among population. The contribution of environment among different disease functional domain such as hematological and endocrine could achieve almost half of the total variances (Polderman et al. 2015). However, besides those epidemiology proof, little is known about the influences of overall environmental exposure process at molecular level. Conventional exposure study always investigate one or several specific compounds and their environmental fate or toxicology endpoint. Exposome, on the other hand, tries to access multiple exposure factors from biological or environmental samples as much as possible without a predefined compounds list. Those endogenous and exogenous molecules can reveal the exposure process in details. Exposome could not only help to investigate the comprehensive molecules level changes, but also the interactions among molecules in an non-targeted design. By following annotation of captured compounds, exposome can discover exposure markers for certain type of pollution, as well as biomarkers for certain exposure process and discuss related physiological process. The workflow for exposome is quite similar to metabolomics(X. Hu et al. 2021). According to CDC, The exposome can be defined as the measure of all the exposures of an individual in a lifetime and how those exposures relate to health. Exposomics is the study of the exposome and relies on the application of internal and external exposure assessment methods. Internal exposure relies on fields of study such as genomics, metabolomics, lipidomics, transcriptomics and proteomics. External exposure assessment relies on measuring environmental stresses. Human Early Life Exposome (HELIX) project(Maitre et al. 2022), a multi-centre cohort of 1301 mother-child pairs, associated individual exposomes consisting of &gt;100 chemical, outdoor, social and lifestyle exposures assessed in pregnancy and childhood, with multi-omics profiles (methylome, transcriptome, proteins and metabolites) in childhood. The data could be found online. “molecular gatekeepers”, key metabolites that link single or multiple exposure biomarkers with correlated clusters of endogenous metabolites, could be used to find health-relevant biological metabolites. (M. Yu et al. 2022) 11.1 Internal exposure Virtual Metabolic Human Database integrating human and gut microbiome metabolism with nutrition and disease. 11.2 External exposure 11.2.1 Environmental fate of compounds 11.2.1.1 QSPR Chemicalize is a powerful online platform for chemical calculations, search, and text processing. QSPR molecular descriptor generate tools list Spark uses computational algorithms based on fundamental chemical structure theory to estimate a wide variety of reactivity parameters strictly from molecular structure. OPERA OPERA models for predicting physicochemical properties and environmental fate endpoints(Mansouri et al. 2018). LogP is important for analytical chemistry. Mannhold (Mannhold et al. 2009) report a comprehensive comparison of logP algorithms. Later, Rajarshi Guha make a comparison with logP algorithms with CDK based on logPstar dataset. Commercial software such as Spark, ACS Labs and ChemAxon might always claim a better performance on in-house dataset compared with public software like KowWIN within EPI Suite. However, we should be careful to evaluate the influence of logP accuracy on the metabolites or unknown compounds. 11.2.1.2 Fate Wania Group developed software tools to address various aspects of organic contaminant fate and behaviour. Trent University release models to predict environmental fate for pollutions such as Level 3. EAWAG-BBD could provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. 11.2.2 Exposure study database The information system PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from earth system research. Environmental Health Criteria (EHC) Monographs CTD is a robust, publicly available database that aims to advance understanding about how environmental exposures affect human health. ODMOA facilitates and coordinates the collection, access to, and use of public health data in order to monitor and improve population health. This data is better for general public health research for Massachusetts. The Surveillance, Epidemiology, and End Results (SEER) Program provides information on cancer statistics in an effort to reduce the cancer burden among the U.S. population. References Hu, Xin, Douglas I. Walker, Yongliang Liang, Matthew Ryan Smith, Michael L. Orr, Brian D. Juran, Chunyu Ma, et al. 2021. “A Scalable Workflow to Characterize the Human Exposome.” Nature Communications 12 (1): 5575. https://doi.org/10.1038/s41467-021-25840-9. Lakhani, Chirag M., Braden T. Tierney, Arjun K. Manrai, Jian Yang, Peter M. Visscher, and Chirag J. Patel. 2019. “Repurposing Large Health Insurance Claims Data to Estimate Genetic and Environmental Contributions in 560 Phenotypes.” Nature Genetics 51 (2): 327–34. https://doi.org/10.1038/s41588-018-0313-7. Maitre, Léa, Mariona Bustamante, Carles Hernández-Ferrer, Denise Thiel, Chung-Ho E. Lau, Alexandros P. Siskos, Marta Vives-Usano, et al. 2022. “Multi-Omics Signatures of the Human Early Life Exposome.” Nature Communications 13 (1): 7024. https://doi.org/10.1038/s41467-022-34422-2. Mannhold, Raimund, Gennadiy I. Poda, Claude Ostermann, and Igor V. Tetko. 2009. “Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of LogP Methods on More Than 96,000 Compounds.” Journal of Pharmaceutical Sciences 98 (3): 861–93. https://doi.org/10.1002/jps.21494. Mansouri, Kamel, Chris M. Grulke, Richard S. Judson, and Antony J. Williams. 2018. “OPERA Models for Predicting Physicochemical Properties and Environmental Fate Endpoints.” Journal of Cheminformatics 10 (1): 10. https://doi.org/10.1186/s13321-018-0263-1. Polderman, Tinca J. C., Beben Benyamin, Christiaan A. de Leeuw, Patrick F. Sullivan, Arjen van Bochoven, Peter M. Visscher, and Danielle Posthuma. 2015. “Meta-Analysis of the Heritability of Human Traits Based on Fifty Years of Twin Studies.” Nature Genetics 47 (7): 702–9. https://doi.org/10.1038/ng.3285. Yu, Miao, Susan L. Teitelbaum, Georgia Dolios, Lam-Ha T. Dang, Peijun Tu, Mary S. Wolff, and Lauren M. Petrick. 2022. “Molecular Gatekeeper Discovery: Workflow for Linking Multiple Exposure Biomarkers to Metabolomics.” Environmental Science &amp; Technology 56 (10): 6162–71. https://doi.org/10.1021/acs.est.1c04039. "],["references.html", "References", " References Abrahamsson, Dimitri, Christopher L. Brueck, Carsten Prasse, Dimitra A. Lambropoulou, Lelouda-Athanasia Koronaiou, Miaomiao Wang, June-Soo Park, and Tracey J. Woodruff. 2023. “Extracting Structural Information from Physicochemical Property Measurements Using Machine Learning-A New Approach for Structure Elucidation in Non-targeted Analysis.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.3c03003. Adams, Kendra J., Brian Pratt, Neelanjan Bose, Laura G. Dubois, Lisa St John-Williams, Kevin M. Perrott, Karina Ky, et al. 2020. “Skyline for Small Molecules: A Unifying Software Package for Quantitative Metabolomics.” Journal of Proteome Research 19 (4): 1447–58. https://doi.org/10.1021/acs.jproteome.9b00640. Aguilar-Mogas, Antoni, Marta Sales-Pardo, Miriam Navarro, Roger Guimerà, and Oscar Yanes. 2017. “iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.” Analytical Chemistry 89 (6): 3474–82. https://doi.org/10.1021/acs.analchem.6b04512. Alden, Nicholas, Smitha Krishnan, Vladimir Porokhin, Ravali Raju, Kyle McElearney, Alan Gilbert, and Kyongbum Lee. 2017. “Biologically Consistent Annotation of Metabolomics Data.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.7b02162. Ali, Ahmed, Yasmine Abouleila, Yoshihiro Shimizu, Eiso Hiyama, Samy Emara, Alireza Mashaghi, and Thomas Hankemeier. 2019. “Single-Cell Metabolomics by Mass Spectrometry: Advances, Challenges, and Future Applications.” TrAC Trends in Analytical Chemistry 120 (November): 115436. https://doi.org/10.1016/j.trac.2019.02.033. Alka, Oliver, Timo Sachsenberg, Leon Bichmann, Julianus Pfeuffer, Hendrik Weisser, Samuel Wein, Eugen Netz, Marc Rurik, Oliver Kohlbacher, and Hannes Röst. 2020. “CHAPTER 6:OpenMS and KNIME for Mass Spectrometry Data Processing.” In Processing Metabolomics and Proteomics Data with Open Software, 201–31. https://doi.org/10.1039/9781788019880-00201. Alka, Oliver, Premy Shanthamoorthy, Michael Witting, Karin Kleigrewe, Oliver Kohlbacher, and Hannes L. Röst. 2022. “DIAMetAlyzer Allows Automated False-Discovery Rate-Controlled Analysis for Data-Independent Acquisition in Metabolomics.” Nature Communications 13 (1): 1347. https://doi.org/10.1038/s41467-022-29006-z. Allam-Ndoul, Bénédicte, Frédéric Guénard, Véronique Garneau, Hubert Cormier, Olivier Barbier, Louis Pérusse, and Marie-Claude Vohl. 2016. “Association Between Metabolite Profiles, Metabolic Syndrome and Obesity Status.” Nutrients 8 (6): 324. https://doi.org/10.3390/nu8060324. Allard, Pierre-Marie, Grégory Genta-Jouve, and Jean-Luc Wolfender. 2017. “Deep Metabolome Annotation in Natural Products Research: Towards a Virtuous Cycle in Metabolite Identification.” Current Opinion in Chemical Biology, Omics, 36 (February): 40–49. https://doi.org/10.1016/j.cbpa.2016.12.022. Allen, Felicity, Allison Pon, Michael Wilson, Russ Greiner, and David Wishart. 2014. “CFM-ID: A Web Server for Annotation, Spectrum Prediction and Metabolite Identification from Tandem Mass Spectra.” Nucleic Acids Research 42 (W1): W94–99. https://doi.org/10.1093/nar/gku436. Alonso, Arnald, Sara Marsal, and Antonio Julià. 2015. “Analytical Methods in Untargeted Metabolomics: State of the Art in 2015.” Frontiers in Bioengineering and Biotechnology 3 (March). https://doi.org/10.3389/fbioe.2015.00023. Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Andersson, Martin. 2009. “A Comparison of Nine PLS1 Algorithms.” Journal of Chemometrics 23 (10): 518–29. https://doi.org/10.1002/cem.1248. Aron, Allegra T., Emily C. Gentry, Kerry L. McPhail, Louis-Félix Nothias, Mélissa Nothias-Esposito, Amina Bouslimani, Daniel Petras, et al. 2020. “Reproducible Molecular Networking of Untargeted Mass Spectrometry Data Using GNPS.” Nature Protocols 15 (6): 1954–91. https://doi.org/10.1038/s41596-020-0317-5. Bach, Eric, Emma L. Schymanski, and Juho Rousu. 2022. “Joint Structural Annotation of Small Molecules Using Liquid Chromatography Retention Order and Tandem Mass Spectrometry Data.” Nature Machine Intelligence 4 (12): 1224–37. https://doi.org/10.1038/s42256-022-00577-2. Bai, Caihong, Suyun Xu, Jingyi Tang, Yuxi Zhang, Jiahui Yang, and Kaifeng Hu. 2022. “A ‘Shape-Orientated’ Algorithm Employing an Adapted Marr Wavelet and Shape Matching Index Improves the Performance of Continuous Wavelet Transform for Chromatographic Peak Detection and Quantification.” Journal of Chromatography A 1673 (June): 463086. https://doi.org/10.1016/j.chroma.2022.463086. Baker, Monya. 2011. “Metabolomics: From Small Molecules to Big Ideas.” Nature Methods 8 (2): 117–21. https://doi.org/10.1038/nmeth0211-117. Baran, Richard, and Trent R. Northen. 2013. “Robust Automated Mass Spectra Interpretation and Chemical Formula Calculation Using Mixed Integer Linear Programming.” Analytical Chemistry 85 (20): 9777–84. https://doi.org/10.1021/ac402180c. Barbier Saint Hilaire, Pierre, Ulli M. Hohenester, Benoit Colsch, Jean-Claude Tabet, Christophe Junot, and François Fenaille. 2018. “Evaluation of the High-Field Orbitrap Fusion for Compound Annotation in Metabolomics.” Analytical Chemistry 90 (5): 3030–35. https://doi.org/10.1021/acs.analchem.7b05372. Barnes, Stephen, H. Paul Benton, Krista Casazza, Sara J. Cooper, Xiangqin Cui, Xiuxia Du, Jeffrey Engler, et al. 2016a. “Training in Metabolomics Research. I. Designing the Experiment, Collecting and Extracting Samples and Generating Metabolomics Data.” Journal of Mass Spectrometry 51 (7): 461–75. https://doi.org/10.1002/jms.3782. ———, et al. 2016b. “Training in Metabolomics Research. II. Processing and Statistical Analysis of Metabolomics Data, Metabolite Identification, Pathway Analysis, Applications of Metabolomics and Its Future.” Journal of Mass Spectrometry 51 (8): 535–48. https://doi.org/10.1002/jms.3780. Barranco-Altirriba, Maria, Pol Solà-Santos, Sergio Picart-Armada, Samir Kanaan-Izquierdo, Jordi Fonollosa, and Alexandre Perera-Lluna. 2021. “mWISE: An Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features Through Diffusion in Graphs.” Analytical Chemistry 93 (31): 10772–78. https://doi.org/10.1021/acs.analchem.1c00238. Basu, Sumanta, William Duren, Charles R. Evans, Charles F. Burant, George Michailidis, and Alla Karnovsky. 2017. “Sparse Network Modeling and Metscape-Based Visualization Methods for the Analysis of Large-Scale Metabolomics Data.” Bioinformatics 33 (10): 1545–53. https://doi.org/10.1093/bioinformatics/btx012. Baygi, Sadjad Fakouri, Sanjay K. Banerjee, Praloy Chakraborty, Yashwant Kumar, and Dinesh Kumar Barupal. 2022. “IDSL.UFA Assigns High-Confidence Molecular Formula Annotations for Untargeted LC/HRMS Data Sets in Metabolomics and Exposomics.” Analytical Chemistry 94 (39): 13315–22. https://doi.org/10.1021/acs.analchem.2c00563. Baygi, Sadjad Fakouri, Yashwant Kumar, and Dinesh Kumar Barupal. 2023. “IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets.” IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets, June. https://doi.org/10.1021/acs.analchem.3c00376. Beale, David J., Farhana R. Pinu, Konstantinos A. Kouremenos, Mahesha M. Poojary, Vinod K. Narayana, Berin A. Boughton, Komal Kanojia, Saravanan Dayalan, Oliver A. H. Jones, and Daniel A. Dias. 2018. “Review of Recent Developments in GC–MS Approaches to Metabolomics-Based Research.” Metabolomics 14 (11): 152. https://doi.org/10.1007/s11306-018-1449-2. Begou, O., H. G. Gika, I. D. Wilson, and G. Theodoridis. 2017. “Hyphenated MS-based Targeted Approaches in Metabolomics.” Analyst 142 (17): 3079–3100. https://doi.org/10.1039/C7AN00812K. Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological) 57 (1): 289–300. https://www.jstor.org/stable/2346101. Bennett, Bryson D., Elizabeth H. Kimball, Melissa Gao, Robin Osterhout, Stephen J. Van Dien, and Joshua D. Rabinowitz. 2009. “Absolute Metabolite Concentrations and Implied Enzyme Active Site Occupancy in Escherichia Coli.” Nature Chemical Biology 5 (8): 593–99. https://doi.org/10.1038/nchembio.186. Bernardo-Bermejo, Samuel, Jingchuan Xue, Linh Hoang, Elizabeth Billings, Bill Webb, M. Willy Honders, Sanne Venneker, et al. 2023. “Quantitative Multiple Fragment Monitoring with Enhanced in-Source Fragmentation/Annotation Mass Spectrometry.” Nature Protocols, February, 1–20. https://doi.org/10.1038/s41596-023-00803-0. Bertsch, Andreas, Clemens Gröpl, Knut Reinert, and Oliver Kohlbacher. 2011. “OpenMS and TOPP: Open Source Software for LC-MS Data Analysis.” In Data Mining in Proteomics: From Standards to Applications, edited by Michael Hamacher, Martin Eisenacher, and Christian Stephan, 353–67. Methods in Molecular Biology. Totowa, NJ: Humana Press. https://doi.org/10.1007/978-1-60761-987-1_23. Bijttebier, Sebastiaan, Anastasia Van der Auwera, Kenn Foubert, Stefan Voorspoels, Luc Pieters, and Sandra Apers. 2016. “Bridging the Gap Between Comprehensive Extraction Protocols in Plant Metabolomics Studies and Method Validation.” Analytica Chimica Acta 935 (September): 136–50. https://doi.org/10.1016/j.aca.2016.06.047. Bilbao, Aivett, Nathalie Munoz, Joonhoon Kim, Daniel J. Orton, Yuqian Gao, Kunal Poorey, Kyle R. Pomraning, et al. 2023. “PeakDecoder Enables Machine Learning-Based Metabolite Annotation and Accurate Profiling in Multidimensional Mass Spectrometry Measurements.” Nature Communications 14 (1): 2461. https://doi.org/10.1038/s41467-023-37031-9. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Bittremieux, Wout, Nicole E. Avalon, Sydney P. Thomas, Sarvar A. Kakhkhorov, Alexander A. Aksenov, Paulo Wender P. Gomes, Christine M. Aceves, et al. 2023. “Open Access Repository-Scale Propagated Nearest Neighbor Suspect Spectral Library for Untargeted Metabolomics.” Nature Communications 14 (1): 8488. https://doi.org/10.1038/s41467-023-44035-y. Bittremieux, Wout, Robin Schmid, Florian Huber, Justin J. J. van der Hooft, Mingxun Wang, and Pieter C. Dorrestein. 2022. “Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules.” Journal of the American Society for Mass Spectrometry 33 (9): 1733–44. https://doi.org/10.1021/jasms.2c00153. Blaise, Benjamin J. 2013. “Data-Driven Sample Size Determination for Metabolic Phenotyping Studies.” Analytical Chemistry 85 (19): 8943–50. https://doi.org/10.1021/ac4022314. Blaise, Benjamin J., Gonçalo D. S. Correia, Gordon A. Haggart, Izabella Surowiec, Caroline Sands, Matthew R. Lewis, Jake T. M. Pearce, et al. 2021. “Statistical Analysis in Metabolic Phenotyping.” Nature Protocols, July, 1–28. https://doi.org/10.1038/s41596-021-00579-1. Blaise, Benjamin J., Gonçalo Correia, Adrienne Tin, J. Hunter Young, Anne-Claire Vergnaud, Matthew Lewis, Jake T. M. Pearce, et al. 2016. “Power Analysis and Sample Size Determination in Metabolic Phenotyping.” Analytical Chemistry 88 (10): 5179–88. https://doi.org/10.1021/acs.analchem.6b00188. Blaženović, Ivana, Tobias Kind, Hrvoje Torbašinović, Slobodan Obrenović, Sajjan S. Mehta, Hiroshi Tsugawa, Tobias Wermuth, et al. 2017. “Comprehensive Comparison of in Silico MS/MS Fragmentation Tools of the CASMI Contest: Database Boosting Is Needed to Achieve 93% Accuracy.” Journal of Cheminformatics 9 (1): 32. https://doi.org/10.1186/s13321-017-0219-x. Bonini, Paolo, Tobias Kind, Hiroshi Tsugawa, Dinesh Kumar Barupal, and Oliver Fiehn. 2020. “Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics.” Analytical Chemistry 92 (11): 7515–22. https://doi.org/10.1021/acs.analchem.9b05765. Bonnefille, Bénilde, Oskar Karlsson, May Britt Rian, Rubhana Raqib, Faruque Parvez, Stefano Papazian, M. Sirajul Islam, and Jonathan W. Martin. 2023. “Nontarget Analysis of Polluted Surface Waters in Bangladesh Using Open Science Workflows.” Environmental Science &amp; Technology, April. https://doi.org/10.1021/acs.est.2c08200. Bonner, Ron, and Gérard Hopfgartner. 2018. “SWATH Data Independent Acquisition Mass Spectrometry for Metabolomics.” TrAC Trends in Analytical Chemistry, October. https://doi.org/10.1016/j.trac.2018.10.014. Box, George E. P., J. Stuart Hunter, and William G. Hunter. 2005. Statistics for Experimenters. Wiley-Interscience. Brereton, Richard G., and Gavin R. Lloyd. 2018. “Partial Least Squares Discriminant Analysis for Chemometrics and Metabolomics: How Scores, Loadings, and Weights Differ According to Two Common Algorithms.” Journal of Chemometrics 32 (4): e3028. https://doi.org/10.1002/cem.3028. Broadhurst, David, Royston Goodacre, Stacey N. Reinke, Julia Kuligowski, Ian D. Wilson, Matthew R. Lewis, and Warwick B. Dunn. 2018. “Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies.” Metabolomics 14 (6). https://doi.org/10.1007/s11306-018-1367-3. Broeckling, C. D., F. A. Afsar, S. Neumann, A. Ben-Hur, and J. E. Prenni. 2014. “RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data.” Analytical Chemistry 86 (14): 6812–17. https://doi.org/10.1021/ac501530d. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Broeckling, Corey D., Andrea Ganna, Mark Layer, Kevin Brown, Ben Sutton, Erik Ingelsson, Graham Peers, and Jessica E. Prenni. 2016. “Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data Through MS1 Spectrum and Time Prediction.” Analytical Chemistry 88 (18): 9226–34. https://doi.org/10.1021/acs.analchem.6b02479. Bundy, Jacob G., Matthew P. Davey, and Mark R. Viant. 2009. “Environmental Metabolomics: A Critical Review and Future Perspectives.” Metabolomics 5 (1): 3. https://doi.org/10.1007/s11306-008-0152-0. Cai, Jingwei, and Zhengyin Yan. 2021. “Re-Examining the Impact of Minimal Scans in Liquid Chromatography–Mass Spectrometry Analysis.” Journal of the American Society for Mass Spectrometry, June. https://doi.org/10.1021/jasms.1c00073. Cai, Qingpo, Jessica A. Alvarez, Jian Kang, and Tianwei Yu. 2017. “Network Marker Selection for Untargeted LC–MS Metabolomics Data.” Journal of Proteome Research 16 (3): 1261–69. https://doi.org/10.1021/acs.jproteome.6b00861. Cajka, Tomas, and Oliver Fiehn. 2016. “Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics.” Analytical Chemistry 88 (1): 524–45. https://doi.org/10.1021/acs.analchem.5b04491. Calbiani, F., M. Careri, L. Elviri, A. Mangia, and I. Zagnoni. 2006. “Matrix Effects on Accurate Mass Measurements of Low-Molecular Weight Compounds Using Liquid Chromatography-Electrospray-Quadrupole Time-of-Flight Mass Spectrometry.” Journal of Mass Spectrometry 41 (3): 289–94. https://doi.org/10.1002/jms.984. Carroll, Adam J., Murray R. Badger, and A. Harvey Millar. 2010. “The MetabolomeExpress Project: Enabling Web-Based Processing, Analysis and Transparent Dissemination of GC/MS Metabolomics Datasets.” BMC Bioinformatics 11 (1): 376. https://doi.org/10.1186/1471-2105-11-376. Castro-Puyana, María, Raquel Pérez-Míguez, Lidia Montero, and Miguel Herrero. 2017. “Application of Mass Spectrometry-Based Metabolomics Approaches for Food Safety, Quality and Traceability.” TrAC Trends in Analytical Chemistry 93 (August): 102–18. https://doi.org/10.1016/j.trac.2017.05.004. Chaker, Jade, David Møbjerg Kristensen, Thorhallur Ingi Halldorsson, Sjurdur Frodi Olsen, Christine Monfort, Cécile Chevrier, Bernard Jégou, and Arthur David. 2022. “Comprehensive Evaluation of Blood Plasma and Serum Sample Preparations for HRMS-Based Chemical Exposomics: Overlaps and Specificities.” Analytical Chemistry 94 (2): 866–74. https://doi.org/10.1021/acs.analchem.1c03638. Chaleckis, Romanas, Isabel Meister, Pei Zhang, and Craig E Wheelock. 2019. “Challenges, Progress and Promises of Metabolite Annotation for LC–MS-based Metabolomics.” Current Opinion in Biotechnology, Analytical Biotechnology, 55 (February): 44–50. https://doi.org/10.1016/j.copbio.2018.07.010. Chambers, Matthew C., Brendan Maclean, Robert Burke, Dario Amodei, Daniel L. Ruderman, Steffen Neumann, Laurent Gatto, et al. 2012. “A Cross-Platform Toolkit for Mass Spectrometry and Proteomics.” Nature Biotechnology 30 (October): 918–20. https://doi.org/10.1038/nbt.2377. Chang, Hui-Yin, Ching-Tai Chen, T. Mamie Lih, Ke-Shiuan Lynn, Chiun-Gung Juo, Wen-Lian Hsu, and Ting-Yi Sung. 2016. “iMet-Q: A User-Friendly Tool for Label-Free Metabolomics Quantitation Using Dynamic Peak-Width Determination.” PLOS ONE 11 (1): e0146112. https://doi.org/10.1371/journal.pone.0146112. Chang, Hui-Yin, Sean M. Colby, Xiuxia Du, Javier D. Gomez, Maximilian J. Helf, Katerina Kechris, Christine R. Kirkpatrick, et al. 2021. “A Practical Guide to Metabolomics Software Development.” Analytical Chemistry 93 (4): 1912–23. https://doi.org/10.1021/acs.analchem.0c03581. Charbonnet, Joseph A., Carrie A. McDonough, Feng Xiao, Trever Schwichtenberg, Dunping Cao, Sarit Kaserzon, Kevin V. Thomas, et al. 2022. “Communicating Confidence of Per- and Polyfluoroalkyl Substance Identification via High-Resolution Mass Spectrometry.” Environmental Science &amp; Technology Letters, May. https://doi.org/10.1021/acs.estlett.2c00206. Chen, Gengbo, Scott Walmsley, Gemmy C. M. Cheung, Liyan Chen, Ching-Yu Cheng, Roger W. Beuerman, Tien Yin Wong, Lei Zhou, and Hyungwon Choi. 2017. “Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.” Analytical Chemistry 89 (9): 4897–4906. https://doi.org/10.1021/acs.analchem.6b05006. Chen, Li, Wenyun Lu, Lin Wang, Xi Xing, Ziyang Chen, Xin Teng, Xianfeng Zeng, et al. 2021. “Metabolite Discovery Through Global Annotation of Untargeted Metabolomics Data.” Nature Methods 18 (11): 1377–85. https://doi.org/10.1038/s41592-021-01303-3. Chen, Yanhua, Zhi Zhou, Wei Yang, Nan Bi, Jing Xu, Jiuming He, Ruiping Zhang, Lvhua Wang, and Zeper Abliz. 2017. “Development of a Data-Independent Targeted Metabolomics Method for Relative Quantification Using Liquid Chromatography Coupled with Tandem Mass Spectrometry.” Analytical Chemistry 89 (13): 6954–62. https://doi.org/10.1021/acs.analchem.6b04727. Cheng, Susan, Svati H. Shah, Elizabeth J. Corwin, Oliver Fiehn, Robert L. Fitzgerald, Robert E. Gerszten, Thomas Illig, et al. 2017. “Potential Impact and Study Considerations of Metabolomics in Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.” Circulation: Cardiovascular Genetics 10 (2): e000032. https://doi.org/10.1161/HCG.0000000000000032. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Chokkathukalam, Achuthanunni, Andris Jankevics, Darren J. Creek, Fiona Achcar, Michael P. Barrett, and Rainer Breitling. 2013. “mzMatch–ISO: An R Tool for the Annotation and Relative Quantification of Isotope-Labelled Mass Spectrometry Data.” Bioinformatics 29 (2): 281–83. https://doi.org/10.1093/bioinformatics/bts674. Chong, Jasmine, David S. Wishart, and Jianguo Xia. 2019. “Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis.” Current Protocols in Bioinformatics 68 (1): e86. https://doi.org/10.1002/cpbi.86. Clasquin, Michelle F., Eugene Melamud, and Joshua D. Rabinowitz. 2012. “LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine.” Current Protocols in Bioinformatics 37 (1): 14.11.1–23. https://doi.org/10.1002/0471250953.bi1411s37. Climaco Pinto, Rui, Ibrahim Karaman, Matthew R. Lewis, Jenny Hällqvist, Manuja Kaluarachchi, Gonçalo Graça, Elena Chekmeneva, et al. 2022. “Finding Correspondence Between Metabolomic Features in Untargeted Liquid Chromatography–Mass Spectrometry Metabolomics Datasets.” Analytical Chemistry 94 (14): 5493–503. https://doi.org/10.1021/acs.analchem.1c03592. Codrean, S., B. Kruit, N. Meekel, D. Vughs, and F. Béen. 2023. “Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.” Analytical Chemistry, October. https://doi.org/10.1021/acs.analchem.3c03470. Colby, Sean M., Christine H. Chang, Jessica L. Bade, Jamie R. Nunez, Madison R. Blumer, Daniel J. Orton, Kent J. Bloodsworth, et al. 2022. “DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data.” Analytical Chemistry 94 (16): 6130–38. https://doi.org/10.1021/acs.analchem.1c05017. Considine, E. C., G. Thomas, A. L. Boulesteix, A. S. Khashan, and L. C. Kenny. 2017. “Critical Review of Reporting of the Data Analysis Step in Metabolomics.” Metabolomics 14 (1): 7. https://doi.org/10.1007/s11306-017-1299-3. Creek, Darren J., Andris Jankevics, Karl E. V. Burgess, Rainer Breitling, and Michael P. Barrett. 2012. “IDEOM: An Excel Interface for Analysis of LC–MS-based Metabolomics Data.” Bioinformatics 28 (7): 1048–49. https://doi.org/10.1093/bioinformatics/bts069. Dagan, Shai, Dana Marder, Nitzan Tzanani, Eyal Drug, Hagit Prihed, and Lilach Yishai-Aviram. 2023. “Evaluation of Matrix Complexity in Nontargeted Analysis of Small-Molecule Toxicants by Liquid Chromatography–High-Resolution Mass Spectrometry.” Analytical Chemistry 95 (20): 7924–32. https://doi.org/10.1021/acs.analchem.3c00413. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. de Jonge, Niek F., Joris J. R. Louwen, Elena Chekmeneva, Stephane Camuzeaux, Femke J. Vermeir, Robert S. Jansen, Florian Huber, and Justin J. J. van der Hooft. 2023. “MS2Query: Reliable and Scalable MS2 Mass Spectra-Based Analogue Search.” Nature Communications 14 (1): 1752. https://doi.org/10.1038/s41467-023-37446-4. De Livera, Alysha M., Daniel A. Dias, David De Souza, Thusitha Rupasinghe, James Pyke, Dedreia Tull, Ute Roessner, Malcolm McConville, and Terence P. Speed. 2012. “Normalizing and Integrating Metabolomics Data.” Analytical Chemistry 84 (24): 10768–76. https://doi.org/10.1021/ac302748b. Deda, Olga, Anastasia Chrysovalantou Chatziioannou, Stella Fasoula, Dimitris Palachanis, Nicolaos Raikos, Georgios A. Theodoridis, and Helen G. Gika. 2017. “Sample Preparation Optimization in Fecal Metabolic Profiling.” Journal of Chromatography B, Advances in mass spectrometry-based applications, 1047 (March): 115–23. https://doi.org/10.1016/j.jchromb.2016.06.047. DeFelice, Brian C., Sajjan Singh Mehta, Stephanie Samra, Tomáš Čajka, Benjamin Wancewicz, Johannes F. Fahrmann, and Oliver Fiehn. 2017. “Mass Spectral Feature List Optimizer (MS-FLO): A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing.” Analytical Chemistry 89 (6): 3250–55. https://doi.org/10.1021/acs.analchem.6b04372. Delabriere, Alexis, Philipp Warmer, Vincenth Brennsteiner, and Nicola Zamboni. 2021. “SLAW: A Scalable and Self-Optimizing Processing Workflow for Untargeted LC-MS.” Analytical Chemistry 93 (45): 15024–32. https://doi.org/10.1021/acs.analchem.1c02687. Dietrich, Christian, Arne Wick, and Thomas A. Ternes. 2022. “Open-Source Feature Detection for Non-Target LC–MS Analytics.” Rapid Communications in Mass Spectrometry 36 (2): e9206. https://doi.org/10.1002/rcm.9206. Ding, Xian, Fen Yang, Yanhua Chen, Jing Xu, Jiuming He, Ruiping Zhang, and Zeper Abliz. 2022. “Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics.” Analytical Chemistry 94 (21): 7500–7509. https://doi.org/10.1021/acs.analchem.1c05502. Djoumbou Feunang, Yannick, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, et al. 2016. “ClassyFire: Automated Chemical Classification with a Comprehensive, Computable Taxonomy.” Journal of Cheminformatics 8 (1): 61. https://doi.org/10.1186/s13321-016-0174-y. Dodds, James N., Lingjue Wang, Gary J. Patti, and Erin S. Baker. 2022. “Combining Isotopologue Workflows and Simultaneous Multidimensional Separations to Detect, Identify, and Validate Metabolites in Untargeted Analyses.” Analytical Chemistry 94 (5): 2527–35. https://doi.org/10.1021/acs.analchem.1c04430. Domingo-Almenara, Xavier, Jesus Brezmes, Maria Vinaixa, Sara Samino, Noelia Ramirez, Marta Ramon-Krauel, Carles Lerin, et al. 2016. “eRah: A Computational Tool Integrating Spectral Deconvolution and Alignment with Quantification and Identification of Metabolites in GC/MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9821–29. https://doi.org/10.1021/acs.analchem.6b02927. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, Julijana Ivanisevic, Aurelien Thomas, Jonathan Sidibé, Tony Teav, Carlos Guijas, et al. 2018. “XCMS-MRM and METLIN-MRM: A Cloud Library and Public Resource for Targeted Analysis of Small Molecules.” Nature Methods 15 (9): 681–84. https://doi.org/10.1038/s41592-018-0110-3. Domingo-Almenara, Xavier, and Gary Siuzdak. 2020. “Metabolomics Data Processing Using XCMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 11–24. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_2. Doppler, Maria, Bernhard Kluger, Christoph Bueschl, Christina Schneider, Rudolf Krska, Sylvie Delcambre, Karsten Hiller, Marc Lemmens, and Rainer Schuhmacher. 2016. “Stable Isotope-Assisted Evaluation of Different Extraction Solvents for Untargeted Metabolomics of Plants.” International Journal of Molecular Sciences 17 (7). https://doi.org/10.3390/ijms17071017. Dos Santos, Emile Kelly Porto, and Gisele André Baptista Canuto. 2023. “Optimizing XCMS Parameters for GC-MS Metabolomics Data Processing: A Case Study.” Metabolomics: Official Journal of the Metabolomic Society 19 (4): 26. https://doi.org/10.1007/s11306-023-01992-1. Dryden, Michael D. M., Ryan Fobel, Christian Fobel, and Aaron R. Wheeler. 2017. “Upon the Shoulders of Giants: Open-Source Hardware and Software in Analytical Chemistry.” Analytical Chemistry 89 (8): 4330–38. https://doi.org/10.1021/acs.analchem.7b00485. Du, Xinsong, Juan J. Aristizabal-Henao, Timothy J. Garrett, Mathias Brochhausen, William R. Hogan, and Dominick J. Lemas. 2022. “A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.” Metabolites 12 (1): 87. https://doi.org/10.3390/metabo12010087. Du, Xiuxia, and Steven H Zeisel. 2013. “SPECTRAL DECONVOLUTION FOR GAS CHROMATOGRAPHY MASS SPECTROMETRY-BASED METABOLOMICS: CURRENT STATUS AND FUTURE PERSPECTIVES.” Computational and Structural Biotechnology Journal 4 (5): 1–10. https://doi.org/10.5936/csbj.201301013. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Dührkop, Kai, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker. 2019. “SIRIUS 4: A Rapid Tool for Turning Tandem Mass Spectra into Metabolite Structure Information.” Nature Methods 16 (4): 299–302. https://doi.org/10.1038/s41592-019-0344-8. Dührkop, Kai, Louis-Félix Nothias, Markus Fleischauer, Raphael Reher, Marcus Ludwig, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Systematic Classification of Unknown Metabolites Using High-Resolution Fragmentation Mass Spectra.” Nature Biotechnology, November, 1–10. https://doi.org/10.1038/s41587-020-0740-8. Dunn, Warwick B, Ian D Wilson, Andrew W Nicholls, and David Broadhurst. 2012. “The Importance of Experimental Design and QC Samples in Large-Scale and MS-driven Untargeted Metabolomic Studies of Humans.” Bioanalysis 4 (18): 2249–64. https://doi.org/10.4155/bio.12.204. Dyar, Kenneth A., Dominik Lutter, Anna Artati, Nicholas J. Ceglia, Yu Liu, Danny Armenta, Martin Jastroch, et al. 2018. “Atlas of Circadian Metabolism Reveals System-wide Coordination and Communication Between Clocks.” Cell 174 (6): 1571–1585.e11. https://doi.org/10.1016/j.cell.2018.08.042. Edmands, William M. B., Dinesh K. Barupal, and Augustin Scalbert. 2015. “MetMSLine: An Automated and Fully Integrated Pipeline for Rapid Processing of High-Resolution LC–MS Metabolomic Datasets.” Bioinformatics 31 (5): 788–90. https://doi.org/10.1093/bioinformatics/btu705. Edmands, William M. B., Josie Hayes, and Stephen M. Rappaport. 2018. “SimExTargId: A Comprehensive Package for Real-Time LC-MS Data Acquisition and Analysis.” Bioinformatics 34 (20): 3589–90. https://doi.org/10.1093/bioinformatics/bty218. Edmands, William M. B., Lauren Petrick, Dinesh K. Barupal, Augustin Scalbert, Mark J. Wilson, Jeffrey K. Wickliffe, and Stephen M. Rappaport. 2017. “compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets.” Analytical Chemistry 89 (7): 3919–28. https://doi.org/10.1021/acs.analchem.6b02394. Eilertz, Daniel, Michael Mitterer, and Joerg M. Buescher. 2022. “automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning.” Analytical Chemistry 94 (16): 6163–71. https://doi.org/10.1021/acs.analchem.1c05224. El Abiead, Yasin, Maximilian Milford, Harald Schoeny, Mate Rusz, Reza M. Salek, and Gunda Koellensperger. 2022. “Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection.” Analytical Chemistry 94 (24): 8588–95. https://doi.org/10.1021/acs.analchem.1c05270. Engler Hart, Chloe, Tobias Kind, Pieter C. Dorrestein, David Healey, and Daniel Domingo-Fernández. 2024. “Weighting Low-Intensity MS/MS Ions and m/z Frequency for Spectral Library Annotation.” Journal of the American Society for Mass Spectrometry 35 (2): 266–74. https://doi.org/10.1021/jasms.3c00353. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Fessenden, Marissa. 2016. “Metabolomics: Small Molecules, Single Cells.” Nature 540 (7631): 153–55. https://doi.org/10.1038/540153a. Fiehn, Oliver. 2002. “Metabolomics – the Link Between Genotypes and Phenotypes.” Plant Molecular Biology 48 (1): 155–71. https://doi.org/10.1023/A:1013713905833. Flasch, Mira, Veronika Fitz, Evelyn Rampler, Chibundu N. Ezekiel, Gunda Koellensperger, and Benedikt Warth. 2022. “Integrated Exposomics/Metabolomics for Rapid Exposure and Effect Analyses.” JACS Au 2 (11): 2548–60. https://doi.org/10.1021/jacsau.2c00433. Forsberg, Erica M., Tao Huan, Duane Rinehart, H. Paul Benton, Benedikt Warth, Brian Hilmers, and Gary Siuzdak. 2018. “Data Processing, Multi-Omic Pathway Mapping, and Metabolite Activity Analysis Using XCMS Online.” Nature Protocols 13 (4): 633–51. https://doi.org/10.1038/nprot.2017.151. Franceschi, Pietro, Domenico Masuero, Urska Vrhovsek, Fulvio Mattivi, and Ron Wehrens. 2012. “A Benchmark Spike-in Data Set for Biomarker Identification in Metabolomics.” Journal of Chemometrics 26 (1-2): 16–24. https://doi.org/10.1002/cem.1420. Fu, Hai-Yan, Ou Hu, Yue-Ming Zhang, Li Zhang, Jing-Jing Song, Peang Lu, Qing-Xia Zheng, et al. 2017. “Mass-Spectra-Based Peak Alignment for Automatic Nontargeted Metabolic Profiling Analysis for Biomarker Screening in Plant Samples.” Journal of Chromatography A 1513 (Supplement C): 201–9. https://doi.org/10.1016/j.chroma.2017.07.044. Fu, Jianbo, Ying Zhang, Yunxia Wang, Hongning Zhang, Jin Liu, Jing Tang, Qingxia Yang, et al. 2021. “Optimization of Metabolomic Data Processing Using NOREVA.” Nature Protocols, December, 1–23. https://doi.org/10.1038/s41596-021-00636-9. Gadara, Darshak, Katerina Coufalikova, Juraj Bosak, David Smajs, and Zdenek Spacil. 2021. “Systematic Feature Filtering in Exploratory Metabolomics: Application Toward Biomarker Discovery.” Analytical Chemistry 93 (26): 9103–10. https://doi.org/10.1021/acs.analchem.1c00816. Gerlich, Michael, and Steffen Neumann. 2013. “MetFusion: Integration of Compound Identification Strategies.” Journal of Mass Spectrometry 48 (3): 291–98. https://doi.org/10.1002/jms.3123. Ghaste, Manoj, Robert Mistrik, and Vladimir Shulaev. 2016. “Applications of Fourier Transform Ion Cyclotron Resonance (FT-ICR) and Orbitrap Based High Resolution Mass Spectrometry in Metabolomics and Lipidomics.” International Journal of Molecular Sciences 17 (6). https://doi.org/10.3390/ijms17060816. Ghosson, Hikmat, Yann Guitton, Amani Ben Jrad, Chandrashekhar Patil, Delphine Raviglione, Marie-Virginie Salvia, and Cédric Bertrand. 2021. “Electrospray Ionization and Heterogeneous Matrix Effects in Liquid Chromatography/Mass Spectrometry Based Meta-Metabolomics: A Biomarker or a Suppressed Ion?” Rapid Communications in Mass Spectrometry 35 (2): e8977. https://doi.org/10.1002/rcm.8977. Giacomoni, Franck, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, et al. 2015. “Workflow4Metabolomics: A Collaborative Research Infrastructure for Computational Metabolomics.” Bioinformatics 31 (9): 1493–95. https://doi.org/10.1093/bioinformatics/btu813. Giebelhaus, Ryland T., Michael D. Sorochan Armstrong, A. Paulina de la Mata, and James J. Harynuk. 2022. “Untargeted Region of Interest Selection for Gas Chromatography – Mass Spectrometry Data Using a Pseudo F-ratio Moving Window.” Journal of Chromatography A 1682 (October): 463499. https://doi.org/10.1016/j.chroma.2022.463499. Gika, Helen G., Georgios A. Theodoridis, Robert S. Plumb, and Ian D. Wilson. 2014. “Current Practice of Liquid Chromatography–Mass Spectrometry in Metabolomics and Metabonomics.” Journal of Pharmaceutical and Biomedical Analysis, Review Papers on Pharmaceutical and Biomedical Analysis 2013, 87 (January): 12–25. https://doi.org/10.1016/j.jpba.2013.06.032. Gil, Andres, David Siegel, Hjalmar Permentier, Dirk-Jan Reijngoud, Frank Dekker, and Rainer Bischoff. 2015. “Stability of Energy Metabolites—An Often Overlooked Issue in Metabolomics Studies: A Review.” ELECTROPHORESIS 36 (18): 2156–69. https://doi.org/10.1002/elps.201500031. Giné, Roger, Jordi Capellades, Josep M. Badia, Dennis Vughs, Michaela Schwaiger-Haber, Theodore Alexandrov, Maria Vinaixa, Andrea M. Brunner, Gary J. Patti, and Oscar Yanes. 2021. “HERMES: A Molecular-Formula-Oriented Method to Target the Metabolome.” Nature Methods 18 (11): 1370–76. https://doi.org/10.1038/s41592-021-01307-z. Gloaguen, Yoann, Jennifer A. Kirwan, and Dieter Beule. 2022. “Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics.” Analytical Chemistry 94 (12): 4930–37. https://doi.org/10.1021/acs.analchem.1c02220. Goldansaz, Seyed Ali, An Chi Guo, Tanvir Sajed, Michael A. Steele, Graham S. Plastow, and David S. Wishart. 2017. “Livestock Metabolomics and the Livestock Metabolome: A Systematic Review.” PLOS ONE 12 (5): e0177675. https://doi.org/10.1371/journal.pone.0177675. González, Oskar, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2022. “Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis.” Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis, April. https://doi.org/10.1021/jasms.2c00022. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. González-Riano, Carolina, Danuta Dudzik, Antonia Garcia, Alberto Gil-de-la-Fuente, Ana Gradillas, Joanna Godzien, Ángeles López-Gonzálvez, et al. 2020. “Recent Developments Along the Analytical Process for Metabolomics Workflows.” Analytical Chemistry 92 (1): 203–26. https://doi.org/10.1021/acs.analchem.9b04553. Goracci, Laura, Paolo Tiberi, Stefano Di Bona, Stefano Bonciarelli, Giovanna Ilaria Passeri, Marta Piroddi, Simone Moretti, Claudia Volpi, Ismael Zamora, and Gabriele Cruciani. 2024. “MARS: A Multipurpose Software for Untargeted LC–MS-Based Metabolomics and Exposomics.” Analytical Chemistry, January. https://doi.org/10.1021/acs.analchem.3c03620. Graça, Gonçalo, Yuheng Cai, Chung-Ho E. Lau, Panagiotis A. Vorkas, Matthew R. Lewis, Elizabeth J. Want, David Herrington, and Timothy M. D. Ebbels. 2022. “Automated Annotation of Untargeted All-Ion Fragmentation LC–MS Metabolomics Data with MetaboAnnotatoR.” Analytical Chemistry 94 (8): 3446–55. https://doi.org/10.1021/acs.analchem.1c03032. Griffiths, William J., Therese Koal, Yuqin Wang, Matthias Kohl, David P. Enot, and Hans-Peter Deigner. 2010. “Targeted Metabolomics for Biomarker Discovery.” Angewandte Chemie International Edition 49 (32): 5426–45. https://doi.org/10.1002/anie.200905579. Gromski, Piotr S., Howbeer Muhamadali, David I. Ellis, Yun Xu, Elon Correa, Michael L. Turner, and Royston Goodacre. 2015. “A Tutorial Review: Metabolomics and Partial Least Squares-Discriminant Analysis – a Marriage of Convenience or a Shotgun Wedding.” Analytica Chimica Acta 879 (June): 10–23. https://doi.org/10.1016/j.aca.2015.02.012. Groves, Ryan A., Carly C. Y. Chan, Spencer D. Wildman, Daniel B. Gregson, Thomas Rydzak, and Ian A. Lewis. 2023. “Rapid LC–MS Assay for Targeted Metabolite Quantification by Serial Injection into Isocratic Gradients.” Analytical and Bioanalytical Chemistry 415 (2): 269–76. https://doi.org/10.1007/s00216-022-04384-x. Gugisch, Ralf, Adalbert Kerber, Axel Kohnert, Reinhard Laue, Markus Meringer, Christoph Rücker, and Alfred Wassermann. 2015. “Chapter 6 - MOLGEN 5.0, A Molecular Structure Generator.” In Advances in Mathematical Chemistry and Applications, edited by Subhash C. Basak, Guillermo Restrepo, and José L. Villaveces, 113–38. Bentham Science Publishers. https://doi.org/10.1016/B978-1-68108-198-4.50006-0. Guha, Rajarshi. 2007. “Chemical Informatics Functionality in R.” Journal of Statistical Software 18 (1): 1–16. https://doi.org/10.18637/jss.v018.i05. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Hao, Kebing Xue, Haiming Sun, Weihao Jiang, and Shiliang Pu. 2023. “Contrastive Learning-Based Embedder for the Representation of Tandem Mass Spectra.” Analytical Chemistry, May. https://doi.org/10.1021/acs.analchem.3c00260. Guo, Jian, and Tao Huan. 2020. “Comparison of Full-Scan, Data-Dependent, and Data-Independent Acquisition Modes in Liquid Chromatography–Mass Spectrometry Based Untargeted Metabolomics.” Analytical Chemistry 92 (12): 8072–80. https://doi.org/10.1021/acs.analchem.9b05135. Guo, Jian, Sam Shen, and Tao Huan. 2022. “Paramounter: Direct Measurement of Universal Parameters To Process Metabolomics Data in a ‘White Box’.” Analytical Chemistry, March. https://doi.org/10.1021/acs.analchem.1c04758. Guo, Jian, Sam Shen, Shipei Xing, Huaxu Yu, and Tao Huan. 2021. “ISFrag: De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data.” Analytical Chemistry, July. https://doi.org/10.1021/acs.analchem.1c01644. Habra, Hani, Maureen Kachman, Kevin Bullock, Clary Clish, Charles R. Evans, and Alla Karnovsky. 2021. “metabCombiner: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.” Analytical Chemistry 93 (12): 5028–36. https://doi.org/10.1021/acs.analchem.0c03693. Hansen, Rebecca L., and Young Jin Lee. 2018. “High-Spatial Resolution Mass Spectrometry Imaging: Toward Single Cell Metabolomics in Plant Tissues.” The Chemical Record 18 (1): 65–77. https://doi.org/10.1002/tcr.201700027. Hao, Jun-Di, Yao-Yu Chen, Yan-Zhen Wang, Na An, Pei-Rong Bai, Quan-Fei Zhu, and Yu-Qi Feng. 2023. “Novel Peak Shift Correction Method Based on the Retention Index for Peak Alignment in Untargeted Metabolomics.” Analytical Chemistry 95 (35): 13330–37. https://doi.org/10.1021/acs.analchem.3c02583. Harrieder, Eva-Maria, Fleming Kretschmer, Sebastian Böcker, and Michael Witting. 2022. “Current State-of-the-Art of Separation Methods Used in LC-MS Based Metabolomics and Lipidomics.” Journal of Chromatography B 1188 (January): 123069. https://doi.org/10.1016/j.jchromb.2021.123069. Harwood, Thomas V., Daniel G. C. Treen, Mingxun Wang, Wibe de Jong, Trent R. Northen, and Benjamin P. Bowen. 2023. “BLINK Enables Ultrafast Tandem Mass Spectrometry Cosine Similarity Scoring.” Scientific Reports 13 (1): 13462. https://doi.org/10.1038/s41598-023-40496-9. Haug, Kenneth, Reza M Salek, and Christoph Steinbeck. 2017. “Global Open Data Management in Metabolomics.” Current Opinion in Chemical Biology, Omics, 36 (February): 58–63. https://doi.org/10.1016/j.cbpa.2016.12.024. Helmus, Rick, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt, and Emma L. Schymanski. 2021. “patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening.” Journal of Cheminformatics 13 (1): 1. https://doi.org/10.1186/s13321-020-00477-w. Hernandes, Vinicius Veri, Coral Barbas, and Danuta Dudzik. 2017. “A Review of Blood Sample Handling and Pre-Processing for Metabolomics Studies.” ELECTROPHORESIS 38 (18): 2232–41. https://doi.org/10.1002/elps.201700086. Hiller, Karsten, Jasper Hangebrauk, Christian Jäger, Jana Spura, Kerstin Schreiber, and Dietmar Schomburg. 2009. “MetaboliteDetector: Comprehensive Analysis Tool for Targeted and Nontargeted GC/MS Based Metabolome Analysis.” Analytical Chemistry 81 (9): 3429–39. https://doi.org/10.1021/ac802689c. Hites, Ronald A. 2019. “Correcting for Censored Environmental Measurements.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.9b05042. Hites, Ronald A., and Karl J. Jobst. 2018. “Is Nontargeted Screening Reproducible?” Environmental Science &amp; Technology 52 (21): 11975–76. https://doi.org/10.1021/acs.est.8b05671. Houriet, Joelle, Warren S. Vidar, Preston K. Manwill, Daniel A. Todd, and Nadja B. Cech. 2022. “How Low Can You Go? Selecting Intensity Thresholds for Untargeted Metabolomics Data Preprocessing.” Analytical Chemistry 94 (51): 17964–71. https://doi.org/10.1021/acs.analchem.2c04088. Hu, Xin, Douglas I. Walker, Yongliang Liang, Matthew Ryan Smith, Michael L. Orr, Brian D. Juran, Chunyu Ma, et al. 2021. “A Scalable Workflow to Characterize the Human Exposome.” Nature Communications 12 (1): 5575. https://doi.org/10.1038/s41467-021-25840-9. Hu, Yaxi, Betty Cai, and Tao Huan. 2019. “Enhancing Metabolome Coverage in Data-Dependent LC–MS/MS Analysis Through an Integrated Feature Extraction Strategy.” Analytical Chemistry 91 (22): 14433–41. https://doi.org/10.1021/acs.analchem.9b02980. Huan, Tao, Erica M. Forsberg, Duane Rinehart, Caroline H. Johnson, Julijana Ivanisevic, H. Paul Benton, Mingliang Fang, et al. 2017. “Systems Biology Guided by XCMS Online Metabolomics.” Nature Methods 14 (5): 461–62. https://doi.org/10.1038/nmeth.4260. Huang, Danning, Marcos Bouza, David A. Gaul, Franklin E. Leach, I. Jonathan Amster, Frank C. Schroeder, Arthur S. Edison, and Facundo M. Fernández. 2021. “Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation.” Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation, August. https://doi.org/10.1021/acs.analchem.1c02224. Huber, Florian, Stefan Verhoeven, Christiaan Meijer, Hanno Spreeuw, Efraín Manuel Villanueva Castilla, Cunliang Geng, Justin J. j van der Hooft, et al. 2020. “Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data.” Journal of Open Source Software 5 (52): 2411. https://doi.org/10.21105/joss.02411. Hufsky, Franziska, Kerstin Scheubert, and Sebastian Böcker. 2014. “Computational Mass Spectrometry for Small-Molecule Fragmentation.” TrAC Trends in Analytical Chemistry 53 (January): 41–48. https://doi.org/10.1016/j.trac.2013.09.008. Ibáñez, Clara, Lamia Mouhid, Guillermo Reglero, and Ana Ramírez de Molina. 2017. “Lipidomics Insights in Health and Nutritional Intervention Studies.” Journal of Agricultural and Food Chemistry 65 (36): 7827–42. https://doi.org/10.1021/acs.jafc.7b02643. Jacyna, Julia, Marta Kordalewska, and Michał J. Markuszewski. 2019. “Design of Experiments in Metabolomics-Related Studies: An Overview.” Journal of Pharmaceutical and Biomedical Analysis 164 (February): 598–606. https://doi.org/10.1016/j.jpba.2018.11.027. Jaeger, Carsten, Friederike Hoffmann, Clemens A. Schmitt, and Jan Lisec. 2016. “Automated Annotation and Evaluation of In-Source Mass Spectra in GC/Atmospheric Pressure Chemical Ionization-MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9386–90. https://doi.org/10.1021/acs.analchem.6b02743. Jalili, Vahid, Enis Afgan, Qiang Gu, Dave Clements, Daniel Blankenberg, Jeremy Goecks, James Taylor, and Anton Nekrutenko. 2020. “The Galaxy Platform for Accessible, Reproducible and Collaborative Biomedical Analyses: 2020 Update.” Nucleic Acids Research 48 (W1): W395–402. https://doi.org/10.1093/nar/gkaa434. Jang, Cholsoon, Li Chen, and Joshua D. Rabinowitz. 2018. “Metabolomics and Isotope Tracing.” Cell 173 (4): 822–37. https://doi.org/10.1016/j.cell.2018.03.055. Jones, Dean P., Youngja Park, and Thomas R. Ziegler. 2012. “Nutritional Metabolomics: Progress in Addressing Complexity in Diet and Health.” Annual Review of Nutrition 32 (1): 183–202. https://doi.org/10.1146/annurev-nutr-072610-145159. Jorge, Tiago F., Ana T. Mata, and Carla António. 2016. “Mass Spectrometry as a Quantitative Tool in Plant Metabolomics.” Phil. Trans. R. Soc. A 374 (2079): 20150370. https://doi.org/10.1098/rsta.2015.0370. Jr, Stephen Salerno, Mahya Mehrmohamadi, Maria V. Liberti, Muting Wan, Martin T. Wells, James G. Booth, and Jason W. Locasale. 2017. “RRmix: A Method for Simultaneous Batch Effect Correction and Analysis of Metabolomics Data in the Absence of Internal Standards.” PLOS ONE 12 (6): e0179530. https://doi.org/10.1371/journal.pone.0179530. Ju, Ran, Xinyu Liu, Fujian Zheng, Xinjie Zhao, Xin Lu, Xiaohui Lin, Zhongda Zeng, and Guowang Xu. 2020. “A Graph Density-Based Strategy for Features Fusion from Different Peak Extract Software to Achieve More Metabolites in Metabolic Profiling from High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1139 (December): 8–14. https://doi.org/10.1016/j.aca.2020.09.029. Kachman, Maureen, Hani Habra, William Duren, Janis Wigginton, Peter Sajjakulnukit, George Michailidis, Charles Burant, and Alla Karnovsky. 2020. “Deep Annotation of Untargeted LC-MS Metabolomics Data with Binner.” Bioinformatics 36 (6): 1801–6. https://doi.org/10.1093/bioinformatics/btz798. Kapoore, Rahul Vijay, and Seetharaman Vaidyanathan. 2016. “Towards Quantitative Mass Spectrometry-Based Metabolomics in Microbial and Mammalian Systems.” Phil. Trans. R. Soc. A 374 (2079): 20150363. https://doi.org/10.1098/rsta.2015.0363. Karpievitch, Yuliya V., Sonja B. Nikolic, Richard Wilson, James E. Sharman, and Lindsay M. Edwards. 2014. “Metabolomics Data Normalization with EigenMS.” PLOS ONE 9 (12): e116221. https://doi.org/10.1371/journal.pone.0116221. Kennedy, Adam D., Bryan M. Wittmann, Anne M. Evans, Luke A. D. Miller, Douglas R. Toal, Shaun Lonergan, Sarah H. Elsea, and Kirk L. Pappan. 2018. “Metabolomics in the Clinic: A Review of the Shared and Unique Features of Untargeted Metabolomics for Clinical Research and Clinical Testing.” Journal of Mass Spectrometry 53 (11): 1143–54. https://doi.org/10.1002/jms.4292. Keshet, Uri, Tobias Kind, Xinchen Lu, Sarita Devi, and Oliver Fiehn. 2022. “Acyl-CoA Identification in Mouse Liver Samples Using the In Silico CoA-Blast Tandem Mass Spectral Library.” Analytical Chemistry 94 (6): 2732–39. https://doi.org/10.1021/acs.analchem.1c03272. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Kim, Jungyeon, Joong Kyong Ahn, Yu Eun Cheong, Sung-Joon Lee, Hoon-Suk Cha, and Kyoung Heon Kim. 2020. “Systematic Re-Evaluation of the Long-Used Standard Protocol of Urease-Dependent Metabolome Sample Preparation.” PloS One 15 (3): e0230072. https://doi.org/10.1371/journal.pone.0230072. Kim, Taiyun, Owen Tang, Stephen T. Vernon, Katharine A. Kott, Yen Chin Koay, John Park, David E. James, et al. 2021. “A Hierarchical Approach to Removal of Unwanted Variation for Large-Scale Metabolomics Data.” Nature Communications 12 (1): 4992. https://doi.org/10.1038/s41467-021-25210-5. Kind, Tobias, and Oliver Fiehn. 2007. “Seven Golden Rules for Heuristic Filtering of Molecular Formulas Obtained by Accurate Mass Spectrometry.” BMC Bioinformatics 8 (1): 105. https://doi.org/10.1186/1471-2105-8-105. Kind, Tobias, Hiroshi Tsugawa, Tomas Cajka, Yan Ma, Zijuan Lai, Sajjan S. Mehta, Gert Wohlgemuth, et al. 2018. “Identification of Small Molecules Using Accurate Mass MS/MS Search.” Mass Spectrometry Reviews 37 (4): 513–32. https://doi.org/10.1002/mas.21535. Koelmel, Jeremy P., Nicholas M. Kroeger, Candice Z. Ulmer, John A. Bowden, Rainey E. Patterson, Jason A. Cochran, Christopher W. W. Beecher, Timothy J. Garrett, and Richard A. Yost. 2017. “LipidMatch: An Automated Workflow for Rule-Based Lipid Identification Using Untargeted High-Resolution Tandem Mass Spectrometry Data.” BMC Bioinformatics 18 (July): 331. https://doi.org/10.1186/s12859-017-1744-3. Kong, Fanzhou, Uri Keshet, Tong Shen, Elys Rodriguez, and Oliver Fiehn. 2023. “LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02263. Kouřil, Štěpán, Julie de Sousa, Jan Václavík, David Friedecký, and Tomáš Adam. 2020. “CROP: Correlation-Based Reduction of Feature Multiplicities in Untargeted Metabolomic Data.” Bioinformatics 36 (9): 2941–42. https://doi.org/10.1093/bioinformatics/btaa012. Kuhl, Carsten, Ralf Tautenhahn, Christoph Böttcher, Tony R. Larson, and Steffen Neumann. 2012. “CAMERA: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets.” Analytical Chemistry 84 (1): 283–89. https://doi.org/10.1021/ac202450g. Kuligowski, Julia, Ángel Sánchez-Illana, Daniel Sanjuán-Herráez, Máximo Vento, and Guillermo Quintás. 2015. “Intra-Batch Effect Correction in Liquid Chromatography-Mass Spectrometry Using Quality Control Samples and Support Vector Regression (QC-SVRC).” Analyst 140 (22): 7810–17. https://doi.org/10.1039/C5AN01638J. Kusonmano, Kanthida, Wanwipa Vongsangnak, and Pramote Chumnanpuen. 2016. “Informatics for Metabolomics.” In Translational Biomedical Informatics, 91–115. Advances in Experimental Medicine and Biology. Springer, Singapore. https://doi.org/10.1007/978-981-10-1503-8_5. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Lakhani, Chirag M., Braden T. Tierney, Arjun K. Manrai, Jian Yang, Peter M. Visscher, and Chirag J. Patel. 2019. “Repurposing Large Health Insurance Claims Data to Estimate Genetic and Environmental Contributions in 560 Phenotypes.” Nature Genetics 51 (2): 327–34. https://doi.org/10.1038/s41588-018-0313-7. Laparre, Jérôme, Zied Kaabia, Mark Mooney, Tom Buckley, Mark Sherry, Bruno Le Bizec, and Gaud Dervilly-Pinel. 2017. “Impact of Storage Conditions on the Urinary Metabolomics Fingerprint.” Analytica Chimica Acta 951 (January): 99–107. https://doi.org/10.1016/j.aca.2016.11.055. Larralde, Martin, Thomas N. Lawson, Ralf J. M. Weber, Pablo Moreno, Kenneth Haug, Philippe Rocca-Serra, Mark R. Viant, Christoph Steinbeck, and Reza M. Salek. 2017. “mzML2ISA &amp; nmrML2ISA: Generating Enriched ISA-Tab Metadata Files from Metabolomics XML Data.” Bioinformatics 33 (16): 2598–2600. https://doi.org/10.1093/bioinformatics/btx169. Lassen, Johan, Kirstine Lykke Nielsen, Mogens Johannsen, and Palle Villesen. 2021. “Assessment of XCMS Optimization Methods with Machine-Learning Performance.” Analytical Chemistry 93 (40): 13459–66. https://doi.org/10.1021/acs.analchem.1c02000. Lawson, Thomas N., Ralf J. M. Weber, Martin R. Jones, Andrew J. Chetwynd, Giovanny Rodrı́guez-Blanco, Riccardo Di Guida, Mark R. Viant, and Warwick B. Dunn. 2017. “msPurity: Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics.” Analytical Chemistry 89 (4): 2432–39. https://doi.org/10.1021/acs.analchem.6b04358. Lê Cao, Kim-Anh, Simon Boitard, and Philippe Besse. 2011. “Sparse PLS Discriminant Analysis: Biologically Relevant Feature Selection and Graphical Displays for Multiclass Problems.” BMC Bioinformatics 12 (June): 253. https://doi.org/10.1186/1471-2105-12-253. Ledesma-Escobar, Carlos Augusto, Feliciano Priego-Capote, and Mónica Calderón-Santiago. 2023. “MetaboMSDIA: A Tool for Implementing Data-Independent Acquisition in Metabolomic-Based Mass Spectrometry Analysis.” Analytica Chimica Acta 1266 (July): 341308. https://doi.org/10.1016/j.aca.2023.341308. Leek, Jeffrey T., W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034. Leek, Jeffrey T., and John D. Storey. 2007. “Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis.” PLOS Genet 3 (9): e161. https://doi.org/10.1371/journal.pgen.0030161. ———. 2008. “A General Framework for Multiple Testing Dependence.” Proceedings of the National Academy of Sciences 105 (48): 18718–23. https://doi.org/10.1073/pnas.0808709105. Levy, Allison J., Nicholas R. Oranzi, Atiye Ahmadireskety, Robin H. J. Kemperman, Michael S. Wei, and Richard A. Yost. 2019. “Recent Progress in Metabolomics Using Ion Mobility-Mass Spectrometry.” TrAC Trends in Analytical Chemistry 116 (July): 274–81. https://doi.org/10.1016/j.trac.2019.05.001. Li, Bo, Jing Tang, Qingxia Yang, Shuang Li, Xuejiao Cui, Yinghong Li, Yuzong Chen, Weiwei Xue, Xiaofeng Li, and Feng Zhu. 2017. “NOREVA: Normalization and Evaluation of MS-based Metabolomics Data.” Nucleic Acids Research 45 (W1): W162–70. https://doi.org/10.1093/nar/gkx449. Li, Hao, Yuping Cai, Yuan Guo, Fangfang Chen, and Zheng-Jiang Zhu. 2016. “MetDIA: Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition.” Analytical Chemistry 88 (17): 8757–64. https://doi.org/10.1021/acs.analchem.6b02122. Li, Liang, Ronghong Li, Jianjun Zhou, Azeret Zuniga, Avalyn E. Stanislaus, Yiman Wu, Tao Huan, et al. 2013. “MyCompoundID: Using an Evidence-Based Metabolome Library for Metabolite Identification.” Analytical Chemistry 85 (6): 3401–8. https://doi.org/10.1021/ac400099b. Li, Lili, Weijie Ren, Hongwei Kong, Chunxia Zhao, Xinjie Zhao, Xiaohui Lin, Xin Lu, and Guowang Xu. 2017. “An Alignment Algorithm for LC-MS-based Metabolomics Dataset Assisted by MS/MS Information.” Analytica Chimica Acta 990 (October): 96–102. https://doi.org/10.1016/j.aca.2017.07.058. Li, Shuzhao. 2020. Computational Methods and Data Analysis for Metabolomics. Springer. Li, Shuzhao, Youngja Park, Sai Duraisingham, Frederick H. Strobel, Nooruddin Khan, Quinlyn A. Soltow, Dean P. Jones, and Bali Pulendran. 2013. “Predicting Network Activity from High Throughput Metabolomics.” PLOS Computational Biology 9 (7): e1003123. https://doi.org/10.1371/journal.pcbi.1003123. Li, Shuzhao, Amnah Siddiqa, Maheshwor Thapa, Yuanye Chi, and Shujian Zheng. 2023. “Trackable and Scalable LC-MS Metabolomics Data Processing Using Asari.” Nature Communications 14 (1): 4113. https://doi.org/10.1038/s41467-023-39889-1. Li, Yuanyue, and Oliver Fiehn. 2023. “Flash Entropy Search to Query All Mass Spectral Libraries in Real Time.” Nature Methods 20 (10): 1475–78. https://doi.org/10.1038/s41592-023-02012-9. Li, Yuanyue, Tobias Kind, Jacob Folz, Arpana Vaniya, Sajjan Singh Mehta, and Oliver Fiehn. 2021. “Spectral Entropy Outperforms MS/MS Dot Product Similarity for Small-Molecule Compound Identification.” Nature Methods 18 (12): 1524–31. https://doi.org/10.1038/s41592-021-01331-z. Li, Zhucui, Yan Lu, Yufeng Guo, Haijie Cao, Qinhong Wang, and Wenqing Shui. 2018. “Comprehensive Evaluation of Untargeted Metabolomics Data Processing Software in Feature Detection, Quantification and Discriminating Marker Selection.” Analytica Chimica Acta 1029 (October): 50–57. https://doi.org/10.1016/j.aca.2018.05.001. Liao, Jingyu, Yuhao Zhang, Wendan Zhang, Yuanyuan Zeng, Jing Zhao, Jingfang Zhang, Tingting Yao, et al. 2023. “Different Software Processing Affects the Peak Picking and Metabolic Pathway Recognition of Metabolomics Data.” Journal of Chromatography A 1687 (January): 463700. https://doi.org/10.1016/j.chroma.2022.463700. Libiseller, Gunnar, Michaela Dvorzak, Ulrike Kleb, Edgar Gander, Tobias Eisenberg, Frank Madeo, Steffen Neumann, et al. 2015. “IPO: A Tool for Automated Optimization of XCMS Parameters.” BMC Bioinformatics 16 (April): 118. https://doi.org/10.1186/s12859-015-0562-8. Lieng, Brandon Y., Andrew T. Quaile, Xavier Domingo-Almenara, Hannes L. Röst, and J. Rafael Montenegro-Burke. 2023. “Computational Expansion of High-Resolution-MSn Spectral Libraries.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c03343. Lin, Ching Yu, Huifeng Wu, Ronald S. Tjeerdema, and Mark R. Viant. 2007. “Evaluation of Metabolite Extraction Strategies from Tissue Samples Using NMR Metabolomics.” Metabolomics 3 (1): 55–67. https://doi.org/10.1007/s11306-006-0043-1. Lisec, Jan, Friederike Hoffmann, Clemens Schmitt, and Carsten Jaeger. 2016. “Extending the Dynamic Range in Metabolomics Experiments by Automatic Correction of Peaks Exceeding the Detection Limit.” Analytical Chemistry 88 (15): 7487–92. https://doi.org/10.1021/acs.analchem.6b02515. Lisitsyna, Anna, Franco Moritz, Youzhong Liu, Loubna Al Sadat, Hans Hauner, Melina Claussnitzer, Philippe Schmitt-Kopplin, and Sara Forcisi. 2022. “Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.” Analytical Chemistry 94 (14): 5474–82. https://doi.org/10.1021/acs.analchem.1c03237. Liu, Qin, Douglas Walker, Karan Uppal, Zihe Liu, Chunyu Ma, ViLinh Tran, Shuzhao Li, Dean P. Jones, and Tianwei Yu. 2020. “Addressing the Batch Effect Issue for LC/MS Metabolomics Data in Data Preprocessing.” Scientific Reports 10 (1): 13856. https://doi.org/10.1038/s41598-020-70850-0. Liu, Xinyu, Lina Zhou, Xianzhe Shi, and Guowang Xu. 2019. “New Advances in Analytical Methods for Mass Spectrometry-Based Large-Scale Metabolomics Study.” TrAC Trends in Analytical Chemistry 121 (December): 115665. https://doi.org/10.1016/j.trac.2019.115665. Liu, Youzhong, Yingjie Zhang, Tom Vennekens, Jennifer L. Lippens, Luc Duijsens, Danh Bui-Thi, Kris Laukens, and Thomas de Vijlder. 2023. “MeRgeION: A Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing.” Analytical Chemistry 95 (22): 8433–42. https://doi.org/10.1021/acs.analchem.2c04343. Livera, Alysha M. De, Marko Sysi-Aho, Laurent Jacob, Johann A. Gagnon-Bartsch, Sandra Castillo, Julie A. Simpson, and Terence P. Speed. 2015. “Statistical Methods for Handling Unwanted Variation in Metabolomics Data.” Analytical Chemistry 87 (7): 3606–15. https://doi.org/10.1021/ac502439y. Loftfield, Erikka, Emily Vogtmann, Joshua N. Sampson, Steven C. Moore, Heidi Nelson, Rob Knight, Nicholas Chia, and Rashmi Sinha. 2016. “Comparison of Collection Methods for Fecal Samples for Discovery Metabolomics in Epidemiologic Studies.” Cancer Epidemiology and Prevention Biomarkers 25 (11): 1483–90. https://doi.org/10.1158/1055-9965.EPI-16-0409. Loos, Martin, and Heinz Singer. 2017. “Nontargeted Homologue Series Extraction from Hyphenated High Resolution Mass Spectrometry Data.” Journal of Cheminformatics 9 (February). https://doi.org/10.1186/s13321-017-0197-z. Lu, Wenyun, Bryson D. Bennett, and Joshua D. Rabinowitz. 2008. “Analytical Strategies for LC–MS-based Targeted Metabolomics.” Journal of Chromatography B, Hyphenated Techniques for Global Metabolite Profiling, 871 (2): 236–42. https://doi.org/10.1016/j.jchromb.2008.04.031. Lu, Wenyun, Xiaoyang Su, Matthias S. Klein, Ian A. Lewis, Oliver Fiehn, and Joshua D. Rabinowitz. 2017. “Metabolite Measurement: Pitfalls to Avoid and Practices to Follow.” Annual Review of Biochemistry 86 (1): 277–304. https://doi.org/10.1146/annurev-biochem-061516-044952. Lu, Xin, and Guowang Xu. 2008. “LC-MS Metabonomics Methodology in Biomarker Discovery.” In Biomarker Methods in Drug Discovery and Development, edited by Feng Wang, 291–315. Methods in Pharmacology and Toxicology™. Humana Press. https://doi.org/10.1007/978-1-59745-463-6_14. Ludwig, Marcus, Louis-Félix Nothias, Kai Dührkop, Irina Koester, Markus Fleischauer, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Database-Independent Molecular Formula Annotation Using Gibbs Sampling Through ZODIAC.” Nature Machine Intelligence 2 (10): 629–41. https://doi.org/10.1038/s42256-020-00234-6. Luo, Xian, and Liang Li. 2017. “Metabolomics of Small Numbers of Cells: Metabolomic Profiling of 100, 1000, and 10000 Human Breast Cancer Cells.” Analytical Chemistry 89 (21): 11664–71. https://doi.org/10.1021/acs.analchem.7b03100. Lv, Wangjie, Zhongda Zeng, Yuqing Zhang, Qingqing Wang, Lichao Wang, Zhaoxuan Zhang, Xianzhe Shi, Xinjie Zhao, and Guowang Xu. 2022. “Comprehensive Metabolite Quantitative Assay Based on Alternate Metabolomics and Lipidomics Analyses.” Analytica Chimica Acta 1215 (July): 339979. https://doi.org/10.1016/j.aca.2022.339979. Ma, Yan, Tobias Kind, Dawei Yang, Carlos Leon, and Oliver Fiehn. 2014. “MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra.” Analytical Chemistry 86 (21): 10724–31. https://doi.org/10.1021/ac502818e. Madsen, Rasmus, Torbjörn Lundstedt, and Johan Trygg. 2010. “Chemometrics in Metabolomics—A Review in Human Disease Diagnosis.” Analytica Chimica Acta 659 (1): 23–33. https://doi.org/10.1016/j.aca.2009.11.042. Mahieu, Nathaniel G., and Gary J. Patti. 2017. “Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer Than 1000 Unique Metabolites.” Analytical Chemistry 89 (19): 10397–406. https://doi.org/10.1021/acs.analchem.7b02380. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Mahieu, Nathaniel G., Jonathan L. Spalding, and Gary J. Patti. 2016. “Warpgroup: Increased Precision of Metabolomic Data Processing by Consensus Integration Bound Analysis.” Bioinformatics 32 (2): 268–75. https://doi.org/10.1093/bioinformatics/btv564. Mahmud, Iqbal, Sandi Sternberg, Michael Williams, and Timothy J. Garrett. 2017. “Comparison of Global Metabolite Extraction Strategies for Soybeans Using UHPLC-HRMS.” Analytical and Bioanalytical Chemistry 409 (26): 6173–80. https://doi.org/10.1007/s00216-017-0557-6. Maitre, Léa, Mariona Bustamante, Carles Hernández-Ferrer, Denise Thiel, Chung-Ho E. Lau, Alexandros P. Siskos, Marta Vives-Usano, et al. 2022. “Multi-Omics Signatures of the Human Early Life Exposome.” Nature Communications 13 (1): 7024. https://doi.org/10.1038/s41467-022-34422-2. Mangul, Serghei, Thiago Mosqueiro, Richard J. Abdill, Dat Duong, Keith Mitchell, Varuni Sarwal, Brian Hill, et al. 2019. “Challenges and Recommendations to Improve the Installability and Archival Stability of Omics Computational Tools.” PLOS Biology 17 (6): e3000333. https://doi.org/10.1371/journal.pbio.3000333. Mannhold, Raimund, Gennadiy I. Poda, Claude Ostermann, and Igor V. Tetko. 2009. “Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of LogP Methods on More Than 96,000 Compounds.” Journal of Pharmaceutical Sciences 98 (3): 861–93. https://doi.org/10.1002/jps.21494. Mansouri, Kamel, Chris M. Grulke, Richard S. Judson, and Antony J. Williams. 2018. “OPERA Models for Predicting Physicochemical Properties and Environmental Fate Endpoints.” Journal of Cheminformatics 10 (1): 10. https://doi.org/10.1186/s13321-018-0263-1. Mardal, Marie, Petur W. Dalsgaard, Brian S. Rasmussen, Kristian Linnet, and Christian B. Mollerup. 2023. “Scalable Analysis of Untargeted LC-HRMS Data by Means of SQL Database Archiving.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.2c03769. Martens, Jonathan, Giel Berden, Rianne E. van Outersterp, Leo A. J. Kluijtmans, Udo F. Engelke, Clara D. M. van Karnebeek, Ron A. Wevers, and Jos Oomens. 2017. “Molecular Identification in Metabolomics Using Infrared Ion Spectroscopy.” Scientific Reports 7 (June). https://doi.org/10.1038/s41598-017-03387-4. Matich, Eryn K., Nita G. Chavez Soria, Diana S. Aga, and G. Ekin Atilla-Gokcumen. 2019. “Applications of Metabolomics in Assessing Ecological Effects of Emerging Contaminants and Pollutants on Plants.” Journal of Hazardous Materials 373 (July): 527–35. https://doi.org/10.1016/j.jhazmat.2019.02.084. Matsuo, Teruko, Hiroshi Tsugawa, Hiromi Miyagawa, and Eiichiro Fukusaki. 2017. “Integrated Strategy for Unknown EI–MS Identification Using Quality Control Calibration Curve, Multivariate Analysis, EI–MS Spectral Database, and Retention Index Prediction.” Analytical Chemistry 89 (12): 6766–73. https://doi.org/10.1021/acs.analchem.7b01010. McLean, Craig, and Elizabeth B. Kujawinski. 2020. “AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing.” Analytical Chemistry 92 (8): 5724–32. https://doi.org/10.1021/acs.analchem.9b04804. Melamud, Eugene, Livia Vastag, and Joshua D. Rabinowitz. 2010. “Metabolomic Analysis and Visualization Engine for LC-MS Data.” Analytical Chemistry 82 (23): 9818–26. https://doi.org/10.1021/ac1021166. Menikarachchi, Lochana C., Shannon Cawley, Dennis W. Hill, L. Mark Hall, Lowell Hall, Steven Lai, Janine Wilder, and David F. Grant. 2012. “MolFind: A Software Package Enabling HPLC/MS-Based Identification of Unknown Chemical Structures.” Analytical Chemistry 84 (21): 9388–94. https://doi.org/10.1021/ac302048x. Miggiels, Paul, Bert Wouters, Gerard J. P. van Westen, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2019. “Novel Technologies for Metabolomics: More for Less.” TrAC Trends in Analytical Chemistry 120 (November): 115323. https://doi.org/10.1016/j.trac.2018.11.021. Misra, Biswapriya B. 2018. “New Tools and Resources in Metabolomics: 2016–2017.” ELECTROPHORESIS 39 (7): 909–23. https://doi.org/10.1002/elps.201700441. Misra, Biswapriya B., Johannes F. Fahrmann, and Dmitry Grapov. 2017. “Review of Emerging Metabolomic Tools and Resources: 2015–2016.” ELECTROPHORESIS 38 (18): 2257–74. https://doi.org/10.1002/elps.201700110. Misra, Biswapriya B., and Justin J. J. van der Hooft. 2016. “Updates in Metabolomics Tools and Resources: 2014–2015.” ELECTROPHORESIS 37 (1): 86–110. https://doi.org/10.1002/elps.201500417. Miyagawa, Hiromi, and Takeshi Bamba. 2019. “Comparison of Sequential Derivatization with Concurrent Methods for GC/MS-based Metabolomics.” Journal of Bioscience and Bioengineering 127 (2): 160–68. https://doi.org/10.1016/j.jbiosc.2018.07.015. Montenegro-Burke, J. Rafael, Aries E. Aisporna, H. Paul Benton, Duane Rinehart, Mingliang Fang, Tao Huan, Benedikt Warth, et al. 2017. “Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes.” Analytical Chemistry 89 (2): 1254–59. https://doi.org/10.1021/acs.analchem.6b03890. Müller, Manfred J., and Anja Bosy-Westphal. 2020. “From a ‘Metabolomics Fashion’ to a Sound Application of Metabolomics in Research on Human Nutrition.” European Journal of Clinical Nutrition 74 (12): 1619–29. https://doi.org/10.1038/s41430-020-00781-6. Myers, Owen D., Susan J. Sumner, Shuzhao Li, Stephen Barnes, and Xiuxia Du. 2017. “Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data.” Analytical Chemistry 89 (17): 8689–95. https://doi.org/10.1021/acs.analchem.7b01069. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Nash, William J., and Warwick B. Dunn. 2019. “From Mass to Metabolite in Human Untargeted Metabolomics: Recent Advances in Annotation of Metabolites Applying Liquid Chromatography-Mass Spectrometry Data.” TrAC Trends in Analytical Chemistry 120 (November): 115324. https://doi.org/10.1016/j.trac.2018.11.022. Ni, Yan, Mingming Su, Yunping Qiu, Wei Jia, and Xiuxia Du. 2016. “ADAP-GC 3.0: Improved Peak Detection and Deconvolution of Co-eluting Metabolites from GC/TOF-MS Data for Metabolomics Studies.” Analytical Chemistry 88 (17): 8802–11. https://doi.org/10.1021/acs.analchem.6b02222. Ni, Zhixu, Michele Wölk, Geoff Jukes, Karla Mendivelso Espinosa, Robert Ahrends, Lucila Aimo, Jorge Alvarez-Jarreta, et al. 2022. “Guiding the Choice of Informatics Software and Tools for Lipidomics Research Applications.” Nature Methods, December, 1–12. https://doi.org/10.1038/s41592-022-01710-0. Nikolskiy, Igor, Nathaniel G. Mahieu, Ying-Jr Chen, Ralf Tautenhahn, and Gary J. Patti. 2013. “An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites.” Analytical Chemistry 85 (16): 7713–19. https://doi.org/10.1021/ac400751j. Nothias, Louis-Félix, Daniel Petras, Robin Schmid, Kai Dührkop, Johannes Rainer, Abinesh Sarvepalli, Ivan Protsyuk, et al. 2020. “Feature-Based Molecular Networking in the GNPS Analysis Environment.” Nature Methods 17 (9): 905–8. https://doi.org/10.1038/s41592-020-0933-6. Nyamundanda, Gift, Isobel Claire Gormley, Yue Fan, William M. Gallagher, and Lorraine Brennan. 2013. “MetSizeR: Selecting the Optimal Sample Size for Metabolomic Studies Using an Analysis Based Approach.” BMC Bioinformatics 14: 338. https://doi.org/10.1186/1471-2105-14-338. O’Boyle, Noel M., Michael Banck, Craig A. James, Chris Morley, Tim Vandermeersch, and Geoffrey R. Hutchison. 2011. “Open Babel: An Open Chemical Toolbox.” Journal of Cheminformatics 3 (1): 33. https://doi.org/10.1186/1758-2946-3-33. Oberg, Ann L., and Olga Vitek. 2009. “Statistical Design of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Journal of Proteome Research 8 (5): 2144–56. https://doi.org/10.1021/pr8010099. Ortmayr, Karin, Verena Charwat, Cornelia Kasper, Stephan Hann, and Gunda Koellensperger. 2016. “Uncertainty Budgeting in Fold Change Determination and Implications for Non-Targeted Metabolomics Studies in Model Systems” 142 (1): 80–90. https://doi.org/10.1039/C6AN01342B. Osipenko, Sergey, Alexander Zherebker, Lidiia Rumiantseva, Oxana Kovaleva, Evgeny N. Nikolaev, and Yury Kostyukevich. 2022. “Oxygen Isotope Exchange Reaction for Untargeted LC–MS Analysis.” Journal of the American Society for Mass Spectrometry 33 (2): 390–98. https://doi.org/10.1021/jasms.1c00383. Palmer, Andrew, Prasad Phapale, Ilya Chernyavsky, Regis Lavigne, Dominik Fay, Artem Tarasov, Vitaly Kovalev, et al. 2017. “FDR-controlled Metabolite Annotation for High-Resolution Imaging Mass Spectrometry.” Nature Methods 14 (1): 57–60. https://doi.org/10.1038/nmeth.4072. Pang, Zhiqiang, Jasmine Chong, Shuzhao Li, and Jianguo Xia. 2020. “MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics.” Metabolites 10 (5): 186. https://doi.org/10.3390/metabo10050186. Passos Mansoldo, Felipe Raposo, Rafael Garrett, Veronica da Silva Cardoso, Marina Amaral Alves, and Alane Beatriz Vermelho. 2022. “Metabology: Analysis of Metabolomics Data Using Community Ecology Tools.” Analytica Chimica Acta 1232 (November): 340469. https://doi.org/10.1016/j.aca.2022.340469. Patiny, Luc, and Alain Borel. 2013. “ChemCalc: A Building Block for Tomorrow’s Chemical Infrastructure.” Journal of Chemical Information and Modeling 53 (5): 1223–28. https://doi.org/10.1021/ci300563h. Petras, Daniel, Vanessa V. Phelan, Deepa Acharya, Andrew E. Allen, Allegra T. Aron, Nuno Bandeira, Benjamin P. Bowen, et al. 2021. “GNPS Dashboard: Collaborative Exploration of Mass Spectrometry Data in the Web Browser.” Nature Methods, December, 1–3. https://doi.org/10.1038/s41592-021-01339-5. Pezzatti, Julian, Julien Boccard, Santiago Codesido, Yoric Gagnebin, Abhinav Joshi, Didier Picard, Víctor González-Ruiz, and Serge Rudaz. 2020. “Implementation of Liquid Chromatography–High Resolution Mass Spectrometry Methods for Untargeted Metabolomic Analyses of Biological Samples: A Tutorial.” Analytica Chimica Acta 1105 (April): 28–44. https://doi.org/10.1016/j.aca.2019.12.062. Pfeuffer, Julianus, Chris Bielow, Samuel Wein, Kyowon Jeong, Eugen Netz, Axel Walter, Oliver Alka, et al. 2024. “OpenMS 3 Enables Reproducible Analysis of Large-Scale Mass Spectrometry Data.” Nature Methods 21 (3): 365–67. https://doi.org/10.1038/s41592-024-02197-7. Pfeuffer, Julianus, Timo Sachsenberg, Oliver Alka, Mathias Walzer, Alexander Fillbrunn, Lars Nilse, Oliver Schilling, Knut Reinert, and Oliver Kohlbacher. 2017. “OpenMS – A Platform for Reproducible Analysis of Mass Spectrometry Data.” Journal of Biotechnology, Bioinformatics Solutions for Big Data Analysis in Life Sciences presented by the German Network for Bioinformatics Infrastructure, 261 (November): 142–48. https://doi.org/10.1016/j.jbiotec.2017.05.016. Phapale, Prasad, Vineeta Rai, Ashok Kumar Mohanty, and Sanjeeva Srivastava. 2020. “Untargeted Metabolomics Workshop Report: Quality Control Considerations from Sample Preparation to Data Analysis.” Journal of the American Society for Mass Spectrometry 31 (9): 2006–10. https://doi.org/10.1021/jasms.0c00224. Place, Benjamin J., Elin M. Ulrich, Jonathan K. Challis, Alex Chao, Bowen Du, Kristin Favela, Yong-Lai Feng, et al. 2021. “An Introduction to the Benchmarking and Publications for Non-Targeted Analysis Working Group.” Analytical Chemistry 93 (49): 16289–96. https://doi.org/10.1021/acs.analchem.1c02660. Pluskal, Tomáš, Sandra Castillo, Alejandro Villar-Briones, and Matej Orešič. 2010. “MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data.” BMC Bioinformatics 11: 395. https://doi.org/10.1186/1471-2105-11-395. Pluskal, Tomáš, Ansgar Korf, Aleksandr Smirnov, Robin Schmid, Timothy R. Fallon, Xiuxia Du, and Jing-Ke Weng. 2020. “CHAPTER 7:Metabolomics Data Analysis Using MZmine.” In Processing Metabolomics and Proteomics Data with Open Software, 232–54. https://doi.org/10.1039/9781788019880-00232. Plyushchenko, Ivan V., Elizaveta S. Fedorova, Natalia V. Potoldykova, Konstantin A. Polyakovskiy, Alexander I. Glukhov, and Igor A. Rodin. 2022. “Omics Untargeted Key Script: R-Based Software Toolbox for Untargeted Metabolomics with Bladder Cancer Biomarkers Discovery Case Study.” Journal of Proteome Research 21 (3): 833–47. https://doi.org/10.1021/acs.jproteome.1c00392. Polderman, Tinca J. C., Beben Benyamin, Christiaan A. de Leeuw, Patrick F. Sullivan, Arjen van Bochoven, Peter M. Visscher, and Danielle Posthuma. 2015. “Meta-Analysis of the Heritability of Human Traits Based on Fifty Years of Twin Studies.” Nature Genetics 47 (7): 702–9. https://doi.org/10.1038/ng.3285. Qiu, Feng, Dennis D. Fine, Daniel J. Wherritt, Zhentian Lei, and Lloyd W. Sumner. 2016. “PlantMAT: A Metabolomics Tool for Predicting the Specialized Metabolic Potential of a System and for Large-Scale Metabolite Identifications.” Analytical Chemistry 88 (23): 11373–83. https://doi.org/10.1021/acs.analchem.6b00906. Qiu, Feng, Zhentian Lei, and Lloyd W. Sumner. 2018. “MetExpert: An Expert System to Enhance Gas Chromatography-Mass Spectrometry-Based Metabolite Identifications.” Analytica Chimica Acta, Analytical Metabolomics, 1037 (December): 316–26. https://doi.org/10.1016/j.aca.2018.03.052. Reuschenbach, Max, Felix Drees, Torsten C. Schmidt, and Gerrit Renner. 2023. “qBinning: Data Quality-Based Algorithm for Automized Ion Chromatogram Extraction from High-Resolution Mass Spectrometry.” Analytical Chemistry, September. https://doi.org/10.1021/acs.analchem.3c01079. Rey-Stolle, Fernanda, Danuta Dudzik, Carolina Gonzalez-Riano, Miguel Fernández-García, Vanesa Alonso-Herranz, David Rojo, Coral Barbas, and Antonia García. 2022. “Low and High Resolution Gas Chromatography-Mass Spectrometry for Untargeted Metabolomics: A Tutorial.” Analytica Chimica Acta 1210 (June): 339043. https://doi.org/10.1016/j.aca.2021.339043. Riquelme, Gabriel, Nicolás Zabalegui, Pablo Marchi, Christina M. Jones, and María Eugenia Monge. 2020. “A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows.” Metabolites 10 (10): 416. https://doi.org/10.3390/metabo10100416. Röst, Hannes L., Timo Sachsenberg, Stephan Aiche, Chris Bielow, Hendrik Weisser, Fabian Aicheler, Sandro Andreotti, et al. 2016. “OpenMS: A Flexible Open-Source Software Platform for Mass Spectrometry Data Analysis.” Nature Methods 13 (9): 741–48. https://doi.org/10.1038/nmeth.3959. Röst, Hannes L., Uwe Schmitt, Ruedi Aebersold, and Lars Malmström. 2014. “pyOpenMS: A Python-based Interface to the OpenMS Mass-Spectrometry Algorithm Library.” PROTEOMICS 14 (1): 74–77. https://doi.org/10.1002/pmic.201300246. Roszkowska, Anna, Miao Yu, Vincent Bessonneau, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2018. “Tissue Storage Affects Lipidome Profiling in Comparison to in Vivo Microsampling Approach.” Scientific Reports 8 (1): 6980. https://doi.org/10.1038/s41598-018-25428-2. Rurik, Marc, Oliver Alka, Fabian Aicheler, and Oliver Kohlbacher. 2020. “Metabolomics Data Processing Using OpenMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 49–60. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_4. Rusconi, Filippo. 2019. “mineXpert: Biological Mass Spectrometry Data Visualization and Mining with Full JavaScript Ability.” Journal of Proteome Research 18 (5): 2254–59. https://doi.org/10.1021/acs.jproteome.9b00099. Ruttkies, Christoph, Emma L. Schymanski, Sebastian Wolf, Juliane Hollender, and Steffen Neumann. 2016. “MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation.” Journal of Cheminformatics 8 (January): 3. https://doi.org/10.1186/s13321-016-0115-9. Saccenti, Edoardo, and Marieke E. Timmerman. 2016. “Approaches to Sample Size Determination for Multivariate Data: Applications to PCA and PLS-DA of Omics Data.” Journal of Proteome Research 15 (8): 2379–93. https://doi.org/10.1021/acs.jproteome.5b01029. Samanipour, Saer, Malcolm J. Reid, Kine Bæk, and Kevin V. Thomas. 2018. “Combining a Deconvolution and a Universal Library Search Algorithm for the Nontarget Analysis of Data-Independent Acquisition Mode Liquid Chromatography-High-Resolution Mass Spectrometry Results.” Environmental Science &amp; Technology 52 (8): 4694–4701. https://doi.org/10.1021/acs.est.8b00259. Sarpe, Vladimir, and David C Schriemer. 2017. “Supporting Metabolomics with Adaptable Software: Design Architectures for the End-User.” Current Opinion in Biotechnology, Analytical biotechnology, 43 (February): 110–17. https://doi.org/10.1016/j.copbio.2016.11.001. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Scheubert, Kerstin, Franziska Hufsky, Daniel Petras, Mingxun Wang, Louis-Félix Nothias, Kai Dührkop, Nuno Bandeira, Pieter C. Dorrestein, and Sebastian Böcker. 2017. “Significance Estimation for Large Scale Metabolomics Annotations by Spectral Matching.” Nature Communications 8 (1): 1494. https://doi.org/10.1038/s41467-017-01318-5. Schrimpe-Rutledge, Alexandra C., Simona G. Codreanu, Stacy D. Sherrod, and John A. McLean. 2016. “Untargeted Metabolomics Strategies—Challenges and Emerging Directions.” Journal of The American Society for Mass Spectrometry 27 (12): 1897–1905. https://doi.org/10.1007/s13361-016-1469-y. Schymanski, Emma L., and Antony J. Williams. 2017. “Open Science for Identifying ‘Known Unknown’ Chemicals.” Environmental Science &amp; Technology 51 (10): 5357–59. https://doi.org/10.1021/acs.est.7b01908. Senan, Oriol, Antoni Aguilar-Mogas, Miriam Navarro, Jordi Capellades, Luke Noon, Deborah Burks, Oscar Yanes, Roger Guimerà, and Marta Sales-Pardo. 2019. “CliqueMS: A Computational Tool for Annotating in-Source Metabolite Ions from LC-MS Untargeted Metabolomics Data Based on a Coelution Similarity Network.” Bioinformatics 35 (20): 4089–97. https://doi.org/10.1093/bioinformatics/btz207. Shaffer, Justin P., Louis-Félix Nothias, Luke R. Thompson, Jon G. Sanders, Rodolfo A. Salido, Sneha P. Couvillion, Asker D. Brejnrod, et al. 2022. “Standardized Multi-Omics of Earth’s Microbiomes Reveals Microbial and Metabolite Diversity.” Nature Microbiology 7 (12): 2128–50. https://doi.org/10.1038/s41564-022-01266-x. Shen, Xiaotao, Ruohong Wang, Xin Xiong, Yandong Yin, Yuping Cai, Zaijun Ma, Nan Liu, and Zheng-Jiang Zhu. 2019. “Metabolic Reaction Network-Based Recursive Metabolite Annotation for Untargeted Metabolomics.” Nature Communications 10 (1): 1–14. https://doi.org/10.1038/s41467-019-09550-x. Shen, Xiaotao, Hong Yan, Chuchu Wang, Peng Gao, Caroline H. Johnson, and Michael P. Snyder. 2022. “TidyMass an Object-Oriented Reproducible Analysis Framework for LC–MS Data.” Nature Communications 13 (1): 4365. https://doi.org/10.1038/s41467-022-32155-w. Shi, Jiachen, Jialiang Zhao, Yu Zhang, Yanan Wang, Chin Ping Tan, Yong-Jiang Xu, and Yuanfa Liu. 2023. “Windows Scanning Multiomics: Integrated Metabolomics and Proteomics.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.3c03785. Silva, Ricardo R. da, Mingxun Wang, Louis-Félix Nothias, Justin J. J. van der Hooft, Andrés Mauricio Caraballo-Rodríguez, Evan Fox, Marcy J. Balunas, Jonathan L. Klassen, Norberto Peporine Lopes, and Pieter C. Dorrestein. 2018. “Propagating Annotations of Molecular Networks Using in Silico Fragmentation.” PLOS Computational Biology 14 (4): e1006089. https://doi.org/10.1371/journal.pcbi.1006089. Silva, Ricardo R., Fabien Jourdan, Diego M. Salvanha, Fabien Letisse, Emilien L. Jamin, Simone Guidetti-Gonzalez, Carlos A. Labate, and Ricardo Z. N. Vêncio. 2014. “ProbMetab: An R Package for Bayesian Probabilistic Annotation of LC–MS-based Metabolomics.” Bioinformatics 30 (9): 1336–37. https://doi.org/10.1093/bioinformatics/btu019. Sindelar, Miriam, and Gary J. Patti. 2020. “Chemical Discovery in the Era of Metabolomics.” Journal of the American Chemical Society, April. https://doi.org/10.1021/jacs.9b13198. Siskos, Alexandros P., Pooja Jain, Werner Römisch-Margl, Mark Bennett, David Achaintre, Yasmin Asad, Luke Marney, et al. 2017. “Interlaboratory Reproducibility of a Targeted Metabolomics Platform for Analysis of Human Serum and Plasma.” Analytical Chemistry 89 (1): 656–65. https://doi.org/10.1021/acs.analchem.6b02930. Sitnikov, Dmitri G., Cian S. Monnin, and Dajana Vuckovic. 2016. “Systematic Assessment of Seven Solvent and Solid-Phase Extraction Methods for Metabolomics Analysis of Human Plasma by LC-MS.” Scientific Reports 6 (December). https://doi.org/10.1038/srep38885. Smirnov, Kirill S., Sara Forcisi, Franco Moritz, Marianna Lucio, and Philippe Schmitt-Kopplin. 2019. “Mass Difference Maps and Their Application for the Recalibration of Mass Spectrometric Data in Nontargeted Metabolomics.” Analytical Chemistry 91 (5): 3350–58. https://doi.org/10.1021/acs.analchem.8b04555. Smirnov, Kirill S., Tanja V. Maier, Alesia Walker, Silke S. Heinzmann, Sara Forcisi, Inés Martinez, Jens Walter, and Philippe Schmitt-Kopplin. 2016. “Challenges of Metabolomics in Human Gut Microbiota Research.” International Journal of Medical Microbiology, Intestinal microbiota - a microbial ecosystem at the edge between immune homeostasis and inflammation, 306 (5): 266–79. https://doi.org/10.1016/j.ijmm.2016.03.006. Smith, Colin A., Elizabeth J. Want, Grace O’Maille, Ruben Abagyan, and Gary Siuzdak. 2006. “XCMS:  Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification.” Analytical Chemistry 78 (3): 779–87. https://doi.org/10.1021/ac051437y. Spalding, Jonathan L., Kevin Cho, Nathaniel G. Mahieu, Igor Nikolskiy, Elizabeth M. Llufrio, Stephen L. Johnson, and Gary J. Patti. 2016. “Bar Coding MS2 Spectra for Metabolite Identification.” Analytical Chemistry 88 (5): 2538–42. https://doi.org/10.1021/acs.analchem.5b04925. Spicer, Rachel, Reza M. Salek, Pablo Moreno, Daniel Cañueto, and Christoph Steinbeck. 2017. “Navigating Freely-Available Software Tools for Metabolomics Analysis.” Metabolomics 13 (9). https://doi.org/10.1007/s11306-017-1242-7. Spratlin, Jennifer L., Natalie J. Serkova, and S. Gail Eckhardt. 2009. “Clinical Applications of Metabolomics in Oncology: A Review.” Clinical Cancer Research 15 (2): 431–40. https://doi.org/10.1158/1078-0432.CCR-08-1059. Stancliffe, Ethan, Michaela Schwaiger-Haber, Miriam Sindelar, Matthew J. Murphy, Mette Soerensen, and Gary J. Patti. 2022. “An Untargeted Metabolomics Workflow That Scales to Thousands of Samples for Population-Based Studies.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.2c01270. Stincone, Paolo, Abzer K. Pakkir Shah, Robin Schmid, Lana G. Graves, Stilianos P. Lambidis, Ralph R. Torres, Shu-Ning Xia, et al. 2023. “Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform.” Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform, August. https://doi.org/10.1021/acs.analchem.3c01202. Styczynski, Mark P., Joel F. Moxley, Lily V. Tong, Jason L. Walther, Kyle L. Jensen, and Gregory N. Stephanopoulos. 2007. “Systematic Identification of Conserved Metabolites in GC/MS Data for Metabolomics and Biomarker Discovery.” Analytical Chemistry 79 (3): 966–73. https://doi.org/10.1021/ac0614846. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Sumner, Lloyd W, Pedro Mendes, and Richard A Dixon. 2003. “Plant Metabolomics: Large-Scale Phytochemistry in the Functional Genomics Era.” Phytochemistry, Plant Metabolomics, 62 (6): 817–36. https://doi.org/10.1016/S0031-9422(02)00708-2. Sysi-Aho, Marko, Mikko Katajamaa, Laxman Yetukuri, and Matej Orešič. 2007. “Normalization Method for Metabolomics Data Using Optimal Selection of Multiple Internal Standards.” BMC Bioinformatics 8 (March): 93. https://doi.org/10.1186/1471-2105-8-93. Tang, Yanan, Caley B. Craven, Nicholas J. P. Wawryk, Junlang Qiu, Feng Li, and Xing-Fang Li. 2020. “Advances in Mass Spectrometry-Based Omics Analysis of Trace Organics in Water.” TrAC Trends in Analytical Chemistry 128 (July): 115918. https://doi.org/10.1016/j.trac.2020.115918. Tarakhovskaya, Elena, Andrea Marcillo, Caroline Davis, Sanja Milkovska-Stamenova, Antje Hutschenreuther, and Claudia Birkemeyer. 2023. “Matrix Effects in GC-MS Profiling of Common Metabolites After Trimethylsilyl Derivatization.” Molecules (Basel, Switzerland) 28 (6): 2653. https://doi.org/10.3390/molecules28062653. Tautenhahn, Ralf, Christoph Böttcher, and Steffen Neumann. 2008. “Highly Sensitive Feature Detection for High Resolution LC/MS.” BMC Bioinformatics 9: 504. https://doi.org/10.1186/1471-2105-9-504. Tautenhahn, Ralf, Kevin Cho, Winnie Uritboonthai, Zhengjiang Zhu, Gary J. Patti, and Gary Siuzdak. 2012. “An Accelerated Workflow for Untargeted Metabolomics Using the METLIN Database.” Nature Biotechnology 30 (9): 826–28. https://doi.org/10.1038/nbt.2348. Theodoridis, Georgios A., Helen G. Gika, Elizabeth J. Want, and Ian D. Wilson. 2012. “Liquid Chromatography–Mass Spectrometry Based Global Metabolite Profiling: A Review.” Analytica Chimica Acta 711 (January): 7–16. https://doi.org/10.1016/j.aca.2011.09.042. Thonusin, Chanisa, Heidi B. IglayReger, Tanu Soni, Amy E. Rothberg, Charles F. Burant, and Charles R. Evans. 2017. “Evaluation of Intensity Drift Correction Strategies Using MetaboDrift, a Normalization Tool for Multi-Batch Metabolomics Data.” Journal of Chromatography A, Pushing the Boundaries of Chromatography and Electrophoresis, 1523 (Supplement C): 265–74. https://doi.org/10.1016/j.chroma.2017.09.023. Tian, Leqi, Zhenjiang Li, Guoxuan Ma, Xiaoyue Zhang, Ziyin Tang, Siheng Wang, Jian Kang, Donghai Liang, and Tianwei Yu. 2022. “Metapone: A Bioconductor Package for Joint Pathway Testing for Untargeted Metabolomics Data.” Bioinformatics 38 (14): 3662–64. https://doi.org/10.1093/bioinformatics/btac364. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Tian, Zhitao, Xin Hu, Yingying Xu, Mengmeng Liu, Hongbo Liu, Dongqin Li, Lisong Hu, Guozhu Wei, and Wei Chen. 2023. “PMhub 1.0: A Comprehensive Plant Metabolome Database.” Nucleic Acids Research, October, gkad811. https://doi.org/10.1093/nar/gkad811. Torigoe, Taihei, Masatomo Takahashi, Omidreza Heravizadeh, Kazuki Ikeda, Kohta Nakatani, Takeshi Bamba, and Yoshihiro Izumi. 2024. “Predicting Retention Time in Unified-Hydrophilic-Interaction/Anion-Exchange Liquid Chromatography High-Resolution Tandem Mass Spectrometry (Unified-HILIC/AEX/HRMS/MS) for Comprehensive Structural Annotation of Polar Metabolome.” Analytical Chemistry 96 (3): 1275–83. https://doi.org/10.1021/acs.analchem.3c04618. Treutler, Hendrik, and Steffen Neumann. 2016. “Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data.” Metabolites 6 (4): 37. https://doi.org/10.3390/metabo6040037. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsou, Chih-Chiang, Dmitry Avtonomov, Brett Larsen, Monika Tucholska, Hyungwon Choi, Anne-Claude Gingras, and Alexey I. Nesvizhskii. 2015. “DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics.” Nature Methods 12 (3): 258–64. https://doi.org/10.1038/nmeth.3255. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uchino, Haruki, Hiroshi Tsugawa, Hidenori Takahashi, and Makoto Arita. 2022. “Computational Mass Spectrometry Accelerates C = C Position-Resolved Untargeted Lipidomics Using Oxygen Attachment Dissociation.” Communications Chemistry 5 (1): 1–13. https://doi.org/10.1038/s42004-022-00778-1. Uppal, Karan, Quinlyn A. Soltow, Frederick H. Strobel, W. Stephen Pittard, Kim M. Gernert, Tianwei Yu, and Dean P. Jones. 2013. “xMSanalyzer: Automated Pipeline for Improved Feature Detection and Downstream Analysis of Large-Scale, Non-Targeted Metabolomics Data.” BMC Bioinformatics 14 (1): 15. https://doi.org/10.1186/1471-2105-14-15. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. Uppal, Karan, Douglas I. Walker, Ken Liu, Shuzhao Li, Young-Mi Go, and Dean P. Jones. 2016. “Computational Metabolomics: A Framework for the Million Metabolome.” Chemical Research in Toxicology 29 (12): 1956–75. https://doi.org/10.1021/acs.chemrestox.6b00179. van der Kloet, Frans M., Ivana Bobeldijk, Elwin R. Verheij, and Renger H. Jellema. 2009. “Analytical Error Reduction Using Single Point Calibration for Accurate and Precise Metabolomic Phenotyping.” Journal of Proteome Research 8 (11): 5132–41. https://doi.org/10.1021/pr900499r. van Tetering, Lara, Sylvia Spies, Quirine D. K. Wildeman, Kas J. Houthuijs, Rianne E. van Outersterp, Jonathan Martens, Ron A. Wevers, David S. Wishart, Giel Berden, and Jos Oomens. 2024. “A Spectroscopic Test Suggests That Fragment Ion Structure Annotations in MS/MS Libraries Are Frequently Incorrect.” Communications Chemistry 7 (1): 1–11. https://doi.org/10.1038/s42004-024-01112-7. Verhoeven, Aswin, Martin Giera, and Oleg A. Mayboroda. 2020. “Scientific Workflow Managers in Metabolomics: An Overview.” Analyst 145 (11): 3801–8. https://doi.org/10.1039/D0AN00272K. Viant, Mark R., Timothy M. D. Ebbels, Richard D. Beger, Drew R. Ekman, David J. T. Epps, Hennicke Kamp, Pim E. G. Leonards, et al. 2019. “Use Cases, Best Practice and Reporting Standards for Metabolomics in Regulatory Toxicology.” Nature Communications 10 (1): 3041. https://doi.org/10.1038/s41467-019-10900-y. Viant, Mark R, Irwin J Kurland, Martin R Jones, and Warwick B Dunn. 2017. “How Close Are We to Complete Annotation of Metabolomes?” Current Opinion in Chemical Biology, Omics, 36 (February): 64–69. https://doi.org/10.1016/j.cbpa.2017.01.001. Vinaixa, Maria, Emma L. Schymanski, Steffen Neumann, Miriam Navarro, Reza M. Salek, and Oscar Yanes. 2016. “Mass Spectral Databases for LC/MS- and GC/MS-based Metabolomics: State of the Field and Future Prospects.” TrAC Trends in Analytical Chemistry 78 (April): 23–35. https://doi.org/10.1016/j.trac.2015.09.005. Vitale, Chiara Maria, Arjen Lommen, Carolin Huber, Kevin Wagner, Borja Garlito Molina, Rosalie Nijssen, Elliott James Price, et al. 2022. “Harmonized Quality Assurance/Quality Control Provisions for Nontargeted Measurement of Urinary Pesticide Biomarkers in the HBM4EU Multisite SPECIMEn Study.” Analytical Chemistry 94 (22): 7833–43. https://doi.org/10.1021/acs.analchem.2c00061. Volikov, Alexander, Gleb Rukhovich, and Irina V. Perminova. 2023. “NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter.” NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter, June. https://doi.org/10.1021/jasms.3c00003. Wallach, Joshua D., Kevin W. Boyack, and John P. A. Ioannidis. 2018. “Reproducible Research Practices, Transparency, and Open Access Data in the Biomedical Literature, 2015–2017.” PLOS Biology 16 (11): e2006930. https://doi.org/10.1371/journal.pbio.2006930. Wandro, Stephen, Lisa Carmody, Tara Gallagher, John J. LiPuma, and Katrine Whiteson. 2017. “Making It Last: Storage Time and Temperature Have Differential Impacts on Metabolite Profiles of Airway Samples from Cystic Fibrosis Patients.” mSystems 2 (6). https://doi.org/10.1128/mSystems.00100-17. Wang, Mingxun, Jeremy J. Carver, Vanessa V. Phelan, Laura M. Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, et al. 2016. “Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking.” Nature Biotechnology 34 (8): 828–37. https://doi.org/10.1038/nbt.3597. Wang, Ruimin, Miaoshan Lu, Shaowei An, Jinyin Wang, and Changbin Yu. 2023. “G-Aligner: A Graph-Based Feature Alignment Method for Untargeted LC–MS-based Metabolomics.” BMC Bioinformatics 24 (1): 431. https://doi.org/10.1186/s12859-023-05525-4. Wang, Ruohong, Yandong Yin, and Zheng-Jiang Zhu. 2019. “Advancing Untargeted Metabolomics Using Data-Independent Acquisition Mass Spectrometry Technology.” Analytical and Bioanalytical Chemistry 411 (19): 4349–57. https://doi.org/10.1007/s00216-019-01709-1. Wang, San-Yuan, Ching-Hua Kuo, and Yufeng J. Tseng. 2013. “Batch Normalizer: A Fast Total Abundance Regression Calibration Method to Simultaneously Adjust Batch and Injection Order Effects in Liquid Chromatography/Time-of-Flight Mass Spectrometry-Based Metabolomics Data and Comparison with Current Calibration Methods.” Analytical Chemistry 85 (2): 1037–46. https://doi.org/10.1021/ac302877x. Wang, Suping, Xiaojuan Jiang, Rong Ding, Binbin Chen, Haiyan Lyu, Junyang Liu, Chunyan Zhu, et al. 2022. “MS-IDF: A Software Tool for Nontargeted Identification of Endogenous Metabolites After Chemical Isotope Labeling Based on a Narrow Mass Defect Filter.” Analytical Chemistry 94 (7): 3194–3202. https://doi.org/10.1021/acs.analchem.1c04719. Wang, Yang, Fang Liu, Peng Li, Chengwei He, Ruibing Wang, Huanxing Su, and Jian-Bo Wan. 2016. “An Improved Pseudotargeted Metabolomics Approach Using Multiple Ion Monitoring with Time-Staggered Ion Lists Based on Ultra-High Performance Liquid Chromatography/Quadrupole Time-of-Flight Mass Spectrometry.” Analytica Chimica Acta 927 (July): 82–88. https://doi.org/10.1016/j.aca.2016.05.008. Warth, Benedikt, Scott Spangler, Mingliang Fang, Caroline H. Johnson, Erica M. Forsberg, Ana Granados, Richard L. Martin, et al. 2017. “Exposome-Scale Investigations Guided by Global Metabolomics, Pathway Analysis, and Cognitive Computing.” Analytical Chemistry 89 (21): 11505–13. https://doi.org/10.1021/acs.analchem.7b02759. Weber, Ralf J. M., Thomas N. Lawson, Reza M. Salek, Timothy M. D. Ebbels, Robert C. Glen, Royston Goodacre, Julian L. Griffin, et al. 2017. “Computational Tools and Workflows in Metabolomics: An International Survey Highlights the Opportunity for Harmonisation Through Galaxy.” Metabolomics 13 (2). https://doi.org/10.1007/s11306-016-1147-x. Weber, Ralf J. M., and Mark R. Viant. 2010. “MI-Pack: Increased Confidence of Metabolite Identification in Mass Spectra by Integrating Accurate Masses and Metabolic Pathways.” Chemometrics and Intelligent Laboratory Systems, OMICS, 104 (1): 75–82. https://doi.org/10.1016/j.chemolab.2010.04.010. Wehrens, Ron, Tom G. Bloemberg, and Paul H. C. Eilers. 2015. “Fast Parametric Time Warping of Peak Lists.” Bioinformatics 31 (18): 3063–65. https://doi.org/10.1093/bioinformatics/btv299. Wei, Runmin, Jingye Wang, Mingming Su, Erik Jia, Shaoqiu Chen, Tianlu Chen, and Yan Ni. 2018. “Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.” Scientific Reports 8 (1): 663. https://doi.org/10.1038/s41598-017-19120-0. Weljie, Aalim M., Jack Newton, Pascal Mercier, Erin Carlson, and Carolyn M. Slupsky. 2006. “Targeted Profiling:  Quantitative Analysis of 1H NMR Metabolomics Data.” Analytical Chemistry 78 (13): 4430–42. https://doi.org/10.1021/ac060209g. Wen, Bo, Zhanlong Mei, Chunwei Zeng, and Siqi Liu. 2017. “metaX: A Flexible and Comprehensive Software for Processing Metabolomics Data.” BMC Bioinformatics 18 (March): 183. https://doi.org/10.1186/s12859-017-1579-y. Wiklund, Susanne, Erik Johansson, Lina Sjöström, Ewa J. Mellerowicz, Ulf Edlund, John P. Shockcor, Johan Gottfries, Thomas Moritz, and Johan Trygg. 2008. “Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models.” Analytical Chemistry 80 (1): 115–22. https://doi.org/10.1021/ac0713510. Wise, Stephen A. 2022. “What If Using Certified Reference Materials (CRMs) Was a Requirement to Publish in Analytical/Bioanalytical Chemistry Journals?” Analytical and Bioanalytical Chemistry 414 (24): 7015–22. https://doi.org/10.1007/s00216-022-04163-8. Wishart, David S. 2016. “Emerging Applications of Metabolomics in Drug Discovery and Precision Medicine.” Nature Reviews Drug Discovery 15 (7): 473–84. https://doi.org/10.1038/nrd.2016.32. Witting, Michael, Christoph Ruttkies, Steffen Neumann, and Philippe Schmitt-Kopplin. 2017. “LipidFrag: Improving Reliability of in Silico Fragmentation of Lipids and Application to the Caenorhabditis Elegans Lipidome.” PLOS ONE 12 (3): e0172311. https://doi.org/10.1371/journal.pone.0172311. Wolf, Sebastian, Stephan Schmidt, Matthias Müller-Hannemann, and Steffen Neumann. 2010. “In Silico Fragmentation for Computer Assisted Identification of Metabolite Mass Spectra.” BMC Bioinformatics 11 (March): 148. https://doi.org/10.1186/1471-2105-11-148. Wolfender, Jean-Luc, Guillaume Marti, Aurélien Thomas, and Samuel Bertrand. 2015. “Current Approaches and Challenges for the Metabolite Profiling of Complex Natural Extracts.” Journal of Chromatography A, Editors’ Choice IX, 1382 (February): 136–64. https://doi.org/10.1016/j.chroma.2014.10.091. Wright, Elliott J., Daniel G. Beach, and Pearse McCarron. 2022. “Non-Target Analysis and Stability Assessment of Reference Materials Using Liquid Chromatography-High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1201 (April): 339622. https://doi.org/10.1016/j.aca.2022.339622. Wu, Yiman, and Liang Li. 2016. “Sample Normalization Methods in Quantitative Metabolomics.” Journal of Chromatography A, Editors’ Choice X, 1430 (January): 80–95. https://doi.org/10.1016/j.chroma.2015.12.007. Xing, Shipei, Sam Shen, Banghua Xu, Xiaoxiao Li, and Tao Huan. 2023. “BUDDY: Molecular Formula Discovery via Bottom-up MS/MS Interrogation.” Nature Methods, April, 1–10. https://doi.org/10.1038/s41592-023-01850-x. Xu, Yi-Fan, Wenyun Lu, and Joshua D. Rabinowitz. 2015. “Avoiding Misannotation of In-Source Fragmentation Products as Cellular Metabolites in Liquid Chromatography–Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 87 (4): 2273–81. https://doi.org/10.1021/ac504118y. Xue, Jingchuan, Rico J. E. Derks, Bill Webb, Elizabeth M. Billings, Aries Aisporna, Martin Giera, and Gary Siuzdak. 2021. “Single Quadrupole Multiple Fragment Ion Monitoring Quantitative Mass Spectrometry.” Analytical Chemistry 93 (31): 10879–89. https://doi.org/10.1021/acs.analchem.1c01246. Xue, Jingchuan, Xavier Domingo-Almenara, Carlos Guijas, Amelia Palermo, Markus M. Rinschen, John Isbell, H. Paul Benton, and Gary Siuzdak. 2020. “Enhanced in-Source Fragmentation Annotation Enables Novel Data Independent Acquisition and Autonomous METLIN Molecular Identification.” Analytical Chemistry 92 (8): 6051–59. https://doi.org/10.1021/acs.analchem.0c00409. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Xue, Jingchuan, Jiamin Zhu, Lixin Hu, Junjie Yang, Yunbo Lv, Fanrong Zhao, Yuxian Liu, Tao Zhang, Yanpeng Cai, and Mingliang Fang. 2023. “EISA-EXPOSOME: One Highly Sensitive and Autonomous Exposomic Platform with Enhanced in-Source Fragmentation/Annotation.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02697. Yamamoto, Hiroyuki, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, and Yoshiaki Ohashi. 2014. “Statistical Hypothesis Testing of Factor Loading in Principal Component Analysis and Its Application to Metabolite Set Enrichment Analysis.” BMC Bioinformatics 15 (February): 51. https://doi.org/10.1186/1471-2105-15-51. Yan, Binjun, Mengtian Shi, Siyu Cai, Yuan Su, Renhui Chen, Chiyuan Huang, and David Da Yong Chen. 2023. “Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC–MS/MS.” Analytical Chemistry 95 (45): 16558–66. https://doi.org/10.1021/acs.analchem.3c02689. Yang, Qingxia, Yunxia Wang, Ying Zhang, Fengcheng Li, Weiqi Xia, Ying Zhou, Yunqing Qiu, Honglin Li, and Feng Zhu. 2020. “NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data.” Nucleic Acids Research 48 (W1): W436–48. https://doi.org/10.1093/nar/gkaa258. Yang, Qin, Shan-Shan Lin, Jiang-Tao Yang, Li-Juan Tang, and Ru-Qin Yu. 2017. “Detection of Inborn Errors of Metabolism Utilizing GC-MS Urinary Metabolomics Coupled with a Modified Orthogonal Partial Least Squares Discriminant Analysis.” Talanta 165 (April): 545–52. https://doi.org/10.1016/j.talanta.2017.01.018. Yang, Qiong, Hongchao Ji, Zhenbo Xu, Yiming Li, Pingshan Wang, Jinyu Sun, Xiaqiong Fan, Hailiang Zhang, Hongmei Lu, and Zhimin Zhang. 2023. “Ultra-Fast and Accurate Electron Ionization Mass Spectrum Matching for Compound Identification with Million-Scale in-Silico Library.” Nature Communications 14 (1): 3722. https://doi.org/10.1038/s41467-023-39279-7. Yang, Ruochen, Xi Chen, and Idoia Ochoa. 2019. “MassComp, a Lossless Compressor for Mass Spectrometry Data.” BMC Bioinformatics 20 (1): 368. https://doi.org/10.1186/s12859-019-2962-7. Yates Iii, John R. 2011. “A Century of Mass Spectrometry: From Atoms to Proteomes.” Nature Methods 8 (8): 633–37. https://doi.org/10.1038/nmeth.1659. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Yu, Miao, Susan L. Teitelbaum, Georgia Dolios, Lam-Ha T. Dang, Peijun Tu, Mary S. Wolff, and Lauren M. Petrick. 2022. “Molecular Gatekeeper Discovery: Workflow for Linking Multiple Exposure Biomarkers to Metabolomics.” Environmental Science &amp; Technology 56 (10): 6162–71. https://doi.org/10.1021/acs.est.1c04039. Yu, Tianwei, Youngja Park, Jennifer M. Johnson, and Dean P. Jones. 2009. “apLCMS—Adaptive Processing of High-Resolution LC/MS Data.” Bioinformatics 25 (15): 1930–36. https://doi.org/10.1093/bioinformatics/btp291. Yu, Yong-Jie, Qing-Xia Zheng, Yue-Ming Zhang, Qian Zhang, Yu-Ying Zhang, Ping-Ping Liu, Peng Lu, et al. 2019. “Automatic Data Analysis Workflow for Ultra-High Performance Liquid Chromatography-High Resolution Mass Spectrometry-Based Metabolomics.” Journal of Chromatography A 1585 (January): 172–81. https://doi.org/10.1016/j.chroma.2018.11.070. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. Yuan, Min, Susanne B. Breitkopf, Xuemei Yang, and John M. Asara. 2012. “A Positive/Negative Ion–Switching, Targeted Mass Spectrometry–Based Metabolomics Platform for Bodily Fluids, Cells, and Fresh and Fixed Tissue.” Nature Protocols 7 (5): 872–81. https://doi.org/10.1038/nprot.2012.024. Zenobi, R. 2013. “Single-Cell Metabolomics: Analytical and Biological Perspectives.” Science 342 (6163): 1243259. https://doi.org/10.1126/science.1243259. Zha, Haihong, Yuping Cai, Yandong Yin, Zhuozhong Wang, Kang Li, and Zheng-Jiang Zhu. 2018. “SWATHtoMRM: Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery.” Analytical Chemistry 90 (6): 4062–70. https://doi.org/10.1021/acs.analchem.7b05318. Zhang, Aihua, Hui Sun, Ping Wang, Ying Han, and Xijun Wang. 2012. “Modern Analytical Techniques in Metabolomics Analysis.” The Analyst 137 (2): 293–300. https://doi.org/10.1039/C1AN15605E. Zhang, Xiuqiong, Zaifang Li, Chunxia Zhao, Tiantian Chen, Xinxin Wang, Xiaoshan Sun, Xinjie Zhao, Xin Lu, and Guowang Xu. 2024. “Leveraging Unidentified Metabolic Features for Key Pathway Discovery: Chemical Classification-driven Network Analysis in Untargeted Metabolomics.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.3c04591. Zhang, Yuhao, Jingyu Liao, Wanqi Le, Gaosong Wu, and Weidong Zhang. 2023. “Improving the Data Quality of Untargeted Metabolomics Through a Targeted Data-Dependent Acquisition Based on an Inclusion List of Differential and Preidentified Ions.” Analytical Chemistry 95 (34): 12964–73. https://doi.org/10.1021/acs.analchem.3c02888. Zhang, Yu-Ying, Qian Zhang, Yue-Ming Zhang, Wei-Wei Wang, Li Zhang, Yong-Jie Yu, Chang-Cai Bai, Ji-Zhao Guo, Hai-Yan Fu, and Yuanbin She. 2020. “A Comprehensive Automatic Data Analysis Strategy for Gas Chromatography-Mass Spectrometry Based Untargeted Metabolomics.” Journal of Chromatography A 1616 (April): 460787. https://doi.org/10.1016/j.chroma.2019.460787. Zhang, Zixuan, Huaxu Yu, Ethan Wong-Ma, Pouneh Dokouhaki, Ahmed Mostafa, Jay S. Shavadia, Fang Wu, and Tao Huan. 2024. “Reducing Quantitative Uncertainty Caused by Data Processing in Untargeted Metabolomics.” Analytical Chemistry 96 (9): 3727–32. https://doi.org/10.1021/acs.analchem.3c04046. Zhao, Fan, Shuai Huang, and Xiaozhe Zhang. 2021. “High Sensitivity and Specificity Feature Detection in Liquid Chromatography–Mass Spectrometry Data: A Deep Learning Framework.” Talanta 222 (January): 121580. https://doi.org/10.1016/j.talanta.2020.121580. Zhao, Shuang, and Liang Li. 2020. “Chemical Derivatization in LC-MS-based Metabolomics Study.” TrAC Trends in Analytical Chemistry 131 (October): 115988. https://doi.org/10.1016/j.trac.2020.115988. Zhao, Tingting, Shipei Xing, Huaxu Yu, and Tao Huan. 2023. “De Novo Cleaning of Chimeric MS/MS Spectra for LC-MS/MS-Based Metabolomics.” Analytical Chemistry 95 (35): 13018–28. https://doi.org/10.1021/acs.analchem.3c00736. Zheng, Fujian, Lei You, Wangshu Qin, Runze Ouyang, Wangjie Lv, Lei Guo, Xin Lu, Enyou Li, Xinjie Zhao, and Guowang Xu. 2022. “MetEx: A Targeted Extraction Strategy for Improving the Coverage and Accuracy of Metabolite Annotation in Liquid Chromatography–High-Resolution Mass Spectrometry Data.” Analytical Chemistry 94 (24): 8561–69. https://doi.org/10.1021/acs.analchem.1c04783. Zheng, Fujian, Xinjie Zhao, Zhongda Zeng, Lichao Wang, Wangjie Lv, Qingqing Wang, and Guowang Xu. 2020. “Development of a Plasma Pseudotargeted Metabolomics Method Based on Ultra-High-Performance Liquid Chromatography–Mass Spectrometry.” Nature Protocols 15 (8): 2519–37. https://doi.org/10.1038/s41596-020-0341-5. Zhou, Juntuo, and Yuxin Yin. 2016. “Strategies for Large-Scale Targeted Metabolomics Quantification by Liquid Chromatography-Mass Spectrometry.” Analyst 141 (23): 6362–73. https://doi.org/10.1039/C6AN01753C. Zhou, Zhiwei, Mingdu Luo, Haosong Zhang, Yandong Yin, Yuping Cai, and Zheng-Jiang Zhu. 2022. “Metabolite Annotation from Knowns to Unknowns Through Knowledge-Guided Multi-Layer Metabolic Networking.” Nature Communications 13 (1): 6656. https://doi.org/10.1038/s41467-022-34537-6. Zhu, Xiaochun, Yuping Chen, and Raju Subramanian. 2014. “Comparison of Information-Dependent Acquisition, SWATH, and MSAll Techniques in Metabolite Identification Study Employing Ultrahigh-Performance Liquid Chromatography–Quadrupole Time-of-Flight Mass Spectrometry.” Analytical Chemistry 86 (2): 1202–9. https://doi.org/10.1021/ac403385y. Zubeldia-Varela, Elisa, Domingo Barber, Coral Barbas, Marina Perez-Gordo, and David Rojo. 2020. “Sample Pre-Treatment Procedures for the Omics Analysis of Human Gut Microbiota: Turning Points, Tips and Tricks for Gene Sequencing and Metabolomics.” Journal of Pharmaceutical and Biomedical Analysis 191 (November): 113592. https://doi.org/10.1016/j.jpba.2020.113592. "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]
+[["index.html", "Meta-Workflow Preface", " Meta-Workflow Miao YU 2024-04-10 Preface This is an online handout for mass spectrometry based metabolomics data analysis. It would cover a full reproducible metabolomics workflow for data analysis and important topics related to metabolomics. Here is a list of topics: Sample collection Sample pretreatment Principles of metabolomics data analysis Software selection Batch correction Annotation Omics analysis Exposome This is a book written in Bookdown. You could contribute it by a pull request in Github. A workshop based on this book could be found here. Meanwhile, a docker image xcmsrocker is available for metabolomics reproducible research. R and Rstudio are the software needed in this workflow. "],["introduction.html", "Chapter 1 Introduction 1.1 History 1.2 Reviews and tutorials 1.3 Trends in Metabolomics 1.4 Workflow", " Chapter 1 Introduction Information in living organism communicates along the Central Dogma in different scales from individual, population, community to ecosystem. Metabolomics (i.e., the profiling and quantification of metabolites) is a relatively new field of “omics” studies. Different from other omics studies, metabolomics always focused on small molecular (molecular weight below 1500 Da) with much lower mass than polypeptide with single or doubled charged ions. Here is a demo of the position of metabolomics in “omics” studies[@b.dunn2011]. Figure 1.1: The complex interactions of functional levels in biological systems. Metabolomics studies always employ GC-MS(Theodoridis et al. 2012; Beale et al. 2018), GC*GC-MS(T.-F. Tian et al. 2016), LC-MS(Gika et al. 2014), LC-MS/MS(Begou et al. 2017), IM-MS(Levy et al. 2019), infrared ion spectroscopy(Martens et al. 2017) or NMR[@b.dunn2011] to measure metabolites. For analytical methods, this review could be checked(A. Zhang et al. 2012). The overall technique progress of metabolomics (2012-2018) could be found here(Miggiels et al. 2019). However, this workflow will only cover mass spectrometry based metabolomics or XC-MS based research. 1.1 History 1.1.1 History of Mass Spectrometry Here is a historical commentary for mass spectrometry(Yates Iii 2011). In details, here is a summary: 1913, Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.” Figure 1.2: Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.” Petroleum industry bring mass spectrometry from physics to chemistry The first commercial mass spectrometer is from Consolidated Engineering Corp to analysis simple gas mixtures from petroleum In World War II, U.S. use mass spectrometer to separate and enrich isotopes of uranium in Manhattan Project U.S. also use mass spectrometer for organic compounds during wartime and extend the application of mass spectrometer 1946, TOF, William E. Stephens 1970s, quadrupole mass analyzer 1970s, R. Graham Cooks developed mass-analyzed ion kinetic energy spectrometry, or MIKES to make MRM analysis for multi-stage mass sepctrometry 1980s, MALDI rescue TOF and mass spectrometry move into biological application 1990s, Orbitrap mass spectrometry 2010s, Aperture Coding mass spectrometry 1.1.2 History of Metabolomcis You could check this report(Baker 2011). According to this book section(Kusonmano, Vongsangnak, and Chumnanpuen 2016): Figure 1.3: Metabolomics timeline during pre- and post-metabolomics era 2000-1500 BC some traditional Chinese doctors who began to evaluate the glucose level in urine of diabetic patients using ants 300 BC ancient Egypt and Greece that traditionally determine the urine taste to diagnose human diseases 1913 Joseph John Thomson and Francis William Aston mass spectrometry 1946 Felix Bloch and Edward Purcell Nuclear magnetic resonance late 1960s chromatographic separation technique 1971 Pauling’s research team “Quantitative Analysis of Urine Vapor and Breath by Gas–Liquid Partition Chromatography” Willmitzer and his research team pioneer group in metabolomics which suggested the promotion of the metabolomics field and its potential applications from agriculture to medicine and other related areas in the biological sciences 2007 Human Metabolome Project consists of databases of approximately 2500 metabolites, 1200 drugs, and 3500 food components post-metabolomics era high-throughput analytical techniques 1.1.3 Defination Metabolomics is actually a comprehensive analysis with identification and quantification of both known and unknown compounds in an unbiased way. Metabolic fingerprinting is working on fast classification of samples based on metabolite data without quantifying or identification of the metabolites. Metabolite profiling always need a pre-defined metabolites list to be quantification(Madsen, Lundstedt, and Trygg 2010). Meanwhile, targeted and untargeted metabolomics are also used in publications. For targeted metabolomics, the majority of the molecules within a biological pathway or a defined group of related metabolites are determined. Sometimes broad collection of known metabolites could also be referred as targeted analysis. Untargeted analysis detect all of possible metabolites unbiased in the samples of interest. A similar concept called non-targeted analysis/screen is actually describe the similar studies or workflow. 1.2 Reviews and tutorials Some nice reviews and tutorials related to this workflow could be found in those papers or directly online: 1.2.1 Workflow Those papers are recommended(González-Riano et al. 2020; Pezzatti et al. 2020; X. Liu et al. 2019; Barnes et al. 2016a; Cajka and Fiehn 2016; Gika et al. 2014; Theodoridis et al. 2012; X. Lu and Xu 2008; Fiehn 2002) for general metabolomics related topics. For targeted metabolomics, you could check those reviews(Griffiths et al. 2010; W. Lu, Bennett, and Rabinowitz 2008; Weljie et al. 2006; Yuan et al. 2012; J. Zhou and Yin 2016; Begou et al. 2017). 1.2.2 Data analysis You could firstly read those papers(Barnes et al. 2016b; Kusonmano, Vongsangnak, and Chumnanpuen 2016; Madsen, Lundstedt, and Trygg 2010; Uppal et al. 2016; Alonso, Marsal, and Julià 2015) to get the concepts and issues for data analysis in metabolomics. Then this paper(Gromski et al. 2015) could be treated as a step-by-step tutorial. For GC-MS based metabolomics, check this paper(Rey-Stolle et al. 2022). A guide could be used choose a inofrmatics software and tools for lipidomics(Z. Ni et al. 2022). For annotation, this paper(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) is a well organized review. For database used in metabolomics, you could check this review(Vinaixa et al. 2016). For metabolomics software, check this series of reviews for each year(Misra and van der Hooft 2016; Misra, Fahrmann, and Grapov 2017; Misra 2018). For open sourced software, those reviews(Chang et al. 2021; Spicer et al. 2017; Dryden et al. 2017) could be a good start. For DIA or DDA metabolomics, check those papers(Fenaille et al. 2017; Bilbao et al. 2015). Here is the slides for metabolomics data analysis workshop and I have made presentations twice in UWaterloo and UC Irvine. Introduction Statistical Analysis Batch Correction Annotation 1.2.3 Application For environmental research related metabolomics or exposome, check those papers(Matich et al. 2019; Tang et al. 2020; Warth et al. 2017; Bundy, Davey, and Viant 2009). For toxicology, check this paper(Mark R. Viant et al. 2019). Check this piece(Wishart 2016) for drug discovery and precision medicine. For food chemistry, check this paper(Castro-Puyana et al. 2017), this paper for livestock(Goldansaz et al. 2017) and those papers for nutrition(Allam-Ndoul et al. 2016; Jones, Park, and Ziegler 2012; Müller and Bosy-Westphal 2020). For disease related metabolomics such as oncology(Spratlin, Serkova, and Eckhardt 2009), Cardiovascular(Cheng et al. 2017) . This paper(Kennedy et al. 2018) cover the metabolomics realted clinic research. For plant science, check those paper(Lloyd W. Sumner, Mendes, and Dixon 2003; Jorge, Mata, and António 2016; Hansen and Lee 2018). For single cell metabolomics analysis, check here(Fessenden 2016; Zenobi 2013; Ali et al. 2019; Hansen and Lee 2018). For gut microbiota, check here(Smirnov et al. 2016). 1.2.4 Challenge General challenge for metabolomics studies could be found here (Schymanski and Williams 2017; Uppal et al. 2016; Schrimpe-Rutledge et al. 2016; Wolfender et al. 2015). For reproducible research, check those papers (Xinsong Du et al. 2022; Place et al. 2021; Verhoeven, Giera, and Mayboroda 2020; Mangul et al. 2019; Wallach, Boyack, and Ioannidis 2018; Hites and Jobst 2018; Considine et al. 2017; Sarpe and Schriemer 2017). To match data from different LC system, M2S could be used(Climaco Pinto et al. 2022). Quantitative Metabolomics related issues could be found here(Kapoore and Vaidyanathan 2016; Jorge, Mata, and António 2016; Lv et al. 2022; Vitale et al. 2022). For quality control issues, check here(Dudzik et al. 2018; Siskos et al. 2017; Lloyd W. Sumner et al. 2007; Place et al. 2021; Corey D. Broeckling et al. 2023; González-Domínguez et al. 2024). You might also try postcolumn infusion as a quality control tool(González, Dubbelman, and Hankemeier 2022). 1.3 Trends in Metabolomics library(rentrez) papers_by_year &lt;- function(years, search_term){ return(sapply(years, function(y) entrez_search(db=&quot;pubmed&quot;,term=search_term, mindate=y, maxdate=y, retmax=0)$count)) } years &lt;- 2002:2022 total_papers &lt;- papers_by_year(years, &quot;&quot;) omics &lt;- c(&quot;genomics&quot;, &quot;epigenomics&quot;, &quot;metagenomic&quot;, &quot;proteomics&quot;, &quot;transcriptomics&quot;,&quot;metabolomics&quot;,&quot;exposomics&quot;) trend_data &lt;- sapply(omics, function(t) papers_by_year(years, t)) trend_props &lt;- trend_data/total_papers library(reshape) library(ggplot2) trend_df &lt;- melt(data.frame(years, trend_data), id.vars=&quot;years&quot;) p &lt;- ggplot(trend_df, aes(years, value, colour=variable)) p + geom_line(size=1) + scale_y_log10(&quot;number of papers&quot;) + theme_bw() 1.4 Workflow References Ali, Ahmed, Yasmine Abouleila, Yoshihiro Shimizu, Eiso Hiyama, Samy Emara, Alireza Mashaghi, and Thomas Hankemeier. 2019. “Single-Cell Metabolomics by Mass Spectrometry: Advances, Challenges, and Future Applications.” TrAC Trends in Analytical Chemistry 120 (November): 115436. https://doi.org/10.1016/j.trac.2019.02.033. Allam-Ndoul, Bénédicte, Frédéric Guénard, Véronique Garneau, Hubert Cormier, Olivier Barbier, Louis Pérusse, and Marie-Claude Vohl. 2016. “Association Between Metabolite Profiles, Metabolic Syndrome and Obesity Status.” Nutrients 8 (6): 324. https://doi.org/10.3390/nu8060324. Alonso, Arnald, Sara Marsal, and Antonio Julià. 2015. “Analytical Methods in Untargeted Metabolomics: State of the Art in 2015.” Frontiers in Bioengineering and Biotechnology 3 (March). https://doi.org/10.3389/fbioe.2015.00023. Baker, Monya. 2011. “Metabolomics: From Small Molecules to Big Ideas.” Nature Methods 8 (2): 117–21. https://doi.org/10.1038/nmeth0211-117. Barnes, Stephen, H. Paul Benton, Krista Casazza, Sara J. Cooper, Xiangqin Cui, Xiuxia Du, Jeffrey Engler, et al. 2016a. “Training in Metabolomics Research. I. Designing the Experiment, Collecting and Extracting Samples and Generating Metabolomics Data.” Journal of Mass Spectrometry 51 (7): 461–75. https://doi.org/10.1002/jms.3782. ———, et al. 2016b. “Training in Metabolomics Research. II. Processing and Statistical Analysis of Metabolomics Data, Metabolite Identification, Pathway Analysis, Applications of Metabolomics and Its Future.” Journal of Mass Spectrometry 51 (8): 535–48. https://doi.org/10.1002/jms.3780. Beale, David J., Farhana R. Pinu, Konstantinos A. Kouremenos, Mahesha M. Poojary, Vinod K. Narayana, Berin A. Boughton, Komal Kanojia, Saravanan Dayalan, Oliver A. H. Jones, and Daniel A. Dias. 2018. “Review of Recent Developments in GC–MS Approaches to Metabolomics-Based Research.” Metabolomics 14 (11): 152. https://doi.org/10.1007/s11306-018-1449-2. Begou, O., H. G. Gika, I. D. Wilson, and G. Theodoridis. 2017. “Hyphenated MS-based Targeted Approaches in Metabolomics.” Analyst 142 (17): 3079–3100. https://doi.org/10.1039/C7AN00812K. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Bundy, Jacob G., Matthew P. Davey, and Mark R. Viant. 2009. “Environmental Metabolomics: A Critical Review and Future Perspectives.” Metabolomics 5 (1): 3. https://doi.org/10.1007/s11306-008-0152-0. Cajka, Tomas, and Oliver Fiehn. 2016. “Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics.” Analytical Chemistry 88 (1): 524–45. https://doi.org/10.1021/acs.analchem.5b04491. Castro-Puyana, María, Raquel Pérez-Míguez, Lidia Montero, and Miguel Herrero. 2017. “Application of Mass Spectrometry-Based Metabolomics Approaches for Food Safety, Quality and Traceability.” TrAC Trends in Analytical Chemistry 93 (August): 102–18. https://doi.org/10.1016/j.trac.2017.05.004. Chang, Hui-Yin, Sean M. Colby, Xiuxia Du, Javier D. Gomez, Maximilian J. Helf, Katerina Kechris, Christine R. Kirkpatrick, et al. 2021. “A Practical Guide to Metabolomics Software Development.” Analytical Chemistry 93 (4): 1912–23. https://doi.org/10.1021/acs.analchem.0c03581. Cheng, Susan, Svati H. Shah, Elizabeth J. Corwin, Oliver Fiehn, Robert L. Fitzgerald, Robert E. Gerszten, Thomas Illig, et al. 2017. “Potential Impact and Study Considerations of Metabolomics in Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.” Circulation: Cardiovascular Genetics 10 (2): e000032. https://doi.org/10.1161/HCG.0000000000000032. Climaco Pinto, Rui, Ibrahim Karaman, Matthew R. Lewis, Jenny Hällqvist, Manuja Kaluarachchi, Gonçalo Graça, Elena Chekmeneva, et al. 2022. “Finding Correspondence Between Metabolomic Features in Untargeted Liquid Chromatography–Mass Spectrometry Metabolomics Datasets.” Analytical Chemistry 94 (14): 5493–503. https://doi.org/10.1021/acs.analchem.1c03592. Considine, E. C., G. Thomas, A. L. Boulesteix, A. S. Khashan, and L. C. Kenny. 2017. “Critical Review of Reporting of the Data Analysis Step in Metabolomics.” Metabolomics 14 (1): 7. https://doi.org/10.1007/s11306-017-1299-3. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Dryden, Michael D. M., Ryan Fobel, Christian Fobel, and Aaron R. Wheeler. 2017. “Upon the Shoulders of Giants: Open-Source Hardware and Software in Analytical Chemistry.” Analytical Chemistry 89 (8): 4330–38. https://doi.org/10.1021/acs.analchem.7b00485. Du, Xinsong, Juan J. Aristizabal-Henao, Timothy J. Garrett, Mathias Brochhausen, William R. Hogan, and Dominick J. Lemas. 2022. “A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.” Metabolites 12 (1): 87. https://doi.org/10.3390/metabo12010087. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fessenden, Marissa. 2016. “Metabolomics: Small Molecules, Single Cells.” Nature 540 (7631): 153–55. https://doi.org/10.1038/540153a. Fiehn, Oliver. 2002. “Metabolomics – the Link Between Genotypes and Phenotypes.” Plant Molecular Biology 48 (1): 155–71. https://doi.org/10.1023/A:1013713905833. Gika, Helen G., Georgios A. Theodoridis, Robert S. Plumb, and Ian D. Wilson. 2014. “Current Practice of Liquid Chromatography–Mass Spectrometry in Metabolomics and Metabonomics.” Journal of Pharmaceutical and Biomedical Analysis, Review Papers on Pharmaceutical and Biomedical Analysis 2013, 87 (January): 12–25. https://doi.org/10.1016/j.jpba.2013.06.032. Goldansaz, Seyed Ali, An Chi Guo, Tanvir Sajed, Michael A. Steele, Graham S. Plastow, and David S. Wishart. 2017. “Livestock Metabolomics and the Livestock Metabolome: A Systematic Review.” PLOS ONE 12 (5): e0177675. https://doi.org/10.1371/journal.pone.0177675. González, Oskar, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2022. “Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis.” Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis, April. https://doi.org/10.1021/jasms.2c00022. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. González-Riano, Carolina, Danuta Dudzik, Antonia Garcia, Alberto Gil-de-la-Fuente, Ana Gradillas, Joanna Godzien, Ángeles López-Gonzálvez, et al. 2020. “Recent Developments Along the Analytical Process for Metabolomics Workflows.” Analytical Chemistry 92 (1): 203–26. https://doi.org/10.1021/acs.analchem.9b04553. Griffiths, William J., Therese Koal, Yuqin Wang, Matthias Kohl, David P. Enot, and Hans-Peter Deigner. 2010. “Targeted Metabolomics for Biomarker Discovery.” Angewandte Chemie International Edition 49 (32): 5426–45. https://doi.org/10.1002/anie.200905579. Gromski, Piotr S., Howbeer Muhamadali, David I. Ellis, Yun Xu, Elon Correa, Michael L. Turner, and Royston Goodacre. 2015. “A Tutorial Review: Metabolomics and Partial Least Squares-Discriminant Analysis – a Marriage of Convenience or a Shotgun Wedding.” Analytica Chimica Acta 879 (June): 10–23. https://doi.org/10.1016/j.aca.2015.02.012. Hansen, Rebecca L., and Young Jin Lee. 2018. “High-Spatial Resolution Mass Spectrometry Imaging: Toward Single Cell Metabolomics in Plant Tissues.” The Chemical Record 18 (1): 65–77. https://doi.org/10.1002/tcr.201700027. Hites, Ronald A., and Karl J. Jobst. 2018. “Is Nontargeted Screening Reproducible?” Environmental Science &amp; Technology 52 (21): 11975–76. https://doi.org/10.1021/acs.est.8b05671. Jones, Dean P., Youngja Park, and Thomas R. Ziegler. 2012. “Nutritional Metabolomics: Progress in Addressing Complexity in Diet and Health.” Annual Review of Nutrition 32 (1): 183–202. https://doi.org/10.1146/annurev-nutr-072610-145159. Jorge, Tiago F., Ana T. Mata, and Carla António. 2016. “Mass Spectrometry as a Quantitative Tool in Plant Metabolomics.” Phil. Trans. R. Soc. A 374 (2079): 20150370. https://doi.org/10.1098/rsta.2015.0370. Kapoore, Rahul Vijay, and Seetharaman Vaidyanathan. 2016. “Towards Quantitative Mass Spectrometry-Based Metabolomics in Microbial and Mammalian Systems.” Phil. Trans. R. Soc. A 374 (2079): 20150363. https://doi.org/10.1098/rsta.2015.0363. Kennedy, Adam D., Bryan M. Wittmann, Anne M. Evans, Luke A. D. Miller, Douglas R. Toal, Shaun Lonergan, Sarah H. Elsea, and Kirk L. Pappan. 2018. “Metabolomics in the Clinic: A Review of the Shared and Unique Features of Untargeted Metabolomics for Clinical Research and Clinical Testing.” Journal of Mass Spectrometry 53 (11): 1143–54. https://doi.org/10.1002/jms.4292. Kusonmano, Kanthida, Wanwipa Vongsangnak, and Pramote Chumnanpuen. 2016. “Informatics for Metabolomics.” In Translational Biomedical Informatics, 91–115. Advances in Experimental Medicine and Biology. Springer, Singapore. https://doi.org/10.1007/978-981-10-1503-8_5. Levy, Allison J., Nicholas R. Oranzi, Atiye Ahmadireskety, Robin H. J. Kemperman, Michael S. Wei, and Richard A. Yost. 2019. “Recent Progress in Metabolomics Using Ion Mobility-Mass Spectrometry.” TrAC Trends in Analytical Chemistry 116 (July): 274–81. https://doi.org/10.1016/j.trac.2019.05.001. Liu, Xinyu, Lina Zhou, Xianzhe Shi, and Guowang Xu. 2019. “New Advances in Analytical Methods for Mass Spectrometry-Based Large-Scale Metabolomics Study.” TrAC Trends in Analytical Chemistry 121 (December): 115665. https://doi.org/10.1016/j.trac.2019.115665. Lu, Wenyun, Bryson D. Bennett, and Joshua D. Rabinowitz. 2008. “Analytical Strategies for LC–MS-based Targeted Metabolomics.” Journal of Chromatography B, Hyphenated Techniques for Global Metabolite Profiling, 871 (2): 236–42. https://doi.org/10.1016/j.jchromb.2008.04.031. Lu, Xin, and Guowang Xu. 2008. “LC-MS Metabonomics Methodology in Biomarker Discovery.” In Biomarker Methods in Drug Discovery and Development, edited by Feng Wang, 291–315. Methods in Pharmacology and Toxicology™. Humana Press. https://doi.org/10.1007/978-1-59745-463-6_14. Lv, Wangjie, Zhongda Zeng, Yuqing Zhang, Qingqing Wang, Lichao Wang, Zhaoxuan Zhang, Xianzhe Shi, Xinjie Zhao, and Guowang Xu. 2022. “Comprehensive Metabolite Quantitative Assay Based on Alternate Metabolomics and Lipidomics Analyses.” Analytica Chimica Acta 1215 (July): 339979. https://doi.org/10.1016/j.aca.2022.339979. Madsen, Rasmus, Torbjörn Lundstedt, and Johan Trygg. 2010. “Chemometrics in Metabolomics—A Review in Human Disease Diagnosis.” Analytica Chimica Acta 659 (1): 23–33. https://doi.org/10.1016/j.aca.2009.11.042. Mangul, Serghei, Thiago Mosqueiro, Richard J. Abdill, Dat Duong, Keith Mitchell, Varuni Sarwal, Brian Hill, et al. 2019. “Challenges and Recommendations to Improve the Installability and Archival Stability of Omics Computational Tools.” PLOS Biology 17 (6): e3000333. https://doi.org/10.1371/journal.pbio.3000333. Martens, Jonathan, Giel Berden, Rianne E. van Outersterp, Leo A. J. Kluijtmans, Udo F. Engelke, Clara D. M. van Karnebeek, Ron A. Wevers, and Jos Oomens. 2017. “Molecular Identification in Metabolomics Using Infrared Ion Spectroscopy.” Scientific Reports 7 (June). https://doi.org/10.1038/s41598-017-03387-4. Matich, Eryn K., Nita G. Chavez Soria, Diana S. Aga, and G. Ekin Atilla-Gokcumen. 2019. “Applications of Metabolomics in Assessing Ecological Effects of Emerging Contaminants and Pollutants on Plants.” Journal of Hazardous Materials 373 (July): 527–35. https://doi.org/10.1016/j.jhazmat.2019.02.084. Miggiels, Paul, Bert Wouters, Gerard J. P. van Westen, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2019. “Novel Technologies for Metabolomics: More for Less.” TrAC Trends in Analytical Chemistry 120 (November): 115323. https://doi.org/10.1016/j.trac.2018.11.021. Misra, Biswapriya B. 2018. “New Tools and Resources in Metabolomics: 2016–2017.” ELECTROPHORESIS 39 (7): 909–23. https://doi.org/10.1002/elps.201700441. Misra, Biswapriya B., Johannes F. Fahrmann, and Dmitry Grapov. 2017. “Review of Emerging Metabolomic Tools and Resources: 2015–2016.” ELECTROPHORESIS 38 (18): 2257–74. https://doi.org/10.1002/elps.201700110. Misra, Biswapriya B., and Justin J. J. van der Hooft. 2016. “Updates in Metabolomics Tools and Resources: 2014–2015.” ELECTROPHORESIS 37 (1): 86–110. https://doi.org/10.1002/elps.201500417. Müller, Manfred J., and Anja Bosy-Westphal. 2020. “From a ‘Metabolomics Fashion’ to a Sound Application of Metabolomics in Research on Human Nutrition.” European Journal of Clinical Nutrition 74 (12): 1619–29. https://doi.org/10.1038/s41430-020-00781-6. Ni, Zhixu, Michele Wölk, Geoff Jukes, Karla Mendivelso Espinosa, Robert Ahrends, Lucila Aimo, Jorge Alvarez-Jarreta, et al. 2022. “Guiding the Choice of Informatics Software and Tools for Lipidomics Research Applications.” Nature Methods, December, 1–12. https://doi.org/10.1038/s41592-022-01710-0. Pezzatti, Julian, Julien Boccard, Santiago Codesido, Yoric Gagnebin, Abhinav Joshi, Didier Picard, Víctor González-Ruiz, and Serge Rudaz. 2020. “Implementation of Liquid Chromatography–High Resolution Mass Spectrometry Methods for Untargeted Metabolomic Analyses of Biological Samples: A Tutorial.” Analytica Chimica Acta 1105 (April): 28–44. https://doi.org/10.1016/j.aca.2019.12.062. Place, Benjamin J., Elin M. Ulrich, Jonathan K. Challis, Alex Chao, Bowen Du, Kristin Favela, Yong-Lai Feng, et al. 2021. “An Introduction to the Benchmarking and Publications for Non-Targeted Analysis Working Group.” Analytical Chemistry 93 (49): 16289–96. https://doi.org/10.1021/acs.analchem.1c02660. Rey-Stolle, Fernanda, Danuta Dudzik, Carolina Gonzalez-Riano, Miguel Fernández-García, Vanesa Alonso-Herranz, David Rojo, Coral Barbas, and Antonia García. 2022. “Low and High Resolution Gas Chromatography-Mass Spectrometry for Untargeted Metabolomics: A Tutorial.” Analytica Chimica Acta 1210 (June): 339043. https://doi.org/10.1016/j.aca.2021.339043. Sarpe, Vladimir, and David C Schriemer. 2017. “Supporting Metabolomics with Adaptable Software: Design Architectures for the End-User.” Current Opinion in Biotechnology, Analytical biotechnology, 43 (February): 110–17. https://doi.org/10.1016/j.copbio.2016.11.001. Schrimpe-Rutledge, Alexandra C., Simona G. Codreanu, Stacy D. Sherrod, and John A. McLean. 2016. “Untargeted Metabolomics Strategies—Challenges and Emerging Directions.” Journal of The American Society for Mass Spectrometry 27 (12): 1897–1905. https://doi.org/10.1007/s13361-016-1469-y. Schymanski, Emma L., and Antony J. Williams. 2017. “Open Science for Identifying ‘Known Unknown’ Chemicals.” Environmental Science &amp; Technology 51 (10): 5357–59. https://doi.org/10.1021/acs.est.7b01908. Siskos, Alexandros P., Pooja Jain, Werner Römisch-Margl, Mark Bennett, David Achaintre, Yasmin Asad, Luke Marney, et al. 2017. “Interlaboratory Reproducibility of a Targeted Metabolomics Platform for Analysis of Human Serum and Plasma.” Analytical Chemistry 89 (1): 656–65. https://doi.org/10.1021/acs.analchem.6b02930. Smirnov, Kirill S., Tanja V. Maier, Alesia Walker, Silke S. Heinzmann, Sara Forcisi, Inés Martinez, Jens Walter, and Philippe Schmitt-Kopplin. 2016. “Challenges of Metabolomics in Human Gut Microbiota Research.” International Journal of Medical Microbiology, Intestinal microbiota - a microbial ecosystem at the edge between immune homeostasis and inflammation, 306 (5): 266–79. https://doi.org/10.1016/j.ijmm.2016.03.006. Spicer, Rachel, Reza M. Salek, Pablo Moreno, Daniel Cañueto, and Christoph Steinbeck. 2017. “Navigating Freely-Available Software Tools for Metabolomics Analysis.” Metabolomics 13 (9). https://doi.org/10.1007/s11306-017-1242-7. Spratlin, Jennifer L., Natalie J. Serkova, and S. Gail Eckhardt. 2009. “Clinical Applications of Metabolomics in Oncology: A Review.” Clinical Cancer Research 15 (2): 431–40. https://doi.org/10.1158/1078-0432.CCR-08-1059. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Sumner, Lloyd W, Pedro Mendes, and Richard A Dixon. 2003. “Plant Metabolomics: Large-Scale Phytochemistry in the Functional Genomics Era.” Phytochemistry, Plant Metabolomics, 62 (6): 817–36. https://doi.org/10.1016/S0031-9422(02)00708-2. Tang, Yanan, Caley B. Craven, Nicholas J. P. Wawryk, Junlang Qiu, Feng Li, and Xing-Fang Li. 2020. “Advances in Mass Spectrometry-Based Omics Analysis of Trace Organics in Water.” TrAC Trends in Analytical Chemistry 128 (July): 115918. https://doi.org/10.1016/j.trac.2020.115918. Theodoridis, Georgios A., Helen G. Gika, Elizabeth J. Want, and Ian D. Wilson. 2012. “Liquid Chromatography–Mass Spectrometry Based Global Metabolite Profiling: A Review.” Analytica Chimica Acta 711 (January): 7–16. https://doi.org/10.1016/j.aca.2011.09.042. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Uppal, Karan, Douglas I. Walker, Ken Liu, Shuzhao Li, Young-Mi Go, and Dean P. Jones. 2016. “Computational Metabolomics: A Framework for the Million Metabolome.” Chemical Research in Toxicology 29 (12): 1956–75. https://doi.org/10.1021/acs.chemrestox.6b00179. Verhoeven, Aswin, Martin Giera, and Oleg A. Mayboroda. 2020. “Scientific Workflow Managers in Metabolomics: An Overview.” Analyst 145 (11): 3801–8. https://doi.org/10.1039/D0AN00272K. Viant, Mark R., Timothy M. D. Ebbels, Richard D. Beger, Drew R. Ekman, David J. T. Epps, Hennicke Kamp, Pim E. G. Leonards, et al. 2019. “Use Cases, Best Practice and Reporting Standards for Metabolomics in Regulatory Toxicology.” Nature Communications 10 (1): 3041. https://doi.org/10.1038/s41467-019-10900-y. Vinaixa, Maria, Emma L. Schymanski, Steffen Neumann, Miriam Navarro, Reza M. Salek, and Oscar Yanes. 2016. “Mass Spectral Databases for LC/MS- and GC/MS-based Metabolomics: State of the Field and Future Prospects.” TrAC Trends in Analytical Chemistry 78 (April): 23–35. https://doi.org/10.1016/j.trac.2015.09.005. Vitale, Chiara Maria, Arjen Lommen, Carolin Huber, Kevin Wagner, Borja Garlito Molina, Rosalie Nijssen, Elliott James Price, et al. 2022. “Harmonized Quality Assurance/Quality Control Provisions for Nontargeted Measurement of Urinary Pesticide Biomarkers in the HBM4EU Multisite SPECIMEn Study.” Analytical Chemistry 94 (22): 7833–43. https://doi.org/10.1021/acs.analchem.2c00061. Wallach, Joshua D., Kevin W. Boyack, and John P. A. Ioannidis. 2018. “Reproducible Research Practices, Transparency, and Open Access Data in the Biomedical Literature, 2015–2017.” PLOS Biology 16 (11): e2006930. https://doi.org/10.1371/journal.pbio.2006930. Warth, Benedikt, Scott Spangler, Mingliang Fang, Caroline H. Johnson, Erica M. Forsberg, Ana Granados, Richard L. Martin, et al. 2017. “Exposome-Scale Investigations Guided by Global Metabolomics, Pathway Analysis, and Cognitive Computing.” Analytical Chemistry 89 (21): 11505–13. https://doi.org/10.1021/acs.analchem.7b02759. Weljie, Aalim M., Jack Newton, Pascal Mercier, Erin Carlson, and Carolyn M. Slupsky. 2006. “Targeted Profiling:  Quantitative Analysis of 1H NMR Metabolomics Data.” Analytical Chemistry 78 (13): 4430–42. https://doi.org/10.1021/ac060209g. Wishart, David S. 2016. “Emerging Applications of Metabolomics in Drug Discovery and Precision Medicine.” Nature Reviews Drug Discovery 15 (7): 473–84. https://doi.org/10.1038/nrd.2016.32. Wolfender, Jean-Luc, Guillaume Marti, Aurélien Thomas, and Samuel Bertrand. 2015. “Current Approaches and Challenges for the Metabolite Profiling of Complex Natural Extracts.” Journal of Chromatography A, Editors’ Choice IX, 1382 (February): 136–64. https://doi.org/10.1016/j.chroma.2014.10.091. Yates Iii, John R. 2011. “A Century of Mass Spectrometry: From Atoms to Proteomes.” Nature Methods 8 (8): 633–37. https://doi.org/10.1038/nmeth.1659. Yuan, Min, Susanne B. Breitkopf, Xuemei Yang, and John M. Asara. 2012. “A Positive/Negative Ion–Switching, Targeted Mass Spectrometry–Based Metabolomics Platform for Bodily Fluids, Cells, and Fresh and Fixed Tissue.” Nature Protocols 7 (5): 872–81. https://doi.org/10.1038/nprot.2012.024. Zenobi, R. 2013. “Single-Cell Metabolomics: Analytical and Biological Perspectives.” Science 342 (6163): 1243259. https://doi.org/10.1126/science.1243259. Zhang, Aihua, Hui Sun, Ping Wang, Ying Han, and Xijun Wang. 2012. “Modern Analytical Techniques in Metabolomics Analysis.” The Analyst 137 (2): 293–300. https://doi.org/10.1039/C1AN15605E. Zhou, Juntuo, and Yuxin Yin. 2016. “Strategies for Large-Scale Targeted Metabolomics Quantification by Liquid Chromatography-Mass Spectrometry.” Analyst 141 (23): 6362–73. https://doi.org/10.1039/C6AN01753C. "],["experimental-designdoe.html", "Chapter 2 Experimental design(DoE) 2.1 Homogeneity study 2.2 Heterogeneity study 2.3 Power analysis 2.4 Optimization 2.5 Pooled QC", " Chapter 2 Experimental design(DoE) Before you perform any metabolomics experiment, a clean and meaningful experimental design is the best start. Depending on different research purposes, experimental design can be classified into homogeneity and heterogeneity study. Technique such as isotope labeled media will not be discussed in this chapter while this paper(Jang, Chen, and Rabinowitz 2018) could be a good start. 2.1 Homogeneity study In homogeneity study, the research purpose is about method validation in most cases. Pooled sample made from multiple samples or technical replicates from same population will be used. Variances within the samples should be attributed to factors other than the samples themselves. For example, we want to know if sample injection order will affect the intensities of the unknown peaks, one pooled sample or technical replicates samples should be used. Another experimental design for homogeneity study will use biological replicates to find the common features from a group of samples. Biological replicates mean samples from same population with same biological process. For example, we wanted to know metabolites profiles of a certain species and we could collected lots of the individual samples from the population. Then only the peaks/compounds appeared in all samples will be used to describe the metabolites profiles of this species. Technical replicates could also be used with biological replicates. 2.2 Heterogeneity study In heterogeneity study, the research purpose is to find the differences among samples. You need at least a baseline to perform the comparison. Such baseline could be generated by random process, control samples or background knowledge. For example, outlier detection can be performed to find abnormal samples in unsupervised manners. Distribution or spatial analysis could be used to find geological relationship of known and unknown compounds. Temporal trend of metabolites profile could be found by time series or cohort studies. Clinical trial or random control trial is also an important class of heterogeneity studies. In this cases, you need at least two groups: treated group and control group. Also you could treat this group information as the one primary variable or primary variables to be explored for certain research purposes. In the following discussion about experimental design, we will use random control trail as model to discuss important issues. 2.3 Power analysis Supposing we have control and treated groups, the numbers of samples in each group should be carefully calculated.For each metabolite, such comparison could be treated as one t-test. You need to perform a Power analysis to get the numbers. For example, we have two groups of samples with 10 samples in each group. Then we set the power at 0.9, which means one minus Type II error probability, the standard deviation at 1 and the significance level (Type 1 error probability) at 0.05. Then we will get the meaningful delta between the two groups should be higher than 1.53367 under this experiment design. Also we could set the delta to get the minimized numbers of the samples in each group. To get those data such as the standard deviation or delta for power analysis, you need to perform preliminary or pilot experiments. power.t.test(n=10,sd=1,sig.level = 0.05,power = 0.9) ## ## Two-sample t test power calculation ## ## n = 10 ## delta = 1.53367 ## sd = 1 ## sig.level = 0.05 ## power = 0.9 ## alternative = two.sided ## ## NOTE: n is number in *each* group power.t.test(delta = 5,sd=1,sig.level = 0.05,power = 0.9) ## ## Two-sample t test power calculation ## ## n = 2.328877 ## delta = 5 ## sd = 1 ## sig.level = 0.05 ## power = 0.9 ## alternative = two.sided ## ## NOTE: n is number in *each* group However, since sometimes we could not perform preliminary experiment, we could directly compute the power based on false discovery rate control. If the power is lower than certain value, say 0.8, we just exclude this peak as significant features. In this review (Oberg and Vitek 2009), author suggest to estimate an average \\(\\alpha\\) according to this equation (Benjamini and Hochberg 1995) and then use normal way to calculate the sample numbers: \\[ \\alpha_{ave} \\leq (1-\\beta_{ave})\\cdot q\\frac{1}{1+(1-q)\\cdot m_0/m_1} \\] Other study (Blaise et al. 2016) show a method based on simulation to estimate the sample size. They used BY correction to limit the influences from correlations. Other investigation could be found here(Saccenti and Timmerman 2016; Blaise 2013). However, the nature of omics study make the power analysis hard to use one number for all metabolites and all the methods are trying to find a balance to represent more peaks with least samples. MetSizeR GUI Tool for Estimating Sample Sizes for metabolomics Experiments(Nyamundanda et al. 2013). MSstats Protein/Peptide significance analysis (Choi et al. 2014). enviGCMS GC/LC-MS Data Analysis for Environmental Science(Z. Yu et al. 2017). 2.4 Optimization One experiment can contain lots of factors with different levels and only one set of parameters for different factors will show the best sensitivity or reproducibility for certain study. To find this set of parameters, Plackett-Burman Design (PBD), Response Surface Methodology (RSM), Central Composite Design (CCD), and Taguchi methods could be used to optimize the parameters for metabolomics study. The target could be the quality of peaks, the numbers of peaks, the stability of peaks intensity, and/or the statistics of the combination of those targets. You could check those paper for details(Jacyna, Kordalewska, and Markuszewski 2019; Box, Hunter, and Hunter 2005). 2.5 Pooled QC Pooled QC samples are unique and very important for metabolomics study. Every 10 or 20 samples, a pooled sample from all samples and blank sample in one study should be injected as quality control samples. Pooled QC samples contain the changes during the instrumental analysis and blank samples could tell where the variances come from. Meanwhile the cap of sequence should old the column with pooled QC samples. The injection sequence should be randomized. Those papers(Phapale et al. 2020; Dudzik et al. 2018; Dunn et al. 2012; Broadhurst et al. 2018; Corey D. Broeckling et al. 2023; González-Domínguez et al. 2024) should be read for details. If there are other co-factors, a linear model or randomizing would be applied to eliminate their influences. You need to record the values of those co-factors for further data analysis. Common co-factors in metabolomics studies are age, gender, location, etc. If you need data correction, some background or calibration samples are required. However, control samples could also be used for data correction in certain DoE. Another important factors are instrumentals. High-resolution mass spectrum is always preferred. As shown in Lukas’s study (Najdekr et al. 2016): the most effective mass resolving powers for profiling analyses of metabolite rich biofluids on the Orbitrap Elite were around 60000-120000 fwhm to retrieve the highest amount of information. The region between 400-800 m/z was influenced the most by resolution. However, elimination of peaks with high RSD% within group were always omitted by most study. Based on pre-experiment, you could get a description of RSD% distribution and set cut-off to use stable peaks for further data analysis. To my knowledge, 30% is suitable considering the batch effects. Adding certified reference material or standard reference material will help to evaluate the quality large scale data collocation or important metabolites(Wise 2022; Wright, Beach, and McCarron 2022). For quality control in long term, ScreenDB provide a data analysis strategy for HRMS data founded on structured query language database archiving(Mardal et al. 2023). AVIR develops a computational solution to automatically recognize metabolic features with computational variation in a metabolomics data set(Z. Zhang et al. 2024). References Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological) 57 (1): 289–300. https://www.jstor.org/stable/2346101. Blaise, Benjamin J. 2013. “Data-Driven Sample Size Determination for Metabolic Phenotyping Studies.” Analytical Chemistry 85 (19): 8943–50. https://doi.org/10.1021/ac4022314. Blaise, Benjamin J., Gonçalo Correia, Adrienne Tin, J. Hunter Young, Anne-Claire Vergnaud, Matthew Lewis, Jake T. M. Pearce, et al. 2016. “Power Analysis and Sample Size Determination in Metabolic Phenotyping.” Analytical Chemistry 88 (10): 5179–88. https://doi.org/10.1021/acs.analchem.6b00188. Box, George E. P., J. Stuart Hunter, and William G. Hunter. 2005. Statistics for Experimenters. Wiley-Interscience. Broadhurst, David, Royston Goodacre, Stacey N. Reinke, Julia Kuligowski, Ian D. Wilson, Matthew R. Lewis, and Warwick B. Dunn. 2018. “Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies.” Metabolomics 14 (6). https://doi.org/10.1007/s11306-018-1367-3. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Dunn, Warwick B, Ian D Wilson, Andrew W Nicholls, and David Broadhurst. 2012. “The Importance of Experimental Design and QC Samples in Large-Scale and MS-driven Untargeted Metabolomic Studies of Humans.” Bioanalysis 4 (18): 2249–64. https://doi.org/10.4155/bio.12.204. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. Jacyna, Julia, Marta Kordalewska, and Michał J. Markuszewski. 2019. “Design of Experiments in Metabolomics-Related Studies: An Overview.” Journal of Pharmaceutical and Biomedical Analysis 164 (February): 598–606. https://doi.org/10.1016/j.jpba.2018.11.027. Jang, Cholsoon, Li Chen, and Joshua D. Rabinowitz. 2018. “Metabolomics and Isotope Tracing.” Cell 173 (4): 822–37. https://doi.org/10.1016/j.cell.2018.03.055. Mardal, Marie, Petur W. Dalsgaard, Brian S. Rasmussen, Kristian Linnet, and Christian B. Mollerup. 2023. “Scalable Analysis of Untargeted LC-HRMS Data by Means of SQL Database Archiving.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.2c03769. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Nyamundanda, Gift, Isobel Claire Gormley, Yue Fan, William M. Gallagher, and Lorraine Brennan. 2013. “MetSizeR: Selecting the Optimal Sample Size for Metabolomic Studies Using an Analysis Based Approach.” BMC Bioinformatics 14: 338. https://doi.org/10.1186/1471-2105-14-338. Oberg, Ann L., and Olga Vitek. 2009. “Statistical Design of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Journal of Proteome Research 8 (5): 2144–56. https://doi.org/10.1021/pr8010099. Phapale, Prasad, Vineeta Rai, Ashok Kumar Mohanty, and Sanjeeva Srivastava. 2020. “Untargeted Metabolomics Workshop Report: Quality Control Considerations from Sample Preparation to Data Analysis.” Journal of the American Society for Mass Spectrometry 31 (9): 2006–10. https://doi.org/10.1021/jasms.0c00224. Saccenti, Edoardo, and Marieke E. Timmerman. 2016. “Approaches to Sample Size Determination for Multivariate Data: Applications to PCA and PLS-DA of Omics Data.” Journal of Proteome Research 15 (8): 2379–93. https://doi.org/10.1021/acs.jproteome.5b01029. Wise, Stephen A. 2022. “What If Using Certified Reference Materials (CRMs) Was a Requirement to Publish in Analytical/Bioanalytical Chemistry Journals?” Analytical and Bioanalytical Chemistry 414 (24): 7015–22. https://doi.org/10.1007/s00216-022-04163-8. Wright, Elliott J., Daniel G. Beach, and Pearse McCarron. 2022. “Non-Target Analysis and Stability Assessment of Reference Materials Using Liquid Chromatography-High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1201 (April): 339622. https://doi.org/10.1016/j.aca.2022.339622. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. Zhang, Zixuan, Huaxu Yu, Ethan Wong-Ma, Pouneh Dokouhaki, Ahmed Mostafa, Jay S. Shavadia, Fang Wu, and Tao Huan. 2024. “Reducing Quantitative Uncertainty Caused by Data Processing in Untargeted Metabolomics.” Analytical Chemistry 96 (9): 3727–32. https://doi.org/10.1021/acs.analchem.3c04046. "],["pretreatment.html", "Chapter 3 Pretreatment 3.1 Collection 3.2 Quenching 3.3 Extraction 3.4 Derivatization 3.5 Isotope label 3.6 Storage", " Chapter 3 Pretreatment Pretreatment will affect the results of metabolomics and cover the sample treatment from crude samples to injection vials for instrumental analysis. The purpose of sample pretreatment is the to retain more interesting compounds while remove unrelated compounds. For metabolomics studies, we might not know ‘interesting’ compounds in advance and the unrelated compounds are highly depended on research purpose. For example, Gel Permeation Chromatograph(GPC), Florisil, Alumina, Silica gel could be used to remove lipid while alcohols and strong acid/base could make protein denaturation to release more compounds. However, if we are interested in small lipid or peptide, such pretreatment methods should be changed. In general, sample collection, quenching, extraction methods, derivatization, and storage should be optimized in pretreatment. 3.1 Collection Those papers investigated different fecal collection methods(Loftfield et al. 2016; Deda et al. 2017). This paper discuss the influence of sample normalization(Wu and Li 2016). 3.2 Quenching Quenching solvent is always used to stop stop enzymatic activity. In this review(W. Lu et al. 2017), authors said: A classical approach, which works well for many analytes, is boiling ethanol. Although the boiling solvent raises concerns about thermal degradation, it reliably denatures enzymes. In contrast, cold organic solvent may not fully denature enzymes or may do so too slowly such that some metabolic reactions continue, interconverting metabolites during the quenching process. This review(J. Kim et al. 2020) summarized the urease-dependent metabolome sample preparation and found: activities of urease and endogenous urinary enzymes and metabolite contaminants from the urease preparations introduce artefacts into metabolite profiles, thus leading to misinterpretation. 3.3 Extraction According to this research(Bennett et al. 2009): The total metabolome concentration is approximately 300 mM, whereas the protein concentration is approximately 7 mM., which implies that most cellular metabolites are in free form. Dmitri et.al(Sitnikov, Monnin, and Vuckovic 2016) thought the most orthogonal methods to methanol-based precipitation were ion-exchange solid-phase extraction and liquid-liquid extraction using methyl-tertbutyl ether. Another study used stable isotope labeled sample and found the use of a water-methanol-acetonitrile mixture for global metabolite extraction instead of aqueous methanol or aqueous acetonitrile alone (Doppler et al. 2016). Metabolic information was highly influenced by the extraction solvent(Ibáñez et al. 2017). Tissue samples need to first be pulverized into fine powders. Feces collected with 95% ethanol or FOBT would be more reproducible and stable. In this review(W. Lu et al. 2017), authors said: In our experience, for both cell and tissue specimens, 40:40:20 acetonitrile:methanol:water with 0.1 M formic acid (and subsequent neutralization with ammonium bicarbonate) is generally an effective solvent system for both quenching and extraction, including for ATP and other high-energy phosphorylated compounds. We typically use approximately 1 mL of solvent mix to extract 25 mg of biological specimen. …Thus, although drying is acceptable for most metabolites, care must be taken with redox-active species. nano LC-MS could be used to analysis small numbers of cells(Luo and Li 2017). For plant like soybeans(Mahmud et al. 2017), ammonium acetate/methanol could be selected as extraction strategies compared with water/methanol and sodium phosphate/methanol. For general plant samples, check this comprehensive investigation(Bijttebier et al. 2016). For blood plasma and serum sample, a comprehensive evaluation of 12 sample preparation methods (SPM) using phospholipid and protein removal plates (PLR), solid phase extraction plates (SPE), supported liquid extraction cartridge (SLE), and conventionally used protein precipitation (PPT) were purformed. Results show PPT and PLR on the same samples by implementing a simple analytical workflow as their complementarity would allow the broadening of the visible chemical space (Chaker et al. 2022). 3.4 Derivatization Derivatization is always used in GC-based metabolomics study. This paper(Miyagawa and Bamba 2019) compared sequential derivatization methods and found different compounds would show different fluctuations during oximation or silylation process. This paper summarized derivatization methods for LC-MS (S. Zhao and Li 2020). 3.5 Isotope label You might try heavy water to exchange oxygen atom with samples to track certain metabolites(Osipenko et al. 2022) or MS-IDF(S. Wang et al. 2022). 3.6 Storage Samples should be stored after sample collection or sample pretreatment. -80°C or -20°C is always preferred to store samples. Dry ice should be used during sample pretreatment. However, comprehensive investigation of storage influences found the metabolites profile will change after one day storage at -80°C(M. Yu et al. 2020) . Rapid analysis of samples should be considered to capture more accurate information in the samples. Storage conditions such as temperature and time can affect the metabolite composition of various samples. Laparre et al.(Laparre et al. 2017) noted that the metabolite profiles of urine samples were significantly changed after 5 days of storage at 4°C , while Wandro and colleagues(Wandro et al. 2017) observed that the metabolomic profiles of cystic fibrosis sputum samples underwent notable changes after only 1 day of storage at 4°C . Likewise, Roszkowska et al. demonstrated that various signaling molecules were lost from the lipidome profile of tissue after storing the samples for one year at 80°C (Roszkowska et al. 2018). To date, most metabolomics studies involving storage of samples prior to the analysis have used a storage temperature of 80°C , as previous investigations have shown that low temperatures or freeze-thaw cycles do not significantly change the metabolite profile of certain samples(Lin et al. 2007) . For gut microbiota, this paper could be checked for storage issue(Zubeldia-Varela et al. 2020). For blood sample storage, you could check this paper(Hernandes, Barbas, and Dudzik 2017). For urine sample storage, check this(Laparre et al. 2017). This piece reviewed the stability of energy metabolites(Gil et al. 2015). References Bennett, Bryson D., Elizabeth H. Kimball, Melissa Gao, Robin Osterhout, Stephen J. Van Dien, and Joshua D. Rabinowitz. 2009. “Absolute Metabolite Concentrations and Implied Enzyme Active Site Occupancy in Escherichia Coli.” Nature Chemical Biology 5 (8): 593–99. https://doi.org/10.1038/nchembio.186. Bijttebier, Sebastiaan, Anastasia Van der Auwera, Kenn Foubert, Stefan Voorspoels, Luc Pieters, and Sandra Apers. 2016. “Bridging the Gap Between Comprehensive Extraction Protocols in Plant Metabolomics Studies and Method Validation.” Analytica Chimica Acta 935 (September): 136–50. https://doi.org/10.1016/j.aca.2016.06.047. Chaker, Jade, David Møbjerg Kristensen, Thorhallur Ingi Halldorsson, Sjurdur Frodi Olsen, Christine Monfort, Cécile Chevrier, Bernard Jégou, and Arthur David. 2022. “Comprehensive Evaluation of Blood Plasma and Serum Sample Preparations for HRMS-Based Chemical Exposomics: Overlaps and Specificities.” Analytical Chemistry 94 (2): 866–74. https://doi.org/10.1021/acs.analchem.1c03638. Deda, Olga, Anastasia Chrysovalantou Chatziioannou, Stella Fasoula, Dimitris Palachanis, Nicolaos Raikos, Georgios A. Theodoridis, and Helen G. Gika. 2017. “Sample Preparation Optimization in Fecal Metabolic Profiling.” Journal of Chromatography B, Advances in mass spectrometry-based applications, 1047 (March): 115–23. https://doi.org/10.1016/j.jchromb.2016.06.047. Doppler, Maria, Bernhard Kluger, Christoph Bueschl, Christina Schneider, Rudolf Krska, Sylvie Delcambre, Karsten Hiller, Marc Lemmens, and Rainer Schuhmacher. 2016. “Stable Isotope-Assisted Evaluation of Different Extraction Solvents for Untargeted Metabolomics of Plants.” International Journal of Molecular Sciences 17 (7). https://doi.org/10.3390/ijms17071017. Gil, Andres, David Siegel, Hjalmar Permentier, Dirk-Jan Reijngoud, Frank Dekker, and Rainer Bischoff. 2015. “Stability of Energy Metabolites—An Often Overlooked Issue in Metabolomics Studies: A Review.” ELECTROPHORESIS 36 (18): 2156–69. https://doi.org/10.1002/elps.201500031. Hernandes, Vinicius Veri, Coral Barbas, and Danuta Dudzik. 2017. “A Review of Blood Sample Handling and Pre-Processing for Metabolomics Studies.” ELECTROPHORESIS 38 (18): 2232–41. https://doi.org/10.1002/elps.201700086. Ibáñez, Clara, Lamia Mouhid, Guillermo Reglero, and Ana Ramírez de Molina. 2017. “Lipidomics Insights in Health and Nutritional Intervention Studies.” Journal of Agricultural and Food Chemistry 65 (36): 7827–42. https://doi.org/10.1021/acs.jafc.7b02643. Kim, Jungyeon, Joong Kyong Ahn, Yu Eun Cheong, Sung-Joon Lee, Hoon-Suk Cha, and Kyoung Heon Kim. 2020. “Systematic Re-Evaluation of the Long-Used Standard Protocol of Urease-Dependent Metabolome Sample Preparation.” PloS One 15 (3): e0230072. https://doi.org/10.1371/journal.pone.0230072. Laparre, Jérôme, Zied Kaabia, Mark Mooney, Tom Buckley, Mark Sherry, Bruno Le Bizec, and Gaud Dervilly-Pinel. 2017. “Impact of Storage Conditions on the Urinary Metabolomics Fingerprint.” Analytica Chimica Acta 951 (January): 99–107. https://doi.org/10.1016/j.aca.2016.11.055. Lin, Ching Yu, Huifeng Wu, Ronald S. Tjeerdema, and Mark R. Viant. 2007. “Evaluation of Metabolite Extraction Strategies from Tissue Samples Using NMR Metabolomics.” Metabolomics 3 (1): 55–67. https://doi.org/10.1007/s11306-006-0043-1. Loftfield, Erikka, Emily Vogtmann, Joshua N. Sampson, Steven C. Moore, Heidi Nelson, Rob Knight, Nicholas Chia, and Rashmi Sinha. 2016. “Comparison of Collection Methods for Fecal Samples for Discovery Metabolomics in Epidemiologic Studies.” Cancer Epidemiology and Prevention Biomarkers 25 (11): 1483–90. https://doi.org/10.1158/1055-9965.EPI-16-0409. Lu, Wenyun, Xiaoyang Su, Matthias S. Klein, Ian A. Lewis, Oliver Fiehn, and Joshua D. Rabinowitz. 2017. “Metabolite Measurement: Pitfalls to Avoid and Practices to Follow.” Annual Review of Biochemistry 86 (1): 277–304. https://doi.org/10.1146/annurev-biochem-061516-044952. Luo, Xian, and Liang Li. 2017. “Metabolomics of Small Numbers of Cells: Metabolomic Profiling of 100, 1000, and 10000 Human Breast Cancer Cells.” Analytical Chemistry 89 (21): 11664–71. https://doi.org/10.1021/acs.analchem.7b03100. Mahmud, Iqbal, Sandi Sternberg, Michael Williams, and Timothy J. Garrett. 2017. “Comparison of Global Metabolite Extraction Strategies for Soybeans Using UHPLC-HRMS.” Analytical and Bioanalytical Chemistry 409 (26): 6173–80. https://doi.org/10.1007/s00216-017-0557-6. Miyagawa, Hiromi, and Takeshi Bamba. 2019. “Comparison of Sequential Derivatization with Concurrent Methods for GC/MS-based Metabolomics.” Journal of Bioscience and Bioengineering 127 (2): 160–68. https://doi.org/10.1016/j.jbiosc.2018.07.015. Osipenko, Sergey, Alexander Zherebker, Lidiia Rumiantseva, Oxana Kovaleva, Evgeny N. Nikolaev, and Yury Kostyukevich. 2022. “Oxygen Isotope Exchange Reaction for Untargeted LC–MS Analysis.” Journal of the American Society for Mass Spectrometry 33 (2): 390–98. https://doi.org/10.1021/jasms.1c00383. Roszkowska, Anna, Miao Yu, Vincent Bessonneau, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2018. “Tissue Storage Affects Lipidome Profiling in Comparison to in Vivo Microsampling Approach.” Scientific Reports 8 (1): 6980. https://doi.org/10.1038/s41598-018-25428-2. Sitnikov, Dmitri G., Cian S. Monnin, and Dajana Vuckovic. 2016. “Systematic Assessment of Seven Solvent and Solid-Phase Extraction Methods for Metabolomics Analysis of Human Plasma by LC-MS.” Scientific Reports 6 (December). https://doi.org/10.1038/srep38885. Wandro, Stephen, Lisa Carmody, Tara Gallagher, John J. LiPuma, and Katrine Whiteson. 2017. “Making It Last: Storage Time and Temperature Have Differential Impacts on Metabolite Profiles of Airway Samples from Cystic Fibrosis Patients.” mSystems 2 (6). https://doi.org/10.1128/mSystems.00100-17. Wang, Suping, Xiaojuan Jiang, Rong Ding, Binbin Chen, Haiyan Lyu, Junyang Liu, Chunyan Zhu, et al. 2022. “MS-IDF: A Software Tool for Nontargeted Identification of Endogenous Metabolites After Chemical Isotope Labeling Based on a Narrow Mass Defect Filter.” Analytical Chemistry 94 (7): 3194–3202. https://doi.org/10.1021/acs.analchem.1c04719. Wu, Yiman, and Liang Li. 2016. “Sample Normalization Methods in Quantitative Metabolomics.” Journal of Chromatography A, Editors’ Choice X, 1430 (January): 80–95. https://doi.org/10.1016/j.chroma.2015.12.007. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Zhao, Shuang, and Liang Li. 2020. “Chemical Derivatization in LC-MS-based Metabolomics Study.” TrAC Trends in Analytical Chemistry 131 (October): 115988. https://doi.org/10.1016/j.trac.2020.115988. Zubeldia-Varela, Elisa, Domingo Barber, Coral Barbas, Marina Perez-Gordo, and David Rojo. 2020. “Sample Pre-Treatment Procedures for the Omics Analysis of Human Gut Microbiota: Turning Points, Tips and Tricks for Gene Sequencing and Metabolomics.” Journal of Pharmaceutical and Biomedical Analysis 191 (November): 113592. https://doi.org/10.1016/j.jpba.2020.113592. "],["instrumental-analysis.html", "Chapter 4 Instrumental analysis 4.1 Column and gradient selection 4.2 Mass resolution 4.3 Matrix effects", " Chapter 4 Instrumental analysis To get more information in the samples, full scan is preferred on GC/LC-MS. Each scan would collect a mass spectrum to cover the setting mass range. If you narrow down your mass range and keep the same scan time, each mass would gain the collection time and you would get a higher sensitivity. However, if you expand your scan range, the sensitivity for each mass would decrease. You could also extend the collection time for each scan. However, it would affect the separation process. Full scan is performed synchronously with the separation process. For a better separation on chromotograph, each peak should have at least 10 points to get a nice peak shape. If you want to separate two peaks with a retention time differences of 10s. Assuming the half peak width is 5s, you need to collect 10 mass spectrum within 10s. So the drwell time for each scan is 1s. If you use a high resolution column and the half peak width is 1s, you need to finish a scan within 0.2s. As we discussed above, shorter dwell time would decrease the sensitivity. Thus there is a trade-off between separation and sensitivity. If you use UPLC, the separation could be finished within 20 min while you need to calculate if you mass spectrometry could still show a good sensitivity. Recently a study (J. Cai and Yan 2021) show 6 points will be enough to generate peaks with 20 points with optimized workflow. 4.1 Column and gradient selection For GC, higher temperature could release compounds with higher boiling point. For LC, gradient and functional groups of stationary phase would be more important than temperature. Polarity of samples and column should match. More polar solvent could release polar compounds. Normal-phase column will not retain non-polar compounds while reversed-phase will elute polar column in the very beginning. To cover a wide polarity range or logP value compounds, normal phase column should match with non-polar to polar gradient to get a better separation of polar compounds while reverse phase column should match with polar to non-polar gradient to elute compounds. If you use an inappropriate order of gradient, you compounds would not be separated well. If you have no idea about column and gradient selection, check literature’s condition. Meanwhile, the pretreatment methods should fit the column and gradient selection. You will get limited information by injection of non-polar extracts on a normal phase column and nothing will be retained on column. This study show improved chromatography conditions will improve the annotation results(Anderson et al. 2021). You can also install polar and non-polar columns and run separation on one column while condition on another one, which could extend the chemical coverage(Flasch et al. 2022). Meta-analysis of chromatographic methods in EBI metabolights and NIH Workbench could be a guide for lab without experience on metabolomics chromatographic methods(Harrieder et al. 2022). This work introduce Sequential Quantification using Isotope Dilution (SQUID), a method combining serial sample injections into a continuous isocratic mobile phase, enabling rapid analysis of target molecules with high accuracy, as demonstrated by detecting microbial polyamines in human urine samples with an LLOQ of 106 nM and analysis times as short as 57 s, thus proposing SQUID as a high-throughput LC–MS tool for quantifying target biomarkers in large cohorts(Groves et al. 2023). 4.2 Mass resolution For metabolomics, high resolution mass spectrum should be used to make identification of compounds easier. The Mass Resolving Power is very important for annotation and high resolution mass spectrum should be calibrated in real time. The region between 400–800 m/z was influenced the most by resolution(Najdekr et al. 2016). Orbitrap Fusion’s performance was evaluated here(Barbier Saint Hilaire et al. 2018), as well as the comparison with Fourier transform ion cyclotron resonance (FT-ICR)(Ghaste, Mistrik, and Shulaev 2016; Huang et al. 2021). Mass Difference Maps could recalibrate HRMS data (Smirnov et al. 2019). 4.3 Matrix effects Matrix effects could decrease the sensitivity of untargeted analysis. Such matrix effects could be checked by low resolution mass spectrometry(Z. Yu et al. 2017) and found for high resolution mass spectrometry(Calbiani et al. 2006). Ion suppression should also be considered as a critical issue comparing heterogeneous metabolic profiles(Ghosson et al. 2021). This work discussed the matrix effects after Trimethylsilyl derivatization(Tarakhovskaya et al. 2023).The study(Dagan et al. 2023) investigated how the complexity of matrices affects nontargeted detection using LC-MS/MS analysis, finding that detection limits for trace compounds were significantly influenced by matrix complexity, with higher concentrations required for detection within the “top 1000” list compared to the first 10,000 peaks, suggesting a negative power law functional relationship between peak location and concentration; the research also demonstrated a correlation between power law coefficient and dilution factor, while showcasing the distribution of matrix peaks across various matrices, providing insights into the capabilities and limitations of LC-MS in analyzing nontargets in complex matrices. dist_loc &lt;- list.files( find.package(&quot;DiagrammeR&quot;), recursive = TRUE, pattern = &quot;mermaid.*js&quot;, full.names = TRUE ) js_cdn_url &lt;- &quot;https://cdnjs.cloudflare.com/ajax/libs/mermaid/9.0.1/mermaid.min.js&quot; download.file(js_cdn_url, dist_loc) References Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Barbier Saint Hilaire, Pierre, Ulli M. Hohenester, Benoit Colsch, Jean-Claude Tabet, Christophe Junot, and François Fenaille. 2018. “Evaluation of the High-Field Orbitrap Fusion for Compound Annotation in Metabolomics.” Analytical Chemistry 90 (5): 3030–35. https://doi.org/10.1021/acs.analchem.7b05372. Cai, Jingwei, and Zhengyin Yan. 2021. “Re-Examining the Impact of Minimal Scans in Liquid Chromatography–Mass Spectrometry Analysis.” Journal of the American Society for Mass Spectrometry, June. https://doi.org/10.1021/jasms.1c00073. Calbiani, F., M. Careri, L. Elviri, A. Mangia, and I. Zagnoni. 2006. “Matrix Effects on Accurate Mass Measurements of Low-Molecular Weight Compounds Using Liquid Chromatography-Electrospray-Quadrupole Time-of-Flight Mass Spectrometry.” Journal of Mass Spectrometry 41 (3): 289–94. https://doi.org/10.1002/jms.984. Dagan, Shai, Dana Marder, Nitzan Tzanani, Eyal Drug, Hagit Prihed, and Lilach Yishai-Aviram. 2023. “Evaluation of Matrix Complexity in Nontargeted Analysis of Small-Molecule Toxicants by Liquid Chromatography–High-Resolution Mass Spectrometry.” Analytical Chemistry 95 (20): 7924–32. https://doi.org/10.1021/acs.analchem.3c00413. Flasch, Mira, Veronika Fitz, Evelyn Rampler, Chibundu N. Ezekiel, Gunda Koellensperger, and Benedikt Warth. 2022. “Integrated Exposomics/Metabolomics for Rapid Exposure and Effect Analyses.” JACS Au 2 (11): 2548–60. https://doi.org/10.1021/jacsau.2c00433. Ghaste, Manoj, Robert Mistrik, and Vladimir Shulaev. 2016. “Applications of Fourier Transform Ion Cyclotron Resonance (FT-ICR) and Orbitrap Based High Resolution Mass Spectrometry in Metabolomics and Lipidomics.” International Journal of Molecular Sciences 17 (6). https://doi.org/10.3390/ijms17060816. Ghosson, Hikmat, Yann Guitton, Amani Ben Jrad, Chandrashekhar Patil, Delphine Raviglione, Marie-Virginie Salvia, and Cédric Bertrand. 2021. “Electrospray Ionization and Heterogeneous Matrix Effects in Liquid Chromatography/Mass Spectrometry Based Meta-Metabolomics: A Biomarker or a Suppressed Ion?” Rapid Communications in Mass Spectrometry 35 (2): e8977. https://doi.org/10.1002/rcm.8977. Groves, Ryan A., Carly C. Y. Chan, Spencer D. Wildman, Daniel B. Gregson, Thomas Rydzak, and Ian A. Lewis. 2023. “Rapid LC–MS Assay for Targeted Metabolite Quantification by Serial Injection into Isocratic Gradients.” Analytical and Bioanalytical Chemistry 415 (2): 269–76. https://doi.org/10.1007/s00216-022-04384-x. Harrieder, Eva-Maria, Fleming Kretschmer, Sebastian Böcker, and Michael Witting. 2022. “Current State-of-the-Art of Separation Methods Used in LC-MS Based Metabolomics and Lipidomics.” Journal of Chromatography B 1188 (January): 123069. https://doi.org/10.1016/j.jchromb.2021.123069. Huang, Danning, Marcos Bouza, David A. Gaul, Franklin E. Leach, I. Jonathan Amster, Frank C. Schroeder, Arthur S. Edison, and Facundo M. Fernández. 2021. “Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation.” Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation, August. https://doi.org/10.1021/acs.analchem.1c02224. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Smirnov, Kirill S., Sara Forcisi, Franco Moritz, Marianna Lucio, and Philippe Schmitt-Kopplin. 2019. “Mass Difference Maps and Their Application for the Recalibration of Mass Spectrometric Data in Nontargeted Metabolomics.” Analytical Chemistry 91 (5): 3350–58. https://doi.org/10.1021/acs.analchem.8b04555. Tarakhovskaya, Elena, Andrea Marcillo, Caroline Davis, Sanja Milkovska-Stamenova, Antje Hutschenreuther, and Claudia Birkemeyer. 2023. “Matrix Effects in GC-MS Profiling of Common Metabolites After Trimethylsilyl Derivatization.” Molecules (Basel, Switzerland) 28 (6): 2653. https://doi.org/10.3390/molecules28062653. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. "],["workflow-2.html", "Chapter 5 Workflow 5.1 Platform for metabolomics data analysis 5.2 Project Setup 5.3 Data sharing 5.4 Contest", " Chapter 5 Workflow You could check this book for metabolomics data analysis (S. Li 2020). DiagrammeR::mermaid(&quot; flowchart TB I(peak-picking) --&gt; C C(visulization) --&gt; D(normalization/batch correction) D --&gt; A(annotation/identification) A --&gt; H(statistical analysis) C --&gt; A --&gt; B(omics analysis) D --&gt; H B --&gt; H H --&gt; E(experimental validation) A --&gt; E H --&gt; A B --&gt; E C --&gt; H &quot;) 5.1 Platform for metabolomics data analysis Here is a list for related open source projects 5.1.1 XCMS &amp; XCMS online XCMS online is hosted by Scripps Institute. If your datasets are not large, XCMS online would be the best option for you. Recently they updated the online version to support more functions for systems biology. They use metlin and iso metlin to annotate the MS/MS data. Pathway analysis is also supported. Besides, to accelerate the process, xcms online employed stream (windows only). You could use stream to connect your instrument workstation to their server and process the data along with the data acquisition automate. They also developed apps for xcms online, but I think apps for slack would be even cooler to control the data processing. xcms is different from xcms online while they might share the same code. I used it almost every data to run local metabolomics data analysis. Recently, they will change their version to xcms 3 with major update for object class. Their data format would integrate into the MSnbase package and the parameters would be easy to set up for each step. Normally, I will use msconvert-IPO-xcms-xMSannotator-metaboanalyst as workflow to process the offline data. It could accelerate the process by parallel processing. However, if you are not familiar with R, you would better to choose some software below. For xcms, 1000 files will need around 5 hours to generate the peaks list on a regular workstation. IPO A Tool for automated Optimization of XCMS Parameters (Libiseller et al. 2015) and Warpgroup is used for chromatogram subregion detection, consensus integration bound determination and accurate missing value integration(Mahieu, Spalding, and Patti 2016). A case study to compare different xcms parameters with IPO can be found for GC-MS (Dos Santos and Canuto 2023). Another option is AutoTuner, which are much faster than IPO(McLean and Kujawinski 2020). Recently, MetaboAnalystR 3.0 could also optimize the parameters for xcms while you need to perform the following analysis within this software(Pang et al. 2020). For IPO, ten files will need ~12 hours to generate the optimized results on a regular workstation. Paramounter is a direct measurement of universal parameters to process metabolomics data in a “White Box”(J. Guo, Shen, and Huan 2022). Another research use machine learning method to compare different optimization methods and they are all better than the default setting of xcms(Lassen et al. 2021). It could be extended to include ion mobility(Dodds et al. 2022). Check those papers for the XCMS based workflow(Forsberg et al. 2018; Huan et al. 2017; Mahieu et al. 2016; Montenegro-Burke et al. 2017; Domingo-Almenara and Siuzdak 2020; Stancliffe et al. 2022). For metlin related annotation, check those papers(Guijas et al. 2018; Tautenhahn et al. 2012; Xue, Guijas, et al. 2020; Domingo-Almenara, Montenegro-Burke, Ivanisevic, et al. 2018). MAIT based on xcms and you could find source code here(Fernández-Albert et al. 2014). iMet-Q is an automated tool with friendly user interfaces for quantifying metabolites in full-scan liquid chromatography-mass spectrometry (LC-MS) data (Chang et al. 2016) compMS2Miner is an Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets. Here is related papers (Edmands et al. 2017; Edmands, Hayes, and Rappaport 2018; Edmands, Barupal, and Scalbert 2015). mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language, which could be coupled with xcms (Scheltema et al. 2011; Creek et al. 2012). It also could be used for annotation with MetAssign(Daly et al. 2014). 5.1.2 PRIMe PRIMe is from RIKEN and UC Davis. They update their database frequently(Tsugawa et al. 2016). It supports mzML and major MS vendor formats. They defined own file format ABF and eco-system for omics studies. The software are updated almost everyday. You could use MS-DIAL for untargeted analysis and MRMOROBS for targeted analysis. For annotation, they developed MS-FINDER and statistic tools with excel. This platform could replaced the dear software from company and well prepared for MS/MS data analysis and lipidomics. They are open source, work on Windows and also could run within mathmamtics. However, they don’t cover pathway analysis. Another feature is they always show the most recently spectral records from public repositories. You could always get the updated MSP spectra files for your own data analysis. For PRIMe based workflow, check those papers(Lai et al. 2018; Matsuo et al. 2017; Treutler et al. 2016; Tsugawa et al. 2015; Tsugawa et al. 2016; Kind et al. 2018). There are also extensions for their workflow(Uchino et al. 2022) and workflow for environmental science(Bonnefille et al. 2023). 5.1.3 GNPS GNPS is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. It’s a straight forward annotation methods for MS/MS data. Feature-based molecular networking (FBMN) within GNPS could be coupled with xcms, openMS, MS-DIAL, MZmine2, and other popular software. GNPS also have a dashboard for online mass spectrometery data analysis(Petras et al. 2021). Check those papers for GNPS and related projects(Aron et al. 2020; Nothias et al. 2020; Scheubert et al. 2017; Ricardo R. da Silva et al. 2018; M. Wang et al. 2016; Bittremieux et al. 2023). 5.1.4 OpenMS &amp; SIRIUS OpenMS is another good platform for mass spectrum data analysis developed with C++. You could use them as plugin of KNIME. I suggest anyone who want to be a data scientist to get familiar with platform like KNIME because they supplied various API for different programme language, which is easy to use and show every steps for others. Also TOPPView in OpenMS could be the best software to visualize the MS data. You could always use the metabolomics workflow to train starter about details in data processing. pyOpenMS and OpenSWATH are also used in this platform. If you want to turn into industry, this platform fit you best because you might get a clear idea about solution and workflow. Check those paper for OpenMS based workflow(Bertsch et al. 2011; Pfeuffer et al. 2017, 2024; Röst et al. 2014, 2016; Rurik et al. 2020; Alka et al. 2020). OpenMS could be coupled to SIRIUS 4 for annotation. Sirius is a new java-based software framework for discovering a landscape of de-novo identification of metabolites using single and tandem mass spectrometry. SIRIUS 4 project integrates a collection of our tools, including CSI:FingerID, ZODIAC and CANOPUS. Check those papers for SIRIUS based workflow(Dührkop et al. 2019, 2020; Alka et al. 2020; Ludwig et al. 2020). 5.1.5 MZmine 2 MZmine 2 has three version developed on Java platform and the lastest version is included into MSDK. Similar function could be found from MZmine 2 as shown in XCMS online. However, MZmine 2 do not have pathway analysis. You could use metaboanalyst for that purpose. Actually, you could go into MSDK to find similar function supplied by ProteoSuite and Openchrom. If you are a experienced coder for Java, you should start here. Check those papers for MZmine based workflow(Pluskal et al. 2010; Pluskal et al. 2020). 5.1.6 Emory MaHPIC This platform is composed by several R packages from Emory University including apLCMS to collect the data, xMSanalyzer to handle automated pipeline for large-scale, non-targeted metabolomics data, xMSannotator for annotation of LC-MS data and Mummichog for pathway and network analysis for high-throughput metabolomics. This platform would be preferred by someone from environmental science to study exposome. You could check those papers for Emory workflow(Uppal et al. 2013; Uppal, Walker, and Jones 2017; T. Yu et al. 2009; S. Li et al. 2013; Q. Liu et al. 2020). 5.1.7 Others PMDDA is a reproducible workflow for exhaustive MS2 data acquisition of MS1 features(M. Yu, Dolios, and Petrick 2022) will data and script available online. tidymass is an object-oriented reproducible analysis framework for LC–MS data(Shen et al. 2022). R for mass spectrometry is a R software collection for the analysis and interpretation of high throughput mass spectrometry assays. MAVEN from Princeton University (Melamud, Vastag, and Rabinowitz 2010; Clasquin, Melamud, and Rabinowitz 2012). metabolomics is a CRAN package for analysis of metabolomics data. autoGCMSDataAnal is a Matlab based comprehensive data analysis strategy for GC-MS-based untargeted metabolomics and AntDAS2 provided An automatic data analysis strategy for UPLC-HRMS-based metabolomics(Y.-J. Yu et al. 2019; Y.-Y. Zhang et al. 2020). enviGCMS from environmental non-targeted analysis and rmwf for reproducible metabolomics workflow (M. Yu et al. 2020; M. Yu, Olkowicz, and Pawliszyn 2019). Pseudotargeted metabolomics method (Zheng et al. 2020; Y. Wang et al. 2016). pySM provides a reference implementation of our pipeline for False Discovery Rate-controlled metabolite annotation of high-resolution imaging mass spectrometry data (Palmer et al. 2017). TinyMS is a Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows (Riquelme et al. 2020) MetaboliteDetector is a QT4 based software package for the analysis of GC/MS based metabolomics data (Hiller et al. 2009). W4M and metaX could analysis data online (Giacomoni et al. 2015; Wen et al. 2017; Jalili et al. 2020). FTMSVisualization is a suite of tools for visualizing complex mixture FT-MS data (Kew et al. 2017) magma could predict and match MS/MS files. metabCombiner Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets(Habra et al. 2021) SLAW is a scalable and self-Optimizing processing workflow for Untargeted LC-MS with a docker image (Delabriere et al. 2021). patRoon: open source software platform for environmental mass spectrometry based non-target screening (Helmus et al. 2021). ‘shape-orientated’ algorithm: A new ‘shape-orientated’ continuous wavelet transform (CWT)-based algorithm employing an adapted Marr wavelet (AMW) with a shape matching index (SMI), defined as peak height normalized wavelet coefficient for feature filtering, was developed for chromatographic peak detection and quantification. (Bai et al. 2022) automRm An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning. (Eilertz, Mitterer, and Buescher 2022) IDSL.UFAIntrinsic Peak Analysis (IPA) for HRMS Data. (Baygi et al. 2022) DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data (Colby et al. 2022) Omics Untargeted Key Script is a tools to make untargeted LC-MS metabolomic profiling with the latest computational features readily accessible in a ready-to-use unified manner to a research community(Plyushchenko et al. 2022). MetEx is a targeted extraction strategy for improving the coverage and accuracy of metabolite annotation(Zheng et al. 2022). Asari:Trackable and scalable LC-MS metabolomics data processing software in Python(S. Li et al. 2023) NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter(Volikov, Rukhovich, and Perminova 2023) MARS:A Multipurpose Software for Untargeted LC−MS-Based Metabolomics and Exposomics with GUI in C++ (Goracci et al. 2024) MeRgeION: a Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing (Y. Liu et al. 2023) 5.1.8 Workflow Comparison Here are some comparisons for different workflow and you could make selection based on their works(Myers et al. 2017; Weber et al. 2017; Z. Li et al. 2018; Liao et al. 2023). xcmsrocker is a docker image for metabolomics to compare R based software with template(M. Yu, Dolios, and Petrick 2022). 5.2 Project Setup I suggest building your data analysis projects in RStudio (Click File - New project - New dictionary - Empty project). Then assign a name for your project. I also recommend the following tips if you are familiar with it. Use git/github to make version control of your code and sync your project online. Don’t use your name for your project because other peoples might cooperate with you and someone might check your data when you publish your papers. Each project should be a work for one paper or one chapter in your thesis. Use workflow document(txt or doc) in your project to record all of the steps and code you performed for this project. Treat this document as digital version of your experiment notebook Use data folder in your project folder for the raw data and the results you get in data analysis Use figure folder in your project folder for the figure Use munuscript folder in your project folder for the manuscript (you could write paper in rstudio with the help of template in Rmarkdown) Just double click \\[yourprojectname\\].Rproj to start your project 5.3 Data sharing See this paper(Haug, Salek, and Steinbeck 2017): MetaboLights EU based The Metabolomics Workbench US based MetaboBank Japan based MetabolomeXchange search engine MetabolomeExpress a public place to process, interpret and share GC/MS metabolomics datasets(Carroll, Badger, and Harvey Millar 2010). 5.4 Contest CASMI predict small molecular contest(Blaženović et al. 2017) References Alka, Oliver, Timo Sachsenberg, Leon Bichmann, Julianus Pfeuffer, Hendrik Weisser, Samuel Wein, Eugen Netz, Marc Rurik, Oliver Kohlbacher, and Hannes Röst. 2020. “CHAPTER 6:OpenMS and KNIME for Mass Spectrometry Data Processing.” In Processing Metabolomics and Proteomics Data with Open Software, 201–31. https://doi.org/10.1039/9781788019880-00201. Aron, Allegra T., Emily C. Gentry, Kerry L. McPhail, Louis-Félix Nothias, Mélissa Nothias-Esposito, Amina Bouslimani, Daniel Petras, et al. 2020. “Reproducible Molecular Networking of Untargeted Mass Spectrometry Data Using GNPS.” Nature Protocols 15 (6): 1954–91. https://doi.org/10.1038/s41596-020-0317-5. Bai, Caihong, Suyun Xu, Jingyi Tang, Yuxi Zhang, Jiahui Yang, and Kaifeng Hu. 2022. “A ‘Shape-Orientated’ Algorithm Employing an Adapted Marr Wavelet and Shape Matching Index Improves the Performance of Continuous Wavelet Transform for Chromatographic Peak Detection and Quantification.” Journal of Chromatography A 1673 (June): 463086. https://doi.org/10.1016/j.chroma.2022.463086. Baygi, Sadjad Fakouri, Sanjay K. Banerjee, Praloy Chakraborty, Yashwant Kumar, and Dinesh Kumar Barupal. 2022. “IDSL.UFA Assigns High-Confidence Molecular Formula Annotations for Untargeted LC/HRMS Data Sets in Metabolomics and Exposomics.” Analytical Chemistry 94 (39): 13315–22. https://doi.org/10.1021/acs.analchem.2c00563. Bertsch, Andreas, Clemens Gröpl, Knut Reinert, and Oliver Kohlbacher. 2011. “OpenMS and TOPP: Open Source Software for LC-MS Data Analysis.” In Data Mining in Proteomics: From Standards to Applications, edited by Michael Hamacher, Martin Eisenacher, and Christian Stephan, 353–67. Methods in Molecular Biology. Totowa, NJ: Humana Press. https://doi.org/10.1007/978-1-60761-987-1_23. Bittremieux, Wout, Nicole E. Avalon, Sydney P. Thomas, Sarvar A. Kakhkhorov, Alexander A. Aksenov, Paulo Wender P. Gomes, Christine M. Aceves, et al. 2023. “Open Access Repository-Scale Propagated Nearest Neighbor Suspect Spectral Library for Untargeted Metabolomics.” Nature Communications 14 (1): 8488. https://doi.org/10.1038/s41467-023-44035-y. Blaženović, Ivana, Tobias Kind, Hrvoje Torbašinović, Slobodan Obrenović, Sajjan S. Mehta, Hiroshi Tsugawa, Tobias Wermuth, et al. 2017. “Comprehensive Comparison of in Silico MS/MS Fragmentation Tools of the CASMI Contest: Database Boosting Is Needed to Achieve 93% Accuracy.” Journal of Cheminformatics 9 (1): 32. https://doi.org/10.1186/s13321-017-0219-x. Bonnefille, Bénilde, Oskar Karlsson, May Britt Rian, Rubhana Raqib, Faruque Parvez, Stefano Papazian, M. Sirajul Islam, and Jonathan W. Martin. 2023. “Nontarget Analysis of Polluted Surface Waters in Bangladesh Using Open Science Workflows.” Environmental Science &amp; Technology, April. https://doi.org/10.1021/acs.est.2c08200. Carroll, Adam J., Murray R. Badger, and A. Harvey Millar. 2010. “The MetabolomeExpress Project: Enabling Web-Based Processing, Analysis and Transparent Dissemination of GC/MS Metabolomics Datasets.” BMC Bioinformatics 11 (1): 376. https://doi.org/10.1186/1471-2105-11-376. Chang, Hui-Yin, Ching-Tai Chen, T. Mamie Lih, Ke-Shiuan Lynn, Chiun-Gung Juo, Wen-Lian Hsu, and Ting-Yi Sung. 2016. “iMet-Q: A User-Friendly Tool for Label-Free Metabolomics Quantitation Using Dynamic Peak-Width Determination.” PLOS ONE 11 (1): e0146112. https://doi.org/10.1371/journal.pone.0146112. Clasquin, Michelle F., Eugene Melamud, and Joshua D. Rabinowitz. 2012. “LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine.” Current Protocols in Bioinformatics 37 (1): 14.11.1–23. https://doi.org/10.1002/0471250953.bi1411s37. Colby, Sean M., Christine H. Chang, Jessica L. Bade, Jamie R. Nunez, Madison R. Blumer, Daniel J. Orton, Kent J. Bloodsworth, et al. 2022. “DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data.” Analytical Chemistry 94 (16): 6130–38. https://doi.org/10.1021/acs.analchem.1c05017. Creek, Darren J., Andris Jankevics, Karl E. V. Burgess, Rainer Breitling, and Michael P. Barrett. 2012. “IDEOM: An Excel Interface for Analysis of LC–MS-based Metabolomics Data.” Bioinformatics 28 (7): 1048–49. https://doi.org/10.1093/bioinformatics/bts069. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. Delabriere, Alexis, Philipp Warmer, Vincenth Brennsteiner, and Nicola Zamboni. 2021. “SLAW: A Scalable and Self-Optimizing Processing Workflow for Untargeted LC-MS.” Analytical Chemistry 93 (45): 15024–32. https://doi.org/10.1021/acs.analchem.1c02687. Dodds, James N., Lingjue Wang, Gary J. Patti, and Erin S. Baker. 2022. “Combining Isotopologue Workflows and Simultaneous Multidimensional Separations to Detect, Identify, and Validate Metabolites in Untargeted Analyses.” Analytical Chemistry 94 (5): 2527–35. https://doi.org/10.1021/acs.analchem.1c04430. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, Julijana Ivanisevic, Aurelien Thomas, Jonathan Sidibé, Tony Teav, Carlos Guijas, et al. 2018. “XCMS-MRM and METLIN-MRM: A Cloud Library and Public Resource for Targeted Analysis of Small Molecules.” Nature Methods 15 (9): 681–84. https://doi.org/10.1038/s41592-018-0110-3. Domingo-Almenara, Xavier, and Gary Siuzdak. 2020. “Metabolomics Data Processing Using XCMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 11–24. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_2. Dos Santos, Emile Kelly Porto, and Gisele André Baptista Canuto. 2023. “Optimizing XCMS Parameters for GC-MS Metabolomics Data Processing: A Case Study.” Metabolomics: Official Journal of the Metabolomic Society 19 (4): 26. https://doi.org/10.1007/s11306-023-01992-1. Dührkop, Kai, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker. 2019. “SIRIUS 4: A Rapid Tool for Turning Tandem Mass Spectra into Metabolite Structure Information.” Nature Methods 16 (4): 299–302. https://doi.org/10.1038/s41592-019-0344-8. Dührkop, Kai, Louis-Félix Nothias, Markus Fleischauer, Raphael Reher, Marcus Ludwig, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Systematic Classification of Unknown Metabolites Using High-Resolution Fragmentation Mass Spectra.” Nature Biotechnology, November, 1–10. https://doi.org/10.1038/s41587-020-0740-8. Edmands, William M. B., Dinesh K. Barupal, and Augustin Scalbert. 2015. “MetMSLine: An Automated and Fully Integrated Pipeline for Rapid Processing of High-Resolution LC–MS Metabolomic Datasets.” Bioinformatics 31 (5): 788–90. https://doi.org/10.1093/bioinformatics/btu705. Edmands, William M. B., Josie Hayes, and Stephen M. Rappaport. 2018. “SimExTargId: A Comprehensive Package for Real-Time LC-MS Data Acquisition and Analysis.” Bioinformatics 34 (20): 3589–90. https://doi.org/10.1093/bioinformatics/bty218. Edmands, William M. B., Lauren Petrick, Dinesh K. Barupal, Augustin Scalbert, Mark J. Wilson, Jeffrey K. Wickliffe, and Stephen M. Rappaport. 2017. “compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets.” Analytical Chemistry 89 (7): 3919–28. https://doi.org/10.1021/acs.analchem.6b02394. Eilertz, Daniel, Michael Mitterer, and Joerg M. Buescher. 2022. “automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning.” Analytical Chemistry 94 (16): 6163–71. https://doi.org/10.1021/acs.analchem.1c05224. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Forsberg, Erica M., Tao Huan, Duane Rinehart, H. Paul Benton, Benedikt Warth, Brian Hilmers, and Gary Siuzdak. 2018. “Data Processing, Multi-Omic Pathway Mapping, and Metabolite Activity Analysis Using XCMS Online.” Nature Protocols 13 (4): 633–51. https://doi.org/10.1038/nprot.2017.151. Giacomoni, Franck, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, et al. 2015. “Workflow4Metabolomics: A Collaborative Research Infrastructure for Computational Metabolomics.” Bioinformatics 31 (9): 1493–95. https://doi.org/10.1093/bioinformatics/btu813. Goracci, Laura, Paolo Tiberi, Stefano Di Bona, Stefano Bonciarelli, Giovanna Ilaria Passeri, Marta Piroddi, Simone Moretti, Claudia Volpi, Ismael Zamora, and Gabriele Cruciani. 2024. “MARS: A Multipurpose Software for Untargeted LC–MS-Based Metabolomics and Exposomics.” Analytical Chemistry, January. https://doi.org/10.1021/acs.analchem.3c03620. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Jian, Sam Shen, and Tao Huan. 2022. “Paramounter: Direct Measurement of Universal Parameters To Process Metabolomics Data in a ‘White Box’.” Analytical Chemistry, March. https://doi.org/10.1021/acs.analchem.1c04758. Habra, Hani, Maureen Kachman, Kevin Bullock, Clary Clish, Charles R. Evans, and Alla Karnovsky. 2021. “metabCombiner: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.” Analytical Chemistry 93 (12): 5028–36. https://doi.org/10.1021/acs.analchem.0c03693. Haug, Kenneth, Reza M Salek, and Christoph Steinbeck. 2017. “Global Open Data Management in Metabolomics.” Current Opinion in Chemical Biology, Omics, 36 (February): 58–63. https://doi.org/10.1016/j.cbpa.2016.12.024. Helmus, Rick, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt, and Emma L. Schymanski. 2021. “patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening.” Journal of Cheminformatics 13 (1): 1. https://doi.org/10.1186/s13321-020-00477-w. Hiller, Karsten, Jasper Hangebrauk, Christian Jäger, Jana Spura, Kerstin Schreiber, and Dietmar Schomburg. 2009. “MetaboliteDetector: Comprehensive Analysis Tool for Targeted and Nontargeted GC/MS Based Metabolome Analysis.” Analytical Chemistry 81 (9): 3429–39. https://doi.org/10.1021/ac802689c. Huan, Tao, Erica M. Forsberg, Duane Rinehart, Caroline H. Johnson, Julijana Ivanisevic, H. Paul Benton, Mingliang Fang, et al. 2017. “Systems Biology Guided by XCMS Online Metabolomics.” Nature Methods 14 (5): 461–62. https://doi.org/10.1038/nmeth.4260. Jalili, Vahid, Enis Afgan, Qiang Gu, Dave Clements, Daniel Blankenberg, Jeremy Goecks, James Taylor, and Anton Nekrutenko. 2020. “The Galaxy Platform for Accessible, Reproducible and Collaborative Biomedical Analyses: 2020 Update.” Nucleic Acids Research 48 (W1): W395–402. https://doi.org/10.1093/nar/gkaa434. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Kind, Tobias, Hiroshi Tsugawa, Tomas Cajka, Yan Ma, Zijuan Lai, Sajjan S. Mehta, Gert Wohlgemuth, et al. 2018. “Identification of Small Molecules Using Accurate Mass MS/MS Search.” Mass Spectrometry Reviews 37 (4): 513–32. https://doi.org/10.1002/mas.21535. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Lassen, Johan, Kirstine Lykke Nielsen, Mogens Johannsen, and Palle Villesen. 2021. “Assessment of XCMS Optimization Methods with Machine-Learning Performance.” Analytical Chemistry 93 (40): 13459–66. https://doi.org/10.1021/acs.analchem.1c02000. Li, Shuzhao. 2020. Computational Methods and Data Analysis for Metabolomics. Springer. Li, Shuzhao, Youngja Park, Sai Duraisingham, Frederick H. Strobel, Nooruddin Khan, Quinlyn A. Soltow, Dean P. Jones, and Bali Pulendran. 2013. “Predicting Network Activity from High Throughput Metabolomics.” PLOS Computational Biology 9 (7): e1003123. https://doi.org/10.1371/journal.pcbi.1003123. Li, Shuzhao, Amnah Siddiqa, Maheshwor Thapa, Yuanye Chi, and Shujian Zheng. 2023. “Trackable and Scalable LC-MS Metabolomics Data Processing Using Asari.” Nature Communications 14 (1): 4113. https://doi.org/10.1038/s41467-023-39889-1. Li, Zhucui, Yan Lu, Yufeng Guo, Haijie Cao, Qinhong Wang, and Wenqing Shui. 2018. “Comprehensive Evaluation of Untargeted Metabolomics Data Processing Software in Feature Detection, Quantification and Discriminating Marker Selection.” Analytica Chimica Acta 1029 (October): 50–57. https://doi.org/10.1016/j.aca.2018.05.001. Liao, Jingyu, Yuhao Zhang, Wendan Zhang, Yuanyuan Zeng, Jing Zhao, Jingfang Zhang, Tingting Yao, et al. 2023. “Different Software Processing Affects the Peak Picking and Metabolic Pathway Recognition of Metabolomics Data.” Journal of Chromatography A 1687 (January): 463700. https://doi.org/10.1016/j.chroma.2022.463700. Libiseller, Gunnar, Michaela Dvorzak, Ulrike Kleb, Edgar Gander, Tobias Eisenberg, Frank Madeo, Steffen Neumann, et al. 2015. “IPO: A Tool for Automated Optimization of XCMS Parameters.” BMC Bioinformatics 16 (April): 118. https://doi.org/10.1186/s12859-015-0562-8. Liu, Qin, Douglas Walker, Karan Uppal, Zihe Liu, Chunyu Ma, ViLinh Tran, Shuzhao Li, Dean P. Jones, and Tianwei Yu. 2020. “Addressing the Batch Effect Issue for LC/MS Metabolomics Data in Data Preprocessing.” Scientific Reports 10 (1): 13856. https://doi.org/10.1038/s41598-020-70850-0. Liu, Youzhong, Yingjie Zhang, Tom Vennekens, Jennifer L. Lippens, Luc Duijsens, Danh Bui-Thi, Kris Laukens, and Thomas de Vijlder. 2023. “MeRgeION: A Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing.” Analytical Chemistry 95 (22): 8433–42. https://doi.org/10.1021/acs.analchem.2c04343. Ludwig, Marcus, Louis-Félix Nothias, Kai Dührkop, Irina Koester, Markus Fleischauer, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Database-Independent Molecular Formula Annotation Using Gibbs Sampling Through ZODIAC.” Nature Machine Intelligence 2 (10): 629–41. https://doi.org/10.1038/s42256-020-00234-6. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Mahieu, Nathaniel G., Jonathan L. Spalding, and Gary J. Patti. 2016. “Warpgroup: Increased Precision of Metabolomic Data Processing by Consensus Integration Bound Analysis.” Bioinformatics 32 (2): 268–75. https://doi.org/10.1093/bioinformatics/btv564. Matsuo, Teruko, Hiroshi Tsugawa, Hiromi Miyagawa, and Eiichiro Fukusaki. 2017. “Integrated Strategy for Unknown EI–MS Identification Using Quality Control Calibration Curve, Multivariate Analysis, EI–MS Spectral Database, and Retention Index Prediction.” Analytical Chemistry 89 (12): 6766–73. https://doi.org/10.1021/acs.analchem.7b01010. McLean, Craig, and Elizabeth B. Kujawinski. 2020. “AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing.” Analytical Chemistry 92 (8): 5724–32. https://doi.org/10.1021/acs.analchem.9b04804. Melamud, Eugene, Livia Vastag, and Joshua D. Rabinowitz. 2010. “Metabolomic Analysis and Visualization Engine for LC-MS Data.” Analytical Chemistry 82 (23): 9818–26. https://doi.org/10.1021/ac1021166. Montenegro-Burke, J. Rafael, Aries E. Aisporna, H. Paul Benton, Duane Rinehart, Mingliang Fang, Tao Huan, Benedikt Warth, et al. 2017. “Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes.” Analytical Chemistry 89 (2): 1254–59. https://doi.org/10.1021/acs.analchem.6b03890. Myers, Owen D., Susan J. Sumner, Shuzhao Li, Stephen Barnes, and Xiuxia Du. 2017. “Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data.” Analytical Chemistry 89 (17): 8689–95. https://doi.org/10.1021/acs.analchem.7b01069. Nothias, Louis-Félix, Daniel Petras, Robin Schmid, Kai Dührkop, Johannes Rainer, Abinesh Sarvepalli, Ivan Protsyuk, et al. 2020. “Feature-Based Molecular Networking in the GNPS Analysis Environment.” Nature Methods 17 (9): 905–8. https://doi.org/10.1038/s41592-020-0933-6. Palmer, Andrew, Prasad Phapale, Ilya Chernyavsky, Regis Lavigne, Dominik Fay, Artem Tarasov, Vitaly Kovalev, et al. 2017. “FDR-controlled Metabolite Annotation for High-Resolution Imaging Mass Spectrometry.” Nature Methods 14 (1): 57–60. https://doi.org/10.1038/nmeth.4072. Pang, Zhiqiang, Jasmine Chong, Shuzhao Li, and Jianguo Xia. 2020. “MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics.” Metabolites 10 (5): 186. https://doi.org/10.3390/metabo10050186. Petras, Daniel, Vanessa V. Phelan, Deepa Acharya, Andrew E. Allen, Allegra T. Aron, Nuno Bandeira, Benjamin P. Bowen, et al. 2021. “GNPS Dashboard: Collaborative Exploration of Mass Spectrometry Data in the Web Browser.” Nature Methods, December, 1–3. https://doi.org/10.1038/s41592-021-01339-5. Pfeuffer, Julianus, Chris Bielow, Samuel Wein, Kyowon Jeong, Eugen Netz, Axel Walter, Oliver Alka, et al. 2024. “OpenMS 3 Enables Reproducible Analysis of Large-Scale Mass Spectrometry Data.” Nature Methods 21 (3): 365–67. https://doi.org/10.1038/s41592-024-02197-7. Pfeuffer, Julianus, Timo Sachsenberg, Oliver Alka, Mathias Walzer, Alexander Fillbrunn, Lars Nilse, Oliver Schilling, Knut Reinert, and Oliver Kohlbacher. 2017. “OpenMS – A Platform for Reproducible Analysis of Mass Spectrometry Data.” Journal of Biotechnology, Bioinformatics Solutions for Big Data Analysis in Life Sciences presented by the German Network for Bioinformatics Infrastructure, 261 (November): 142–48. https://doi.org/10.1016/j.jbiotec.2017.05.016. Pluskal, Tomáš, Sandra Castillo, Alejandro Villar-Briones, and Matej Orešič. 2010. “MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data.” BMC Bioinformatics 11: 395. https://doi.org/10.1186/1471-2105-11-395. Pluskal, Tomáš, Ansgar Korf, Aleksandr Smirnov, Robin Schmid, Timothy R. Fallon, Xiuxia Du, and Jing-Ke Weng. 2020. “CHAPTER 7:Metabolomics Data Analysis Using MZmine.” In Processing Metabolomics and Proteomics Data with Open Software, 232–54. https://doi.org/10.1039/9781788019880-00232. Plyushchenko, Ivan V., Elizaveta S. Fedorova, Natalia V. Potoldykova, Konstantin A. Polyakovskiy, Alexander I. Glukhov, and Igor A. Rodin. 2022. “Omics Untargeted Key Script: R-Based Software Toolbox for Untargeted Metabolomics with Bladder Cancer Biomarkers Discovery Case Study.” Journal of Proteome Research 21 (3): 833–47. https://doi.org/10.1021/acs.jproteome.1c00392. Riquelme, Gabriel, Nicolás Zabalegui, Pablo Marchi, Christina M. Jones, and María Eugenia Monge. 2020. “A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows.” Metabolites 10 (10): 416. https://doi.org/10.3390/metabo10100416. Röst, Hannes L., Timo Sachsenberg, Stephan Aiche, Chris Bielow, Hendrik Weisser, Fabian Aicheler, Sandro Andreotti, et al. 2016. “OpenMS: A Flexible Open-Source Software Platform for Mass Spectrometry Data Analysis.” Nature Methods 13 (9): 741–48. https://doi.org/10.1038/nmeth.3959. Röst, Hannes L., Uwe Schmitt, Ruedi Aebersold, and Lars Malmström. 2014. “pyOpenMS: A Python-based Interface to the OpenMS Mass-Spectrometry Algorithm Library.” PROTEOMICS 14 (1): 74–77. https://doi.org/10.1002/pmic.201300246. Rurik, Marc, Oliver Alka, Fabian Aicheler, and Oliver Kohlbacher. 2020. “Metabolomics Data Processing Using OpenMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 49–60. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_4. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Scheubert, Kerstin, Franziska Hufsky, Daniel Petras, Mingxun Wang, Louis-Félix Nothias, Kai Dührkop, Nuno Bandeira, Pieter C. Dorrestein, and Sebastian Böcker. 2017. “Significance Estimation for Large Scale Metabolomics Annotations by Spectral Matching.” Nature Communications 8 (1): 1494. https://doi.org/10.1038/s41467-017-01318-5. Shen, Xiaotao, Hong Yan, Chuchu Wang, Peng Gao, Caroline H. Johnson, and Michael P. Snyder. 2022. “TidyMass an Object-Oriented Reproducible Analysis Framework for LC–MS Data.” Nature Communications 13 (1): 4365. https://doi.org/10.1038/s41467-022-32155-w. Silva, Ricardo R. da, Mingxun Wang, Louis-Félix Nothias, Justin J. J. van der Hooft, Andrés Mauricio Caraballo-Rodríguez, Evan Fox, Marcy J. Balunas, Jonathan L. Klassen, Norberto Peporine Lopes, and Pieter C. Dorrestein. 2018. “Propagating Annotations of Molecular Networks Using in Silico Fragmentation.” PLOS Computational Biology 14 (4): e1006089. https://doi.org/10.1371/journal.pcbi.1006089. Stancliffe, Ethan, Michaela Schwaiger-Haber, Miriam Sindelar, Matthew J. Murphy, Mette Soerensen, and Gary J. Patti. 2022. “An Untargeted Metabolomics Workflow That Scales to Thousands of Samples for Population-Based Studies.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.2c01270. Tautenhahn, Ralf, Kevin Cho, Winnie Uritboonthai, Zhengjiang Zhu, Gary J. Patti, and Gary Siuzdak. 2012. “An Accelerated Workflow for Untargeted Metabolomics Using the METLIN Database.” Nature Biotechnology 30 (9): 826–28. https://doi.org/10.1038/nbt.2348. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uchino, Haruki, Hiroshi Tsugawa, Hidenori Takahashi, and Makoto Arita. 2022. “Computational Mass Spectrometry Accelerates C = C Position-Resolved Untargeted Lipidomics Using Oxygen Attachment Dissociation.” Communications Chemistry 5 (1): 1–13. https://doi.org/10.1038/s42004-022-00778-1. Uppal, Karan, Quinlyn A. Soltow, Frederick H. Strobel, W. Stephen Pittard, Kim M. Gernert, Tianwei Yu, and Dean P. Jones. 2013. “xMSanalyzer: Automated Pipeline for Improved Feature Detection and Downstream Analysis of Large-Scale, Non-Targeted Metabolomics Data.” BMC Bioinformatics 14 (1): 15. https://doi.org/10.1186/1471-2105-14-15. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. Volikov, Alexander, Gleb Rukhovich, and Irina V. Perminova. 2023. “NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter.” NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter, June. https://doi.org/10.1021/jasms.3c00003. Wang, Mingxun, Jeremy J. Carver, Vanessa V. Phelan, Laura M. Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, et al. 2016. “Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking.” Nature Biotechnology 34 (8): 828–37. https://doi.org/10.1038/nbt.3597. Wang, Yang, Fang Liu, Peng Li, Chengwei He, Ruibing Wang, Huanxing Su, and Jian-Bo Wan. 2016. “An Improved Pseudotargeted Metabolomics Approach Using Multiple Ion Monitoring with Time-Staggered Ion Lists Based on Ultra-High Performance Liquid Chromatography/Quadrupole Time-of-Flight Mass Spectrometry.” Analytica Chimica Acta 927 (July): 82–88. https://doi.org/10.1016/j.aca.2016.05.008. Weber, Ralf J. M., Thomas N. Lawson, Reza M. Salek, Timothy M. D. Ebbels, Robert C. Glen, Royston Goodacre, Julian L. Griffin, et al. 2017. “Computational Tools and Workflows in Metabolomics: An International Survey Highlights the Opportunity for Harmonisation Through Galaxy.” Metabolomics 13 (2). https://doi.org/10.1007/s11306-016-1147-x. Wen, Bo, Zhanlong Mei, Chunwei Zeng, and Siqi Liu. 2017. “metaX: A Flexible and Comprehensive Software for Processing Metabolomics Data.” BMC Bioinformatics 18 (March): 183. https://doi.org/10.1186/s12859-017-1579-y. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Yu, Tianwei, Youngja Park, Jennifer M. Johnson, and Dean P. Jones. 2009. “apLCMS—Adaptive Processing of High-Resolution LC/MS Data.” Bioinformatics 25 (15): 1930–36. https://doi.org/10.1093/bioinformatics/btp291. Yu, Yong-Jie, Qing-Xia Zheng, Yue-Ming Zhang, Qian Zhang, Yu-Ying Zhang, Ping-Ping Liu, Peng Lu, et al. 2019. “Automatic Data Analysis Workflow for Ultra-High Performance Liquid Chromatography-High Resolution Mass Spectrometry-Based Metabolomics.” Journal of Chromatography A 1585 (January): 172–81. https://doi.org/10.1016/j.chroma.2018.11.070. Zhang, Yu-Ying, Qian Zhang, Yue-Ming Zhang, Wei-Wei Wang, Li Zhang, Yong-Jie Yu, Chang-Cai Bai, Ji-Zhao Guo, Hai-Yan Fu, and Yuanbin She. 2020. “A Comprehensive Automatic Data Analysis Strategy for Gas Chromatography-Mass Spectrometry Based Untargeted Metabolomics.” Journal of Chromatography A 1616 (April): 460787. https://doi.org/10.1016/j.chroma.2019.460787. Zheng, Fujian, Lei You, Wangshu Qin, Runze Ouyang, Wangjie Lv, Lei Guo, Xin Lu, Enyou Li, Xinjie Zhao, and Guowang Xu. 2022. “MetEx: A Targeted Extraction Strategy for Improving the Coverage and Accuracy of Metabolite Annotation in Liquid Chromatography–High-Resolution Mass Spectrometry Data.” Analytical Chemistry 94 (24): 8561–69. https://doi.org/10.1021/acs.analchem.1c04783. Zheng, Fujian, Xinjie Zhao, Zhongda Zeng, Lichao Wang, Wangjie Lv, Qingqing Wang, and Guowang Xu. 2020. “Development of a Plasma Pseudotargeted Metabolomics Method Based on Ultra-High-Performance Liquid Chromatography–Mass Spectrometry.” Nature Protocols 15 (8): 2519–37. https://doi.org/10.1038/s41596-020-0341-5. "],["raw-data-pretreatment.html", "Chapter 6 Raw data pretreatment 6.1 Data visualization 6.2 Peak extraction 6.3 MS/MS 6.4 Retention Time Correction 6.5 Filling missing values 6.6 Spectral deconvolution 6.7 Dynamic Range 6.8 RSD/fold change Filter 6.9 Power Analysis Filter", " Chapter 6 Raw data pretreatment Raw data from the instruments such as LC-MS or GC-MS were hard to be analyzed. To make it clear, the structure of those data could be summarized as: Indexed scan with time-stamp Each scan contains a full scan mass spectra Common formats for open source mass spectrum data are mzxml, mzml or CDF. However, MassComp might shrink the data size(R. Yang, Chen, and Ochoa 2019). ProteoWizard Toolkit provides a set of open-source, cross-platform software libraries and tools (Chambers et al. 2012). Msconvert is one tool in this toolkit. mzML2ISA &amp; nmrML2ISA could generate enriched ISA-Tab metadata files from metabolomics XML data (Larralde et al. 2017). 6.1 Data visualization You could use msxpertsuite for MS data visualization. It is biological mass spectrometry data visualization and mining with full JavaScript ability (Rusconi 2019). FTMSVisualization is a suite of tools for visualizing complex mixture FT-MS data (Kew et al. 2017). 6.2 Peak extraction GC/LC-MS data are usually be shown as a matrix with column standing for retention times and row standing for masses after bin them into small cell. Figure 6.1: Demo of GC/LC-MS data Conversation from the mass-retention time matrix into a vector with selected MS peaks at certain retention time is the basic idea of Peak extraction. You could EIC for each mass to charge ratio and use the change of trace slope to determine whether there is a peak or not. Then we could make integration for this peak and get peak area and retention time. intensity &lt;- c(10,10,10,10,10,14,19,25,30,33,26,21,16,12,11,10,9,10,11,10) time &lt;- c(1:20) plot(intensity~time, type = &#39;o&#39;, main = &#39;EIC&#39;) Figure 6.2: Demo of EIC with peak However, due to the accuracy of instrument, the detected mass to charge ratio would have some shift and EIC would fail if different scan get the intensity from different mass to charge ratio. In the matchedfilter algorithm (Smith et al. 2006), they solve this issue by bin the data in m/z dimension. The adjacent chromatographic slices could be combined to find a clean signal fitting fixed second-derivative Gaussian with full width at half-maximum (fwhm) of 30s to find peaks with about 1.5-4 times the signal peak width. The the integration is performed on the fitted area. Figure 6.3: Demo of matchedfilter The Centwave algorithm (Tautenhahn, Böttcher, and Neumann 2008) based on detection of regions of interest(ROI) and the following Continuous Wavelet Transform (CWT) is preferred for high-resolution mass spectrum. ROI means a region with stable mass for a certain time. When we find the ROIs, the peak shape is evaluated and ROI could be extended if needed. This algorithm use prefilter to accelerate the processing speed. prefilter with 3 and 100 means the ROI should contain 3 scan with intensity above 100. Centwave use a peak width range which should be checked on pool QC. Another important parameter is ppm. It is the maximum allowed deviation between scans when locating regions of interest (ROIs), which is different from vendor number and you need to extend them larger than the company claimed. For profparam, it’s used for fill peaks or align peaks instead of peak picking. snthr is the cutoff of signal to noise ratio. An Open-source feature detection algorithm for non-target LC–MS analytics could be found here to understand peak picking process(Dietrich, Wick, and Ternes 2022). Pseudo F-ratio moving window could also be used to select untargeted region of interest for gas chromatography – mass spectrometry data(Giebelhaus et al. 2022). mzRAPP could enables the generation of benchmark peak lists by using an internal set of known molecules in the analyzed data set to compare workflows(El Abiead et al. 2022). G-Aligner is a graph-based feature alignment method for untargeted LC–MS-based metabolomics(Ruimin Wang et al. 2023), which will consider the importance of feature matching. qBinning is a novel algorithm for constructing extracted ion chromatograms (EICs) based on statistical principles and without the need to set user parameters(Reuschenbach et al. 2023). Machine learning can also be used for feature extraxtion. Deep learning frame for LC-MS feature detection on 2D pseudo color image could improve the peak picking process (F. Zhao, Huang, and Zhang 2021). Another deep learning-assisted peak curation (NeatMS) can also be used for large-scale LC-MS metabolomics(Gloaguen, Kirwan, and Beule 2022). A feature selection pipeline based on neural network and genetic algorithm could be applied for metabolomics data analysis(Lisitsyna et al. 2022). 6.3 MS/MS Various data acquisition workflow could be checked here(Fenaille et al. 2017). Before using MS/MS annotation, it’s better to know that DDA and DIA will lose precursor found in MS1(J. Guo and Huan 2020; Stincone et al. 2023). 6.3.1 MRM decoMS2 An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites(Nikolskiy et al. 2013). It requires two different collision energies, low (usually 0V) and high, in each precursor range to solve the mathematical equations. Data-Independent Targeted Metabolomics Method could connect MS1 and MRM (Y. Chen et al. 2017) DecoID python-based database-assisted deconvolution of MS/MS spectra. 6.3.2 DDA The coverage of DDA could be enhanced by a feature classification strategy (Y. Hu, Cai, and Huan 2019) or iterative process (Anderson et al. 2021). 6.3.3 DIA DIA methods could be summarized here including MSE, stepwise windows and random windows(Bilbao et al. 2015) and here is comparison(Zhu, Chen, and Subramanian 2014). msPurity Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics (Lawson et al. 2017) ULSA Deconvolution algorithm and a universal library search algorithm (ULSA) for the analysis of complex spectra generated via data-independent acquisition based on Matlab (Samanipour et al. 2018) MS-DIAL was initially designed for DIA (Tsugawa et al. 2015; Treutler and Neumann 2016) DIA-Umpire show a comprehensive computational framework for data-independent acquisition proteomics (Tsou et al. 2015) MetDIA could perform Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition (H. Li et al. 2016) MetaboDIA workflow build customized MS/MS spectral libraries using a user’s own data dependent acquisition (DDA) data and to perform MS/MS-based quantification with DIA data, thus complementing conventional MS1-based quantification (G. Chen et al. 2017) SWATHtoMRM Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery(Zha et al. 2018) Skyline is a freely-available and open source Windows client application for building Selected Reaction Monitoring (SRM) / Multiple Reaction Monitoring (MRM), Parallel Reaction Monitoring (PRM - Targeted MS/MS), Data Independent Acquisition (DIA/SWATH) and targeted DDA with MS1 quantitative methods and analyzing the resulting mass spectrometer data (Adams et al. 2020). MSstats is an R-based/Bioconductor package for statistical relative quantification of peptides and proteins in mass spectrometry-based proteomic experiments(Choi et al. 2014). It is applicable to multiple types of sample preparation, including label-free workflows, workflows that use stable isotope labeled reference proteins and peptides, and work-flows that use fractionation. It is applicable to targeted Selected Reactin Monitoring(SRM), Data-Dependent Acquisiton(DDA or shotgun), and Data-Independent Acquisition(DIA or SWATH-MS). This github page is for sharing source and testing. Other related papers could be found here to cover SWATH and other topic in DIA(Bonner and Hopfgartner 2018; Ruohong Wang, Yin, and Zhu 2019) MetaboAnnotatoR is designed to perform metabolite annotation of features from LC-MS All-ion fragmentation (AIF) datasets, using ion fragment databases(Graça et al. 2022). DIAMetAlyzer is a pipeline for assay library generation and targeted analysis with statistical validation.(Alka et al. 2022) MetaboMSDIA: A tool for implementing data-independent acquisition in metabolomic-based mass spectrometry analysis(Ledesma-Escobar, Priego-Capote, and Calderón-Santiago 2023). CRISP: a cross-run ion selection and peak-picking (CRISP) tool that utilizes the important advantage of run-to-run consistency of DIA and simultaneously examines the DIA data from the whole set of runs to filter out the interfering signals, instead of only looking at a single run at a time(Yan et al. 2023). 6.4 Retention Time Correction For single file, we could get peaks. However, we should make the peaks align across samples for as features and retention time corrections should be performed. The basic idea behind retention time correction is that use the high quality grouped peaks to make a new retention time. You might choose obiwarp(for dramatic shifts) or loess regression(fast) method to get the corrected retention time for all of the samples. Remember the original retention times might be changed and you might need cross-correct the data. After the correction, you could group the peaks again for a better cross-sample peaks list. However, if you directly use obiwarp, you don’t have to group peaks before correction. This paper show a matlab based shift correction methods(H.-Y. Fu et al. 2017). Retention time correction is a Parametric time warping process and this paper is a good start (Wehrens, Bloemberg, and Eilers 2015). Meanwhile, you could use MS2 for retention time correction(Lili Li et al. 2017). This work is a python based RI system and peak shift correction model, significantly enhancing alignment accuracy(Hao et al. 2023). 6.5 Filling missing values Too many zeros or NA in peaks list are problematic for statistics. Then we usually need to integreate the area exsiting a peak. xcms 3 could use profile matrix to fill the blank. They also have function to impute the NA data by replace missing values with a proportion of the row minimum or random numbers based on the row minimum. It depends on the user to select imputation methods as well as control the minimum fraction of features appeared in single group. Figure 6.4: Peak filling of GC/LC-MS data With many groups of samples, you will get another data matrix with column standing for peaks at certain retention time and row standing for samples after the Raw data pretreatment. Figure 6.5: Demo of many GC/LC-MS data 6.6 Spectral deconvolution Without structure information about certain compound, the peak extraction would suffer influence from other compounds. At the same retention time, co-elute compounds might share similar mass. Hard electron ionization methods such as electron impact ionization (EI), APPI suffer this issue. So it would be hard to distinguish the co-elute peaks’ origin and deconvolution method[] could be used to separate different groups according to the similar chromatogragh behaviors. Another computational tool eRah could be a better solution for the whole process(Domingo-Almenara et al. 2016). Also the ADAD-GC3.0 could also be helpful for such issue(Y. Ni et al. 2016). Other solutions for GC could be found here(Styczynski et al. 2007; T.-F. Tian et al. 2016; Xiuxia Du and Zeisel 2013). 6.7 Dynamic Range Another issue is the Dynamic Range. For metabolomics, peaks could be below the detection limit or over the detection limit. Such Dynamic range issues might raise the loss of information. 6.7.1 Non-detects Some of the data were limited by the detect of limitation. Thus we need some methods to impute the data if we don’t want to lose information by deleting the NA or 0. Two major imputation way could be used. The first way is use model-free method such as half the minimum of the values across the data, 0, 1, mean/median across the data( enviGCMS package could do this via getimputation function). The second way is use model-based method such as linear model, random forest, KNN, PCA. Try simputation package for various imputation methods. As mentioned before, you could also use imputeRowMin or imputeRowMinRand within xcms package to perform imputation. Tobit regression is preferred for censored data. Also you might choose maximum likelihood estimation(Estimation of mean and standard deviation by MLE. Creating 10 complete samples. Pool the results from 10 individual analyses). x &lt;- rnorm(1000,1) x[x&lt;0] &lt;- 0 y &lt;- x*10+1 library(AER) tfit &lt;- tobit(y ~ x, left = 0) summary(tfit) ## ## Call: ## tobit(formula = y ~ x, left = 0) ## ## Observations: ## Total Left-censored Uncensored Right-censored ## 1000 0 1000 0 ## ## Coefficients: ## Estimate Std. Error z value Pr(&gt;|z|) ## (Intercept) 1.0000 0.4325 2.312 0.0208 * ## x 10.0000 0.3162 31.623 &lt;2e-16 *** ## Log(scale) 2.1541 0.0000 Inf &lt;2e-16 *** ## --- ## Signif. codes: 0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 ## ## Scale: 8.62 ## ## Gaussian distribution ## Number of Newton-Raphson Iterations: 1 ## Log-likelihood: -3073 on 3 Df ## Wald-statistic: 1000 on 1 Df, p-value: &lt; 2.22e-16 According to Ronald Hites’s simulation(Hites 2019), measurements below the LOD (even missing measurements) with the LOD/2 or with the \\(LOD/\\sqrt2\\) causes little bias and “Any time you have a % non-detected &gt;20%, for whatever reason, it is unlikely that the data set can give useful results.” Another study find random forest could be the best imputation method for missing at random (MAR), and missing completely at random (MCAR) data. Quantile regression imputation of left-censored data is the best imputation methods for left-censored missing not at random data (Wei et al. 2018). 6.7.2 Over Detection Limit CorrectOverloadedPeaks could be used to correct the Peaks Exceeding the Detection Limit issue (Lisec et al. 2016). 6.8 RSD/fold change Filter Some peaks need to be rule out due to high RSD% and small fold changes compared with blank samples. A more general feature filtering for biomarker discovery can be found here(Gadara et al. 2021) and a detailed discussion on intensity thresholds could be found here(Houriet et al. 2022). 6.9 Power Analysis Filter As shown in \\[Exprimental design(DoE)\\], the power analysis in metabolomics is ad-hoc since you don’t know too much before you perform the experiment. However, we could perform power analysis after the experiment done. That is, we just rule out the peaks with a lower power for current experimental design. References Adams, Kendra J., Brian Pratt, Neelanjan Bose, Laura G. Dubois, Lisa St John-Williams, Kevin M. Perrott, Karina Ky, et al. 2020. “Skyline for Small Molecules: A Unifying Software Package for Quantitative Metabolomics.” Journal of Proteome Research 19 (4): 1447–58. https://doi.org/10.1021/acs.jproteome.9b00640. Alka, Oliver, Premy Shanthamoorthy, Michael Witting, Karin Kleigrewe, Oliver Kohlbacher, and Hannes L. Röst. 2022. “DIAMetAlyzer Allows Automated False-Discovery Rate-Controlled Analysis for Data-Independent Acquisition in Metabolomics.” Nature Communications 13 (1): 1347. https://doi.org/10.1038/s41467-022-29006-z. Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Bonner, Ron, and Gérard Hopfgartner. 2018. “SWATH Data Independent Acquisition Mass Spectrometry for Metabolomics.” TrAC Trends in Analytical Chemistry, October. https://doi.org/10.1016/j.trac.2018.10.014. Chambers, Matthew C., Brendan Maclean, Robert Burke, Dario Amodei, Daniel L. Ruderman, Steffen Neumann, Laurent Gatto, et al. 2012. “A Cross-Platform Toolkit for Mass Spectrometry and Proteomics.” Nature Biotechnology 30 (October): 918–20. https://doi.org/10.1038/nbt.2377. Chen, Gengbo, Scott Walmsley, Gemmy C. M. Cheung, Liyan Chen, Ching-Yu Cheng, Roger W. Beuerman, Tien Yin Wong, Lei Zhou, and Hyungwon Choi. 2017. “Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.” Analytical Chemistry 89 (9): 4897–4906. https://doi.org/10.1021/acs.analchem.6b05006. Chen, Yanhua, Zhi Zhou, Wei Yang, Nan Bi, Jing Xu, Jiuming He, Ruiping Zhang, Lvhua Wang, and Zeper Abliz. 2017. “Development of a Data-Independent Targeted Metabolomics Method for Relative Quantification Using Liquid Chromatography Coupled with Tandem Mass Spectrometry.” Analytical Chemistry 89 (13): 6954–62. https://doi.org/10.1021/acs.analchem.6b04727. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Dietrich, Christian, Arne Wick, and Thomas A. Ternes. 2022. “Open-Source Feature Detection for Non-Target LC–MS Analytics.” Rapid Communications in Mass Spectrometry 36 (2): e9206. https://doi.org/10.1002/rcm.9206. Domingo-Almenara, Xavier, Jesus Brezmes, Maria Vinaixa, Sara Samino, Noelia Ramirez, Marta Ramon-Krauel, Carles Lerin, et al. 2016. “eRah: A Computational Tool Integrating Spectral Deconvolution and Alignment with Quantification and Identification of Metabolites in GC/MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9821–29. https://doi.org/10.1021/acs.analchem.6b02927. Du, Xiuxia, and Steven H Zeisel. 2013. “SPECTRAL DECONVOLUTION FOR GAS CHROMATOGRAPHY MASS SPECTROMETRY-BASED METABOLOMICS: CURRENT STATUS AND FUTURE PERSPECTIVES.” Computational and Structural Biotechnology Journal 4 (5): 1–10. https://doi.org/10.5936/csbj.201301013. El Abiead, Yasin, Maximilian Milford, Harald Schoeny, Mate Rusz, Reza M. Salek, and Gunda Koellensperger. 2022. “Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection.” Analytical Chemistry 94 (24): 8588–95. https://doi.org/10.1021/acs.analchem.1c05270. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fu, Hai-Yan, Ou Hu, Yue-Ming Zhang, Li Zhang, Jing-Jing Song, Peang Lu, Qing-Xia Zheng, et al. 2017. “Mass-Spectra-Based Peak Alignment for Automatic Nontargeted Metabolic Profiling Analysis for Biomarker Screening in Plant Samples.” Journal of Chromatography A 1513 (Supplement C): 201–9. https://doi.org/10.1016/j.chroma.2017.07.044. Gadara, Darshak, Katerina Coufalikova, Juraj Bosak, David Smajs, and Zdenek Spacil. 2021. “Systematic Feature Filtering in Exploratory Metabolomics: Application Toward Biomarker Discovery.” Analytical Chemistry 93 (26): 9103–10. https://doi.org/10.1021/acs.analchem.1c00816. Giebelhaus, Ryland T., Michael D. Sorochan Armstrong, A. Paulina de la Mata, and James J. Harynuk. 2022. “Untargeted Region of Interest Selection for Gas Chromatography – Mass Spectrometry Data Using a Pseudo F-ratio Moving Window.” Journal of Chromatography A 1682 (October): 463499. https://doi.org/10.1016/j.chroma.2022.463499. Gloaguen, Yoann, Jennifer A. Kirwan, and Dieter Beule. 2022. “Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics.” Analytical Chemistry 94 (12): 4930–37. https://doi.org/10.1021/acs.analchem.1c02220. Graça, Gonçalo, Yuheng Cai, Chung-Ho E. Lau, Panagiotis A. Vorkas, Matthew R. Lewis, Elizabeth J. Want, David Herrington, and Timothy M. D. Ebbels. 2022. “Automated Annotation of Untargeted All-Ion Fragmentation LC–MS Metabolomics Data with MetaboAnnotatoR.” Analytical Chemistry 94 (8): 3446–55. https://doi.org/10.1021/acs.analchem.1c03032. Guo, Jian, and Tao Huan. 2020. “Comparison of Full-Scan, Data-Dependent, and Data-Independent Acquisition Modes in Liquid Chromatography–Mass Spectrometry Based Untargeted Metabolomics.” Analytical Chemistry 92 (12): 8072–80. https://doi.org/10.1021/acs.analchem.9b05135. Hao, Jun-Di, Yao-Yu Chen, Yan-Zhen Wang, Na An, Pei-Rong Bai, Quan-Fei Zhu, and Yu-Qi Feng. 2023. “Novel Peak Shift Correction Method Based on the Retention Index for Peak Alignment in Untargeted Metabolomics.” Analytical Chemistry 95 (35): 13330–37. https://doi.org/10.1021/acs.analchem.3c02583. Hites, Ronald A. 2019. “Correcting for Censored Environmental Measurements.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.9b05042. Houriet, Joelle, Warren S. Vidar, Preston K. Manwill, Daniel A. Todd, and Nadja B. Cech. 2022. “How Low Can You Go? Selecting Intensity Thresholds for Untargeted Metabolomics Data Preprocessing.” Analytical Chemistry 94 (51): 17964–71. https://doi.org/10.1021/acs.analchem.2c04088. Hu, Yaxi, Betty Cai, and Tao Huan. 2019. “Enhancing Metabolome Coverage in Data-Dependent LC–MS/MS Analysis Through an Integrated Feature Extraction Strategy.” Analytical Chemistry 91 (22): 14433–41. https://doi.org/10.1021/acs.analchem.9b02980. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Larralde, Martin, Thomas N. Lawson, Ralf J. M. Weber, Pablo Moreno, Kenneth Haug, Philippe Rocca-Serra, Mark R. Viant, Christoph Steinbeck, and Reza M. Salek. 2017. “mzML2ISA &amp; nmrML2ISA: Generating Enriched ISA-Tab Metadata Files from Metabolomics XML Data.” Bioinformatics 33 (16): 2598–2600. https://doi.org/10.1093/bioinformatics/btx169. Lawson, Thomas N., Ralf J. M. Weber, Martin R. Jones, Andrew J. Chetwynd, Giovanny Rodrı́guez-Blanco, Riccardo Di Guida, Mark R. Viant, and Warwick B. Dunn. 2017. “msPurity: Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics.” Analytical Chemistry 89 (4): 2432–39. https://doi.org/10.1021/acs.analchem.6b04358. Ledesma-Escobar, Carlos Augusto, Feliciano Priego-Capote, and Mónica Calderón-Santiago. 2023. “MetaboMSDIA: A Tool for Implementing Data-Independent Acquisition in Metabolomic-Based Mass Spectrometry Analysis.” Analytica Chimica Acta 1266 (July): 341308. https://doi.org/10.1016/j.aca.2023.341308. Li, Hao, Yuping Cai, Yuan Guo, Fangfang Chen, and Zheng-Jiang Zhu. 2016. “MetDIA: Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition.” Analytical Chemistry 88 (17): 8757–64. https://doi.org/10.1021/acs.analchem.6b02122. Li, Lili, Weijie Ren, Hongwei Kong, Chunxia Zhao, Xinjie Zhao, Xiaohui Lin, Xin Lu, and Guowang Xu. 2017. “An Alignment Algorithm for LC-MS-based Metabolomics Dataset Assisted by MS/MS Information.” Analytica Chimica Acta 990 (October): 96–102. https://doi.org/10.1016/j.aca.2017.07.058. Lisec, Jan, Friederike Hoffmann, Clemens Schmitt, and Carsten Jaeger. 2016. “Extending the Dynamic Range in Metabolomics Experiments by Automatic Correction of Peaks Exceeding the Detection Limit.” Analytical Chemistry 88 (15): 7487–92. https://doi.org/10.1021/acs.analchem.6b02515. Lisitsyna, Anna, Franco Moritz, Youzhong Liu, Loubna Al Sadat, Hans Hauner, Melina Claussnitzer, Philippe Schmitt-Kopplin, and Sara Forcisi. 2022. “Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.” Analytical Chemistry 94 (14): 5474–82. https://doi.org/10.1021/acs.analchem.1c03237. Ni, Yan, Mingming Su, Yunping Qiu, Wei Jia, and Xiuxia Du. 2016. “ADAP-GC 3.0: Improved Peak Detection and Deconvolution of Co-eluting Metabolites from GC/TOF-MS Data for Metabolomics Studies.” Analytical Chemistry 88 (17): 8802–11. https://doi.org/10.1021/acs.analchem.6b02222. Nikolskiy, Igor, Nathaniel G. Mahieu, Ying-Jr Chen, Ralf Tautenhahn, and Gary J. Patti. 2013. “An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites.” Analytical Chemistry 85 (16): 7713–19. https://doi.org/10.1021/ac400751j. Reuschenbach, Max, Felix Drees, Torsten C. Schmidt, and Gerrit Renner. 2023. “qBinning: Data Quality-Based Algorithm for Automized Ion Chromatogram Extraction from High-Resolution Mass Spectrometry.” Analytical Chemistry, September. https://doi.org/10.1021/acs.analchem.3c01079. Rusconi, Filippo. 2019. “mineXpert: Biological Mass Spectrometry Data Visualization and Mining with Full JavaScript Ability.” Journal of Proteome Research 18 (5): 2254–59. https://doi.org/10.1021/acs.jproteome.9b00099. Samanipour, Saer, Malcolm J. Reid, Kine Bæk, and Kevin V. Thomas. 2018. “Combining a Deconvolution and a Universal Library Search Algorithm for the Nontarget Analysis of Data-Independent Acquisition Mode Liquid Chromatography-High-Resolution Mass Spectrometry Results.” Environmental Science &amp; Technology 52 (8): 4694–4701. https://doi.org/10.1021/acs.est.8b00259. Smith, Colin A., Elizabeth J. Want, Grace O’Maille, Ruben Abagyan, and Gary Siuzdak. 2006. “XCMS:  Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification.” Analytical Chemistry 78 (3): 779–87. https://doi.org/10.1021/ac051437y. Stincone, Paolo, Abzer K. Pakkir Shah, Robin Schmid, Lana G. Graves, Stilianos P. Lambidis, Ralph R. Torres, Shu-Ning Xia, et al. 2023. “Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform.” Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform, August. https://doi.org/10.1021/acs.analchem.3c01202. Styczynski, Mark P., Joel F. Moxley, Lily V. Tong, Jason L. Walther, Kyle L. Jensen, and Gregory N. Stephanopoulos. 2007. “Systematic Identification of Conserved Metabolites in GC/MS Data for Metabolomics and Biomarker Discovery.” Analytical Chemistry 79 (3): 966–73. https://doi.org/10.1021/ac0614846. Tautenhahn, Ralf, Christoph Böttcher, and Steffen Neumann. 2008. “Highly Sensitive Feature Detection for High Resolution LC/MS.” BMC Bioinformatics 9: 504. https://doi.org/10.1186/1471-2105-9-504. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Treutler, Hendrik, and Steffen Neumann. 2016. “Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data.” Metabolites 6 (4): 37. https://doi.org/10.3390/metabo6040037. Tsou, Chih-Chiang, Dmitry Avtonomov, Brett Larsen, Monika Tucholska, Hyungwon Choi, Anne-Claude Gingras, and Alexey I. Nesvizhskii. 2015. “DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics.” Nature Methods 12 (3): 258–64. https://doi.org/10.1038/nmeth.3255. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Wang, Ruimin, Miaoshan Lu, Shaowei An, Jinyin Wang, and Changbin Yu. 2023. “G-Aligner: A Graph-Based Feature Alignment Method for Untargeted LC–MS-based Metabolomics.” BMC Bioinformatics 24 (1): 431. https://doi.org/10.1186/s12859-023-05525-4. Wang, Ruohong, Yandong Yin, and Zheng-Jiang Zhu. 2019. “Advancing Untargeted Metabolomics Using Data-Independent Acquisition Mass Spectrometry Technology.” Analytical and Bioanalytical Chemistry 411 (19): 4349–57. https://doi.org/10.1007/s00216-019-01709-1. Wehrens, Ron, Tom G. Bloemberg, and Paul H. C. Eilers. 2015. “Fast Parametric Time Warping of Peak Lists.” Bioinformatics 31 (18): 3063–65. https://doi.org/10.1093/bioinformatics/btv299. Wei, Runmin, Jingye Wang, Mingming Su, Erik Jia, Shaoqiu Chen, Tianlu Chen, and Yan Ni. 2018. “Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.” Scientific Reports 8 (1): 663. https://doi.org/10.1038/s41598-017-19120-0. Yan, Binjun, Mengtian Shi, Siyu Cai, Yuan Su, Renhui Chen, Chiyuan Huang, and David Da Yong Chen. 2023. “Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC–MS/MS.” Analytical Chemistry 95 (45): 16558–66. https://doi.org/10.1021/acs.analchem.3c02689. Yang, Ruochen, Xi Chen, and Idoia Ochoa. 2019. “MassComp, a Lossless Compressor for Mass Spectrometry Data.” BMC Bioinformatics 20 (1): 368. https://doi.org/10.1186/s12859-019-2962-7. Zha, Haihong, Yuping Cai, Yandong Yin, Zhuozhong Wang, Kang Li, and Zheng-Jiang Zhu. 2018. “SWATHtoMRM: Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery.” Analytical Chemistry 90 (6): 4062–70. https://doi.org/10.1021/acs.analchem.7b05318. Zhao, Fan, Shuai Huang, and Xiaozhe Zhang. 2021. “High Sensitivity and Specificity Feature Detection in Liquid Chromatography–Mass Spectrometry Data: A Deep Learning Framework.” Talanta 222 (January): 121580. https://doi.org/10.1016/j.talanta.2020.121580. Zhu, Xiaochun, Yuping Chen, and Raju Subramanian. 2014. “Comparison of Information-Dependent Acquisition, SWATH, and MSAll Techniques in Metabolite Identification Study Employing Ultrahigh-Performance Liquid Chromatography–Quadrupole Time-of-Flight Mass Spectrometry.” Analytical Chemistry 86 (2): 1202–9. https://doi.org/10.1021/ac403385y. "],["annotation.html", "Chapter 7 Annotation 7.1 Issues in annotation 7.2 Peak misidentification 7.3 Annotation v.s. identification 7.4 Molecular Formula Assignment 7.5 Redundant peaks 7.6 MS1 MS2 connection 7.7 MS2 MSn connection 7.8 MS/MS annotation 7.9 Knowledge based annotation 7.10 MS Database for annotation 7.11 Compounds Database", " Chapter 7 Annotation When you get the peaks table or features table, annotation of the peaks would help you. Check this review(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) or other reviews(Chaleckis et al. 2019; Lai et al. 2018; Nash and Dunn 2019; Mark R. Viant et al. 2017; Allard, Genta-Jouve, and Wolfender 2017; Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) for a detailed notes on annotation. The first paper proposed five levels regarding currently computational annotation strategies. Level 1: Peak Grouping: MS Psedospectra extraction based on peak shape similarity and peak abundance correlation Level 2: Peak Annotation: Adducts, Neutral losses, isotopes, and other mass relationships based on mass distances Level 3: Biochemical knowledge based on putative identification, potential biochemical reaction and related statistical analysis Level 4: Use and integration of tandem MS data based on data dependent/independent acquisition mode or in silico prediction Level 5: Retention time prediction based on library-available retention index or quantitative structure-retention relationships (QSRR) models. Most of the software are at level 1 or 2. If we only have compounds structure, we could guess ions under different ionization method. If we have mass spectrum, we could match the mass spectral by a similarity analysis to the database. In metabolomics, we only have mass spectrum or mass-to-charge ratios. Single mass-to-charge ratio is not enough for identification. That’s the one bottleneck for annotation. So prediction is always performed on MS/MS data. 7.1 Issues in annotation The major issue in annotation is the redundancy peaks from same metabolite. Unlike genomes, peaks or features from peak selection are not independent with each other. Adducts, in-source fragments and isotopes would lead to wrong annotation. A common solution is that use known adducts, neutral losses, molecular multimers or multiple charged ions to compare mass distances. Another issue is about the MS/MS database. Only 10% of known metabolites in databases have experimental spectral data. Thus in silico prediction is required. Some works try to fill the gap between experimental data, theoretical values(from chemical database like chemspider) and prediction together. Here is a nice review about MS/MS prediction(Hufsky, Scheubert, and Böcker 2014). 7.2 Peak misidentification Isomer Use separation methods such as chromatography, ion mobility MS, MS/MS. Reversed-phase ion-pairing chromatography and HILIC is useful. Chemical derivatization is another option. Interfering compounds 20ppm is the least exact mass accuracy for HRMS. In-source degradation products 7.3 Annotation v.s. identification According to the definition from the Chemical Analysis Working Group of the Metabolomics Standards Intitvative(Lloyd W. Sumner et al. 2007; Mark R. Viant et al. 2017). Four levels of confidence could be assigned to identification: Level 1 ‘identified metabolites’ Level 2 ‘Putatively annotated compounds’ Level 3 ‘Putatively characterised compound classes’ Level 4 ‘Unknown’ In practice, data analysis based annotation could reach level 2. For level 1, we need at extra methods such as MS/MS, retention time, accurate mass, 2D NMR spectra, and so on to confirm the compounds. However, standards are always required for solid proof. For specific group of compounds such as PFASs, the communication of confidence level could be slightly different(Charbonnet et al. 2022). Through MS/MS seemed a required step for identification, recent study found ESI might also generate fragments ions for structure identification (Xue, Guijas, et al. 2020; Xue et al. 2021, 2023; Bernardo-Bermejo et al. 2023). 7.4 Molecular Formula Assignment Cheminformatics will help for MS annotation. The first task is molecular formula assignment. For a given accurate mass, the formula should be constrained by predefined element type and atom number, mass error window and rules of chemical bonding, such as double bond equivalent (DBE) and the nitrogen rule. The nitrogen rule is that an odd nominal molecular mass implies also an odd number of nitrogen. This rule should only be used with nominal (integer) masses. Degree of unsaturation or DBE use rings-plus-double-bonds equivalent (RDBE) values, which should be interger. The elements oxygen and sulphur were not taken into account. Otherwise the molecular formula will not be true. \\[RDBE = C+Si - 1/2(H+F+Cl+Br+I) + 1/2(N+P)+1 \\] To assign molecular formula to a mass to charge ratio, Seven Golden Rules (Kind and Fiehn 2007) for heuristic filtering of molecular formulas should be considered: Apply heuristic restrictions for number of elements during formula generation. This is the table for known compounds: ## Mass.Range.[Da] Library C.max H.max N.max O.max P.max S.max F.max Cl.max ## 1 &lt; 500 DNP 29 72 10 18 4 7 15 8 ## 2 &lt;NA&gt; Wiley 39 72 20 20 9 10 16 10 ## 3 &lt; 1000 DNP 66 126 25 27 6 8 16 11 ## 4 &lt;NA&gt; Wiley 78 126 20 27 9 14 34 12 ## 5 &lt; 2000 DNP 115 236 32 63 6 8 16 11 ## 6 &lt;NA&gt; Wiley 156 180 20 40 9 14 48 12 ## 7 &lt; 3000 DNP 162 208 48 78 6 9 16 11 ## Br.max Si.max ## 1 5 NA ## 2 4 8 ## 3 8 NA ## 4 8 14 ## 5 8 NA ## 6 10 15 ## 7 8 NA Perform LEWIS and SENIOR check. The LEWIS rule demands that molecules consisting of main group elements, especially carbon, nitrogen and oxygen, share electrons in a way that all atoms have completely filled s, p-valence shells (‘octet rule’). Senior’s theorem requires three essential conditions for the existence of molecular graphs The sum of valences or the total number of atoms having odd valences is even; The sum of valences is greater than or equal to twice the maximum valence; The sum of valences is greater than or equal to twice the number of atoms minus 1. Perform isotopic pattern filter. Isotope ratio abundance was included in the algorithm as an additional orthogonal constraint, assuming high quality data acquisitions, specifically sufficient ion statistics and high signal/noise ratio for the detection of the M+1 and M+2 abundances. For monoisotopic elements (F, Na, P, I) this rule has no impact. isotope pattern will be useful for brominated, chlorinated small molecules and sulphur-containing peptides. Perform H/C ratio check (hydrogen/carbon ratio). In most cases the hydrogen/carbon ratio does not exceed H/C &gt; 3 with rare exception such as in methylhydrazine (CH6N2). Conversely, the H/C ratio is usually smaller than 2, and should not be less than 0.125 like in the case of tetracyanopyrrole (C8HN5). Perform NOPS ratio check (N, O, P, S/C ratios). ## Element.ratios Common.range.(covering.99.7%) Extended.range.(covering.99.99%) ## 1 H/C 0.2–3.1 0.1–6 ## 2 F/C 0–1.5 0–6 ## 3 Cl/C 0–0.8 0–2 ## 4 Br/C 0–0.8 0–2 ## 5 N/C 0–1.3 0–4 ## 6 O/C 0–1.2 0–3 ## 7 P/C 0–0.3 0–2 ## 8 S/C 0–0.8 0–3 ## 9 Si/C 0–0.5 0–1 ## Extreme.range.(beyond.99.99%) ## 1 &lt; 0.1 and 6–9 ## 2 &gt; 1.5 ## 3 &gt; 0.8 ## 4 &gt; 0.8 ## 5 &gt; 1.3 ## 6 &gt; 1.2 ## 7 &gt; 0.3 ## 8 &gt; 0.8 ## 9 &gt; 0.5 Perform heuristic HNOPS probability check (H, N, O, P, S/C high probability ratios) df &lt;- data.frame( stringsAsFactors = FALSE, Element.counts = c(&quot;NOPS all &gt; 1&quot;,&quot;NOP all &gt; 3&quot;,&quot;OPS all &gt; 1&quot;, &quot;PSN all &gt; 1&quot;,&quot;NOS all &gt; 6&quot;), Heuristic.Rule = c(&quot;N&lt; 10, O &lt; 20, P &lt; 4, S &lt; 3&quot;, &quot;N &lt; 11, O &lt; 22, P &lt; 6&quot;,&quot;O &lt; 14, P &lt; 3, S &lt; 3&quot;, &quot;P &lt; 3, S &lt; 3, N &lt; 4&quot;,&quot;N &lt; 19 O &lt; 14 S &lt; 8&quot;), DB.examples.for.maximum.values = c(&quot;C15H34N9O8PS, C22H44N4O14P2S2, C24H38N7O19P3S&quot;,&quot;C20H28N10O21P4, C10H18N5O20P5&quot;, &quot;C22H44N4O14P2S2, C16H36N4O4P2S2&quot;, &quot;C22H44N4O14P2S2, C16H36N4O4P2S2&quot;,&quot;C59H64N18O14S7&quot;) ) df ## Element.counts Heuristic.Rule ## 1 NOPS all &gt; 1 N&lt; 10, O &lt; 20, P &lt; 4, S &lt; 3 ## 2 NOP all &gt; 3 N &lt; 11, O &lt; 22, P &lt; 6 ## 3 OPS all &gt; 1 O &lt; 14, P &lt; 3, S &lt; 3 ## 4 PSN all &gt; 1 P &lt; 3, S &lt; 3, N &lt; 4 ## 5 NOS all &gt; 6 N &lt; 19 O &lt; 14 S &lt; 8 ## DB.examples.for.maximum.values ## 1 C15H34N9O8PS, C22H44N4O14P2S2, C24H38N7O19P3S ## 2 C20H28N10O21P4, C10H18N5O20P5 ## 3 C22H44N4O14P2S2, C16H36N4O4P2S2 ## 4 C22H44N4O14P2S2, C16H36N4O4P2S2 ## 5 C59H64N18O14S7 Perform TMS check (for GC-MS if a silylation step is involved). For TMS derivatized molecules detected in GC/MS analyses, the rules on element ratio checks and valence tests are hence best applied after TMS groups are subtracted, in a similar manner as adducts need to be first recognized and subtracted in LC/MS analyses. Seven Golden Rules were built for GC-MS and Hydrogen Rearrangement Rules were major designed for LC-CID-MS/MS(Tsugawa et al. 2016). Based on extensively curated database records and enthalpy calculations, “hydrogen rearrangement (HR) rules” could be extending the even-electron rule for carbon (C) and heteroatoms, oxygen (O), nitrogen (N), phosphorus (P), and sulfur (S). They used high abundance MS/MS peaks that exceeded 10% of their base peaks to identify common features in terms of 4 HR rules for positive mode and 5 HR rules for negative mode. Seven Golden Rules and Hydrogen Rearrangement Rules might also be captured by statistical models. However, such heuristic rules could reduce the searching space of possible formula. molgen generating all structures (connectivity isomers, constitutions) that correspond to a given molecular formula, with optional further restrictions, e.g. presence or absence of particular substructures (Gugisch et al. 2015). mfFinder can predict formula based on accurate mass (Patiny and Borel 2013). RAMSI is the robust automated mass spectra interpretation and chemical formula calculation method using mixed integer linear programming optimization (Baran and Northen 2013). Here is some other Cheminformatics tools, which could be used to assign meaningful formula or structures for mass spectra. RDKit Open-Source Cheminformatics Software cdk The Chemistry Development Kit (CDK) is a scientific, LGPL-ed library for bio- and cheminformatics and computational chemistry written in Java (Guha 2007). Open Babel Open Babel is a chemical toolbox designed to speak the many languages of chemical data (O’Boyle et al. 2011). ClassyFire is a tool for automated chemical classification with a comprehensive, computable taxonomy (Djoumbou Feunang et al. 2016). BUDDY can perform molecular formula discovery via bottom-up MS/MS interrogation(Xing et al. 2023). 7.5 Redundant peaks Full scan mass spectra always contain lots of redundant peaks such as adducts, isotope, fragments, multiple charged ions and other oligomers. Such peaks dominated the features table(Xu, Lu, and Rabinowitz 2015; Sindelar and Patti 2020; Mahieu and Patti 2017). Annotation tools could label those peaks either by known list or frequency analysis of the paired mass distances(Ju et al. 2020; Kouřil et al. 2020). 7.5.1 Adducts list You could find adducts list here from commonMZ project. 7.5.2 Isotope Here is Isotope pattern prediction. 7.5.3 CAMERA Common annotation for xcms workflow(Kuhl et al. 2012). 7.5.4 RAMClustR The software could be found here (C. D. Broeckling et al. 2014; Corey D. Broeckling et al. 2016). The package included a vignette to follow. 7.5.5 BioCAn BioCAn combines the results from database searches and in silico fragmentation analyses and places these results into a relevant biological context for the sample as captured by a metabolic model (Alden et al. 2017). 7.5.6 mzMatch mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language. (Chokkathukalam et al. 2013; Scheltema et al. 2011) and MetAssign is a probabilistic annotation method using a Bayesian clustering approach, which is part of mzMatch(Daly et al. 2014). 7.5.7 xMSannotator The software could be found here(Uppal, Walker, and Jones 2017). 7.5.8 mWise mWise is an Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features through Diffusion in Graphs(Barranco-Altirriba et al. 2021). 7.5.9 MAIT You could find source code here(Fernández-Albert et al. 2014). 7.5.10 pmd Paired Mass Distance(PMD) analysis for GC/LC-MS based nontarget analysis to remove redundant peaks(M. Yu, Olkowicz, and Pawliszyn 2019). 7.5.11 nontarget nontarget could find Isotope &amp; adduct peak grouping, and perform homologue series detection (Loos and Singer 2017). 7.5.12 Binner Binner Deep annotation of untargeted LC-MS metabolomics data (Kachman et al. 2020) 7.5.13 mz.unity You could find source code here (Mahieu et al. 2016) and it’s for detecting and exploring complex relationships in accurate-mass mass spectrometry data. 7.5.14 MS-FLO ms-flo A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing (DeFelice et al. 2017). 7.5.15 CliqueMS CliqueMS is a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network (Senan et al. 2019). 7.5.16 InterpretMSSpectrum This package is for annotate and interpret deconvoluted mass spectra (mass*intensity pairs) from high resolution mass spectrometry devices. You could use this package to find molecular ions for GC-MS (Jaeger et al. 2016). 7.5.17 NetID NetID is a global network optimization approach to annotate untargeted LC-MS metabolomics data(L. Chen et al. 2021). 7.5.18 ISfrag De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data(J. Guo et al. 2021) 7.5.19 FastEI Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library(Qiong Yang et al. 2023) 7.6 MS1 MS2 connection 7.6.1 PMDDA Three step workflow: MS1 full scan peak-picking, GlobalStd algorithm to select precursor ions for MS2 from MS1 data and collect the MS2 data and annotation with GNPS(M. Yu, Dolios, and Petrick 2022). 7.6.2 HERMES A molecular-formula-oriented method to target the metabolome(Giné et al. 2021). 7.6.3 dpDDA Similar work can be found here with inclusion list of differential and preidentified ions (dpDDA)(Y. Zhang et al. 2023). 7.7 MS2 MSn connection A computational approach to generate adatabase of high-resolution-MS n spectra by converting existing low-resolution MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD(Lieng et al. 2023). 7.8 MS/MS annotation MS/MS annotation is performed to generate a matching score with library spectra. The most popular matching algorithm is dot product similarity. A recent study found spectral entropy algorithm outperformed dot product similarity [Y. Li et al. (2021);Y. Li and Fiehn (2023);]. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment showed modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules(Bittremieux et al. 2022). This work proposed a method weighting low-intensity MS/MS ions and m/z frequency for spectral library annotation, which will be help to annotate unknown spectra(Engler Hart et al. 2024). BLINK enables ultrafast tandem mass spectrometry cosine similarity scoring(Harwood et al. 2023). MS2Query enable the reliable and scalable MS2 mass spectra-based analogue search by machine learning(de Jonge et al. 2023). However, A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect(van Tetering et al. 2024). Machine learning can also be applied for MS2 annotation(Codrean et al. 2023; H. Guo et al. 2023; Bilbao et al. 2023). You could check \\[Workflow\\] section for popular platform. Here are some stand-alone annotation software: 7.8.1 Matchms Matchms is an open-source Python package to import, process, clean, and compare mass spectrometry data (MS/MS). It allows to implement and run an easy-to-follow, easy-to-reproduce workflow from raw mass spectra to pre- and post-processed spectral data. Spectral data can be imported from common formats such mzML, mzXML, msp, metabolomics-USI, MGF, or json (e.g. GNPS-syle json files). Matchms then provides filters for metadata cleaning and checking, as well as for basic peak filtering. Finally, matchms was build to import and apply different similarity measures to compare large amounts of spectra. This includes common Cosine scores, but can also easily be extended by custom measures. Example for spectrum similarity measures that were designed to work in matchms are Spec2Vec and MS2DeepScore(Huber et al. 2020). 7.8.2 MetDNA MetDNA is the Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics (Shen et al. 2019). 7.8.3 MetFusion Java based integration of compound identiﬁcation strategies. You could access the application here (Gerlich and Neumann 2013). 7.8.4 MS2Analyzer MS2Analyzer could annotate small molecule substructure from accurate tandem mass spectra. (Ma et al. 2014) 7.8.5 MetFrag MetFrag could be used to make in silico prediction/match of MS/MS data(Ruttkies et al. 2016; Wolf et al. 2010). 7.8.6 CFM-ID CFM-ID use Metlin’s data to make prediction (Allen et al. 2014) and 4.0 (Allen et al. 2014). 7.8.7 LC-MS2Struct A machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements.(Bach, Schymanski, and Rousu 2022) 7.8.8 LipidFrag LipidFrag could be used to make in silico prediction/match of lipid related MS/MS data (Witting et al. 2017). 7.8.9 Lipidmatch in silico: in silico lipid mass spectrum search (Koelmel et al. 2017). 7.8.10 BarCoding Bar coding select mass-to-charge regions containing the most informative metabolite fragments and designate them as bins. Then translate each metabolite fragmentation pattern into a binary code by assigning 1’s to bins containing fragments and 0’s to bins without fragments. Such coding annotation could be used for MRM data (Spalding et al. 2016). 7.8.11 iMet This online application is a network-based computation method for annotation (Aguilar-Mogas et al. 2017). 7.8.12 DNMS2Purifier XGBoost based MS/MS spectral cleaning tool using intensity ratio fluctuation, appearance rate, and relative intensity(T. Zhao et al. 2023). 7.8.13 IDSL.CSA Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets(Baygi, Kumar, and Barupal 2023). 7.9 Knowledge based annotation 7.9.1 Experimental design Physicochemical Property can be used for annotation with a specific experimental design(Abrahamsson et al. 2023). 7.9.2 Chromatographic retention-related criteria For targeted analysis, chromatographic retention time could be the qualitative method for certain compounds with a carefully designed pre-treatment. For untargeted analysis, such information could also be used for annotation. GC-MS usually use retention index for certain column while LC-MS might not show enough reproducible results as GC. Such method could be tracked back to quantitative structure-retention relationship (QSRR) models or linear solvation energy relationship (LSER). However, such methods need molecular descriptors as much as possible. For untargeted analysis, retention time and mass to charge ratio could not generate enough molecular descriptors to build QSPR models. In this case, such criteria might be usefully for validation instead of annotation unless we could measure or extract more information such as ion mobility from unknown compounds. Retip Retention Time Prediction for Compound Annotation in Untargeted Metabolomics (Bonini et al. 2020). JAVA based MolFind could make annotation for unknown chemical structure by prediction based on RI, ECOM50, drift time and CID spectra (Menikarachchi et al. 2012). For-ident could give a score for identification with the help of logD(relative retention time) and/or MS/MS. RT-Transformer: retention time prediction for metabolite annotation to assist in metabolite identification,which is a novel deep neural network model coupled with graph attention network and 1D-Transformer, which can predict retention times under any chromatographic methods. RT prediction model(random forest) of unified-HILIC/AEX/HRMS/MS, which enables the comprehensive structural annotation of polar metabolites(Unified-HILIC/AEX/HRMS/MS)(Torigoe et al. 2024). 7.9.3 ProbMetab Provides probability ranking to candidate compounds assigned to masses, with the prior assumption of connected sample and additional previous and spectral information modeled by the user. You could find source code here (Ricardo R. Silva et al. 2014). 7.9.4 MI-Pack You could find python software here (Weber and Viant 2010). 7.9.5 MetExpert MetExpert is an expert system to assist users with limited expertise in informatics to interpret GCMS data for metabolite identification without querying spectral databases (Qiu, Lei, and Sumner 2018). 7.9.6 MycompoundID MycompoundID could be used to search known and unknown metabolites online (Liang Li et al. 2013). 7.9.7 MetFamily Shiny app for MS and MS/MS data annotation (Treutler et al. 2016). 7.9.8 CoA-Blast For certain group of compounds such as Acyl-CoA, you might build a class level in silico database to annotated compounds with certain structure(Keshet et al. 2022). 7.9.9 KGMN Knowledge-guided multi-layer network (KGMN) integrates three-layer networks, including knowledge-based metabolic reaction network, knowledge-guided MS/MS similarity network, and global peak correlation network for annotaiton (Z. Zhou et al. 2022). 7.9.10 CCMN CCMNs were then constructed using metabolic features shared classes, which facilitated the structure- or class annotation for completely unknown metabolic features(X. Zhang et al. 2024). 7.10 MS Database for annotation 7.10.1 MS Fiehn Lab NIST: No free Spectral Database for Organic Compounds, SDBS MINE is an open access database of computationally predicted enzyme promiscuity products for untargeted metabolomics. The annotation would be accurate for general compounds database. 7.10.2 MS/MS LibGen can generate high quality spectral libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers(Kong et al. 2023). MoNA Platform to collect all other open source database MassBank GNPS use inner correlationship in the data and make network analysis at peaks’ level instand of annotated compounds to annotate the data. ReSpect: phytochemicals Metlin is another useful online application for annotation(Guijas et al. 2018). LipidBlast: in silico prediction Lipid Maps MZcloud NIST: Not free GMDB a multistage tandem mass spectral database using a variety of structurally defined glycans. HMDB is a freely available electronic database containing detailed information about small molecule metabolites found in the human body. KEGG is a collection of small molecules, biopolymers, and other chemical substances that are relevant to biological systems. 7.11 Compounds Database PubChem is an open chemistry database at the National Institutes of Health (NIH). Chemspider is a free chemical structure database providing fast text and structure search access to over 67 million structures from hundreds of data sources. ChEBI is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. RefMet A Reference list of Metabolite names. CAS Largest substance database CompTox compounds, exposure and toxicity database. Here is related data. T3DB is a unique bioinformatics resource that combines detailed toxin data with comprehensive toxin target information. FooDB is the world’s largest and most comprehensive resource on food constituents, chemistry and biology. Phenol explorer is the first comprehensive database on polyphenol content in foods. Drugbank is a unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information. LMDB is a freely available electronic database containing detailed information about small molecule metabolites found in different livestock species. HPV High Production Volume Information System There are also metabolites atlas for specific domain. PMhub 1.0: a comprehensive plant metabolome database(Z. Tian et al. 2023) Atlas of Circadian Metabolism(Dyar et al. 2018) Plantmat excel library based prediction for plant metabolites(Qiu et al. 2016). References Abrahamsson, Dimitri, Christopher L. Brueck, Carsten Prasse, Dimitra A. Lambropoulou, Lelouda-Athanasia Koronaiou, Miaomiao Wang, June-Soo Park, and Tracey J. Woodruff. 2023. “Extracting Structural Information from Physicochemical Property Measurements Using Machine Learning-A New Approach for Structure Elucidation in Non-targeted Analysis.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.3c03003. Aguilar-Mogas, Antoni, Marta Sales-Pardo, Miriam Navarro, Roger Guimerà, and Oscar Yanes. 2017. “iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.” Analytical Chemistry 89 (6): 3474–82. https://doi.org/10.1021/acs.analchem.6b04512. Alden, Nicholas, Smitha Krishnan, Vladimir Porokhin, Ravali Raju, Kyle McElearney, Alan Gilbert, and Kyongbum Lee. 2017. “Biologically Consistent Annotation of Metabolomics Data.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.7b02162. Allard, Pierre-Marie, Grégory Genta-Jouve, and Jean-Luc Wolfender. 2017. “Deep Metabolome Annotation in Natural Products Research: Towards a Virtuous Cycle in Metabolite Identification.” Current Opinion in Chemical Biology, Omics, 36 (February): 40–49. https://doi.org/10.1016/j.cbpa.2016.12.022. Allen, Felicity, Allison Pon, Michael Wilson, Russ Greiner, and David Wishart. 2014. “CFM-ID: A Web Server for Annotation, Spectrum Prediction and Metabolite Identification from Tandem Mass Spectra.” Nucleic Acids Research 42 (W1): W94–99. https://doi.org/10.1093/nar/gku436. Bach, Eric, Emma L. Schymanski, and Juho Rousu. 2022. “Joint Structural Annotation of Small Molecules Using Liquid Chromatography Retention Order and Tandem Mass Spectrometry Data.” Nature Machine Intelligence 4 (12): 1224–37. https://doi.org/10.1038/s42256-022-00577-2. Baran, Richard, and Trent R. Northen. 2013. “Robust Automated Mass Spectra Interpretation and Chemical Formula Calculation Using Mixed Integer Linear Programming.” Analytical Chemistry 85 (20): 9777–84. https://doi.org/10.1021/ac402180c. Barranco-Altirriba, Maria, Pol Solà-Santos, Sergio Picart-Armada, Samir Kanaan-Izquierdo, Jordi Fonollosa, and Alexandre Perera-Lluna. 2021. “mWISE: An Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features Through Diffusion in Graphs.” Analytical Chemistry 93 (31): 10772–78. https://doi.org/10.1021/acs.analchem.1c00238. Baygi, Sadjad Fakouri, Yashwant Kumar, and Dinesh Kumar Barupal. 2023. “IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets.” IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets, June. https://doi.org/10.1021/acs.analchem.3c00376. Bernardo-Bermejo, Samuel, Jingchuan Xue, Linh Hoang, Elizabeth Billings, Bill Webb, M. Willy Honders, Sanne Venneker, et al. 2023. “Quantitative Multiple Fragment Monitoring with Enhanced in-Source Fragmentation/Annotation Mass Spectrometry.” Nature Protocols, February, 1–20. https://doi.org/10.1038/s41596-023-00803-0. Bilbao, Aivett, Nathalie Munoz, Joonhoon Kim, Daniel J. Orton, Yuqian Gao, Kunal Poorey, Kyle R. Pomraning, et al. 2023. “PeakDecoder Enables Machine Learning-Based Metabolite Annotation and Accurate Profiling in Multidimensional Mass Spectrometry Measurements.” Nature Communications 14 (1): 2461. https://doi.org/10.1038/s41467-023-37031-9. Bittremieux, Wout, Robin Schmid, Florian Huber, Justin J. J. van der Hooft, Mingxun Wang, and Pieter C. Dorrestein. 2022. “Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules.” Journal of the American Society for Mass Spectrometry 33 (9): 1733–44. https://doi.org/10.1021/jasms.2c00153. Bonini, Paolo, Tobias Kind, Hiroshi Tsugawa, Dinesh Kumar Barupal, and Oliver Fiehn. 2020. “Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics.” Analytical Chemistry 92 (11): 7515–22. https://doi.org/10.1021/acs.analchem.9b05765. Broeckling, C. D., F. A. Afsar, S. Neumann, A. Ben-Hur, and J. E. Prenni. 2014. “RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data.” Analytical Chemistry 86 (14): 6812–17. https://doi.org/10.1021/ac501530d. Broeckling, Corey D., Andrea Ganna, Mark Layer, Kevin Brown, Ben Sutton, Erik Ingelsson, Graham Peers, and Jessica E. Prenni. 2016. “Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data Through MS1 Spectrum and Time Prediction.” Analytical Chemistry 88 (18): 9226–34. https://doi.org/10.1021/acs.analchem.6b02479. Chaleckis, Romanas, Isabel Meister, Pei Zhang, and Craig E Wheelock. 2019. “Challenges, Progress and Promises of Metabolite Annotation for LC–MS-based Metabolomics.” Current Opinion in Biotechnology, Analytical Biotechnology, 55 (February): 44–50. https://doi.org/10.1016/j.copbio.2018.07.010. Charbonnet, Joseph A., Carrie A. McDonough, Feng Xiao, Trever Schwichtenberg, Dunping Cao, Sarit Kaserzon, Kevin V. Thomas, et al. 2022. “Communicating Confidence of Per- and Polyfluoroalkyl Substance Identification via High-Resolution Mass Spectrometry.” Environmental Science &amp; Technology Letters, May. https://doi.org/10.1021/acs.estlett.2c00206. Chen, Li, Wenyun Lu, Lin Wang, Xi Xing, Ziyang Chen, Xin Teng, Xianfeng Zeng, et al. 2021. “Metabolite Discovery Through Global Annotation of Untargeted Metabolomics Data.” Nature Methods 18 (11): 1377–85. https://doi.org/10.1038/s41592-021-01303-3. Chokkathukalam, Achuthanunni, Andris Jankevics, Darren J. Creek, Fiona Achcar, Michael P. Barrett, and Rainer Breitling. 2013. “mzMatch–ISO: An R Tool for the Annotation and Relative Quantification of Isotope-Labelled Mass Spectrometry Data.” Bioinformatics 29 (2): 281–83. https://doi.org/10.1093/bioinformatics/bts674. Codrean, S., B. Kruit, N. Meekel, D. Vughs, and F. Béen. 2023. “Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.” Analytical Chemistry, October. https://doi.org/10.1021/acs.analchem.3c03470. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. de Jonge, Niek F., Joris J. R. Louwen, Elena Chekmeneva, Stephane Camuzeaux, Femke J. Vermeir, Robert S. Jansen, Florian Huber, and Justin J. J. van der Hooft. 2023. “MS2Query: Reliable and Scalable MS2 Mass Spectra-Based Analogue Search.” Nature Communications 14 (1): 1752. https://doi.org/10.1038/s41467-023-37446-4. DeFelice, Brian C., Sajjan Singh Mehta, Stephanie Samra, Tomáš Čajka, Benjamin Wancewicz, Johannes F. Fahrmann, and Oliver Fiehn. 2017. “Mass Spectral Feature List Optimizer (MS-FLO): A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing.” Analytical Chemistry 89 (6): 3250–55. https://doi.org/10.1021/acs.analchem.6b04372. Djoumbou Feunang, Yannick, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, et al. 2016. “ClassyFire: Automated Chemical Classification with a Comprehensive, Computable Taxonomy.” Journal of Cheminformatics 8 (1): 61. https://doi.org/10.1186/s13321-016-0174-y. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Dyar, Kenneth A., Dominik Lutter, Anna Artati, Nicholas J. Ceglia, Yu Liu, Danny Armenta, Martin Jastroch, et al. 2018. “Atlas of Circadian Metabolism Reveals System-wide Coordination and Communication Between Clocks.” Cell 174 (6): 1571–1585.e11. https://doi.org/10.1016/j.cell.2018.08.042. Engler Hart, Chloe, Tobias Kind, Pieter C. Dorrestein, David Healey, and Daniel Domingo-Fernández. 2024. “Weighting Low-Intensity MS/MS Ions and m/z Frequency for Spectral Library Annotation.” Journal of the American Society for Mass Spectrometry 35 (2): 266–74. https://doi.org/10.1021/jasms.3c00353. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Gerlich, Michael, and Steffen Neumann. 2013. “MetFusion: Integration of Compound Identification Strategies.” Journal of Mass Spectrometry 48 (3): 291–98. https://doi.org/10.1002/jms.3123. Giné, Roger, Jordi Capellades, Josep M. Badia, Dennis Vughs, Michaela Schwaiger-Haber, Theodore Alexandrov, Maria Vinaixa, Andrea M. Brunner, Gary J. Patti, and Oscar Yanes. 2021. “HERMES: A Molecular-Formula-Oriented Method to Target the Metabolome.” Nature Methods 18 (11): 1370–76. https://doi.org/10.1038/s41592-021-01307-z. Gugisch, Ralf, Adalbert Kerber, Axel Kohnert, Reinhard Laue, Markus Meringer, Christoph Rücker, and Alfred Wassermann. 2015. “Chapter 6 - MOLGEN 5.0, A Molecular Structure Generator.” In Advances in Mathematical Chemistry and Applications, edited by Subhash C. Basak, Guillermo Restrepo, and José L. Villaveces, 113–38. Bentham Science Publishers. https://doi.org/10.1016/B978-1-68108-198-4.50006-0. Guha, Rajarshi. 2007. “Chemical Informatics Functionality in R.” Journal of Statistical Software 18 (1): 1–16. https://doi.org/10.18637/jss.v018.i05. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Hao, Kebing Xue, Haiming Sun, Weihao Jiang, and Shiliang Pu. 2023. “Contrastive Learning-Based Embedder for the Representation of Tandem Mass Spectra.” Analytical Chemistry, May. https://doi.org/10.1021/acs.analchem.3c00260. Guo, Jian, Sam Shen, Shipei Xing, Huaxu Yu, and Tao Huan. 2021. “ISFrag: De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data.” Analytical Chemistry, July. https://doi.org/10.1021/acs.analchem.1c01644. Harwood, Thomas V., Daniel G. C. Treen, Mingxun Wang, Wibe de Jong, Trent R. Northen, and Benjamin P. Bowen. 2023. “BLINK Enables Ultrafast Tandem Mass Spectrometry Cosine Similarity Scoring.” Scientific Reports 13 (1): 13462. https://doi.org/10.1038/s41598-023-40496-9. Huber, Florian, Stefan Verhoeven, Christiaan Meijer, Hanno Spreeuw, Efraín Manuel Villanueva Castilla, Cunliang Geng, Justin J. j van der Hooft, et al. 2020. “Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data.” Journal of Open Source Software 5 (52): 2411. https://doi.org/10.21105/joss.02411. Hufsky, Franziska, Kerstin Scheubert, and Sebastian Böcker. 2014. “Computational Mass Spectrometry for Small-Molecule Fragmentation.” TrAC Trends in Analytical Chemistry 53 (January): 41–48. https://doi.org/10.1016/j.trac.2013.09.008. Jaeger, Carsten, Friederike Hoffmann, Clemens A. Schmitt, and Jan Lisec. 2016. “Automated Annotation and Evaluation of In-Source Mass Spectra in GC/Atmospheric Pressure Chemical Ionization-MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9386–90. https://doi.org/10.1021/acs.analchem.6b02743. Ju, Ran, Xinyu Liu, Fujian Zheng, Xinjie Zhao, Xin Lu, Xiaohui Lin, Zhongda Zeng, and Guowang Xu. 2020. “A Graph Density-Based Strategy for Features Fusion from Different Peak Extract Software to Achieve More Metabolites in Metabolic Profiling from High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1139 (December): 8–14. https://doi.org/10.1016/j.aca.2020.09.029. Kachman, Maureen, Hani Habra, William Duren, Janis Wigginton, Peter Sajjakulnukit, George Michailidis, Charles Burant, and Alla Karnovsky. 2020. “Deep Annotation of Untargeted LC-MS Metabolomics Data with Binner.” Bioinformatics 36 (6): 1801–6. https://doi.org/10.1093/bioinformatics/btz798. Keshet, Uri, Tobias Kind, Xinchen Lu, Sarita Devi, and Oliver Fiehn. 2022. “Acyl-CoA Identification in Mouse Liver Samples Using the In Silico CoA-Blast Tandem Mass Spectral Library.” Analytical Chemistry 94 (6): 2732–39. https://doi.org/10.1021/acs.analchem.1c03272. Kind, Tobias, and Oliver Fiehn. 2007. “Seven Golden Rules for Heuristic Filtering of Molecular Formulas Obtained by Accurate Mass Spectrometry.” BMC Bioinformatics 8 (1): 105. https://doi.org/10.1186/1471-2105-8-105. Koelmel, Jeremy P., Nicholas M. Kroeger, Candice Z. Ulmer, John A. Bowden, Rainey E. Patterson, Jason A. Cochran, Christopher W. W. Beecher, Timothy J. Garrett, and Richard A. Yost. 2017. “LipidMatch: An Automated Workflow for Rule-Based Lipid Identification Using Untargeted High-Resolution Tandem Mass Spectrometry Data.” BMC Bioinformatics 18 (July): 331. https://doi.org/10.1186/s12859-017-1744-3. Kong, Fanzhou, Uri Keshet, Tong Shen, Elys Rodriguez, and Oliver Fiehn. 2023. “LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02263. Kouřil, Štěpán, Julie de Sousa, Jan Václavík, David Friedecký, and Tomáš Adam. 2020. “CROP: Correlation-Based Reduction of Feature Multiplicities in Untargeted Metabolomic Data.” Bioinformatics 36 (9): 2941–42. https://doi.org/10.1093/bioinformatics/btaa012. Kuhl, Carsten, Ralf Tautenhahn, Christoph Böttcher, Tony R. Larson, and Steffen Neumann. 2012. “CAMERA: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets.” Analytical Chemistry 84 (1): 283–89. https://doi.org/10.1021/ac202450g. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Li, Liang, Ronghong Li, Jianjun Zhou, Azeret Zuniga, Avalyn E. Stanislaus, Yiman Wu, Tao Huan, et al. 2013. “MyCompoundID: Using an Evidence-Based Metabolome Library for Metabolite Identification.” Analytical Chemistry 85 (6): 3401–8. https://doi.org/10.1021/ac400099b. Li, Yuanyue, and Oliver Fiehn. 2023. “Flash Entropy Search to Query All Mass Spectral Libraries in Real Time.” Nature Methods 20 (10): 1475–78. https://doi.org/10.1038/s41592-023-02012-9. Li, Yuanyue, Tobias Kind, Jacob Folz, Arpana Vaniya, Sajjan Singh Mehta, and Oliver Fiehn. 2021. “Spectral Entropy Outperforms MS/MS Dot Product Similarity for Small-Molecule Compound Identification.” Nature Methods 18 (12): 1524–31. https://doi.org/10.1038/s41592-021-01331-z. Lieng, Brandon Y., Andrew T. Quaile, Xavier Domingo-Almenara, Hannes L. Röst, and J. Rafael Montenegro-Burke. 2023. “Computational Expansion of High-Resolution-MSn Spectral Libraries.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c03343. Loos, Martin, and Heinz Singer. 2017. “Nontargeted Homologue Series Extraction from Hyphenated High Resolution Mass Spectrometry Data.” Journal of Cheminformatics 9 (February). https://doi.org/10.1186/s13321-017-0197-z. Ma, Yan, Tobias Kind, Dawei Yang, Carlos Leon, and Oliver Fiehn. 2014. “MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra.” Analytical Chemistry 86 (21): 10724–31. https://doi.org/10.1021/ac502818e. Mahieu, Nathaniel G., and Gary J. Patti. 2017. “Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer Than 1000 Unique Metabolites.” Analytical Chemistry 89 (19): 10397–406. https://doi.org/10.1021/acs.analchem.7b02380. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Menikarachchi, Lochana C., Shannon Cawley, Dennis W. Hill, L. Mark Hall, Lowell Hall, Steven Lai, Janine Wilder, and David F. Grant. 2012. “MolFind: A Software Package Enabling HPLC/MS-Based Identification of Unknown Chemical Structures.” Analytical Chemistry 84 (21): 9388–94. https://doi.org/10.1021/ac302048x. Nash, William J., and Warwick B. Dunn. 2019. “From Mass to Metabolite in Human Untargeted Metabolomics: Recent Advances in Annotation of Metabolites Applying Liquid Chromatography-Mass Spectrometry Data.” TrAC Trends in Analytical Chemistry 120 (November): 115324. https://doi.org/10.1016/j.trac.2018.11.022. O’Boyle, Noel M., Michael Banck, Craig A. James, Chris Morley, Tim Vandermeersch, and Geoffrey R. Hutchison. 2011. “Open Babel: An Open Chemical Toolbox.” Journal of Cheminformatics 3 (1): 33. https://doi.org/10.1186/1758-2946-3-33. Patiny, Luc, and Alain Borel. 2013. “ChemCalc: A Building Block for Tomorrow’s Chemical Infrastructure.” Journal of Chemical Information and Modeling 53 (5): 1223–28. https://doi.org/10.1021/ci300563h. Qiu, Feng, Dennis D. Fine, Daniel J. Wherritt, Zhentian Lei, and Lloyd W. Sumner. 2016. “PlantMAT: A Metabolomics Tool for Predicting the Specialized Metabolic Potential of a System and for Large-Scale Metabolite Identifications.” Analytical Chemistry 88 (23): 11373–83. https://doi.org/10.1021/acs.analchem.6b00906. Qiu, Feng, Zhentian Lei, and Lloyd W. Sumner. 2018. “MetExpert: An Expert System to Enhance Gas Chromatography-Mass Spectrometry-Based Metabolite Identifications.” Analytica Chimica Acta, Analytical Metabolomics, 1037 (December): 316–26. https://doi.org/10.1016/j.aca.2018.03.052. Ruttkies, Christoph, Emma L. Schymanski, Sebastian Wolf, Juliane Hollender, and Steffen Neumann. 2016. “MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation.” Journal of Cheminformatics 8 (January): 3. https://doi.org/10.1186/s13321-016-0115-9. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Senan, Oriol, Antoni Aguilar-Mogas, Miriam Navarro, Jordi Capellades, Luke Noon, Deborah Burks, Oscar Yanes, Roger Guimerà, and Marta Sales-Pardo. 2019. “CliqueMS: A Computational Tool for Annotating in-Source Metabolite Ions from LC-MS Untargeted Metabolomics Data Based on a Coelution Similarity Network.” Bioinformatics 35 (20): 4089–97. https://doi.org/10.1093/bioinformatics/btz207. Shen, Xiaotao, Ruohong Wang, Xin Xiong, Yandong Yin, Yuping Cai, Zaijun Ma, Nan Liu, and Zheng-Jiang Zhu. 2019. “Metabolic Reaction Network-Based Recursive Metabolite Annotation for Untargeted Metabolomics.” Nature Communications 10 (1): 1–14. https://doi.org/10.1038/s41467-019-09550-x. Silva, Ricardo R., Fabien Jourdan, Diego M. Salvanha, Fabien Letisse, Emilien L. Jamin, Simone Guidetti-Gonzalez, Carlos A. Labate, and Ricardo Z. N. Vêncio. 2014. “ProbMetab: An R Package for Bayesian Probabilistic Annotation of LC–MS-based Metabolomics.” Bioinformatics 30 (9): 1336–37. https://doi.org/10.1093/bioinformatics/btu019. Sindelar, Miriam, and Gary J. Patti. 2020. “Chemical Discovery in the Era of Metabolomics.” Journal of the American Chemical Society, April. https://doi.org/10.1021/jacs.9b13198. Spalding, Jonathan L., Kevin Cho, Nathaniel G. Mahieu, Igor Nikolskiy, Elizabeth M. Llufrio, Stephen L. Johnson, and Gary J. Patti. 2016. “Bar Coding MS2 Spectra for Metabolite Identification.” Analytical Chemistry 88 (5): 2538–42. https://doi.org/10.1021/acs.analchem.5b04925. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Tian, Zhitao, Xin Hu, Yingying Xu, Mengmeng Liu, Hongbo Liu, Dongqin Li, Lisong Hu, Guozhu Wei, and Wei Chen. 2023. “PMhub 1.0: A Comprehensive Plant Metabolome Database.” Nucleic Acids Research, October, gkad811. https://doi.org/10.1093/nar/gkad811. Torigoe, Taihei, Masatomo Takahashi, Omidreza Heravizadeh, Kazuki Ikeda, Kohta Nakatani, Takeshi Bamba, and Yoshihiro Izumi. 2024. “Predicting Retention Time in Unified-Hydrophilic-Interaction/Anion-Exchange Liquid Chromatography High-Resolution Tandem Mass Spectrometry (Unified-HILIC/AEX/HRMS/MS) for Comprehensive Structural Annotation of Polar Metabolome.” Analytical Chemistry 96 (3): 1275–83. https://doi.org/10.1021/acs.analchem.3c04618. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. van Tetering, Lara, Sylvia Spies, Quirine D. K. Wildeman, Kas J. Houthuijs, Rianne E. van Outersterp, Jonathan Martens, Ron A. Wevers, David S. Wishart, Giel Berden, and Jos Oomens. 2024. “A Spectroscopic Test Suggests That Fragment Ion Structure Annotations in MS/MS Libraries Are Frequently Incorrect.” Communications Chemistry 7 (1): 1–11. https://doi.org/10.1038/s42004-024-01112-7. Viant, Mark R, Irwin J Kurland, Martin R Jones, and Warwick B Dunn. 2017. “How Close Are We to Complete Annotation of Metabolomes?” Current Opinion in Chemical Biology, Omics, 36 (February): 64–69. https://doi.org/10.1016/j.cbpa.2017.01.001. Weber, Ralf J. M., and Mark R. Viant. 2010. “MI-Pack: Increased Confidence of Metabolite Identification in Mass Spectra by Integrating Accurate Masses and Metabolic Pathways.” Chemometrics and Intelligent Laboratory Systems, OMICS, 104 (1): 75–82. https://doi.org/10.1016/j.chemolab.2010.04.010. Witting, Michael, Christoph Ruttkies, Steffen Neumann, and Philippe Schmitt-Kopplin. 2017. “LipidFrag: Improving Reliability of in Silico Fragmentation of Lipids and Application to the Caenorhabditis Elegans Lipidome.” PLOS ONE 12 (3): e0172311. https://doi.org/10.1371/journal.pone.0172311. Wolf, Sebastian, Stephan Schmidt, Matthias Müller-Hannemann, and Steffen Neumann. 2010. “In Silico Fragmentation for Computer Assisted Identification of Metabolite Mass Spectra.” BMC Bioinformatics 11 (March): 148. https://doi.org/10.1186/1471-2105-11-148. Xing, Shipei, Sam Shen, Banghua Xu, Xiaoxiao Li, and Tao Huan. 2023. “BUDDY: Molecular Formula Discovery via Bottom-up MS/MS Interrogation.” Nature Methods, April, 1–10. https://doi.org/10.1038/s41592-023-01850-x. Xu, Yi-Fan, Wenyun Lu, and Joshua D. Rabinowitz. 2015. “Avoiding Misannotation of In-Source Fragmentation Products as Cellular Metabolites in Liquid Chromatography–Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 87 (4): 2273–81. https://doi.org/10.1021/ac504118y. Xue, Jingchuan, Rico J. E. Derks, Bill Webb, Elizabeth M. Billings, Aries Aisporna, Martin Giera, and Gary Siuzdak. 2021. “Single Quadrupole Multiple Fragment Ion Monitoring Quantitative Mass Spectrometry.” Analytical Chemistry 93 (31): 10879–89. https://doi.org/10.1021/acs.analchem.1c01246. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Xue, Jingchuan, Jiamin Zhu, Lixin Hu, Junjie Yang, Yunbo Lv, Fanrong Zhao, Yuxian Liu, Tao Zhang, Yanpeng Cai, and Mingliang Fang. 2023. “EISA-EXPOSOME: One Highly Sensitive and Autonomous Exposomic Platform with Enhanced in-Source Fragmentation/Annotation.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02697. Yang, Qiong, Hongchao Ji, Zhenbo Xu, Yiming Li, Pingshan Wang, Jinyu Sun, Xiaqiong Fan, Hailiang Zhang, Hongmei Lu, and Zhimin Zhang. 2023. “Ultra-Fast and Accurate Electron Ionization Mass Spectrum Matching for Compound Identification with Million-Scale in-Silico Library.” Nature Communications 14 (1): 3722. https://doi.org/10.1038/s41467-023-39279-7. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Zhang, Xiuqiong, Zaifang Li, Chunxia Zhao, Tiantian Chen, Xinxin Wang, Xiaoshan Sun, Xinjie Zhao, Xin Lu, and Guowang Xu. 2024. “Leveraging Unidentified Metabolic Features for Key Pathway Discovery: Chemical Classification-driven Network Analysis in Untargeted Metabolomics.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.3c04591. Zhang, Yuhao, Jingyu Liao, Wanqi Le, Gaosong Wu, and Weidong Zhang. 2023. “Improving the Data Quality of Untargeted Metabolomics Through a Targeted Data-Dependent Acquisition Based on an Inclusion List of Differential and Preidentified Ions.” Analytical Chemistry 95 (34): 12964–73. https://doi.org/10.1021/acs.analchem.3c02888. Zhao, Tingting, Shipei Xing, Huaxu Yu, and Tao Huan. 2023. “De Novo Cleaning of Chimeric MS/MS Spectra for LC-MS/MS-Based Metabolomics.” Analytical Chemistry 95 (35): 13018–28. https://doi.org/10.1021/acs.analchem.3c00736. Zhou, Zhiwei, Mingdu Luo, Haosong Zhang, Yandong Yin, Yuping Cai, and Zheng-Jiang Zhu. 2022. “Metabolite Annotation from Knowns to Unknowns Through Knowledge-Guided Multi-Layer Metabolic Networking.” Nature Communications 13 (1): 6656. https://doi.org/10.1038/s41467-022-34537-6. "],["omics-analysis.html", "Chapter 8 Omics analysis 8.1 From Bottom-up to Top-down 8.2 Pathway analysis 8.3 Network analysis 8.4 Omics integration", " Chapter 8 Omics analysis When you get the filtered ions, the next step is making annotations for them. Such annotations would be helpful for omics studies. Omics analysis try to combine the information from other ‘omics’ to answer one specific question. Since we have got the annotations, Omics analysis could be performed.Upload the data obtained from the xcms to other tools or databases. You will get an updated database list here. Right now, it is hard to connect different omics databases such as gene, protein and metabolites together for a whole scope of certain biological process. However, you might select few metabolites across those databases and find something interesting. 8.1 From Bottom-up to Top-down Bottom-up analysis mean the model for each metabolite. In this case, we could find out which metabolite will be affected by our experiment design. However, take care of multiple comparison issue. \\[ metabolite = f(control/treatment, co-variables) \\] Top-down analysis mean the model for output. In this case, we could evaluate the contribution of each metabolites. You need variable selection to make a better model. \\[ control/treatment = f(metabolite 1,metabolite 2,...,metaboliteN,co-varuables) \\] For omics study, you might need to integrate dataset from different sources. \\[ control/treatment = f(metabolites, proteins, genes, miRNA,co-varuables) \\] 8.2 Pathway analysis Pathway analysis maps annotated data into known pathway and make statistical analysis to find the influenced pathway or the compounds with high influences on certain pathway. 8.2.1 Pathway Database SMPDB (The Small Molecule Pathway Database) is an interactive, visual database containing more than 618 small molecule pathways found in humans. More than 70% of these pathways (&gt;433) are not found in any other pathway database. The pathways include metabolic, drug, and disease pathways. KEGG (Kyoto Encyclopedia of Genes and Genomes) is one of the most complete and widely used databases containing metabolic pathways (495 reference pathways) from a wide variety of organisms (&gt;4,700). These pathways are hyperlinked to metabolite and protein/enzyme information. Currently KEGG has &gt;17,000 compounds (from animals, plants and bacteria), 10,000 drugs (including different salt forms and drug carriers) and nearly 11,000 glycan structures. BioCyc is a collection of 14558 Pathway/Genome Databases (PGDBs), plus software tools for exploring them. Reactome is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. WikiPathway is a database of biological pathways maintained by and for the scientific community. 8.2.2 Pathway software Pathway Commons online tools for pathway analysis RaMP could make pathway analysis for batch search metabox could make pathway analysis impala is used for pathway enrichment analysis Metscape based on Debiased Sparse Partial Correlation (DSPC) algorithm (Basu et al. 2017) to make annotation. 8.3 Network analysis Mummichog could make pathway and network analysis without annotation. MSS: sequential feature screening procedure to select important sub-network and identify the optimal matching for metabolimics data (Q. Cai et al. 2017). Metapone is joint pathway testing package for untargeted metabolomics data (L. Tian et al. 2022). 8.4 Omics integration Blast finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. The Omics Discovery Index (OmicsDI) provides a knowledge discovery framework across heterogeneous omics data (genomics, proteomics, transcriptomics and metabolomics). Omics Data Integration Project Standardized multi-omics of Earth’s microbiomes could check this GNPS based work(Shaffer et al. 2022). Windows Scanning Multiomics: Integrated Metabolomics and Proteomics(Shi et al. 2023) References Basu, Sumanta, William Duren, Charles R. Evans, Charles F. Burant, George Michailidis, and Alla Karnovsky. 2017. “Sparse Network Modeling and Metscape-Based Visualization Methods for the Analysis of Large-Scale Metabolomics Data.” Bioinformatics 33 (10): 1545–53. https://doi.org/10.1093/bioinformatics/btx012. Cai, Qingpo, Jessica A. Alvarez, Jian Kang, and Tianwei Yu. 2017. “Network Marker Selection for Untargeted LC–MS Metabolomics Data.” Journal of Proteome Research 16 (3): 1261–69. https://doi.org/10.1021/acs.jproteome.6b00861. Shaffer, Justin P., Louis-Félix Nothias, Luke R. Thompson, Jon G. Sanders, Rodolfo A. Salido, Sneha P. Couvillion, Asker D. Brejnrod, et al. 2022. “Standardized Multi-Omics of Earth’s Microbiomes Reveals Microbial and Metabolite Diversity.” Nature Microbiology 7 (12): 2128–50. https://doi.org/10.1038/s41564-022-01266-x. Shi, Jiachen, Jialiang Zhao, Yu Zhang, Yanan Wang, Chin Ping Tan, Yong-Jiang Xu, and Yuanfa Liu. 2023. “Windows Scanning Multiomics: Integrated Metabolomics and Proteomics.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.3c03785. Tian, Leqi, Zhenjiang Li, Guoxuan Ma, Xiaoyue Zhang, Ziyin Tang, Siheng Wang, Jian Kang, Donghai Liang, and Tianwei Yu. 2022. “Metapone: A Bioconductor Package for Joint Pathway Testing for Untargeted Metabolomics Data.” Bioinformatics 38 (14): 3662–64. https://doi.org/10.1093/bioinformatics/btac364. "],["peaks-normalization.html", "Chapter 9 Peaks normalization 9.1 Batch effects 9.2 Batch effects classification 9.3 Batch effects visualization 9.4 Source of batch effects 9.5 Avoid batch effects by DoE 9.6 post hoc data normalization 9.7 Method to validate the normalization 9.8 Software", " Chapter 9 Peaks normalization 9.1 Batch effects Batch effects are the variances caused by factor other than the experimental design. We could simply make a linear model for the intensity of one peak: \\[Intensity = Average + Condition + Batch + Error\\] Research is focused on condition contribution part and overall average or random error could be estimated. However, we know little about the batch contribution. Sometimes we could use known variables such as injection order or operators as the batch part. However, in most cases we such variable is unknown. Almost all the batch correction methods are trying to use some estimations to balance or remove the batch effect. For analytical chemistry, internal standards or pool quality control samples are actually standing for the batch contribution part in the model. However, it’s impractical to get all the internal standards when the data is collected untargeted. For methods using internal standards or pool quality control samples, the variations among those samples are usually removed as median, quantile, mean or the ratios. Other ways like quantile regression, centering and scaling based on distribution within samples could be treated as using the stable distribution of peaks intensity to remove batch effects. 9.2 Batch effects classification Variances among the samples across all the extracted peaks might be affected by factors other than the experiment design. There are three types of those batch effects: Monotone, Block and Mixed. Monotone would increase/decrease with the injection order or batches. Block would be system shift among different batches. Mixed would be the combination of monotone and block batch effects. Meanwhile, different compounds would suffer different type of batch effects. In this case, the normalization or batch correction should be done peak by peak. 9.3 Batch effects visualization Any correction might introduce bias. We need to make sure there are patterns which different from our experimental design. Pooled QC samples should be clustered on PCA score plot. 9.4 Source of batch effects Different Operators &amp; Dates &amp; Sequences Different Instrumental Condition such as different instrumental parameters, poor quality control, sample contamination during the analysis, Column (Pooled QC) and sample matrix effects (ions suppression or/and enhancement) Unknown Unknowns 9.5 Avoid batch effects by DoE You could avoid batch effects from experimental design. Cap the sequence with Pooled QC and Randomized samples sequence. Some internal standards/Instrumental QC might Help to find the source of batch effects while it’s not practical for every compounds in non-targeted analysis. Batch effects might not change the conclusion when the effect size is relatively small. Here is a simulation: set.seed(30) # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) re &lt;- t.test(con~group) # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) batch &lt;- seq(0,5,length.out = 10) ins &lt;- batch+con re &lt;- t.test(ins~group) index &lt;- sample(10) ins &lt;- batch+con[index] re &lt;- t.test(ins~group[index]) Randomization could not guarantee the results. Here is a simulation. # real peaks group &lt;- factor(c(rep(1,5),rep(2,5))) con &lt;- c(rnorm(5,5),rnorm(5,8)) batch &lt;- seq(5,0,length.out = 10) ins &lt;- batch+con re &lt;- t.test(ins~group) 9.6 post hoc data normalization To make the samples comparable, normalization across the samples are always needed when the experiment part is done. Batch effect should have patterns other than experimental design, otherwise just noise. Correction is possible by data analysis/randomized experimental design. There are numerous methods to make normalization with their combination. We could divided those methods into two categories: unsupervised and supervised. Unsupervised methods only consider the normalization peaks intensity distribution across the samples. For example, quantile calibration try to make the intensity distribution among the samples similar. Such methods are preferred to explore the inner structures of the samples. Internal standards or pool QC samples also belong to this category. However, it’s hard to take a few peaks standing for all peaks extracted. Supervised methods will use the group information or batch information in experimental design to normalize the data. A linear model is always used to model the unwanted variances and remove them for further analysis. Since the real batch effects are always unknown, it’s hard to make validation for different normalization methods. Li et.al developed NOREVA to make comparision among 25 correction method (B. Li et al. 2017) and a recently updates make this numbers to 168 (Qingxia Yang et al. 2020). MetaboDrift also contain some methods for batch correction in excel (Thonusin et al. 2017). Another idea is use spiked-in samples to validate the methods (Franceschi et al. 2012) , which might be good for targeted analysis instead of non-targeted analysis. Relative log abundance (RLA) plots(De Livera et al. 2012) and heatmap often used to show the variances among the samples. 9.6.1 Unsupervised methods 9.6.1.1 Distribution of intensity Intensity collects from LC/GC-MS always showed a right-skewed distribution. Log transformation is often necessary for further statistical analysis. 9.6.1.2 Centering For peak p of sample s in batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = I_{p,s,b} - mean(I_{p,b}) + median(I_{p,qc})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = I_{p,s,b} - mean(I_{p,b})\\] 9.6.1.3 Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{std_{p,b}} * std_{p,qc,b} + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{std_{p,b}}\\] 9.6.1.4 Pareto Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{Sqrt(std_{p,b})} * Sqrt(std_{p,qc,b}) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{Sqrt(std_{p,b})}\\] 9.6.1.5 Range Scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{max(I_{p,b}) - min(I_{p,b})} * (max(I_{p,qc,b}) - min(I_{p,qc,b})) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{max(I_{p,b}) - min(I_{p,b})} \\] 9.6.1.6 Level scaling For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{mean(I_{p,b})} * mean(I_{p,qc,b}) + mean(I_{p,qc,b})\\] If no quality control samples used, the corrected abundance I would be: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} - mean(I_{p,b})}{mean(I_{p,b})} \\] 9.6.1.7 Quantile The idea of quantile calibration is that alignment of the intensities in certain samples according to quantile in each sample. Here is the demo: set.seed(42) a &lt;- rnorm(1000) # b sufferred batch effect with a bias of 10 b &lt;- rnorm(1000,10) hist(a,xlim=c(-5,15),breaks = 50) hist(b,col = &#39;black&#39;, breaks = 50, add=T) # quantile normalized cor &lt;- (a[order(a)]+b[order(b)])/2 # reorder cor &lt;- cor[order(order(a))] hist(cor,col = &#39;red&#39;, breaks = 50, add=T) 9.6.1.8 Ratio based calibration This method calibrates samples by the ratio between qc samples in all samples and in certain batch.For peak p of sample s in certain batch b, the corrected abundance I is: \\[\\hat I_{p,s,b} = \\frac{I_{p,s,b} * median(I_{p,qc})}{mean_{p,qc,b}}\\] set.seed(42) # raw data I = c(rnorm(10,mean = 0, sd = 0.3),rnorm(10,mean = 1, sd = 0.5)) # batch B = c(rep(0,10),rep(1,10)) # qc Iqc = c(rnorm(1,mean = 0, sd = 0.3),rnorm(1,mean = 1, sd = 0.5)) # corrected data Icor = I * median(c(rep(Iqc[1],10),rep(Iqc[2],10)))/mean(c(rep(Iqc[1],10),rep(Iqc[2],10))) # plot the result plot(I) plot(Icor) 9.6.1.9 Linear Normalizer This method initially scales each sample so that the sum of all peak abundances equals one. In this study, by multiplying the median sum of all peak abundances across all samples,we got the corrected data. set.seed(42) # raw data peaksa &lt;- c(rnorm(10,mean = 10, sd = 0.3),rnorm(10,mean = 20, sd = 0.5)) peaksb &lt;- c(rnorm(10,mean = 10, sd = 0.3),rnorm(10,mean = 20, sd = 0.5)) df &lt;- rbind(peaksa,peaksb) dfcor &lt;- df/apply(df,2,sum)* sum(apply(df,2,median)) image(df) image(dfcor) 9.6.1.10 Internal standards \\[\\hat I_{p,s} = \\frac{I_{p,s} * median(I_{IS})}{I_{IS,s}}\\] Some methods also use pooled calibration samples and multiple internal standard strategy to correct the data (van der Kloet et al. 2009; Sysi-Aho et al. 2007). Also some methods only use QC samples to handle the data (Kuligowski et al. 2015). 9.6.2 Supervised methods 9.6.2.1 Regression calibration Considering the batch effect of injection order, regress the data by a linear model to get the calibration. 9.6.2.2 Batch Normalizer Use the total abundance scale and then fit with the regression line (S.-Y. Wang, Kuo, and Tseng 2013). 9.6.2.3 Surrogate Variable Analysis(SVA) We have a data matrix(M*N) with M stands for identity peaks from one sample and N stand for individual samples. For one sample, \\(X = (x_{i1},...,x_{in})^T\\) stands for the normalized intensities of peaks. We use \\(Y = (y_i,...,y_m)^T\\) stands for the group information of our data. Then we could build such models: \\[x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\] \\(\\mu_i\\) stands for the baseline of the peak intensities in a normal state. Then we have: \\[f_i(y_i) = E(x_{ij}|y_j) - \\mu_i\\] stands for the biological variations caused by the our group, for example, whether treated by exposure or not. However, considering the batch effects, the real model could be: \\[x_{ij} = \\mu_i + f_i(y_i) + \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\] \\(\\gamma_{li}\\) stands for the peak-specific coefficient for potential factor \\(l\\). \\(p_{lj}\\) stands for the potential factors across the samples. Actually, the error item \\(e_{ij}\\) in real sample could always be decomposed as \\(e_{ij} = \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\) with \\(e_{ij}^*\\) standing for the real random error in certain sample for certain peak. We could not get the potential factors directly. Since we don’t care the details of the unknown factors, we could estimate orthogonal vectors \\(h_k\\) standing for such potential factors. Thus we have: \\[ x_{ij} = \\mu_i + f_i(y_i) + \\sum_{l = 1}^L \\gamma_{li}p_{lj} + e_{ij}^*\\\\ = \\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\lambda_{ki}h_{kj} + e_{ij} \\] Here is the details of the algorithm: The algorithm is decomposed into two parts: detection of unmodeled factors and construction of surrogate variables 9.6.2.3.1 Detection of unmodeled factors Estimate \\(\\hat\\mu_i\\) and \\(f_i\\) by fitting the model \\(x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\) and get the residual \\(r_{ij} = x_{ij}-\\hat\\mu_i - \\hat f_i(y_i)\\). Then we have the residual matrix R. Perform the singular value decompositon(SVD) of the residual matrix \\(R = UDV^T\\) Let \\(d_l\\) be the \\(l\\)th eigenvalue of the diagonal matrix D for \\(l = 1,...,n\\). Set \\(df\\) as the freedom of the model \\(\\hat\\mu_i + \\hat f_i(y_i)\\). We could build a statistic \\(T_k\\) as: \\[T_k = \\frac{d_k^2}{\\sum_{l=1}^{n-df}d_l^2}\\] to show the variance explained by the \\(k\\)th eigenvalue. Permute each row of R to remove the structure in the matrix and get \\(R^*\\). Fit the model \\(r_{ij}^* = \\mu_i^* + f_i^*(y_i) + e^*_{ij}\\) and get \\(r_{ij}^0 = r^*_{ij}-\\hat\\mu^*_i - \\hat f^*_i(y_i)\\) as a null matrix \\(R_0\\) Perform the singular value decompositon(SVD) of the residual matrix \\(R_0 = U_0D_0V_0^T\\) Compute the null statistic: \\[ T_k^0 = \\frac{d_{0k}^2}{\\sum_{l=1}^{n-df}d_{0l}^2} \\] Repeat permuting the row B times to get the null statistics \\(T_k^{0b}\\) Get the p-value for eigengene: \\[p_k = \\frac{\\#{T_k^{0b}\\geq T_k;b=1,...,B }}{B}\\] For a significance level \\(\\alpha\\), treat k as a significant signature of residual R if \\(p_k\\leq\\alpha\\) 9.6.2.3.2 Construction of surrogate variables Estimate \\(\\hat\\mu_i\\) and \\(f_i\\) by fitting the model \\(x_{ij} = \\mu_i + f_i(y_i) + e_{ij}\\) and get the residual \\(r_{ij} = x_{ij}-\\hat\\mu_i - \\hat f_i(y_i)\\). Then we have the residual matrix R. Perform the singular value decompositon(SVD) of the residual matrix \\(R = UDV^T\\). Let \\(e_k = (e_{k1},...,e_{kn})^T\\) be the \\(k\\)th column of V Set \\(\\hat K\\) as the significant eigenvalues found by the first step. Regress each \\(e_k\\) on \\(x_i\\), get the p-value for the association. Set \\(\\pi_0\\) as the proportion of the peak intensity \\(x_i\\) not associate with \\(e_k\\) and find the numbers \\(\\hat m =[1-\\hat \\pi_0 \\times m]\\) and the index of the peaks associated with the eigenvalues Form the matrix \\(\\hat m_1 \\times N\\), this matrix\\(X_r\\) stand for the potential variables. As was done for R, get the eigengents of \\(X_r\\) and denote these by \\(e_j^r\\) Let \\(j^* = argmax_{1\\leq j \\leq n}cor(e_k,e_j^r)\\) and set \\(\\hat h_k=e_j^r\\). Set the estimate of the surrogate variable to be the eigenvalue of the reduced matrix most correlated with the corresponding residual eigenvalue. Since the reduced matrix is enriched for peaks associated with this residual eigenvalue, this is a principled choice for the estimated surrogate variable that allows for correlation with the primary variable. Employ the \\(\\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\gamma_{ki}\\hat h_{kj} + e_{ij}\\) as the estimate of the ideal model \\(\\mu_i + f_i(y_i) + \\sum_{k = 1}^K \\gamma_{ki}h_{kj} + e_{ij}\\) This method could found the potential unwanted variables for the data. SVA were introduced by Jeff Leek (Leek and Storey 2008, 2007; Leek et al. 2012) and EigenMS package implement SVA with modifications including analysis of data with missing values that are typical in LC-MS experiments (Karpievitch et al. 2014). 9.6.2.4 RUV (Remove Unwanted Variation) This method’s performance is similar to SVA. Instead find surrogate variable from the whole dataset. RUA use control or pool QC to find the unwanted variances and remove them to find the peaks related to experimental design. However, we could also empirically estimate the control peaks by linear mixed model. RUA-random (Livera et al. 2015; De Livera et al. 2012) further use linear mixed model to estimate the variances of random error. A hierarchical approach RUV was recently proposed for metabolomics data(T. Kim et al. 2021). This method could be used with suitable control, which is common in metabolomics DoE. 9.6.2.5 RRmix RRmix also use a latent factor models correct the data (Jr et al. 2017). This method could be treated as linear mixed model version SVA. No control samples are required and the unwanted variances could be removed by factor analysis. This method might be the best choice to remove the unwanted variables with common experiment design. 9.6.2.6 Norm ISWSVR It is a two-step approach via combining the best-performance internal standard correction with support vector regression normalization, comprehensively removing the systematic and random errors and matrix effects(Ding et al. 2022). 9.7 Method to validate the normalization Various methods have been used for batch correction and evaluation. Simulation will ensure groud turth. Difference analysis would be a common method for evaluation. Then we could check whether this peak is true positive or false positive by settings of the simulation. Other methods need statistics or lots of standards to describ the performance of batch correction or normalization results. 9.8 Software BatchCorrMetabolomics is for improved batch correction in untargeted MS-based metabolomics MetNorm show Statistical Methods for Normalizing Metabolomics Data. BatchQC could be used to make batch effect simulation. Noreva could make online batch correction and comparison(J. Fu et al. 2021). References De Livera, Alysha M., Daniel A. Dias, David De Souza, Thusitha Rupasinghe, James Pyke, Dedreia Tull, Ute Roessner, Malcolm McConville, and Terence P. Speed. 2012. “Normalizing and Integrating Metabolomics Data.” Analytical Chemistry 84 (24): 10768–76. https://doi.org/10.1021/ac302748b. Ding, Xian, Fen Yang, Yanhua Chen, Jing Xu, Jiuming He, Ruiping Zhang, and Zeper Abliz. 2022. “Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics.” Analytical Chemistry 94 (21): 7500–7509. https://doi.org/10.1021/acs.analchem.1c05502. Franceschi, Pietro, Domenico Masuero, Urska Vrhovsek, Fulvio Mattivi, and Ron Wehrens. 2012. “A Benchmark Spike-in Data Set for Biomarker Identification in Metabolomics.” Journal of Chemometrics 26 (1-2): 16–24. https://doi.org/10.1002/cem.1420. Fu, Jianbo, Ying Zhang, Yunxia Wang, Hongning Zhang, Jin Liu, Jing Tang, Qingxia Yang, et al. 2021. “Optimization of Metabolomic Data Processing Using NOREVA.” Nature Protocols, December, 1–23. https://doi.org/10.1038/s41596-021-00636-9. Jr, Stephen Salerno, Mahya Mehrmohamadi, Maria V. Liberti, Muting Wan, Martin T. Wells, James G. Booth, and Jason W. Locasale. 2017. “RRmix: A Method for Simultaneous Batch Effect Correction and Analysis of Metabolomics Data in the Absence of Internal Standards.” PLOS ONE 12 (6): e0179530. https://doi.org/10.1371/journal.pone.0179530. Karpievitch, Yuliya V., Sonja B. Nikolic, Richard Wilson, James E. Sharman, and Lindsay M. Edwards. 2014. “Metabolomics Data Normalization with EigenMS.” PLOS ONE 9 (12): e116221. https://doi.org/10.1371/journal.pone.0116221. Kim, Taiyun, Owen Tang, Stephen T. Vernon, Katharine A. Kott, Yen Chin Koay, John Park, David E. James, et al. 2021. “A Hierarchical Approach to Removal of Unwanted Variation for Large-Scale Metabolomics Data.” Nature Communications 12 (1): 4992. https://doi.org/10.1038/s41467-021-25210-5. Kuligowski, Julia, Ángel Sánchez-Illana, Daniel Sanjuán-Herráez, Máximo Vento, and Guillermo Quintás. 2015. “Intra-Batch Effect Correction in Liquid Chromatography-Mass Spectrometry Using Quality Control Samples and Support Vector Regression (QC-SVRC).” Analyst 140 (22): 7810–17. https://doi.org/10.1039/C5AN01638J. Leek, Jeffrey T., W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034. Leek, Jeffrey T., and John D. Storey. 2007. “Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis.” PLOS Genet 3 (9): e161. https://doi.org/10.1371/journal.pgen.0030161. ———. 2008. “A General Framework for Multiple Testing Dependence.” Proceedings of the National Academy of Sciences 105 (48): 18718–23. https://doi.org/10.1073/pnas.0808709105. Li, Bo, Jing Tang, Qingxia Yang, Shuang Li, Xuejiao Cui, Yinghong Li, Yuzong Chen, Weiwei Xue, Xiaofeng Li, and Feng Zhu. 2017. “NOREVA: Normalization and Evaluation of MS-based Metabolomics Data.” Nucleic Acids Research 45 (W1): W162–70. https://doi.org/10.1093/nar/gkx449. Livera, Alysha M. De, Marko Sysi-Aho, Laurent Jacob, Johann A. Gagnon-Bartsch, Sandra Castillo, Julie A. Simpson, and Terence P. Speed. 2015. “Statistical Methods for Handling Unwanted Variation in Metabolomics Data.” Analytical Chemistry 87 (7): 3606–15. https://doi.org/10.1021/ac502439y. Sysi-Aho, Marko, Mikko Katajamaa, Laxman Yetukuri, and Matej Orešič. 2007. “Normalization Method for Metabolomics Data Using Optimal Selection of Multiple Internal Standards.” BMC Bioinformatics 8 (March): 93. https://doi.org/10.1186/1471-2105-8-93. Thonusin, Chanisa, Heidi B. IglayReger, Tanu Soni, Amy E. Rothberg, Charles F. Burant, and Charles R. Evans. 2017. “Evaluation of Intensity Drift Correction Strategies Using MetaboDrift, a Normalization Tool for Multi-Batch Metabolomics Data.” Journal of Chromatography A, Pushing the Boundaries of Chromatography and Electrophoresis, 1523 (Supplement C): 265–74. https://doi.org/10.1016/j.chroma.2017.09.023. van der Kloet, Frans M., Ivana Bobeldijk, Elwin R. Verheij, and Renger H. Jellema. 2009. “Analytical Error Reduction Using Single Point Calibration for Accurate and Precise Metabolomic Phenotyping.” Journal of Proteome Research 8 (11): 5132–41. https://doi.org/10.1021/pr900499r. Wang, San-Yuan, Ching-Hua Kuo, and Yufeng J. Tseng. 2013. “Batch Normalizer: A Fast Total Abundance Regression Calibration Method to Simultaneously Adjust Batch and Injection Order Effects in Liquid Chromatography/Time-of-Flight Mass Spectrometry-Based Metabolomics Data and Comparison with Current Calibration Methods.” Analytical Chemistry 85 (2): 1037–46. https://doi.org/10.1021/ac302877x. Yang, Qingxia, Yunxia Wang, Ying Zhang, Fengcheng Li, Weiqi Xia, Ying Zhou, Yunqing Qiu, Honglin Li, and Feng Zhu. 2020. “NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data.” Nucleic Acids Research 48 (W1): W436–48. https://doi.org/10.1093/nar/gkaa258. "],["statistical-analysis.html", "Chapter 10 Statistical analysis 10.1 Basic Statistical Analysis 10.2 Differences analysis 10.3 PCA 10.4 Cluster Analysis 10.5 PLSDA 10.6 Network analysis 10.7 Software", " Chapter 10 Statistical analysis The general purposes for metabolomics study are strongly associated with research goal. However, since metabolomics are usually performed in a non-targeted mode, statistical analysis methods are always started with the exploratory analysis. The basic target for an exploratory analysis is: Find the relationship among variables Find the relationship among samples/group of samples. This is basically unsupervised analysis. However, sometimes we have group information which could be used to find biomarkers or correlation between variables and groups or continuous variables. This type of data need supervised methods to process. A general discussion about statistical analysis in metabolic phenotyping can be found here(Blaise et al. 2021). Before we talk the details of algorithms, let’s cover some basic statistical concepts. 10.1 Basic Statistical Analysis Statistic is used to describe certain property or variables among the samples. It could be designed for certain purpose to extract signal and remove noise. Statistical models and inference are both based on statistic instead of the data. \\[Statistic = f(sample_1,sample_2,...,sample_n)\\] Null Hypothesis Significance Testing (NHST) is often used to make statistical inference. P value is the probability of certain statistics happens under H0 (pre-defined distribution). For omics studies, you should realize Multiple Comparison issue when you perform a lot of(more than 20) comparisons or tests at the same time. False Discovery Rate(FDR) control is required for multiple tests to make sure the results are not false positive. You could use Benjamini-Hochberg method to adjust raw p values or directly use Storey Q value to make FDR control. NHST is famous for the failure of p-value interpretation as well as multiple comparison issues. Bayesian Hypothesis Testing could be an options to cover some drawbacks of NHST. Bayesian Hypothesis Testing use Bayes factor to show the differences between null hypothesis and any other hypothesis. \\[Bayes\\ factor = \\frac{p(D|Ha)}{p(D|H0)} = \\frac{posterior\\ odds}{prior\\ odds}\\] Statistical model use statistics to make prediction/explanation. Most of the statistical model need to be tuned for parameters to show a better performance. Statistical model is build on real data and could be diagnosed by other general statistics such as \\(R^2\\), \\(ROC curve\\). When the models are built or compared, model selection could be preformed. \\[Target = g(Statistic) = g(f(sample_1,sample_2,...,sample_n))\\] Bias-Variance Tradeoff is an important concept regarding statistical models. Certain models could be overfitted(small Bias, large variance) or underfitted(large Bias, small variance) when the parameters of models are not well selected. \\[E[(y - \\hat f)^2] = \\sigma^2 + Var[\\hat f] + Bias[\\hat f]\\] Cross validation could be used to find the best model based on training-testing strategy such as Jacknife, bootstraping resampling and n-fold cross validation. Regularization for models could also be used to find the model with best prediction performance. Rigid regression, LASSO or other general regularization could be employed to build a robust models. For supervised models, linear model and tree based model are two basic categories. Linear model could be useful to tell the independent or correlated relationship of variables and the influences on the predicted variables. Tree based model, on the other hand, try to build a hierarchical structure for the variables such as bagging, random forest or boosting. Linear model could be treated as special case of tree based model with single layer. Other models like Support Vector Machine (SVM), Artificial Neural Network (ANN) or Deep Learning are also make various assumptions on the data. However, if you final target is prediction, you could try any of those models or even weighted combine their prediction to make meta-prediction. 10.2 Differences analysis After we get corrected peaks across samples, the next step is to find the differences between two groups. Actually, you could perform ANOVA or Kruskal-Wallis Test for comparison among more than two groups. The basic idea behind statistic analysis is to find the meaningful differences between groups and extract such ions or peak groups. So how to find the differences? In most metabolomics software, such task is completed by a t-test and report p-value and fold changes. If you only compare two groups on one peaks, that’s OK. However, if you compare two groups on thousands of peaks, statistic textbook would tell you to notice the false positive. For one comparison, the confidence level is 0.05, which means 5% chances to get false positive result. For two comparisons, such chances would be \\(1-0.95^2\\). For 10 comparisons, such chances would be \\(1-0.95^{10} = 0.4012631\\). For 100 comparisons, such chances would be \\(1-0.95^{100} = 0.9940795\\). You would almost certainly to make mistakes for your results. In statistics, the false discovery rate(FDR) control is always mentioned in omics studies for multiple tests. I suggested using q-values to control FDR. If q-value is less than 0.05, we should expect a lower than 5% chances we make the wrong selections for all of the comparisons showed lower q-values in the whole dataset. Also we could use local false discovery rate, which showed the FDR for certain peaks. However, such values are hard to be estimated accurately. Karin Ortmayr thought fold change might be better than p-values to find the differences (Ortmayr et al. 2016). 10.2.1 T-test or ANOVA If one peak show significant differences among two groups or multiple groups, T-test or ANOVA could be used to find such peaks. However, when multiple hypothesis testings are performed, the probability of false positive would increase. In this case, false discovery rate(FDR) control is required. Q value or adjusted p value could be used in this situation. At certain confidence interval, we could find peaks with significant differences after FDR control. 10.2.2 LIMMA Linear Models for MicroArray Data(LIMMA) model could also be used for high-dimensional data like metabolomics. They use a moderated t-statistic to make estimation of the effects called Empirical Bayes Statistics for Differential Expression. It is a hierarchical model to shrink the t-statistic for each peak to all the peaks. Such estimation is more robust. In LIMMA, we could add the known batch effect variable as a covariance in the model. LIMMA is different from t-test or ANOVA while we could still use p value and FDR control on LIMMA results. 10.2.3 Bayesian mixture model Another way to make difference analysis is based on Bayesian mixture model without p value. Such model would not use hypothesis testing and directly generate the posterior estimation of parameters. A posterior probability could be used to check whether certain peaks could be related to different condition. If we want to make comparison between classical model like LIMMA and Bayesian mixture model. We need to use simulation to find the cutoff. 10.3 PCA In most cases, PCA is used as an exploratory data analysis(EDA) method. In most of those most cases, PCA is just served as visualization method. I mean, when I need to visualize some high-dimension data, I would use PCA. So, the basic idea behind PCA is compression. When you have 100 samples with concentrations of certain compound, you could plot the concentrations with samples’ ID. However, if you have 100 compounds to be analyzed, it would by hard to show the relationship between the samples. Actually, you need to show a matrix with sample and compounds (100 * 100 with the concentrations filled into the matrix) in an informal way. The PCA would say: OK, guys, I could convert your data into only 100 * 2 matrix with the loss of information minimized. Yeah, that is what the mathematical guys or computer programmer do. You just run the command of PCA. The new two “compounds” might have the cor-relationship between the original 100 compounds and retain the variances between them. After such projection, you would see the compressed relationship between the 100 samples. If some samples’ data are similar, they would be projected together in new two “compounds” plot. That is why PCA could be used for cluster and the new “compounds” could be referred as principal components(PCs). However, you might ask why only two new compounds could finished such task. I have to say, two PCs are just good for visualization. In most cases, we need to collect PCs standing for more than 80% variances in our data if you want to recovery the data with PCs. If each compound have no relationship between each other, the PCs are still those 100 compounds. So you have found a property of the PCs: PCs are orthogonal between each other. Another issue is how to find the relationship between the compounds. We could use PCA to find the relationship between samples. However, we could also extract the influences of the compounds on certain PCs. You might find many compounds showed the same loading on the first PC. That means the concentrations pattern between the compounds are looked similar. So PCA could also be used to explore the relationship between the compounds. OK, next time you might recall PCA when you need it instead of other paper showed them. Besides, there are some other usage of PCA. Loadings are actually correlation coefficients between peaks and their PC scores. Yamamoto et.al. (Yamamoto et al. 2014) used t-test on this correlation coefficient and thought the peaks with statistically significant correlation to the PC score have biological meanings for further study such as annotation. However, such analysis works better when few PCs could explain most of the variances in the dataset. 10.4 Cluster Analysis After we got a lot of samples and analyzed the concentrations of many compounds in them, we may ask about the relationship between the samples. You might have the sampling information such as the date and the position and you could use boxplot or violin plot to explore the relationships among those categorical variables. However, you could also use the data to find some potential relationship. But how? if two samples’ data were almost the same, we might think those samples were from the same potential group. On the other hand, how do we define the “same” in the data? Cluster analysis told us that just define a “distances” to measure the similarity between samples. Mathematically, such distances would be shown in many different manners such as the sum of the absolute values of the differences between samples. For example, we analyzed the amounts of compound A, B and C in two samples and get the results: Compounds(ng) A B C Sample 1 10 13 21 Sample 2 54 23 16 The distance could be: \\[ distance = |10-54|+|13-23|+|21-16| = 59 \\] Also you could use the sum of squares or other way to stand for the similarity. After you defined a “distance”, you could get the distances between all of pairs for your samples. If two samples’ distance was the smallest, put them together as one group. Then calculate the distances again to combine the small group into big group until all of the samples were include in one group. Then draw a dendrogram for those process. The following issue is that how to cluster samples? You might set a cut-off and directly get the group from the dendrogram. However, sometimes you were ordered to cluster the samples into certain numbers of groups such as three. In such situation, you need K means cluster analysis. The basic idea behind the K means is that generate three virtual samples and calculate the distances between those three virtual samples and all of the other samples. There would be three values for each samples. Choose the smallest values and class that sample into this group. Then your samples were classified into three groups. You need to calculate the center of those three groups and get three new virtual samples. Repeat such process until the group members unchanged and you get your samples classified. OK, the basic idea behind the cluster analysis could be summarized as define the distances, set your cut-off and find the group. By this way, you might show potential relationships among samples. 10.5 PLSDA PLS-DA, OPLS-DA and HPSO-OPLS-DA (Qin Yang et al. 2017) could be used. Partial least squares discriminant analysis(PLSDA) was first used in the 1990s. However, Partial least squares(PLS) was proposed in the 1960s by Hermann Wold. Principal components analysis produces the weight matrix reflecting the covariance structure between the variables, while partial least squares produces the weight matrix reflecting the covariance structure between the variables and classes. After rotation by weight matrix, the new variables would contain relationship with classes. The classification performance of PLSDA is identical to linear discriminant analysis(LDA) if class sizes are balanced, or the columns are adjusted according to the mean of the class mean. If the number of variables exceeds the number of samples, LDA can be performed on the principal components. Quadratic discriminant analysis(QDA) could model nonlinearity relationship between variables while PLSDA is better for collinear variables. However, as a classifier, there is little advantage for PLSDA. The advantages of PLSDA is that this modle could show relationship between variables, which is not the goal of regular classifier. Different algorithms (Andersson 2009) for PLSDA would show different score, while PCA always show the same score with fixed algorithm. For PCA, both new variables and classes are orthognal. However, for PLS(Wold), only new classes are orthognal. For PLS(Martens), only new variables are orthognal. This paper show the details of using such methods (Brereton and Lloyd 2018). Sparse PLS discriminant analysis(sPLS-DA) make a L1 penal on the variable selection to remove the influences from unrelated variables, which make sense for high-throughput omics data (Lê Cao, Boitard, and Besse 2011). For o-PLS-DA, s-plot could be used to find features(Wiklund et al. 2008). 10.6 Network analysis 10.6.1 Vertex and edge Each node is a vertex and the connection between nodes is a edge in the network. The connection can be directed or undirected depending on the relationship. 10.6.2 Build the network Adjacency matrices were always used to build the network. It’s a square matrix with n dimensions. Row i and column j is equal to 1 if and only if vertices i and j are connected. In directed network, such values could be 1 for i to j and -1 for j to i. 10.6.3 Network attributes Vertex/edge attributes could be the group information or metadata about the nodes/connections. The edges could be weighted as attribute. Path is the way from one node to another node in the network and you could find the shortest path in the path. The largest distance of a graph is called its diameter. An undirected network is connected if there is a way from any vertex to any other. Connected networks can further classified according to the strength of their connectedness. An undirected network with at least two paths between each pairs of nodes is said to be biconnected. The transitivity of network is a crude summary of the structure. A high value means that nodes are connected well locally with dense subgraphs. Network data sets typically show high transitivity. Maximum flows and minimum cuts could be used to check the largest volumns and smallest path flow between two nodes. For example, two hubs is connected by one node and the largest volumn and smallest path flow between two nodes from each hub could be counted at the select node. Sparse network has similar number of edges and the number of nodes. Dense network has the number of edges as a quadratic function of the nodes. 10.7 Software MetaboAnalystR (Chong, Wishart, and Xia 2019) caret could employ more than 200 statistical models in a general framework to build/select models. You could also show the variable importance for some of the models. caretEnsemble Functions for creating ensembles of caret models pROC Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical tests based on U-statistics or bootstrap. Confidence intervals can be computed for (p)AUC or ROC curves. gWQS Fits Weighted Quantile Sum (WQS) regressions for continuous, binomial, multinomial and count outcomes. Community ecology tool could be used to analysis metabolomic data(Passos Mansoldo et al. 2022). References Andersson, Martin. 2009. “A Comparison of Nine PLS1 Algorithms.” Journal of Chemometrics 23 (10): 518–29. https://doi.org/10.1002/cem.1248. Blaise, Benjamin J., Gonçalo D. S. Correia, Gordon A. Haggart, Izabella Surowiec, Caroline Sands, Matthew R. Lewis, Jake T. M. Pearce, et al. 2021. “Statistical Analysis in Metabolic Phenotyping.” Nature Protocols, July, 1–28. https://doi.org/10.1038/s41596-021-00579-1. Brereton, Richard G., and Gavin R. Lloyd. 2018. “Partial Least Squares Discriminant Analysis for Chemometrics and Metabolomics: How Scores, Loadings, and Weights Differ According to Two Common Algorithms.” Journal of Chemometrics 32 (4): e3028. https://doi.org/10.1002/cem.3028. Chong, Jasmine, David S. Wishart, and Jianguo Xia. 2019. “Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis.” Current Protocols in Bioinformatics 68 (1): e86. https://doi.org/10.1002/cpbi.86. Lê Cao, Kim-Anh, Simon Boitard, and Philippe Besse. 2011. “Sparse PLS Discriminant Analysis: Biologically Relevant Feature Selection and Graphical Displays for Multiclass Problems.” BMC Bioinformatics 12 (June): 253. https://doi.org/10.1186/1471-2105-12-253. Ortmayr, Karin, Verena Charwat, Cornelia Kasper, Stephan Hann, and Gunda Koellensperger. 2016. “Uncertainty Budgeting in Fold Change Determination and Implications for Non-Targeted Metabolomics Studies in Model Systems” 142 (1): 80–90. https://doi.org/10.1039/C6AN01342B. Passos Mansoldo, Felipe Raposo, Rafael Garrett, Veronica da Silva Cardoso, Marina Amaral Alves, and Alane Beatriz Vermelho. 2022. “Metabology: Analysis of Metabolomics Data Using Community Ecology Tools.” Analytica Chimica Acta 1232 (November): 340469. https://doi.org/10.1016/j.aca.2022.340469. Wiklund, Susanne, Erik Johansson, Lina Sjöström, Ewa J. Mellerowicz, Ulf Edlund, John P. Shockcor, Johan Gottfries, Thomas Moritz, and Johan Trygg. 2008. “Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models.” Analytical Chemistry 80 (1): 115–22. https://doi.org/10.1021/ac0713510. Yamamoto, Hiroyuki, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, and Yoshiaki Ohashi. 2014. “Statistical Hypothesis Testing of Factor Loading in Principal Component Analysis and Its Application to Metabolite Set Enrichment Analysis.” BMC Bioinformatics 15 (February): 51. https://doi.org/10.1186/1471-2105-15-51. Yang, Qin, Shan-Shan Lin, Jiang-Tao Yang, Li-Juan Tang, and Ru-Qin Yu. 2017. “Detection of Inborn Errors of Metabolism Utilizing GC-MS Urinary Metabolomics Coupled with a Modified Orthogonal Partial Least Squares Discriminant Analysis.” Talanta 165 (April): 545–52. https://doi.org/10.1016/j.talanta.2017.01.018. "],["exposome.html", "Chapter 11 Exposome 11.1 Internal exposure 11.2 External exposure", " Chapter 11 Exposome Nature or nurture debate has a similar paradigm in environmental study: is the ecological system and human health risk dominated by heredity or environment? Twins and siblings study(Lakhani et al. 2019; Polderman et al. 2015) show that both heritability and environmental factors could explain the phenotypic variance among population. The contribution of environment among different disease functional domain such as hematological and endocrine could achieve almost half of the total variances (Polderman et al. 2015). However, besides those epidemiology proof, little is known about the influences of overall environmental exposure process at molecular level. Conventional exposure study always investigate one or several specific compounds and their environmental fate or toxicology endpoint. Exposome, on the other hand, tries to access multiple exposure factors from biological or environmental samples as much as possible without a predefined compounds list. Those endogenous and exogenous molecules can reveal the exposure process in details. Exposome could not only help to investigate the comprehensive molecules level changes, but also the interactions among molecules in an non-targeted design. By following annotation of captured compounds, exposome can discover exposure markers for certain type of pollution, as well as biomarkers for certain exposure process and discuss related physiological process. The workflow for exposome is quite similar to metabolomics(X. Hu et al. 2021). According to CDC, The exposome can be defined as the measure of all the exposures of an individual in a lifetime and how those exposures relate to health. Exposomics is the study of the exposome and relies on the application of internal and external exposure assessment methods. Internal exposure relies on fields of study such as genomics, metabolomics, lipidomics, transcriptomics and proteomics. External exposure assessment relies on measuring environmental stresses. Human Early Life Exposome (HELIX) project(Maitre et al. 2022), a multi-centre cohort of 1301 mother-child pairs, associated individual exposomes consisting of &gt;100 chemical, outdoor, social and lifestyle exposures assessed in pregnancy and childhood, with multi-omics profiles (methylome, transcriptome, proteins and metabolites) in childhood. The data could be found online. “molecular gatekeepers”, key metabolites that link single or multiple exposure biomarkers with correlated clusters of endogenous metabolites, could be used to find health-relevant biological metabolites. (M. Yu et al. 2022) 11.1 Internal exposure Virtual Metabolic Human Database integrating human and gut microbiome metabolism with nutrition and disease. 11.2 External exposure 11.2.1 Environmental fate of compounds 11.2.1.1 QSPR Chemicalize is a powerful online platform for chemical calculations, search, and text processing. QSPR molecular descriptor generate tools list Spark uses computational algorithms based on fundamental chemical structure theory to estimate a wide variety of reactivity parameters strictly from molecular structure. OPERA OPERA models for predicting physicochemical properties and environmental fate endpoints(Mansouri et al. 2018). LogP is important for analytical chemistry. Mannhold (Mannhold et al. 2009) report a comprehensive comparison of logP algorithms. Later, Rajarshi Guha make a comparison with logP algorithms with CDK based on logPstar dataset. Commercial software such as Spark, ACS Labs and ChemAxon might always claim a better performance on in-house dataset compared with public software like KowWIN within EPI Suite. However, we should be careful to evaluate the influence of logP accuracy on the metabolites or unknown compounds. 11.2.1.2 Fate Wania Group developed software tools to address various aspects of organic contaminant fate and behaviour. Trent University release models to predict environmental fate for pollutions such as Level 3. EAWAG-BBD could provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. 11.2.2 Exposure study database The information system PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing georeferenced data from earth system research. Environmental Health Criteria (EHC) Monographs CTD is a robust, publicly available database that aims to advance understanding about how environmental exposures affect human health. ODMOA facilitates and coordinates the collection, access to, and use of public health data in order to monitor and improve population health. This data is better for general public health research for Massachusetts. The Surveillance, Epidemiology, and End Results (SEER) Program provides information on cancer statistics in an effort to reduce the cancer burden among the U.S. population. References Hu, Xin, Douglas I. Walker, Yongliang Liang, Matthew Ryan Smith, Michael L. Orr, Brian D. Juran, Chunyu Ma, et al. 2021. “A Scalable Workflow to Characterize the Human Exposome.” Nature Communications 12 (1): 5575. https://doi.org/10.1038/s41467-021-25840-9. Lakhani, Chirag M., Braden T. Tierney, Arjun K. Manrai, Jian Yang, Peter M. Visscher, and Chirag J. Patel. 2019. “Repurposing Large Health Insurance Claims Data to Estimate Genetic and Environmental Contributions in 560 Phenotypes.” Nature Genetics 51 (2): 327–34. https://doi.org/10.1038/s41588-018-0313-7. Maitre, Léa, Mariona Bustamante, Carles Hernández-Ferrer, Denise Thiel, Chung-Ho E. Lau, Alexandros P. Siskos, Marta Vives-Usano, et al. 2022. “Multi-Omics Signatures of the Human Early Life Exposome.” Nature Communications 13 (1): 7024. https://doi.org/10.1038/s41467-022-34422-2. Mannhold, Raimund, Gennadiy I. Poda, Claude Ostermann, and Igor V. Tetko. 2009. “Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of LogP Methods on More Than 96,000 Compounds.” Journal of Pharmaceutical Sciences 98 (3): 861–93. https://doi.org/10.1002/jps.21494. Mansouri, Kamel, Chris M. Grulke, Richard S. Judson, and Antony J. Williams. 2018. “OPERA Models for Predicting Physicochemical Properties and Environmental Fate Endpoints.” Journal of Cheminformatics 10 (1): 10. https://doi.org/10.1186/s13321-018-0263-1. Polderman, Tinca J. C., Beben Benyamin, Christiaan A. de Leeuw, Patrick F. Sullivan, Arjen van Bochoven, Peter M. Visscher, and Danielle Posthuma. 2015. “Meta-Analysis of the Heritability of Human Traits Based on Fifty Years of Twin Studies.” Nature Genetics 47 (7): 702–9. https://doi.org/10.1038/ng.3285. Yu, Miao, Susan L. Teitelbaum, Georgia Dolios, Lam-Ha T. Dang, Peijun Tu, Mary S. Wolff, and Lauren M. Petrick. 2022. “Molecular Gatekeeper Discovery: Workflow for Linking Multiple Exposure Biomarkers to Metabolomics.” Environmental Science &amp; Technology 56 (10): 6162–71. https://doi.org/10.1021/acs.est.1c04039. "],["references.html", "References", " References Abrahamsson, Dimitri, Christopher L. Brueck, Carsten Prasse, Dimitra A. Lambropoulou, Lelouda-Athanasia Koronaiou, Miaomiao Wang, June-Soo Park, and Tracey J. Woodruff. 2023. “Extracting Structural Information from Physicochemical Property Measurements Using Machine Learning-A New Approach for Structure Elucidation in Non-targeted Analysis.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.3c03003. Adams, Kendra J., Brian Pratt, Neelanjan Bose, Laura G. Dubois, Lisa St John-Williams, Kevin M. Perrott, Karina Ky, et al. 2020. “Skyline for Small Molecules: A Unifying Software Package for Quantitative Metabolomics.” Journal of Proteome Research 19 (4): 1447–58. https://doi.org/10.1021/acs.jproteome.9b00640. Aguilar-Mogas, Antoni, Marta Sales-Pardo, Miriam Navarro, Roger Guimerà, and Oscar Yanes. 2017. “iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra.” Analytical Chemistry 89 (6): 3474–82. https://doi.org/10.1021/acs.analchem.6b04512. Alden, Nicholas, Smitha Krishnan, Vladimir Porokhin, Ravali Raju, Kyle McElearney, Alan Gilbert, and Kyongbum Lee. 2017. “Biologically Consistent Annotation of Metabolomics Data.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.7b02162. Ali, Ahmed, Yasmine Abouleila, Yoshihiro Shimizu, Eiso Hiyama, Samy Emara, Alireza Mashaghi, and Thomas Hankemeier. 2019. “Single-Cell Metabolomics by Mass Spectrometry: Advances, Challenges, and Future Applications.” TrAC Trends in Analytical Chemistry 120 (November): 115436. https://doi.org/10.1016/j.trac.2019.02.033. Alka, Oliver, Timo Sachsenberg, Leon Bichmann, Julianus Pfeuffer, Hendrik Weisser, Samuel Wein, Eugen Netz, Marc Rurik, Oliver Kohlbacher, and Hannes Röst. 2020. “CHAPTER 6:OpenMS and KNIME for Mass Spectrometry Data Processing.” In Processing Metabolomics and Proteomics Data with Open Software, 201–31. https://doi.org/10.1039/9781788019880-00201. Alka, Oliver, Premy Shanthamoorthy, Michael Witting, Karin Kleigrewe, Oliver Kohlbacher, and Hannes L. Röst. 2022. “DIAMetAlyzer Allows Automated False-Discovery Rate-Controlled Analysis for Data-Independent Acquisition in Metabolomics.” Nature Communications 13 (1): 1347. https://doi.org/10.1038/s41467-022-29006-z. Allam-Ndoul, Bénédicte, Frédéric Guénard, Véronique Garneau, Hubert Cormier, Olivier Barbier, Louis Pérusse, and Marie-Claude Vohl. 2016. “Association Between Metabolite Profiles, Metabolic Syndrome and Obesity Status.” Nutrients 8 (6): 324. https://doi.org/10.3390/nu8060324. Allard, Pierre-Marie, Grégory Genta-Jouve, and Jean-Luc Wolfender. 2017. “Deep Metabolome Annotation in Natural Products Research: Towards a Virtuous Cycle in Metabolite Identification.” Current Opinion in Chemical Biology, Omics, 36 (February): 40–49. https://doi.org/10.1016/j.cbpa.2016.12.022. Allen, Felicity, Allison Pon, Michael Wilson, Russ Greiner, and David Wishart. 2014. “CFM-ID: A Web Server for Annotation, Spectrum Prediction and Metabolite Identification from Tandem Mass Spectra.” Nucleic Acids Research 42 (W1): W94–99. https://doi.org/10.1093/nar/gku436. Alonso, Arnald, Sara Marsal, and Antonio Julià. 2015. “Analytical Methods in Untargeted Metabolomics: State of the Art in 2015.” Frontiers in Bioengineering and Biotechnology 3 (March). https://doi.org/10.3389/fbioe.2015.00023. Anderson, Brady G., Alexander Raskind, Hani Habra, Robert T. Kennedy, and Charles R. Evans. 2021. “Modifying Chromatography Conditions for Improved Unknown Feature Identification in Untargeted Metabolomics.” Analytical Chemistry 93 (48): 15840–49. https://doi.org/10.1021/acs.analchem.1c02149. Andersson, Martin. 2009. “A Comparison of Nine PLS1 Algorithms.” Journal of Chemometrics 23 (10): 518–29. https://doi.org/10.1002/cem.1248. Aron, Allegra T., Emily C. Gentry, Kerry L. McPhail, Louis-Félix Nothias, Mélissa Nothias-Esposito, Amina Bouslimani, Daniel Petras, et al. 2020. “Reproducible Molecular Networking of Untargeted Mass Spectrometry Data Using GNPS.” Nature Protocols 15 (6): 1954–91. https://doi.org/10.1038/s41596-020-0317-5. Bach, Eric, Emma L. Schymanski, and Juho Rousu. 2022. “Joint Structural Annotation of Small Molecules Using Liquid Chromatography Retention Order and Tandem Mass Spectrometry Data.” Nature Machine Intelligence 4 (12): 1224–37. https://doi.org/10.1038/s42256-022-00577-2. Bai, Caihong, Suyun Xu, Jingyi Tang, Yuxi Zhang, Jiahui Yang, and Kaifeng Hu. 2022. “A ‘Shape-Orientated’ Algorithm Employing an Adapted Marr Wavelet and Shape Matching Index Improves the Performance of Continuous Wavelet Transform for Chromatographic Peak Detection and Quantification.” Journal of Chromatography A 1673 (June): 463086. https://doi.org/10.1016/j.chroma.2022.463086. Baker, Monya. 2011. “Metabolomics: From Small Molecules to Big Ideas.” Nature Methods 8 (2): 117–21. https://doi.org/10.1038/nmeth0211-117. Baran, Richard, and Trent R. Northen. 2013. “Robust Automated Mass Spectra Interpretation and Chemical Formula Calculation Using Mixed Integer Linear Programming.” Analytical Chemistry 85 (20): 9777–84. https://doi.org/10.1021/ac402180c. Barbier Saint Hilaire, Pierre, Ulli M. Hohenester, Benoit Colsch, Jean-Claude Tabet, Christophe Junot, and François Fenaille. 2018. “Evaluation of the High-Field Orbitrap Fusion for Compound Annotation in Metabolomics.” Analytical Chemistry 90 (5): 3030–35. https://doi.org/10.1021/acs.analchem.7b05372. Barnes, Stephen, H. Paul Benton, Krista Casazza, Sara J. Cooper, Xiangqin Cui, Xiuxia Du, Jeffrey Engler, et al. 2016a. “Training in Metabolomics Research. I. Designing the Experiment, Collecting and Extracting Samples and Generating Metabolomics Data.” Journal of Mass Spectrometry 51 (7): 461–75. https://doi.org/10.1002/jms.3782. ———, et al. 2016b. “Training in Metabolomics Research. II. Processing and Statistical Analysis of Metabolomics Data, Metabolite Identification, Pathway Analysis, Applications of Metabolomics and Its Future.” Journal of Mass Spectrometry 51 (8): 535–48. https://doi.org/10.1002/jms.3780. Barranco-Altirriba, Maria, Pol Solà-Santos, Sergio Picart-Armada, Samir Kanaan-Izquierdo, Jordi Fonollosa, and Alexandre Perera-Lluna. 2021. “mWISE: An Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features Through Diffusion in Graphs.” Analytical Chemistry 93 (31): 10772–78. https://doi.org/10.1021/acs.analchem.1c00238. Basu, Sumanta, William Duren, Charles R. Evans, Charles F. Burant, George Michailidis, and Alla Karnovsky. 2017. “Sparse Network Modeling and Metscape-Based Visualization Methods for the Analysis of Large-Scale Metabolomics Data.” Bioinformatics 33 (10): 1545–53. https://doi.org/10.1093/bioinformatics/btx012. Baygi, Sadjad Fakouri, Sanjay K. Banerjee, Praloy Chakraborty, Yashwant Kumar, and Dinesh Kumar Barupal. 2022. “IDSL.UFA Assigns High-Confidence Molecular Formula Annotations for Untargeted LC/HRMS Data Sets in Metabolomics and Exposomics.” Analytical Chemistry 94 (39): 13315–22. https://doi.org/10.1021/acs.analchem.2c00563. Baygi, Sadjad Fakouri, Yashwant Kumar, and Dinesh Kumar Barupal. 2023. “IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets.” IDSL.CSA: Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets, June. https://doi.org/10.1021/acs.analchem.3c00376. Beale, David J., Farhana R. Pinu, Konstantinos A. Kouremenos, Mahesha M. Poojary, Vinod K. Narayana, Berin A. Boughton, Komal Kanojia, Saravanan Dayalan, Oliver A. H. Jones, and Daniel A. Dias. 2018. “Review of Recent Developments in GC–MS Approaches to Metabolomics-Based Research.” Metabolomics 14 (11): 152. https://doi.org/10.1007/s11306-018-1449-2. Begou, O., H. G. Gika, I. D. Wilson, and G. Theodoridis. 2017. “Hyphenated MS-based Targeted Approaches in Metabolomics.” Analyst 142 (17): 3079–3100. https://doi.org/10.1039/C7AN00812K. Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological) 57 (1): 289–300. https://www.jstor.org/stable/2346101. Bennett, Bryson D., Elizabeth H. Kimball, Melissa Gao, Robin Osterhout, Stephen J. Van Dien, and Joshua D. Rabinowitz. 2009. “Absolute Metabolite Concentrations and Implied Enzyme Active Site Occupancy in Escherichia Coli.” Nature Chemical Biology 5 (8): 593–99. https://doi.org/10.1038/nchembio.186. Bernardo-Bermejo, Samuel, Jingchuan Xue, Linh Hoang, Elizabeth Billings, Bill Webb, M. Willy Honders, Sanne Venneker, et al. 2023. “Quantitative Multiple Fragment Monitoring with Enhanced in-Source Fragmentation/Annotation Mass Spectrometry.” Nature Protocols, February, 1–20. https://doi.org/10.1038/s41596-023-00803-0. Bertsch, Andreas, Clemens Gröpl, Knut Reinert, and Oliver Kohlbacher. 2011. “OpenMS and TOPP: Open Source Software for LC-MS Data Analysis.” In Data Mining in Proteomics: From Standards to Applications, edited by Michael Hamacher, Martin Eisenacher, and Christian Stephan, 353–67. Methods in Molecular Biology. Totowa, NJ: Humana Press. https://doi.org/10.1007/978-1-60761-987-1_23. Bijttebier, Sebastiaan, Anastasia Van der Auwera, Kenn Foubert, Stefan Voorspoels, Luc Pieters, and Sandra Apers. 2016. “Bridging the Gap Between Comprehensive Extraction Protocols in Plant Metabolomics Studies and Method Validation.” Analytica Chimica Acta 935 (September): 136–50. https://doi.org/10.1016/j.aca.2016.06.047. Bilbao, Aivett, Nathalie Munoz, Joonhoon Kim, Daniel J. Orton, Yuqian Gao, Kunal Poorey, Kyle R. Pomraning, et al. 2023. “PeakDecoder Enables Machine Learning-Based Metabolite Annotation and Accurate Profiling in Multidimensional Mass Spectrometry Measurements.” Nature Communications 14 (1): 2461. https://doi.org/10.1038/s41467-023-37031-9. Bilbao, Aivett, Emmanuel Varesio, Jeremy Luban, Caterina Strambio-De-Castillia, Gérard Hopfgartner, Markus Müller, and Frédérique Lisacek. 2015. “Processing Strategies and Software Solutions for Data-Independent Acquisition in Mass Spectrometry.” PROTEOMICS 15 (5-6): 964–80. https://doi.org/10.1002/pmic.201400323. Bittremieux, Wout, Nicole E. Avalon, Sydney P. Thomas, Sarvar A. Kakhkhorov, Alexander A. Aksenov, Paulo Wender P. Gomes, Christine M. Aceves, et al. 2023. “Open Access Repository-Scale Propagated Nearest Neighbor Suspect Spectral Library for Untargeted Metabolomics.” Nature Communications 14 (1): 8488. https://doi.org/10.1038/s41467-023-44035-y. Bittremieux, Wout, Robin Schmid, Florian Huber, Justin J. J. van der Hooft, Mingxun Wang, and Pieter C. Dorrestein. 2022. “Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules.” Journal of the American Society for Mass Spectrometry 33 (9): 1733–44. https://doi.org/10.1021/jasms.2c00153. Blaise, Benjamin J. 2013. “Data-Driven Sample Size Determination for Metabolic Phenotyping Studies.” Analytical Chemistry 85 (19): 8943–50. https://doi.org/10.1021/ac4022314. Blaise, Benjamin J., Gonçalo D. S. Correia, Gordon A. Haggart, Izabella Surowiec, Caroline Sands, Matthew R. Lewis, Jake T. M. Pearce, et al. 2021. “Statistical Analysis in Metabolic Phenotyping.” Nature Protocols, July, 1–28. https://doi.org/10.1038/s41596-021-00579-1. Blaise, Benjamin J., Gonçalo Correia, Adrienne Tin, J. Hunter Young, Anne-Claire Vergnaud, Matthew Lewis, Jake T. M. Pearce, et al. 2016. “Power Analysis and Sample Size Determination in Metabolic Phenotyping.” Analytical Chemistry 88 (10): 5179–88. https://doi.org/10.1021/acs.analchem.6b00188. Blaženović, Ivana, Tobias Kind, Hrvoje Torbašinović, Slobodan Obrenović, Sajjan S. Mehta, Hiroshi Tsugawa, Tobias Wermuth, et al. 2017. “Comprehensive Comparison of in Silico MS/MS Fragmentation Tools of the CASMI Contest: Database Boosting Is Needed to Achieve 93% Accuracy.” Journal of Cheminformatics 9 (1): 32. https://doi.org/10.1186/s13321-017-0219-x. Bonini, Paolo, Tobias Kind, Hiroshi Tsugawa, Dinesh Kumar Barupal, and Oliver Fiehn. 2020. “Retip: Retention Time Prediction for Compound Annotation in Untargeted Metabolomics.” Analytical Chemistry 92 (11): 7515–22. https://doi.org/10.1021/acs.analchem.9b05765. Bonnefille, Bénilde, Oskar Karlsson, May Britt Rian, Rubhana Raqib, Faruque Parvez, Stefano Papazian, M. Sirajul Islam, and Jonathan W. Martin. 2023. “Nontarget Analysis of Polluted Surface Waters in Bangladesh Using Open Science Workflows.” Environmental Science &amp; Technology, April. https://doi.org/10.1021/acs.est.2c08200. Bonner, Ron, and Gérard Hopfgartner. 2018. “SWATH Data Independent Acquisition Mass Spectrometry for Metabolomics.” TrAC Trends in Analytical Chemistry, October. https://doi.org/10.1016/j.trac.2018.10.014. Box, George E. P., J. Stuart Hunter, and William G. Hunter. 2005. Statistics for Experimenters. Wiley-Interscience. Brereton, Richard G., and Gavin R. Lloyd. 2018. “Partial Least Squares Discriminant Analysis for Chemometrics and Metabolomics: How Scores, Loadings, and Weights Differ According to Two Common Algorithms.” Journal of Chemometrics 32 (4): e3028. https://doi.org/10.1002/cem.3028. Broadhurst, David, Royston Goodacre, Stacey N. Reinke, Julia Kuligowski, Ian D. Wilson, Matthew R. Lewis, and Warwick B. Dunn. 2018. “Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies.” Metabolomics 14 (6). https://doi.org/10.1007/s11306-018-1367-3. Broeckling, C. D., F. A. Afsar, S. Neumann, A. Ben-Hur, and J. E. Prenni. 2014. “RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data.” Analytical Chemistry 86 (14): 6812–17. https://doi.org/10.1021/ac501530d. Broeckling, Corey D., Richard D. Beger, Leo L. Cheng, Raquel Cumeras, Daniel J. Cuthbertson, Surendra Dasari, W. Clay Davis, et al. 2023. “Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.” Analytical Chemistry 95 (51): 18645–54. https://doi.org/10.1021/acs.analchem.3c02924. Broeckling, Corey D., Andrea Ganna, Mark Layer, Kevin Brown, Ben Sutton, Erik Ingelsson, Graham Peers, and Jessica E. Prenni. 2016. “Enabling Efficient and Confident Annotation of LC-MS Metabolomics Data Through MS1 Spectrum and Time Prediction.” Analytical Chemistry 88 (18): 9226–34. https://doi.org/10.1021/acs.analchem.6b02479. Bundy, Jacob G., Matthew P. Davey, and Mark R. Viant. 2009. “Environmental Metabolomics: A Critical Review and Future Perspectives.” Metabolomics 5 (1): 3. https://doi.org/10.1007/s11306-008-0152-0. Cai, Jingwei, and Zhengyin Yan. 2021. “Re-Examining the Impact of Minimal Scans in Liquid Chromatography–Mass Spectrometry Analysis.” Journal of the American Society for Mass Spectrometry, June. https://doi.org/10.1021/jasms.1c00073. Cai, Qingpo, Jessica A. Alvarez, Jian Kang, and Tianwei Yu. 2017. “Network Marker Selection for Untargeted LC–MS Metabolomics Data.” Journal of Proteome Research 16 (3): 1261–69. https://doi.org/10.1021/acs.jproteome.6b00861. Cajka, Tomas, and Oliver Fiehn. 2016. “Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics.” Analytical Chemistry 88 (1): 524–45. https://doi.org/10.1021/acs.analchem.5b04491. Calbiani, F., M. Careri, L. Elviri, A. Mangia, and I. Zagnoni. 2006. “Matrix Effects on Accurate Mass Measurements of Low-Molecular Weight Compounds Using Liquid Chromatography-Electrospray-Quadrupole Time-of-Flight Mass Spectrometry.” Journal of Mass Spectrometry 41 (3): 289–94. https://doi.org/10.1002/jms.984. Carroll, Adam J., Murray R. Badger, and A. Harvey Millar. 2010. “The MetabolomeExpress Project: Enabling Web-Based Processing, Analysis and Transparent Dissemination of GC/MS Metabolomics Datasets.” BMC Bioinformatics 11 (1): 376. https://doi.org/10.1186/1471-2105-11-376. Castro-Puyana, María, Raquel Pérez-Míguez, Lidia Montero, and Miguel Herrero. 2017. “Application of Mass Spectrometry-Based Metabolomics Approaches for Food Safety, Quality and Traceability.” TrAC Trends in Analytical Chemistry 93 (August): 102–18. https://doi.org/10.1016/j.trac.2017.05.004. Chaker, Jade, David Møbjerg Kristensen, Thorhallur Ingi Halldorsson, Sjurdur Frodi Olsen, Christine Monfort, Cécile Chevrier, Bernard Jégou, and Arthur David. 2022. “Comprehensive Evaluation of Blood Plasma and Serum Sample Preparations for HRMS-Based Chemical Exposomics: Overlaps and Specificities.” Analytical Chemistry 94 (2): 866–74. https://doi.org/10.1021/acs.analchem.1c03638. Chaleckis, Romanas, Isabel Meister, Pei Zhang, and Craig E Wheelock. 2019. “Challenges, Progress and Promises of Metabolite Annotation for LC–MS-based Metabolomics.” Current Opinion in Biotechnology, Analytical Biotechnology, 55 (February): 44–50. https://doi.org/10.1016/j.copbio.2018.07.010. Chambers, Matthew C., Brendan Maclean, Robert Burke, Dario Amodei, Daniel L. Ruderman, Steffen Neumann, Laurent Gatto, et al. 2012. “A Cross-Platform Toolkit for Mass Spectrometry and Proteomics.” Nature Biotechnology 30 (October): 918–20. https://doi.org/10.1038/nbt.2377. Chang, Hui-Yin, Ching-Tai Chen, T. Mamie Lih, Ke-Shiuan Lynn, Chiun-Gung Juo, Wen-Lian Hsu, and Ting-Yi Sung. 2016. “iMet-Q: A User-Friendly Tool for Label-Free Metabolomics Quantitation Using Dynamic Peak-Width Determination.” PLOS ONE 11 (1): e0146112. https://doi.org/10.1371/journal.pone.0146112. Chang, Hui-Yin, Sean M. Colby, Xiuxia Du, Javier D. Gomez, Maximilian J. Helf, Katerina Kechris, Christine R. Kirkpatrick, et al. 2021. “A Practical Guide to Metabolomics Software Development.” Analytical Chemistry 93 (4): 1912–23. https://doi.org/10.1021/acs.analchem.0c03581. Charbonnet, Joseph A., Carrie A. McDonough, Feng Xiao, Trever Schwichtenberg, Dunping Cao, Sarit Kaserzon, Kevin V. Thomas, et al. 2022. “Communicating Confidence of Per- and Polyfluoroalkyl Substance Identification via High-Resolution Mass Spectrometry.” Environmental Science &amp; Technology Letters, May. https://doi.org/10.1021/acs.estlett.2c00206. Chen, Gengbo, Scott Walmsley, Gemmy C. M. Cheung, Liyan Chen, Ching-Yu Cheng, Roger W. Beuerman, Tien Yin Wong, Lei Zhou, and Hyungwon Choi. 2017. “Customized Consensus Spectral Library Building for Untargeted Quantitative Metabolomics Analysis with Data Independent Acquisition Mass Spectrometry and MetaboDIA Workflow.” Analytical Chemistry 89 (9): 4897–4906. https://doi.org/10.1021/acs.analchem.6b05006. Chen, Li, Wenyun Lu, Lin Wang, Xi Xing, Ziyang Chen, Xin Teng, Xianfeng Zeng, et al. 2021. “Metabolite Discovery Through Global Annotation of Untargeted Metabolomics Data.” Nature Methods 18 (11): 1377–85. https://doi.org/10.1038/s41592-021-01303-3. Chen, Yanhua, Zhi Zhou, Wei Yang, Nan Bi, Jing Xu, Jiuming He, Ruiping Zhang, Lvhua Wang, and Zeper Abliz. 2017. “Development of a Data-Independent Targeted Metabolomics Method for Relative Quantification Using Liquid Chromatography Coupled with Tandem Mass Spectrometry.” Analytical Chemistry 89 (13): 6954–62. https://doi.org/10.1021/acs.analchem.6b04727. Cheng, Susan, Svati H. Shah, Elizabeth J. Corwin, Oliver Fiehn, Robert L. Fitzgerald, Robert E. Gerszten, Thomas Illig, et al. 2017. “Potential Impact and Study Considerations of Metabolomics in Cardiovascular Health and Disease: A Scientific Statement From the American Heart Association.” Circulation: Cardiovascular Genetics 10 (2): e000032. https://doi.org/10.1161/HCG.0000000000000032. Choi, Meena, Ching-Yun Chang, Timothy Clough, Daniel Broudy, Trevor Killeen, Brendan MacLean, and Olga Vitek. 2014. “MSstats: An R Package for Statistical Analysis of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Bioinformatics 30 (17): 2524–26. https://doi.org/10.1093/bioinformatics/btu305. Chokkathukalam, Achuthanunni, Andris Jankevics, Darren J. Creek, Fiona Achcar, Michael P. Barrett, and Rainer Breitling. 2013. “mzMatch–ISO: An R Tool for the Annotation and Relative Quantification of Isotope-Labelled Mass Spectrometry Data.” Bioinformatics 29 (2): 281–83. https://doi.org/10.1093/bioinformatics/bts674. Chong, Jasmine, David S. Wishart, and Jianguo Xia. 2019. “Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis.” Current Protocols in Bioinformatics 68 (1): e86. https://doi.org/10.1002/cpbi.86. Clasquin, Michelle F., Eugene Melamud, and Joshua D. Rabinowitz. 2012. “LC-MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine.” Current Protocols in Bioinformatics 37 (1): 14.11.1–23. https://doi.org/10.1002/0471250953.bi1411s37. Climaco Pinto, Rui, Ibrahim Karaman, Matthew R. Lewis, Jenny Hällqvist, Manuja Kaluarachchi, Gonçalo Graça, Elena Chekmeneva, et al. 2022. “Finding Correspondence Between Metabolomic Features in Untargeted Liquid Chromatography–Mass Spectrometry Metabolomics Datasets.” Analytical Chemistry 94 (14): 5493–503. https://doi.org/10.1021/acs.analchem.1c03592. Codrean, S., B. Kruit, N. Meekel, D. Vughs, and F. Béen. 2023. “Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.” Analytical Chemistry, October. https://doi.org/10.1021/acs.analchem.3c03470. Colby, Sean M., Christine H. Chang, Jessica L. Bade, Jamie R. Nunez, Madison R. Blumer, Daniel J. Orton, Kent J. Bloodsworth, et al. 2022. “DEIMoS: An Open-Source Tool for Processing High-Dimensional Mass Spectrometry Data.” Analytical Chemistry 94 (16): 6130–38. https://doi.org/10.1021/acs.analchem.1c05017. Considine, E. C., G. Thomas, A. L. Boulesteix, A. S. Khashan, and L. C. Kenny. 2017. “Critical Review of Reporting of the Data Analysis Step in Metabolomics.” Metabolomics 14 (1): 7. https://doi.org/10.1007/s11306-017-1299-3. Creek, Darren J., Andris Jankevics, Karl E. V. Burgess, Rainer Breitling, and Michael P. Barrett. 2012. “IDEOM: An Excel Interface for Analysis of LC–MS-based Metabolomics Data.” Bioinformatics 28 (7): 1048–49. https://doi.org/10.1093/bioinformatics/bts069. Dagan, Shai, Dana Marder, Nitzan Tzanani, Eyal Drug, Hagit Prihed, and Lilach Yishai-Aviram. 2023. “Evaluation of Matrix Complexity in Nontargeted Analysis of Small-Molecule Toxicants by Liquid Chromatography–High-Resolution Mass Spectrometry.” Analytical Chemistry 95 (20): 7924–32. https://doi.org/10.1021/acs.analchem.3c00413. Daly, Rónán, Simon Rogers, Joe Wandy, Andris Jankevics, Karl E. V. Burgess, and Rainer Breitling. 2014. “MetAssign: Probabilistic Annotation of Metabolites from LC–MS Data Using a Bayesian Clustering Approach.” Bioinformatics 30 (19): 2764–71. https://doi.org/10.1093/bioinformatics/btu370. de Jonge, Niek F., Joris J. R. Louwen, Elena Chekmeneva, Stephane Camuzeaux, Femke J. Vermeir, Robert S. Jansen, Florian Huber, and Justin J. J. van der Hooft. 2023. “MS2Query: Reliable and Scalable MS2 Mass Spectra-Based Analogue Search.” Nature Communications 14 (1): 1752. https://doi.org/10.1038/s41467-023-37446-4. De Livera, Alysha M., Daniel A. Dias, David De Souza, Thusitha Rupasinghe, James Pyke, Dedreia Tull, Ute Roessner, Malcolm McConville, and Terence P. Speed. 2012. “Normalizing and Integrating Metabolomics Data.” Analytical Chemistry 84 (24): 10768–76. https://doi.org/10.1021/ac302748b. Deda, Olga, Anastasia Chrysovalantou Chatziioannou, Stella Fasoula, Dimitris Palachanis, Nicolaos Raikos, Georgios A. Theodoridis, and Helen G. Gika. 2017. “Sample Preparation Optimization in Fecal Metabolic Profiling.” Journal of Chromatography B, Advances in mass spectrometry-based applications, 1047 (March): 115–23. https://doi.org/10.1016/j.jchromb.2016.06.047. DeFelice, Brian C., Sajjan Singh Mehta, Stephanie Samra, Tomáš Čajka, Benjamin Wancewicz, Johannes F. Fahrmann, and Oliver Fiehn. 2017. “Mass Spectral Feature List Optimizer (MS-FLO): A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing.” Analytical Chemistry 89 (6): 3250–55. https://doi.org/10.1021/acs.analchem.6b04372. Delabriere, Alexis, Philipp Warmer, Vincenth Brennsteiner, and Nicola Zamboni. 2021. “SLAW: A Scalable and Self-Optimizing Processing Workflow for Untargeted LC-MS.” Analytical Chemistry 93 (45): 15024–32. https://doi.org/10.1021/acs.analchem.1c02687. Dietrich, Christian, Arne Wick, and Thomas A. Ternes. 2022. “Open-Source Feature Detection for Non-Target LC–MS Analytics.” Rapid Communications in Mass Spectrometry 36 (2): e9206. https://doi.org/10.1002/rcm.9206. Ding, Xian, Fen Yang, Yanhua Chen, Jing Xu, Jiuming He, Ruiping Zhang, and Zeper Abliz. 2022. “Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics.” Analytical Chemistry 94 (21): 7500–7509. https://doi.org/10.1021/acs.analchem.1c05502. Djoumbou Feunang, Yannick, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, et al. 2016. “ClassyFire: Automated Chemical Classification with a Comprehensive, Computable Taxonomy.” Journal of Cheminformatics 8 (1): 61. https://doi.org/10.1186/s13321-016-0174-y. Dodds, James N., Lingjue Wang, Gary J. Patti, and Erin S. Baker. 2022. “Combining Isotopologue Workflows and Simultaneous Multidimensional Separations to Detect, Identify, and Validate Metabolites in Untargeted Analyses.” Analytical Chemistry 94 (5): 2527–35. https://doi.org/10.1021/acs.analchem.1c04430. Domingo-Almenara, Xavier, Jesus Brezmes, Maria Vinaixa, Sara Samino, Noelia Ramirez, Marta Ramon-Krauel, Carles Lerin, et al. 2016. “eRah: A Computational Tool Integrating Spectral Deconvolution and Alignment with Quantification and Identification of Metabolites in GC/MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9821–29. https://doi.org/10.1021/acs.analchem.6b02927. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, H. Paul Benton, and Gary Siuzdak. 2018. “Annotation: A Computational Solution for Streamlining Metabolomics Analysis.” Analytical Chemistry 90 (1): 480–89. https://doi.org/10.1021/acs.analchem.7b03929. Domingo-Almenara, Xavier, J. Rafael Montenegro-Burke, Julijana Ivanisevic, Aurelien Thomas, Jonathan Sidibé, Tony Teav, Carlos Guijas, et al. 2018. “XCMS-MRM and METLIN-MRM: A Cloud Library and Public Resource for Targeted Analysis of Small Molecules.” Nature Methods 15 (9): 681–84. https://doi.org/10.1038/s41592-018-0110-3. Domingo-Almenara, Xavier, and Gary Siuzdak. 2020. “Metabolomics Data Processing Using XCMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 11–24. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_2. Doppler, Maria, Bernhard Kluger, Christoph Bueschl, Christina Schneider, Rudolf Krska, Sylvie Delcambre, Karsten Hiller, Marc Lemmens, and Rainer Schuhmacher. 2016. “Stable Isotope-Assisted Evaluation of Different Extraction Solvents for Untargeted Metabolomics of Plants.” International Journal of Molecular Sciences 17 (7). https://doi.org/10.3390/ijms17071017. Dos Santos, Emile Kelly Porto, and Gisele André Baptista Canuto. 2023. “Optimizing XCMS Parameters for GC-MS Metabolomics Data Processing: A Case Study.” Metabolomics: Official Journal of the Metabolomic Society 19 (4): 26. https://doi.org/10.1007/s11306-023-01992-1. Dryden, Michael D. M., Ryan Fobel, Christian Fobel, and Aaron R. Wheeler. 2017. “Upon the Shoulders of Giants: Open-Source Hardware and Software in Analytical Chemistry.” Analytical Chemistry 89 (8): 4330–38. https://doi.org/10.1021/acs.analchem.7b00485. Du, Xinsong, Juan J. Aristizabal-Henao, Timothy J. Garrett, Mathias Brochhausen, William R. Hogan, and Dominick J. Lemas. 2022. “A Checklist for Reproducible Computational Analysis in Clinical Metabolomics Research.” Metabolites 12 (1): 87. https://doi.org/10.3390/metabo12010087. Du, Xiuxia, and Steven H Zeisel. 2013. “SPECTRAL DECONVOLUTION FOR GAS CHROMATOGRAPHY MASS SPECTROMETRY-BASED METABOLOMICS: CURRENT STATUS AND FUTURE PERSPECTIVES.” Computational and Structural Biotechnology Journal 4 (5): 1–10. https://doi.org/10.5936/csbj.201301013. Dudzik, Danuta, Cecilia Barbas-Bernardos, Antonia García, and Coral Barbas. 2018. “Quality Assurance Procedures for Mass Spectrometry Untargeted Metabolomics. A Review.” Journal of Pharmaceutical and Biomedical Analysis, Review issue 2017, 147 (January): 149–73. https://doi.org/10.1016/j.jpba.2017.07.044. Dührkop, Kai, Markus Fleischauer, Marcus Ludwig, Alexander A. Aksenov, Alexey V. Melnik, Marvin Meusel, Pieter C. Dorrestein, Juho Rousu, and Sebastian Böcker. 2019. “SIRIUS 4: A Rapid Tool for Turning Tandem Mass Spectra into Metabolite Structure Information.” Nature Methods 16 (4): 299–302. https://doi.org/10.1038/s41592-019-0344-8. Dührkop, Kai, Louis-Félix Nothias, Markus Fleischauer, Raphael Reher, Marcus Ludwig, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Systematic Classification of Unknown Metabolites Using High-Resolution Fragmentation Mass Spectra.” Nature Biotechnology, November, 1–10. https://doi.org/10.1038/s41587-020-0740-8. Dunn, Warwick B, Ian D Wilson, Andrew W Nicholls, and David Broadhurst. 2012. “The Importance of Experimental Design and QC Samples in Large-Scale and MS-driven Untargeted Metabolomic Studies of Humans.” Bioanalysis 4 (18): 2249–64. https://doi.org/10.4155/bio.12.204. Dyar, Kenneth A., Dominik Lutter, Anna Artati, Nicholas J. Ceglia, Yu Liu, Danny Armenta, Martin Jastroch, et al. 2018. “Atlas of Circadian Metabolism Reveals System-wide Coordination and Communication Between Clocks.” Cell 174 (6): 1571–1585.e11. https://doi.org/10.1016/j.cell.2018.08.042. Edmands, William M. B., Dinesh K. Barupal, and Augustin Scalbert. 2015. “MetMSLine: An Automated and Fully Integrated Pipeline for Rapid Processing of High-Resolution LC–MS Metabolomic Datasets.” Bioinformatics 31 (5): 788–90. https://doi.org/10.1093/bioinformatics/btu705. Edmands, William M. B., Josie Hayes, and Stephen M. Rappaport. 2018. “SimExTargId: A Comprehensive Package for Real-Time LC-MS Data Acquisition and Analysis.” Bioinformatics 34 (20): 3589–90. https://doi.org/10.1093/bioinformatics/bty218. Edmands, William M. B., Lauren Petrick, Dinesh K. Barupal, Augustin Scalbert, Mark J. Wilson, Jeffrey K. Wickliffe, and Stephen M. Rappaport. 2017. “compMS2Miner: An Automatable Metabolite Identification, Visualization, and Data-Sharing R Package for High-Resolution LC–MS Data Sets.” Analytical Chemistry 89 (7): 3919–28. https://doi.org/10.1021/acs.analchem.6b02394. Eilertz, Daniel, Michael Mitterer, and Joerg M. Buescher. 2022. “automRm: An R Package for Fully Automatic LC-QQQ-MS Data Preprocessing Powered by Machine Learning.” Analytical Chemistry 94 (16): 6163–71. https://doi.org/10.1021/acs.analchem.1c05224. El Abiead, Yasin, Maximilian Milford, Harald Schoeny, Mate Rusz, Reza M. Salek, and Gunda Koellensperger. 2022. “Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection.” Analytical Chemistry 94 (24): 8588–95. https://doi.org/10.1021/acs.analchem.1c05270. Engler Hart, Chloe, Tobias Kind, Pieter C. Dorrestein, David Healey, and Daniel Domingo-Fernández. 2024. “Weighting Low-Intensity MS/MS Ions and m/z Frequency for Spectral Library Annotation.” Journal of the American Society for Mass Spectrometry 35 (2): 266–74. https://doi.org/10.1021/jasms.3c00353. Fenaille, François, Pierre Barbier Saint-Hilaire, Kathleen Rousseau, and Christophe Junot. 2017. “Data Acquisition Workflows in Liquid Chromatography Coupled to High Resolution Mass Spectrometry-Based Metabolomics: Where Do We Stand?” Journal of Chromatography A 1526 (Supplement C): 1–12. https://doi.org/10.1016/j.chroma.2017.10.043. Fernández-Albert, Francesc, Rafael Llorach, Cristina Andrés-Lacueva, and Alexandre Perera. 2014. “An R Package to Analyse LC/MS Metabolomic Data: MAIT (Metabolite Automatic Identification Toolkit).” Bioinformatics 30 (13): 1937–39. https://doi.org/10.1093/bioinformatics/btu136. Fessenden, Marissa. 2016. “Metabolomics: Small Molecules, Single Cells.” Nature 540 (7631): 153–55. https://doi.org/10.1038/540153a. Fiehn, Oliver. 2002. “Metabolomics – the Link Between Genotypes and Phenotypes.” Plant Molecular Biology 48 (1): 155–71. https://doi.org/10.1023/A:1013713905833. Flasch, Mira, Veronika Fitz, Evelyn Rampler, Chibundu N. Ezekiel, Gunda Koellensperger, and Benedikt Warth. 2022. “Integrated Exposomics/Metabolomics for Rapid Exposure and Effect Analyses.” JACS Au 2 (11): 2548–60. https://doi.org/10.1021/jacsau.2c00433. Forsberg, Erica M., Tao Huan, Duane Rinehart, H. Paul Benton, Benedikt Warth, Brian Hilmers, and Gary Siuzdak. 2018. “Data Processing, Multi-Omic Pathway Mapping, and Metabolite Activity Analysis Using XCMS Online.” Nature Protocols 13 (4): 633–51. https://doi.org/10.1038/nprot.2017.151. Franceschi, Pietro, Domenico Masuero, Urska Vrhovsek, Fulvio Mattivi, and Ron Wehrens. 2012. “A Benchmark Spike-in Data Set for Biomarker Identification in Metabolomics.” Journal of Chemometrics 26 (1-2): 16–24. https://doi.org/10.1002/cem.1420. Fu, Hai-Yan, Ou Hu, Yue-Ming Zhang, Li Zhang, Jing-Jing Song, Peang Lu, Qing-Xia Zheng, et al. 2017. “Mass-Spectra-Based Peak Alignment for Automatic Nontargeted Metabolic Profiling Analysis for Biomarker Screening in Plant Samples.” Journal of Chromatography A 1513 (Supplement C): 201–9. https://doi.org/10.1016/j.chroma.2017.07.044. Fu, Jianbo, Ying Zhang, Yunxia Wang, Hongning Zhang, Jin Liu, Jing Tang, Qingxia Yang, et al. 2021. “Optimization of Metabolomic Data Processing Using NOREVA.” Nature Protocols, December, 1–23. https://doi.org/10.1038/s41596-021-00636-9. Gadara, Darshak, Katerina Coufalikova, Juraj Bosak, David Smajs, and Zdenek Spacil. 2021. “Systematic Feature Filtering in Exploratory Metabolomics: Application Toward Biomarker Discovery.” Analytical Chemistry 93 (26): 9103–10. https://doi.org/10.1021/acs.analchem.1c00816. Gerlich, Michael, and Steffen Neumann. 2013. “MetFusion: Integration of Compound Identification Strategies.” Journal of Mass Spectrometry 48 (3): 291–98. https://doi.org/10.1002/jms.3123. Ghaste, Manoj, Robert Mistrik, and Vladimir Shulaev. 2016. “Applications of Fourier Transform Ion Cyclotron Resonance (FT-ICR) and Orbitrap Based High Resolution Mass Spectrometry in Metabolomics and Lipidomics.” International Journal of Molecular Sciences 17 (6). https://doi.org/10.3390/ijms17060816. Ghosson, Hikmat, Yann Guitton, Amani Ben Jrad, Chandrashekhar Patil, Delphine Raviglione, Marie-Virginie Salvia, and Cédric Bertrand. 2021. “Electrospray Ionization and Heterogeneous Matrix Effects in Liquid Chromatography/Mass Spectrometry Based Meta-Metabolomics: A Biomarker or a Suppressed Ion?” Rapid Communications in Mass Spectrometry 35 (2): e8977. https://doi.org/10.1002/rcm.8977. Giacomoni, Franck, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, et al. 2015. “Workflow4Metabolomics: A Collaborative Research Infrastructure for Computational Metabolomics.” Bioinformatics 31 (9): 1493–95. https://doi.org/10.1093/bioinformatics/btu813. Giebelhaus, Ryland T., Michael D. Sorochan Armstrong, A. Paulina de la Mata, and James J. Harynuk. 2022. “Untargeted Region of Interest Selection for Gas Chromatography – Mass Spectrometry Data Using a Pseudo F-ratio Moving Window.” Journal of Chromatography A 1682 (October): 463499. https://doi.org/10.1016/j.chroma.2022.463499. Gika, Helen G., Georgios A. Theodoridis, Robert S. Plumb, and Ian D. Wilson. 2014. “Current Practice of Liquid Chromatography–Mass Spectrometry in Metabolomics and Metabonomics.” Journal of Pharmaceutical and Biomedical Analysis, Review Papers on Pharmaceutical and Biomedical Analysis 2013, 87 (January): 12–25. https://doi.org/10.1016/j.jpba.2013.06.032. Gil, Andres, David Siegel, Hjalmar Permentier, Dirk-Jan Reijngoud, Frank Dekker, and Rainer Bischoff. 2015. “Stability of Energy Metabolites—An Often Overlooked Issue in Metabolomics Studies: A Review.” ELECTROPHORESIS 36 (18): 2156–69. https://doi.org/10.1002/elps.201500031. Giné, Roger, Jordi Capellades, Josep M. Badia, Dennis Vughs, Michaela Schwaiger-Haber, Theodore Alexandrov, Maria Vinaixa, Andrea M. Brunner, Gary J. Patti, and Oscar Yanes. 2021. “HERMES: A Molecular-Formula-Oriented Method to Target the Metabolome.” Nature Methods 18 (11): 1370–76. https://doi.org/10.1038/s41592-021-01307-z. Gloaguen, Yoann, Jennifer A. Kirwan, and Dieter Beule. 2022. “Deep Learning-Assisted Peak Curation for Large-Scale LC-MS Metabolomics.” Analytical Chemistry 94 (12): 4930–37. https://doi.org/10.1021/acs.analchem.1c02220. Goldansaz, Seyed Ali, An Chi Guo, Tanvir Sajed, Michael A. Steele, Graham S. Plastow, and David S. Wishart. 2017. “Livestock Metabolomics and the Livestock Metabolome: A Systematic Review.” PLOS ONE 12 (5): e0177675. https://doi.org/10.1371/journal.pone.0177675. González, Oskar, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2022. “Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis.” Postcolumn Infusion as a Quality Control Tool for LC-MS-Based Analysis, April. https://doi.org/10.1021/jasms.2c00022. González-Domínguez, Álvaro, Núria Estanyol-Torres, Carl Brunius, Rikard Landberg, and Raúl González-Domínguez. 2024. “QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data.” Analytical Chemistry 96 (3): 1064–72. https://doi.org/10.1021/acs.analchem.3c03660. González-Riano, Carolina, Danuta Dudzik, Antonia Garcia, Alberto Gil-de-la-Fuente, Ana Gradillas, Joanna Godzien, Ángeles López-Gonzálvez, et al. 2020. “Recent Developments Along the Analytical Process for Metabolomics Workflows.” Analytical Chemistry 92 (1): 203–26. https://doi.org/10.1021/acs.analchem.9b04553. Goracci, Laura, Paolo Tiberi, Stefano Di Bona, Stefano Bonciarelli, Giovanna Ilaria Passeri, Marta Piroddi, Simone Moretti, Claudia Volpi, Ismael Zamora, and Gabriele Cruciani. 2024. “MARS: A Multipurpose Software for Untargeted LC–MS-Based Metabolomics and Exposomics.” Analytical Chemistry, January. https://doi.org/10.1021/acs.analchem.3c03620. Graça, Gonçalo, Yuheng Cai, Chung-Ho E. Lau, Panagiotis A. Vorkas, Matthew R. Lewis, Elizabeth J. Want, David Herrington, and Timothy M. D. Ebbels. 2022. “Automated Annotation of Untargeted All-Ion Fragmentation LC–MS Metabolomics Data with MetaboAnnotatoR.” Analytical Chemistry 94 (8): 3446–55. https://doi.org/10.1021/acs.analchem.1c03032. Griffiths, William J., Therese Koal, Yuqin Wang, Matthias Kohl, David P. Enot, and Hans-Peter Deigner. 2010. “Targeted Metabolomics for Biomarker Discovery.” Angewandte Chemie International Edition 49 (32): 5426–45. https://doi.org/10.1002/anie.200905579. Gromski, Piotr S., Howbeer Muhamadali, David I. Ellis, Yun Xu, Elon Correa, Michael L. Turner, and Royston Goodacre. 2015. “A Tutorial Review: Metabolomics and Partial Least Squares-Discriminant Analysis – a Marriage of Convenience or a Shotgun Wedding.” Analytica Chimica Acta 879 (June): 10–23. https://doi.org/10.1016/j.aca.2015.02.012. Groves, Ryan A., Carly C. Y. Chan, Spencer D. Wildman, Daniel B. Gregson, Thomas Rydzak, and Ian A. Lewis. 2023. “Rapid LC–MS Assay for Targeted Metabolite Quantification by Serial Injection into Isocratic Gradients.” Analytical and Bioanalytical Chemistry 415 (2): 269–76. https://doi.org/10.1007/s00216-022-04384-x. Gugisch, Ralf, Adalbert Kerber, Axel Kohnert, Reinhard Laue, Markus Meringer, Christoph Rücker, and Alfred Wassermann. 2015. “Chapter 6 - MOLGEN 5.0, A Molecular Structure Generator.” In Advances in Mathematical Chemistry and Applications, edited by Subhash C. Basak, Guillermo Restrepo, and José L. Villaveces, 113–38. Bentham Science Publishers. https://doi.org/10.1016/B978-1-68108-198-4.50006-0. Guha, Rajarshi. 2007. “Chemical Informatics Functionality in R.” Journal of Statistical Software 18 (1): 1–16. https://doi.org/10.18637/jss.v018.i05. Guijas, Carlos, J. Rafael Montenegro-Burke, Xavier Domingo-Almenara, Amelia Palermo, Benedikt Warth, Gerrit Hermann, Gunda Koellensperger, et al. 2018. “METLIN: A Technology Platform for Identifying Knowns and Unknowns.” Analytical Chemistry 90 (5): 3156–64. https://doi.org/10.1021/acs.analchem.7b04424. Guo, Hao, Kebing Xue, Haiming Sun, Weihao Jiang, and Shiliang Pu. 2023. “Contrastive Learning-Based Embedder for the Representation of Tandem Mass Spectra.” Analytical Chemistry, May. https://doi.org/10.1021/acs.analchem.3c00260. Guo, Jian, and Tao Huan. 2020. “Comparison of Full-Scan, Data-Dependent, and Data-Independent Acquisition Modes in Liquid Chromatography–Mass Spectrometry Based Untargeted Metabolomics.” Analytical Chemistry 92 (12): 8072–80. https://doi.org/10.1021/acs.analchem.9b05135. Guo, Jian, Sam Shen, and Tao Huan. 2022. “Paramounter: Direct Measurement of Universal Parameters To Process Metabolomics Data in a ‘White Box’.” Analytical Chemistry, March. https://doi.org/10.1021/acs.analchem.1c04758. Guo, Jian, Sam Shen, Shipei Xing, Huaxu Yu, and Tao Huan. 2021. “ISFrag: De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data.” Analytical Chemistry, July. https://doi.org/10.1021/acs.analchem.1c01644. Habra, Hani, Maureen Kachman, Kevin Bullock, Clary Clish, Charles R. Evans, and Alla Karnovsky. 2021. “metabCombiner: Paired Untargeted LC-HRMS Metabolomics Feature Matching and Concatenation of Disparately Acquired Data Sets.” Analytical Chemistry 93 (12): 5028–36. https://doi.org/10.1021/acs.analchem.0c03693. Hansen, Rebecca L., and Young Jin Lee. 2018. “High-Spatial Resolution Mass Spectrometry Imaging: Toward Single Cell Metabolomics in Plant Tissues.” The Chemical Record 18 (1): 65–77. https://doi.org/10.1002/tcr.201700027. Hao, Jun-Di, Yao-Yu Chen, Yan-Zhen Wang, Na An, Pei-Rong Bai, Quan-Fei Zhu, and Yu-Qi Feng. 2023. “Novel Peak Shift Correction Method Based on the Retention Index for Peak Alignment in Untargeted Metabolomics.” Analytical Chemistry 95 (35): 13330–37. https://doi.org/10.1021/acs.analchem.3c02583. Harrieder, Eva-Maria, Fleming Kretschmer, Sebastian Böcker, and Michael Witting. 2022. “Current State-of-the-Art of Separation Methods Used in LC-MS Based Metabolomics and Lipidomics.” Journal of Chromatography B 1188 (January): 123069. https://doi.org/10.1016/j.jchromb.2021.123069. Harwood, Thomas V., Daniel G. C. Treen, Mingxun Wang, Wibe de Jong, Trent R. Northen, and Benjamin P. Bowen. 2023. “BLINK Enables Ultrafast Tandem Mass Spectrometry Cosine Similarity Scoring.” Scientific Reports 13 (1): 13462. https://doi.org/10.1038/s41598-023-40496-9. Haug, Kenneth, Reza M Salek, and Christoph Steinbeck. 2017. “Global Open Data Management in Metabolomics.” Current Opinion in Chemical Biology, Omics, 36 (February): 58–63. https://doi.org/10.1016/j.cbpa.2016.12.024. Helmus, Rick, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt, and Emma L. Schymanski. 2021. “patRoon: Open Source Software Platform for Environmental Mass Spectrometry Based Non-Target Screening.” Journal of Cheminformatics 13 (1): 1. https://doi.org/10.1186/s13321-020-00477-w. Hernandes, Vinicius Veri, Coral Barbas, and Danuta Dudzik. 2017. “A Review of Blood Sample Handling and Pre-Processing for Metabolomics Studies.” ELECTROPHORESIS 38 (18): 2232–41. https://doi.org/10.1002/elps.201700086. Hiller, Karsten, Jasper Hangebrauk, Christian Jäger, Jana Spura, Kerstin Schreiber, and Dietmar Schomburg. 2009. “MetaboliteDetector: Comprehensive Analysis Tool for Targeted and Nontargeted GC/MS Based Metabolome Analysis.” Analytical Chemistry 81 (9): 3429–39. https://doi.org/10.1021/ac802689c. Hites, Ronald A. 2019. “Correcting for Censored Environmental Measurements.” Environmental Science &amp; Technology, September. https://doi.org/10.1021/acs.est.9b05042. Hites, Ronald A., and Karl J. Jobst. 2018. “Is Nontargeted Screening Reproducible?” Environmental Science &amp; Technology 52 (21): 11975–76. https://doi.org/10.1021/acs.est.8b05671. Houriet, Joelle, Warren S. Vidar, Preston K. Manwill, Daniel A. Todd, and Nadja B. Cech. 2022. “How Low Can You Go? Selecting Intensity Thresholds for Untargeted Metabolomics Data Preprocessing.” Analytical Chemistry 94 (51): 17964–71. https://doi.org/10.1021/acs.analchem.2c04088. Hu, Xin, Douglas I. Walker, Yongliang Liang, Matthew Ryan Smith, Michael L. Orr, Brian D. Juran, Chunyu Ma, et al. 2021. “A Scalable Workflow to Characterize the Human Exposome.” Nature Communications 12 (1): 5575. https://doi.org/10.1038/s41467-021-25840-9. Hu, Yaxi, Betty Cai, and Tao Huan. 2019. “Enhancing Metabolome Coverage in Data-Dependent LC–MS/MS Analysis Through an Integrated Feature Extraction Strategy.” Analytical Chemistry 91 (22): 14433–41. https://doi.org/10.1021/acs.analchem.9b02980. Huan, Tao, Erica M. Forsberg, Duane Rinehart, Caroline H. Johnson, Julijana Ivanisevic, H. Paul Benton, Mingliang Fang, et al. 2017. “Systems Biology Guided by XCMS Online Metabolomics.” Nature Methods 14 (5): 461–62. https://doi.org/10.1038/nmeth.4260. Huang, Danning, Marcos Bouza, David A. Gaul, Franklin E. Leach, I. Jonathan Amster, Frank C. Schroeder, Arthur S. Edison, and Facundo M. Fernández. 2021. “Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation.” Comparison of High-Resolution Fourier Transform Mass Spectrometry Platforms for Putative Metabolite Annotation, August. https://doi.org/10.1021/acs.analchem.1c02224. Huber, Florian, Stefan Verhoeven, Christiaan Meijer, Hanno Spreeuw, Efraín Manuel Villanueva Castilla, Cunliang Geng, Justin J. j van der Hooft, et al. 2020. “Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data.” Journal of Open Source Software 5 (52): 2411. https://doi.org/10.21105/joss.02411. Hufsky, Franziska, Kerstin Scheubert, and Sebastian Böcker. 2014. “Computational Mass Spectrometry for Small-Molecule Fragmentation.” TrAC Trends in Analytical Chemistry 53 (January): 41–48. https://doi.org/10.1016/j.trac.2013.09.008. Ibáñez, Clara, Lamia Mouhid, Guillermo Reglero, and Ana Ramírez de Molina. 2017. “Lipidomics Insights in Health and Nutritional Intervention Studies.” Journal of Agricultural and Food Chemistry 65 (36): 7827–42. https://doi.org/10.1021/acs.jafc.7b02643. Jacyna, Julia, Marta Kordalewska, and Michał J. Markuszewski. 2019. “Design of Experiments in Metabolomics-Related Studies: An Overview.” Journal of Pharmaceutical and Biomedical Analysis 164 (February): 598–606. https://doi.org/10.1016/j.jpba.2018.11.027. Jaeger, Carsten, Friederike Hoffmann, Clemens A. Schmitt, and Jan Lisec. 2016. “Automated Annotation and Evaluation of In-Source Mass Spectra in GC/Atmospheric Pressure Chemical Ionization-MS-Based Metabolomics.” Analytical Chemistry 88 (19): 9386–90. https://doi.org/10.1021/acs.analchem.6b02743. Jalili, Vahid, Enis Afgan, Qiang Gu, Dave Clements, Daniel Blankenberg, Jeremy Goecks, James Taylor, and Anton Nekrutenko. 2020. “The Galaxy Platform for Accessible, Reproducible and Collaborative Biomedical Analyses: 2020 Update.” Nucleic Acids Research 48 (W1): W395–402. https://doi.org/10.1093/nar/gkaa434. Jang, Cholsoon, Li Chen, and Joshua D. Rabinowitz. 2018. “Metabolomics and Isotope Tracing.” Cell 173 (4): 822–37. https://doi.org/10.1016/j.cell.2018.03.055. Jones, Dean P., Youngja Park, and Thomas R. Ziegler. 2012. “Nutritional Metabolomics: Progress in Addressing Complexity in Diet and Health.” Annual Review of Nutrition 32 (1): 183–202. https://doi.org/10.1146/annurev-nutr-072610-145159. Jorge, Tiago F., Ana T. Mata, and Carla António. 2016. “Mass Spectrometry as a Quantitative Tool in Plant Metabolomics.” Phil. Trans. R. Soc. A 374 (2079): 20150370. https://doi.org/10.1098/rsta.2015.0370. Jr, Stephen Salerno, Mahya Mehrmohamadi, Maria V. Liberti, Muting Wan, Martin T. Wells, James G. Booth, and Jason W. Locasale. 2017. “RRmix: A Method for Simultaneous Batch Effect Correction and Analysis of Metabolomics Data in the Absence of Internal Standards.” PLOS ONE 12 (6): e0179530. https://doi.org/10.1371/journal.pone.0179530. Ju, Ran, Xinyu Liu, Fujian Zheng, Xinjie Zhao, Xin Lu, Xiaohui Lin, Zhongda Zeng, and Guowang Xu. 2020. “A Graph Density-Based Strategy for Features Fusion from Different Peak Extract Software to Achieve More Metabolites in Metabolic Profiling from High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1139 (December): 8–14. https://doi.org/10.1016/j.aca.2020.09.029. Kachman, Maureen, Hani Habra, William Duren, Janis Wigginton, Peter Sajjakulnukit, George Michailidis, Charles Burant, and Alla Karnovsky. 2020. “Deep Annotation of Untargeted LC-MS Metabolomics Data with Binner.” Bioinformatics 36 (6): 1801–6. https://doi.org/10.1093/bioinformatics/btz798. Kapoore, Rahul Vijay, and Seetharaman Vaidyanathan. 2016. “Towards Quantitative Mass Spectrometry-Based Metabolomics in Microbial and Mammalian Systems.” Phil. Trans. R. Soc. A 374 (2079): 20150363. https://doi.org/10.1098/rsta.2015.0363. Karpievitch, Yuliya V., Sonja B. Nikolic, Richard Wilson, James E. Sharman, and Lindsay M. Edwards. 2014. “Metabolomics Data Normalization with EigenMS.” PLOS ONE 9 (12): e116221. https://doi.org/10.1371/journal.pone.0116221. Kennedy, Adam D., Bryan M. Wittmann, Anne M. Evans, Luke A. D. Miller, Douglas R. Toal, Shaun Lonergan, Sarah H. Elsea, and Kirk L. Pappan. 2018. “Metabolomics in the Clinic: A Review of the Shared and Unique Features of Untargeted Metabolomics for Clinical Research and Clinical Testing.” Journal of Mass Spectrometry 53 (11): 1143–54. https://doi.org/10.1002/jms.4292. Keshet, Uri, Tobias Kind, Xinchen Lu, Sarita Devi, and Oliver Fiehn. 2022. “Acyl-CoA Identification in Mouse Liver Samples Using the In Silico CoA-Blast Tandem Mass Spectral Library.” Analytical Chemistry 94 (6): 2732–39. https://doi.org/10.1021/acs.analchem.1c03272. Kew, William, John W. T. Blackburn, David J. Clarke, and Dušan Uhrín. 2017. “Interactive van Krevelen Diagrams – Advanced Visualisation of Mass Spectrometry Data of Complex Mixtures.” Rapid Communications in Mass Spectrometry 31 (7): 658–62. https://doi.org/10.1002/rcm.7823. Kim, Jungyeon, Joong Kyong Ahn, Yu Eun Cheong, Sung-Joon Lee, Hoon-Suk Cha, and Kyoung Heon Kim. 2020. “Systematic Re-Evaluation of the Long-Used Standard Protocol of Urease-Dependent Metabolome Sample Preparation.” PloS One 15 (3): e0230072. https://doi.org/10.1371/journal.pone.0230072. Kim, Taiyun, Owen Tang, Stephen T. Vernon, Katharine A. Kott, Yen Chin Koay, John Park, David E. James, et al. 2021. “A Hierarchical Approach to Removal of Unwanted Variation for Large-Scale Metabolomics Data.” Nature Communications 12 (1): 4992. https://doi.org/10.1038/s41467-021-25210-5. Kind, Tobias, and Oliver Fiehn. 2007. “Seven Golden Rules for Heuristic Filtering of Molecular Formulas Obtained by Accurate Mass Spectrometry.” BMC Bioinformatics 8 (1): 105. https://doi.org/10.1186/1471-2105-8-105. Kind, Tobias, Hiroshi Tsugawa, Tomas Cajka, Yan Ma, Zijuan Lai, Sajjan S. Mehta, Gert Wohlgemuth, et al. 2018. “Identification of Small Molecules Using Accurate Mass MS/MS Search.” Mass Spectrometry Reviews 37 (4): 513–32. https://doi.org/10.1002/mas.21535. Koelmel, Jeremy P., Nicholas M. Kroeger, Candice Z. Ulmer, John A. Bowden, Rainey E. Patterson, Jason A. Cochran, Christopher W. W. Beecher, Timothy J. Garrett, and Richard A. Yost. 2017. “LipidMatch: An Automated Workflow for Rule-Based Lipid Identification Using Untargeted High-Resolution Tandem Mass Spectrometry Data.” BMC Bioinformatics 18 (July): 331. https://doi.org/10.1186/s12859-017-1744-3. Kong, Fanzhou, Uri Keshet, Tong Shen, Elys Rodriguez, and Oliver Fiehn. 2023. “LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02263. Kouřil, Štěpán, Julie de Sousa, Jan Václavík, David Friedecký, and Tomáš Adam. 2020. “CROP: Correlation-Based Reduction of Feature Multiplicities in Untargeted Metabolomic Data.” Bioinformatics 36 (9): 2941–42. https://doi.org/10.1093/bioinformatics/btaa012. Kuhl, Carsten, Ralf Tautenhahn, Christoph Böttcher, Tony R. Larson, and Steffen Neumann. 2012. “CAMERA: An Integrated Strategy for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass Spectrometry Data Sets.” Analytical Chemistry 84 (1): 283–89. https://doi.org/10.1021/ac202450g. Kuligowski, Julia, Ángel Sánchez-Illana, Daniel Sanjuán-Herráez, Máximo Vento, and Guillermo Quintás. 2015. “Intra-Batch Effect Correction in Liquid Chromatography-Mass Spectrometry Using Quality Control Samples and Support Vector Regression (QC-SVRC).” Analyst 140 (22): 7810–17. https://doi.org/10.1039/C5AN01638J. Kusonmano, Kanthida, Wanwipa Vongsangnak, and Pramote Chumnanpuen. 2016. “Informatics for Metabolomics.” In Translational Biomedical Informatics, 91–115. Advances in Experimental Medicine and Biology. Springer, Singapore. https://doi.org/10.1007/978-981-10-1503-8_5. Lai, Zijuan, Hiroshi Tsugawa, Gert Wohlgemuth, Sajjan Mehta, Matthew Mueller, Yuxuan Zheng, Atsushi Ogiwara, et al. 2018. “Identifying Metabolites by Integrating Metabolome Databases with Mass Spectrometry Cheminformatics.” Nature Methods 15 (1): 53–56. https://doi.org/10.1038/nmeth.4512. Lakhani, Chirag M., Braden T. Tierney, Arjun K. Manrai, Jian Yang, Peter M. Visscher, and Chirag J. Patel. 2019. “Repurposing Large Health Insurance Claims Data to Estimate Genetic and Environmental Contributions in 560 Phenotypes.” Nature Genetics 51 (2): 327–34. https://doi.org/10.1038/s41588-018-0313-7. Laparre, Jérôme, Zied Kaabia, Mark Mooney, Tom Buckley, Mark Sherry, Bruno Le Bizec, and Gaud Dervilly-Pinel. 2017. “Impact of Storage Conditions on the Urinary Metabolomics Fingerprint.” Analytica Chimica Acta 951 (January): 99–107. https://doi.org/10.1016/j.aca.2016.11.055. Larralde, Martin, Thomas N. Lawson, Ralf J. M. Weber, Pablo Moreno, Kenneth Haug, Philippe Rocca-Serra, Mark R. Viant, Christoph Steinbeck, and Reza M. Salek. 2017. “mzML2ISA &amp; nmrML2ISA: Generating Enriched ISA-Tab Metadata Files from Metabolomics XML Data.” Bioinformatics 33 (16): 2598–2600. https://doi.org/10.1093/bioinformatics/btx169. Lassen, Johan, Kirstine Lykke Nielsen, Mogens Johannsen, and Palle Villesen. 2021. “Assessment of XCMS Optimization Methods with Machine-Learning Performance.” Analytical Chemistry 93 (40): 13459–66. https://doi.org/10.1021/acs.analchem.1c02000. Lawson, Thomas N., Ralf J. M. Weber, Martin R. Jones, Andrew J. Chetwynd, Giovanny Rodrı́guez-Blanco, Riccardo Di Guida, Mark R. Viant, and Warwick B. Dunn. 2017. “msPurity: Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics.” Analytical Chemistry 89 (4): 2432–39. https://doi.org/10.1021/acs.analchem.6b04358. Lê Cao, Kim-Anh, Simon Boitard, and Philippe Besse. 2011. “Sparse PLS Discriminant Analysis: Biologically Relevant Feature Selection and Graphical Displays for Multiclass Problems.” BMC Bioinformatics 12 (June): 253. https://doi.org/10.1186/1471-2105-12-253. Ledesma-Escobar, Carlos Augusto, Feliciano Priego-Capote, and Mónica Calderón-Santiago. 2023. “MetaboMSDIA: A Tool for Implementing Data-Independent Acquisition in Metabolomic-Based Mass Spectrometry Analysis.” Analytica Chimica Acta 1266 (July): 341308. https://doi.org/10.1016/j.aca.2023.341308. Leek, Jeffrey T., W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey. 2012. “The Sva Package for Removing Batch Effects and Other Unwanted Variation in High-Throughput Experiments.” Bioinformatics 28 (6): 882–83. https://doi.org/10.1093/bioinformatics/bts034. Leek, Jeffrey T., and John D. Storey. 2007. “Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis.” PLOS Genet 3 (9): e161. https://doi.org/10.1371/journal.pgen.0030161. ———. 2008. “A General Framework for Multiple Testing Dependence.” Proceedings of the National Academy of Sciences 105 (48): 18718–23. https://doi.org/10.1073/pnas.0808709105. Levy, Allison J., Nicholas R. Oranzi, Atiye Ahmadireskety, Robin H. J. Kemperman, Michael S. Wei, and Richard A. Yost. 2019. “Recent Progress in Metabolomics Using Ion Mobility-Mass Spectrometry.” TrAC Trends in Analytical Chemistry 116 (July): 274–81. https://doi.org/10.1016/j.trac.2019.05.001. Li, Bo, Jing Tang, Qingxia Yang, Shuang Li, Xuejiao Cui, Yinghong Li, Yuzong Chen, Weiwei Xue, Xiaofeng Li, and Feng Zhu. 2017. “NOREVA: Normalization and Evaluation of MS-based Metabolomics Data.” Nucleic Acids Research 45 (W1): W162–70. https://doi.org/10.1093/nar/gkx449. Li, Hao, Yuping Cai, Yuan Guo, Fangfang Chen, and Zheng-Jiang Zhu. 2016. “MetDIA: Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition.” Analytical Chemistry 88 (17): 8757–64. https://doi.org/10.1021/acs.analchem.6b02122. Li, Liang, Ronghong Li, Jianjun Zhou, Azeret Zuniga, Avalyn E. Stanislaus, Yiman Wu, Tao Huan, et al. 2013. “MyCompoundID: Using an Evidence-Based Metabolome Library for Metabolite Identification.” Analytical Chemistry 85 (6): 3401–8. https://doi.org/10.1021/ac400099b. Li, Lili, Weijie Ren, Hongwei Kong, Chunxia Zhao, Xinjie Zhao, Xiaohui Lin, Xin Lu, and Guowang Xu. 2017. “An Alignment Algorithm for LC-MS-based Metabolomics Dataset Assisted by MS/MS Information.” Analytica Chimica Acta 990 (October): 96–102. https://doi.org/10.1016/j.aca.2017.07.058. Li, Shuzhao. 2020. Computational Methods and Data Analysis for Metabolomics. Springer. Li, Shuzhao, Youngja Park, Sai Duraisingham, Frederick H. Strobel, Nooruddin Khan, Quinlyn A. Soltow, Dean P. Jones, and Bali Pulendran. 2013. “Predicting Network Activity from High Throughput Metabolomics.” PLOS Computational Biology 9 (7): e1003123. https://doi.org/10.1371/journal.pcbi.1003123. Li, Shuzhao, Amnah Siddiqa, Maheshwor Thapa, Yuanye Chi, and Shujian Zheng. 2023. “Trackable and Scalable LC-MS Metabolomics Data Processing Using Asari.” Nature Communications 14 (1): 4113. https://doi.org/10.1038/s41467-023-39889-1. Li, Yuanyue, and Oliver Fiehn. 2023. “Flash Entropy Search to Query All Mass Spectral Libraries in Real Time.” Nature Methods 20 (10): 1475–78. https://doi.org/10.1038/s41592-023-02012-9. Li, Yuanyue, Tobias Kind, Jacob Folz, Arpana Vaniya, Sajjan Singh Mehta, and Oliver Fiehn. 2021. “Spectral Entropy Outperforms MS/MS Dot Product Similarity for Small-Molecule Compound Identification.” Nature Methods 18 (12): 1524–31. https://doi.org/10.1038/s41592-021-01331-z. Li, Zhucui, Yan Lu, Yufeng Guo, Haijie Cao, Qinhong Wang, and Wenqing Shui. 2018. “Comprehensive Evaluation of Untargeted Metabolomics Data Processing Software in Feature Detection, Quantification and Discriminating Marker Selection.” Analytica Chimica Acta 1029 (October): 50–57. https://doi.org/10.1016/j.aca.2018.05.001. Liao, Jingyu, Yuhao Zhang, Wendan Zhang, Yuanyuan Zeng, Jing Zhao, Jingfang Zhang, Tingting Yao, et al. 2023. “Different Software Processing Affects the Peak Picking and Metabolic Pathway Recognition of Metabolomics Data.” Journal of Chromatography A 1687 (January): 463700. https://doi.org/10.1016/j.chroma.2022.463700. Libiseller, Gunnar, Michaela Dvorzak, Ulrike Kleb, Edgar Gander, Tobias Eisenberg, Frank Madeo, Steffen Neumann, et al. 2015. “IPO: A Tool for Automated Optimization of XCMS Parameters.” BMC Bioinformatics 16 (April): 118. https://doi.org/10.1186/s12859-015-0562-8. Lieng, Brandon Y., Andrew T. Quaile, Xavier Domingo-Almenara, Hannes L. Röst, and J. Rafael Montenegro-Burke. 2023. “Computational Expansion of High-Resolution-MSn Spectral Libraries.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c03343. Lin, Ching Yu, Huifeng Wu, Ronald S. Tjeerdema, and Mark R. Viant. 2007. “Evaluation of Metabolite Extraction Strategies from Tissue Samples Using NMR Metabolomics.” Metabolomics 3 (1): 55–67. https://doi.org/10.1007/s11306-006-0043-1. Lisec, Jan, Friederike Hoffmann, Clemens Schmitt, and Carsten Jaeger. 2016. “Extending the Dynamic Range in Metabolomics Experiments by Automatic Correction of Peaks Exceeding the Detection Limit.” Analytical Chemistry 88 (15): 7487–92. https://doi.org/10.1021/acs.analchem.6b02515. Lisitsyna, Anna, Franco Moritz, Youzhong Liu, Loubna Al Sadat, Hans Hauner, Melina Claussnitzer, Philippe Schmitt-Kopplin, and Sara Forcisi. 2022. “Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.” Analytical Chemistry 94 (14): 5474–82. https://doi.org/10.1021/acs.analchem.1c03237. Liu, Qin, Douglas Walker, Karan Uppal, Zihe Liu, Chunyu Ma, ViLinh Tran, Shuzhao Li, Dean P. Jones, and Tianwei Yu. 2020. “Addressing the Batch Effect Issue for LC/MS Metabolomics Data in Data Preprocessing.” Scientific Reports 10 (1): 13856. https://doi.org/10.1038/s41598-020-70850-0. Liu, Xinyu, Lina Zhou, Xianzhe Shi, and Guowang Xu. 2019. “New Advances in Analytical Methods for Mass Spectrometry-Based Large-Scale Metabolomics Study.” TrAC Trends in Analytical Chemistry 121 (December): 115665. https://doi.org/10.1016/j.trac.2019.115665. Liu, Youzhong, Yingjie Zhang, Tom Vennekens, Jennifer L. Lippens, Luc Duijsens, Danh Bui-Thi, Kris Laukens, and Thomas de Vijlder. 2023. “MeRgeION: A Multifunctional R Pipeline for Small Molecule LC-MS/MS Data Processing, Searching, and Organizing.” Analytical Chemistry 95 (22): 8433–42. https://doi.org/10.1021/acs.analchem.2c04343. Livera, Alysha M. De, Marko Sysi-Aho, Laurent Jacob, Johann A. Gagnon-Bartsch, Sandra Castillo, Julie A. Simpson, and Terence P. Speed. 2015. “Statistical Methods for Handling Unwanted Variation in Metabolomics Data.” Analytical Chemistry 87 (7): 3606–15. https://doi.org/10.1021/ac502439y. Loftfield, Erikka, Emily Vogtmann, Joshua N. Sampson, Steven C. Moore, Heidi Nelson, Rob Knight, Nicholas Chia, and Rashmi Sinha. 2016. “Comparison of Collection Methods for Fecal Samples for Discovery Metabolomics in Epidemiologic Studies.” Cancer Epidemiology and Prevention Biomarkers 25 (11): 1483–90. https://doi.org/10.1158/1055-9965.EPI-16-0409. Loos, Martin, and Heinz Singer. 2017. “Nontargeted Homologue Series Extraction from Hyphenated High Resolution Mass Spectrometry Data.” Journal of Cheminformatics 9 (February). https://doi.org/10.1186/s13321-017-0197-z. Lu, Wenyun, Bryson D. Bennett, and Joshua D. Rabinowitz. 2008. “Analytical Strategies for LC–MS-based Targeted Metabolomics.” Journal of Chromatography B, Hyphenated Techniques for Global Metabolite Profiling, 871 (2): 236–42. https://doi.org/10.1016/j.jchromb.2008.04.031. Lu, Wenyun, Xiaoyang Su, Matthias S. Klein, Ian A. Lewis, Oliver Fiehn, and Joshua D. Rabinowitz. 2017. “Metabolite Measurement: Pitfalls to Avoid and Practices to Follow.” Annual Review of Biochemistry 86 (1): 277–304. https://doi.org/10.1146/annurev-biochem-061516-044952. Lu, Xin, and Guowang Xu. 2008. “LC-MS Metabonomics Methodology in Biomarker Discovery.” In Biomarker Methods in Drug Discovery and Development, edited by Feng Wang, 291–315. Methods in Pharmacology and Toxicology™. Humana Press. https://doi.org/10.1007/978-1-59745-463-6_14. Ludwig, Marcus, Louis-Félix Nothias, Kai Dührkop, Irina Koester, Markus Fleischauer, Martin A. Hoffmann, Daniel Petras, et al. 2020. “Database-Independent Molecular Formula Annotation Using Gibbs Sampling Through ZODIAC.” Nature Machine Intelligence 2 (10): 629–41. https://doi.org/10.1038/s42256-020-00234-6. Luo, Xian, and Liang Li. 2017. “Metabolomics of Small Numbers of Cells: Metabolomic Profiling of 100, 1000, and 10000 Human Breast Cancer Cells.” Analytical Chemistry 89 (21): 11664–71. https://doi.org/10.1021/acs.analchem.7b03100. Lv, Wangjie, Zhongda Zeng, Yuqing Zhang, Qingqing Wang, Lichao Wang, Zhaoxuan Zhang, Xianzhe Shi, Xinjie Zhao, and Guowang Xu. 2022. “Comprehensive Metabolite Quantitative Assay Based on Alternate Metabolomics and Lipidomics Analyses.” Analytica Chimica Acta 1215 (July): 339979. https://doi.org/10.1016/j.aca.2022.339979. Ma, Yan, Tobias Kind, Dawei Yang, Carlos Leon, and Oliver Fiehn. 2014. “MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra.” Analytical Chemistry 86 (21): 10724–31. https://doi.org/10.1021/ac502818e. Madsen, Rasmus, Torbjörn Lundstedt, and Johan Trygg. 2010. “Chemometrics in Metabolomics—A Review in Human Disease Diagnosis.” Analytica Chimica Acta 659 (1): 23–33. https://doi.org/10.1016/j.aca.2009.11.042. Mahieu, Nathaniel G., and Gary J. Patti. 2017. “Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer Than 1000 Unique Metabolites.” Analytical Chemistry 89 (19): 10397–406. https://doi.org/10.1021/acs.analchem.7b02380. Mahieu, Nathaniel G., Jonathan L. Spalding, Susan J. Gelman, and Gary J. Patti. 2016. “Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm.” Analytical Chemistry 88 (18): 9037–46. https://doi.org/10.1021/acs.analchem.6b01702. Mahieu, Nathaniel G., Jonathan L. Spalding, and Gary J. Patti. 2016. “Warpgroup: Increased Precision of Metabolomic Data Processing by Consensus Integration Bound Analysis.” Bioinformatics 32 (2): 268–75. https://doi.org/10.1093/bioinformatics/btv564. Mahmud, Iqbal, Sandi Sternberg, Michael Williams, and Timothy J. Garrett. 2017. “Comparison of Global Metabolite Extraction Strategies for Soybeans Using UHPLC-HRMS.” Analytical and Bioanalytical Chemistry 409 (26): 6173–80. https://doi.org/10.1007/s00216-017-0557-6. Maitre, Léa, Mariona Bustamante, Carles Hernández-Ferrer, Denise Thiel, Chung-Ho E. Lau, Alexandros P. Siskos, Marta Vives-Usano, et al. 2022. “Multi-Omics Signatures of the Human Early Life Exposome.” Nature Communications 13 (1): 7024. https://doi.org/10.1038/s41467-022-34422-2. Mangul, Serghei, Thiago Mosqueiro, Richard J. Abdill, Dat Duong, Keith Mitchell, Varuni Sarwal, Brian Hill, et al. 2019. “Challenges and Recommendations to Improve the Installability and Archival Stability of Omics Computational Tools.” PLOS Biology 17 (6): e3000333. https://doi.org/10.1371/journal.pbio.3000333. Mannhold, Raimund, Gennadiy I. Poda, Claude Ostermann, and Igor V. Tetko. 2009. “Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of LogP Methods on More Than 96,000 Compounds.” Journal of Pharmaceutical Sciences 98 (3): 861–93. https://doi.org/10.1002/jps.21494. Mansouri, Kamel, Chris M. Grulke, Richard S. Judson, and Antony J. Williams. 2018. “OPERA Models for Predicting Physicochemical Properties and Environmental Fate Endpoints.” Journal of Cheminformatics 10 (1): 10. https://doi.org/10.1186/s13321-018-0263-1. Mardal, Marie, Petur W. Dalsgaard, Brian S. Rasmussen, Kristian Linnet, and Christian B. Mollerup. 2023. “Scalable Analysis of Untargeted LC-HRMS Data by Means of SQL Database Archiving.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.2c03769. Martens, Jonathan, Giel Berden, Rianne E. van Outersterp, Leo A. J. Kluijtmans, Udo F. Engelke, Clara D. M. van Karnebeek, Ron A. Wevers, and Jos Oomens. 2017. “Molecular Identification in Metabolomics Using Infrared Ion Spectroscopy.” Scientific Reports 7 (June). https://doi.org/10.1038/s41598-017-03387-4. Matich, Eryn K., Nita G. Chavez Soria, Diana S. Aga, and G. Ekin Atilla-Gokcumen. 2019. “Applications of Metabolomics in Assessing Ecological Effects of Emerging Contaminants and Pollutants on Plants.” Journal of Hazardous Materials 373 (July): 527–35. https://doi.org/10.1016/j.jhazmat.2019.02.084. Matsuo, Teruko, Hiroshi Tsugawa, Hiromi Miyagawa, and Eiichiro Fukusaki. 2017. “Integrated Strategy for Unknown EI–MS Identification Using Quality Control Calibration Curve, Multivariate Analysis, EI–MS Spectral Database, and Retention Index Prediction.” Analytical Chemistry 89 (12): 6766–73. https://doi.org/10.1021/acs.analchem.7b01010. McLean, Craig, and Elizabeth B. Kujawinski. 2020. “AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing.” Analytical Chemistry 92 (8): 5724–32. https://doi.org/10.1021/acs.analchem.9b04804. Melamud, Eugene, Livia Vastag, and Joshua D. Rabinowitz. 2010. “Metabolomic Analysis and Visualization Engine for LC-MS Data.” Analytical Chemistry 82 (23): 9818–26. https://doi.org/10.1021/ac1021166. Menikarachchi, Lochana C., Shannon Cawley, Dennis W. Hill, L. Mark Hall, Lowell Hall, Steven Lai, Janine Wilder, and David F. Grant. 2012. “MolFind: A Software Package Enabling HPLC/MS-Based Identification of Unknown Chemical Structures.” Analytical Chemistry 84 (21): 9388–94. https://doi.org/10.1021/ac302048x. Miggiels, Paul, Bert Wouters, Gerard J. P. van Westen, Anne-Charlotte Dubbelman, and Thomas Hankemeier. 2019. “Novel Technologies for Metabolomics: More for Less.” TrAC Trends in Analytical Chemistry 120 (November): 115323. https://doi.org/10.1016/j.trac.2018.11.021. Misra, Biswapriya B. 2018. “New Tools and Resources in Metabolomics: 2016–2017.” ELECTROPHORESIS 39 (7): 909–23. https://doi.org/10.1002/elps.201700441. Misra, Biswapriya B., Johannes F. Fahrmann, and Dmitry Grapov. 2017. “Review of Emerging Metabolomic Tools and Resources: 2015–2016.” ELECTROPHORESIS 38 (18): 2257–74. https://doi.org/10.1002/elps.201700110. Misra, Biswapriya B., and Justin J. J. van der Hooft. 2016. “Updates in Metabolomics Tools and Resources: 2014–2015.” ELECTROPHORESIS 37 (1): 86–110. https://doi.org/10.1002/elps.201500417. Miyagawa, Hiromi, and Takeshi Bamba. 2019. “Comparison of Sequential Derivatization with Concurrent Methods for GC/MS-based Metabolomics.” Journal of Bioscience and Bioengineering 127 (2): 160–68. https://doi.org/10.1016/j.jbiosc.2018.07.015. Montenegro-Burke, J. Rafael, Aries E. Aisporna, H. Paul Benton, Duane Rinehart, Mingliang Fang, Tao Huan, Benedikt Warth, et al. 2017. “Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes.” Analytical Chemistry 89 (2): 1254–59. https://doi.org/10.1021/acs.analchem.6b03890. Müller, Manfred J., and Anja Bosy-Westphal. 2020. “From a ‘Metabolomics Fashion’ to a Sound Application of Metabolomics in Research on Human Nutrition.” European Journal of Clinical Nutrition 74 (12): 1619–29. https://doi.org/10.1038/s41430-020-00781-6. Myers, Owen D., Susan J. Sumner, Shuzhao Li, Stephen Barnes, and Xiuxia Du. 2017. “Detailed Investigation and Comparison of the XCMS and MZmine 2 Chromatogram Construction and Chromatographic Peak Detection Methods for Preprocessing Mass Spectrometry Metabolomics Data.” Analytical Chemistry 89 (17): 8689–95. https://doi.org/10.1021/acs.analchem.7b01069. Najdekr, Lukáš, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomáš Adam. 2016. “Influence of Mass Resolving Power in Orbital Ion-Trap Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 88 (23): 11429–35. https://doi.org/10.1021/acs.analchem.6b02319. Nash, William J., and Warwick B. Dunn. 2019. “From Mass to Metabolite in Human Untargeted Metabolomics: Recent Advances in Annotation of Metabolites Applying Liquid Chromatography-Mass Spectrometry Data.” TrAC Trends in Analytical Chemistry 120 (November): 115324. https://doi.org/10.1016/j.trac.2018.11.022. Ni, Yan, Mingming Su, Yunping Qiu, Wei Jia, and Xiuxia Du. 2016. “ADAP-GC 3.0: Improved Peak Detection and Deconvolution of Co-eluting Metabolites from GC/TOF-MS Data for Metabolomics Studies.” Analytical Chemistry 88 (17): 8802–11. https://doi.org/10.1021/acs.analchem.6b02222. Ni, Zhixu, Michele Wölk, Geoff Jukes, Karla Mendivelso Espinosa, Robert Ahrends, Lucila Aimo, Jorge Alvarez-Jarreta, et al. 2022. “Guiding the Choice of Informatics Software and Tools for Lipidomics Research Applications.” Nature Methods, December, 1–12. https://doi.org/10.1038/s41592-022-01710-0. Nikolskiy, Igor, Nathaniel G. Mahieu, Ying-Jr Chen, Ralf Tautenhahn, and Gary J. Patti. 2013. “An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites.” Analytical Chemistry 85 (16): 7713–19. https://doi.org/10.1021/ac400751j. Nothias, Louis-Félix, Daniel Petras, Robin Schmid, Kai Dührkop, Johannes Rainer, Abinesh Sarvepalli, Ivan Protsyuk, et al. 2020. “Feature-Based Molecular Networking in the GNPS Analysis Environment.” Nature Methods 17 (9): 905–8. https://doi.org/10.1038/s41592-020-0933-6. Nyamundanda, Gift, Isobel Claire Gormley, Yue Fan, William M. Gallagher, and Lorraine Brennan. 2013. “MetSizeR: Selecting the Optimal Sample Size for Metabolomic Studies Using an Analysis Based Approach.” BMC Bioinformatics 14: 338. https://doi.org/10.1186/1471-2105-14-338. O’Boyle, Noel M., Michael Banck, Craig A. James, Chris Morley, Tim Vandermeersch, and Geoffrey R. Hutchison. 2011. “Open Babel: An Open Chemical Toolbox.” Journal of Cheminformatics 3 (1): 33. https://doi.org/10.1186/1758-2946-3-33. Oberg, Ann L., and Olga Vitek. 2009. “Statistical Design of Quantitative Mass Spectrometry-Based Proteomic Experiments.” Journal of Proteome Research 8 (5): 2144–56. https://doi.org/10.1021/pr8010099. Ortmayr, Karin, Verena Charwat, Cornelia Kasper, Stephan Hann, and Gunda Koellensperger. 2016. “Uncertainty Budgeting in Fold Change Determination and Implications for Non-Targeted Metabolomics Studies in Model Systems” 142 (1): 80–90. https://doi.org/10.1039/C6AN01342B. Osipenko, Sergey, Alexander Zherebker, Lidiia Rumiantseva, Oxana Kovaleva, Evgeny N. Nikolaev, and Yury Kostyukevich. 2022. “Oxygen Isotope Exchange Reaction for Untargeted LC–MS Analysis.” Journal of the American Society for Mass Spectrometry 33 (2): 390–98. https://doi.org/10.1021/jasms.1c00383. Palmer, Andrew, Prasad Phapale, Ilya Chernyavsky, Regis Lavigne, Dominik Fay, Artem Tarasov, Vitaly Kovalev, et al. 2017. “FDR-controlled Metabolite Annotation for High-Resolution Imaging Mass Spectrometry.” Nature Methods 14 (1): 57–60. https://doi.org/10.1038/nmeth.4072. Pang, Zhiqiang, Jasmine Chong, Shuzhao Li, and Jianguo Xia. 2020. “MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics.” Metabolites 10 (5): 186. https://doi.org/10.3390/metabo10050186. Passos Mansoldo, Felipe Raposo, Rafael Garrett, Veronica da Silva Cardoso, Marina Amaral Alves, and Alane Beatriz Vermelho. 2022. “Metabology: Analysis of Metabolomics Data Using Community Ecology Tools.” Analytica Chimica Acta 1232 (November): 340469. https://doi.org/10.1016/j.aca.2022.340469. Patiny, Luc, and Alain Borel. 2013. “ChemCalc: A Building Block for Tomorrow’s Chemical Infrastructure.” Journal of Chemical Information and Modeling 53 (5): 1223–28. https://doi.org/10.1021/ci300563h. Petras, Daniel, Vanessa V. Phelan, Deepa Acharya, Andrew E. Allen, Allegra T. Aron, Nuno Bandeira, Benjamin P. Bowen, et al. 2021. “GNPS Dashboard: Collaborative Exploration of Mass Spectrometry Data in the Web Browser.” Nature Methods, December, 1–3. https://doi.org/10.1038/s41592-021-01339-5. Pezzatti, Julian, Julien Boccard, Santiago Codesido, Yoric Gagnebin, Abhinav Joshi, Didier Picard, Víctor González-Ruiz, and Serge Rudaz. 2020. “Implementation of Liquid Chromatography–High Resolution Mass Spectrometry Methods for Untargeted Metabolomic Analyses of Biological Samples: A Tutorial.” Analytica Chimica Acta 1105 (April): 28–44. https://doi.org/10.1016/j.aca.2019.12.062. Pfeuffer, Julianus, Chris Bielow, Samuel Wein, Kyowon Jeong, Eugen Netz, Axel Walter, Oliver Alka, et al. 2024. “OpenMS 3 Enables Reproducible Analysis of Large-Scale Mass Spectrometry Data.” Nature Methods 21 (3): 365–67. https://doi.org/10.1038/s41592-024-02197-7. Pfeuffer, Julianus, Timo Sachsenberg, Oliver Alka, Mathias Walzer, Alexander Fillbrunn, Lars Nilse, Oliver Schilling, Knut Reinert, and Oliver Kohlbacher. 2017. “OpenMS – A Platform for Reproducible Analysis of Mass Spectrometry Data.” Journal of Biotechnology, Bioinformatics Solutions for Big Data Analysis in Life Sciences presented by the German Network for Bioinformatics Infrastructure, 261 (November): 142–48. https://doi.org/10.1016/j.jbiotec.2017.05.016. Phapale, Prasad, Vineeta Rai, Ashok Kumar Mohanty, and Sanjeeva Srivastava. 2020. “Untargeted Metabolomics Workshop Report: Quality Control Considerations from Sample Preparation to Data Analysis.” Journal of the American Society for Mass Spectrometry 31 (9): 2006–10. https://doi.org/10.1021/jasms.0c00224. Place, Benjamin J., Elin M. Ulrich, Jonathan K. Challis, Alex Chao, Bowen Du, Kristin Favela, Yong-Lai Feng, et al. 2021. “An Introduction to the Benchmarking and Publications for Non-Targeted Analysis Working Group.” Analytical Chemistry 93 (49): 16289–96. https://doi.org/10.1021/acs.analchem.1c02660. Pluskal, Tomáš, Sandra Castillo, Alejandro Villar-Briones, and Matej Orešič. 2010. “MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data.” BMC Bioinformatics 11: 395. https://doi.org/10.1186/1471-2105-11-395. Pluskal, Tomáš, Ansgar Korf, Aleksandr Smirnov, Robin Schmid, Timothy R. Fallon, Xiuxia Du, and Jing-Ke Weng. 2020. “CHAPTER 7:Metabolomics Data Analysis Using MZmine.” In Processing Metabolomics and Proteomics Data with Open Software, 232–54. https://doi.org/10.1039/9781788019880-00232. Plyushchenko, Ivan V., Elizaveta S. Fedorova, Natalia V. Potoldykova, Konstantin A. Polyakovskiy, Alexander I. Glukhov, and Igor A. Rodin. 2022. “Omics Untargeted Key Script: R-Based Software Toolbox for Untargeted Metabolomics with Bladder Cancer Biomarkers Discovery Case Study.” Journal of Proteome Research 21 (3): 833–47. https://doi.org/10.1021/acs.jproteome.1c00392. Polderman, Tinca J. C., Beben Benyamin, Christiaan A. de Leeuw, Patrick F. Sullivan, Arjen van Bochoven, Peter M. Visscher, and Danielle Posthuma. 2015. “Meta-Analysis of the Heritability of Human Traits Based on Fifty Years of Twin Studies.” Nature Genetics 47 (7): 702–9. https://doi.org/10.1038/ng.3285. Qiu, Feng, Dennis D. Fine, Daniel J. Wherritt, Zhentian Lei, and Lloyd W. Sumner. 2016. “PlantMAT: A Metabolomics Tool for Predicting the Specialized Metabolic Potential of a System and for Large-Scale Metabolite Identifications.” Analytical Chemistry 88 (23): 11373–83. https://doi.org/10.1021/acs.analchem.6b00906. Qiu, Feng, Zhentian Lei, and Lloyd W. Sumner. 2018. “MetExpert: An Expert System to Enhance Gas Chromatography-Mass Spectrometry-Based Metabolite Identifications.” Analytica Chimica Acta, Analytical Metabolomics, 1037 (December): 316–26. https://doi.org/10.1016/j.aca.2018.03.052. Reuschenbach, Max, Felix Drees, Torsten C. Schmidt, and Gerrit Renner. 2023. “qBinning: Data Quality-Based Algorithm for Automized Ion Chromatogram Extraction from High-Resolution Mass Spectrometry.” Analytical Chemistry, September. https://doi.org/10.1021/acs.analchem.3c01079. Rey-Stolle, Fernanda, Danuta Dudzik, Carolina Gonzalez-Riano, Miguel Fernández-García, Vanesa Alonso-Herranz, David Rojo, Coral Barbas, and Antonia García. 2022. “Low and High Resolution Gas Chromatography-Mass Spectrometry for Untargeted Metabolomics: A Tutorial.” Analytica Chimica Acta 1210 (June): 339043. https://doi.org/10.1016/j.aca.2021.339043. Riquelme, Gabriel, Nicolás Zabalegui, Pablo Marchi, Christina M. Jones, and María Eugenia Monge. 2020. “A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows.” Metabolites 10 (10): 416. https://doi.org/10.3390/metabo10100416. Röst, Hannes L., Timo Sachsenberg, Stephan Aiche, Chris Bielow, Hendrik Weisser, Fabian Aicheler, Sandro Andreotti, et al. 2016. “OpenMS: A Flexible Open-Source Software Platform for Mass Spectrometry Data Analysis.” Nature Methods 13 (9): 741–48. https://doi.org/10.1038/nmeth.3959. Röst, Hannes L., Uwe Schmitt, Ruedi Aebersold, and Lars Malmström. 2014. “pyOpenMS: A Python-based Interface to the OpenMS Mass-Spectrometry Algorithm Library.” PROTEOMICS 14 (1): 74–77. https://doi.org/10.1002/pmic.201300246. Roszkowska, Anna, Miao Yu, Vincent Bessonneau, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2018. “Tissue Storage Affects Lipidome Profiling in Comparison to in Vivo Microsampling Approach.” Scientific Reports 8 (1): 6980. https://doi.org/10.1038/s41598-018-25428-2. Rurik, Marc, Oliver Alka, Fabian Aicheler, and Oliver Kohlbacher. 2020. “Metabolomics Data Processing Using OpenMS.” In Computational Methods and Data Analysis for Metabolomics, edited by Shuzhao Li, 49–60. Methods in Molecular Biology. New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-0239-3_4. Rusconi, Filippo. 2019. “mineXpert: Biological Mass Spectrometry Data Visualization and Mining with Full JavaScript Ability.” Journal of Proteome Research 18 (5): 2254–59. https://doi.org/10.1021/acs.jproteome.9b00099. Ruttkies, Christoph, Emma L. Schymanski, Sebastian Wolf, Juliane Hollender, and Steffen Neumann. 2016. “MetFrag Relaunched: Incorporating Strategies Beyond in Silico Fragmentation.” Journal of Cheminformatics 8 (January): 3. https://doi.org/10.1186/s13321-016-0115-9. Saccenti, Edoardo, and Marieke E. Timmerman. 2016. “Approaches to Sample Size Determination for Multivariate Data: Applications to PCA and PLS-DA of Omics Data.” Journal of Proteome Research 15 (8): 2379–93. https://doi.org/10.1021/acs.jproteome.5b01029. Samanipour, Saer, Malcolm J. Reid, Kine Bæk, and Kevin V. Thomas. 2018. “Combining a Deconvolution and a Universal Library Search Algorithm for the Nontarget Analysis of Data-Independent Acquisition Mode Liquid Chromatography-High-Resolution Mass Spectrometry Results.” Environmental Science &amp; Technology 52 (8): 4694–4701. https://doi.org/10.1021/acs.est.8b00259. Sarpe, Vladimir, and David C Schriemer. 2017. “Supporting Metabolomics with Adaptable Software: Design Architectures for the End-User.” Current Opinion in Biotechnology, Analytical biotechnology, 43 (February): 110–17. https://doi.org/10.1016/j.copbio.2016.11.001. Scheltema, Richard A., Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. 2011. “PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis.” Analytical Chemistry 83 (7): 2786–93. https://doi.org/10.1021/ac2000994. Scheubert, Kerstin, Franziska Hufsky, Daniel Petras, Mingxun Wang, Louis-Félix Nothias, Kai Dührkop, Nuno Bandeira, Pieter C. Dorrestein, and Sebastian Böcker. 2017. “Significance Estimation for Large Scale Metabolomics Annotations by Spectral Matching.” Nature Communications 8 (1): 1494. https://doi.org/10.1038/s41467-017-01318-5. Schrimpe-Rutledge, Alexandra C., Simona G. Codreanu, Stacy D. Sherrod, and John A. McLean. 2016. “Untargeted Metabolomics Strategies—Challenges and Emerging Directions.” Journal of The American Society for Mass Spectrometry 27 (12): 1897–1905. https://doi.org/10.1007/s13361-016-1469-y. Schymanski, Emma L., and Antony J. Williams. 2017. “Open Science for Identifying ‘Known Unknown’ Chemicals.” Environmental Science &amp; Technology 51 (10): 5357–59. https://doi.org/10.1021/acs.est.7b01908. Senan, Oriol, Antoni Aguilar-Mogas, Miriam Navarro, Jordi Capellades, Luke Noon, Deborah Burks, Oscar Yanes, Roger Guimerà, and Marta Sales-Pardo. 2019. “CliqueMS: A Computational Tool for Annotating in-Source Metabolite Ions from LC-MS Untargeted Metabolomics Data Based on a Coelution Similarity Network.” Bioinformatics 35 (20): 4089–97. https://doi.org/10.1093/bioinformatics/btz207. Shaffer, Justin P., Louis-Félix Nothias, Luke R. Thompson, Jon G. Sanders, Rodolfo A. Salido, Sneha P. Couvillion, Asker D. Brejnrod, et al. 2022. “Standardized Multi-Omics of Earth’s Microbiomes Reveals Microbial and Metabolite Diversity.” Nature Microbiology 7 (12): 2128–50. https://doi.org/10.1038/s41564-022-01266-x. Shen, Xiaotao, Ruohong Wang, Xin Xiong, Yandong Yin, Yuping Cai, Zaijun Ma, Nan Liu, and Zheng-Jiang Zhu. 2019. “Metabolic Reaction Network-Based Recursive Metabolite Annotation for Untargeted Metabolomics.” Nature Communications 10 (1): 1–14. https://doi.org/10.1038/s41467-019-09550-x. Shen, Xiaotao, Hong Yan, Chuchu Wang, Peng Gao, Caroline H. Johnson, and Michael P. Snyder. 2022. “TidyMass an Object-Oriented Reproducible Analysis Framework for LC–MS Data.” Nature Communications 13 (1): 4365. https://doi.org/10.1038/s41467-022-32155-w. Shi, Jiachen, Jialiang Zhao, Yu Zhang, Yanan Wang, Chin Ping Tan, Yong-Jiang Xu, and Yuanfa Liu. 2023. “Windows Scanning Multiomics: Integrated Metabolomics and Proteomics.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.3c03785. Silva, Ricardo R. da, Mingxun Wang, Louis-Félix Nothias, Justin J. J. van der Hooft, Andrés Mauricio Caraballo-Rodríguez, Evan Fox, Marcy J. Balunas, Jonathan L. Klassen, Norberto Peporine Lopes, and Pieter C. Dorrestein. 2018. “Propagating Annotations of Molecular Networks Using in Silico Fragmentation.” PLOS Computational Biology 14 (4): e1006089. https://doi.org/10.1371/journal.pcbi.1006089. Silva, Ricardo R., Fabien Jourdan, Diego M. Salvanha, Fabien Letisse, Emilien L. Jamin, Simone Guidetti-Gonzalez, Carlos A. Labate, and Ricardo Z. N. Vêncio. 2014. “ProbMetab: An R Package for Bayesian Probabilistic Annotation of LC–MS-based Metabolomics.” Bioinformatics 30 (9): 1336–37. https://doi.org/10.1093/bioinformatics/btu019. Sindelar, Miriam, and Gary J. Patti. 2020. “Chemical Discovery in the Era of Metabolomics.” Journal of the American Chemical Society, April. https://doi.org/10.1021/jacs.9b13198. Siskos, Alexandros P., Pooja Jain, Werner Römisch-Margl, Mark Bennett, David Achaintre, Yasmin Asad, Luke Marney, et al. 2017. “Interlaboratory Reproducibility of a Targeted Metabolomics Platform for Analysis of Human Serum and Plasma.” Analytical Chemistry 89 (1): 656–65. https://doi.org/10.1021/acs.analchem.6b02930. Sitnikov, Dmitri G., Cian S. Monnin, and Dajana Vuckovic. 2016. “Systematic Assessment of Seven Solvent and Solid-Phase Extraction Methods for Metabolomics Analysis of Human Plasma by LC-MS.” Scientific Reports 6 (December). https://doi.org/10.1038/srep38885. Smirnov, Kirill S., Sara Forcisi, Franco Moritz, Marianna Lucio, and Philippe Schmitt-Kopplin. 2019. “Mass Difference Maps and Their Application for the Recalibration of Mass Spectrometric Data in Nontargeted Metabolomics.” Analytical Chemistry 91 (5): 3350–58. https://doi.org/10.1021/acs.analchem.8b04555. Smirnov, Kirill S., Tanja V. Maier, Alesia Walker, Silke S. Heinzmann, Sara Forcisi, Inés Martinez, Jens Walter, and Philippe Schmitt-Kopplin. 2016. “Challenges of Metabolomics in Human Gut Microbiota Research.” International Journal of Medical Microbiology, Intestinal microbiota - a microbial ecosystem at the edge between immune homeostasis and inflammation, 306 (5): 266–79. https://doi.org/10.1016/j.ijmm.2016.03.006. Smith, Colin A., Elizabeth J. Want, Grace O’Maille, Ruben Abagyan, and Gary Siuzdak. 2006. “XCMS:  Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification.” Analytical Chemistry 78 (3): 779–87. https://doi.org/10.1021/ac051437y. Spalding, Jonathan L., Kevin Cho, Nathaniel G. Mahieu, Igor Nikolskiy, Elizabeth M. Llufrio, Stephen L. Johnson, and Gary J. Patti. 2016. “Bar Coding MS2 Spectra for Metabolite Identification.” Analytical Chemistry 88 (5): 2538–42. https://doi.org/10.1021/acs.analchem.5b04925. Spicer, Rachel, Reza M. Salek, Pablo Moreno, Daniel Cañueto, and Christoph Steinbeck. 2017. “Navigating Freely-Available Software Tools for Metabolomics Analysis.” Metabolomics 13 (9). https://doi.org/10.1007/s11306-017-1242-7. Spratlin, Jennifer L., Natalie J. Serkova, and S. Gail Eckhardt. 2009. “Clinical Applications of Metabolomics in Oncology: A Review.” Clinical Cancer Research 15 (2): 431–40. https://doi.org/10.1158/1078-0432.CCR-08-1059. Stancliffe, Ethan, Michaela Schwaiger-Haber, Miriam Sindelar, Matthew J. Murphy, Mette Soerensen, and Gary J. Patti. 2022. “An Untargeted Metabolomics Workflow That Scales to Thousands of Samples for Population-Based Studies.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.2c01270. Stincone, Paolo, Abzer K. Pakkir Shah, Robin Schmid, Lana G. Graves, Stilianos P. Lambidis, Ralph R. Torres, Shu-Ning Xia, et al. 2023. “Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform.” Evaluation of Data-Dependent MS/MS Acquisition Parameters for Non-Targeted Metabolomics and Molecular Networking of Environmental Samples: Focus on the Q Exactive Platform, August. https://doi.org/10.1021/acs.analchem.3c01202. Styczynski, Mark P., Joel F. Moxley, Lily V. Tong, Jason L. Walther, Kyle L. Jensen, and Gregory N. Stephanopoulos. 2007. “Systematic Identification of Conserved Metabolites in GC/MS Data for Metabolomics and Biomarker Discovery.” Analytical Chemistry 79 (3): 966–73. https://doi.org/10.1021/ac0614846. Sumner, Lloyd W., Alexander Amberg, Dave Barrett, Michael H. Beale, Richard Beger, Clare A. Daykin, Teresa W.-M. Fan, et al. 2007. “Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).” Metabolomics : Official Journal of the Metabolomic Society 3 (3): 211–21. https://doi.org/10.1007/s11306-007-0082-2. Sumner, Lloyd W, Pedro Mendes, and Richard A Dixon. 2003. “Plant Metabolomics: Large-Scale Phytochemistry in the Functional Genomics Era.” Phytochemistry, Plant Metabolomics, 62 (6): 817–36. https://doi.org/10.1016/S0031-9422(02)00708-2. Sysi-Aho, Marko, Mikko Katajamaa, Laxman Yetukuri, and Matej Orešič. 2007. “Normalization Method for Metabolomics Data Using Optimal Selection of Multiple Internal Standards.” BMC Bioinformatics 8 (March): 93. https://doi.org/10.1186/1471-2105-8-93. Tang, Yanan, Caley B. Craven, Nicholas J. P. Wawryk, Junlang Qiu, Feng Li, and Xing-Fang Li. 2020. “Advances in Mass Spectrometry-Based Omics Analysis of Trace Organics in Water.” TrAC Trends in Analytical Chemistry 128 (July): 115918. https://doi.org/10.1016/j.trac.2020.115918. Tarakhovskaya, Elena, Andrea Marcillo, Caroline Davis, Sanja Milkovska-Stamenova, Antje Hutschenreuther, and Claudia Birkemeyer. 2023. “Matrix Effects in GC-MS Profiling of Common Metabolites After Trimethylsilyl Derivatization.” Molecules (Basel, Switzerland) 28 (6): 2653. https://doi.org/10.3390/molecules28062653. Tautenhahn, Ralf, Christoph Böttcher, and Steffen Neumann. 2008. “Highly Sensitive Feature Detection for High Resolution LC/MS.” BMC Bioinformatics 9: 504. https://doi.org/10.1186/1471-2105-9-504. Tautenhahn, Ralf, Kevin Cho, Winnie Uritboonthai, Zhengjiang Zhu, Gary J. Patti, and Gary Siuzdak. 2012. “An Accelerated Workflow for Untargeted Metabolomics Using the METLIN Database.” Nature Biotechnology 30 (9): 826–28. https://doi.org/10.1038/nbt.2348. Theodoridis, Georgios A., Helen G. Gika, Elizabeth J. Want, and Ian D. Wilson. 2012. “Liquid Chromatography–Mass Spectrometry Based Global Metabolite Profiling: A Review.” Analytica Chimica Acta 711 (January): 7–16. https://doi.org/10.1016/j.aca.2011.09.042. Thonusin, Chanisa, Heidi B. IglayReger, Tanu Soni, Amy E. Rothberg, Charles F. Burant, and Charles R. Evans. 2017. “Evaluation of Intensity Drift Correction Strategies Using MetaboDrift, a Normalization Tool for Multi-Batch Metabolomics Data.” Journal of Chromatography A, Pushing the Boundaries of Chromatography and Electrophoresis, 1523 (Supplement C): 265–74. https://doi.org/10.1016/j.chroma.2017.09.023. Tian, Leqi, Zhenjiang Li, Guoxuan Ma, Xiaoyue Zhang, Ziyin Tang, Siheng Wang, Jian Kang, Donghai Liang, and Tianwei Yu. 2022. “Metapone: A Bioconductor Package for Joint Pathway Testing for Untargeted Metabolomics Data.” Bioinformatics 38 (14): 3662–64. https://doi.org/10.1093/bioinformatics/btac364. Tian, Tze-Feng, San-Yuan Wang, Tien-Chueh Kuo, Cheng-En Tan, Guan-Yuan Chen, Ching-Hua Kuo, Chi-Hsin Sally Chen, Chang-Chuan Chan, Olivia A. Lin, and Y. Jane Tseng. 2016. “Web Server for Peak Detection, Baseline Correction, and Alignment in Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics Data.” Analytical Chemistry 88 (21): 10395–403. https://doi.org/10.1021/acs.analchem.6b00755. Tian, Zhitao, Xin Hu, Yingying Xu, Mengmeng Liu, Hongbo Liu, Dongqin Li, Lisong Hu, Guozhu Wei, and Wei Chen. 2023. “PMhub 1.0: A Comprehensive Plant Metabolome Database.” Nucleic Acids Research, October, gkad811. https://doi.org/10.1093/nar/gkad811. Torigoe, Taihei, Masatomo Takahashi, Omidreza Heravizadeh, Kazuki Ikeda, Kohta Nakatani, Takeshi Bamba, and Yoshihiro Izumi. 2024. “Predicting Retention Time in Unified-Hydrophilic-Interaction/Anion-Exchange Liquid Chromatography High-Resolution Tandem Mass Spectrometry (Unified-HILIC/AEX/HRMS/MS) for Comprehensive Structural Annotation of Polar Metabolome.” Analytical Chemistry 96 (3): 1275–83. https://doi.org/10.1021/acs.analchem.3c04618. Treutler, Hendrik, and Steffen Neumann. 2016. “Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data.” Metabolites 6 (4): 37. https://doi.org/10.3390/metabo6040037. Treutler, Hendrik, Hiroshi Tsugawa, Andrea Porzel, Karin Gorzolka, Alain Tissier, Steffen Neumann, and Gerd Ulrich Balcke. 2016. “Discovering Regulated Metabolite Families in Untargeted Metabolomics Studies.” Analytical Chemistry 88 (16): 8082–90. https://doi.org/10.1021/acs.analchem.6b01569. Tsou, Chih-Chiang, Dmitry Avtonomov, Brett Larsen, Monika Tucholska, Hyungwon Choi, Anne-Claude Gingras, and Alexey I. Nesvizhskii. 2015. “DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics.” Nature Methods 12 (3): 258–64. https://doi.org/10.1038/nmeth.3255. Tsugawa, Hiroshi, Tomas Cajka, Tobias Kind, Yan Ma, Brendan Higgins, Kazutaka Ikeda, Mitsuhiro Kanazawa, Jean VanderGheynst, Oliver Fiehn, and Masanori Arita. 2015. “MS-DIAL: Data-Independent MS/MS Deconvolution for Comprehensive Metabolome Analysis.” Nature Methods 12 (6): 523–26. https://doi.org/10.1038/nmeth.3393. Tsugawa, Hiroshi, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita. 2016. “Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software.” Analytical Chemistry 88 (16): 7946–58. https://doi.org/10.1021/acs.analchem.6b00770. Uchino, Haruki, Hiroshi Tsugawa, Hidenori Takahashi, and Makoto Arita. 2022. “Computational Mass Spectrometry Accelerates C = C Position-Resolved Untargeted Lipidomics Using Oxygen Attachment Dissociation.” Communications Chemistry 5 (1): 1–13. https://doi.org/10.1038/s42004-022-00778-1. Uppal, Karan, Quinlyn A. Soltow, Frederick H. Strobel, W. Stephen Pittard, Kim M. Gernert, Tianwei Yu, and Dean P. Jones. 2013. “xMSanalyzer: Automated Pipeline for Improved Feature Detection and Downstream Analysis of Large-Scale, Non-Targeted Metabolomics Data.” BMC Bioinformatics 14 (1): 15. https://doi.org/10.1186/1471-2105-14-15. Uppal, Karan, Douglas I. Walker, and Dean P. Jones. 2017. “xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data.” Analytical Chemistry 89 (2): 1063–67. https://doi.org/10.1021/acs.analchem.6b01214. Uppal, Karan, Douglas I. Walker, Ken Liu, Shuzhao Li, Young-Mi Go, and Dean P. Jones. 2016. “Computational Metabolomics: A Framework for the Million Metabolome.” Chemical Research in Toxicology 29 (12): 1956–75. https://doi.org/10.1021/acs.chemrestox.6b00179. van der Kloet, Frans M., Ivana Bobeldijk, Elwin R. Verheij, and Renger H. Jellema. 2009. “Analytical Error Reduction Using Single Point Calibration for Accurate and Precise Metabolomic Phenotyping.” Journal of Proteome Research 8 (11): 5132–41. https://doi.org/10.1021/pr900499r. van Tetering, Lara, Sylvia Spies, Quirine D. K. Wildeman, Kas J. Houthuijs, Rianne E. van Outersterp, Jonathan Martens, Ron A. Wevers, David S. Wishart, Giel Berden, and Jos Oomens. 2024. “A Spectroscopic Test Suggests That Fragment Ion Structure Annotations in MS/MS Libraries Are Frequently Incorrect.” Communications Chemistry 7 (1): 1–11. https://doi.org/10.1038/s42004-024-01112-7. Verhoeven, Aswin, Martin Giera, and Oleg A. Mayboroda. 2020. “Scientific Workflow Managers in Metabolomics: An Overview.” Analyst 145 (11): 3801–8. https://doi.org/10.1039/D0AN00272K. Viant, Mark R., Timothy M. D. Ebbels, Richard D. Beger, Drew R. Ekman, David J. T. Epps, Hennicke Kamp, Pim E. G. Leonards, et al. 2019. “Use Cases, Best Practice and Reporting Standards for Metabolomics in Regulatory Toxicology.” Nature Communications 10 (1): 3041. https://doi.org/10.1038/s41467-019-10900-y. Viant, Mark R, Irwin J Kurland, Martin R Jones, and Warwick B Dunn. 2017. “How Close Are We to Complete Annotation of Metabolomes?” Current Opinion in Chemical Biology, Omics, 36 (February): 64–69. https://doi.org/10.1016/j.cbpa.2017.01.001. Vinaixa, Maria, Emma L. Schymanski, Steffen Neumann, Miriam Navarro, Reza M. Salek, and Oscar Yanes. 2016. “Mass Spectral Databases for LC/MS- and GC/MS-based Metabolomics: State of the Field and Future Prospects.” TrAC Trends in Analytical Chemistry 78 (April): 23–35. https://doi.org/10.1016/j.trac.2015.09.005. Vitale, Chiara Maria, Arjen Lommen, Carolin Huber, Kevin Wagner, Borja Garlito Molina, Rosalie Nijssen, Elliott James Price, et al. 2022. “Harmonized Quality Assurance/Quality Control Provisions for Nontargeted Measurement of Urinary Pesticide Biomarkers in the HBM4EU Multisite SPECIMEn Study.” Analytical Chemistry 94 (22): 7833–43. https://doi.org/10.1021/acs.analchem.2c00061. Volikov, Alexander, Gleb Rukhovich, and Irina V. Perminova. 2023. “NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter.” NOMspectra: An Open-Source Python Package for Processing High Resolution Mass Spectrometry Data on Natural Organic Matter, June. https://doi.org/10.1021/jasms.3c00003. Wallach, Joshua D., Kevin W. Boyack, and John P. A. Ioannidis. 2018. “Reproducible Research Practices, Transparency, and Open Access Data in the Biomedical Literature, 2015–2017.” PLOS Biology 16 (11): e2006930. https://doi.org/10.1371/journal.pbio.2006930. Wandro, Stephen, Lisa Carmody, Tara Gallagher, John J. LiPuma, and Katrine Whiteson. 2017. “Making It Last: Storage Time and Temperature Have Differential Impacts on Metabolite Profiles of Airway Samples from Cystic Fibrosis Patients.” mSystems 2 (6). https://doi.org/10.1128/mSystems.00100-17. Wang, Mingxun, Jeremy J. Carver, Vanessa V. Phelan, Laura M. Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, et al. 2016. “Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking.” Nature Biotechnology 34 (8): 828–37. https://doi.org/10.1038/nbt.3597. Wang, Ruimin, Miaoshan Lu, Shaowei An, Jinyin Wang, and Changbin Yu. 2023. “G-Aligner: A Graph-Based Feature Alignment Method for Untargeted LC–MS-based Metabolomics.” BMC Bioinformatics 24 (1): 431. https://doi.org/10.1186/s12859-023-05525-4. Wang, Ruohong, Yandong Yin, and Zheng-Jiang Zhu. 2019. “Advancing Untargeted Metabolomics Using Data-Independent Acquisition Mass Spectrometry Technology.” Analytical and Bioanalytical Chemistry 411 (19): 4349–57. https://doi.org/10.1007/s00216-019-01709-1. Wang, San-Yuan, Ching-Hua Kuo, and Yufeng J. Tseng. 2013. “Batch Normalizer: A Fast Total Abundance Regression Calibration Method to Simultaneously Adjust Batch and Injection Order Effects in Liquid Chromatography/Time-of-Flight Mass Spectrometry-Based Metabolomics Data and Comparison with Current Calibration Methods.” Analytical Chemistry 85 (2): 1037–46. https://doi.org/10.1021/ac302877x. Wang, Suping, Xiaojuan Jiang, Rong Ding, Binbin Chen, Haiyan Lyu, Junyang Liu, Chunyan Zhu, et al. 2022. “MS-IDF: A Software Tool for Nontargeted Identification of Endogenous Metabolites After Chemical Isotope Labeling Based on a Narrow Mass Defect Filter.” Analytical Chemistry 94 (7): 3194–3202. https://doi.org/10.1021/acs.analchem.1c04719. Wang, Yang, Fang Liu, Peng Li, Chengwei He, Ruibing Wang, Huanxing Su, and Jian-Bo Wan. 2016. “An Improved Pseudotargeted Metabolomics Approach Using Multiple Ion Monitoring with Time-Staggered Ion Lists Based on Ultra-High Performance Liquid Chromatography/Quadrupole Time-of-Flight Mass Spectrometry.” Analytica Chimica Acta 927 (July): 82–88. https://doi.org/10.1016/j.aca.2016.05.008. Warth, Benedikt, Scott Spangler, Mingliang Fang, Caroline H. Johnson, Erica M. Forsberg, Ana Granados, Richard L. Martin, et al. 2017. “Exposome-Scale Investigations Guided by Global Metabolomics, Pathway Analysis, and Cognitive Computing.” Analytical Chemistry 89 (21): 11505–13. https://doi.org/10.1021/acs.analchem.7b02759. Weber, Ralf J. M., Thomas N. Lawson, Reza M. Salek, Timothy M. D. Ebbels, Robert C. Glen, Royston Goodacre, Julian L. Griffin, et al. 2017. “Computational Tools and Workflows in Metabolomics: An International Survey Highlights the Opportunity for Harmonisation Through Galaxy.” Metabolomics 13 (2). https://doi.org/10.1007/s11306-016-1147-x. Weber, Ralf J. M., and Mark R. Viant. 2010. “MI-Pack: Increased Confidence of Metabolite Identification in Mass Spectra by Integrating Accurate Masses and Metabolic Pathways.” Chemometrics and Intelligent Laboratory Systems, OMICS, 104 (1): 75–82. https://doi.org/10.1016/j.chemolab.2010.04.010. Wehrens, Ron, Tom G. Bloemberg, and Paul H. C. Eilers. 2015. “Fast Parametric Time Warping of Peak Lists.” Bioinformatics 31 (18): 3063–65. https://doi.org/10.1093/bioinformatics/btv299. Wei, Runmin, Jingye Wang, Mingming Su, Erik Jia, Shaoqiu Chen, Tianlu Chen, and Yan Ni. 2018. “Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.” Scientific Reports 8 (1): 663. https://doi.org/10.1038/s41598-017-19120-0. Weljie, Aalim M., Jack Newton, Pascal Mercier, Erin Carlson, and Carolyn M. Slupsky. 2006. “Targeted Profiling:  Quantitative Analysis of 1H NMR Metabolomics Data.” Analytical Chemistry 78 (13): 4430–42. https://doi.org/10.1021/ac060209g. Wen, Bo, Zhanlong Mei, Chunwei Zeng, and Siqi Liu. 2017. “metaX: A Flexible and Comprehensive Software for Processing Metabolomics Data.” BMC Bioinformatics 18 (March): 183. https://doi.org/10.1186/s12859-017-1579-y. Wiklund, Susanne, Erik Johansson, Lina Sjöström, Ewa J. Mellerowicz, Ulf Edlund, John P. Shockcor, Johan Gottfries, Thomas Moritz, and Johan Trygg. 2008. “Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models.” Analytical Chemistry 80 (1): 115–22. https://doi.org/10.1021/ac0713510. Wise, Stephen A. 2022. “What If Using Certified Reference Materials (CRMs) Was a Requirement to Publish in Analytical/Bioanalytical Chemistry Journals?” Analytical and Bioanalytical Chemistry 414 (24): 7015–22. https://doi.org/10.1007/s00216-022-04163-8. Wishart, David S. 2016. “Emerging Applications of Metabolomics in Drug Discovery and Precision Medicine.” Nature Reviews Drug Discovery 15 (7): 473–84. https://doi.org/10.1038/nrd.2016.32. Witting, Michael, Christoph Ruttkies, Steffen Neumann, and Philippe Schmitt-Kopplin. 2017. “LipidFrag: Improving Reliability of in Silico Fragmentation of Lipids and Application to the Caenorhabditis Elegans Lipidome.” PLOS ONE 12 (3): e0172311. https://doi.org/10.1371/journal.pone.0172311. Wolf, Sebastian, Stephan Schmidt, Matthias Müller-Hannemann, and Steffen Neumann. 2010. “In Silico Fragmentation for Computer Assisted Identification of Metabolite Mass Spectra.” BMC Bioinformatics 11 (March): 148. https://doi.org/10.1186/1471-2105-11-148. Wolfender, Jean-Luc, Guillaume Marti, Aurélien Thomas, and Samuel Bertrand. 2015. “Current Approaches and Challenges for the Metabolite Profiling of Complex Natural Extracts.” Journal of Chromatography A, Editors’ Choice IX, 1382 (February): 136–64. https://doi.org/10.1016/j.chroma.2014.10.091. Wright, Elliott J., Daniel G. Beach, and Pearse McCarron. 2022. “Non-Target Analysis and Stability Assessment of Reference Materials Using Liquid Chromatography-High-Resolution Mass Spectrometry.” Analytica Chimica Acta 1201 (April): 339622. https://doi.org/10.1016/j.aca.2022.339622. Wu, Yiman, and Liang Li. 2016. “Sample Normalization Methods in Quantitative Metabolomics.” Journal of Chromatography A, Editors’ Choice X, 1430 (January): 80–95. https://doi.org/10.1016/j.chroma.2015.12.007. Xing, Shipei, Sam Shen, Banghua Xu, Xiaoxiao Li, and Tao Huan. 2023. “BUDDY: Molecular Formula Discovery via Bottom-up MS/MS Interrogation.” Nature Methods, April, 1–10. https://doi.org/10.1038/s41592-023-01850-x. Xu, Yi-Fan, Wenyun Lu, and Joshua D. Rabinowitz. 2015. “Avoiding Misannotation of In-Source Fragmentation Products as Cellular Metabolites in Liquid Chromatography–Mass Spectrometry-Based Metabolomics.” Analytical Chemistry 87 (4): 2273–81. https://doi.org/10.1021/ac504118y. Xue, Jingchuan, Rico J. E. Derks, Bill Webb, Elizabeth M. Billings, Aries Aisporna, Martin Giera, and Gary Siuzdak. 2021. “Single Quadrupole Multiple Fragment Ion Monitoring Quantitative Mass Spectrometry.” Analytical Chemistry 93 (31): 10879–89. https://doi.org/10.1021/acs.analchem.1c01246. Xue, Jingchuan, Xavier Domingo-Almenara, Carlos Guijas, Amelia Palermo, Markus M. Rinschen, John Isbell, H. Paul Benton, and Gary Siuzdak. 2020. “Enhanced in-Source Fragmentation Annotation Enables Novel Data Independent Acquisition and Autonomous METLIN Molecular Identification.” Analytical Chemistry 92 (8): 6051–59. https://doi.org/10.1021/acs.analchem.0c00409. Xue, Jingchuan, Carlos Guijas, H. Paul Benton, Benedikt Warth, and Gary Siuzdak. 2020. “METLIN MS 2 Molecular Standards Database: A Broad Chemical and Biological Resource.” Nature Methods 17 (10): 953–54. https://doi.org/10.1038/s41592-020-0942-5. Xue, Jingchuan, Jiamin Zhu, Lixin Hu, Junjie Yang, Yunbo Lv, Fanrong Zhao, Yuxian Liu, Tao Zhang, Yanpeng Cai, and Mingliang Fang. 2023. “EISA-EXPOSOME: One Highly Sensitive and Autonomous Exposomic Platform with Enhanced in-Source Fragmentation/Annotation.” Analytical Chemistry, November. https://doi.org/10.1021/acs.analchem.3c02697. Yamamoto, Hiroyuki, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, and Yoshiaki Ohashi. 2014. “Statistical Hypothesis Testing of Factor Loading in Principal Component Analysis and Its Application to Metabolite Set Enrichment Analysis.” BMC Bioinformatics 15 (February): 51. https://doi.org/10.1186/1471-2105-15-51. Yan, Binjun, Mengtian Shi, Siyu Cai, Yuan Su, Renhui Chen, Chiyuan Huang, and David Da Yong Chen. 2023. “Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC–MS/MS.” Analytical Chemistry 95 (45): 16558–66. https://doi.org/10.1021/acs.analchem.3c02689. Yang, Qingxia, Yunxia Wang, Ying Zhang, Fengcheng Li, Weiqi Xia, Ying Zhou, Yunqing Qiu, Honglin Li, and Feng Zhu. 2020. “NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data.” Nucleic Acids Research 48 (W1): W436–48. https://doi.org/10.1093/nar/gkaa258. Yang, Qin, Shan-Shan Lin, Jiang-Tao Yang, Li-Juan Tang, and Ru-Qin Yu. 2017. “Detection of Inborn Errors of Metabolism Utilizing GC-MS Urinary Metabolomics Coupled with a Modified Orthogonal Partial Least Squares Discriminant Analysis.” Talanta 165 (April): 545–52. https://doi.org/10.1016/j.talanta.2017.01.018. Yang, Qiong, Hongchao Ji, Zhenbo Xu, Yiming Li, Pingshan Wang, Jinyu Sun, Xiaqiong Fan, Hailiang Zhang, Hongmei Lu, and Zhimin Zhang. 2023. “Ultra-Fast and Accurate Electron Ionization Mass Spectrum Matching for Compound Identification with Million-Scale in-Silico Library.” Nature Communications 14 (1): 3722. https://doi.org/10.1038/s41467-023-39279-7. Yang, Ruochen, Xi Chen, and Idoia Ochoa. 2019. “MassComp, a Lossless Compressor for Mass Spectrometry Data.” BMC Bioinformatics 20 (1): 368. https://doi.org/10.1186/s12859-019-2962-7. Yates Iii, John R. 2011. “A Century of Mass Spectrometry: From Atoms to Proteomes.” Nature Methods 8 (8): 633–37. https://doi.org/10.1038/nmeth.1659. Yu, Miao, Georgia Dolios, and Lauren Petrick. 2022. “Reproducible Untargeted Metabolomics Workflow for Exhaustive MS2 Data Acquisition of MS1 Features.” Journal of Cheminformatics 14 (1): 6. https://doi.org/10.1186/s13321-022-00586-8. Yu, Miao, Sofia Lendor, Anna Roszkowska, Mariola Olkowicz, Leslie Bragg, Mark Servos, and Janusz Pawliszyn. 2020. “Metabolic Profile of Fish Muscle Tissue Changes with Sampling Method, Storage Strategy and Time.” Analytica Chimica Acta 1136 (November): 42–50. https://doi.org/10.1016/j.aca.2020.08.050. Yu, Miao, Mariola Olkowicz, and Janusz Pawliszyn. 2019. “Structure/Reaction Directed Analysis for LC-MS Based Untargeted Analysis.” Analytica Chimica Acta 1050 (March): 16–24. https://doi.org/10.1016/j.aca.2018.10.062. Yu, Miao, Susan L. Teitelbaum, Georgia Dolios, Lam-Ha T. Dang, Peijun Tu, Mary S. Wolff, and Lauren M. Petrick. 2022. “Molecular Gatekeeper Discovery: Workflow for Linking Multiple Exposure Biomarkers to Metabolomics.” Environmental Science &amp; Technology 56 (10): 6162–71. https://doi.org/10.1021/acs.est.1c04039. Yu, Tianwei, Youngja Park, Jennifer M. Johnson, and Dean P. Jones. 2009. “apLCMS—Adaptive Processing of High-Resolution LC/MS Data.” Bioinformatics 25 (15): 1930–36. https://doi.org/10.1093/bioinformatics/btp291. Yu, Yong-Jie, Qing-Xia Zheng, Yue-Ming Zhang, Qian Zhang, Yu-Ying Zhang, Ping-Ping Liu, Peng Lu, et al. 2019. “Automatic Data Analysis Workflow for Ultra-High Performance Liquid Chromatography-High Resolution Mass Spectrometry-Based Metabolomics.” Journal of Chromatography A 1585 (January): 172–81. https://doi.org/10.1016/j.chroma.2018.11.070. Yu, Zhihao, Haylea C. Miller, Geoffrey J. Puzon, and Brian H. Clowers. 2017. “Development of Untargeted Metabolomics Methods for the Rapid Detection of Pathogenic Naegleria Fowleri.” Environmental Science &amp; Technology 51 (8): 4210–19. https://doi.org/10.1021/acs.est.6b05969. Yuan, Min, Susanne B. Breitkopf, Xuemei Yang, and John M. Asara. 2012. “A Positive/Negative Ion–Switching, Targeted Mass Spectrometry–Based Metabolomics Platform for Bodily Fluids, Cells, and Fresh and Fixed Tissue.” Nature Protocols 7 (5): 872–81. https://doi.org/10.1038/nprot.2012.024. Zenobi, R. 2013. “Single-Cell Metabolomics: Analytical and Biological Perspectives.” Science 342 (6163): 1243259. https://doi.org/10.1126/science.1243259. Zha, Haihong, Yuping Cai, Yandong Yin, Zhuozhong Wang, Kang Li, and Zheng-Jiang Zhu. 2018. “SWATHtoMRM: Development of High-Coverage Targeted Metabolomics Method Using SWATH Technology for Biomarker Discovery.” Analytical Chemistry 90 (6): 4062–70. https://doi.org/10.1021/acs.analchem.7b05318. Zhang, Aihua, Hui Sun, Ping Wang, Ying Han, and Xijun Wang. 2012. “Modern Analytical Techniques in Metabolomics Analysis.” The Analyst 137 (2): 293–300. https://doi.org/10.1039/C1AN15605E. Zhang, Xiuqiong, Zaifang Li, Chunxia Zhao, Tiantian Chen, Xinxin Wang, Xiaoshan Sun, Xinjie Zhao, Xin Lu, and Guowang Xu. 2024. “Leveraging Unidentified Metabolic Features for Key Pathway Discovery: Chemical Classification-driven Network Analysis in Untargeted Metabolomics.” Analytical Chemistry, February. https://doi.org/10.1021/acs.analchem.3c04591. Zhang, Yuhao, Jingyu Liao, Wanqi Le, Gaosong Wu, and Weidong Zhang. 2023. “Improving the Data Quality of Untargeted Metabolomics Through a Targeted Data-Dependent Acquisition Based on an Inclusion List of Differential and Preidentified Ions.” Analytical Chemistry 95 (34): 12964–73. https://doi.org/10.1021/acs.analchem.3c02888. Zhang, Yu-Ying, Qian Zhang, Yue-Ming Zhang, Wei-Wei Wang, Li Zhang, Yong-Jie Yu, Chang-Cai Bai, Ji-Zhao Guo, Hai-Yan Fu, and Yuanbin She. 2020. “A Comprehensive Automatic Data Analysis Strategy for Gas Chromatography-Mass Spectrometry Based Untargeted Metabolomics.” Journal of Chromatography A 1616 (April): 460787. https://doi.org/10.1016/j.chroma.2019.460787. Zhang, Zixuan, Huaxu Yu, Ethan Wong-Ma, Pouneh Dokouhaki, Ahmed Mostafa, Jay S. Shavadia, Fang Wu, and Tao Huan. 2024. “Reducing Quantitative Uncertainty Caused by Data Processing in Untargeted Metabolomics.” Analytical Chemistry 96 (9): 3727–32. https://doi.org/10.1021/acs.analchem.3c04046. Zhao, Fan, Shuai Huang, and Xiaozhe Zhang. 2021. “High Sensitivity and Specificity Feature Detection in Liquid Chromatography–Mass Spectrometry Data: A Deep Learning Framework.” Talanta 222 (January): 121580. https://doi.org/10.1016/j.talanta.2020.121580. Zhao, Shuang, and Liang Li. 2020. “Chemical Derivatization in LC-MS-based Metabolomics Study.” TrAC Trends in Analytical Chemistry 131 (October): 115988. https://doi.org/10.1016/j.trac.2020.115988. Zhao, Tingting, Shipei Xing, Huaxu Yu, and Tao Huan. 2023. “De Novo Cleaning of Chimeric MS/MS Spectra for LC-MS/MS-Based Metabolomics.” Analytical Chemistry 95 (35): 13018–28. https://doi.org/10.1021/acs.analchem.3c00736. Zheng, Fujian, Lei You, Wangshu Qin, Runze Ouyang, Wangjie Lv, Lei Guo, Xin Lu, Enyou Li, Xinjie Zhao, and Guowang Xu. 2022. “MetEx: A Targeted Extraction Strategy for Improving the Coverage and Accuracy of Metabolite Annotation in Liquid Chromatography–High-Resolution Mass Spectrometry Data.” Analytical Chemistry 94 (24): 8561–69. https://doi.org/10.1021/acs.analchem.1c04783. Zheng, Fujian, Xinjie Zhao, Zhongda Zeng, Lichao Wang, Wangjie Lv, Qingqing Wang, and Guowang Xu. 2020. “Development of a Plasma Pseudotargeted Metabolomics Method Based on Ultra-High-Performance Liquid Chromatography–Mass Spectrometry.” Nature Protocols 15 (8): 2519–37. https://doi.org/10.1038/s41596-020-0341-5. Zhou, Juntuo, and Yuxin Yin. 2016. “Strategies for Large-Scale Targeted Metabolomics Quantification by Liquid Chromatography-Mass Spectrometry.” Analyst 141 (23): 6362–73. https://doi.org/10.1039/C6AN01753C. Zhou, Zhiwei, Mingdu Luo, Haosong Zhang, Yandong Yin, Yuping Cai, and Zheng-Jiang Zhu. 2022. “Metabolite Annotation from Knowns to Unknowns Through Knowledge-Guided Multi-Layer Metabolic Networking.” Nature Communications 13 (1): 6656. https://doi.org/10.1038/s41467-022-34537-6. Zhu, Xiaochun, Yuping Chen, and Raju Subramanian. 2014. “Comparison of Information-Dependent Acquisition, SWATH, and MSAll Techniques in Metabolite Identification Study Employing Ultrahigh-Performance Liquid Chromatography–Quadrupole Time-of-Flight Mass Spectrometry.” Analytical Chemistry 86 (2): 1202–9. https://doi.org/10.1021/ac403385y. Zubeldia-Varela, Elisa, Domingo Barber, Coral Barbas, Marina Perez-Gordo, and David Rojo. 2020. “Sample Pre-Treatment Procedures for the Omics Analysis of Human Gut Microbiota: Turning Points, Tips and Tricks for Gene Sequencing and Metabolomics.” Journal of Pharmaceutical and Biomedical Analysis 191 (November): 113592. https://doi.org/10.1016/j.jpba.2020.113592. "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]
diff --git a/docs/workflow-2.html b/docs/workflow-2.html
index 17e1c14..74b24ff 100644
--- a/docs/workflow-2.html
+++ b/docs/workflow-2.html
@@ -418,8 +418,8 @@ <h1><span class="header-section-number">Chapter 5</span> Workflow<a href="workfl
 <span id="cb7-13"><a href="workflow-2.html#cb7-13" tabindex="-1"></a><span class="st">B --&gt; E</span></span>
 <span id="cb7-14"><a href="workflow-2.html#cb7-14" tabindex="-1"></a><span class="st">C --&gt; H</span></span>
 <span id="cb7-15"><a href="workflow-2.html#cb7-15" tabindex="-1"></a><span class="st">&quot;</span>)</span></code></pre></div>
-<div class="DiagrammeR html-widget html-fill-item" id="htmlwidget-1bc83ad4efd5c6e46f77" style="width:672px;height:480px;"></div>
-<script type="application/json" data-for="htmlwidget-1bc83ad4efd5c6e46f77">{"x":{"diagram":"\nflowchart TB\nI(peak-picking) --> C\nC(visulization) --> D(normalization/batch correction)\nD --> A(annotation/identification)\nA --> H(statistical analysis)\nC --> A --> B(omics analysis)\nD --> H\nB --> H\nH --> E(experimental validation)\nA --> E\nH --> A\nB --> E\nC --> H\n"},"evals":[],"jsHooks":[]}</script>
+<div class="DiagrammeR html-widget html-fill-item" id="htmlwidget-f4255361ed1fe92a147a" style="width:672px;height:480px;"></div>
+<script type="application/json" data-for="htmlwidget-f4255361ed1fe92a147a">{"x":{"diagram":"\nflowchart TB\nI(peak-picking) --> C\nC(visulization) --> D(normalization/batch correction)\nD --> A(annotation/identification)\nA --> H(statistical analysis)\nC --> A --> B(omics analysis)\nD --> H\nB --> H\nH --> E(experimental validation)\nA --> E\nH --> A\nB --> E\nC --> H\n"},"evals":[],"jsHooks":[]}</script>
 <div id="platform-for-metabolomics-data-analysis" class="section level2 hasAnchor" number="5.1">
 <h2><span class="header-section-number">5.1</span> Platform for metabolomics data analysis<a href="workflow-2.html#platform-for-metabolomics-data-analysis" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Here is a list for related open source <a href="http://strimmerlab.org/notes/mass-spectrometry.html">projects</a></p>