From e7046e86e3415a41f90d709ffde4497428399314 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ole=20Engstr=C3=B8m?= <oleemail@icloud.com>
Date: Tue, 23 Jul 2024 23:22:45 +0200
Subject: [PATCH] Fixed comma error in relation to
 https://github.com/openjournals/joss-reviews/issues/6533

---
 paper/paper.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/paper/paper.md b/paper/paper.md
index 3de1be7..d30c153 100644
--- a/paper/paper.md
+++ b/paper/paper.md
@@ -49,7 +49,7 @@ In conclusion, `ikpls` empowers researchers and practitioners in machine learnin
 
 # Statement of need
 
-PLS [@wold1966estimation] is a standard method in machine learning and chemometrics. PLS can be used as a regression model, PLS-R (PLS regression), [@wold1983food; @wold2001pls] or a classification model, PLS-DA (PLS discriminant analysis) [@barker2003partial]. PLS takes as input a matrix $\mathbf{X}$ with dimension $(N, K)$ of predictor variables and a matrix $\mathbf{Y}$ with dimension $(N, M)$ of response variables. PLS decomposes $\mathbf{X}$ and $\mathbf{Y}$ into $A$ latent variables (also called components), which are linear combinations of the original $\mathbf{X}$ and $\mathbf{Y}$. Choosing the optimal number of components, $A$, depends on the input data and varies from task to task. Additionally, selecting the optimal preprocessing method is challenging to assess before model validation [@rinnan2009review, @sorensen2021nir] but is required for achieving optimal performance [@du2022quantitative]. The optimal number of components and the optimal preprocessing method are typically chosen by cross-validation, which may be very computationally expensive. The implementations of the fast cross-validation algorithm [@engstrøm2024shortcutting] will significantly reduce the computational cost of cross-validation.
+PLS [@wold1966estimation] is a standard method in machine learning and chemometrics. PLS can be used as a regression model, PLS-R (PLS regression) [@wold1983food; @wold2001pls], or a classification model, PLS-DA (PLS discriminant analysis) [@barker2003partial]. PLS takes as input a matrix $\mathbf{X}$ with dimension $(N, K)$ of predictor variables and a matrix $\mathbf{Y}$ with dimension $(N, M)$ of response variables. PLS decomposes $\mathbf{X}$ and $\mathbf{Y}$ into $A$ latent variables (also called components), which are linear combinations of the original $\mathbf{X}$ and $\mathbf{Y}$. Choosing the optimal number of components, $A$, depends on the input data and varies from task to task. Additionally, selecting the optimal preprocessing method is challenging to assess before model validation [@rinnan2009review, @sorensen2021nir] but is required for achieving optimal performance [@du2022quantitative]. The optimal number of components and the optimal preprocessing method are typically chosen by cross-validation, which may be very computationally expensive. The implementations of the fast cross-validation algorithm [@engstrøm2024shortcutting] will significantly reduce the computational cost of cross-validation.
 
 This work introduces the Python software package, `ikpls`, with novel, fast implementations of IKPLS Algorithm #1 and Algorithm #2 by @dayal1997improved, which have previously been compared with other PLS algorithms and shown to be fast [@alin2009comparison] and numerically stable [@andersson2009comparison]. The implementations introduced in this work use NumPy [@harris2020array] and JAX [@jax2018github]. The NumPy implementations can be executed on CPUs, and the JAX implementations can be executed on CPUs, GPUs, and TPUs. The JAX implementations are also end-to-end differentiable, allowing integration into deep learning methods. This work compares the execution time of the implementations on input data of varying dimensions. It reveals that choosing the implementation that best fits the data will yield orders of magnitude faster execution than the common NIPALS [@wold1966estimation] implementation of PLS, which is the one implemented by scikit-learn [@scikit-learn], an extensive machine learning library for Python. With the implementations introduced in this work, choosing the optimal number of components and the optimal preprocessing becomes much more feasible than previously. Indeed, derivatives of this work have previously been applied to do this precisely [@engstrom2023improving; @engstrom2023analyzing].