diff --git a/10-spatial-estimation.Rmd b/10-spatial-estimation.Rmd
index 598eec2..c210f07 100644
--- a/10-spatial-estimation.Rmd
+++ b/10-spatial-estimation.Rmd
@@ -40,9 +40,6 @@ knitr::knit_hooks$set(webgl = hook_webgl)
 
 This chapter introduces some basic ideas on different types of **sampling** strategies in spatial context. In addition, this chapter also introduces various type of spatial statistics that are being used in **predicting** the occurrence of events in the unsampled locations. 
 
-:::: {.box-content .learning-objectives-content}
-
-::: {.box-title .learning-objectives-top}
 ## Learning Objectives {-}
 ::: 
 
@@ -55,7 +52,7 @@ This chapter introduces some basic ideas on different types of **sampling** stra
 
 ## Key Terms {-}
 
-Sampling, Spatial autocorrelation, Moran's I, Kriging, Interpolation, Spatial regression
+Geostatistics, Sampling, Prediction, Spatial Autocorrelation, Estimation, Geostatistics, P-value, Alpha, Null Hypothesis, Semi-variance, Semi-variogram, Population, Sampling Design, Sampling Unit, Probability Sampling, Simple Random Sampling, Stratified Random Sampling, Homogeneous, Systematic Sampling, Cluster Sampling, Purposive/Adaptive Sampling, Representative Sampling, Unique Case Sampling, Sequential Sampling, Spatial Autocorrelation, Domain, Attributes, Moran's I, Geary's C, Polygon Data, Point Data, Rook's/Queen's Case, Lag, Lag Distance, Gaussian, Spherical, Exponential, Circular, Spatial Interpolation, Thiessian Polygon, Inverse Distance Weighting, Kriging, Auxiliary, Spatial Lag Model, Spatial Weight Matrix, Distance-based Approach, Nearest-neighbor, Response Variable, Predictor Variable, Spatial Error Model, Errorsarlm, Spatialreg, Basal Area,      
 
 ## Introduction 
 
@@ -67,10 +64,6 @@ Like any other data, spatial data will only make true sense if it can represent
 
 To overcome this limitation, one should come up with a proper **sampling** strategies to ensure that the given sample is representative of a population and lacks any redundant information. While the need to account for additional variables about a location may be intimidating, many spatial statistic analyses are out there to help with basic **prediction** and **estimation** of the information in an unobserved location. For example, interpolation and spatial regression can help us predict and estimate the value of a variable in an unsampled location. Similarly, **spatial autocorrelation** measures the degree of similarity between samples at different sampled locations.
 
-
-:::: {.box-content .call-out-content}
-
-::: {.box-title .call-out-top}
 ## Recall This {-}
 :::
 
@@ -121,10 +114,6 @@ Broadly, sampling can be categorized into two groups [@Teddlie2007]:
 
 Before getting into the details about different types of sampling. We will make ourself familiar with some sampling key terms and their definitions.
 
-
-:::: {.box-content .call-out-content}
-
-::: {.box-title .call-out-top}
 ## Recall This {-}
 :::
 
@@ -152,11 +141,11 @@ Probability sampling techniques are mostly used in studies that use extensive am
 
 ## Simple Random Sampling
 
-In simple random sampling, each sampling unit within a given population has equal probability of being selected in a sample [@thompson2012sampling].For example, suppose we would like to measure the tree heights of all the trees from a simple random sample of 60 plots with their spatial locations (given by plots center co-ordinates) from a forest divided into 625 spatially defined plots as given in **Figure 10.1**.  Notice, there is no distinctive pattern on how plots are being selected for the measurement of tree heights, this justify the **random part** of the simple random sampling. 
+In simple random sampling, each sampling unit within a given population has equal probability of being selected in a sample [@thompson2012sampling].For example, suppose we would like to measure the tree heights of all the trees from a simple random sample of 60 plots with their spatial locations (given by plots center co-ordinates) from a forest divided into 625 spatially defined plots as given in (Figure \@ref(fig:10-simple-random-sampling)). Notice, there is no distinctive pattern on how plots are being selected for the measurement of tree heights, this justify the **random part** of the simple random sampling. 
 
 As an investigator, when we make a sequence of selections from a population, at each step, new and distinct set of sampling units are being selected in the sample, each having equal probability of being selected at each step.
 
-For example, when we take another sample of 60 plots as in **Figure 10.2**, we can see that different **sampling units (plots)** are being selected from what we obtained in **Figure 10.1.**, this represent the **equal probability** of each sampling unit being selected.
+For example, when we take another sample of 60 plots as in (Figure \@ref(fig:10-simple-random-sampling)), we can see that different **sampling units (plots)** are being selected from what we obtained in (Figure \@ref(fig:10-simple-random-sampling)), this represent the **equal probability** of each sampling unit being selected.
 
 ```{r 10-simple-random-sampling, echo=FALSE, warning=FALSE, message=FALSE}
 m <- matrix(c(1:625), nrow=25, ncol=25)
@@ -176,7 +165,7 @@ p<-ggplot(n,aes(x=x, y=y, label=val)) +
 p
 ```
 
-**Figure 10.1.** \ref{fig:figs} Simple random sample of 60 units from a population of 625 units.
+(Figure \@ref(fig:10-simple-random-sampling-2)) Simple random sample of 60 units from a population of 625 units.
 
 ```{r 10-simple-random-sampling-2, echo=FALSE,warning=FALSE}
 m <- matrix(c(1:625), nrow=25, ncol=25)
@@ -197,14 +186,14 @@ p<-ggplot(n,aes(x=x, y=y, label=val)) +
 p
 ```
 
-**Figure 10.2.** \ref{fig:figs} Another simple random sample of 60 units.
+(Figure \@ref(fig:10-simple-random-sampling-2)) Another simple random sample of 60 units.
 
 
 ## Stratified Random Sampling
 
 When a population under study is not **homogeneous** (similar in biological characteristics) across the entire study area and consists of some sort of gradient, stratified random sampling method is used [@thompson2012sampling]. The principle of stratification is to partition the population in such a way that the units within a stratum are as similar as possible [@Teddlie2007]. Random samples from each strata are drawn to ensure adequate sampling of all groups [@Teddlie2007]. Even though one stratum may differ markedly from another, a stratified sample with the desired number of units from each stratum in the population will tend to be “representative” of the population as a whole [@Howell2020].
 
-For example, a forest under study is divided **(stratified)** into similar  regions  **Figure 10.3** defined by elevation, soil moisture, and soil nutrient gradient and random samples are taken within each strata. The stratification of a study region despite of its size can help to spread the sample over the entire study area.
+For example, a forest under study is divided **(stratified)** into similar  regions  (Figure \@ref(fig:10-stratified-random-sampling)) defined by elevation, soil moisture, and soil nutrient gradient and random samples are taken within each strata. The stratification of a study region despite of its size can help to spread the sample over the entire study area.
 
 
 ```{r 10-stratified-random-sampling, echo=FALSE, warning=FALSE, message=FALSE}
@@ -231,7 +220,7 @@ p<-ggplot(n,aes(x=x, y=y, label=val)) +
 p
 ```
 
-**Figure 10.3.** \ref{fig:figs} Stratified random sample within unequal strata within a study area.
+(Figure \@ref(fig:10-stratified-random-sampling-2)) Stratified random sample within unequal strata within a study area.
 
 ```{r 10-stratified-random-sampling-2, echo=FALSE,warning=FALSE}
 m <- matrix(c(1:400), nrow=20, ncol=20)
@@ -260,11 +249,11 @@ p<-ggplot(n,aes(x=x, y=y, label=val)) +
 p
 ```
 
-**Figure 10.4.** \ref{fig:figs} Stratified random sample from equal strata within a study area.
+(Figure \@ref(fig:10-stratified-random-sampling-2)) Stratified random sample from equal strata within a study area.
 
 ## Systematic Sampling
 
-A systematic sample uses a fixed grid or array to assign plots in a regular pattern **Figure 10.5** [@McRoberts2014]. The advantage of systematic sampling is that it maximizes the average distance between the plots and therefore minimizes spatial correlation among observations and increases statistical efficiency [@McRoberts2014]. In addition, a systematic sample, which is clearly seen to be representative in some sense, can be very convincing to decision-makers who lack experience with sampling [@McRoberts2014]. Raster grids such as digital elevation models (DEM) are some examples of systematic sample.
+A systematic sample uses a fixed grid or array to assign plots in a regular pattern (Figure \@ref(fig:10-systematic-sampling)) [@McRoberts2014]. The advantage of systematic sampling is that it maximizes the average distance between the plots and therefore minimizes spatial correlation among observations and increases statistical efficiency [@McRoberts2014]. In addition, a systematic sample, which is clearly seen to be representative in some sense, can be very convincing to decision-makers who lack experience with sampling [@McRoberts2014]. Raster grids such as digital elevation models (DEM) are some examples of systematic sample.
 
 ```{r 10-systematic-sampling, echo=FALSE,warning=FALSE,message=FALSE}
 m <- matrix(c(1:400), nrow=20, ncol=20)
@@ -290,7 +279,7 @@ p<-ggplot(data_col_odd,aes(x=x, y=y, label=val)) +
 p
 ```
 
-**Figure 10.5.** \ref{fig:figs} Sample every second observation in the row 
+(Figure \@ref(fig:10-systematic-sampling-2)) Sample every second observation in the row 
 
 ```{r 10-systematic-sampling-2, echo=FALSE,warning=FALSE}
 m <- matrix(c(1:400), nrow=20, ncol=20)
@@ -313,7 +302,7 @@ p<-ggplot(data_row_odd,aes(x=x, y=y, label=val)) +
         axis.title = element_blank())
 p
 ```
-**Figure 10.6.** \ref{fig:figs} Sample all the observation in every second column 
+(Figure \@ref(fig:10-systematic-sampling-2)) Sample all the observation in every second column 
 
 ## Cluster Sampling 
 
@@ -343,7 +332,7 @@ p<-ggplot(df,aes(x=x, y=y,label=val)) +
 p 
 ```
 
-**Figure 10.7.** \ref{fig:figs} Cluster of plots selected from entire study area
+(Figure \@ref(fig:10-cluster-sampling)) Cluster of plots selected from entire study area
 
 ## Non-probability Sampling
 
@@ -363,11 +352,8 @@ In this sampling method, we would pick up a single or group of cases in an inter
 
 ## Spatial Autocorrelation
 
-When an attribute or variable is mapped across a study area or domain, geologist ask a question on whether a variable is cluster, randomly distributed or dispersed [@Carr1993]. In some cases, the nature of a cluster is distinctive visually, while in others it is not apparent [@Carr1993]. Hence, to come up with a quantitative measure on variable is clustered or randomly distributed in the domain, **spatial autocorrelation** is used. The concept of autocorrelation comes from Tobler’s first law of geography which states, “things that are closer in distance are related” [@Tobler1970]. **Spatial autocorrelation** can be defined as the relationship between a variable of interest with itself when measured at different location [@Getis1995CliffAA]. In other words, if a variable is measured at different locations which are at proximity, the value of variable is almost same. There could be both positive and negative spatial autocorrelation. Consider the following example; **Figure 10.8** shows the clustering pattern in the given square boxes (can be variable of interest) representing the positive spatial autocorrelation (left) and a complete checkerboard (right) distribution of square boxes (variable of interest) indicating a negative spatial autocorrelation.
-
-:::: {.box-content .call-out-content}
+When an attribute or variable is mapped across a study area or domain, geologist ask a question on whether a variable is cluster, randomly distributed or dispersed [@Carr1993]. In some cases, the nature of a cluster is distinctive visually, while in others it is not apparent [@Carr1993]. Hence, to come up with a quantitative measure on variable is clustered or randomly distributed in the domain, **spatial autocorrelation** is used. The concept of autocorrelation comes from Tobler’s first law of geography which states, “things that are closer in distance are related” [@Tobler1970]. **Spatial autocorrelation** can be defined as the relationship between a variable of interest with itself when measured at different location [@Getis1995CliffAA]. In other words, if a variable is measured at different locations which are at proximity, the value of variable is almost same. There could be both positive and negative spatial autocorrelation. Consider the following example; (Figure \@ref(fig:10-cluster-sampling)) shows the clustering pattern in the given square boxes (can be variable of interest) representing the positive spatial autocorrelation (left) and a complete checkerboard (right) distribution of square boxes (variable of interest) indicating a negative spatial autocorrelation.
 
-::: {.box-title .call-out-top}
 ## Recall This {-}
 :::
 
@@ -431,7 +417,7 @@ grids_bs <- plot_grid(p,q,ncol = 2, labels = c('Positive autocorrelation', 'Nega
 grids_bs
 ```
 
-**Figure 10.8.** \ref{fig:figs} Example of a positive (left) and negative spatial autocorrelation (right) for a give domain.
+(Figure \@ref(fig:10-spatial-autocorrelation)) Example of a positive (left) and negative spatial autocorrelation (right) for a give domain.
 
 ## Moran's I 
 
@@ -447,14 +433,10 @@ $$y_j=\text{variable measure at location j}$$
 $$S^2=\text{the variance}$$
 $$w_{ij}=\text{the spatial weight matrix}$$ 
 
-
-:::: {.box-content .case-study-content}
-
-::: {.box-title .case-study-top}
 ## Case Study: Title of Case Study Here 
 You see textual case study content here
 
-For this case study, we will use ground plot data from Change Monitoring Inventory (CMI) program [for details: @ProvinceofBC2018] for Williams Lake and 100-miles House timber supply area (TSA) in the province of British Columbia, Canada. William Lake TSA and 100-miles House TSA are divided into 18 and 8 blocks respectively **Figure 10.9**. There is a total of 456 CMI plots used in this study **Figure 10.9**. The total basal area (m2/ha) is our variable of interest in this study. For each of the polygon in Williams lake and 100-miles house TSA, total basal area was calculated by taking the sum of the basal area of each CMI plots in each polygon. For this part of exercise, we want to understand if there is any spatial relationship (autocorrelation) between the total basal area measured in each polygon of TSA. We will quantitatively measure the presence or absence of **spatial autocorrelation** using **Moran’s I** and **Geary's C**. 
+For this case study, we will use ground plot data from Change Monitoring Inventory (CMI) program [for details: @ProvinceofBC2018] for Williams Lake and 100-miles House timber supply area (TSA) in the province of British Columbia, Canada. William Lake TSA and 100-miles House TSA are divided into 18 and 8 blocks respectively (Figure \@ref(fig:10-spatial-autocorrelation)). There is a total of 456 CMI plots used in this study (Figure \@ref(fig:10-spatial-autocorrelation)). The total basal area (m2/ha) is our variable of interest in this study. For each of the polygon in Williams lake and 100-miles house TSA, total basal area was calculated by taking the sum of the basal area of each CMI plots in each polygon. For this part of exercise, we want to understand if there is any spatial relationship (autocorrelation) between the total basal area measured in each polygon of TSA. We will quantitatively measure the presence or absence of **spatial autocorrelation** using **Moran’s I** and **Geary's C**. 
 
 <p id="box-text">
 </p>
@@ -545,7 +527,7 @@ t<-ggplot() +
 t
 ```
 
-**Figure 10.9.** \ref{fig:figs} CMI plots location for williams lake TSA and 100 miles house TSA. Basal area has been calculated using the sum of the basal area for each plot (dot) over the given polygon.
+(Figure \@ref(fig:10-morans-I)) CMI plots location for williams lake TSA and 100 miles house TSA. Basal area has been calculated using the sum of the basal area for each plot (dot) over the given polygon.
 :::
 
 ## Calculating Moran's I {-}
@@ -614,7 +596,7 @@ grids_bs <- plot_grid(p,q,ncol = 2, labels = c('Rooks contiguity', 'Queens conti
 grids_bs
 ```
 
-**Figure 10.10.** \ref{fig:figs} Rook's  (left) and Queen's (right) case for searching the neighborhood (grey unit) for the darker unit in the center.
+(Figure \@ref(fig:10-using-contiguity)) Rook's  (left) and Queen's (right) case for searching the neighborhood (grey unit) for the darker unit in the center.
 
 **Step 1: Build Neighborhood** 
 
@@ -736,13 +718,13 @@ $$N=\text{number of sampled differences or lag} $$
 
 Like the familiar variance of basic statistics, it is a sum of squares divided by the number N of sampled differences. Unlike simple variance about a mean, the semivariogram measures difference between two samples. The 'semi' in semivariogram comes from the fact that the variance is divided by 2.
 
-A **semivariogram** is a graph that consists of **semivariance** on the y-axis and a **lag distance** on the x-axis **Figure 10.11**. There are various components in a semivariogram that can be used to interpret the nature and structure of spatial autocorrelation. Various components of a semivariogram are:
+A **semivariogram** is a graph that consists of **semivariance** on the y-axis and a **lag distance** on the x-axis (Figure \@ref(fig:10-gearys-c-2)). There are various components in a semivariogram that can be used to interpret the nature and structure of spatial autocorrelation. Various components of a semivariogram are:
 
-Nugget (C): Nugget refers to an unaccounted autocorrelation due to a small lag distance than sampling distance or due to sampling errors **Figure 10.11**.
+Nugget (C): Nugget refers to an unaccounted autocorrelation due to a small lag distance than sampling distance or due to sampling errors (Figure \@ref(fig:10-gearys-c-2)).
 
-Range (R): The distance at which a variogram model first flattens out. This is the distance upto which y variables are spatially autocorrelated or aggregated **Figure 10.11**.
+Range (R): The distance at which a variogram model first flattens out. This is the distance upto which y variables are spatially autocorrelated or aggregated (Figure \@ref(fig:10-gearys-c-2)).
 
-Sill (S): The value of semivariance that a variogram model attains at a given range is called sill **Figure 10.11**.
+Sill (S): The value of semivariance that a variogram model attains at a given range is called sill (Figure \@ref(fig:10-gearys-c-2)).
 
 Partial sill: The sill minus nugget
 
@@ -790,17 +772,14 @@ g<-ggplot(evgm,aes(x=dist,y=gamma))+geom_point()+
 g
 ```
 
-**Figure 10.11.** \ref{fig:figs} An example semivariogram with all the components using the "Fulmar" [see details @Pebesma2005] data from "gstat" package in R.
-
-:::: {.box-content .case-study-content}
+(Figure \@ref(fig:10-semivariogram-modelling)) An example semivariogram with all the components using the "Fulmar" [see details @Pebesma2005] data from "gstat" package in R.
 
-::: {.box-title .case-study-top}
 ## Case Study: Title of Case Study Here
 You see textual case study content here 
 
-For this case study, we will use the ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada. Fort Saint Johns is divided into 6 blocks respectively **Figure 10.12**. There is a total of 108 YSM plots used in this study **Figure 10.12**. The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plots, we will calculate the total basal area by adding the basal area for all trees within the plot. We will explore different type of semivariogram model with the same dataset and check which one will best fit the data. 
+For this case study, we will use the ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada. Fort Saint Johns is divided into 6 blocks respectively (Figure \@ref(fig:10-semivariogram-modelling)). There is a total of 108 YSM plots used in this study (Figure \@ref(fig:10-semivariogram-modelling)). The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plots, we will calculate the total basal area by adding the basal area for all trees within the plot. We will explore different type of semivariogram model with the same dataset and check which one will best fit the data. 
 
-![**Figure 10.12**. Location of Fort Saint Johns TSA and the young stand change monitoring plots.](images/10-FSJ-plot.png){.center}
+![(Figure \@ref(fig:10-semivariogram-modelling)). Location of Fort Saint Johns TSA and the young stand change monitoring plots.](images/10-FSJ-plot.png){.center}
 
 <p id="box-text">
 </p>
@@ -843,7 +822,7 @@ g<-ggplot(TheVariogram,aes(x=dist,y=gamma))+geom_point()+
 g
 ```
 
-**Figure 10.13.** \ref{fig:figs} A semivariogram using the Gaussian model for the basal area (m2/ha) for young stand monitoring plots.
+(Figure \@ref(fig:10-gaussian)) A semivariogram using the Gaussian model for the basal area (m2/ha) for young stand monitoring plots.
 
 ## Spherical
 
@@ -872,7 +851,7 @@ g<-ggplot(TheVariogram,aes(x=dist,y=gamma))+geom_point()+
 g
 ```
 
-**Figure 10.14.** \ref{fig:figs} A semivariogram using the Spherical model for the basal area (m2/ha) for for young stand monitoring plots.
+(Figure \@ref(fig:10-spherical)) A semivariogram using the Spherical model for the basal area (m2/ha) for for young stand monitoring plots.
 
 ## Exponential
 
@@ -896,7 +875,7 @@ g<-ggplot(TheVariogram,aes(x=dist,y=gamma))+geom_point()+
 g
 ```
 
-**Figure 10.15.** \ref{fig:figs} A semivariogram using the exponential model for the basal area (m2/ha) for for young stand monitoring plots.
+(Figure \@ref(fig:10-exponential)) A semivariogram using the exponential model for the basal area (m2/ha) for for young stand monitoring plots.
 
 ## Circular
 
@@ -920,11 +899,11 @@ g<-ggplot(TheVariogram,aes(x=dist,y=gamma))+geom_point()+
 g
 ```
 
-**Figure 10.16.** \ref{fig:figs} A semivariogram using the circular model for the basal area (m2/ha) for for young stand monitoring plots.
+(Figure \@ref(fig:10-circular)) A semivariogram using the circular model for the basal area (m2/ha) for for young stand monitoring plots.
 
 Just looking at the variograms, it appears that all of the four models fit our data well and indicates there is a strong correlation in basal area per hectare of live trees between the plots. However, we will use all the components of semivariogram models to pick our best fitting variogram.
 
-**Table 10.1** Summary of various component of variogram for four different models.
+(Figure \@ref(fig:10-circular))Summary of various component of variogram for four different models.
 
 ```{r 10-circular-2, echo=FALSE, warning=FALSE, message=FALSE, tab.cap = FALSE}
 Model<-c("Circular", "Gaussian", "Spherical", "Exponential")
@@ -949,15 +928,12 @@ We can see that the partial sill to total sill ratio is highest for Gaussian and
 
 We will discuss both method with a case study in detail.
 
-:::: {.box-content .case-study-content}
-
-::: {.box-title .case-study-top}
 ## Case Study: Title of Case Study Here
 You see textual case study content here 
 
-For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively **Figure 10.17**. There is a total of 108 YSM plots used in this study **Figure 10.17**. The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use this dataset to explore different interpolation technique to find the variable of interest (basal area) within the unsampled locations.
+For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively (Figure \@ref(fig:10-circular-2)). There is a total of 108 YSM plots used in this study (Figure \@ref(fig:10-circular-2)). The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use this dataset to explore different interpolation technique to find the variable of interest (basal area) within the unsampled locations.
 
-![**Figure 10.17**. Location of Fort Saint Johns TSA and the young stand change monitoring plots.](images/10-FSJ-plot.png){.center}
+![(Figure \@ref(fig:10-circular-2)). Location of Fort Saint Johns TSA and the young stand change monitoring plots.](images/10-FSJ-plot.png){.center}
 
 ::::
 
@@ -1000,7 +976,7 @@ r <- raster(cata, res=100)
 vr <- rasterize(vca, r, 'baha_L')
 ```
 
-**Figure 10.18.** \ref{fig:figs} An intermediate step in creating polygon and rasterizing it over the entire Fort Saint Johns TSA.
+(Figure \@ref(fig:10-nearest-neighbor-2)) An intermediate step in creating polygon and rasterizing it over the entire Fort Saint Johns TSA.
 
 **Step 3: Nearest neighbor with five unsampled points to be interpolated at a time and plot the results**
 
@@ -1014,7 +990,7 @@ tm_shape(nnmsk) +
   tm_legend(legend.outside=FALSE)
 ```
 
-**Figure 10.19.** \ref{fig:figs} Predicted basal area over the entire Fort Saint Johns TSA using five nearest neighbor.
+(Figure \@ref(fig:10-nearest-neighbor-3)) Predicted basal area over the entire Fort Saint Johns TSA using five nearest neighbor.
 
 **Step 4: Leaflet map for some interactions**
 
@@ -1027,7 +1003,7 @@ leaflet() %>% addTiles() %>%
     title = "Predicted basal area")
 ```
 
-**Figure 10.20.** \ref{fig:figs} Predicted basal area over the entire Fort Saint Johns TSA using five nearest neighbor projected over the province of British Columbia. 
+(Figure \@ref(fig:10-nearest-neighbor-4)) Predicted basal area over the entire Fort Saint Johns TSA using five nearest neighbor projected over the province of British Columbia. 
 
 ## Thiessian Polygon
 
@@ -1079,7 +1055,7 @@ tm_shape(th.clp) +
   tm_polygons(col="baha_L", palette="RdBu", auto.palette.mapping = FALSE,title="Predicted basal area") +
   tm_legend(legend.outside=FALSE)
 ```
-**Figure 10.21.** \ref{fig:figs} Predicted basal area over the entire Fort Saint Johns TSA using thiessian polygon.
+(Figure \@ref(fig:10-nearest-neighbor-8)) Predicted basal area over the entire Fort Saint Johns TSA using thiessian polygon.
 
 #### Inverse Distance Weighting
 
@@ -1137,7 +1113,7 @@ tm_shape(r.m) +
   tm_shape(dta) + tm_dots(size=0.2) +
   tm_legend(legend.outside=FALSE)
 ```
-**Figure 10.22.** \ref{fig:figs} Predicted basal area over the entire Fort Saint Johns TSA using inverse distance weighting.
+(Figure \@ref(fig:10-idw-4)) Predicted basal area over the entire Fort Saint Johns TSA using inverse distance weighting.
 
 **Step 5: Leaflet map for some interaction**
 
@@ -1150,7 +1126,7 @@ leaflet() %>% addTiles() %>%
     title = "Predicted basal area")
 ```
 
-**Figure 10.23.** \ref{fig:figs} Predicted basal area over the entire Fort Saint Johns TSA using inverse distance weighting projected over the province of Brithish Columbia.
+(Figure \@ref(fig:10-idw-5)) Predicted basal area over the entire Fort Saint Johns TSA using inverse distance weighting projected over the province of Brithish Columbia.
 
 ## Methods Using Semi-variogram 
 
@@ -1174,13 +1150,10 @@ $$\epsilon_{s_o} = \text{spatially autocorrelated erros}$$
 
 Linear kriging are distribution free linear interpolation techniques that are in alignment with linear regression methods [@Asa2012]. There are three principle linear kriging techniques as discussed below:
 
-:::: {.box-content .case-study-content}
-
-::: {.box-title .case-study-top}
 ## Case Study: Title of Case Study here 
 You see textual case study content here 
 
-For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively **Figure 10.17**. There is a total of 108 YSM plots used in this study **Figure 10.17**. The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use this dataset to explore different interpolation technique to find the variable of interest (basal area) in the unsampled locations.
+For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively (Figure \@ref(fig:10-idw-4)). There is a total of 108 YSM plots used in this study (Figure \@ref(fig:10-idw-4)). The total basal area (m2/ha) is our variable of interest in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use this dataset to explore different interpolation technique to find the variable of interest (basal area) in the unsampled locations.
 <p id="box-text">
 </p>
 
@@ -1310,11 +1283,11 @@ grids_bs <- plot_grid(g,h,i,j,ncol=2,align = "h")
 grids_bs
 ```
 
-**Figure 10.24.** \ref{fig:figs} Variogram models fitted for basal area using the YSM plot data. 
+(Figure \@ref(fig:10-simple-kriging-4)) Variogram models fitted for basal area using the YSM plot data. 
 
 **Step 5: Put all the variogram parameters in a table and see which fits best**
 
-**Table 10.2** Summary of various components of variogram fro four variogram models.
+(Figure \@ref(fig:10-simple-kriging-4)) Summary of various components of variogram fro four variogram models.
 
 ```{r 10-simple-kriging-5, echo=FALSE, warning=FALSE, message=FALSE, tab.cap = FALSE}
 Model<-c("Exponential", "Spherical", "Gaussian", "Circular")
@@ -1350,7 +1323,7 @@ tm_shape(raster_clip) +
   tm_legend(legend.outside=FALSE)
 
 ```
-**Figure 10.25.** \ref{fig:figs} Predicted basal area using simple kriging.
+(Figure \@ref(fig:10-simple-kriging-7)) Predicted basal area using simple kriging.
 
 **Step 7: Cross validation**
 
@@ -1399,7 +1372,7 @@ tm_shape(Ok_clip) +
   tm_legend(legend.outside=FALSE)
 
 ```
-**Figure 10.26.** \ref{fig:figs} Predicted basal area using ordinary kriging.
+(Figure \@ref(fig:10-ordinary-kriging-2)) Predicted basal area using ordinary kriging.
 
 **Step 7: Cross validation**
 
@@ -1463,7 +1436,7 @@ tm_shape(uk_clip) +
   tm_legend(legend.outside=FALSE)
 ```
 
-**Figure 10.27.** \ref{fig:figs} Predicted basal area using universal kriging.
+(Figure \@ref(fig:10-universal-kriging-2)) Predicted basal area using universal kriging.
 
 **Step 3: Cross validation**
 
@@ -1483,7 +1456,7 @@ mean(res^2/as.data.frame(cv_uK)$var1.var)
 
 We will put the cross validation results together and cross compare across three methods.
 
-**Table 10.3** Cross-validation results for different type of kriging.
+(Figure \@ref(fig:10-universal-kriging-3)) Cross-validation results for different type of kriging.
 
 ```{r 10-universal-kriging-4, echo=FALSE, warning=FALSE, message=FALSE, tab.cap = FALSE}
 Method<-c("Simple", "Ordinary", "Universal")
@@ -1494,7 +1467,7 @@ summary<-data.frame(Method,RMSE,ME,MSDR)
 knitr::kable(summary, caption =NULL)
 ```
 
-From the cross validation results **(Table 10.3)**, we want root mean squared error (RMSE) to be low for greater predictive accuracy [@Tziachris2017]. We also want mean error (ME) to be as close to 0 as possible [@Tziachris2017]. And we want mean squared deviation Ratio (MSDR) to be closer to 1 for the good kriging model [@Tziachris2017]. The RMSE is lowest for simple and ordinary kriging, the ME is negative and below zero, and the MSDR of the predictions vs. the sample is low and closer to 1 for both simple and ordinary kriging (Table 10.3). This means the variability in predictions from both kriging are somewhat closer to real values than universal kriging [@Tziachris2017]. Looks like universal kriging is predicting more negative basal area for some of the location within TSA. The choice between simple and ordinary kriging may vary with researchers discretion here.
+From the cross validation results (Figure \@ref(fig:10-universal-kriging-4)), we want root mean squared error (RMSE) to be low for greater predictive accuracy [@Tziachris2017]. We also want mean error (ME) to be as close to 0 as possible [@Tziachris2017]. And we want mean squared deviation Ratio (MSDR) to be closer to 1 for the good kriging model [@Tziachris2017]. The RMSE is lowest for simple and ordinary kriging, the ME is negative and below zero, and the MSDR of the predictions vs. the sample is low and closer to 1 for both simple and ordinary kriging (Table 10.3). This means the variability in predictions from both kriging are somewhat closer to real values than universal kriging [@Tziachris2017]. Looks like universal kriging is predicting more negative basal area for some of the location within TSA. The choice between simple and ordinary kriging may vary with researchers discretion here.
 
 ## Co-Kriging
 
@@ -1524,13 +1497,10 @@ Disjunctive kriging allows to estimate the value of a variable of interest at an
 
 For classical statistics tests,**spatial autocorrelation** is problematic as ordinary least square (regression) or analysis of variance (ANOVA) assumes that observations are independent in space and time [@Meng2009]. However, geostatistical data violates the assumptions of independence, and using regression and ANOVA might inflate the significance of t and F statistics, when, in fact, they may not be significant at all [@Meng2009]. In that case one should try to improve the regression model by adding important **auxiliary** (independent variables that are associated or important in predicting the variables of interest) and incorporating the spatial autocorrelation structure [@Meng2009] using spatial regression models [@Anselin1998]. **The whole objective of spatial regression is to understand the association of the variable of interest with the independent variables while accounting for the spatial structure present in the data.** We will show two examples of spatial regression model in this section using our familiar YSM data for Fort Saint Johns TSA. 
 
-:::: {.box-content .case-study-content}
-
-::: {.box-title .case-study-top}
 ## Case Study: Title of case study here 
 You see textual case study content here
 
-For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively **Figure 10.17**. There is a total of 108 YSM plots used in this study **Figure 10.17**. The total basal area (m2/ha) is our variable of interest or response variable in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use the auxiliary variables such as trees per hectare (TPH), elevation (m), site index, top height(m), and tree volume (m3/ha)
+For this case study, we will use ground plot data from Young stand monitoring (YSM) program data [@ProvinceofBC2018] for Fort Saint Johns timber supply area (TSA) in the province of British Columbia, Canada.Fort Saint Johns is divided into 6 blocks respectively (Figure \@ref(fig:10-universal-kriging-4)). There is a total of 108 YSM plots used in this study (Figure \@ref(fig:10-universal-kriging-4)). The total basal area (m2/ha) is our variable of interest or response variable in this study. For each of the YSM plot the total basal area was calculated by adding the basal area for all trees within the plot. We will use the auxiliary variables such as trees per hectare (TPH), elevation (m), site index, top height(m), and tree volume (m3/ha)
 <p id="box-text">
 </p>
 
@@ -1656,9 +1626,6 @@ The lag error parameter Lambda for the model in **step 4** is positive and signi
 
 When it is not so clear theoretically that either of the spatial model works for our data, we can compare the model performance parameters: the AIC and Log likelihood. In our case, the spatial error model has lowest AIC and highest negative Log likelihood values. Hence, spatial lag model best fits our data.
 
-:::: {.box-content .call-out-content}
-
-::: {.box-title .call-out-top}
 ## Remember This? {-}
 :::