diff --git a/docs/abbreviations.md b/docs/abbreviations.md
index 88761fd..92ec0ec 100644
--- a/docs/abbreviations.md
+++ b/docs/abbreviations.md
@@ -36,6 +36,7 @@
 *[ISA]: Instruction Set Architecture
 *[ISAs]: Instruction Set Architectures
 *[JSON]: JavaScript Object Notation
+*[LLC]: Last Level Cache
 *[LUT]: Look Up Table
 *[LUTs]: Look Up Tables
 *[MKL]: Intel Math Kernel Library
@@ -45,6 +46,7 @@
 *[MSVC]: Microsoft Visual C++
 *[MT 19937]: Mersenne Twister 19937
 *[NEON]: ARM SIMD instructions
+*[NUMA]: Non Uniform Memory Access
 *[OS]: Operating System
 *[OSs]: Operating Systems
 *[PRNG]: Pseudo Random Number Generator
@@ -68,3 +70,4 @@
 *[SSE4.1]: Streaming SIMD Extensions 4.1
 *[SSE4.2]: Streaming SIMD Extensions 4.2
 *[STD]: Standard
+*[UMA]: Uniform Memory Access
diff --git a/docs/thread_pinning.md b/docs/thread_pinning.md
index e0e4830..a0ae961 100644
--- a/docs/thread_pinning.md
+++ b/docs/thread_pinning.md
@@ -1,6 +1,6 @@
 # Thread Pinning
 
-`AFF3CT-core` enables to select on which CPU process units (PUs) the threads are 
+`AFF3CT-core` enables to select on which process units (PUs) the threads are 
 effectively run. This is called *thread pinning* and it can significantly 
 benefit to the performance, especially on modern heterogeneous architectures. 
 To do so, the runtime relies on the 
@@ -25,7 +25,7 @@ To do so, the runtime relies on the
 
 *Portable Hardware Locality* (`hwloc` in short) is a library which provides a 
 **portable abstraction** of the **hierarchical topology of modern 
-architectures** (see the illustration below).
+architectures** (see the figure below).
 
 <figure markdown>
   ![Orange Pi 5](./assets/hwloc_orangepi5.svg)
@@ -36,52 +36,53 @@ architectures** (see the illustration below).
   </figcaption>
 </figure>
 
-`hwloc` gives the ability to pin threads over any level of hierarchy with a tree 
-view, where the process units are the leaves and there are intern nodes which 
-represent a set of PUs that are physically close (share the same LLC or are in 
-the same NUMA node). 
+`hwloc` gives the ability to pin threads over various level of hierarchy 
+represented by a tree structure. The deepest/lowest nodes (the leaves) are the 
+PUs while higher nodes represent sets of PUs that are physically close. For 
+instance, a PUs set can share the same UMA node (in the case of a NUMA 
+architecture), the same LLC or the same package. 
 
-For instance, we can choose to pin a thread over a *package* and it will be able
-to execute on all the PUs that are in this level. In the Orange Pi 5 SBC, if we 
-choose `Package L#0` the thread will run over the following set of PUs: 
-`PU L#0`, `PU L#1`, `PU L#2` and `PU L#3`. Consequently, **the pinned thread can 
-move in the selected `hwloc` object during the execution** and it is up to the 
-OS to schedule the thread on the available set of PUs.
+In the Orange Pi 5 SBC, if we pin a thread on the `Package L#0`, it will run 
+over the following set of PUs: `PU L#0`, `PU L#1`, `PU L#2` and `PU L#3`. 
+Thus, **the pinned thread can move in the selected `hwloc` node during the 
+execution** and it is up to the OS to schedule the thread on the selected PUs 
+set.
 
 !!! warning
-	The indexes given by `hwloc` are different from those given by the OS: they 
-	are logical indexes that express the real locality. **Consequently, in 
+	The indexes given by `hwloc` can be different from those given by the OS: 
+	they are logical indexes that express the real locality. **Consequently, in 
 	`AFF3CT-core`, it is important to use `hwloc` logical indexes.** The 
 	`hwloc-ls` command gives an overview of the current topology with these 
 	logical indexes.
 
 ## Sequence & Pipeline
 
-In `AFF3CT-core`, the thread pinning can be set in `runtime::Sequence` and 
-`runtime::Pipeline` classes constructor. In both cases, there is a dedicated 
-argument of `std::string` type: `sequence_pinning_policy` for 
-`runtime::Sequence` and `pipeline_pinning_policy` for `runtime::Pipeline`.
+In `AFF3CT-core`, thread pinning can be set in `runtime::Sequence` and 
+`runtime::Pipeline` class constructors. In both cases, there is a dedicated 
+argument of `std::string` type named `sequence_pinning_policy` for 
+`runtime::Sequence` or `pipeline_pinning_policy` for `runtime::Pipeline`.
 
 !!! info
-    It is important to specify the thread pinning at the construction of the 
-    `runtime::Sequence`/`runtime::Pipeline` object to guarantee that the data 
-    will be allocated and initialized (first touch policy) on the right memory 
-    banks during the replication process.
+    For NUMA architectures, it is important to specify thread pinning at the 
+    construction of the `runtime::Sequence`/`runtime::Pipeline` object to 
+    guarantee that the data will be allocated and initialized on the right 
+    memory banks (according to the first touch policy) during the replication 
+    process.
 
-To specify the pinning policy, we defined a syntax to express `hwloc` with three 
-different separators:  
+To specify the pinning policy, we defined a syntax to express `hwloc` objects 
+with three different separators:  
 
 - Pipeline stage (does not concern `runtime::Sequence`): `|`
 - Replicated stage (= replicated sequence = one thread): `;`  
 - For one thread, the list of pinned `hwloc` objects (= logical or): `,`  
 
-Then, the pinning can contains all the available `hwloc` objects. Below is 
-the correspondence between the `std::string` and the `hwloc` objects type 
-enumerate:
+Then, the pinning policy can contains all the available `hwloc` objects. Below 
+is the correspondence between the `std::string` and the `hwloc` object types:
 
 ```cpp
-static std::map<std::string, hwloc_obj_type_t> object_map =
-{ /* global containers */             /* data caches */              /* instruction caches */
+std::map<std::string, hwloc_obj_type_t> str_to_hwloc_obj =
+{ 
+  /* global containers */             /* data caches */              /* instruction caches */
   { "GROUP",   HWLOC_OBJ_GROUP    },  { "L5D", HWLOC_OBJ_L5CACHE },  { "L3I",  HWLOC_OBJ_L3ICACHE },
   { "NUMA",    HWLOC_OBJ_NUMANODE },  { "L4D", HWLOC_OBJ_L4CACHE },  { "L2I",  HWLOC_OBJ_L2ICACHE },
   { "PACKAGE", HWLOC_OBJ_PACKAGE  },  { "L3D", HWLOC_OBJ_L3CACHE },  { "L1I",  HWLOC_OBJ_L1ICACHE },
@@ -91,26 +92,24 @@ static std::map<std::string, hwloc_obj_type_t> object_map =
 };           
 ```
 
-The following syntax is used to specify the object index `X`: `OBJECT_X`. 
-
-`OBJECT` can be all the `std::string` defined in the previous listing 
-(ex: `PU_10` refers to the logical process unit n°10).
+To specify the index `X` of an `hwloc` object, the following syntax is used: 
+`OBJECT_X` (ex: `PU_5` refers to the logical PU n°5).
 
 !!! info
-    `CORE` and `PU` objects can be confusing. If the CPU cores does not support
+    `CORE` and `PU` objects can be confusing. If the CPU cores do not support
     SMT, then `CORE` and `PU` are the same. However, if the CPU cores support
     SMT, then the `PU` is the hardware thread identifier inside a given `CORE`.
 
 ### Illustrative Examples
 
-The section proposes some examples to understand how the syntax works. Only the 
-simplest `hwloc` object is used: the `PU`. Let's suppose that we have a 
-octo-core CPU with 8 process units (`PU_0, PU_1, PU_2, PU_3, PU_4, PU_5, PU_6, 
-PU_7`), see the topology of the Orange Pi 5 Plus above).
+This section gives some examples to understand how the syntax works. We 
+suppose that we have a CPU with 8 PUs with the same topology as the the Orange 
+Pi 5 Plus SBC presented before.
 
 #### Example 1
 
-We want to describe a 3 stages pipeline with:
+Let's suppose we want to setup a 3-stage pipeline with the following 
+characteristics:
 
 - **Stage 1** - No replication (= 1 thread): 
      - Pinned to `PU_0`
@@ -136,15 +135,18 @@ S2T4(Stage 2, thread 4 - pin: PU_6 or PU_7)-->SYNC2;
 SYNC2(Sync)-->S3T1(Stage 3, thread 1 - pin: PU_0, PU_1, PU_2 or PU_3);
 ```
 
-The input parameters will be:  
+In the previous configuration, 6 threads will execute simultaneously (even if 
+the given architecture supports up to 8 executions in parallel).
+
+To instantiate this `runtime::Pipeline`, here are the corresponding constructor 
+parameters:  
 
 - Number of replications (= threads) per stage: `{ 1, 4, 1 }`
-- Enabling pinning: `{ true, true, true }`  
+- Enabling pinning per stage: `{ true, true, true }`  
 - Pinning policy: 
   `"PU_0 | PU_4, PU_5; PU_4, PU_5; PU_6, PU_7; PU_6, PU_7 | PU_0, PU_1, PU_2, PU_3"`
 
-The previous pinning policy syntax can be compressed a little bit. It is 
-possible to use the following equivalent `std::string`:
+The previous pinning policy syntax can be compressed a little bit as follow:
 
 - Pinning policy : 
   `"PU_0 | PACKAGE_1; PACKAGE_1; PACKAGE_2; PACKAGE_2 | PACKAGE_0"`
@@ -153,7 +155,7 @@ possible to use the following equivalent `std::string`:
 
 Let's now consider that we want to pin all the threads of the stage 2 on the 
 `PU_4`, `PU_5`, `PU_6` or `PU_7` (this is less restrictive than the previous 
-example). The pinning strategy for stage 1 and 3 is the same as before.
+example). The pinning strategy for stage 1 and 3 is unchanged.
 
 ```mermaid
 graph LR;
@@ -169,6 +171,10 @@ S2T4(Stage 2, thread 4 - pin: PU_4, PU_5, PU_6 or PU_7)-->SYNC2;
 SYNC2(Sync)-->S3T1(Stage 3, thread 1 - pin: PU_0, PU_1, PU_2 or PU_3);
 ```
 
+Here are the corresponding parameters: 
+
+- Number of replications (= threads) per stage: `{ 1, 4, 1 }`
+- Enabling pinning per stage: `{ true, true, true }`  
 - Pinning policy : `"PU_0 | PACKAGE_1, PACKAGE_2 | PACKAGE_0"`
 
 With the previous syntax, the 4 threads of the stage 2 will apply the 
@@ -176,12 +182,12 @@ With the previous syntax, the 4 threads of the stage 2 will apply the
 
 #### Example 3
 
-It is also possible to choose the stages we want to pin using a vector of 
-`boolean`. For instance, if we don't want to pin the first stage, we can do:  
+It is also possible to choose the stages we want to pin or not using a vector of 
+`boolean`. Let's suppose we do not want to specify any pinning for the stage 1. 
 
 ```mermaid
 graph LR;
-S1T1(Stage 1, thread 1 - no pin)-->SYNC1;
+S1T1(Stage 1, thread 1 - no pinning)-->SYNC1;
 SYNC1(Sync)-->S2T1;
 SYNC1(Sync)-->S2T2;
 SYNC1(Sync)-->S2T3;
@@ -193,11 +199,13 @@ S2T4(Stage 2, thread 4 - pin: PU_4, PU_5, PU_6 or PU_7)-->SYNC2;
 SYNC2(Sync)-->S3T1(Stage 3, thread 1 - pin: PU_0, PU_1, PU_2 or PU_3);
 ```
 
-- Enabling pinning: `{false, true, true}`  
+Here are the corresponding parameters:
+
+- Number of replications (= threads) per stage: `{ 1, 4, 1 }`
+- Enabling pinning per stage: `{false, true, true}`  
 - Pinning policy: `"| PACKAGE_1, PACKAGE_2 | PACKAGE_0"`
 
-Thus, the operating system will be in charge of pinning the thread of the first
-stage.
+In this case, the OS will be in charge of pinning the thread of the first stage.
   
 ### Unpin