corrections based on comments

coecms · May 21, 2024 · fa6f809 · fa6f809
1 parent ab962ee
commit fa6f809
Showing 1 changed file with 46 additions and 27 deletions.
diff --git a/posts/2024-05-20-pytorch_intro.ipynb b/posts/2024-05-20-pytorch_intro.ipynb
@@ -219,17 +219,9 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Device tensor is stored on: cpu\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "if torch.cuda.is_available():\n",
     "    a = a.to('cuda')\n",
@@ -238,6 +230,13 @@
     "print(f\"Device tensor is stored on: {a.device}\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```Device tensor is stored on: cuda```"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -264,9 +263,9 @@
    "metadata": {},
    "source": [
     "- Tensors: As discussed earlier, tensors are the fundamental data structures in PyTorch, similar to matrices but with the ability to store data in higher dimensions. They are used to store the inputs, outputs, and parameters of a model.\n",
-    "- Modules: In PyTorch, every neural network is derived from the nn.Module base class. A module can contain other modules, allowing to nest them in a tree structure. This modular design provides great flexibility when designing models.\n",
-    "- Parameters: Parameters are tensor subclasses that have a very special property — they are automatically added to the list of its module’s parameters, and will be considered by optimizers.\n",
     "\n",
+    "- Modules: In PyTorch, every neural network is derived from the nn.Module base class. A module is a building block for neural networks; it encapsulates parameters, and provides a way for organizing computations. This can include layers, methods to set parameters, forward and backward propagations, and more. Essentially, it’s a self-contained component that defines how data should be processed.\n",
+    "- Parameters: Parameters are tensor subclasses that have a very special property — they are automatically added to the list of its module’s parameters, and will be considered by optimizers.\n",
     "- Optimizers: PyTorch includes several optimization algorithms in torch.optim, like SGD, Adam, and RMSprop, which are used to update weights during training according to the gradients computed during backpropagation."
    ]
   },
@@ -281,7 +280,17 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "For this example, we can use a dataset like CIFAR-10, which contains images of different animal and vehicle classes. Here's how you can adjust the network, data loading, and training setup for animal image classification."
+    "The best way to intorduce you to Neural Netwroks is to go through an example, here we will build a simple animal image classification network called ```AnimalNet```. \n",
+    "\n",
+    "To build an animal image classification there are several steps that we are going to do:\n",
+    "\n",
+    "- First import all the necessary libraries.\n",
+    "- Define the network architecture by subclassing ```nn.Module```\n",
+    "- Load the data that we are going to use for training and testing. For this example, we can use a dataset like [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html), which contains images of different animal and vehicle classes.\n",
+    "- Train the network with an optimizer and loss function. \n",
+    "- Then we will evaulate our model and test in on some animal images.\n",
+    "\n",
+    "For this example I won't be going into the technical details of machine learning, this is more an introduction of how PyTorch works, however I will try to include useful links as we go where you may like to read further (i.e. loss functions, optimisers, etc)"
    ]
   },
   {
@@ -314,7 +323,9 @@
    "source": [
     "### Network Definition:\n",
     "\n",
-    "Next, define the network architecture by subclassing ```nn.Module```, and initialize the neural network layers in ```__init__```. Implement the forward pass in the forward method."
+    "Next, define the network architecture by subclassing ```nn.Module```, and initialize the neural network layers in ```__init__```. Implement the forward pass in the forward method.\n",
+    "\n",
+    "These neural network layers are also known a [convolution layers](https://www.geeksforgeeks.org/introduction-convolution-neural-network/)."
    ]
   },
   {
@@ -350,7 +361,9 @@
     "        x = F.relu(self.fc3(x))\n",
     "\n",
     "        # Output Layer and Softmax Activation:\n",
-    "        return F.log_softmax(self.fc4(x), dim=1) # applies the final linear transformation to map the representations learned by the network to the number of classes in the task (10 in this case)."
+    "        return F.log_softmax(self.fc4(x), dim=1) \n",
+    "        # This applies the final linear transformation to map the representations learned by the network to the number \n",
+    "        # of classes in the task (10 in this case)."
    ]
   },
   {
@@ -378,9 +391,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "- The forward() function is an important part of defining neural networks in PyTorch. It specifies how the input data is transformed as it passes through the network, essentially defining the computation that occurs within the network. This is where you apply layers, activation functions, and other computational steps to the input tensors.\n",
+    "- The ```forward()``` function is an important part of defining neural networks in PyTorch. It specifies how the input data is transformed as it passes through the network, essentially defining the computation that occurs within the network. This is where you apply layers, activation functions, and other computational steps to the input tensors.\n",
     "\n",
-    "- ```F.relu()``` is the Rectified Linear Activation Function (ReLU) applied element-wise. It introduces non-linearity into the model, allowing it to learn more complex patterns. ReLU is defined as ```f(x) = max(0, x)``` , which sets all negative values in the input tensor to zero.\n",
+    "- ```F.relu()``` is the Rectified Linear Activation Function [(ReLU)](https://www.geeksforgeeks.org/activation-functions-in-pytorch/) applied element-wise. It introduces non-linearity into the model, allowing it to learn more complex patterns. ReLU is defined as ```f(x) = max(0, x)``` , which sets all negative values in the input tensor to zero.\n",
     "\n",
     "- ```F.log_softmax()``` is applied to the output of the final layer. This function computes the logarithm of the softmax of the input tensor. Softmax converts the logits (raw predictions) into probabilities by taking the exponentials of each output and then normalizing these values by dividing by the sum of all exponentials; this ensures that the output values are between 0 and 1 and sum to 1.\n",
     "\n",
@@ -394,7 +407,8 @@
     "### Data Loading and Preprocessing:\n",
     "\n",
     "One of the main tasks when building your machine learning model is preparing your training and testing data.\n",
-    "Here we need to adjust the data loading to use the CIFAR-10 dataset, which includes classes like birds, cats, dogs, and other animals."
+    "\n",
+    "In this example we are using the  [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. This dataset contains images of animals but it also contains images of vehicles like cars/trains/planes/etc. We need to filter this dataset so that it only contains the classes we want, like birds, cats, dogs, and other animals."
    ]
   },
   {
@@ -442,7 +456,7 @@
     }
    ],
    "source": [
-    "# Data loading and transformation\n",
+    "# Data loading ino tensors and transformation\n",
     "transform = transforms.Compose([\n",
     "    transforms.ToTensor(),\n",
     "    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize the RGB channels\n",
@@ -519,7 +533,12 @@
    "source": [
     "### Training the Network\n",
     "\n",
-    "Set up the optimizer, define the loss function, and implement the training loop. "
+    "Set up the optimizer, define the loss function, and implement the training loop. \n",
+    "\n",
+    "Somne details: \n",
+    "\n",
+    "- We are using the [Adam optimize](https://www.geeksforgeeks.org/adam-optimizer/?ref=header_search) which is good for image classification.\n",
+    "- Our loss function is the [Cross Entropy Loss](https://www.geeksforgeeks.org/what-is-cross-entropy-loss-function/?ref=header_search) function."
    ]
   },
   {
@@ -663,7 +682,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now lets test our model on 6 random images, first we'll plot the 'ground truth':"
+    "Now let's test our model on 6 random images, first we'll plot the 'ground truth':"
    ]
   },
   {
@@ -702,7 +721,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And now plot what the model thinks those 6 images are:"
+    "And now we plot what the model thinks those 6 images are:"
    ]
   },
   {
@@ -731,7 +750,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Our model actually got 4 out of 6 correct which is slightly better than the ~50% accuracy from the model evaluation step. To see a potential reason for the better accuracy lets see how accurate the model is for each animal class it has been trained on:"
+    "Our model actually got 4 out of 6 correct which is slightly better than the ~50% accuracy from the model evaluation step. To see a potential reason for the better accuracy let's see how accurate the model is for each animal class it has been trained on:"
    ]
   },
   {
@@ -787,7 +806,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "GPUs offer a great advantage over CPUs when it comes to machine leanring. Training and running your model on a GPU/s can be significantly faster. Lets redo the above but on a GPU instead and see what needs to be changed in the code. \n",
+    "GPUs offer a great advantage over CPUs when it comes to machine leanring. Training and running your model on a GPU/s can be significantly faster. Let's redo the above but on a GPU instead and see what needs to be changed in the code. \n",
     "\n",
     "Note that for this code to work you will need to be using a machine that has CUDA installed, see [pytorch + cuda](https://pytorch.org). Thankfully CUDA is installed in some environments on GADI like dk92: [dk92 NCI](https://opus.nci.org.au/display/DAE/RAPIDS)"
    ]
@@ -813,9 +832,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We don't need to change anything about our AnimalNet Class and AnimalCIFAR10 setup so I wont add that here again. \n",
+    "We don't need to change anything about our AnimalNet Class and AnimalCIFAR10 setup so I won't add that here again. \n",
     "\n",
-    "Let's skip to the code where we initialize our model and do the training. There's 2 things we need to change here:\n",
+    "Let's skip to the code where we initialize our model and do the training. There are 2 things we need to change here:\n",
     "- We need to move our model to the GPU\n",
     "- We need to move out data/inputs to the GPU"
    ]
@@ -891,7 +910,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And finally, to test the model on some of the teasting images you just need to transfer those images to teh GPU to pass to the model:"
+    "And finally, to test the model on some of the testing images you just need to transfer those images to the GPU to pass to the model:"
    ]
   },
   {