From ab962ee063f0696bff3e806c7b54877a67b997c1 Mon Sep 17 00:00:00 2001 From: Sam Green Date: Mon, 20 May 2024 12:01:27 +1000 Subject: [PATCH 1/2] Adding notebook for pytorch intro --- post_meta.yaml | 4 + posts/2024-05-20-pytorch_intro.ipynb | 1138 ++++++++++++++++++++++++++ 2 files changed, 1142 insertions(+) create mode 100644 posts/2024-05-20-pytorch_intro.ipynb diff --git a/post_meta.yaml b/post_meta.yaml index a30eec2..8da3558 100644 --- a/post_meta.yaml +++ b/post_meta.yaml @@ -286,3 +286,7 @@ title: Python Performance Options excerpt: Investigates how the performance of python code can be enhanced without parallelisation. tags: [ python ] + +2024-05-20-pytorch_intro: + title: Introduction to Pytorch + tags: [ python ML ] \ No newline at end of file diff --git a/posts/2024-05-20-pytorch_intro.ipynb b/posts/2024-05-20-pytorch_intro.ipynb new file mode 100644 index 0000000..1f9cf94 --- /dev/null +++ b/posts/2024-05-20-pytorch_intro.ipynb @@ -0,0 +1,1138 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction to Pytorch" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "PyTorch is an open-source machine learning library developed by Facebook's AI Research lab (FAIR). It's widely used for applications such as natural language processing and computer vision, primarily due to its flexibility and ease of use. PyTorch is known for its powerful tensor operations, which are similar to the arrays and matrices found in other programming languages, but optimized for deep learning. One of its standout features is dynamic computational graphing, allowing for mutable graphs that update and change as operations are added. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Tensors" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Tensors in PyTorch are similar to arrays in other programming languages, but with additional capabilities that make them suitable for machine learning. A tensor is a generalized matrix, or in other words, an n-dimensional array where n can be any non-negative integer. For instance, a 0-dimensional tensor is a scalar, a 1-dimensional tensor is a vector, and a 2-dimensional tensor is a matrix.\n", + "\n", + "Tensors are used in PyTorch for several reasons:\n", + "\n", + "1. Efficiency: Tensors allow efficient storage and manipulation of data, which is crucial when dealing with large datasets and complex models in machine learning.\n", + "2. GPU Acceleration: Tensors can be moved to a GPU to accelerate computing, which is much faster compared to CPU computations.\n", + "3. Automatic Differentiation: PyTorch uses tensors to perform automatic differentiation, which is essential for training machine learning models." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import torch\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Tensors can be initialized in various ways. \n", + "Take a look at the following examples:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Vector: tensor([1, 2, 3, 4])\n", + "Matrix:\n", + " tensor([[1, 2, 3],\n", + " [4, 5, 6]])\n", + "Matrix from Numpy:\n", + " tensor([[1, 2, 3],\n", + " [4, 5, 6]])\n", + "Zero matrix:\n", + " tensor([[0., 0., 0.],\n", + " [0., 0., 0.],\n", + " [0., 0., 0.]])\n", + "Random matrix:\n", + " tensor([[0.4252, 0.7602, 0.8002],\n", + " [0.7485, 0.3796, 0.8222],\n", + " [0.6692, 0.5390, 0.4381]])\n" + ] + } + ], + "source": [ + "# Create a tensor from a list\n", + "l = [1, 2, 3, 4]\n", + "a = torch.tensor(l)\n", + "print(\"Vector:\", a)\n", + "\n", + "# Create a 2x3 matrix\n", + "m = [[1, 2, 3], [4, 5, 6]]\n", + "b = torch.tensor(m)\n", + "print(\"Matrix:\\n\", b)\n", + "\n", + "# Create a tensor from a 2x3 numpy array\n", + "np_array = np.array(m)\n", + "b_np = torch.from_numpy(np_array)\n", + "print(\"Matrix from Numpy:\\n\", b)\n", + "\n", + "# Create a tensor of zeros\n", + "c = torch.zeros((3, 3))\n", + "print(\"Zero matrix:\\n\", c)\n", + "\n", + "# Create a tensor with random values\n", + "d = torch.rand(3, 3)\n", + "print(\"Random matrix:\\n\", d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Tensor Attributes" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Tensor: tensor([[1, 2, 3],\n", + " [4, 5, 6]])\n", + "Shape of tensor: torch.Size([2, 3])\n", + "Datatype of tensor: torch.int64\n", + "Device tensor is stored on: cpu\n" + ] + } + ], + "source": [ + "print(f\"Tensor: {b}\")\n", + "print(f\"Shape of tensor: {b.shape}\")\n", + "print(f\"Datatype of tensor: {b.dtype}\")\n", + "print(f\"Device tensor is stored on: {b.device}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Tensor operations" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Add 10: tensor([11, 12, 13, 14])\n", + "Element-wise multiplication: tensor([ 1, 4, 9, 16])\n", + "Matrix multiplication:\n", + " tensor([[ 4, 5],\n", + " [10, 11]])\n", + "Mean of random matrix: tensor(0.5434)\n" + ] + } + ], + "source": [ + "# Addition\n", + "result = torch.add(a, 10)\n", + "print(\"Add 10:\", result)\n", + "\n", + "# Element-wise multiplication\n", + "result = a * a\n", + "print(\"Element-wise multiplication:\", result)\n", + "\n", + "# Matrix multiplication\n", + "result = torch.matmul(b, torch.tensor([[1, 0], [0, 1], [1, 1]]))\n", + "print(\"Matrix multiplication:\\n\", result)\n", + "\n", + "# Mean of tensor\n", + "result = d.mean()\n", + "print(\"Mean of random matrix:\", result)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Standard numpy-like indexing and slicing:" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "tensor([[0., 0., 0.],\n", + " [0., 0., 0.],\n", + " [0., 0., 0.]])\n", + "tensor([[0., 1., 0.],\n", + " [0., 1., 0.],\n", + " [0., 1., 0.]])\n" + ] + } + ], + "source": [ + "print(c)\n", + "c[:,1] = 1\n", + "print(c)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Moving Tensors to GPU\n", + "To leverage GPU acceleration, you need to move your tensors to the GPU:" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Device tensor is stored on: cpu\n" + ] + } + ], + "source": [ + "if torch.cuda.is_available():\n", + " a = a.to('cuda')\n", + " print(\"Moved vector to GPU:\", a)\n", + "\n", + "print(f\"Device tensor is stored on: {a.device}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Neural Networks" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Neural networks in PyTorch are built using the torch.nn module, which provides all the building blocks needed to create your own neural network. These blocks include layers, activation functions, and loss functions, which can be combined to model complex patterns in data." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Key Components of PyTorch for Neural Networks" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- Tensors: As discussed earlier, tensors are the fundamental data structures in PyTorch, similar to matrices but with the ability to store data in higher dimensions. They are used to store the inputs, outputs, and parameters of a model.\n", + "- Modules: In PyTorch, every neural network is derived from the nn.Module base class. A module can contain other modules, allowing to nest them in a tree structure. This modular design provides great flexibility when designing models.\n", + "- Parameters: Parameters are tensor subclasses that have a very special property — they are automatically added to the list of its module’s parameters, and will be considered by optimizers.\n", + "\n", + "- Optimizers: PyTorch includes several optimization algorithms in torch.optim, like SGD, Adam, and RMSprop, which are used to update weights during training according to the gradients computed during backpropagation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Building a Simple Neural Network" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example, we can use a dataset like CIFAR-10, which contains images of different animal and vehicle classes. Here's how you can adjust the network, data loading, and training setup for animal image classification." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Imports:\n", + "\n", + "We'll start by importing necessary PyTorch modules:" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": {}, + "outputs": [], + "source": [ + "import torch\n", + "import torch.nn as nn\n", + "import torch.nn.functional as F\n", + "import torch.optim as optim\n", + "from torch.utils.data import Dataset, DataLoader\n", + "from torchvision import datasets, transforms\n", + "import torchvision" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Network Definition:\n", + "\n", + "Next, define the network architecture by subclassing ```nn.Module```, and initialize the neural network layers in ```__init__```. Implement the forward pass in the forward method." + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [], + "source": [ + "class AnimalNet(nn.Module):\n", + " def __init__(self):\n", + " # The values chosen for the nn.Linear layers in a neural network design are significant as they define \n", + " # the architecture and capacity of the network to learn from the data. \n", + " super(AnimalNet, self).__init__()\n", + " self.fc1 = nn.Linear(3*32*32, 512) # Input layer to hidden layer (flattened 32x32x3 images)\n", + " self.fc2 = nn.Linear(512, 256) # Hidden layer to hidden layer\n", + " self.fc3 = nn.Linear(256, 128) # Hidden layer to hidden layer\n", + " self.fc4 = nn.Linear(128, 6) # Hidden layer to output layer (10 classes)\n", + "\n", + " # x is the input tensor that contains the batch of images.\n", + " # Each image is originally a 3D tensor (with dimensions for channels, height, and width).\n", + " def forward(self, x):\n", + " # Reshapes x into a 2D tensor where each image is flattened into a single vector of size 33232 (3072):\n", + " x = x.view(-1, 3*32*32) \n", + " # The -1 is used to automatically calculate the appropriate number of rows based on the batch size. \n", + " # This is necessary because the input layer of the network (fc1) expects a 1D vector per image.\n", + "\n", + " # Applying the First Linear Layer and Activation Function:\n", + " x = F.relu(self.fc1(x)) # applies the first linear transformation using weights and biases of the fc1 layer.\n", + "\n", + " # These lines repeat the pattern: applying a linear transformation followed by a ReLU activation. \n", + " # Each layer takes the output from the previous layer as its input, progressively transforming the data.\n", + " x = F.relu(self.fc2(x))\n", + " x = F.relu(self.fc3(x))\n", + "\n", + " # Output Layer and Softmax Activation:\n", + " return F.log_softmax(self.fc4(x), dim=1) # applies the final linear transformation to map the representations learned by the network to the number of classes in the task (10 in this case)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Extra info on nn.Linear:\n", + "

nn.Linear(33232, 512):
\n", + "Input Features (33232) - This is the number of input features to the first layer of the neural network. The CIFAR-10 images are colored (RGB) and each image is 32x32 pixels. Therefore, each image consists of 3 color channels × 32 pixels (height) × 32 pixels (width) = 3,072 features. When you input an image to the network, it is typically flattened from a 3D tensor of shape (3, 32, 32) to a 1D tensor of shape (33232), which is a common practice for fully connected layers.
\n", + "Output Features (512) - This number defines how many neurons there are in the first hidden layer of the network. The choice of 512 neurons is somewhat arbitrary but is influenced by factors like the complexity of the task and the amount of available data. More neurons can potentially capture more complex patterns but also increase computational load and the risk of overfitting, especially with smaller datasets. The number 512 is a power of 2, which is often chosen due to how memory is allocated in many computing systems, potentially improving performance.

\n", + "

nn.Linear(512, 256):
\n", + "Input Features (512) - This layer takes the output from the previous layer (512 features) as its input.
\n", + "Output Features (256) - This continues the pattern of creating a \"funnel\" where each subsequent layer has fewer neurons than the previous one. This design can help to condense the information from the high-dimensional input into increasingly abstract and useful representations. Halving the number of neurons in each layer (a common heuristic) helps in gradually reducing the dimensionality of the problem, which can aid in learning more generalized features.

\n", + "

nn.Linear(256, 128):
\n", + "Input Features (256) - Inputs from the second layer are fed into the third.
\n", + "Output Features (128) - Further reduces the complexity and continues the pattern of halving. This reduction helps in focusing the network on the most important features to make decisions.

\n", + "

nn.Linear(128, 10):
\n", + "Input Features (128) - Takes outputs from the previous layer.
\n", + "Output Features (10) - This is determined by the number of classes in the CIFAR-10 dataset. Each of the 10 output units corresponds to one of the classes (like cats, dogs, birds, etc.), which the network will learn to predict.

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- The forward() function is an important part of defining neural networks in PyTorch. It specifies how the input data is transformed as it passes through the network, essentially defining the computation that occurs within the network. This is where you apply layers, activation functions, and other computational steps to the input tensors.\n", + "\n", + "- ```F.relu()``` is the Rectified Linear Activation Function (ReLU) applied element-wise. It introduces non-linearity into the model, allowing it to learn more complex patterns. ReLU is defined as ```f(x) = max(0, x)``` , which sets all negative values in the input tensor to zero.\n", + "\n", + "- ```F.log_softmax()``` is applied to the output of the final layer. This function computes the logarithm of the softmax of the input tensor. Softmax converts the logits (raw predictions) into probabilities by taking the exponentials of each output and then normalizing these values by dividing by the sum of all exponentials; this ensures that the output values are between 0 and 1 and sum to 1.\n", + "\n", + "The ```forward()``` function is automatically invoked when you call the model on an input batch of data, e.g., ```output = model(data)```. The backward pass, used to compute gradients during training, is automatically defined by PyTorch using autograd based on the operations specified in the forward() function. This makes implementing complex neural networks more straightforward, as the user only needs to define the forward computation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Data Loading and Preprocessing:\n", + "\n", + "One of the main tasks when building your machine learning model is preparing your training and testing data.\n", + "Here we need to adjust the data loading to use the CIFAR-10 dataset, which includes classes like birds, cats, dogs, and other animals." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [], + "source": [ + "# Custom dataset to filter only animal classes\n", + "class AnimalCIFAR10(Dataset):\n", + " def __init__(self, root, train=True, transform=None, download=False):\n", + " self.cifar10 = datasets.CIFAR10(root=root, train=train, transform=transform, download=download)\n", + " self.animal_classes = [2, 3, 4, 5, 6, 7] # Indices of animal classes in CIFAR-10\n", + " self.animal_class_map = {2: 0, 3: 1, 4: 2, 5: 3, 6: 4, 7: 5} # Map original class to new class index\n", + " self.data = []\n", + " self.targets = []\n", + "\n", + " for i in range(len(self.cifar10)):\n", + " if self.cifar10.targets[i] in self.animal_classes:\n", + " self.data.append(self.cifar10.data[i])\n", + " self.targets.append(self.animal_class_map[self.cifar10.targets[i]])\n", + "\n", + " def __len__(self):\n", + " return len(self.data)\n", + "\n", + " def __getitem__(self, idx):\n", + " img, target = self.data[idx], self.targets[idx]\n", + " if self.cifar10.transform is not None:\n", + " img = self.cifar10.transform(img)\n", + " return img, target" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Files already downloaded and verified\n", + "Files already downloaded and verified\n" + ] + } + ], + "source": [ + "# Data loading and transformation\n", + "transform = transforms.Compose([\n", + " transforms.ToTensor(),\n", + " transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalize the RGB channels\n", + "])\n", + "\n", + "# Load in our training data from CIFAR10\n", + "train_dataset = AnimalCIFAR10(root='./data', train=True, download=True, transform=transform)\n", + "train_loader = DataLoader(train_dataset, batch_size=6, shuffle=True)\n", + "\n", + "# Load in our testing data from CIFAR10\n", + "test_dataset = AnimalCIFAR10(root='./data', train=False, download=True, transform=transform)\n", + "test_loader = DataLoader(test_dataset, batch_size=6, shuffle=False)\n", + "\n", + "animal_classes = ['bird', 'cat', 'deer', 'dog', 'frog', 'horse']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets show what some of the images look like:" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "cat bird horse horse dog dog \n" + ] + } + ], + "source": [ + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "# functions to show an image\n", + "\n", + "\n", + "def imshow(img):\n", + " img = img / 2 + 0.5 # unnormalize\n", + " npimg = img.numpy()\n", + " plt.imshow(np.transpose(npimg, (1, 2, 0)))\n", + " plt.show()\n", + "\n", + "\n", + "# get some random training images\n", + "dataiter = iter(train_loader)\n", + "images, labels = next(dataiter)\n", + "\n", + "# show images\n", + "imshow(torchvision.utils.make_grid(images))\n", + "# print labels\n", + "print(' '.join(f'{animal_classes[labels[j]]:5s}' for j in range(6)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Training the Network\n", + "\n", + "Set up the optimizer, define the loss function, and implement the training loop. " + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 2000] loss: 1.613\n", + "[1, 4000] loss: 1.477\n", + "[2, 2000] loss: 1.382\n", + "[2, 4000] loss: 1.380\n", + "[3, 2000] loss: 1.315\n", + "[3, 4000] loss: 1.308\n", + "[4, 2000] loss: 1.226\n", + "[4, 4000] loss: 1.269\n", + "[5, 2000] loss: 1.188\n", + "[5, 4000] loss: 1.197\n", + "[6, 2000] loss: 1.128\n", + "[6, 4000] loss: 1.160\n", + "[7, 2000] loss: 1.086\n", + "[7, 4000] loss: 1.104\n", + "[8, 2000] loss: 1.031\n", + "[8, 4000] loss: 1.064\n", + "[9, 2000] loss: 0.979\n", + "[9, 4000] loss: 1.026\n", + "[10, 2000] loss: 0.923\n", + "[10, 4000] loss: 0.970\n", + "[11, 2000] loss: 0.897\n", + "[11, 4000] loss: 0.929\n", + "[12, 2000] loss: 0.855\n", + "[12, 4000] loss: 0.897\n", + "[13, 2000] loss: 0.828\n", + "[13, 4000] loss: 0.848\n", + "[14, 2000] loss: 0.779\n", + "[14, 4000] loss: 0.822\n", + "[15, 2000] loss: 0.739\n", + "[15, 4000] loss: 0.794\n", + "[16, 2000] loss: 0.707\n", + "[16, 4000] loss: 0.753\n", + "[17, 2000] loss: 0.684\n", + "[17, 4000] loss: 0.734\n", + "[18, 2000] loss: 0.654\n", + "[18, 4000] loss: 0.705\n", + "[19, 2000] loss: 0.638\n", + "[19, 4000] loss: 0.684\n", + "[20, 2000] loss: 0.611\n", + "[20, 4000] loss: 0.656\n", + "Finished Training\n" + ] + } + ], + "source": [ + "# Model, Optimizer and Loss\n", + "model = AnimalNet() # initializes an instance of the AnimalNet model.\n", + "optimizer = optim.Adam(model.parameters(), lr=0.001) # This sets up the Adam optimizer to adjust the model’s parameters with a learning rate of 0.001.\n", + "# The Adam optimizer is a popular choice for training deep learning models due to its efficiency and effectiveness.\n", + "loss_fn = nn.CrossEntropyLoss() # specifies the loss function to be used, common for classification problems.\n", + "\n", + "for epoch in range(20): # loop over the dataset multiple times\n", + "\n", + " # Mini-batch Training:\n", + " running_loss = 0.0 # This variable keeps track of the cumulative loss within an epoch.\n", + " for i, data in enumerate(train_loader, 0):\n", + " # get the inputs; data is a list of [inputs, labels]\n", + " inputs, labels = data\n", + "\n", + " # zero the parameter gradients\n", + " optimizer.zero_grad() # Clears the gradients of all optimized parameters. \n", + " # This is important because by default, gradients are accumulated in PyTorch.\n", + "\n", + " # forward + backward + optimize\n", + " outputs = model(inputs) # Pass the data through the model to get the predicted outputs.\n", + " loss = loss_fn(outputs, labels) # Calculate the loss by comparing the model’s predictions with the actual labels.\n", + " loss.backward() # Calculate the gradients of the loss with respect to the model parameters.\n", + " optimizer.step() # Update the model parameters based on the computed gradients.\n", + "\n", + " # print statistics\n", + " running_loss += loss.item()\n", + " if i % 2000 == 1999: # print every 2000 mini-batches\n", + " print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')\n", + " running_loss = 0.0\n", + "\n", + "print('Finished Training')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Evaluate the model\n", + "\n", + "Now that we created our AnimalNet() model and trained it on data we want to see how well it's performing:" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 50.666666666666664%\n" + ] + } + ], + "source": [ + "# Evaluating the Model\n", + "model.eval() # Set the model to evaluation mode\n", + "correct = 0\n", + "total = 0\n", + "with torch.no_grad(): # Disable gradient computation\n", + " for data, target in test_loader:\n", + " output = model(data)\n", + " _, predicted = torch.max(output.data, 1)\n", + " total += target.size(0)\n", + " correct += (predicted == target).sum().item()\n", + "\n", + "print(f'Accuracy: {100 * correct / total}%')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "~50% ism't terrible considering we only ran our model for a few mins over 20 epochs. To improve this there are several things we can do:\n", + "- Run over many more epochs\n", + "- Try different optimizers, loss functions\n", + "- Try a bigger batch size, here we only had 6.\n", + "- Try a different convolution layer set-up\n", + "\n", + "A lot of the time, creating the 'perfect' machine learnign model for your task involves a bit of trial and error. There are always the recommened options for the convolution layers, loss function, and optimizer but you will always need to tweak these. It's about testing and seeing which settings give you the optimal result." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now lets test our model on 6 random images, first we'll plot the 'ground truth':" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "GroundTruth: cat frog frog frog cat dog \n" + ] + } + ], + "source": [ + "dataiter = iter(test_loader)\n", + "images, labels = next(dataiter)\n", + "\n", + "# print images\n", + "imshow(torchvision.utils.make_grid(images))\n", + "print('GroundTruth: ', ' '.join(f'{animal_classes[labels[j]]:5s}' for j in range(6)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And now plot what the model thinks those 6 images are:" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Predicted: cat deer frog frog cat horse\n" + ] + } + ], + "source": [ + "outputs = model(images)\n", + "\n", + "_, predicted = torch.max(outputs, 1)\n", + "\n", + "print('Predicted: ', ' '.join(f'{animal_classes[predicted[j]]:5s}'\n", + " for j in range(6)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our model actually got 4 out of 6 correct which is slightly better than the ~50% accuracy from the model evaluation step. To see a potential reason for the better accuracy lets see how accurate the model is for each animal class it has been trained on:" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy for class: bird is 45.8 %\n", + "Accuracy for class: cat is 36.2 %\n", + "Accuracy for class: deer is 52.5 %\n", + "Accuracy for class: dog is 49.2 %\n", + "Accuracy for class: frog is 55.9 %\n", + "Accuracy for class: horse is 64.4 %\n" + ] + } + ], + "source": [ + "# prepare to count predictions for each class\n", + "correct_pred = {classname: 0 for classname in animal_classes}\n", + "total_pred = {classname: 0 for classname in animal_classes}\n", + "\n", + "# again no gradients needed\n", + "with torch.no_grad():\n", + " for data in test_loader:\n", + " images, labels = data\n", + " outputs = model(images)\n", + " _, predictions = torch.max(outputs, 1)\n", + " # collect the correct predictions for each class\n", + " for label, prediction in zip(labels, predictions):\n", + " if label == prediction:\n", + " correct_pred[animal_classes[label]] += 1\n", + " total_pred[animal_classes[label]] += 1\n", + "\n", + "\n", + "# print accuracy for each class\n", + "for classname, correct_count in correct_pred.items():\n", + " accuracy = 100 * float(correct_count) / total_pred[classname]\n", + " print(f'Accuracy for class: {classname:5s} is {accuracy:.1f} %')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Running this model on a GPU" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "GPUs offer a great advantage over CPUs when it comes to machine leanring. Training and running your model on a GPU/s can be significantly faster. Lets redo the above but on a GPU instead and see what needs to be changed in the code. \n", + "\n", + "Note that for this code to work you will need to be using a machine that has CUDA installed, see [pytorch + cuda](https://pytorch.org). Thankfully CUDA is installed in some environments on GADI like dk92: [dk92 NCI](https://opus.nci.org.au/display/DAE/RAPIDS)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Check that a GPU is available" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Check if a GPU is available and use it\n", + "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We don't need to change anything about our AnimalNet Class and AnimalCIFAR10 setup so I wont add that here again. \n", + "\n", + "Let's skip to the code where we initialize our model and do the training. There's 2 things we need to change here:\n", + "- We need to move our model to the GPU\n", + "- We need to move out data/inputs to the GPU" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "##########!!!!!!!!###########\n", + "model = AnimalNet().to(device) # Move the model to the GPU\n", + "##########!!!!!!!!###########\n", + "\n", + "optimizer = optim.Adam(model.parameters(), lr=0.001)\n", + "loss_fn = nn.CrossEntropyLoss()\n", + "\n", + "for epoch in range(20):\n", + " running_loss = 0.0\n", + " for i, data in enumerate(train_loader, 0):\n", + " inputs, labels = data\n", + " \n", + " ##########!!!!!!!!###########\n", + " inputs, labels = inputs.to(device), labels.to(device) # Move the data to the GPU\n", + " ##########!!!!!!!!###########\n", + "\n", + " optimizer.zero_grad()\n", + "\n", + " outputs = model(inputs)\n", + " loss = loss_fn(outputs, labels)\n", + " loss.backward()\n", + " optimizer.step()\n", + "\n", + " running_loss += loss.item()\n", + " if i % 2000 == 1999:\n", + " print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')\n", + " running_loss = 0.0\n", + "\n", + "print('Finished Training')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then when we want to evaulate the model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "model.eval()\n", + "correct = 0\n", + "total = 0\n", + "with torch.no_grad():\n", + " for data, target in test_loader:\n", + " ##########!!!!!!!!###########\n", + " data, target = data.to(device), target.to(device) # Move teasting data to the GPU\n", + " ##########!!!!!!!!###########\n", + " output = model(data)\n", + " _, predicted = torch.max(output.data, 1)\n", + " total += target.size(0)\n", + " correct += (predicted == target).sum().item()\n", + "\n", + "print(f'Accuracy: {100 * correct / total}%')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And finally, to test the model on some of the teasting images you just need to transfer those images to teh GPU to pass to the model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Run the model on the test images\n", + "##########!!!!!!!!###########\n", + "outputs = model(images.to(device))\n", + "##########!!!!!!!!###########\n", + "\n", + "_, predicted = torch.max(outputs, 1)\n", + "\n", + "print('Predicted: ', ' '.join(f'{classes[predicted[j]]:5s}'\n", + " for j in range(12)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## More Info\n", + "\n", + "- Pytorch website: https://pytorch.org/tutorials/\n", + "\n", + "- Using Pytorch to downscale an Evapotranspiration dataset (built by Sanaa + Sam): https://github.com/coecms/Hybrid_downscaling" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [], + "source": [ + "num_models = 10\n", + "models = [AnimalNet() for _ in range(num_models)]" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Model 1 trained.\n", + "Model 2 trained.\n", + "Model 3 trained.\n", + "Model 4 trained.\n", + "Model 5 trained.\n", + "Model 6 trained.\n", + "Model 7 trained.\n", + "Model 8 trained.\n", + "Model 9 trained.\n", + "Model 10 trained.\n" + ] + } + ], + "source": [ + "for i, model in enumerate(models):\n", + " optimizer = optim.Adam(model.parameters(), lr=0.001)\n", + " loss_fn = nn.CrossEntropyLoss()\n", + " model.train() # Set the model to training mode\n", + "\n", + " for epoch in range(15): # Number of epochs\n", + " for data, target in train_loader:\n", + " optimizer.zero_grad() # Clear the gradients\n", + " output = model(data) # Pass the data through the model\n", + " loss = loss_fn(output, target) # Calculate the loss\n", + " loss.backward() # Backpropagate the error\n", + " optimizer.step() # Update the weights\n", + "\n", + " print(f'Model {i+1} trained.')" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [], + "source": [ + "def average_predictions(models, data_loader):\n", + " total_preds = None\n", + " model.eval() # Set models to evaluation mode\n", + "\n", + " with torch.no_grad():\n", + " for data, _ in data_loader:\n", + " outputs = [model(data) for model in models]\n", + " # Stack outputs to create a new dimension and then take the mean across models\n", + " outputs = torch.stack(outputs, dim=0).mean(dim=0)\n", + " \n", + " if total_preds is None:\n", + " total_preds = outputs\n", + " else:\n", + " total_preds = torch.cat((total_preds, outputs), dim=0)\n", + "\n", + " return total_preds\n", + "\n", + "# Get average predictions on the test set\n", + "average_preds = average_predictions(models, test_loader)\n", + "\n", + "# If using logits, apply softmax to convert to probabilities\n", + "probabilities = F.softmax(average_preds, dim=1)\n", + "predicted_classes = probabilities.argmax(dim=1)" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Ensemble accuracy: 60.17%\n" + ] + } + ], + "source": [ + "correct = 0\n", + "total = 0\n", + "\n", + "with torch.no_grad():\n", + " for data, target in test_loader:\n", + " output = average_predictions(models, [(data, target)])\n", + " _, predicted = torch.max(output.data, 1)\n", + " total += target.size(0)\n", + " correct += (predicted == target).sum().item()\n", + "\n", + "print(f'Ensemble accuracy: {100 * correct / total}%')" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "netsc", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From fa6f809bd63722bd14d05dddf42703fbffb62057 Mon Sep 17 00:00:00 2001 From: Sam Green Date: Tue, 21 May 2024 15:53:59 +1000 Subject: [PATCH 2/2] corrections based on comments --- posts/2024-05-20-pytorch_intro.ipynb | 73 ++++++++++++++++++---------- 1 file changed, 46 insertions(+), 27 deletions(-) diff --git a/posts/2024-05-20-pytorch_intro.ipynb b/posts/2024-05-20-pytorch_intro.ipynb index 1f9cf94..b8a73f8 100644 --- a/posts/2024-05-20-pytorch_intro.ipynb +++ b/posts/2024-05-20-pytorch_intro.ipynb @@ -219,17 +219,9 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Device tensor is stored on: cpu\n" - ] - } - ], + "outputs": [], "source": [ "if torch.cuda.is_available():\n", " a = a.to('cuda')\n", @@ -238,6 +230,13 @@ "print(f\"Device tensor is stored on: {a.device}\")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```Device tensor is stored on: cuda```" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -264,9 +263,9 @@ "metadata": {}, "source": [ "- Tensors: As discussed earlier, tensors are the fundamental data structures in PyTorch, similar to matrices but with the ability to store data in higher dimensions. They are used to store the inputs, outputs, and parameters of a model.\n", - "- Modules: In PyTorch, every neural network is derived from the nn.Module base class. A module can contain other modules, allowing to nest them in a tree structure. This modular design provides great flexibility when designing models.\n", - "- Parameters: Parameters are tensor subclasses that have a very special property — they are automatically added to the list of its module’s parameters, and will be considered by optimizers.\n", "\n", + "- Modules: In PyTorch, every neural network is derived from the nn.Module base class. A module is a building block for neural networks; it encapsulates parameters, and provides a way for organizing computations. This can include layers, methods to set parameters, forward and backward propagations, and more. Essentially, it’s a self-contained component that defines how data should be processed.\n", + "- Parameters: Parameters are tensor subclasses that have a very special property — they are automatically added to the list of its module’s parameters, and will be considered by optimizers.\n", "- Optimizers: PyTorch includes several optimization algorithms in torch.optim, like SGD, Adam, and RMSprop, which are used to update weights during training according to the gradients computed during backpropagation." ] }, @@ -281,7 +280,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "For this example, we can use a dataset like CIFAR-10, which contains images of different animal and vehicle classes. Here's how you can adjust the network, data loading, and training setup for animal image classification." + "The best way to intorduce you to Neural Netwroks is to go through an example, here we will build a simple animal image classification network called ```AnimalNet```. \n", + "\n", + "To build an animal image classification there are several steps that we are going to do:\n", + "\n", + "- First import all the necessary libraries.\n", + "- Define the network architecture by subclassing ```nn.Module```\n", + "- Load the data that we are going to use for training and testing. For this example, we can use a dataset like [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html), which contains images of different animal and vehicle classes.\n", + "- Train the network with an optimizer and loss function. \n", + "- Then we will evaulate our model and test in on some animal images.\n", + "\n", + "For this example I won't be going into the technical details of machine learning, this is more an introduction of how PyTorch works, however I will try to include useful links as we go where you may like to read further (i.e. loss functions, optimisers, etc)" ] }, { @@ -314,7 +323,9 @@ "source": [ "### Network Definition:\n", "\n", - "Next, define the network architecture by subclassing ```nn.Module```, and initialize the neural network layers in ```__init__```. Implement the forward pass in the forward method." + "Next, define the network architecture by subclassing ```nn.Module```, and initialize the neural network layers in ```__init__```. Implement the forward pass in the forward method.\n", + "\n", + "These neural network layers are also known a [convolution layers](https://www.geeksforgeeks.org/introduction-convolution-neural-network/)." ] }, { @@ -350,7 +361,9 @@ " x = F.relu(self.fc3(x))\n", "\n", " # Output Layer and Softmax Activation:\n", - " return F.log_softmax(self.fc4(x), dim=1) # applies the final linear transformation to map the representations learned by the network to the number of classes in the task (10 in this case)." + " return F.log_softmax(self.fc4(x), dim=1) \n", + " # This applies the final linear transformation to map the representations learned by the network to the number \n", + " # of classes in the task (10 in this case)." ] }, { @@ -378,9 +391,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "- The forward() function is an important part of defining neural networks in PyTorch. It specifies how the input data is transformed as it passes through the network, essentially defining the computation that occurs within the network. This is where you apply layers, activation functions, and other computational steps to the input tensors.\n", + "- The ```forward()``` function is an important part of defining neural networks in PyTorch. It specifies how the input data is transformed as it passes through the network, essentially defining the computation that occurs within the network. This is where you apply layers, activation functions, and other computational steps to the input tensors.\n", "\n", - "- ```F.relu()``` is the Rectified Linear Activation Function (ReLU) applied element-wise. It introduces non-linearity into the model, allowing it to learn more complex patterns. ReLU is defined as ```f(x) = max(0, x)``` , which sets all negative values in the input tensor to zero.\n", + "- ```F.relu()``` is the Rectified Linear Activation Function [(ReLU)](https://www.geeksforgeeks.org/activation-functions-in-pytorch/) applied element-wise. It introduces non-linearity into the model, allowing it to learn more complex patterns. ReLU is defined as ```f(x) = max(0, x)``` , which sets all negative values in the input tensor to zero.\n", "\n", "- ```F.log_softmax()``` is applied to the output of the final layer. This function computes the logarithm of the softmax of the input tensor. Softmax converts the logits (raw predictions) into probabilities by taking the exponentials of each output and then normalizing these values by dividing by the sum of all exponentials; this ensures that the output values are between 0 and 1 and sum to 1.\n", "\n", @@ -394,7 +407,8 @@ "### Data Loading and Preprocessing:\n", "\n", "One of the main tasks when building your machine learning model is preparing your training and testing data.\n", - "Here we need to adjust the data loading to use the CIFAR-10 dataset, which includes classes like birds, cats, dogs, and other animals." + "\n", + "In this example we are using the [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. This dataset contains images of animals but it also contains images of vehicles like cars/trains/planes/etc. We need to filter this dataset so that it only contains the classes we want, like birds, cats, dogs, and other animals." ] }, { @@ -442,7 +456,7 @@ } ], "source": [ - "# Data loading and transformation\n", + "# Data loading ino tensors and transformation\n", "transform = transforms.Compose([\n", " transforms.ToTensor(),\n", " transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalize the RGB channels\n", @@ -519,7 +533,12 @@ "source": [ "### Training the Network\n", "\n", - "Set up the optimizer, define the loss function, and implement the training loop. " + "Set up the optimizer, define the loss function, and implement the training loop. \n", + "\n", + "Somne details: \n", + "\n", + "- We are using the [Adam optimize](https://www.geeksforgeeks.org/adam-optimizer/?ref=header_search) which is good for image classification.\n", + "- Our loss function is the [Cross Entropy Loss](https://www.geeksforgeeks.org/what-is-cross-entropy-loss-function/?ref=header_search) function." ] }, { @@ -663,7 +682,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now lets test our model on 6 random images, first we'll plot the 'ground truth':" + "Now let's test our model on 6 random images, first we'll plot the 'ground truth':" ] }, { @@ -702,7 +721,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "And now plot what the model thinks those 6 images are:" + "And now we plot what the model thinks those 6 images are:" ] }, { @@ -731,7 +750,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Our model actually got 4 out of 6 correct which is slightly better than the ~50% accuracy from the model evaluation step. To see a potential reason for the better accuracy lets see how accurate the model is for each animal class it has been trained on:" + "Our model actually got 4 out of 6 correct which is slightly better than the ~50% accuracy from the model evaluation step. To see a potential reason for the better accuracy let's see how accurate the model is for each animal class it has been trained on:" ] }, { @@ -787,7 +806,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "GPUs offer a great advantage over CPUs when it comes to machine leanring. Training and running your model on a GPU/s can be significantly faster. Lets redo the above but on a GPU instead and see what needs to be changed in the code. \n", + "GPUs offer a great advantage over CPUs when it comes to machine leanring. Training and running your model on a GPU/s can be significantly faster. Let's redo the above but on a GPU instead and see what needs to be changed in the code. \n", "\n", "Note that for this code to work you will need to be using a machine that has CUDA installed, see [pytorch + cuda](https://pytorch.org). Thankfully CUDA is installed in some environments on GADI like dk92: [dk92 NCI](https://opus.nci.org.au/display/DAE/RAPIDS)" ] @@ -813,9 +832,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We don't need to change anything about our AnimalNet Class and AnimalCIFAR10 setup so I wont add that here again. \n", + "We don't need to change anything about our AnimalNet Class and AnimalCIFAR10 setup so I won't add that here again. \n", "\n", - "Let's skip to the code where we initialize our model and do the training. There's 2 things we need to change here:\n", + "Let's skip to the code where we initialize our model and do the training. There are 2 things we need to change here:\n", "- We need to move our model to the GPU\n", "- We need to move out data/inputs to the GPU" ] @@ -891,7 +910,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "And finally, to test the model on some of the teasting images you just need to transfer those images to teh GPU to pass to the model:" + "And finally, to test the model on some of the testing images you just need to transfer those images to the GPU to pass to the model:" ] }, {