diff --git a/README.md b/README.md
index 4748a0c9..8fe63c53 100644
--- a/README.md
+++ b/README.md
@@ -38,7 +38,8 @@ Supported precision
 
 Supported chipsets
 * [Snapdragon 845](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-845-mobile-platform), [Snapdragon 855/855+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-855-mobile-platform), [Snapdragon 865/865+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-865-plus-5g-mobile-platform), [Snapdragon 888/888+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-888-5g-mobile-platform)
-* [Snapdragon 8 Gen 1](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-1-mobile-platform), [Snapdragon 8 Gen 2](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-2-mobile-platform), [Snapdragon 8 Gen 3](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-3-mobile-platform), [Snapdragon X Elite](https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-elite)
+* [Snapdragon 8 Elite](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-elite-mobile-platform), [Snapdragon 8 Gen 3](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-3-mobile-platform), [Snapdragon 8 Gen 2](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-2-mobile-platform), [Snapdragon 8 Gen 1](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-1-mobile-platform)
+* [Snapdragon X Elite](https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-elite)
 
 Select supported devices
 * Samsung Galaxy S21 Series, Galaxy S22 Series, Galaxy S23 Series, Galaxy S24 Series
@@ -275,6 +276,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [ConvNext-Tiny-w8a16-Quantized](https://aihub.qualcomm.com/models/convnext_tiny_w8a16_quantized) | [qai_hub_models.models.convnext_tiny_w8a16_quantized](qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [ConvNext-Tiny-w8a8-Quantized](https://aihub.qualcomm.com/models/convnext_tiny_w8a8_quantized) | [qai_hub_models.models.convnext_tiny_w8a8_quantized](qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [DenseNet-121](https://aihub.qualcomm.com/models/densenet121) | [qai_hub_models.models.densenet121](qai_hub_models/models/densenet121/README.md) | ✔️ | ✔️ | ✔️
+| [DenseNet-121-Quantized](https://aihub.qualcomm.com/models/densenet121_quantized) | [qai_hub_models.models.densenet121_quantized](qai_hub_models/models/densenet121_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [EfficientNet-B0](https://aihub.qualcomm.com/models/efficientnet_b0) | [qai_hub_models.models.efficientnet_b0](qai_hub_models/models/efficientnet_b0/README.md) | ✔️ | ✔️ | ✔️
 | [GoogLeNet](https://aihub.qualcomm.com/models/googlenet) | [qai_hub_models.models.googlenet](qai_hub_models/models/googlenet/README.md) | ✔️ | ✔️ | ✔️
 | [GoogLeNetQuantized](https://aihub.qualcomm.com/models/googlenet_quantized) | [qai_hub_models.models.googlenet_quantized](qai_hub_models/models/googlenet_quantized/README.md) | ✔️ | ✔️ | ✔️
@@ -306,6 +308,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Swin-Small](https://aihub.qualcomm.com/models/swin_small) | [qai_hub_models.models.swin_small](qai_hub_models/models/swin_small/README.md) | ✔️ | ✔️ | ✔️
 | [Swin-Tiny](https://aihub.qualcomm.com/models/swin_tiny) | [qai_hub_models.models.swin_tiny](qai_hub_models/models/swin_tiny/README.md) | ✔️ | ✔️ | ✔️
 | [VIT](https://aihub.qualcomm.com/models/vit) | [qai_hub_models.models.vit](qai_hub_models/models/vit/README.md) | ✔️ | ✔️ | ✔️
+| [VITQuantized](https://aihub.qualcomm.com/models/vit_quantized) | [qai_hub_models.models.vit_quantized](qai_hub_models/models/vit_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [WideResNet50](https://aihub.qualcomm.com/models/wideresnet50) | [qai_hub_models.models.wideresnet50](qai_hub_models/models/wideresnet50/README.md) | ✔️ | ✔️ | ✔️
 | [WideResNet50-Quantized](https://aihub.qualcomm.com/models/wideresnet50_quantized) | [qai_hub_models.models.wideresnet50_quantized](qai_hub_models/models/wideresnet50_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
@@ -359,7 +362,9 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [MediaPipe-Face-Detection](https://aihub.qualcomm.com/models/mediapipe_face) | [qai_hub_models.models.mediapipe_face](qai_hub_models/models/mediapipe_face/README.md) | ✔️ | ✔️ | ✔️
 | [MediaPipe-Face-Detection-Quantized](https://aihub.qualcomm.com/models/mediapipe_face_quantized) | [qai_hub_models.models.mediapipe_face_quantized](qai_hub_models/models/mediapipe_face_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [MediaPipe-Hand-Detection](https://aihub.qualcomm.com/models/mediapipe_hand) | [qai_hub_models.models.mediapipe_hand](qai_hub_models/models/mediapipe_hand/README.md) | ✔️ | ✔️ | ✔️
-| [YOLOv11-Detection](qai_hub_models/models/yolov11_det/README.md) | [qai_hub_models.models.yolov11_det](qai_hub_models/models/yolov11_det/README.md) | ✔️ | ✔️ | ✔️
+| [PPE-Detection](https://aihub.qualcomm.com/models/gear_guard_net) | [qai_hub_models.models.gear_guard_net](qai_hub_models/models/gear_guard_net/README.md) | ✔️ | ✔️ | ✔️
+| [Person-Foot-Detection](https://aihub.qualcomm.com/models/foot_track_net) | [qai_hub_models.models.foot_track_net](qai_hub_models/models/foot_track_net/README.md) | ✔️ | ✔️ | ✔️
+| [YOLOv11-Detection](https://aihub.qualcomm.com/models/yolov11_det) | [qai_hub_models.models.yolov11_det](qai_hub_models/models/yolov11_det/README.md) | ✔️ | ✔️ | ✔️
 | [YOLOv8-Detection](https://aihub.qualcomm.com/models/yolov8_det) | [qai_hub_models.models.yolov8_det](qai_hub_models/models/yolov8_det/README.md) | ✔️ | ✔️ | ✔️
 | [YOLOv8-Detection-Quantized](https://aihub.qualcomm.com/models/yolov8_det_quantized) | [qai_hub_models.models.yolov8_det_quantized](qai_hub_models/models/yolov8_det_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Yolo-NAS](https://aihub.qualcomm.com/models/yolonas) | [qai_hub_models.models.yolonas](qai_hub_models/models/yolonas/README.md) | ✔️ | ✔️ | ✔️
@@ -369,7 +374,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Yolo-v7-Quantized](https://aihub.qualcomm.com/models/yolov7_quantized) | [qai_hub_models.models.yolov7_quantized](qai_hub_models/models/yolov7_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
 | **Pose Estimation**
-| [FaceMap_3DMM](qai_hub_models/models/facemap_3dmm/README.md) | [qai_hub_models.models.facemap_3dmm](qai_hub_models/models/facemap_3dmm/README.md) | ✔️ | ✔️ | ✔️
+| [Facial-Landmark-Detection](https://aihub.qualcomm.com/models/facemap_3dmm) | [qai_hub_models.models.facemap_3dmm](qai_hub_models/models/facemap_3dmm/README.md) | ✔️ | ✔️ | ✔️
 | [HRNetPose](https://aihub.qualcomm.com/models/hrnet_pose) | [qai_hub_models.models.hrnet_pose](qai_hub_models/models/hrnet_pose/README.md) | ✔️ | ✔️ | ✔️
 | [HRNetPoseQuantized](https://aihub.qualcomm.com/models/hrnet_pose_quantized) | [qai_hub_models.models.hrnet_pose_quantized](qai_hub_models/models/hrnet_pose_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [LiteHRNet](https://aihub.qualcomm.com/models/litehrnet) | [qai_hub_models.models.litehrnet](qai_hub_models/models/litehrnet/README.md) | ✔️ | ✔️ | ✔️
@@ -413,6 +418,15 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Stable-Diffusion-v2.1](https://aihub.qualcomm.com/models/stable_diffusion_v2_1_quantized) | [qai_hub_models.models.stable_diffusion_v2_1_quantized](qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
 | **Text Generation**
-| [Baichuan-7B](https://aihub.qualcomm.com/models/baichuan_7b_quantized) | [qai_hub_models.models.baichuan_7b_quantized](qai_hub_models/models/baichuan_7b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Baichuan2-7B](https://aihub.qualcomm.com/models/baichuan2_7b_quantized) | [qai_hub_models.models.baichuan2_7b_quantized](qai_hub_models/models/baichuan2_7b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [IBM-Granite-3B-Code-Instruct](https://aihub.qualcomm.com/models/ibm_granite_3b_code_instruct) | [qai_hub_models.models.ibm_granite_3b_code_instruct](qai_hub_models/models/ibm_granite_3b_code_instruct/README.md) | ✔️ | ✔️ | ✔️
+| [IndusQ-1.1B](https://aihub.qualcomm.com/models/indus_1b_quantized) | [qai_hub_models.models.indus_1b_quantized](qai_hub_models/models/indus_1b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [JAIS-6p7b-Chat](https://aihub.qualcomm.com/models/jais_6p7b_chat_quantized) | [qai_hub_models.models.jais_6p7b_chat_quantized](qai_hub_models/models/jais_6p7b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Llama-v2-7B-Chat](https://aihub.qualcomm.com/models/llama_v2_7b_chat_quantized) | [qai_hub_models.models.llama_v2_7b_chat_quantized](qai_hub_models/models/llama_v2_7b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Llama-v3-8B-Chat](https://aihub.qualcomm.com/models/llama_v3_8b_chat_quantized) | [qai_hub_models.models.llama_v3_8b_chat_quantized](qai_hub_models/models/llama_v3_8b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Llama-v3.1-8B-Chat](https://aihub.qualcomm.com/models/llama_v3_1_8b_chat_quantized) | [qai_hub_models.models.llama_v3_1_8b_chat_quantized](qai_hub_models/models/llama_v3_1_8b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Llama-v3.2-3B-Chat](https://aihub.qualcomm.com/models/llama_v3_2_3b_chat_quantized) | [qai_hub_models.models.llama_v3_2_3b_chat_quantized](qai_hub_models/models/llama_v3_2_3b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Mistral-3B](https://aihub.qualcomm.com/models/mistral_3b_quantized) | [qai_hub_models.models.mistral_3b_quantized](qai_hub_models/models/mistral_3b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Mistral-7B-Instruct-v0.3](https://aihub.qualcomm.com/models/mistral_7b_instruct_v0_3_quantized) | [qai_hub_models.models.mistral_7b_instruct_v0_3_quantized](qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [PLaMo-1B](https://aihub.qualcomm.com/models/plamo_1b_quantized) | [qai_hub_models.models.plamo_1b_quantized](qai_hub_models/models/plamo_1b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Qwen2-7B-Instruct](https://aihub.qualcomm.com/models/qwen2_7b_instruct_quantized) | [qai_hub_models.models.qwen2_7b_instruct_quantized](qai_hub_models/models/qwen2_7b_instruct_quantized/README.md) | ✔️ | ✔️ | ✔️
diff --git a/qai_hub_models/_version.py b/qai_hub_models/_version.py
index 572c45a4..978ed91f 100644
--- a/qai_hub_models/_version.py
+++ b/qai_hub_models/_version.py
@@ -2,4 +2,4 @@
 # Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
-__version__ = "0.15.0"
+__version__ = "0.16.2"
diff --git a/qai_hub_models/asset_bases.yaml b/qai_hub_models/asset_bases.yaml
index 852e96cd..7fe290f8 100644
--- a/qai_hub_models/asset_bases.yaml
+++ b/qai_hub_models/asset_bases.yaml
@@ -12,3 +12,4 @@ huggingface_path: qualcomm/{model_name}
 models_website_url: https://aihub.qualcomm.com
 models_website_relative_path: models/{model_id}
 email_template: qai_hub_models/scripts/templates/email_template.txt
+genie_url: https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie
diff --git a/qai_hub_models/conftest.py b/qai_hub_models/conftest.py
index 9dd11824..f63d7819 100644
--- a/qai_hub_models/conftest.py
+++ b/qai_hub_models/conftest.py
@@ -4,6 +4,7 @@
 # ---------------------------------------------------------------------
 def pytest_configure(config):
     config.addinivalue_line("markers", "compile: Run compile tests.")
+    config.addinivalue_line("markers", "quantize: Run quantize tests.")
     config.addinivalue_line("markers", "profile: Run profile tests.")
     config.addinivalue_line("markers", "inference: Run inference tests.")
     config.addinivalue_line("markers", "trace: Run trace accuracy tests.")
diff --git a/qai_hub_models/models/_shared/body_detection/app.py b/qai_hub_models/models/_shared/body_detection/app.py
new file mode 100644
index 00000000..9a39326d
--- /dev/null
+++ b/qai_hub_models/models/_shared/body_detection/app.py
@@ -0,0 +1,171 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from typing import Callable, List
+
+import numpy as np
+import torch
+
+from qai_hub_models.utils.asset_loaders import load_image
+from qai_hub_models.utils.bounding_box_processing import batched_nms, box_xywh_to_xyxy
+from qai_hub_models.utils.image_processing import resize_pad
+
+
+def preprocess(img: np.ndarray, height: int, width: int):
+    """
+    Preprocess model input.
+
+    Inputs:
+        img: np.ndarray
+            Input image of shape [H, W, C]
+        height: int
+            Model input height.
+        width: int
+            Model input width
+    Outputs:
+        input: torch.Tensor
+            Preprocessed model input. Shape is (1, C, H, W)
+        scale: float
+            Scaling factor of input image and network input image.
+        pad: List[float]
+            Top and left padding size.
+    """
+    img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze_(0) / 255.0
+    input, scale, pad = resize_pad(img, (height, width))
+    return input, scale, pad
+
+
+def decode(output: List[torch.Tensor], thr: float) -> np.ndarray:
+    """
+    Decode model output to bounding boxes, class indices and scores.
+
+    Inputs:
+        output: List[torch.Tensor]
+            Model output.
+        thr: float
+            Detection threshold. Predictions lower than the thresholds will be discarded.
+    Outputs: np.ndarray
+        Detection results. Shape is (N, 6). N is the number of detected objects. Each object is
+        represented by (class, x1, y1, x2, y2, score)
+    """
+    anchors = [
+        [[10, 13], [16, 30], [33, 23]],
+        [[30, 61], [62, 45], [59, 119]],
+        [[116, 90], [156, 198], [373, 326]],
+    ]
+    strides = (8, 16, 32)
+    result = []
+    for s, out in enumerate(output):
+        b, h, w, c = out.shape
+        out = out.reshape(b, h, w, 3, -1)
+        _, ny, nx, na = out.shape[:-1]
+        for y in np.arange(ny):
+            for x in np.arange(nx):
+                for a in np.arange(na):
+                    pred = out[0, y, x, a]
+                    obj_score = pred[4].sigmoid()
+                    cls_score = pred[5:].max().sigmoid()
+                    score = obj_score * cls_score
+                    if score < thr:
+                        continue
+                    c = np.argmax(pred[5:])
+                    bx = (pred[0].sigmoid() * 2 - 0.5 + x) * strides[s]
+                    by = (pred[1].sigmoid() * 2 - 0.5 + y) * strides[s]
+                    bw = 4 * pred[2].sigmoid() ** 2 * anchors[s][a][0]
+                    bh = 4 * pred[3].sigmoid() ** 2 * anchors[s][a][1]
+
+                    boxes = box_xywh_to_xyxy(
+                        torch.from_numpy(np.array([[[bx, by], [bw, bh]]]))
+                    )
+                    x1 = boxes[0][0][0].round()
+                    y1 = boxes[0][0][1].round()
+                    x2 = boxes[0][1][0].round()
+                    y2 = boxes[0][1][1].round()
+                    result.append([c, x1, y1, x2, y2, score])
+    return np.array(result, dtype=np.float32)
+
+
+def postprocess(
+    output: List[torch.Tensor],
+    scale: float,
+    pad: List[int],
+    conf_thr: float,
+    iou_thr: float,
+) -> np.ndarray:
+    """
+    Post process model output.
+    Inputs:
+        output: List[torch.Tensor]
+            Multi-scale model output.
+        scale: float
+            Scaling factor from input image and model input.
+        pad: List[int]
+            Padding sizes from input image and model input.
+        conf_thr: float
+            Confidence threshold of detections.
+        iou_thr: float
+            IoU threshold for non maximum suppression.
+    Outputs: np.ndarray
+        Detected object. Shape is (N, 6). N is the number of detected objects. Each object is
+        represented by (class, x1, y1, x2, y2, score)
+    """
+    result = decode(output, conf_thr)
+
+    result_final = []
+    for c in [0, 1]:
+        idx = result[:, 0] == c
+        boxes, scores = batched_nms(
+            iou_thr,
+            0,
+            torch.from_numpy(result[idx, 1:5]).unsqueeze_(0),
+            torch.from_numpy(result[idx, -1]).unsqueeze_(0),
+        )
+        scores[0].unsqueeze_(-1)
+        result_final.append(
+            torch.concat([torch.zeros_like(scores[0]) + c, boxes[0], scores[0]], 1)
+        )
+    result_final = torch.concat(result_final).numpy()
+    result_final[:, 1:5] = (
+        (result_final[:, 1:5] - np.array([pad[0], pad[1], pad[0], pad[1]])) / scale
+    ).round()
+    return result_final
+
+
+class BodyDetectionApp:
+    """Body detection application"""
+
+    def __init__(self, model: Callable[[torch.Tensor], torch.Tensor]) -> None:
+        """
+        Initialize BodyDetectionApp.
+
+        Inputs:
+            model: Callable[[torch.Tensor], torch.Tensor]
+                Detection model.
+        """
+        self.model = model
+
+    def detect(self, imgfile: str, height: int, width: int, conf: float) -> np.ndarray:
+        """
+        Detect objects from input images.
+
+        Inputs:
+            imgfile: str
+                Input image file
+            height: int
+                Model input height.
+            width: int
+                Model input width.
+            conf: float
+                Detection threshold.
+        Outputs: np.ndarray
+            Detection result. Shape is (N, 6). N is the number of detected objects. Each object is represented by
+            (cls_id, x1, y1, x2, y2, score)
+        """
+        img = np.array(load_image(imgfile))
+        input, scale, pad = preprocess(img, height, width)
+        output = self.model(input)
+        for t, o in enumerate(output):
+            output[t] = o.permute(0, 2, 3, 1).detach()
+        result = postprocess(output, scale, pad, conf, 0.5)
+        return result
diff --git a/qai_hub_models/models/_shared/body_detection/demo.py b/qai_hub_models/models/_shared/body_detection/demo.py
new file mode 100644
index 00000000..4b06fa9a
--- /dev/null
+++ b/qai_hub_models/models/_shared/body_detection/demo.py
@@ -0,0 +1,93 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from copy import deepcopy
+
+import numpy as np
+import PIL.Image as Image
+import torch.nn as nn
+
+from qai_hub_models.models._shared.body_detection.app import BodyDetectionApp
+from qai_hub_models.utils.args import (
+    demo_model_from_cli_args,
+    get_model_cli_parser,
+    get_on_device_demo_parser,
+    validate_on_device_demo_args,
+)
+from qai_hub_models.utils.asset_loaders import load_image
+from qai_hub_models.utils.display import display_or_save_image
+from qai_hub_models.utils.draw import draw_box_from_corners
+
+
+def plot_result(img: np.ndarray, result: np.ndarray):
+    """
+    Plot detection result.
+
+    Inputs:
+        img: np.ndarray
+            Input image.
+        result: np.ndarray
+            Detection result.
+    """
+    box_color = ((255, 0, 0), (0, 255, 0))
+    for r in result:
+        corners = np.array(
+            [[r[1], r[2]], [r[1], r[4]], [r[3], r[2]], [r[3], r[4]]]
+        ).astype(int)
+        draw_box_from_corners(img, corners, box_color[int(r[0])])
+    return img
+
+
+def BodyDetectionDemo(
+    is_test: bool,
+    model_name: nn.Module,
+    model_id: str,
+    app_name: BodyDetectionApp,
+    imgfile: str,
+    height: int,
+    width: int,
+    conf: float,
+) -> None:
+    """
+    Object detection demo.
+
+    Input:
+        is_test: bool.
+            Is test
+        model_name: nn.Module
+            Object detection model.
+        model_id: str.
+            Model ID
+        app_name: BodyDetectionApp
+            Object detection app.
+        imgfile: str:
+            Image file path.
+        height: int
+            Input image height.
+        width: int
+            Input image width.
+        conf: float
+            Detection confidence.
+    """
+    parser = get_model_cli_parser(model_name)
+    parser = get_on_device_demo_parser(parser, add_output_dir=True)
+    parser.add_argument(
+        "--image",
+        type=str,
+        default=imgfile,
+        help="image file path or URL",
+    )
+    args = parser.parse_args([] if is_test else None)
+    model = demo_model_from_cli_args(model_name, model_id, args)
+    validate_on_device_demo_args(args, model_id)
+
+    app = app_name(model)
+    result = app.detect(args.image, height, width, conf)
+
+    if not is_test:
+        img = np.array(load_image(args.image))
+        image_annotated = plot_result(deepcopy(img), result)
+        display_or_save_image(
+            Image.fromarray(image_annotated), args.output_dir, "result.jpg"
+        )
diff --git a/qai_hub_models/models/_shared/body_detection/model.py b/qai_hub_models/models/_shared/body_detection/model.py
new file mode 100644
index 00000000..4ef9583d
--- /dev/null
+++ b/qai_hub_models/models/_shared/body_detection/model.py
@@ -0,0 +1,524 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import math
+from copy import deepcopy
+from typing import List
+
+import torch
+import torch.nn as nn
+
+
+def make_divisible(x: int, divisor: int) -> int:
+    """
+    Compute the closest number that is larger or equal to X and is divisible by DIVISOR.
+
+    Inputs:
+        x: int
+            Input intenger
+        divisor: int
+            Divisor for the input number.
+    Outputs: int
+        Closest number that is larger or equal to X and is divisible by DIVISOR.
+    """
+    return math.ceil(x / divisor) * divisor
+
+
+class Concat(nn.Module):
+    """Tensor concatenation module"""
+
+    def __init__(self, dimension: int = 1) -> None:
+        """
+        Inputs:
+            dimension: int
+                Dimension to concatenate tensors.
+        """
+        super().__init__()
+        self.d = dimension
+
+    def forward(self, x: List[torch.Tensor]) -> torch.Tensor:
+        """
+        Inputs:
+            x: List[torch.Tensor]
+                List of tensors to be concatenated.
+        Output: torch.Tensor
+            Concatenated tensor.
+        """
+        return torch.cat(x, self.d)
+
+
+def autopad(kernel_size: int, p=None) -> int:
+    """
+    Compute padding size from kernel size.
+
+    Input:
+        kernel_size: int
+            Kernel size.
+        p: bool | int
+            Padding size.
+    Outputs: int
+        Padding size
+    """
+    if p is None:
+        p = (
+            kernel_size // 2
+            if isinstance(kernel_size, int)
+            else [x // 2 for x in kernel_size]
+        )
+    return p
+
+
+class FusedConvBatchNorm(nn.Module):
+    """Module of convolution, batch nornamlization and activation."""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        kernel_size: int = 1,
+        stride: int = 1,
+        padding=None,
+        groups: int = 1,
+        act: bool = True,
+    ) -> None:
+        """
+        Initialize FusedConvBatchNorm.
+
+        Inputs:
+            in_channels: int
+                Input channels.
+            out_channels: int
+                Output channels.
+            kernel_size: int
+                Kernel size.
+            stride: int
+                Convolution stride.
+            groups: int
+                Groups of channels for convolution.
+            act: bool
+                Whether to enable ReLU activation.
+        """
+        super().__init__()
+        self.conv = nn.Conv2d(
+            in_channels,
+            out_channels,
+            kernel_size,
+            stride,
+            autopad(kernel_size, padding),
+            groups=groups,
+            bias=False,
+        )
+        self.bn = nn.BatchNorm2d(out_channels)
+        self.act = (
+            nn.ReLU(True)
+            if act is True
+            else (act if isinstance(act, nn.Module) else nn.Identity())
+        )
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward computation of FusedConvBatchNorm module.
+
+        Inputs:
+            x: torch.Tensor
+                Input tensor
+        Output: torch.Tensor
+            Output tensor.
+        """
+        return self.act(self.bn(self.conv(x)))
+
+
+class Bottleneck(nn.Module):
+    """Bottleneck block"""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        shortcut: bool = True,
+        groups: int = 1,
+        expand_ratio: float = 0.5,
+    ) -> None:
+        """
+        Initialize Bottleneck module.
+
+        Inputs:
+            in_channels: int
+                Input channels.
+            out_channels: int
+                Output channels.
+            shortcut: bool
+                Whether to enable shortcut connection.
+            groups: int
+                Groups of channels for convolution.
+            expand_ratio: float
+                Expand ratio of input channels to hidden channels.
+        """
+        super().__init__()
+        hidden_channels = int(out_channels * expand_ratio)
+        self.cv1 = FusedConvBatchNorm(in_channels, hidden_channels, 1, 1)
+        self.cv2 = FusedConvBatchNorm(
+            hidden_channels, out_channels, 3, 1, groups=groups
+        )
+        self.add = shortcut and in_channels == out_channels
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward computation of Bottleneck module.
+
+        Inputs:
+            x: torch.Tensor
+                Input tensor.
+        Outputs: torch.Tensor.
+            Output tensor.
+        """
+        y = self.cv2(self.cv1(x))
+        if self.add:
+            y += x
+        return y
+
+
+class C3(nn.Module):
+    """C3 block"""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        num_blocks: int = 1,
+        shortcut: bool = True,
+        group: int = 1,
+        expand_ratio: float = 0.5,
+    ) -> None:
+        """
+        Initialize C3 module.
+
+        Inputs:
+            in_channels: int
+                Input channels.
+            out_channels: int
+                Output channels.
+            num_blocks: int
+                Number of Bottleneck blocks.
+            shortcut: bool
+                Whether to enable shortcut connection.
+            groups: int
+                Groups of channels for convolution.
+            expand_ratio: float
+                Expand ratio of input channels to hidden channels.
+        """
+        super().__init__()
+        hidden_channels = int(out_channels * expand_ratio)
+        self.cv1 = FusedConvBatchNorm(in_channels, hidden_channels, 1, 1)
+        self.cv2 = FusedConvBatchNorm(in_channels, hidden_channels, 1, 1)
+        self.cv3 = FusedConvBatchNorm(2 * hidden_channels, out_channels, 1)
+        self.m = nn.Sequential(
+            *[
+                Bottleneck(
+                    hidden_channels, hidden_channels, shortcut, group, expand_ratio=1.0
+                )
+                for _ in range(num_blocks)
+            ]
+        )
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward computation of C3 module.
+
+        Inputs:
+            x: torch.Tensor.
+                Input tensor.
+        Outputs: torch.Tensor.
+            Output tensor
+        """
+        return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
+
+
+class SPPF(nn.Module):
+    """Spatial Pyramid Pooling - Fast (SPPF) layer"""
+
+    def __init__(
+        self, in_channels: int, out_channels: int, kernel_size: int = 5
+    ) -> None:
+        """
+        Initialize SPPF module.
+
+        Input:
+            in_channels: int
+                Input channels.
+            out_channels: int
+                Output channels.
+            kernel_size: int
+                Kernel size.
+        """
+        super().__init__()
+        hiddel_channels = in_channels // 2
+        self.cv1 = FusedConvBatchNorm(in_channels, hiddel_channels, 1, 1)
+        self.cv2 = FusedConvBatchNorm(hiddel_channels * 4, out_channels, 1, 1)
+        self.m = nn.MaxPool2d(
+            kernel_size=kernel_size, stride=1, padding=kernel_size // 2
+        )
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward computation of SPPF module.
+
+        Inputs:
+            x: torch.Tensor.
+                Input tensor.
+        Outputs: torch.Tensor.
+            Output tensor
+        """
+        x = self.cv1(x)
+        y1 = self.m(x)
+        y2 = self.m(y1)
+        return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1))
+
+
+class Detect(nn.Module):
+    """Detector head module"""
+
+    def __init__(
+        self, num_classes: int = 80, anchors: tuple = (), ch: tuple = ()
+    ) -> None:
+        """
+        Initialize Detector module.
+
+        Inputs:
+            num_classes: int
+                Number of object classes.
+            anchors: tuple
+                Tuple of anchor sizes.
+            ch: tuple
+                Input channels of multi-scales
+            inplace: bool
+        """
+        super().__init__()
+        self.num_classes = num_classes
+        self.num_output = num_classes + 5
+        self.num_layers = len(anchors)
+        self.num_anchors = len(anchors[0]) // 2
+        self.grid = [torch.zeros(1)] * self.num_layers
+        self.anchor_grid = [torch.zeros(1)] * self.num_layers
+        self.register_buffer(
+            "anchors", torch.tensor(anchors).float().view(self.num_layers, -1, 2)
+        )
+        self.m = nn.ModuleList(
+            nn.Conv2d(x, self.num_output * self.num_anchors, 1) for x in ch
+        )
+
+    def forward(self, x: List[torch.Tensor]) -> List[torch.Tensor]:
+        """
+        Forward computation of Detect module.
+
+        Inputs:
+            x: List[torch.Tensor].
+                Input lisr of tensors.
+        Outputs: List[torch.Tensor].
+            Output list of tensors.
+        """
+        for i in range(self.num_layers):
+            x[i] = self.m[i](x[i])
+        return x
+
+
+def parse_model(cfg: dict, ch: List[int]):
+    """
+    Generate model module from model configuration.
+
+    Inputs:
+        cfg: dict
+            Model configurations.
+        ch: list
+            Input channels.
+    Output:
+        model: nn.Sequential
+            Model layers.
+        save: list
+            List of layer indices that needs to be saved.
+    """
+    anchors, nc, gd, gw = (
+        cfg["anchors"],
+        cfg["nc"],
+        cfg["depth_multiple"],
+        cfg["width_multiple"],
+    )
+    num_anchors = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors
+    num_outputs = num_anchors * (nc + 5)
+    layers, save, c2 = [], [], ch[-1]
+    for i, (f, n, m, args) in enumerate(cfg["backbone"] + cfg["head"]):
+        m = eval(m) if isinstance(m, str) else m
+        for j, a in enumerate(args):
+            try:
+                args[j] = eval(a) if isinstance(a, str) else a
+            except NameError:
+                pass
+
+        n = max(round(n * gd), 1) if n > 1 else n
+        if m in [FusedConvBatchNorm, Bottleneck, SPPF, C3, DoubleBlazeBlock]:
+            c1, c2 = ch[f], args[0]
+            if c2 != num_outputs:
+                c2 = make_divisible(c2 * gw, 8)
+
+            args = [c1, c2, *args[1:]]
+            if m in [C3]:
+                args.insert(2, n)
+                n = 1
+        elif m is nn.BatchNorm2d:
+            args = [ch[f]]
+        elif m is Concat:
+            c2 = sum([ch[x] for x in f])
+        elif m is Detect:
+            args.append([ch[x] for x in f])
+            if isinstance(args[1], int):
+                args[1] = [list(range(args[1] * 2))] * len(f)
+        else:
+            c2 = ch[f]
+
+        m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)
+        t = str(m)[8:-2].replace("__main__.", "")
+        np = sum([x.numel() for x in m_.parameters()])
+        m_.i, m_.f, m_.type, m_.np = (i, f, t, np)
+        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)
+        layers.append(m_)
+        if i == 0:
+            ch = []
+        ch.append(c2)
+    return nn.Sequential(*layers), sorted(save)
+
+
+class DoubleBlazeBlock(nn.Module):
+    """
+    DoubleBlaze block
+    """
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        stride: int = 1,
+        kernel_size: int = 5,
+        bias: bool = False,
+    ) -> None:
+        """
+        Initialize DoubleBlaze block.
+
+        Inputs:
+            in_channels: int
+                Number of input channels.
+            out_channels: int
+                Number of output channels.
+            stride: int
+                Convolution stride.
+            kernel_size: int
+                Kernel size.
+            bias: bool.
+                Enable bias in convolution.
+        """
+        super(DoubleBlazeBlock, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+        self.use_pooling = self.stride != 1
+        self.channel_pad = out_channels - in_channels
+        if self.channel_pad != 0:
+            self.pad = nn.Conv2d(in_channels, out_channels, kernel_size=1)
+        padding = (kernel_size - 1) // 2
+        hidden_channels = max(out_channels, in_channels) // 2
+
+        self.conv1 = nn.Sequential(
+            # dw
+            nn.Conv2d(
+                in_channels,
+                in_channels,
+                kernel_size=kernel_size,
+                stride=stride,
+                padding=padding,
+                groups=in_channels,
+                bias=bias,
+            ),
+            nn.BatchNorm2d(in_channels),
+            # pw-linear
+            nn.Conv2d(in_channels, hidden_channels, 1, 1, 0, bias=bias),
+            nn.BatchNorm2d(hidden_channels),
+        )
+        self.act = nn.ReLU(inplace=True)
+
+        self.conv2 = nn.Sequential(
+            nn.ReLU(inplace=True),
+            # dw
+            nn.Conv2d(
+                hidden_channels,
+                hidden_channels,
+                kernel_size=kernel_size,
+                stride=1,
+                padding=padding,
+                groups=hidden_channels,
+                bias=bias,
+            ),
+            nn.BatchNorm2d(hidden_channels),
+            # pw-linear
+            nn.Conv2d(hidden_channels, out_channels, 1, 1, 0, bias=bias),
+            nn.BatchNorm2d(out_channels),
+        )
+
+        if self.use_pooling:
+            self.mp = nn.MaxPool2d(kernel_size=self.stride, stride=self.stride)
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        """
+        Forward computation of DoubleBlaze block.
+
+        Input:
+            x: torch.Tensor.
+                Input tensor
+        Output: torch.Tensor
+            Output tensor.
+        """
+        h = self.conv1(x)
+        h = self.conv2(h)
+
+        if self.use_pooling:
+            x = self.mp(x)
+        if self.channel_pad != 0:
+            x = self.pad(x)
+        return self.act(h + x)
+
+
+class Model(nn.Module):
+    """Person/face detection model"""
+
+    def __init__(self, model_cfg: dict, ch: int = 3) -> None:
+        """
+        Initialize person/face detection model.
+
+        Inputs:
+            ch: int
+                Input channels.
+            model_cfg: dict
+                Model configuration
+        """
+        super().__init__()
+        self.model, self.save = parse_model(deepcopy(model_cfg), ch=[ch])
+
+    def forward(self, x: torch.Tensor) -> List[torch.Tensor]:
+        """
+        Forward computation of Model.
+
+        Inputs:
+            x: torch.Tensor.
+                Input image.
+        Outputs: List[torch.Tensor]
+            Multi-scale object detection output.
+        """
+        y = []
+        for m in self.model:
+            if m.f != -1:
+                x = (
+                    y[m.f]
+                    if isinstance(m.f, int)
+                    else [x if j == -1 else y[j] for j in m.f]
+                )
+            x = m(x)
+            y.append(x if m.i in self.save else None)
+        return x
diff --git a/qai_hub_models/models/_shared/imagenet_classifier/model.py b/qai_hub_models/models/_shared/imagenet_classifier/model.py
index 21e70f39..e8dd34be 100644
--- a/qai_hub_models/models/_shared/imagenet_classifier/model.py
+++ b/qai_hub_models/models/_shared/imagenet_classifier/model.py
@@ -11,17 +11,22 @@
 
 from qai_hub_models.evaluators.base_evaluators import BaseEvaluator
 from qai_hub_models.evaluators.classification_evaluator import ClassificationEvaluator
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_image
 from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.image_processing import (
     IMAGENET_DIM,
+    IMAGENET_TRANSFORM,
     normalize_image_torchvision,
 )
 from qai_hub_models.utils.input_spec import InputSpec
-from qai_hub_models.utils.quantization import get_image_quantization_samples
 
 MODEL_ASSET_VERSION = 1
 MODEL_ID = __name__.split(".")[-2]
 
+TEST_IMAGENET_IMAGE = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "dog.jpg"
+)
+
 
 class ImagenetClassifier(BaseModel):
     """
@@ -118,8 +123,9 @@ def from_pretrained(
     def _sample_inputs_impl(
         self, input_spec: InputSpec | None = None
     ) -> Dict[str, List[np.ndarray]]:
-        samples = get_image_quantization_samples()
-        return dict(image_tensor=[samples[:1].numpy()])
+        image = load_image(TEST_IMAGENET_IMAGE)
+        tensor = IMAGENET_TRANSFORM(image).unsqueeze(0)
+        return dict(image_tensor=[tensor.numpy()])
 
     @staticmethod
     def get_channel_last_inputs() -> List[str]:
diff --git a/qai_hub_models/models/_shared/llama/model.py b/qai_hub_models/models/_shared/llama/model.py
index 25f44dfa..da44c135 100644
--- a/qai_hub_models/models/_shared/llama/model.py
+++ b/qai_hub_models/models/_shared/llama/model.py
@@ -270,12 +270,17 @@ def __init__(self, model, encoding_path, is_token_generator=False):
         self.split_part = 1
         self.is_token_generator = is_token_generator
 
+    def get_qnn_graph_name(self) -> Optional[str]:
+        model_name = "token" if self.is_token_generator else "prompt"
+        return f"{model_name}_part{self.split_part}"
+
     def get_hub_compile_options(
         self,
         target_runtime: TargetRuntime,
         other_compile_options: str = "",
         device: Optional[Device] = None,
     ) -> str:
+        graph_name = self.get_qnn_graph_name()
         if (
             target_runtime != TargetRuntime.QNN
             and target_runtime != TargetRuntime.PRECOMPILED_QNN_ONNX
@@ -284,12 +289,26 @@ def get_hub_compile_options(
                 f"Unsupported target_runtime provided: {target_runtime}."
                 " Only Precompile ONN ONNX or QNN runtime is supported for Llama for now."
             )
-        target_runtime_options = (
+        options = (
             " --target_runtime qnn_context_binary"
             if target_runtime == TargetRuntime.QNN
             else " --target_runtime precompiled_qnn_onnx"
         )
-        return target_runtime_options + " --quantize_full_type w8a16 --quantize_io"
+        options += " --quantize_full_type w8a16 --quantize_io"
+        if graph_name is not None:
+            options += f" --qnn_graph_name {graph_name}"
+        return options
+
+    def get_hub_profile_options(
+        self,
+        target_runtime: TargetRuntime,
+        other_profile_options: str = "",
+    ) -> str:
+        options = "--max_profiler_iterations 50"
+        graph_name = self.get_qnn_graph_name()
+        if graph_name is not None:
+            options += f" --qnn_options context_enable_graphs={graph_name}"
+        return options
 
     @staticmethod
     def get_output_names(
diff --git a/qai_hub_models/models/_shared/llama3/__init__.py b/qai_hub_models/models/_shared/llama3/__init__.py
new file mode 100644
index 00000000..21a22b31
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/__init__.py
@@ -0,0 +1,4 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
diff --git a/qai_hub_models/models/_shared/llama3/app.py b/qai_hub_models/models/_shared/llama3/app.py
new file mode 100644
index 00000000..20872c74
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/app.py
@@ -0,0 +1,209 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import gc
+import math
+from typing import Any, Callable, Set, Type
+
+import torch
+
+from qai_hub_models.models._shared.llama3.model import (
+    Llama3Base_Quantized,
+    get_past_keyval_with_shift,
+)
+from qai_hub_models.models._shared.llama.model import RopeEmbedding
+
+
+def _get_tokens_from_logits(output: torch.Tensor):
+    probs = torch.nn.functional.softmax(output[0][0], dim=-1)
+    return torch.multinomial(probs, num_samples=1).squeeze(1)
+
+
+class ChatApp:
+    """
+    This class is a demonstration of how to use Llama model to build a basic ChatApp.
+    This App use two models
+        * Prompt Processor
+            - Instantiation with sequence length 128. Used to process user
+              prompt.
+        * Token Generator
+            - Instantiation with sequence length 1. Used to predict
+              auto-regressive response.
+    """
+
+    def __init__(
+        self,
+        model_cls: Type[Llama3Base_Quantized],
+        get_input_prompt_with_tags: Callable,
+        prepare_combined_attention_mask: Callable,
+        tokenizer: Any,
+        end_tokens: Set[str],
+    ):
+        """
+        Base ChatApp that generates one response for given input token.
+
+            model_cls: Llama Model class that will be used to instantiate model
+            get_input_prompt_with_tags: Function to wrap input prompt with appropriate tags
+            prepare_combined_attention_mask: Function to combine and build attention mask,
+            tokenizer: Tokenizer to use,
+            end_tokens: Set of end tokens to convey end of token generation,
+        """
+        self.model_cls = model_cls
+        self.get_input_prompt_with_tags = get_input_prompt_with_tags
+        self.prepare_combined_attention_mask = prepare_combined_attention_mask
+        self.tokenizer = tokenizer
+        self.end_tokens = end_tokens
+
+    def generate_output_prompt(
+        self,
+        input_prompt: str,
+        prompt_sequence_length: int,
+        context_length: int,
+        max_output_tokens: int,
+        bundled_kvcache: bool = True,
+    ):
+        input_prompt_processed = self.get_input_prompt_with_tags(
+            user_input_prompt=input_prompt
+        )
+
+        input_tokens = self.tokenizer(
+            input_prompt_processed,
+            return_tensors="pt",
+            padding="max_length",
+            max_length=context_length,
+        )
+        orig_input_ids = input_tokens["input_ids"].type(torch.long)
+
+        num_tokens = torch.sum(input_tokens["attention_mask"]).item()
+        num_prompt_iterations = math.ceil(num_tokens / prompt_sequence_length)
+        rope_embedding = RopeEmbedding(max_length=context_length)
+
+        print(
+            f"Will run prompt processor {num_prompt_iterations} time(s) and then token generator."
+        )
+
+        # Collect output prompt to summarize later
+        output_token = None
+        hub_tokens = None
+
+        model = self.model_cls.from_pretrained(sequence_length=128)
+        llm_config = model.llm_config
+        is_prompt = True
+
+        # Process input prompt
+        input_specs = self.model_cls.get_input_spec(
+            input_seq_length=prompt_sequence_length,
+            num_hidden_layers=llm_config.num_hidden_layers,
+            context_length=model.context_length,
+            hidden_size=llm_config.hidden_size,
+            num_attention_heads=llm_config.num_attention_heads,
+            num_key_value_heads=llm_config.num_key_value_heads,
+        )
+
+        # Initialization of KV cache
+        past_key_values = [
+            torch.zeros(shape)
+            for k, (shape, _) in input_specs.items()
+            if k.startswith("past_")
+        ]
+
+        for i in range(num_prompt_iterations + max_output_tokens - 1):
+            if i < num_prompt_iterations:
+                seq_len = prompt_sequence_length
+                next_seq_len = seq_len if i + 1 < num_prompt_iterations else 1
+            else:
+                if is_prompt:
+                    # switch to token processor
+                    model = self.model_cls.from_pretrained(sequence_length=1)
+                    is_prompt = False
+
+                seq_len = 1
+                next_seq_len = 1
+
+            if is_prompt:
+                input_ids = orig_input_ids[
+                    :,
+                    context_length
+                    - (num_prompt_iterations - i) * seq_len : context_length
+                    - (num_prompt_iterations - i - 1) * seq_len,
+                ]
+
+                # non-padded tokens in first prompt
+                first_prompt = (num_tokens - 1) % seq_len + 1
+                padding_size0 = seq_len - first_prompt
+                padding_size = padding_size0 if i == 0 else 0
+                offset = 0 if i == 0 else first_prompt + (i - 1) * seq_len
+                position_ids = [0] * (padding_size) + list(
+                    range(offset, offset + seq_len - padding_size)
+                )
+                position_ids = (
+                    torch.Tensor(position_ids).type(torch.long).reshape(1, seq_len)
+                )
+                position_ids = (
+                    torch.Tensor(position_ids).type(torch.long).reshape(1, seq_len)
+                )
+                position_ids_cos, position_ids_sin = rope_embedding.get_embedding(
+                    position_ids
+                )
+                attention_mask = torch.zeros((1, context_length))
+                attention_mask[:, context_length - (first_prompt + i * seq_len) :] = 1.0
+            else:
+                input_ids = output_token.reshape(-1, 1).type(torch.int32)
+
+                # Shift attention_mask and position_ids
+                attention_mask = torch.cat(
+                    (attention_mask[:, seq_len:], torch.zeros((1, seq_len))), dim=-1
+                )
+                position_ids = (position_ids[:, -1] + 1).reshape(-1, 1)
+
+                position_ids = torch.Tensor(position_ids).type(torch.long).reshape(1, 1)
+                position_ids_cos, position_ids_sin = rope_embedding.get_embedding(
+                    position_ids
+                )
+
+            cm_attention_masks = self.prepare_combined_attention_mask(
+                attention_mask=attention_mask,
+                input_shape=(1, seq_len),
+                past_key_values_length=context_length - seq_len,
+            )
+
+            # Generate output token
+            output = model(
+                input_ids,
+                cm_attention_masks,
+                position_ids_cos,
+                position_ids_sin,
+                *past_key_values,
+            )
+
+            del cm_attention_masks
+            del input_ids
+            past_key_values = get_past_keyval_with_shift(
+                past_key_values,
+                output[1:],
+                length=context_length - next_seq_len,
+            )
+            output_token = _get_tokens_from_logits(output)
+            output_token = output_token[-next_seq_len:]
+            output_prompt = self.tokenizer.decode(output_token)
+            is_prediction = next_seq_len == 1
+
+            # Assistant generating end of token
+            if is_prediction and output_prompt in self.end_tokens:
+                break
+
+            if is_prompt:
+                hub_tokens = output_token
+            else:
+                hub_tokens = torch.cat((hub_tokens, output_token), dim=-1)
+
+            if is_prediction:
+                print()
+                print(f"Text generated so far: {self.tokenizer.decode(hub_tokens)}")
+                print()
+            gc.collect()
+
+        print("-------- Response Summary --------")
+        print(f"Prompt: {input_prompt}")
+        print(f"Response: {self.tokenizer.decode(hub_tokens)}")
diff --git a/qai_hub_models/models/_shared/llama3/demo.py b/qai_hub_models/models/_shared/llama3/demo.py
new file mode 100644
index 00000000..bd38f8b9
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/demo.py
@@ -0,0 +1,121 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from typing import Any, Callable, List, Set, Type
+
+from qai_hub_models.models._shared.llama3.app import ChatApp as App
+from qai_hub_models.utils.args import get_model_cli_parser
+from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.huggingface import has_model_access
+
+# Max output tokens to generate
+# You can override this with cli argument.
+# Keeping this short as on-device demo takes time to converge.
+MAX_OUTPUT_TOKENS = 20
+DEFAULT_DEVICE = "Samsung Galaxy S24 (Family)"
+
+
+def llama_chat_demo(
+    model_cls: Type[BaseModel],
+    model_id: str,
+    get_input_prompt_with_tags: Callable,
+    prepare_combined_attention_mask: Callable,
+    tokenizer: Any,
+    end_tokens: Set[str],
+    hf_repo_name: str,
+    hf_repo_url: str,
+    default_prompt: str,
+    is_test: bool = False,
+    available_target_runtimes: List[TargetRuntime] = [TargetRuntime.QNN],
+    bundled_kvcache: bool = True,
+):
+    """
+    Shared Chat Demo App to generate output for provided input prompt
+        model_cls: Model base class (either Prompt Processor or Token Generator)
+        model_id: Model ID from hub,
+        get_input_prompt_with_tags: Function to wrap input prompt with appropriate tags,
+        prepare_combined_attention_mask: Function to combine attention mask,
+        tokenizer: Tokenizer to encode-decode prompt,
+        num_splits: Number of model splits,
+        end_tokens: Set of end tokens to use for end of output generation,
+        hf_repo_name: HF repo name,
+        hf_repo_url: HF repo url,
+        default_prompt: Default prompt to set,
+        is_test: If test, no options required,
+        available_target_runtimes: Default availble runtime in options,
+    """
+    # Demo parameters
+    parser = get_model_cli_parser(model_cls)
+    parser.add_argument(
+        "--prompt",
+        type=str,
+        default=default_prompt,
+        help="input prompt.",
+    )
+    parser.add_argument(
+        "--prompt-processor-input-seq-len",
+        type=int,
+        default=128,
+        help="input sequence length for prompt-processor. This must be less than `context_length` set for model.",
+    )
+    parser.add_argument(
+        "--max-output-tokens",
+        type=int,
+        default=MAX_OUTPUT_TOKENS,
+        help="max output tokens to generate.",
+    )
+    args = parser.parse_args([] if is_test else None)
+
+    if not is_test:
+        print(f"\n{'-' * 85}")
+        print(f"** Generating response via {model_id} **")
+        print()
+        print("Prompt:", args.prompt)
+        print("Max number of output tokens to generate:", args.max_output_tokens)
+        print("Please pass `--max-output-tokens <int>` to generate longer responses.")
+        print()
+        print(
+            """NOTE: Each token generation takes around 15 mins on-device:
+    1. Model is divided into multiple parts to fit into device constraints
+    2. Each model requires separate execution on-device via AI Hub
+    3. Due to autoregressive nature, we cannot run step 2 in parallel
+    4. Device procurement is subject to device availability and might take longer to run demo on-device
+
+Alternative:
+    1. Run demo on host (with PyTorch) to verify e2e result for longer responses
+    2. Run demo on-device for shorter responses (--max-output-tokens 10 or 20)
+    3. [Optional] Can run demo on-device to generate long sentence (takes longer)
+
+We are actively working on to improve UX and reduce turn-around time for these models.
+"""
+        )
+        print(f"{'-' * 85}\n")
+
+    has_model_access(hf_repo_name, hf_repo_url)
+
+    """
+    llama_ar128 = model_cls.from_pretrained(
+        sequence_length=args.prompt_processor_input_seq_len
+    )
+    llama_ar1 = model_cls.from_pretrained(sequence_length=1)
+    context_length = llama_ar128.context_length
+    """
+
+    app = App(
+        model_cls,
+        get_input_prompt_with_tags=get_input_prompt_with_tags,
+        prepare_combined_attention_mask=prepare_combined_attention_mask,
+        tokenizer=tokenizer,
+        end_tokens=end_tokens,
+    )
+    context_length = 4096
+    app.generate_output_prompt(
+        args.prompt,
+        prompt_sequence_length=args.prompt_processor_input_seq_len,
+        context_length=context_length,
+        max_output_tokens=args.max_output_tokens,
+        bundled_kvcache=bundled_kvcache,
+    )
diff --git a/qai_hub_models/models/_shared/llama3/export.py b/qai_hub_models/models/_shared/llama3/export.py
new file mode 100644
index 00000000..0574d432
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/export.py
@@ -0,0 +1,357 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+
+from __future__ import annotations
+
+import glob
+import os
+import tempfile
+from pathlib import Path
+from typing import Any, Dict, List, Mapping, Optional, Tuple, Type, cast
+
+import numpy as np
+import qai_hub as hub
+
+from qai_hub_models.models._shared.llama3.model import Llama3Base_Quantized
+from qai_hub_models.models._shared.llama3.split_onnx_utils import utils
+from qai_hub_models.utils.args import get_input_spec_kwargs, get_model_kwargs
+from qai_hub_models.utils.asset_loaders import zip_model
+from qai_hub_models.utils.base_model import TargetRuntime
+from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.printing import (
+    print_inference_metrics,
+    print_profile_metrics_from_job,
+)
+
+
+def export_model(
+    model_cls: Type[Llama3Base_Quantized],
+    model_name: str,
+    components: List[str],
+    sub_components: Dict[str, List[str]],
+    num_layers_per_split: int,
+    device: str,
+    skip_profiling: bool = False,
+    skip_inferencing: bool = False,
+    skip_downloading: bool = False,
+    skip_summary: bool = False,
+    output_dir: Optional[str] = None,
+    target_runtime: TargetRuntime = TargetRuntime.QNN,
+    compile_options: str = "",
+    profile_options: str = "",
+    synchronous: bool = False,
+    **additional_model_kwargs,
+) -> Mapping[
+    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
+] | List[str]:
+    """
+    In this workflow, two instantiations of the Llama model are exported (AR-1,
+    AR-128). AR-<seq_len> refers to a model with input sequence length <seq_len>.
+    We produce two models:
+        AR-128: Used to process prompts.
+        AR-1: Used to process response.
+    Both instantiations have context length 4096 (with KV cache input of
+    4096 minus <seq_len>).
+
+    This function accomplishes several tasks:
+
+        1. Performs the following steps for both AR-1 and AR-128:
+            a. Instantiates a PyTorch model and exports it to ONNX.
+            b. Converts source AIMET Pro encodings to be compatible with this ONNX model.
+            c. Splits the ONNX into multiple parts (due to runtime size limitation).
+            d. For each part: Compile the model to a QNN context binary.
+        2. For each part (across both AR-1 and AR-128):
+            a. Link AR-1 part and AR-128 part together using link jobs.
+        3. Profiles the model performance on real devices.
+        4. Inferences the model on sample inputs (stringing together the parts).
+        5. Downloads the model asset to the local directory.
+        6. Summarizes the results from profiling and inference.
+
+    Each of the last four steps can be optionally skipped using the input options.
+
+    Parameters:
+        model_cls: Llama class.
+        model_name: Model name.
+        components: List of sub-components of the model that will be exported.
+            Each component is compiled and profiled separately.
+            Defaults to ALL_COMPONENTS if not specified.
+        sub_components: Dictionary of strings pointing to lists of strings,
+            where each sub-component will be grouped using weight sharing with
+            other sub-components to form a component.
+        num_layers_per_split: How many layers to include in each model part.
+        device: Device for which to export the model.
+            Full list of available devices can be found by running `hub.get_devices()`.
+            Defaults to DEFAULT_DEVICE if not specified.
+        skip_profiling: If set, skips profiling of compiled model on real devices.
+        skip_inferencing: If set, skips computing on-device outputs from sample data.
+        skip_downloading: If set, skips downloading of compiled model.
+        skip_summary: If set, skips waiting for and summarizing results
+            from profiling and inference.
+        output_dir: Directory to store generated assets (e.g. compiled model).
+            Defaults to `<cwd>/build/<model_name>`.
+        target_runtime: Which on-device runtime to target. Default is TFLite.
+        compile_options: Additional options to pass when submitting the compile job.
+        profile_options: Additional options to pass when submitting the profile job.
+        synchronous: Let each job finish before submitting the next.
+        **additional_model_kwargs: Additional optional kwargs used to customize
+            `model_cls.from_pretrained`
+
+    Returns:
+        A Mapping from sub-component name to a 3-tuple of:
+            * A LinkJob object containing metadata about the link job submitted to hub.
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+    """
+    num_splits = len(components)
+    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
+    hub_device = hub.Device(name=device)
+
+    # Instantiation names and input sequence length
+
+    # 1. Initialize PyTorch model
+    model_params = get_model_kwargs(model_cls, additional_model_kwargs)
+
+    prompt_sequence_length = 128
+    if "sequence_length" in model_params:
+        if isinstance(model_params["sequence_length"], int):
+            prompt_sequence_length = model_params["sequence_length"]
+        del model_params["sequence_length"]
+
+    # If user specifies sequence length, it will define the prompt
+    # generator's sequence length only
+    instantiations = [
+        ("prompt", prompt_sequence_length),
+        ("token", 1),
+    ]
+
+    compile_jobs_to_link: Dict[str, List[hub.client.CompileJob]] = {}
+    compile_jobs: Dict[str, hub.client.CompileJob] = {}
+    link_jobs: Dict[str, hub.client.LinkJob] = {}
+    profile_options_per_instantiation: Dict[str, str] = {}
+
+    sub_component_names = {}
+    component_from_sub_component_names = {}
+
+    for instantiation_name, seq_len in instantiations:
+        full_name = f"{model_name}_{instantiation_name}"
+        model = model_cls.from_pretrained(sequence_length=seq_len, **model_params)
+        llm_config = model.llm_config
+
+        sub_component_names[instantiation_name] = []
+
+        profile_options_per_instantiation[
+            instantiation_name
+        ] = model.get_hub_profile_options(target_runtime, profile_options)
+
+        input_spec = model.get_input_spec(
+            **{
+                **get_input_spec_kwargs(model, additional_model_kwargs),
+                "input_seq_length": seq_len,
+                "num_hidden_layers": llm_config.num_hidden_layers,
+                "context_length": model.context_length,
+                "hidden_size": llm_config.hidden_size,
+                "num_attention_heads": llm_config.num_attention_heads,
+                "num_key_value_heads": llm_config.num_key_value_heads,
+            },
+        )
+
+        # Export the full model to ONNX model
+        sub_output_path = output_path / instantiation_name
+        source_model = model.convert_to_hub_source_model(
+            target_runtime,
+            sub_output_path,
+            input_spec,
+            external_onnx_weights=True,
+            output_names=model.get_output_names(llm_config.num_hidden_layers),
+        )
+        source_model_path = Path(source_model)
+
+        input_onnx_path = glob.glob((source_model_path / "*.onnx").as_posix())[0]
+        input_encodings_path = glob.glob(
+            (source_model_path / "*.encodings").as_posix()
+        )[0]
+
+        # Split encodings
+        model_artifact = Path(output_dir or Path.cwd()) / instantiation_name
+        os.makedirs(model_artifact, exist_ok=True)
+
+        utils.split_onnx(
+            onnxfile=input_onnx_path,
+            modelname=full_name,
+            pickle_filedir=None,
+            num_splits=num_splits,
+            num_layers_per_split=num_layers_per_split,
+            output_dir=model_artifact,
+            split_embedding=True,
+            encoding_file=input_encodings_path,
+            using_qairt_workflow=True,
+        )
+
+        # Submit the parts for compilation
+        for i in range(num_splits):
+            sub_component_name = f"{instantiation_name}_{i + 1}_of_{num_splits}"
+            component_name = f"part_{i + 1}_of_{num_splits}"
+            sub_component_names[instantiation_name].append(sub_component_name)
+            full_name = f"{model_name}_{sub_component_name}"
+            aimet_path = Path(model_artifact) / (full_name + ".aimet")
+
+            model_compile_options = (
+                model.get_hub_compile_options(target_runtime, compile_options)
+                + f" --qnn_graph_name {sub_component_name}"
+            )
+
+            # TODO (#12708): Remove this zipping and let the client do it.
+            with tempfile.TemporaryDirectory() as tmpdir:
+                aimet_tmpdir = os.path.join(tmpdir, os.path.basename(aimet_path))
+                os.makedirs(aimet_tmpdir)
+                zipped_model_path = zip_model(aimet_tmpdir, aimet_path)
+                submitted_compile_job = hub.submit_compile_job(
+                    model=zipped_model_path,
+                    device=hub_device,
+                    name=full_name,
+                    options=model_compile_options,
+                )
+            if synchronous:
+                submitted_compile_job.wait()
+            if component_name not in compile_jobs_to_link:
+                compile_jobs_to_link[component_name] = []
+
+            compile_jobs_to_link[component_name].append(
+                cast(hub.client.CompileJob, submitted_compile_job)
+            )
+            compile_jobs[sub_component_name] = cast(
+                hub.client.CompileJob, submitted_compile_job
+            )
+            component_from_sub_component_names[sub_component_name] = component_name
+
+    # 2. Link jobs
+    for component_name, cjobs in compile_jobs_to_link.items():
+        models = [cjob.get_target_model() for cjob in cjobs]
+
+        full_name = f"{model_name}_{component_name}"
+        link_job = hub.submit_link_job(models, name=full_name)
+        if synchronous:
+            link_job.wait()
+        link_jobs[component_name] = link_job
+
+    # 3. Profile the model assets on real devices
+    profile_jobs: Dict[str, hub.client.ProfileJob] = {}
+    if not skip_profiling:
+        for instantiation_name, _ in instantiations:
+            for sub_component_name in sub_component_names[instantiation_name]:
+                component_name = component_from_sub_component_names[sub_component_name]
+                profile_options = (
+                    profile_options_per_instantiation[instantiation_name]
+                    + f" --qnn_options context_enable_graphs={sub_component_name}"
+                )
+                print(
+                    f"Profiling model {instantiation_name} {sub_component_name} on a hosted device."
+                )
+                full_name = f"{model_name}_{sub_component_name}"
+                submitted_profile_job = hub.submit_profile_job(
+                    model=link_jobs[component_name].get_target_model(),
+                    device=hub_device,
+                    name=full_name,
+                    options=profile_options,
+                )
+                if synchronous:
+                    submitted_profile_job.wait()
+                profile_jobs[sub_component_name] = cast(
+                    hub.client.ProfileJob, submitted_profile_job
+                )
+
+    # 4. Run inference on-device with sample inputs
+    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
+    final_device_output_data: Dict[str, Dict[str, np.ndarray]] = {}
+    final_ref_output_data: Dict[str, Dict[str, np.ndarray]] = {}
+    if not skip_inferencing:
+        for instantiation_name, seq_len in instantiations:
+            model = model_cls.from_pretrained(sequence_length=seq_len, **model_params)
+            full_model_sample_inputs = model.sample_inputs()
+            output_data = {}
+            for sub_component_name in sub_component_names[instantiation_name]:
+                component_name = component_from_sub_component_names[sub_component_name]
+                print(
+                    f"Running inference for {sub_component_name} on a hosted device with example inputs."
+                )
+
+                compile_job = compile_jobs[sub_component_name]
+                target_shapes = compile_job.target_shapes
+
+                # Source inputs from full inputs and previous part's outputs
+                sample_inputs = {}
+                for key in target_shapes:
+                    if key in output_data:
+                        sample_inputs[key] = output_data[key]
+                    elif key in full_model_sample_inputs:
+                        sample_inputs[key] = full_model_sample_inputs[key]
+
+                # Load model with no-AIMET mode
+                inference_options = (
+                    profile_options_per_instantiation[instantiation_name]
+                    + f" --qnn_options context_enable_graphs={sub_component_name}"
+                )
+                # Load individual model part
+                full_name = f"{model_name}_{sub_component_name}"
+                submitted_inference_job = hub.submit_inference_job(
+                    model=link_jobs[component_name].get_target_model(),
+                    inputs=sample_inputs,
+                    device=hub_device,
+                    name=full_name,
+                    options=inference_options,
+                )
+                if synchronous:
+                    submitted_inference_job.wait()
+                output_data = submitted_inference_job.download_output_data()
+                inference_jobs[sub_component_name] = cast(
+                    hub.client.InferenceJob, submitted_inference_job
+                )
+
+            # Store the final output data
+            final_device_output_data[instantiation_name] = output_data
+
+            if not skip_summary:
+                # Compute reference (PyTorch) output data
+                ref_output_data_list = torch_inference(model, full_model_sample_inputs)
+                final_ref_output_data[instantiation_name] = ref_output_data_list
+
+    # 5. Download the model assets to a local file
+    if not skip_downloading:
+        os.makedirs(output_path, exist_ok=True)
+        for component_name, link_job in link_jobs.items():
+            target_model: hub.Model = link_job.get_target_model()  # type: ignore
+            target_model.download(
+                str(output_path / f"{model_name}_{component_name}.bin")
+            )
+
+    # 6. Summarize the results from profiling and inference
+    if not skip_summary and not skip_profiling:
+        for instantiation_name, _ in instantiations:
+            for sub_component_name in sub_component_names[instantiation_name]:
+                profile_job = profile_jobs[sub_component_name]
+                assert profile_job is not None and profile_job.wait().success
+                profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+                print_profile_metrics_from_job(profile_job, profile_data)
+
+    if not skip_summary and not skip_inferencing:
+        for instantiation_name, _ in instantiations:
+            # Get ordered model output names
+            torch_out = final_ref_output_data[instantiation_name]
+            inference_result = final_device_output_data[instantiation_name]
+            print_inference_metrics(
+                None,
+                inference_result,
+                torch_out,
+            )
+
+    return {
+        sub_component_name: (
+            link_jobs[component_name],
+            profile_jobs.get(sub_component_name),
+            inference_jobs.get(sub_component_name),
+        )
+        for component_name in components
+        for sub_component_name in sub_components[component_name]
+    }
diff --git a/qai_hub_models/models/_shared/llama3/model.py b/qai_hub_models/models/_shared/llama3/model.py
new file mode 100644
index 00000000..4ba271b0
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/model.py
@@ -0,0 +1,1001 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+import json
+import os
+from abc import ABC, abstractmethod
+from copy import deepcopy
+from typing import List, Optional
+
+import numpy as np
+import torch
+from qai_hub.public_rest_api import DatasetEntries
+from transformers.models.llama import modeling_llama
+
+from qai_hub_models.models._shared.llama.model import (
+    Llama_QuantizedMixin,
+    RopeEmbedding,
+)
+from qai_hub_models.models.common import (
+    SampleInputsType,
+    SourceModelFormat,
+    TargetRuntime,
+)
+from qai_hub_models.utils.aimet.encodings import map_encodings
+from qai_hub_models.utils.huggingface import (
+    ensure_has_required_transformer,
+    has_model_access,
+)
+from qai_hub_models.utils.input_spec import InputSpec
+from qai_hub_models.utils.system_info import has_recommended_memory
+
+from .model_adaptations import (
+    QcLlama_apply_rotary_pos_emb,
+    SHADynamicCacheNewValueOnly,
+    SHALlamaAttention,
+)
+
+MIN_TRANFORMER_VERSION = "4.45.0"
+
+# isort: off
+
+# TODO: 10761 remove transformer version check once AIMET
+# transformer restriction is uplifted.
+ensure_has_required_transformer(MIN_TRANFORMER_VERSION)
+from transformers import AutoConfig, AutoTokenizer  # noqa: E402
+
+MODEL_ID = __name__.split(".")[-2]
+MODEL_ASSET_VERSION = 1
+
+# Configs
+AIMET_ENCODINGS_PREFIX = "config"
+AIMET_CONFIG = "default_config_llama"
+
+DEFAULT_CONTEXT_LENGTH = 4096
+
+DATA_DIR = "data"
+USE_CACHED_DATA = True
+
+## Ref: https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1
+BEGIN_TEXT = "<|begin_of_text|>"
+END_TEXT = "<|end_of_text|>"
+START_HEADER = "<|start_header_id|>"
+END_HEADER = "<|end_header_id|>"
+SYSTEM_ID = "system"
+ASSISTANT_ID = "assistant"
+USER_ID = "user"
+EOT_ID = "<|eot_id|>"
+END_TOKENS = {"<|eot_id|>", "<|eot_id|>", "<|end_of_text|>"}
+
+DEFAULT_PROMPT_CONTEXT = "You are a helpful AI assistant"
+DEFAULT_USER_PROMPT = "What do llamas eat? Keep the answer under ten words."
+
+
+def get_input_prompt_with_tags(
+    previous_history: str = "",
+    system_context_prompt: str = DEFAULT_PROMPT_CONTEXT,
+    user_input_prompt: str = DEFAULT_USER_PROMPT,
+):
+    """
+    Get prompt to set context and initialize prompt-processor
+    """
+    prompt = previous_history
+    prompt += "" if len(previous_history) == 0 else "</s>"
+
+    prompt = f"""{BEGIN_TEXT}{START_HEADER}{SYSTEM_ID}{END_HEADER}
+
+{system_context_prompt}
+{START_HEADER}{USER_ID}{END_HEADER}
+
+{user_input_prompt}{EOT_ID}{START_HEADER}{ASSISTANT_ID}{END_HEADER}
+
+
+"""
+    return prompt
+
+
+def onnx_counting(i):
+    # Softmax, Softmax_1, Softmax_2, ...
+    if i == 0:
+        return ""
+    else:
+        return f"_{i}"
+
+
+def get_tokenizer(hf_repo_name):
+    """
+    Tokenizer to use for Llama3
+    """
+    tokenizer = AutoTokenizer.from_pretrained(hf_repo_name, is_fast=False)
+    tokenizer.padding_side = "left"
+    tokenizer.pad_token = tokenizer.eos_token
+    tokenizer.pad_token_id = tokenizer.eos_token_id
+    tokenizer.truncation_side = "left"
+    return tokenizer
+
+
+def prepare_decoder_attention_mask(
+    attention_mask, input_shape, inputs_embeds, past_key_values_length, mask_neg=-50.0
+):
+    # Copied from transformers.models.bart.modeling_bart._make_causal_mask
+    def _make_causal_mask(
+        input_ids_shape: torch.Size,
+        dtype: torch.dtype,
+        device: torch.device,
+        past_key_values_length: int = 0,
+        mask_neg: float = -50.0,
+    ):
+        """
+        Make causal mask used for bi-directional self-attention.
+        """
+        bsz, tgt_len = input_ids_shape[0], input_ids_shape[1]
+        # mask = torch.full((tgt_len, tgt_len), torch.tensor(torch.finfo(dtype).min, device=device), device=device)
+        mask = torch.full(
+            (tgt_len, tgt_len), torch.tensor(mask_neg, device=device), device=device
+        )
+        mask_cond = torch.arange(mask.size(-1), device=device)
+        mask.masked_fill_(mask_cond < (mask_cond + 1).view(mask.size(-1), 1), 0)
+        mask = mask.to(dtype)
+
+        if past_key_values_length > 0:
+            mask = torch.cat(
+                [
+                    torch.zeros(
+                        tgt_len, past_key_values_length, dtype=dtype, device=device
+                    ),
+                    mask,
+                ],
+                dim=-1,
+            )
+        return mask[None, None, :, :].expand(
+            bsz, 1, tgt_len, tgt_len + past_key_values_length
+        )
+
+    # Copied from transformers.models.bart.modeling_bart._expand_mask
+    def _expand_mask(
+        mask: torch.Tensor,
+        dtype: torch.dtype,
+        mask_neg: float = -50.0,
+        tgt_len: int = None,
+    ):
+        """
+        Expands attention_mask from `[bsz, seq_len]` to `[bsz, 1, tgt_seq_len, src_seq_len]`.
+        """
+        bsz, src_len = mask.size()
+        tgt_len = tgt_len if tgt_len is not None else src_len
+
+        expanded_mask = (
+            mask[:, None, None, :].expand(bsz, 1, tgt_len, src_len).to(dtype)
+        )
+
+        inverted_mask = 1.0 - expanded_mask
+
+        # return inverted_mask.masked_fill(inverted_mask.to(torch.bool), torch.finfo(dtype).min)
+        return inverted_mask.masked_fill(inverted_mask.to(torch.bool), mask_neg)
+
+    # create causal mask
+    # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
+    combined_attention_mask = None
+    if input_shape[-1] > 1:
+        combined_attention_mask = _make_causal_mask(
+            input_shape,
+            inputs_embeds.dtype,
+            device=inputs_embeds.device,
+            past_key_values_length=past_key_values_length,
+            mask_neg=mask_neg,
+        )
+
+    if attention_mask is not None:
+        # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
+
+        expanded_attn_mask = _expand_mask(
+            attention_mask,
+            inputs_embeds.dtype,
+            tgt_len=input_shape[1],
+            mask_neg=mask_neg,
+        ).to(inputs_embeds.device)
+
+        combined_attention_mask = (
+            expanded_attn_mask
+            if combined_attention_mask is None
+            else expanded_attn_mask + combined_attention_mask
+        )
+
+    return combined_attention_mask
+
+
+def prepare_combined_attention_mask(
+    attention_mask,
+    input_shape,
+    past_key_values_length,
+    mask_neg=-50.0,
+    dtype=torch.float32,
+):
+    dummy_embedding = torch.tensor((1.0,)).to(torch.float32)
+    new_mask = prepare_decoder_attention_mask(
+        attention_mask, input_shape, dummy_embedding, past_key_values_length, mask_neg
+    )
+    return new_mask.clamp_min(mask_neg).to(dtype)
+
+
+def get_past_keyval_with_shift(
+    past_key_vals: List[torch.Tensor],
+    new_key_vals: List[torch.Tensor],
+    length: int,
+) -> List[torch.Tensor]:
+    """
+    Clip past key value to feed next iteration
+    """
+    ret = []
+    # Key and Values are concatanated on batch dimension
+    for i in range(0, len(past_key_vals), 2):
+        n = new_key_vals[i].shape[3]
+        m = past_key_vals[i].shape[3]
+        remove = n + m - length
+        key_cache = torch.cat(
+            [past_key_vals[i][:, :, :, remove:], new_key_vals[i]], dim=3
+        )
+        val_cache = torch.cat(
+            [past_key_vals[i + 1][:, :, remove:], new_key_vals[i + 1]], dim=2
+        )
+
+        ret.append(key_cache)
+        ret.append(val_cache)
+    return ret
+
+
+def monkey_patch_huggingface_llama_modeling():
+    modeling_llama.LLAMA_ATTENTION_CLASSES["eager"] = SHALlamaAttention
+
+    def bypass_RotaryEmbedding(self, x, position_ids, *args, **kwargs):
+        return position_ids
+
+    # Bypass rotary_emb module
+    modeling_llama.LlamaRotaryEmbedding.forward = bypass_RotaryEmbedding
+    modeling_llama.apply_rotary_pos_emb = QcLlama_apply_rotary_pos_emb
+
+    def LlamaRMSNorm_forward(self, hidden_states):
+        # Raise to rank 4
+        hidden_states = hidden_states.unsqueeze(0)
+        variance = hidden_states.pow(2).mean(-1, keepdim=True)
+        hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon)
+        return (hidden_states * self.weight).squeeze(0)
+
+    modeling_llama.LlamaRMSNorm.forward = LlamaRMSNorm_forward
+
+
+class Llama3Base_Quantized(Llama_QuantizedMixin, ABC):
+    def __init__(
+        self,
+        huggingface_model_name: str,
+        min_memory_recommended: int,
+        aimet_encodings: str,
+        sequence_length: int,
+        context_length: int,
+        load_pretrained: bool = True,
+        _make_small_for_debugging: bool = False,  # construct a small and incorrect network
+    ):
+        """
+        This is an abstract base class of all Llama 3 models.
+
+        Parameters
+        ----------
+
+        huggingface_model_name:
+            Name of the HuggingFace model. Subclasses should provide a default
+            for this.
+        min_memory_recommended:
+            Minimum recommended memory in GB for running export.
+        aimet_encodings:
+            AIMET encodings file.
+        sequence_length:
+            Input sequence length (in tokens).
+        context_length:
+            Total context length (in tokens).
+        load_pretrained:
+            Load a pre-trained model as opposed to a randomly initialized.
+        """
+
+        # from transformers.models.llama import modeling_llama
+        self.huggingface_model_name = huggingface_model_name
+
+        # Ensure User has access to model,
+        # otherwise point to instructions to get access and error out.
+        has_model_access(self.huggingface_model_name)
+
+        # Ensure User has recommended memory,
+        # otherwise, provide warning to user and recommend to increase swap-space as a work-around.
+        has_recommended_memory(min_memory_recommended)
+
+        self.llm_config = self._llm_config(
+            _make_small_for_debugging=_make_small_for_debugging
+        )
+
+        # TODO: Make this into a context manager
+        monkey_patch_huggingface_llama_modeling()
+
+        if load_pretrained:
+            model = modeling_llama.LlamaForCausalLM.from_pretrained(
+                self.huggingface_model_name,
+                config=self.llm_config,
+                ignore_mismatched_sizes=_make_small_for_debugging,
+            )
+        else:
+            model = modeling_llama.LlamaForCausalLM(self.llm_config)
+        model.eval()
+
+        os.environ["TOKENIZERS_PARALLELISM"] = "0"
+
+        for name, module in model.named_modules():
+            if hasattr(module, "prepare_conv"):
+                module.prepare_conv()
+            if hasattr(module, "prepare_sha"):
+                module.prepare_sha()
+
+        super().__init__(model, aimet_encodings)
+
+        self.sequence_length = sequence_length
+        self.context_length = context_length
+        self.tokenizer = get_tokenizer(self.huggingface_model_name)
+
+    def _llm_config(self, _make_small_for_debugging: bool = False):
+        """
+        Construct and return a HuggingFace LLM config.
+        """
+        llm_config = AutoConfig.from_pretrained(
+            self.huggingface_model_name, trust_remote_code=True
+        )
+        if _make_small_for_debugging:
+            llm_config.num_hidden_layers = 8
+            llm_config.num_attention_heads = 4
+            llm_config.num_key_value_heads = 2
+            llm_config.vocab_size = 13
+            embed_dim = 8
+            llm_config.head_dim = embed_dim * 2
+            llm_config.hidden_size = llm_config.num_attention_heads * embed_dim * 2
+        llm_config._attn_implementation = "eager"
+        llm_config._attn_implementation_internal = "eager"
+
+        return llm_config
+
+    @abstractmethod
+    def from_pretrained(
+        cls,
+        sequence_length: int,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        aimet_encodings: str | None = "DEFAULT",
+    ) -> "Llama3Base_Quantized":
+        pass
+
+    @staticmethod
+    def get_output_names(num_hidden_layers: int):
+        output_names = ["logits"]
+        for layer in range(num_hidden_layers):
+            output_names.append(f"past_key_{layer}_out")
+            output_names.append(f"past_value_{layer}_out")
+        return output_names
+
+    def forward(
+        self,
+        input_ids,
+        attention_mask,
+        position_ids_cos,
+        position_ids_sin,
+        *past_key_values,
+    ):
+        kv_cache = SHADynamicCacheNewValueOnly()
+        for layer_idx, (k, v) in enumerate(
+            zip(past_key_values[::2], past_key_values[1::2])
+        ):
+            k_split = [k[i : i + 1] for i in range(self.llm_config.num_key_value_heads)]
+            v_split = [v[i : i + 1] for i in range(self.llm_config.num_key_value_heads)]
+            kv_cache.update(k_split, v_split, layer_idx, {})
+
+        out = self.model(
+            input_ids=input_ids,
+            attention_mask=attention_mask,
+            position_ids=[position_ids_cos, position_ids_sin],
+            past_key_values=kv_cache,
+        )
+
+        out_cache = out["past_key_values"]
+        flat_output_past_key_values = []
+        for layer in range(len(out_cache)):
+            k = torch.cat(out_cache.key_cache[layer], dim=0)
+            v = torch.cat(out_cache.value_cache[layer], dim=0)
+            flat_output_past_key_values += [k, v]
+
+        return [out["logits"]] + flat_output_past_key_values
+
+    def get_qnn_graph_name(self) -> Optional[str]:
+        # Graph name of splits is determined by export script
+        return None
+
+    @staticmethod
+    def get_input_spec(
+        num_hidden_layers: int,
+        input_seq_length: int,
+        context_length: int,
+        hidden_size: int,
+        num_key_value_heads: int,
+        num_attention_heads: int,
+    ) -> InputSpec:
+        embed_dim = hidden_size // num_attention_heads // 2
+        input_spec = {
+            "input_ids": ((1, input_seq_length), "int32"),
+            "attention_mask": (
+                (1, 1, input_seq_length, context_length),
+                "float32",
+            ),
+            # These are half the length of the hidden size per head because
+            # each cos/sin are applied to a half-sliced copy of the hidden size
+            # and then concatenated.
+            "position_ids_cos": (
+                (1, 1, input_seq_length, embed_dim),
+                "float32",
+            ),
+            "position_ids_sin": (
+                (1, 1, input_seq_length, embed_dim),
+                "float32",
+            ),
+        }
+
+        # TODO: We could support input_seq_length == CONTEXT_LENGTH, but the
+        # KV cache input needs to be removed.
+        assert (
+            input_seq_length < context_length
+        ), "It is currently not supported to set input sequence length to the same as or longer than context length. There should be no KV cache input at all in such case."
+
+        for layer in range(num_hidden_layers):
+            past_k_name = f"past_key_{layer}_in"
+            input_spec[past_k_name] = (
+                (
+                    num_key_value_heads,
+                    1,
+                    embed_dim * 2,
+                    context_length - input_seq_length,
+                ),
+                "float32",
+            )
+
+            past_v_name = f"past_value_{layer}_in"
+            input_spec[past_v_name] = (
+                (
+                    num_key_value_heads,
+                    1,
+                    context_length - input_seq_length,
+                    embed_dim * 2,
+                ),
+                "float32",
+            )
+        return input_spec
+
+    def _use_zip_file(self) -> bool:
+        """
+        Should the return of convert_to_hub_source_model be zipped.
+        """
+        return False
+
+    def preferred_hub_source_model_format(
+        self, target_runtime: TargetRuntime
+    ) -> SourceModelFormat:
+        """
+        Source model format preferred for conversion on AI Hub.
+        """
+        return SourceModelFormat.ONNX
+
+    def get_calibration_data(
+        self,
+        target_runtime: TargetRuntime | None = None,
+        input_spec: InputSpec | None = None,
+    ) -> DatasetEntries | None:
+        # No calibration data needed
+        return None
+
+    def _adapt_aimet_encodings(
+        self, src_encodings_path, dst_encodings_path, onnx_model_path
+    ):
+        """
+        Adapt encodings from AIMET Pro to vanilla onnx export.
+
+        Works for the new 3.0 and 3.1 encodings.
+        """
+        import onnx
+
+        with open(src_encodings_path) as f:
+            encodings = json.load(f)
+
+        model = onnx.load(onnx_model_path)
+
+        model_input_names = {}
+        for node in model.graph.node:
+            model_input_names[node.name] = node.input
+
+        model_names = (
+            set([o for x in model.graph.node for o in x.output])
+            | set([x.name for x in model.graph.input])
+            | set([x.name for x in model.graph.output])
+        )
+        model_param_names = set([x.name for x in model.graph.initializer])
+
+        uses_lists = isinstance(encodings["activation_encodings"], list)
+        if uses_lists:
+            # Convert encodings to dictionaries for faster look-ups
+            encodings["activation_encodings"] = {
+                v["name"]: v for v in encodings["activation_encodings"]
+            }
+            encodings["param_encodings"] = {
+                v["name"]: v for v in encodings["param_encodings"]
+            }
+
+        enc_names = set(encodings["activation_encodings"].keys())
+        enc_param_names = set(encodings["param_encodings"].keys())
+
+        new_encodings = {
+            "activation_encodings": {},
+            "excluded_layers": [],
+            "param_encodings": {},
+            "quantizer_args": encodings["quantizer_args"],
+            "version": encodings["version"],
+        }
+
+        all_names = model_param_names | model_names
+        num_attention_heads = self.llm_config.num_attention_heads
+        num_key_value_heads = self.llm_config.num_key_value_heads
+        mapping, rev_mapping, known_unused = map_encodings(
+            [
+                (
+                    r"/model_layers_(\d+)_input_layernorm_Mul_1/Mul_output_0",
+                    "/model/model/layers.{0}/input_layernorm/Mul_1_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_q_proj_conv_Conv/Conv_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/q_proj_sha.{i}/Conv_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_2/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(2 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_1/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(1 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_3/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(3 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Sub/Sub_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Sub{onnx_counting(i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Add/Add_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Add{onnx_counting(i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_k_proj_conv_Conv/Conv_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/k_proj_sha.{i}/Conv_output_0"
+                        for i in range(num_key_value_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_4/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(num_attention_heads * 4 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_6/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(num_attention_heads * 4 + 2 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_5/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(num_attention_heads * 4 + 1 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Mul_7/Mul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Mul{onnx_counting(num_attention_heads * 4 + 3 + i * 4)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Sub_1/Sub_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Sub{onnx_counting(num_attention_heads + i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Add_1/Add_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Add{onnx_counting(num_attention_heads + i)}_output_0"
+                        for i in range(num_key_value_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_v_proj_conv_Conv/Conv_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/v_proj_sha.{i}/Conv_output_0"
+                        for i in range(num_key_value_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_MatMul/MatMul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/MatMul{onnx_counting(i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Div/Div_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Div{onnx_counting(i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Add_2/Add_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Add{onnx_counting(num_attention_heads + num_key_value_heads + i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_Softmax/Softmax_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/Softmax{onnx_counting(i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_MatMul_1/MatMul_output_0",
+                    [
+                        f"/model/model/layers.{{0}}/self_attn/MatMul{onnx_counting(num_attention_heads + i)}_output_0"
+                        for i in range(num_attention_heads)
+                    ],
+                ),
+                (
+                    r"/model_layers_(\d+)_self_attn_o_proj_conv_Conv/Conv_output_0",
+                    "/model/model/layers.{0}/self_attn/o_proj_conv/Conv_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_Add/Add_output_0",
+                    "/model/model/layers.{0}/Add_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_post_attention_layernorm_Mul_1/Mul_output_0",
+                    "/model/model/layers.{0}/post_attention_layernorm/Mul_1_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_gate_proj_conv_Conv/Conv_output_0",
+                    "/model/model/layers.{0}/mlp/gate_proj/MatMul_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_act_fn_Sigmoid/Sigmoid_output_0",
+                    "/model/model/layers.{0}/mlp/act_fn/Sigmoid_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_act_fn_Mul/Mul_output_0",
+                    "/model/model/layers.{0}/mlp/act_fn/Mul_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_up_proj_conv_Conv/Conv_output_0",
+                    "/model/model/layers.{0}/mlp/up_proj/MatMul_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_Mul/Mul_output_0",
+                    "/model/model/layers.{0}/mlp/Mul_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_mlp_down_proj_conv_Conv/Conv_output_0",
+                    "/model/model/layers.{0}/mlp/down_proj/MatMul_output_0",
+                ),
+                (
+                    r"/model_layers_(\d+)_Add_1/Add_output_0",
+                    "/model/model/layers.{0}/Add_1_output_0",
+                ),
+                ("/model_norm_Mul_1/Mul_output_0", "/model/model/norm/Mul_1_output_0"),
+                ("/lm_head_conv_Conv/Conv_output_0", "/model/lm_head/MatMul_output_0"),
+                (r"(.*)", "{0}"),
+            ],
+            enc_names,
+            all_names,
+            src_encodings=encodings["activation_encodings"],
+            dst_encodings=new_encodings["activation_encodings"],
+        )
+
+        def split_weights(
+            src_encodings,
+            dst_encodings,
+            src_name,
+            dst_name,
+            dst_pattern_index,
+            num_patterns,
+            groups,
+        ):
+            if src_name in src_encodings:
+                src_entry = src_encodings[src_name]
+                dst_entry = deepcopy(src_entry)
+                # Slice it!
+                if isinstance(dst_entry, dict):
+                    dst_entry["name"] = dst_name
+                    for key in ["scale", "offset", "per_block_int_scale"]:
+                        n = len(dst_entry[key]) // num_patterns
+                        dst_entry[key] = dst_entry[key][
+                            dst_pattern_index * n : (dst_pattern_index + 1) * n
+                        ]
+
+                    # dst_encodings.append(dst_entry)
+                    dst_encodings[dst_name] = dst_entry
+                else:
+                    n = len(dst_entry) // num_patterns
+                    dst_entry = dst_entry[
+                        dst_pattern_index * n : (dst_pattern_index + 1) * n
+                    ]
+                    dst_encodings[dst_name] = dst_entry
+
+        # These parameters are stored as activations
+        param_mapping, rev_param_mapping, param_known_unused = map_encodings(
+            [
+                (
+                    r"model_layers_(\d+)_(input|post_attention)_layernorm_weight",
+                    "model.model.layers.{0}.{1}_layernorm.weight",
+                ),
+                (r"model_norm_weight", "model.model.norm.weight"),
+            ],
+            enc_names,
+            all_names,
+            src_encodings=encodings["activation_encodings"],
+            dst_encodings=new_encodings["param_encodings"],
+        )
+
+        # Process weight mappings
+        param_mapping, rev_param_mapping, param_known_unused = map_encodings(
+            [
+                ("model_embed_tokens_Gather.weight", "model.model.embed_tokens.weight"),
+                (
+                    r"model_layers_(\d+)_self_attn_(k|v)_proj_conv_Conv.weight",
+                    (
+                        (
+                            [
+                                f"model.model.layers.{{0}}.self_attn.{{1}}_proj_sha.{i}.weight"
+                                for i in range(num_key_value_heads)
+                            ]
+                        ),
+                        split_weights,
+                    ),
+                ),
+                (
+                    r"model_layers_(\d+)_self_attn_q_proj_conv_Conv.weight",
+                    (
+                        (
+                            [
+                                f"model.model.layers.{{0}}.self_attn.q_proj_sha.{i}.weight"
+                                for i in range(num_attention_heads)
+                            ]
+                        ),
+                        split_weights,
+                    ),
+                ),
+                (
+                    r"model_layers_(\d+)_self_attn_o_proj_conv_Conv.weight",
+                    "model.model.layers.{0}.self_attn.o_proj_conv.weight",
+                ),
+                (
+                    r"model_layers_(\d+)_mlp_(gate|up|down)_proj_conv_Conv.weight",
+                    ("/model/model/layers.{0}/mlp/{1}_proj/MatMul", 1),
+                ),
+                (r"lm_head_conv_Conv.weight", ("/model/lm_head/MatMul", 1)),
+            ],
+            enc_param_names,
+            all_names,
+            model_input_names,
+            src_encodings=encodings["param_encodings"],
+            dst_encodings=new_encodings["param_encodings"],
+        )
+
+        # This is needed for subtle reasons.
+        # Gather ops require weights and output range to be the same, so that
+        # it can be implemented as a memory look-up. Therefore, AIMET does not
+        # store the output activation. However, since we may split the model
+        # right after this op, it could lead the input to the second part
+        # without activation encodings.
+        embed_a_name = "/model/model/embed_tokens/Gather_output_0"
+        embed_w_name = "model.model.embed_tokens.weight"
+        new_encodings["activation_encodings"][embed_a_name] = new_encodings[
+            "param_encodings"
+        ][embed_w_name]
+        if uses_lists:
+            new_encodings["activation_encodings"][embed_a_name]["name"] = embed_a_name
+
+        # Fill in "zero" encodings for RMSNorm internals. If these are not
+        # collapsed before runtime, it should result in catastophic numerical
+        # results (which is good, since it is better to catch this bug instead
+        # of getting a slightly worse model, which can be hard to detect).
+        zero_keys = []
+        for layer in range(self.llm_config.num_hidden_layers):
+            for sec in ["input", "post_attention"]:
+                zero_keys += [
+                    f"/model/model/layers.{layer}/{sec}_layernorm/Pow_output_0",
+                    f"/model/model/layers.{layer}/{sec}_layernorm/ReduceMean_output_0",
+                    f"/model/model/layers.{layer}/{sec}_layernorm/Add_output_0",
+                    f"/model/model/layers.{layer}/{sec}_layernorm/Sqrt_output_0",
+                    f"/model/model/layers.{layer}/{sec}_layernorm/Div_output_0",
+                    f"/model/model/layers.{layer}/{sec}_layernorm/Mul_output_0",
+                ]
+
+        zero_keys += [
+            "/model/model/norm/Pow_output_0",
+            "/model/model/norm/ReduceMean_output_0",
+            "/model/model/norm/Add_output_0",
+            "/model/model/norm/Sqrt_output_0",
+            "/model/model/norm/Div_output_0",
+            "/model/model/norm/Mul_output_0",
+        ]
+
+        for key in zero_keys:
+            if uses_lists:
+                # aimet format 1.0
+                zero_entry = {
+                    "bw": 16,
+                    "dtype": "INT",
+                    "enc_type": "PER_TENSOR",
+                    "is_sym": False,
+                    "name": key,
+                    "offset": [0],
+                    "scale": [1e-20],
+                }
+            else:
+                # aimet format 0.x
+                zero_entry = [
+                    {
+                        "bitwidth": 16,
+                        "dtype": "int",
+                        "is_symmetric": "False",
+                        "max": 0.0,
+                        "min": 0.0,
+                        "offset": 0,
+                        "scale": 1e-20,
+                    }
+                ]
+            new_encodings["activation_encodings"][key] = zero_entry
+
+        changes = True
+        while changes:
+            changes = False
+            for node in model.graph.node:
+                if node.output[0] in new_encodings["activation_encodings"]:
+                    continue
+
+                if node.op_type in {
+                    "Concat",
+                    "Split",
+                    "Transpose",
+                    "Cast",
+                    "Reshape",
+                    "Slice",
+                }:
+                    if node.input[0] in new_encodings["activation_encodings"]:
+                        for output_name in node.output:
+                            dst_entry = deepcopy(
+                                new_encodings["activation_encodings"][node.input[0]]
+                            )
+                            if isinstance(dst_entry, dict):
+                                dst_entry["name"] = output_name
+                            new_encodings["activation_encodings"][
+                                output_name
+                            ] = dst_entry
+                            enc_names.add(output_name)
+                            changes = True
+
+        if uses_lists:
+            # convert back
+            new_encodings["activation_encodings"] = list(
+                new_encodings["activation_encodings"].values()
+            )
+            new_encodings["param_encodings"] = list(
+                new_encodings["param_encodings"].values()
+            )
+
+        with open(dst_encodings_path, "w") as write_file:
+            json.dump(new_encodings, write_file, indent=4, sort_keys=True)
+
+    def _sample_inputs_impl(
+        self, input_spec: InputSpec | None = None
+    ) -> SampleInputsType:
+        if not input_spec:
+            input_specs = self.get_input_spec(
+                input_seq_length=self.sequence_length,
+                num_hidden_layers=self.llm_config.num_hidden_layers,
+                context_length=self.context_length,
+                hidden_size=self.llm_config.hidden_size,
+                num_attention_heads=self.llm_config.num_attention_heads,
+                num_key_value_heads=self.llm_config.num_key_value_heads,
+            )
+        input_prompt = DEFAULT_USER_PROMPT
+        input_prompt_processed = get_input_prompt_with_tags(
+            user_input_prompt=input_prompt
+        )
+        input_tokens = self.tokenizer(
+            input_prompt_processed,
+            return_tensors="pt",
+            padding="max_length",
+            max_length=self.context_length,
+        )
+        num_tokens = min(
+            torch.sum(input_tokens["attention_mask"]).item(), self.sequence_length
+        )
+        input_ids = input_tokens["input_ids"].type(torch.int32)[
+            :, -self.sequence_length :
+        ]
+
+        padding_size = self.sequence_length - num_tokens
+        position_ids = [0] * (padding_size) + list(
+            range(0, self.sequence_length - padding_size)
+        )
+        position_ids = (
+            torch.Tensor(position_ids).type(torch.long).reshape(1, self.sequence_length)
+        )
+        position_ids = (
+            torch.Tensor(position_ids).type(torch.long).reshape(1, self.sequence_length)
+        )
+        rope_embedding = RopeEmbedding(max_length=self.context_length)
+        position_ids_cos, position_ids_sin = rope_embedding.get_embedding(position_ids)
+        attention_mask = torch.zeros((1, self.context_length))
+        attention_mask[:, -num_tokens:] = 1.0
+        cm_attention_masks = prepare_combined_attention_mask(
+            attention_mask=attention_mask,
+            input_shape=(1, self.sequence_length),
+            past_key_values_length=self.context_length - self.sequence_length,
+        )
+
+        input_dict = {
+            "input_ids": [input_ids.detach().numpy()],
+            "attention_mask": [cm_attention_masks.detach().numpy()],
+            "position_ids_cos": [position_ids_cos.detach().numpy()],
+            "position_ids_sin": [position_ids_sin.detach().numpy()],
+        }
+
+        # Populate the rest with zeros (KV cache input)
+        for k, (shape, _) in input_specs.items():
+            if k.startswith("past_"):
+                input_dict[k] = [np.zeros(shape, dtype=np.float32)]
+
+        return input_dict
diff --git a/qai_hub_models/models/_shared/llama3/model_adaptations.py b/qai_hub_models/models/_shared/llama3/model_adaptations.py
new file mode 100644
index 00000000..113508f8
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/model_adaptations.py
@@ -0,0 +1,289 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import math
+from typing import Any, Dict, List, Optional, Tuple
+
+import torch
+from torch import nn
+from transformers.cache_utils import DynamicCache
+from transformers.models.llama.modeling_llama import LlamaAttention
+
+
+# Copied from transformers.models.llama.modeling_llama.repeat_kv
+def repeat_kv(hidden_states: torch.Tensor, n_rep: int) -> torch.Tensor:
+    """
+    This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). The hidden states go from (batch,
+    num_key_value_heads, seqlen, head_dim) to (batch, num_attention_heads, seqlen, head_dim)
+    """
+    if isinstance(hidden_states, list):
+        return [head for head in hidden_states for _ in range(n_rep)]
+
+    batch, num_key_value_heads, slen, head_dim = hidden_states.shape
+    if n_rep == 1:
+        return hidden_states
+    hidden_states = hidden_states[:, :, None, :, :].expand(
+        batch, num_key_value_heads, n_rep, slen, head_dim
+    )
+    return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim)
+
+
+def _apply_rope_single(x, rope_vals: Tuple[torch.Tensor, torch.Tensor]):
+    """
+    Based on FacebookResearch's llama, provided by Carl
+    """
+    rope_real = rope_vals[0]  # shape should be 1, 1, seqlen, head_dim/2
+    rope_im = rope_vals[1]  # shape should be 1, 1, seqlen, head_dim/2
+
+    # TODO: Why HF uses different coordinates from the paper
+    x_real = x[:, :, :, : x.shape[-1] // 2]  # extract first half elements
+    x_im = x[:, :, :, x.shape[-1] // 2 :]  # extract second half elements
+
+    x_prod_real = x_real * rope_real - x_im * rope_im
+    x_prod_im = x_real * rope_im + x_im * rope_real
+
+    # TODO: HF need to uses different interleaving
+    x = torch.cat((x_prod_real, x_prod_im), dim=3).view(*x.shape)
+    return x
+
+
+def QcLlama_apply_rotary_pos_emb(q, k, cos, sin, position_ids=None, unsqueeze_dim=1):
+    query_states = _apply_rope_single(q, [cos, sin])
+    key_states = _apply_rope_single(k, [cos, sin])
+    return query_states, key_states
+
+
+class SHADynamicCacheNewValueOnly(DynamicCache):
+    """
+    Version of DynamicCache that stores the cache as lists for the separate
+    heads (so as to avoid concats/splits for SHA) and returning only the
+    new values without accumulation.
+    """
+
+    def update(
+        self,
+        key_states: List[torch.Tensor],
+        value_states: List[torch.Tensor],
+        layer_idx: int,
+        cache_kwargs: Optional[Dict[str, Any]] = None,
+    ) -> Tuple[torch.Tensor, torch.Tensor]:
+        # Update the number of seen tokens
+        if layer_idx == 0:
+            # self._seen_tokens += key_states.shape[-2]
+            # This line is updated
+            self._seen_tokens += key_states[0].shape[-2]
+
+        # Update the cache
+        if len(self.key_cache) <= layer_idx:
+            self.key_cache.append(key_states)
+            self.value_cache.append(value_states)
+        else:
+            # Do not concatenate the cache, we only need the latest entry
+            self.key_cache[layer_idx] = key_states
+            self.value_cache[layer_idx] = value_states
+
+        return self.key_cache[layer_idx], self.value_cache[layer_idx]
+
+    def get_seq_length(self, layer_idx: Optional[int] = 0) -> int:
+        """Returns the sequence length of the cached states. A layer index can be optionally passed."""
+        if len(self.key_cache) <= layer_idx:
+            return 0
+        # [0] added to get shape since the outermost is list
+        return self.key_cache[layer_idx][0].shape[-2]
+
+
+class SHALlamaAttention(LlamaAttention):
+    """
+    Split-Head Attention version of LlamaAttention (with Convs)
+    """
+
+    def prepare_conv(self):
+        if not hasattr(self, "forward_no_conv"):
+            self.q_proj_conv = nn.Conv2d(
+                self.hidden_size, self.num_heads * self.head_dim, 1, bias=False
+            )
+            self.k_proj_conv = nn.Conv2d(
+                self.hidden_size,
+                self.num_key_value_heads * self.head_dim,
+                1,
+                bias=False,
+            )
+            self.v_proj_conv = nn.Conv2d(
+                self.hidden_size,
+                self.num_key_value_heads * self.head_dim,
+                1,
+                bias=False,
+            )
+            self.o_proj_conv = nn.Conv2d(
+                self.num_heads * self.head_dim, self.hidden_size, 1, bias=False
+            )
+
+            self.q_proj_conv.weight.data.copy_(self.q_proj.weight[:, :, None, None])
+            self.k_proj_conv.weight.data.copy_(self.k_proj.weight[:, :, None, None])
+            self.v_proj_conv.weight.data.copy_(self.v_proj.weight[:, :, None, None])
+            self.o_proj_conv.weight.data.copy_(self.o_proj.weight[:, :, None, None])
+
+            del self.q_proj
+            del self.k_proj
+            del self.v_proj
+            del self.o_proj
+
+    def prepare_sha(self):
+        if not hasattr(self, "forward_mha"):
+            self.q_proj_sha = nn.ModuleList(
+                [
+                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
+                    for _ in range(self.num_heads)
+                ]
+            )
+            self.k_proj_sha = nn.ModuleList(
+                [
+                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
+                    for _ in range(self.num_key_value_heads)
+                ]
+            )
+            self.v_proj_sha = nn.ModuleList(
+                [
+                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
+                    for _ in range(self.num_key_value_heads)
+                ]
+            )
+            if not hasattr(self, "o_proj_conv"):
+                self.o_proj_conv = nn.Conv2d(
+                    self.num_heads * self.head_dim, self.hidden_size, 1, bias=False
+                )
+                self.o_proj_conv.weight.data.copy_(self.o_proj.weight[:, :, None, None])
+                del self.o_proj
+
+            self.forward_mha = self.forward
+            self.forward = self.forward_sha
+
+        for i in range(self.num_heads):
+            self.q_proj_sha[i].weight.data.copy_(
+                self.q_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
+            )
+
+        for i in range(self.num_key_value_heads):
+            self.k_proj_sha[i].weight.data.copy_(
+                self.k_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
+            )
+            self.v_proj_sha[i].weight.data.copy_(
+                self.v_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
+            )
+
+        del self.q_proj_conv
+        del self.k_proj_conv
+        del self.v_proj_conv
+
+    def forward_sha(
+        self,
+        hidden_states: torch.Tensor,
+        attention_mask: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.LongTensor] = None,
+        past_key_value: Optional[Tuple[torch.Tensor]] = None,
+        output_attentions: bool = False,
+        use_cache: bool = False,
+        cache_position: Optional[torch.LongTensor] = None,
+        position_embeddings: Optional[
+            Tuple[torch.Tensor, torch.Tensor]
+        ] = None,  # will become mandatory in v4.45
+    ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
+
+        bsz, q_len, _ = hidden_states.size()
+
+        hidden_states = torch.reshape(hidden_states, (bsz, -1, 1, self.hidden_size))
+        hidden_states = hidden_states.transpose(1, 3)
+
+        query_states = [
+            q_proj(hidden_states).permute(0, 2, 3, 1) for q_proj in self.q_proj_sha
+        ]
+        key_states = [
+            k_proj(hidden_states).permute(0, 2, 3, 1) for k_proj in self.k_proj_sha
+        ]
+        value_states = [
+            v_proj(hidden_states).permute(0, 2, 3, 1) for v_proj in self.v_proj_sha
+        ]
+
+        kv_seq_len = value_states[0].shape[-2]
+        if past_key_value is not None:
+            kv_seq_len += past_key_value.value_cache[self.layer_idx][0].shape[-2]
+
+        assert position_embeddings is not None
+        query_states = [
+            _apply_rope_single(q, position_embeddings) for q in query_states
+        ]
+        key_states = [_apply_rope_single(k, position_embeddings) for k in key_states]
+
+        if position_embeddings is None:
+            cos, sin = self.rotary_emb(value_states, position_ids)
+        else:
+            cos, sin = position_embeddings
+
+        if past_key_value is not None:
+            # reuse k, v, self_attention
+            past_key = past_key_value.key_cache[self.layer_idx]
+            past_value = past_key_value.value_cache[self.layer_idx]
+
+            cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position}
+            transposed_key_states = [
+                key_state.transpose(2, 3) for key_state in key_states
+            ]
+            past_key_value.update(
+                transposed_key_states, value_states, self.layer_idx, cache_kwargs
+            )
+
+            # Now concate the key/value states
+            key_states = [
+                torch.cat([pk, k.transpose(2, 3)], dim=3)
+                for pk, k in zip(past_key, key_states)
+            ]
+            value_states = [
+                torch.cat([pv, v], dim=2) for pv, v in zip(past_value, value_states)
+            ]
+
+        key_states = repeat_kv(key_states, self.num_key_value_groups)
+        value_states = repeat_kv(value_states, self.num_key_value_groups)
+
+        attn_weights = [
+            torch.matmul(q, k) / math.sqrt(self.head_dim)
+            for q, k in zip(query_states, key_states)
+        ]
+        if attn_weights[0].size() != (bsz, 1, q_len, kv_seq_len):
+            raise ValueError(
+                f"Attention weights should be of size {(bsz, 1, q_len, kv_seq_len)}, but is"
+                f" {attn_weights[0].size()}"
+            )
+
+        if attention_mask is not None:
+            if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
+                raise ValueError(
+                    f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
+                )
+            attn_weights = [aw + attention_mask for aw in attn_weights]
+
+        # upcast attention to fp32
+        attn_weights = [
+            nn.functional.softmax(aw, dim=-1, dtype=torch.float32).to(
+                query_states[0].dtype
+            )
+            for aw in attn_weights
+        ]
+        attn_output = [torch.matmul(aw, v) for aw, v in zip(attn_weights, value_states)]
+
+        if attn_output[0].size() != (bsz, 1, q_len, self.head_dim):
+            raise ValueError(
+                f"`attn_output` should be of size {(bsz, 1, q_len, self.head_dim)}, but is"
+                f" {attn_output[0].size()}"
+            )
+
+        attn_output = torch.cat(attn_output, dim=3)
+        attn_output = attn_output.permute(0, 3, 1, 2)
+        attn_output = self.o_proj_conv(attn_output)
+        attn_output = attn_output.transpose(1, 3)
+        attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
+
+        if not output_attentions:
+            attn_weights = None
+
+        return attn_output, attn_weights, past_key_value
diff --git a/qai_hub_models/models/_shared/llama3/split_onnx_utils/__init__.py b/qai_hub_models/models/_shared/llama3/split_onnx_utils/__init__.py
new file mode 100644
index 00000000..21a22b31
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/split_onnx_utils/__init__.py
@@ -0,0 +1,4 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
diff --git a/qai_hub_models/models/_shared/llama3/split_onnx_utils/split_onnx.py b/qai_hub_models/models/_shared/llama3/split_onnx_utils/split_onnx.py
new file mode 100644
index 00000000..45607848
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/split_onnx_utils/split_onnx.py
@@ -0,0 +1,230 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# Implementation of a class that splits a larger onnx graph into smaller subgraphs
+
+import collections
+import os
+
+import onnx
+from onnx.external_data_helper import uses_external_data
+
+
+class OnnxSplitter:
+    def __init__(self, onnxmodel, verbose=False):
+        self.model = onnxmodel
+        self.verbose = verbose
+        self.graph_inputs = {i.name for i in self.model.graph.input}
+        self.graph_outputs = {i.name for i in self.model.graph.output}
+        # nodeid:Onnx Node
+        self.node = {id(node): node for node in self.model.graph.node}
+        # tensorname: nodeid
+        self.producer = {
+            output: id(node) for node in self.model.graph.node for output in node.output
+        }
+
+    def partition_subgraph(
+        self,
+        name,  # name of the ONNX graph
+        output_tensors,  # list of new output tensors to include
+        additional_input_tensors=None,
+    ):
+        """
+        Partition a graph with input and output tensors
+        - Captures all nodes that required to compute the given output_tensors
+        """
+
+        def upstream(nodeid):
+            return [
+                self.producer[i]
+                for i in self.node[nodeid].input
+                if i not in leaf_tensors
+            ]
+
+        # Check prerequisite
+        value_info = {i.name: i for i in self.model.graph.value_info}
+        assert all(
+            [
+                (name in value_info) or (name in self.graph_outputs)
+                for name in output_tensors
+            ]
+        ), "ValueInfoProto of output_tensors should be given"
+
+        # prepare the 'leaf' tensors, which can be model input or parameter tensors
+        leaf_tensors = set(self.graph_inputs)
+        leaf_tensors.update({i.name for i in self.model.graph.initializer})
+        if additional_input_tensors is not None:
+            leaf_tensors.update(additional_input_tensors)
+            self.graph_inputs.update(additional_input_tensors)
+
+        visited_output_tensors, visited_input_tensors = set(output_tensors), set()
+
+        # Traverse from output_tensors to input or 'leaf' nodes
+        q = collections.deque([self.producer[i] for i in output_tensors])
+        visited = set()
+        while q:
+            nodeid = q.popleft()
+            if nodeid in visited:
+                continue
+            visited.add(nodeid)
+            visited_output_tensors.update(
+                [i for i in self.node[nodeid].output if i in self.graph_outputs]
+            )
+            visited_input_tensors.update(
+                [i for i in self.node[nodeid].input if i in self.graph_inputs]
+            )
+            for producerid in upstream(nodeid):
+                if producerid not in visited:
+                    q.append(producerid)
+
+        use = set()
+        for nodeid in visited:
+            use.update(self.node[nodeid].input)
+            use.update(self.node[nodeid].output)
+
+        # Include in-use items and preserve the original order
+        new_node = [i for i in self.model.graph.node if id(i) in visited]
+        new_initializer = [i for i in self.model.graph.initializer if i.name in use]
+        new_value_info = [i for i in self.model.graph.value_info if i.name in use]
+        new_sparse_initializer = [
+            i for i in self.model.graph.sparse_initializer if i.name in use
+        ]
+
+        value_info_dict = {i.name: i for i in new_value_info}
+        value_info_dict.update({i.name: i for i in self.model.graph.output})
+        if additional_input_tensors is not None:
+            new_inputs = [
+                value_info_dict[i]
+                for i in additional_input_tensors
+                if i in value_info_dict and i in use
+            ]
+        else:
+            new_inputs = []
+        new_inputs += [i for i in self.model.graph.input if i.name in use]
+
+        new_outputs = [value_info_dict[i] for i in output_tensors]
+        new_outputs += [
+            value_info_dict[i.name]
+            for i in self.model.graph.output
+            if i.name in visited_output_tensors and i.name not in output_tensors
+        ]
+
+        if self.verbose:
+            print("new_inputs", [i.name for i in new_inputs])
+        if self.verbose:
+            print("new_outputs", [i.name for i in new_outputs])
+        new_graph = onnx.helper.make_graph(
+            nodes=new_node,
+            name=name,
+            inputs=new_inputs,
+            outputs=new_outputs,
+            initializer=new_initializer,
+            value_info=new_value_info,
+            sparse_initializer=new_sparse_initializer,
+        )
+        return new_graph
+
+    def split(self, list_of_intermediate_output_tensors):
+        count = 0
+        additional_input_tensors, covered_output_tensors = [], set()
+        for i, output_tensors in enumerate(list_of_intermediate_output_tensors):
+            count += 1
+            graphname = f"{self.model.graph.name}_split{count}"
+            if self.verbose:
+                print(f"Partitoin new graph: {graphname} for outputs[{output_tensors}]")
+            subgraph = self.partition_subgraph(
+                graphname, output_tensors, additional_input_tensors
+            )
+            additional_input_tensors += [
+                i for i in output_tensors if i not in self.graph_outputs
+            ]
+            covered_output_tensors.update([i.name for i in subgraph.output])
+            yield subgraph
+
+        graphname = f"{self.model.graph.name}_split{count+1}"
+        last_output_tensors = [
+            i.name
+            for i in self.model.graph.output
+            if i.name not in covered_output_tensors
+        ]
+        lastgraph = self.partition_subgraph(
+            graphname, last_output_tensors, additional_input_tensors
+        )
+        yield lastgraph
+
+    @classmethod
+    def get_all_tensors(cls, graph):
+        yield from graph.initializer
+        for node in graph.node:
+            for attribute in node.attribute:
+                if attribute.type == onnx.AttributeProto.GRAPH:
+                    yield from cls.get_all_tensors(attribute.g)
+                if attribute.type == onnx.AttributeProto.GRAPHS:
+                    for graph in attribute.graphs:
+                        yield from cls.get_all_tensors(graph)
+                if attribute.HasField("t"):
+                    yield attribute.t
+                yield from attribute.tensors
+
+    @classmethod
+    def is_using_external_data(cls, onnxmodel):
+        for tensor in cls.get_all_tensors(onnxmodel.graph):
+            if uses_external_data(tensor):
+                return True
+        return False
+
+
+def save_model(model, newonnxfile, using_external_data=False):
+    kwargs = {}
+    if using_external_data or model.ByteSize() > onnx.checker.MAXIMUM_PROTOBUF:
+        dirname = os.path.dirname(newonnxfile)
+        location = os.path.basename(newonnxfile).replace(".onnx", ".data")
+        kwargs["save_as_external_data"] = True
+        kwargs["all_tensors_to_one_file"] = True
+        kwargs["location"] = location
+        if os.path.exists(os.path.join(dirname, kwargs["location"])):
+            os.unlink(os.path.join(dirname, kwargs["location"]))
+
+    onnx.save(model, newonnxfile, **kwargs)
+
+
+def split_onnx_by_names(
+    onnxfile, list_of_output_tensors, output_dir=".", verbose=False
+):
+    if verbose:
+        print(f"Loading {onnxfile}")
+    onnxmodel = onnx.load(onnxfile, load_external_data=False)
+    splitter = OnnxSplitter(onnxmodel, verbose=verbose)
+    using_external_data = OnnxSplitter.is_using_external_data(onnxmodel)
+
+    list_of_output_tensors = [i.split(",") for i in list_of_output_tensors]
+    num_splits = len(list_of_output_tensors) + 1
+
+    # 1. split model
+    new_model_info = []
+    for i, subgraph in enumerate(splitter.split(list_of_output_tensors)):
+        new_basename = f"{os.path.basename(onnxfile)}_{i+1}_of_{num_splits}"
+        input_tensors = [i.name for i in subgraph.input]
+        new_model_info.append([new_basename, input_tensors])
+
+        submodel = onnx.helper.make_model(
+            subgraph, opset_imports=onnxmodel.opset_import
+        )
+        if (
+            not using_external_data
+            and submodel.ByteSize() < onnx.checker.MAXIMUM_PROTOBUF
+        ):
+            onnx.checker.check_model(submodel)
+
+        if using_external_data:
+            if verbose:
+                print(f"Loading external data from {os.path.dirname(onnxfile)}")
+            onnx.load_external_data_for_model(
+                submodel, base_dir=os.path.dirname(onnxfile)
+            )
+
+        newonnxfile = f"{output_dir}/{new_basename}.onnx"
+        if verbose:
+            print(f"Saving {newonnxfile}")
+        save_model(submodel, newonnxfile, using_external_data)
diff --git a/qai_hub_models/models/_shared/llama3/split_onnx_utils/utils.py b/qai_hub_models/models/_shared/llama3/split_onnx_utils/utils.py
new file mode 100644
index 00000000..13e357a9
--- /dev/null
+++ b/qai_hub_models/models/_shared/llama3/split_onnx_utils/utils.py
@@ -0,0 +1,420 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import collections
+import json
+import os
+import re
+import shutil
+from copy import deepcopy
+from pathlib import Path
+from typing import Optional
+
+import numpy as np
+import onnx
+from onnx.numpy_helper import from_array, to_array
+
+from .split_onnx import OnnxSplitter, save_model
+
+
+def _target_name(name, deco_digit=True, using_qairt_workflow=False):
+    name = f"_{name}" if deco_digit and name.isdigit() else name
+    # name = name.replace('.', '_')
+    if not using_qairt_workflow:
+        name = name.replace("/", "-")
+    return name
+
+
+def get_onnx_input_output_names(
+    onnxfile, onnxmodel=None, deco_digit=True, using_qairt_workflow=False
+):
+    onnxmodel = _load_model(onnxfile) if onnxmodel is None else onnxmodel
+    input_names = [
+        _target_name(
+            i.name, deco_digit=deco_digit, using_qairt_workflow=using_qairt_workflow
+        )
+        for i in onnxmodel.graph.input
+    ]
+    output_names = [
+        _target_name(
+            i.name, deco_digit=deco_digit, using_qairt_workflow=using_qairt_workflow
+        )
+        for i in onnxmodel.graph.output
+    ]
+    return input_names, output_names
+
+
+def get_split_tensors(onnxfile, onnxmodel=None, include_first_input=True):
+    """
+    Model topology
+            │ ←─────────  layers[0]  ────────────→ │       │ ←─────────  layers[-1]  ─────────────→ │
+            │                                      │       │                                        │
+    embed ────┬──────────── add0 ─┬─────────── add1 ── ┄┄┄  ─┬─────────────── add ─┬───────────── add ─── lmhead
+            ↑ └─ norm ─ attn ─┘   └─ norm ─ ffn ─┘   ↑       ↑ └─ norm ─ attn ─┘   └─ norm ─ ffn ─┘   ↑
+            │                                        │       │                                        │
+            │                                        │       │                                        │
+            valid splitting points
+    """
+
+    def get_nodes():
+        model = _load_model(onnxfile) if onnxmodel is None else onnxmodel
+        nodes = {i.name: i for i in model.graph.node}
+        seq = {i.name: idx for idx, i in enumerate(model.graph.node)}
+        producers = collections.defaultdict(lambda: None)
+        producers.update({i.output[0]: i.name for i in model.graph.node})
+        return nodes, seq, producers
+
+    nodes, seq, producers = get_nodes()
+
+    def maybe_skip_cast(a):
+        if nodes[a].op_type == "Cast":
+            return producers[nodes[a].input[0]]
+        else:
+            return a
+
+    def can_visit(src, dst):
+        if seq[src] < seq[dst]:
+            return False
+        stack, visited = collections.deque([src]), set()
+        while stack:
+            cur = stack.pop()
+            if cur == dst:
+                return True
+            visited.add(cur)
+            next_nodes = [
+                producers[tensor]
+                for tensor in nodes[cur].input
+                if producers[tensor] is not None
+            ]
+            for name in next_nodes:
+                if name not in visited and seq[name] >= seq[dst]:
+                    stack.append(name)
+        return False
+
+    def is_residual_add(nodename, strict):
+        if nodes[nodename].op_type != "Add":
+            return False
+        a, b = [producers[tensor] for tensor in nodes[nodename].input]
+        if a is None or b is None:
+            return False
+        a = maybe_skip_cast(a)
+        b = maybe_skip_cast(b)
+        begin, end = (a, b) if seq[a] < seq[b] else (b, a)
+        if strict and nodes[begin].op_type != "Add":
+            return False
+        return can_visit(end, begin)
+
+    def get_add0(add1):
+        a, b = [producers[tensor] for tensor in nodes[add1].input]
+        a = maybe_skip_cast(a)
+        b = maybe_skip_cast(b)
+        add0 = a if seq[a] < seq[b] else b
+        assert is_residual_add(add0, strict=False)
+        return add0
+
+    def get_layer0_input(add0):
+        a, b = [producers[tensor] for tensor in nodes[add0].input]
+        return a if seq[a] < seq[b] else b
+
+    residual_add_names = [
+        name for name in nodes.keys() if is_residual_add(name, strict=True)
+    ]
+    if len(residual_add_names) % 2 == 1:
+        # 'add0' is missing in residual_adds
+        add0 = get_add0(residual_add_names[0])
+        residual_add_names.insert(0, add0)
+
+    output_tensors = []
+    if include_first_input:
+        layer0_input = get_layer0_input(residual_add_names[0])
+        output_tensors.append(nodes[layer0_input].output[0])
+    output_tensors += [
+        nodes[node].output[0] for i, node in enumerate(residual_add_names) if i % 2 == 1
+    ]
+
+    return output_tensors
+
+
+def _load_model(onnxfile, load_external_data=False, model_cache={}):
+    if onnxfile not in model_cache:
+        model_cache[onnxfile] = onnx.load(
+            onnxfile, load_external_data=load_external_data
+        )
+    return model_cache[onnxfile]
+
+
+def _load_encoding(encodingfile, no_merge=False):
+    all = {}
+    if encodingfile is not None:
+        with open(encodingfile) as json_file:
+            quant_encoding_dict = json.load(json_file)
+        if no_merge:
+            return quant_encoding_dict
+        all.update(quant_encoding_dict["activation_encodings"])
+        all.update(quant_encoding_dict["param_encodings"])
+    return all
+
+
+def _save_encoding(encodings, encodingfile):
+    with open(encodingfile, "wt") as json_file:
+        json.dump(encodings, json_file, indent=4, sort_keys=True)
+
+
+def embed_forecast_token_embeddings(onnxmodel, forecast_token_embeddings, base_dir):
+
+    (embedding_table_name,) = [
+        node.input[0] for node in onnxmodel.graph.node if node.op_type == "Gather"
+    ]
+    (embedding_table_proto,) = [
+        i for i in onnxmodel.graph.initializer if i.name == embedding_table_name
+    ]
+    embedding_table = to_array(embedding_table_proto, base_dir=base_dir)
+
+    assert (
+        embedding_table.shape[1] == forecast_token_embeddings.shape[1]
+    ), "Mismatching token embedding size"
+    new_embedding_table = np.concatenate(
+        (embedding_table, forecast_token_embeddings), axis=0
+    )
+    onnxmodel.graph.initializer.remove(embedding_table_proto)
+    onnxmodel.graph.initializer.append(
+        from_array(new_embedding_table, embedding_table_proto.name)
+    )
+
+
+def split_onnx_by_names(
+    onnxfile,
+    modelname,
+    pickle_filedir,
+    *list_of_output_tensors,
+    output_dir=".",
+    onnxmodel=None,
+    encoding_file=None,
+    using_qairt_workflow=False,
+):
+    encodings = None
+    uses_lists = None
+    if encoding_file is not None:
+        with open(encoding_file) as f:
+            encodings = json.load(f)
+        uses_lists = isinstance(encodings["activation_encodings"], list)
+        if uses_lists:
+            # Convert encodings to dictionary
+            encodings["activation_encodings"] = {
+                v["name"]: v for v in encodings["activation_encodings"]
+            }
+            encodings["param_encodings"] = {
+                v["name"]: v for v in encodings["param_encodings"]
+            }
+
+    onnx_to_artifacts_map = dict()
+    onnxmodel = (
+        _load_model(onnxfile, load_external_data=False)
+        if onnxmodel is None
+        else onnxmodel
+    )
+    splitter = OnnxSplitter(onnxmodel, verbose=False)
+    base_dir = os.path.dirname(onnxfile)
+    using_external_data = OnnxSplitter.is_using_external_data(onnxmodel)
+
+    list_of_output_tensors = [i.split(",") for i in list_of_output_tensors]
+    num_splits = len(list_of_output_tensors) + 1
+
+    # 1. split model
+    new_model_info = []
+    for i, subgraph in enumerate(splitter.split(list_of_output_tensors)):
+        new_basename = f"{modelname}_{i+1}_of_{num_splits}"
+        input_tensor_names = [i.name for i in subgraph.input]
+        output_tensor_names = [i.name for i in subgraph.output]
+        new_model_info.append([new_basename, input_tensor_names, output_tensor_names])
+
+        submodel = onnx.helper.make_model(
+            subgraph, opset_imports=onnxmodel.opset_import
+        )
+        if (
+            not using_external_data
+            and submodel.ByteSize() < onnx.checker.MAXIMUM_PROTOBUF
+        ):
+            onnx.checker.check_model(submodel)
+
+        if using_external_data:
+            onnx.load_external_data_for_model(submodel, base_dir=base_dir)
+
+        part_root_path = Path(output_dir) / (new_basename + ".aimet")
+        part_root_path.mkdir(parents=True, exist_ok=True)
+
+        newonnxfile = part_root_path / (new_basename + ".onnx")
+        save_model(submodel, newonnxfile, using_external_data)
+
+        # Save subset of encodings
+        if encodings is not None:
+            new_encodings = deepcopy(encodings)
+
+            activation_names = (
+                set(o for x in submodel.graph.node for o in x.output)
+                | set(x.name for x in submodel.graph.input)
+                | set(x.name for x in submodel.graph.output)
+            )
+            param_names = set(x.name for x in submodel.graph.initializer)
+
+            for k in encodings["activation_encodings"]:
+                if k not in activation_names:
+                    del new_encodings["activation_encodings"][k]
+
+            for k in encodings["param_encodings"]:
+                if k not in param_names:
+                    del new_encodings["param_encodings"][k]
+
+            if uses_lists:
+                # convert back
+                new_encodings["activation_encodings"] = list(
+                    new_encodings["activation_encodings"].values()
+                )
+                new_encodings["param_encodings"] = list(
+                    new_encodings["param_encodings"].values()
+                )
+
+            new_encodings_path = part_root_path / (new_basename + ".encodings")
+            with open(new_encodings_path, "w") as write_file:
+                json.dump(new_encodings, write_file, indent=4, sort_keys=True)
+
+    return onnx_to_artifacts_map
+
+
+def _get_lm_head_sizes(onnxmodel):
+    "Get dimensions of the LM head : embedding_size, vocab_size"
+    lm_head_weight_name = next(
+        node.input[1]
+        for node in reversed(onnxmodel.graph.node)
+        if node.op_type in ("Conv", "MatMul", "Gemm")
+    )
+    (lm_head_weight,) = [
+        i for i in onnxmodel.graph.initializer if lm_head_weight_name == i.name
+    ]
+    if len(lm_head_weight.dims) == 2:
+        embedding_size, vocab_size = lm_head_weight.dims
+    else:
+        (lm_head,) = [i for i in onnxmodel.graph.node if lm_head_weight.name in i.input]
+        if lm_head.op_type == "Conv":
+            attr_group = [i.i for i in lm_head.attribute if i.name == "group"]
+            group = attr_group[0] if len(attr_group) == 1 else 1
+            grouped_vocab, group_size, _, _ = lm_head_weight.dims
+            vocab_size, embedding_size = grouped_vocab // group, group * group_size
+        elif lm_head.op_type == "MatMul":
+            group, group_size, vocab_size = lm_head_weight.dims
+            embedding_size = group * group_size
+        else:
+            raise RuntimeError(f"Unexpected lm_head op_type:{lm_head}")
+
+    return embedding_size, vocab_size
+
+
+def fill_input_encodings_of_split(onnxmodel, encodingfile, output_tensor_list):
+
+    changed = False
+    encodings = _load_encoding(encodingfile, no_merge=True)
+    enc_act, enc_param = encodings["activation_encodings"], encodings["param_encodings"]
+    producer = {tensor: node for node in onnxmodel.graph.node for tensor in node.output}
+    for split_tensor in output_tensor_list:
+        if split_tensor not in enc_act:
+            assert split_tensor in producer
+            input_tensor = producer[split_tensor].input[0]  # use only 1st input
+            if input_tensor in producer:
+                while input_tensor not in enc_act and input_tensor not in enc_param:
+                    input_tensor = producer[input_tensor].input[0]
+                input_encoding = (
+                    enc_act[input_tensor]
+                    if input_tensor in enc_act
+                    else enc_param[input_tensor]
+                )
+                enc_act[split_tensor] = input_encoding
+                changed = True
+
+    if changed:
+        backup = f"{encodingfile}.bak"
+        if not os.path.exists(backup):
+            shutil.move(encodingfile, backup)
+        _save_encoding(encodings, encodingfile)
+
+
+def split_onnx(
+    onnxfile,
+    modelname,
+    pickle_filedir,
+    num_splits,
+    num_layers_per_split: Optional[int] = None,
+    output_dir="./",
+    split_embedding=False,
+    encoding_file=None,
+    using_qairt_workflow=False,
+):
+    def _is_cache(layer, name):
+        return re.search(f"past_(key|value)_{layer}_", name) is not None
+
+    num_splits = int(num_splits)
+
+    onnxmodel = _load_model(onnxfile, load_external_data=False)
+    input_names, output_names = get_onnx_input_output_names(
+        onnxfile,
+        onnxmodel=onnxmodel,
+        deco_digit=False,
+        using_qairt_workflow=using_qairt_workflow,
+    )
+    output_tensor_list = get_split_tensors(
+        onnxfile, onnxmodel=onnxmodel, include_first_input=split_embedding
+    )
+
+    # Infer the shape of per-layer tensors
+    (input_ids,) = [i for i in onnxmodel.graph.input if i.name == "input_ids"]
+    batch_size, seq_length = [i.dim_value for i in input_ids.type.tensor_type.shape.dim]
+
+    embedding_size, vocab_size = _get_lm_head_sizes(onnxmodel)
+
+    per_layer_output_value_info = [
+        onnx.helper.make_tensor_value_info(
+            name, onnx.TensorProto.FLOAT, [batch_size, seq_length, embedding_size]
+        )
+        for name in output_tensor_list
+    ]
+    onnxmodel.graph.value_info.extend(per_layer_output_value_info)
+
+    names_to_split = []
+    if split_embedding:
+        first_output_tensors = output_tensor_list[0].split(",")
+        fill_input_encodings_of_split(onnxmodel, encoding_file, first_output_tensors)
+        names_to_split.append(output_tensor_list[0])
+        output_tensor_list.pop(0)
+
+    num_layers = len(output_tensor_list)
+    if num_layers_per_split is None:
+        num_layers_per_split = (
+            ((num_layers - 1) // num_splits)
+            if split_embedding
+            else (num_layers // num_splits)
+        )
+    past_key_values = {
+        layer: [output for output in output_names if _is_cache(layer, output)]
+        for layer in range(num_layers)
+    }
+
+    for layer_end in range(num_layers_per_split, num_layers, num_layers_per_split):
+        outputs = [output_tensor_list[layer_end - 1]]
+        for layer in range(layer_end - num_layers_per_split, layer_end):
+            outputs += past_key_values[layer]
+        names_to_split.append(",".join(outputs))
+
+    names_to_split = names_to_split[: num_splits - 1]
+    assert (
+        num_splits == len(names_to_split) + 1
+    ), f"Failed to split into {num_splits} pieces!"
+    return split_onnx_by_names(
+        onnxfile,
+        modelname,
+        pickle_filedir,
+        *names_to_split,
+        output_dir=output_dir,
+        onnxmodel=onnxmodel,
+        encoding_file=encoding_file,
+        using_qairt_workflow=using_qairt_workflow,
+    )
diff --git a/qai_hub_models/models/aotgan/README.md b/qai_hub_models/models/aotgan/README.md
index 709849c3..c3b8a37f 100644
--- a/qai_hub_models/models/aotgan/README.md
+++ b/qai_hub_models/models/aotgan/README.md
@@ -6,7 +6,7 @@
 AOT-GAN is a machine learning model that allows to erase and in-paint part of given input image.
 
 This is based on the implementation of AOT-GAN found
-[here](https://github.com/researchmm/AOT-GAN-for-Inpainting). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/aotgan).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.aotgan.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of AOT-GAN can be found
+* The license for the original implementation of AOT-GAN can be found
   [here](https://github.com/taki0112/AttnGAN-Tensorflow/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Aggregated Contextual Transformations for High-Resolution Image Inpainting](https://arxiv.org/abs/2104.01431)
 * [Source Model Implementation](https://github.com/researchmm/AOT-GAN-for-Inpainting)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/aotgan/export.py b/qai_hub_models/models/aotgan/export.py
index 35f236ac..b0e272e4 100644
--- a/qai_hub_models/models/aotgan/export.py
+++ b/qai_hub_models/models/aotgan/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
 from qai_hub_models.models.aotgan import Model
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "aotgan"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/aotgan/perf.yaml b/qai_hub_models/models/aotgan/perf.yaml
index 488934ec..95593bfe 100644
--- a/qai_hub_models/models/aotgan/perf.yaml
+++ b/qai_hub_models/models/aotgan/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: AOT-GAN
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 153234.0
-      throughput: 6.525966821984677
+      inference_time: 152996.0
+      throughput: 6.536118591335721
       estimated_peak_memory_range:
-        min: 3284992
-        max: 5465888
+        min: 4313088
+        max: 6987208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: joprk1705
+      job_id: jg9lno9qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 153843.0
-      throughput: 6.500133252731681
+      inference_time: 153279.0
+      throughput: 6.524050913693331
       estimated_peak_memory_range:
-        min: 4227072
-        max: 23282152
+        min: 4317184
+        max: 24792560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: j1glneqjp
+      job_id: jp2kyojxp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T13:00:39Z'
+    timestamp: '2024-10-15T01:04:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 120494.0
-      throughput: 8.299168423323984
+      inference_time: 120324.0
+      throughput: 8.310893919750008
       estimated_peak_memory_range:
-        min: 75563008
-        max: 268910816
+        min: 3362816
+        max: 225851856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jep283zrp
+      job_id: jp14zoqkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 139206.0
-      throughput: 7.183598408114593
+      inference_time: 139029.0
+      throughput: 7.19274395989326
       estimated_peak_memory_range:
-        min: 4284416
-        max: 51562816
+        min: 4214784
+        max: 63555488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: jw566q065
+      job_id: jpy138nrp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T13:00:40Z'
+    timestamp: '2024-10-15T01:04:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 153299.0
-      throughput: 6.523199759946249
+      inference_time: 152722.0
+      throughput: 6.547845104176216
       estimated_peak_memory_range:
         min: 3293184
-        max: 5408344
+        max: 5871152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,14 +132,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jqpyevy8g
+      job_id: jgdx167kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 92901.0
-      throughput: 10.764146779905492
+      inference_time: 92370.0
+      throughput: 10.826025765941322
       estimated_peak_memory_range:
-        min: 4395008
-        max: 5750104
+        min: 4444160
+        max: 5674816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -149,7 +147,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: jwgoye9q5
+      job_id: jp8qyj8zp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -157,14 +155,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T13:00:43Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T01:04:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 196094.0
-      throughput: 5.099595092149683
+      inference_time: 153035.0
+      throughput: 6.534452902930702
       estimated_peak_memory_range:
-        min: 3325952
-        max: 172737088
+        min: 3321856
+        max: 5871472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -172,14 +170,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: j2p0yex9g
+      job_id: j5mnx9vyp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 194222.0
-      throughput: 5.14874730977953
+      inference_time: 92574.0
+      throughput: 10.80216907555037
       estimated_peak_memory_range:
-        min: 4255744
-        max: 45352048
+        min: 4509696
+        max: 5770936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -187,22 +185,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: jygzevqog
+      job_id: jglvmw7e5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T13:00:46Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T01:04:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 150359.0
-      throughput: 6.6507492068981575
+      inference_time: 152757.0
+      throughput: 6.546344848353922
       estimated_peak_memory_range:
-        min: 3219456
-        max: 5131872
+        min: 3317760
+        max: 5471768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -210,14 +208,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: j1p8owkkg
+      job_id: jpxko0ej5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 93120.0
-      throughput: 10.738831615120274
+      inference_time: 93610.0
+      throughput: 10.682619378271552
       estimated_peak_memory_range:
-        min: 4444160
-        max: 5980048
+        min: 4489216
+        max: 5770440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -225,22 +223,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: j1pv3zyk5
+      job_id: j5q6q4w7p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T13:00:43Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T01:04:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 153344.0
-      throughput: 6.521285475792988
+      inference_time: 152642.0
+      throughput: 6.551276843856868
       estimated_peak_memory_range:
-        min: 3289088
-        max: 5317984
+        min: 3223552
+        max: 5517544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -248,14 +246,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jogkzrkwg
+      job_id: jp4lrejq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 92988.0
-      throughput: 10.754075794726202
+      inference_time: 92421.0
+      throughput: 10.820051719847221
       estimated_peak_memory_range:
-        min: 4382720
-        max: 6036920
+        min: 4460544
+        max: 5743688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -263,22 +261,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: j7gjxk6vp
+      job_id: jgkex6dyg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T13:00:44Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T01:04:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 153465.0
-      throughput: 6.51614374613104
+      inference_time: 193915.0
+      throughput: 5.156898641157208
       estimated_peak_memory_range:
-        min: 3321856
-        max: 5293976
+        min: 3379200
+        max: 195570816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -286,14 +284,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jn5q89dn5
+      job_id: j57yrovq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 93130.0
-      throughput: 10.737678513905294
+      inference_time: 195480.0
+      throughput: 5.11561285041948
       estimated_peak_memory_range:
-        min: 4509696
-        max: 5646168
+        min: 864256
+        max: 48880976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -301,19 +299,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: jlpe940og
+      job_id: jp3j0o8xg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T01:04:51Z'
+  - torchscript_onnx_tflite:
+      inference_time: 118959.0
+      throughput: 8.406257618170967
+      estimated_peak_memory_range:
+        min: 3121152
+        max: 89988256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 235
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 235
+      job_id: jprv3x9vg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 118560.0
+      throughput: 8.434547908232119
+      estimated_peak_memory_range:
+        min: 3158016
+        max: 68309728
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 274
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 274
+      job_id: jgo26dm4p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T13:00:45Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T01:04:52Z'
   - torchscript_onnx_qnn:
-      inference_time: 96405.0
-      throughput: 10.372905969607386
+      inference_time: 96258.0
+      throughput: 10.388746909347795
       estimated_peak_memory_range:
         min: 4202496
         max: 4202496
@@ -324,7 +360,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 274
-      job_id: j1p3kqr35
+      job_id: jp0z0ok25
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -333,4 +369,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T13:00:41Z'
+    timestamp: '2024-10-15T01:04:44Z'
diff --git a/qai_hub_models/models/baichuan_7b_quantized/README.md b/qai_hub_models/models/baichuan2_7b_quantized/README.md
similarity index 63%
rename from qai_hub_models/models/baichuan_7b_quantized/README.md
rename to qai_hub_models/models/baichuan2_7b_quantized/README.md
index 6e9ee724..41a9b966 100644
--- a/qai_hub_models/models/baichuan_7b_quantized/README.md
+++ b/qai_hub_models/models/baichuan2_7b_quantized/README.md
@@ -1,30 +1,37 @@
 [![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
 
 
-# [Baichuan-7B: Large language model achieving state-of-the-art performance on Chinese and English language benchmarks](https://aihub.qualcomm.com/models/baichuan_7b_quantized)
+# [Baichuan2-7B: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/baichuan2_7b_quantized)
 
-Baichuan-7B is a family of LLMs. It achieves the state-of-the-art performance of its size on standard Chinese and English authoritative benchmarks (C-EVAL/MMLU). 4-bit weights and 16-bit activations making it suitable for on-device The model is quantized to deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-KVCache-Quantized's latency.
+Baichuan2-7B is a family of LLMs. It achieves the state-of-the-art performance of its size on standard Chinese and English authoritative benchmarks (C-EVAL/MMLU). 4-bit weights and 16-bit activations making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Baichuan2-PromptProcessor-Quantized's latency and average time per addition token is Baichuan2-TokenGenerator-Quantized's latency.
 
-This is based on the implementation of Baichuan-7B found
-[here](https://github.com/baichuan-inc/Baichuan-7B/). This repository contains scripts for optimized on-device
+This is based on the implementation of Baichuan2-7B found
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
-accross various devices, can be found [here](https://aihub.qualcomm.com/models/baichuan_7b_quantized).
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/baichuan2_7b_quantized).
 
 [Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
 
+## Deploying Baichuan2-7B on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
 
 
 
 
 ## License
-- The license for the original implementation of Baichuan-7B can be found
+* The license for the original implementation of Baichuan2-7B can be found
   [here](https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE)
+
 
 ## References
 * [Baichuan 2: Open Large-scale Language Models](https://arxiv.org/abs/2309.10305)
 * [Source Model Implementation](https://github.com/baichuan-inc/Baichuan-7B/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/baichuan2_7b_quantized/info.yaml b/qai_hub_models/models/baichuan2_7b_quantized/info.yaml
new file mode 100644
index 00000000..b258510d
--- /dev/null
+++ b/qai_hub_models/models/baichuan2_7b_quantized/info.yaml
@@ -0,0 +1,59 @@
+name: Baichuan2-7B
+id: baichuan2_7b_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: Baichuan2-7B is a family of LLMs. It achieves the state-of-the-art performance of its size on standard Chinese and English authoritative benchmarks (C-EVAL/MMLU). 4-bit weights and 16-bit activations making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Baichuan2-PromptProcessor-Quantized's latency and average time per addition token is Baichuan2-TokenGenerator-Quantized's latency.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://arxiv.org/abs/2309.10305
+research_paper_title: "Baichuan 2: Open Large-scale Language Models"
+license: https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE
+deploy_license: https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE
+source_repo: https://github.com/baichuan-inc/Baichuan-7B/
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 7.07B
+  Precision: w4a16 + w8a16 (few layers)
+  Num of key-value heads: 8
+  Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
+  Prompt processor model size: 5.06 GB
+  Prompt processor input (part1): 128 tokens
+  Prompt processor output (part1): Embeddings output
+  Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
+  Prompt processor output (other parts): 128 output tokens + KVCache for token generator
+  Token generator model size: 5.06 GB
+  Token generator input (part1): 128 tokens
+  Token generator output (part1): Embeddings output
+  Token generator input (other parts): 1 input token + past KVCache
+  Token generator output (other parts): 1 output token + KVCache for next iteration
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Supported languages: Chinese and English.
+  Minimum QNN SDK version required: 2.27.7
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: true
+license_type: apache-2.0
+deploy_license_type: apache-2.0
+dataset: []
+model_type_llm: true
+llm_details:
+  call_to_action: 'download'
+  Snapdragon 8 Elite QRD:
+    torchscript_onnx_qnn:
+      model_download_url: v2/snapdragon_8_elite/models.zip
+      genie_compatible: true
diff --git a/qai_hub_models/models/baichuan2_7b_quantized/perf.yaml b/qai_hub_models/models/baichuan2_7b_quantized/perf.yaml
new file mode 100644
index 00000000..b16bc822
--- /dev/null
+++ b/qai_hub_models/models/baichuan2_7b_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: 'Baichuan2-7B'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 208048
+          max: 6657536
+        tokens_per_second: 7.72
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/baichuan_7b_quantized/info.yaml b/qai_hub_models/models/baichuan_7b_quantized/info.yaml
deleted file mode 100644
index cee1c0d1..00000000
--- a/qai_hub_models/models/baichuan_7b_quantized/info.yaml
+++ /dev/null
@@ -1,47 +0,0 @@
-name: Baichuan-7B
-id: baichuan_7b_quantized
-status: public
-headline: Large language model achieving state-of-the-art performance on Chinese and English language benchmarks.
-domain: Generative AI
-description: Baichuan-7B is a family of LLMs. It achieves the state-of-the-art performance of
-  its size on standard Chinese and English authoritative benchmarks (C-EVAL/MMLU).
-  4-bit weights and 16-bit activations making it suitable for on-device
-  The model is quantized to deployment. For Prompt and output length specified below,
-  the time to first token is Llama-PromptProcessor-Quantized's latency and average
-  time per addition token is Llama-TokenGenerator-KVCache-Quantized's latency.
-use_case: Text Generation
-tags:
-  - llm
-  - generative-ai
-  - quantized
-research_paper: https://arxiv.org/abs/2309.10305
-research_paper_title: "Baichuan 2: Open Large-scale Language Models"
-license: https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE
-deploy_license: https://github.com/baichuan-inc/Baichuan-7B/blob/main/LICENSE
-source_repo: https://github.com/baichuan-inc/Baichuan-7B/
-technical_details:
-  Number of parameters: 7B
-  Model size: 3.9GB
-  Model-1 (Prompt Processor): Baichuan-PromptProcessor-Quantized
-  Max context length: 1024
-  Prompt processor input: 1024 tokens
-  Prompt processor output: 1024 output tokens + KVCache for token generator
-  Model-2 (Token Generator): Baichuan-TokenGenerator-KVCache-Quantized
-  Token generator input: 1 input token + past KVCache
-  Token generator output: 1 output token + KVCache for next iteration
-  Decoding length: 1024 (1 output token + 1023 from KVCache)
-  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
-applicable_scenarios:
-  - Dialogue
-  - Content Generation
-  - Customer Support
-related_models: []
-form_factors:
-  - Phone
-  - Tablet
-has_static_banner: true
-has_animated_banner: true
-license_type: apache-2.0
-deploy_license_type: apache-2.0
-dataset: []
-restrict_model_sharing: true
diff --git a/qai_hub_models/models/baichuan_7b_quantized/perf.yaml b/qai_hub_models/models/baichuan_7b_quantized/perf.yaml
deleted file mode 100644
index 4f87d7e0..00000000
--- a/qai_hub_models/models/baichuan_7b_quantized/perf.yaml
+++ /dev/null
@@ -1,77 +0,0 @@
-models:
-- name: Baichuan-TokenGenerator-KVCache-Quantized
-  performance_metrics:
-  - reference_device_info:
-      name: Samsung Galaxy S24 Ultra
-      os: '14'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-02-16T22:23:17.643089Z'
-    torchscript_onnx_qnn:
-      inference_time: 108059
-      throughput: 9.25
-      estimated_peak_memory_range:
-        min: 561152
-        max: 112366992
-      layer_info:
-        layers_on_npu: 33820
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 33820
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-- name: Baichuan-PromptProcessor-Quantized
-  performance_metrics:
-  - reference_device_info:
-      name: Samsung Galaxy S24 Ultra
-      os: '14'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-02-16T22:23:17.643089Z'
-    torchscript_onnx_qnn:
-      inference_time: 2599326
-      throughput: 393.94
-      estimated_peak_memory_range:
-        min: 53248
-        max: 40255040
-      layer_info:
-        layers_on_npu: 31772
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 31772
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-aggregated:
-  supported_devices:
-  - Samsung Galaxy S24 Ultra
-  supported_oses:
-  - Android
-  supported_chipsets:
-  - Snapdragon® 8 Gen 3
-  performance_metrics:
-  - reference_device_info:
-      name: Samsung Galaxy S24 Ultra
-      os: '14'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-02-16T22:23:17.643089Z'
-    torchscript_onnx_qnn:
-      inference_time: 108059
-      throughput: 9.25
-      estimated_peak_memory_range:
-        min: 561152
-        max: 112366992
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: ""
-      job_status: Passed
diff --git a/qai_hub_models/models/common.py b/qai_hub_models/models/common.py
index 598158b5..3e076cf9 100644
--- a/qai_hub_models/models/common.py
+++ b/qai_hub_models/models/common.py
@@ -2,10 +2,12 @@
 # Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
+from dataclasses import dataclass
 from enum import Enum, unique
-from typing import Dict, List
+from typing import Dict, List, Optional
 
 import numpy as np
+import qai_hub as hub
 
 
 @unique
@@ -34,3 +36,12 @@ class SourceModelFormat(Enum):
 
 
 SampleInputsType = Dict[str, List[np.ndarray]]
+
+
+@dataclass
+class ExportResult:
+    compile_job: Optional[hub.CompileJob] = None
+    quantize_job: Optional[hub.QuantizeJob] = None
+    profile_job: Optional[hub.ProfileJob] = None
+    inference_job: Optional[hub.InferenceJob] = None
+    link_job: Optional[hub.LinkJob] = None
diff --git a/qai_hub_models/models/controlnet_quantized/README.md b/qai_hub_models/models/controlnet_quantized/README.md
index d37819e2..08c419b4 100644
--- a/qai_hub_models/models/controlnet_quantized/README.md
+++ b/qai_hub_models/models/controlnet_quantized/README.md
@@ -6,7 +6,7 @@
 On-device, high-resolution image synthesis from text and image prompts. ControlNet guides Stable-diffusion with provided input image to generate accurate images from given input prompt.
 
 This is based on the implementation of ControlNet found
-[here](https://github.com/lllyasviel/ControlNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/controlnet_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.controlnet_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ControlNet can be found
+* The license for the original implementation of ControlNet can be found
   [here](https://github.com/lllyasviel/ControlNet/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/lllyasviel/ControlNet/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/lllyasviel/ControlNet/blob/main/LICENSE)
+
 
 ## References
 * [Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543)
 * [Source Model Implementation](https://github.com/lllyasviel/ControlNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/controlnet_quantized/export.py b/qai_hub_models/models/controlnet_quantized/export.py
index 41b1cff8..b98f52d3 100644
--- a/qai_hub_models/models/controlnet_quantized/export.py
+++ b/qai_hub_models/models/controlnet_quantized/export.py
@@ -9,13 +9,14 @@
 
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.controlnet_quantized import Model
 from qai_hub_models.utils.args import export_parser
-from qai_hub_models.utils.base_model import BasePrecompiledModel, TargetRuntime
+from qai_hub_models.utils.base_model import BasePrecompiledModel
 from qai_hub_models.utils.printing import print_profile_metrics_from_job
 from qai_hub_models.utils.qai_hub_helpers import (
     can_access_qualcomm_ai_hub,
@@ -46,19 +47,16 @@ def export_model(
     output_dir: Optional[str] = None,
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[str, Tuple[Optional[hub.ProfileJob], Optional[hub.InferenceJob]]] | List[
-    str
-]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 5 main tasks:
+    This function executes the following recipe:
 
-        1. Initialize model.
-        2. Upload model assets to hub.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Summarizes the results from profiling.
+        1. Initialize model
+        2. Upload model assets to hub
+        3. Profiles the model performance on a real device
+        4. Summarizes the results from profiling
 
-    Each of the last three steps can be optionally skipped using the input options.
+    Each of the last 2 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,9 +78,8 @@ def export_model(
             `model_cls.from_precompiled`
 
     Returns:
-        A Mapping from component_name to a 2-tuple of:
+        A Mapping from component_name to a struct of:
             * A ProfileJob containing metadata about the profile job (None if profiling skipped).
-            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
     """
     model_name = "controlnet_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,9 +108,7 @@ def export_model(
             component_arg,
         )
 
-    target_runtime = TargetRuntime.TFLITE
-    # On-device perf improves with I/O in channel_last format except when using ONNX.
-    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+    target_runtime = TargetRuntime.QNN
 
     # 1. Initialize model
     print("Initializing model class")
@@ -135,8 +130,11 @@ def export_model(
         uploaded_models[component_name] = hub.upload_model(
             components_dict[component_name].get_target_model_path()
         )
+    print(
+        f"The {component_name} model is saved here: {components_dict[component_name].get_target_model_path()}"
+    )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -154,31 +152,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
-    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-    if not skip_inferencing:
-        for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            profile_options_all = components_dict[
-                component_name
-            ].get_hub_profile_options(target_runtime, profile_options)
-            sample_inputs = components_dict[component_name].sample_inputs(
-                use_channel_last_format=use_channel_last_format
-            )
-            submitted_inference_job = hub.submit_inference_job(
-                model=uploaded_models[component_name],
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Summarize the results from profiling
+    # 4. Summarizes the results from profiling
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -187,9 +161,8 @@ def export_model(
             print_profile_metrics_from_job(profile_job, profile_data)
 
     return {
-        component_name: (
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/controlnet_quantized/perf.yaml b/qai_hub_models/models/controlnet_quantized/perf.yaml
index fae155c5..dfdd181f 100644
--- a/qai_hub_models/models/controlnet_quantized/perf.yaml
+++ b/qai_hub_models/models/controlnet_quantized/perf.yaml
@@ -11,7 +11,7 @@ aggregated:
   supported_chipsets:
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 3
-  - Qcs8550 Proxy
+  - QCS8550 Proxy
 models:
 - name: TextEncoder_Quantized
   performance_metrics:
@@ -112,7 +112,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:19:26Z'
 - name: UNet_Quantized
   performance_metrics:
@@ -213,7 +213,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:19:27Z'
 - name: VAEDecoder_Quantized
   performance_metrics:
@@ -314,7 +314,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:19:26Z'
 - name: ControlNet_Quantized
   performance_metrics:
@@ -415,5 +415,5 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:19:27Z'
diff --git a/qai_hub_models/models/convnext_tiny/README.md b/qai_hub_models/models/convnext_tiny/README.md
index 67aed5c6..efee0bb2 100644
--- a/qai_hub_models/models/convnext_tiny/README.md
+++ b/qai_hub_models/models/convnext_tiny/README.md
@@ -6,7 +6,7 @@
 ConvNextTiny is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ConvNext-Tiny found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/convnext_tiny).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.convnext_tiny.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ConvNext-Tiny can be found
+* The license for the original implementation of ConvNext-Tiny can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/convnext_tiny/export.py b/qai_hub_models/models/convnext_tiny/export.py
index db0e6fa9..00a16877 100644
--- a/qai_hub_models/models/convnext_tiny/export.py
+++ b/qai_hub_models/models/convnext_tiny/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.convnext_tiny import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "convnext_tiny"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/convnext_tiny/perf.yaml b/qai_hub_models/models/convnext_tiny/perf.yaml
index 132a383b..d832083e 100644
--- a/qai_hub_models/models/convnext_tiny/perf.yaml
+++ b/qai_hub_models/models/convnext_tiny/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ConvNext-Tiny
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 3313.0
-      throughput: 301.84123151222457
+      inference_time: 3402.0
+      throughput: 293.9447383891828
       estimated_peak_memory_range:
         min: 16384
-        max: 34047392
+        max: 3571480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,37 +56,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: jqpyev28g
+      job_id: jg9lno3mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3839.0
-      throughput: 260.4845011721803
+      inference_time: 3892.0
+      throughput: 256.9373072970195
       estimated_peak_memory_range:
-        min: 233472
-        max: 136630264
+        min: 626688
+        max: 93060648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: j1p3kq735
+        total_layers: 232
+      job_id: jpxko07j5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16401.0
-      throughput: 60.97189195780745
+      inference_time: 13414.0
+      throughput: 74.5489786789921
       estimated_peak_memory_range:
-        min: 12288
-        max: 68924008
+        min: 638976
+        max: 3750992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 189
+        layers_on_npu: 198
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 189
-      job_id: jnp10qm85
+        total_layers: 198
+      job_id: jglvmwee5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:59:44Z'
+    timestamp: '2024-10-15T01:03:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 2771.0
-      throughput: 360.8805485384338
+      inference_time: 2577.0
+      throughput: 388.04811796662784
       estimated_peak_memory_range:
-        min: 20480
-        max: 213364256
+        min: 16384
+        max: 217772624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,37 +109,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: j2p0ye99g
+      job_id: jp14zodnp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3194.0
-      throughput: 313.08703819661866
+      inference_time: 3299.0
+      throughput: 303.12215822976657
       estimated_peak_memory_range:
         min: 618496
-        max: 31299392
+        max: 36684272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: jwgoyewq5
+        total_layers: 232
+      job_id: j5mnx9wyp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 14123.0
-      throughput: 70.80648587410607
+      inference_time: 9798.0
+      throughput: 102.06164523372117
       estimated_peak_memory_range:
         min: 0
-        max: 378934128
+        max: 390272624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 189
+        layers_on_npu: 198
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 189
-      job_id: jvgdw7mr5
+        total_layers: 198
+      job_id: j56y4oqvp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:59:45Z'
+    timestamp: '2024-10-15T01:03:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 3253.0
-      throughput: 307.40854595757764
+      inference_time: 3342.0
+      throughput: 299.22202274087374
       estimated_peak_memory_range:
-        min: 0
-        max: 2261184
+        min: 20480
+        max: 2120064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +162,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: j1p8owrkg
+      job_id: jgdx16r6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3397.0
-      throughput: 294.3773918163085
+      inference_time: 3633.0
+      throughput: 275.2546105147261
       estimated_peak_memory_range:
-        min: 638976
-        max: 2044160
+        min: 634880
+        max: 1771984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: j7gjxk8vp
+        total_layers: 232
+      job_id: jprv3x1vg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:59:39Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T01:03:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 9137.0
-      throughput: 109.44511327569224
+      inference_time: 3385.0
+      throughput: 295.4209748892171
       estimated_peak_memory_range:
-        min: 24576
-        max: 205835680
+        min: 1273856
+        max: 3086216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: jogkzr0wg
+      job_id: jgdx16rkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 9739.0
-      throughput: 102.67994660642776
+      inference_time: 3670.0
+      throughput: 272.47956403269757
       estimated_peak_memory_range:
-        min: 634880
-        max: 32317280
+        min: 643072
+        max: 1912072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: jmg9v9qw5
+        total_layers: 232
+      job_id: jp0z0oe25
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:59:43Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T01:03:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 3270.0
-      throughput: 305.8103975535168
+      inference_time: 3400.0
+      throughput: 294.11764705882354
       estimated_peak_memory_range:
-        min: 16384
-        max: 2740240
+        min: 32768
+        max: 2225928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,37 +238,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: jn5q891n5
+      job_id: jp14zodkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3404.0
-      throughput: 293.7720329024677
+      inference_time: 3670.0
+      throughput: 272.47956403269757
       estimated_peak_memory_range:
-        min: 643072
-        max: 1995504
+        min: 630784
+        max: 1906976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: jlpe94nog
+        total_layers: 232
+      job_id: jpy138vrp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:59:40Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T01:03:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 3274.0
-      throughput: 305.43677458766035
+      inference_time: 3384.0
+      throughput: 295.5082742316785
       estimated_peak_memory_range:
-        min: 20480
-        max: 2183128
+        min: 16384
+        max: 2285648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +276,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: j1glne8jp
+      job_id: jg9lno3qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3393.0
-      throughput: 294.7244326554671
+      inference_time: 3664.0
+      throughput: 272.92576419213975
       estimated_peak_memory_range:
-        min: 634880
-        max: 1867360
+        min: 626688
+        max: 1792296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: jygzev0og
+        total_layers: 232
+      job_id: jp2kyo3xp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:59:41Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T01:03:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 3265.0
-      throughput: 306.2787136294028
+      inference_time: 9206.0
+      throughput: 108.62480990658267
       estimated_peak_memory_range:
-        min: 28672
-        max: 3297824
+        min: 184320
+        max: 210138112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,60 +314,113 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 328
-      job_id: jw566qm65
+      job_id: j5we6ydz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3524.0
-      throughput: 283.7684449489217
+      inference_time: 9842.0
+      throughput: 101.6053647632595
       estimated_peak_memory_range:
-        min: 634880
-        max: 1936752
+        min: 0
+        max: 32409280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: jz5womr3p
+        total_layers: 232
+      job_id: jgkex6ryg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T01:03:43Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2159.0
+      throughput: 463.1773969430292
+      estimated_peak_memory_range:
+        min: 12288
+        max: 64427792
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 328
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 328
+      job_id: jp4lrexq5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2436.0
+      throughput: 410.5090311986864
+      estimated_peak_memory_range:
+        min: 0
+        max: 37701840
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 232
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 232
+      job_id: j5q6q497p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 7452.0
+      throughput: 134.19216317767044
+      estimated_peak_memory_range:
+        min: 643072
+        max: 132250864
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 198
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 198
+      job_id: jpv6k2z75
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:59:42Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T01:03:49Z'
   - torchscript_onnx_qnn:
-      inference_time: 3635.0
-      throughput: 275.1031636863824
+      inference_time: 3891.0
+      throughput: 257.0033410434336
       estimated_peak_memory_range:
         min: 602112
         max: 602112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 223
+        layers_on_npu: 232
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 223
-      job_id: j1pv3znk5
+        total_layers: 232
+      job_id: jgn6v1rv5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 17094.0
-      throughput: 58.5000585000585
+      inference_time: 16264.0
+      throughput: 61.48548942449582
       estimated_peak_memory_range:
-        min: 61222912
-        max: 61222912
+        min: 60178432
+        max: 60178432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 189
+        layers_on_npu: 198
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 189
-      job_id: jz57zv8vp
+        total_layers: 198
+      job_id: jp3j0oqxg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:59:46Z'
+    timestamp: '2024-10-15T01:03:47Z'
diff --git a/qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md b/qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md
index 0f5910ed..613e7dde 100644
--- a/qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md
+++ b/qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md
@@ -6,7 +6,7 @@
 ConvNextTiny is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ConvNext-Tiny-w8a16-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/convnext_tiny_w8a16_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.convnext_tiny_w8a16_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ConvNext-Tiny-w8a16-Quantized can be found
+* The license for the original implementation of ConvNext-Tiny-w8a16-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/convnext_tiny_w8a16_quantized/evaluate.py b/qai_hub_models/models/convnext_tiny_w8a16_quantized/evaluate.py
index 362002ba..55ba9bac 100644
--- a/qai_hub_models/models/convnext_tiny_w8a16_quantized/evaluate.py
+++ b/qai_hub_models/models/convnext_tiny_w8a16_quantized/evaluate.py
@@ -28,7 +28,7 @@ def main():
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
         supports_tflite=False,
-        supports_ort=False,
+        supports_onnx=False,
     )
     args = parser.parse_args()
     args.device = None
diff --git a/qai_hub_models/models/convnext_tiny_w8a16_quantized/export.py b/qai_hub_models/models/convnext_tiny_w8a16_quantized/export.py
index 6de43c32..5b97df43 100644
--- a/qai_hub_models/models/convnext_tiny_w8a16_quantized/export.py
+++ b/qai_hub_models/models/convnext_tiny_w8a16_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.convnext_tiny_w8a16_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "convnext_tiny_w8a16_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,7 +200,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/convnext_tiny_w8a16_quantized/perf.yaml b/qai_hub_models/models/convnext_tiny_w8a16_quantized/perf.yaml
index 163b09a4..5c1b0e76 100644
--- a/qai_hub_models/models/convnext_tiny_w8a16_quantized/perf.yaml
+++ b/qai_hub_models/models/convnext_tiny_w8a16_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ConvNext-Tiny-w8a16-Quantized
   performance_metrics:
   - torchscript_onnx_qnn:
-      inference_time: 3447.0
-      throughput: 290.1073397156948
+      inference_time: 3622.0
+      throughput: 276.09055770292656
       estimated_peak_memory_range:
-        min: 323584
-        max: 13088648
+        min: 0
+        max: 121490384
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -58,7 +56,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jz5womz6p
+      job_id: j5mnx9z7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -67,13 +65,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:58:56Z'
+    timestamp: '2024-10-15T01:02:47Z'
   - torchscript_onnx_qnn:
-      inference_time: 2447.0
-      throughput: 408.6636697997548
+      inference_time: 2610.0
+      throughput: 383.1417624521073
       estimated_peak_memory_range:
-        min: 0
-        max: 29793200
+        min: 315392
+        max: 36563696
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -81,7 +79,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jmg9v92l5
+      job_id: jgn6v1ej5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -90,13 +88,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:58:57Z'
+    timestamp: '2024-10-15T01:02:48Z'
   - torchscript_onnx_qnn:
-      inference_time: 3060.0
-      throughput: 326.797385620915
+      inference_time: 13298.0
+      throughput: 75.19927808693036
       estimated_peak_memory_range:
-        min: 323584
-        max: 1587888
+        min: 315392
+        max: 8825552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -104,22 +102,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jvgdw74e5
+      job_id: jglvmw025
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:59:00Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T01:02:58Z'
   - torchscript_onnx_qnn:
-      inference_time: 4195.0
-      throughput: 238.37902264600714
+      inference_time: 3178.0
+      throughput: 314.6633102580239
       estimated_peak_memory_range:
-        min: 315392
-        max: 33940192
+        min: 335872
+        max: 1492200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -127,22 +125,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jvgdw74r5
+      job_id: jp2kyom6p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:59:04Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T01:02:50Z'
   - torchscript_onnx_qnn:
-      inference_time: 3090.0
-      throughput: 323.62459546925567
+      inference_time: 3204.0
+      throughput: 312.10986267166044
       estimated_peak_memory_range:
-        min: 327680
-        max: 1678656
+        min: 335872
+        max: 1702992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -150,22 +148,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jz5womz3p
+      job_id: jp8qyj3qp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:59:01Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T01:02:54Z'
   - torchscript_onnx_qnn:
-      inference_time: 3091.0
-      throughput: 323.51989647363314
+      inference_time: 3204.0
+      throughput: 312.10986267166044
       estimated_peak_memory_range:
-        min: 331776
-        max: 2146800
+        min: 335872
+        max: 1630224
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -173,7 +171,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jmg9v92w5
+      job_id: jp0z0o105
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -181,14 +179,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:59:02Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T01:02:53Z'
   - torchscript_onnx_qnn:
-      inference_time: 3074.0
-      throughput: 325.30904359141186
+      inference_time: 3198.0
+      throughput: 312.6954346466542
       estimated_peak_memory_range:
-        min: 323584
-        max: 1667072
+        min: 335872
+        max: 1549344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -196,22 +194,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jnp10q185
+      job_id: jpy138d0p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:59:03Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T01:02:52Z'
   - torchscript_onnx_qnn:
-      inference_time: 13121.0
-      throughput: 76.21370322383964
+      inference_time: 4241.0
+      throughput: 235.7934449422306
       estimated_peak_memory_range:
-        min: 352256
-        max: 8126144
+        min: 315392
+        max: 42740768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -219,19 +217,42 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jz57zvnvp
+      job_id: j5q6q47ep
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T01:02:57Z'
+  - torchscript_onnx_qnn:
+      inference_time: 2406.0
+      throughput: 415.6275976724855
+      estimated_peak_memory_range:
+        min: 0
+        max: 36477936
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: j56y4o3np
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:59:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T01:02:59Z'
   - torchscript_onnx_qnn:
-      inference_time: 3353.0
-      throughput: 298.2403817476886
+      inference_time: 3505.0
+      throughput: 285.30670470756064
       estimated_peak_memory_range:
         min: 303104
         max: 303104
@@ -242,7 +263,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jnp10q125
+      job_id: jprv3xykg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -251,4 +272,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:58:58Z'
+    timestamp: '2024-10-15T01:02:49Z'
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md b/qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md
index 2cc33cf1..7eac8a6d 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md
@@ -6,7 +6,7 @@
 ConvNextTiny is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ConvNext-Tiny-w8a8-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/convnext_tiny_w8a8_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/c
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[convnext_tiny_w8a8_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.convnext_tiny_w8a8_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ConvNext-Tiny-w8a8-Quantized can be found
+* The license for the original implementation of ConvNext-Tiny-w8a8-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/convnext.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/conftest.py b/qai_hub_models/models/convnext_tiny_w8a8_quantized/conftest.py
index e737cdbc..1c81d4b0 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/conftest.py
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/conftest.py
@@ -9,7 +9,6 @@
 import pytest
 
 from qai_hub_models.models.convnext_tiny_w8a8_quantized import Model
-from qai_hub_models.utils.testing import skip_clone_repo_check
 
 
 # Instantiate the model only once for all tests.
@@ -22,7 +21,6 @@ def cached_from_pretrained():
         from_pretrained = Model.from_pretrained
         sig = inspect.signature(from_pretrained)
 
-        @skip_clone_repo_check
         def _cached_from_pretrained(*args, **kwargs):
             cache_key = str(args) + str(kwargs)
             model = pretrained_cache.get(cache_key, None)
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/evaluate.py b/qai_hub_models/models/convnext_tiny_w8a8_quantized/evaluate.py
index 76c29397..87373ea9 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/evaluate.py
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.convnext_tiny_w8a8_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -28,7 +26,8 @@ def main():
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
         supports_tflite=False,
-        supports_ort=False,
+        supports_onnx=False,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -40,13 +39,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/export.py b/qai_hub_models/models/convnext_tiny_w8a8_quantized/export.py
index 6714b0a4..43fc8aa9 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/export.py
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.convnext_tiny_w8a8_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "convnext_tiny_w8a8_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,22 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model, supports_tflite=False, supports_onnx=False)
+    parser = export_parser(
+        model_cls=Model,
+        supports_tflite=False,
+        supports_onnx=False,
+        is_hub_quantized=True,
+    )
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/model.py b/qai_hub_models/models/convnext_tiny_w8a8_quantized/model.py
index 5e332910..a6e459c7 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/model.py
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/model.py
@@ -4,34 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-from pathlib import Path
-
-from aimet_torch.quantsim import QuantizationSimModel
-
-from qai_hub_models.models._shared.convnext_tiny_quantized.model import (
-    ConvNextTinyQuantizableBase,
-)
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.models.convnext_tiny.model import ConvNextTiny
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 1
-
-DEFAULT_ENCODINGS = "convnext_tiny_w8a8_quantized_encodings.json"
-
-
-class ConvNextTinyW8A8Quantizable(ConvNextTinyQuantizableBase):
-    def __init__(
-        self,
-        quant_sim_model: QuantizationSimModel,
-    ) -> None:
-        ConvNextTinyQuantizableBase.__init__(self, quant_sim_model)
 
-    @classmethod
-    def _default_aimet_encodings(cls) -> str | Path:
-        return CachedWebModelAsset.from_asset_store(
-            MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-        ).fetch()
 
-    @classmethod
-    def _output_bw(cls) -> int:
-        return 8
+class ConvNextTinyW8A8Quantizable(HubQuantizableMixin, ConvNextTiny):
+    pass
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/perf.yaml b/qai_hub_models/models/convnext_tiny_w8a8_quantized/perf.yaml
index 2b25f8df..d41eb63c 100644
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/perf.yaml
+++ b/qai_hub_models/models/convnext_tiny_w8a8_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,33 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
-  - QCS8250 (Proxy)
-  - RB5 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ConvNext-Tiny-w8a8-Quantized
   performance_metrics:
   - torchscript_onnx_qnn:
-      inference_time: 1721.0
-      throughput: 581.0575246949448
+      inference_time: 1745.0
+      throughput: 573.0659025787966
       estimated_peak_memory_range:
         min: 16384
-        max: 126115312
+        max: 296021224
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,7 +54,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jep283o4p
+      job_id: jpxk97115
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -70,13 +63,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:58:05Z'
+    timestamp: '2024-10-17T17:33:01Z'
   - torchscript_onnx_qnn:
-      inference_time: 1205.0
-      throughput: 829.8755186721992
+      inference_time: 1224.0
+      throughput: 816.9934640522875
       estimated_peak_memory_range:
         min: 163840
-        max: 22532912
+        max: 23951088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -84,7 +77,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jqpyev87g
+      job_id: j5mnewzwp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -93,13 +86,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:58:06Z'
+    timestamp: '2024-10-17T17:33:03Z'
   - torchscript_onnx_qnn:
-      inference_time: 1675.0
-      throughput: 597.0149253731344
+      inference_time: 6437.0
+      throughput: 155.35187199005748
       estimated_peak_memory_range:
-        min: 180224
-        max: 1408144
+        min: 208896
+        max: 8283008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -107,22 +100,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: j1p8owjxg
+      job_id: jgn609er5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:58:09Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:33:05Z'
   - torchscript_onnx_qnn:
-      inference_time: 2165.0
-      throughput: 461.8937644341801
+      inference_time: 1675.0
+      throughput: 597.0149253731344
       estimated_peak_memory_range:
-        min: 475136
-        max: 23927968
+        min: 212992
+        max: 1423768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -130,22 +123,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jw566qo05
+      job_id: jprv64y9g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:58:13Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:33:07Z'
   - torchscript_onnx_qnn:
-      inference_time: 1679.0
-      throughput: 595.5926146515783
+      inference_time: 1672.0
+      throughput: 598.0861244019138
       estimated_peak_memory_range:
-        min: 184320
-        max: 1662120
+        min: 221184
+        max: 1819072
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -153,22 +146,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jogkzr62g
+      job_id: jpy1z4d7p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:58:10Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:33:11Z'
   - torchscript_onnx_qnn:
-      inference_time: 1687.0
-      throughput: 592.7682276229995
+      inference_time: 1678.0
+      throughput: 595.9475566150179
       estimated_peak_memory_range:
-        min: 184320
-        max: 1399112
+        min: 188416
+        max: 1723008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -176,7 +169,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jn5q89445
+      job_id: jp0z41r65
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -184,14 +177,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:58:11Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:33:13Z'
   - torchscript_onnx_qnn:
-      inference_time: 1671.0
-      throughput: 598.4440454817475
+      inference_time: 2142.0
+      throughput: 466.8534080298786
       estimated_peak_memory_range:
-        min: 172032
-        max: 1383456
+        min: 163840
+        max: 27302640
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -199,22 +192,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: j1glnew8p
+      job_id: jp8q237xp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:58:12Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:33:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 6707.0
-      throughput: 149.0979573579842
+      inference_time: 1156.0
+      throughput: 865.0519031141869
       estimated_peak_memory_range:
-        min: 163840
-        max: 8127056
+        min: 0
+        max: 28172864
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -222,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: j1p3kqol5
+      job_id: jgkevly2g
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:58:14Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:33:17Z'
   - torchscript_onnx_qnn:
-      inference_time: 1816.0
-      throughput: 550.6607929515418
+      inference_time: 1828.0
+      throughput: 547.0459518599563
       estimated_peak_memory_range:
-        min: 466944
-        max: 466944
+        min: 520192
+        max: 520192
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -245,7 +238,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: j2p0yeo6g
+      job_id: jp2kx7m4p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -254,4 +247,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:58:07Z'
+    timestamp: '2024-10-17T17:33:09Z'
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/requirements.txt b/qai_hub_models/models/convnext_tiny_w8a8_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/convnext_tiny_w8a8_quantized/test.py b/qai_hub_models/models/convnext_tiny_w8a8_quantized/test.py
deleted file mode 100644
index b7fedd53..00000000
--- a/qai_hub_models/models/convnext_tiny_w8a8_quantized/test.py
+++ /dev/null
@@ -1,31 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.convnext_tiny_w8a8_quantized.demo import main as demo_main
-from qai_hub_models.models.convnext_tiny_w8a8_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ConvNextTinyW8A8Quantizable,
-)
-from qai_hub_models.utils.testing import skip_clone_repo_check
-
-
-@skip_clone_repo_check
-def test_task():
-    run_imagenet_classifier_test(
-        ConvNextTinyW8A8Quantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        probability_threshold=0.56,
-        diff_tol=0.06,
-    )
-
-
-@skip_clone_repo_check
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/ddrnet23_slim/README.md b/qai_hub_models/models/ddrnet23_slim/README.md
index b076dea2..f4c0a382 100644
--- a/qai_hub_models/models/ddrnet23_slim/README.md
+++ b/qai_hub_models/models/ddrnet23_slim/README.md
@@ -6,7 +6,7 @@
 DDRNet23Slim is a machine learning model that segments an image into semantic classes, specifically designed for road-based scenes. It is designed for the application of self-driving cars.
 
 This is based on the implementation of DDRNet23-Slim found
-[here](https://github.com/chenjun2hao/DDRNet.pytorch). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ddrnet23_slim).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.ddrnet23_slim.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DDRNet23-Slim can be found
+* The license for the original implementation of DDRNet23-Slim can be found
   [here](https://github.com/chenjun2hao/DDRNet.pytorch/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes](https://arxiv.org/abs/2101.06085)
 * [Source Model Implementation](https://github.com/chenjun2hao/DDRNet.pytorch)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ddrnet23_slim/export.py b/qai_hub_models/models/ddrnet23_slim/export.py
index 3f162c6e..b5b34070 100644
--- a/qai_hub_models/models/ddrnet23_slim/export.py
+++ b/qai_hub_models/models/ddrnet23_slim/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ddrnet23_slim import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ddrnet23_slim"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ddrnet23_slim/perf.yaml b/qai_hub_models/models/ddrnet23_slim/perf.yaml
index 00fb7085..55751df0 100644
--- a/qai_hub_models/models/ddrnet23_slim/perf.yaml
+++ b/qai_hub_models/models/ddrnet23_slim/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DDRNet23-Slim
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 5175.0
-      throughput: 193.23671497584542
+      inference_time: 5215.0
+      throughput: 191.75455417066155
       estimated_peak_memory_range:
-        min: 987136
-        max: 3164272
+        min: 1929216
+        max: 3767840
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jmg9v9zl5
+      job_id: jgz3dzj45
       job_status: Passed
     torchscript_onnx:
-      inference_time: 9595.0
-      throughput: 104.22094841063054
+      inference_time: 7422.0
+      throughput: 134.73457289140393
       estimated_peak_memory_range:
-        min: 11673600
-        max: 218833312
+        min: 11857920
+        max: 13806568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 155
-      job_id: j1glney8p
+      job_id: jp3j0oemg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:57:10Z'
+    timestamp: '2024-10-15T01:00:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 4469.0
-      throughput: 223.76370552696352
+      inference_time: 4002.0
+      throughput: 249.8750624687656
       estimated_peak_memory_range:
         min: 987136
-        max: 74955808
+        max: 79422992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jnp10qn25
+      job_id: j5we6y345
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7899.0
-      throughput: 126.59830358273199
+      inference_time: 5648.0
+      throughput: 177.05382436260624
       estimated_peak_memory_range:
         min: 0
-        max: 82281968
+        max: 94016992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 155
-      job_id: j1p3kqzl5
+      job_id: jgo26d31p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:57:11Z'
+    timestamp: '2024-10-15T01:00:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 5059.0
-      throughput: 197.667523225934
+      inference_time: 5072.0
+      throughput: 197.1608832807571
       estimated_peak_memory_range:
-        min: 999424
-        max: 2488576
+        min: 1036288
+        max: 3600920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jvgdw7de5
+      job_id: jg9lnoymg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:56:57Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T01:00:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 7501.0
-      throughput: 133.3155579256099
+      inference_time: 5150.0
+      throughput: 194.1747572815534
       estimated_peak_memory_range:
-        min: 1028096
-        max: 65317360
+        min: 12288
+        max: 1769952
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jz57zvelp
+      job_id: jp4lred25
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:56:58Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T01:00:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 5198.0
-      throughput: 192.3816852635629
+      inference_time: 5152.0
+      throughput: 194.09937888198758
       estimated_peak_memory_range:
-        min: 0
-        max: 14606376
+        min: 995328
+        max: 2779824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jqp4qjyvg
+      job_id: j57yroln5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:56:59Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T01:00:29Z'
   - torchscript_onnx_tflite:
       inference_time: 5027.0
       throughput: 198.92580067634773
       estimated_peak_memory_range:
-        min: 999424
-        max: 8435688
+        min: 528384
+        max: 3069008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: j0pxvel1g
+      job_id: jgdx16q6p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:57:00Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T01:00:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 5195.0
-      throughput: 192.49278152069297
+      inference_time: 7487.0
+      throughput: 133.56484573260317
       estimated_peak_memory_range:
-        min: 405504
-        max: 7332312
+        min: 1032192
+        max: 67333008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,19 +224,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 131
-      job_id: jo5mrv0wg
+      job_id: jp14zownp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T01:00:27Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3419.0
+      throughput: 292.48318221702255
+      estimated_peak_memory_range:
+        min: 983040
+        max: 41260000
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 131
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 131
+      job_id: j5mnx967p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5009.0
+      throughput: 199.64064683569575
+      estimated_peak_memory_range:
+        min: 11886592
+        max: 58267120
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 155
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 155
+      job_id: jpedm6k85
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:57:01Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T01:00:49Z'
   - torchscript_onnx:
-      inference_time: 9452.0
-      throughput: 105.79771476936098
+      inference_time: 8333.0
+      throughput: 120.00480019200768
       estimated_peak_memory_range:
         min: 9859072
         max: 9859072
@@ -249,7 +285,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 155
-      job_id: jwgoyelx5
+      job_id: jpv6k2vz5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -258,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:57:12Z'
+    timestamp: '2024-10-15T01:00:47Z'
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet/README.md b/qai_hub_models/models/deeplabv3_plus_mobilenet/README.md
index a8ce7dc5..ed83d022 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet/README.md
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet/README.md
@@ -6,7 +6,7 @@
 DeepLabV3 is designed for semantic segmentation at multiple scales, trained on the various datasets. It uses MobileNet as a backbone.
 
 This is based on the implementation of DeepLabV3-Plus-MobileNet found
-[here](https://github.com/jfzhang95/pytorch-deeplab-xception). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/deeplabv3_plus_mobilenet).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.deeplabv3_plus_mobilenet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DeepLabV3-Plus-MobileNet can be found
+* The license for the original implementation of DeepLabV3-Plus-MobileNet can be found
   [here](https://github.com/jfzhang95/pytorch-deeplab-xception/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)
 * [Source Model Implementation](https://github.com/jfzhang95/pytorch-deeplab-xception)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet/export.py b/qai_hub_models/models/deeplabv3_plus_mobilenet/export.py
index c5f586f2..d1ce03ae 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet/export.py
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.deeplabv3_plus_mobilenet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "deeplabv3_plus_mobilenet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet/perf.yaml b/qai_hub_models/models/deeplabv3_plus_mobilenet/perf.yaml
index 08441fdb..6a9da31e 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet/perf.yaml
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DeepLabV3-Plus-MobileNet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 13181.0
-      throughput: 75.86677793794098
+      inference_time: 13441.0
+      throughput: 74.39922624804701
       estimated_peak_memory_range:
-        min: 14233600
-        max: 23263104
+        min: 21590016
+        max: 23177024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: jvgdw73e5
+      job_id: jgjvn3zeg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13150.0
-      throughput: 76.04562737642586
+      inference_time: 13124.0
+      throughput: 76.19628162145688
       estimated_peak_memory_range:
-        min: 3190784
-        max: 18670512
+        min: 3178496
+        max: 21314592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: jep283w4p
+      job_id: jp14zoynp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 17923.0
-      throughput: 55.794230876527365
+      inference_time: 16946.0
+      throughput: 59.01097604154373
       estimated_peak_memory_range:
-        min: 52928512
-        max: 54745920
+        min: 47849472
+        max: 346264704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1p3kq9l5
+      job_id: jp0z0o205
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:56:27Z'
+    timestamp: '2024-10-15T00:59:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 10842.0
-      throughput: 92.23390518354547
+      inference_time: 10784.0
+      throughput: 92.7299703264095
       estimated_peak_memory_range:
         min: 22142976
-        max: 97061248
+        max: 102268128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: jz57zv4lp
+      job_id: jpedm6ev5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10777.0
-      throughput: 92.79020135473694
+      inference_time: 10749.0
+      throughput: 93.03190994511117
       estimated_peak_memory_range:
         min: 3174400
-        max: 24408400
+        max: 29549824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: jqpyevx7g
+      job_id: jgdx16e6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16346.0
-      throughput: 61.17704637220115
+      inference_time: 15136.0
+      throughput: 66.0676532769556
       estimated_peak_memory_range:
-        min: 48263168
-        max: 126144864
+        min: 958464
+        max: 85892848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jwgoyerx5
+      job_id: jp8qyjmqp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:56:28Z'
+    timestamp: '2024-10-15T00:59:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 13090.0
-      throughput: 76.39419404125286
+      inference_time: 13166.0
+      throughput: 75.95321282090232
       estimated_peak_memory_range:
-        min: 22343680
-        max: 23718824
+        min: 22347776
+        max: 68519216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: jqp4qj1vg
+      job_id: jgz3dzox5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12022.0
-      throughput: 83.18083513558476
+      inference_time: 12047.0
+      throughput: 83.00821781356355
       estimated_peak_memory_range:
-        min: 3223552
-        max: 4869352
+        min: 3239936
+        max: 4477560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: j1p8owxxg
+      job_id: jp4lrek25
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:56:22Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:59:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 18229.0
-      throughput: 54.85764441274892
+      inference_time: 13288.0
+      throughput: 75.25586995785672
       estimated_peak_memory_range:
-        min: 22167552
-        max: 99317600
+        min: 22155264
+        max: 34721392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: j0pxve41g
+      job_id: jgdx16ezp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 18600.0
-      throughput: 53.763440860215056
+      inference_time: 12206.0
+      throughput: 81.92692118630183
       estimated_peak_memory_range:
-        min: 3174400
-        max: 28359712
+        min: 3194880
+        max: 4680256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: jw566q705
+      job_id: jgn6v1lj5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:56:26Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:59:51Z'
   - torchscript_onnx_tflite:
       inference_time: 13223.0
       throughput: 75.62580352416245
       estimated_peak_memory_range:
-        min: 22122496
-        max: 30344656
+        min: 14749696
+        max: 20440648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: jo5mrvmwg
+      job_id: jp14zoy7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12048.0
-      throughput: 83.00132802124834
+      inference_time: 12296.0
+      throughput: 81.32726089785297
       estimated_peak_memory_range:
-        min: 3235840
-        max: 4624240
+        min: 3260416
+        max: 4642640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: jogkzr42g
+      job_id: j5mnx9q7p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:56:23Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:59:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 13101.0
-      throughput: 76.33005114113426
+      inference_time: 13234.0
+      throughput: 75.5629439322956
       estimated_peak_memory_range:
-        min: 22114304
-        max: 25047336
+        min: 28082176
+        max: 30679032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: jegn2rnrg
+      job_id: jg9lnoj8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12119.0
-      throughput: 82.51505899826718
+      inference_time: 12164.0
+      throughput: 82.20979940808944
       estimated_peak_memory_range:
-        min: 3227648
-        max: 4558864
+        min: 3207168
+        max: 4483920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: jn5q89y45
+      job_id: jpxko0n85
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:56:24Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:59:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 13119.0
-      throughput: 76.22532205198567
+      inference_time: 18816.0
+      throughput: 53.14625850340136
       estimated_peak_memory_range:
         min: 22138880
-        max: 38378768
+        max: 101751920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 98
-      job_id: joprk1095
+      job_id: j5we6y2m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12081.0
-      throughput: 82.77460475126232
+      inference_time: 18643.0
+      throughput: 53.6394357131363
       estimated_peak_memory_range:
-        min: 3252224
-        max: 4620112
+        min: 3174400
+        max: 31557232
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: j1glnex8p
+      job_id: jp2kyo06p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:59:53Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7831.0
+      throughput: 127.69761205465458
+      estimated_peak_memory_range:
+        min: 20316160
+        max: 59031680
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 98
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 98
+      job_id: jg9lnojmg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 9188.0
+      throughput: 108.837614279495
+      estimated_peak_memory_range:
+        min: 3170304
+        max: 27743296
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 124
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 124
+      job_id: jpy138r0p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 11971.0
+      throughput: 83.53521009105337
+      estimated_peak_memory_range:
+        min: 53542912
+        max: 94260304
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 126
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 126
+      job_id: jglvmw225
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:56:25Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:59:59Z'
   - torchscript_onnx_qnn:
-      inference_time: 12511.0
-      throughput: 79.92966189753017
+      inference_time: 12380.0
+      throughput: 80.77544426494346
       estimated_peak_memory_range:
         min: 3170304
         max: 3170304
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 124
-      job_id: j2p0yej6g
+      job_id: j57yro0n5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16612.0
-      throughput: 60.197447628220566
+      inference_time: 16661.0
+      throughput: 60.020406938359045
       estimated_peak_memory_range:
-        min: 69386240
-        max: 69386240
+        min: 69431296
+        max: 69431296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1pv3zdj5
+      job_id: jgkex6qvg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:56:29Z'
+    timestamp: '2024-10-15T00:59:57Z'
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/README.md b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/README.md
index c371f6c3..f8503741 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/README.md
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/README.md
@@ -6,7 +6,7 @@
 DeepLabV3 Quantized is designed for semantic segmentation at multiple scales, trained on various datasets. It uses MobileNet as a backbone.
 
 This is based on the implementation of DeepLabV3-Plus-MobileNet-Quantized found
-[here](https://github.com/jfzhang95/pytorch-deeplab-xception). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/deeplabv3_plus_mobilenet_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.deeplabv3_plus_mobilenet_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DeepLabV3-Plus-MobileNet-Quantized can be found
+* The license for the original implementation of DeepLabV3-Plus-MobileNet-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)
 * [Source Model Implementation](https://github.com/jfzhang95/pytorch-deeplab-xception)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/export.py b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/export.py
index 7d5193ae..58212a92 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/export.py
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.deeplabv3_plus_mobilenet_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "deeplabv3_plus_mobilenet_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/perf.yaml b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/perf.yaml
index 105efc74..dba4daa2 100644
--- a/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/perf.yaml
+++ b/qai_hub_models/models/deeplabv3_plus_mobilenet_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DeepLabV3-Plus-MobileNet-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 3353.0
-      throughput: 298.2403817476886
+      inference_time: 3304.0
+      throughput: 302.6634382566586
       estimated_peak_memory_range:
         min: 12288
-        max: 8840440
+        max: 153358872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +62,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jnp10q625
+      job_id: jgkex6eng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5163.0
-      throughput: 193.6858415649816
+      inference_time: 5214.0
+      throughput: 191.79133103183736
       estimated_peak_memory_range:
         min: 16384
-        max: 14733928
+        max: 12339320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: jqpyevm7g
+        total_layers: 142
+      job_id: jg9lno08g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4668.0
-      throughput: 214.22450728363324
+      inference_time: 4221.0
+      throughput: 236.9106846718787
       estimated_peak_memory_range:
-        min: 15290368
-        max: 18104776
+        min: 11128832
+        max: 19101592
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: j1pv3z7j5
+      job_id: jp0z0o4n5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:55:44Z'
+    timestamp: '2024-10-15T00:59:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 2847.0
-      throughput: 351.24692658939233
+      inference_time: 2825.0
+      throughput: 353.98230088495575
       estimated_peak_memory_range:
         min: 12288
-        max: 65304512
+        max: 68144736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +115,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jvgdw72e5
+      job_id: j5q6q46op
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3858.0
-      throughput: 259.2016588906169
+      inference_time: 3844.0
+      throughput: 260.1456815816857
       estimated_peak_memory_range:
         min: 802816
-        max: 29932288
+        max: 26509280
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: j2p0ye66g
+        total_layers: 142
+      job_id: jp14zo27p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4064.0
-      throughput: 246.06299212598427
+      inference_time: 3141.0
+      throughput: 318.3699458771092
       estimated_peak_memory_range:
         min: 12288
-        max: 72339920
+        max: 75157872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: j7gjxkqxp
+      job_id: jp8qyj2op
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:55:45Z'
+    timestamp: '2024-10-15T00:59:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 3284.0
-      throughput: 304.50669914738125
+      inference_time: 14162.0
+      throughput: 70.61149555147578
       estimated_peak_memory_range:
-        min: 12288
-        max: 14577616
+        min: 5586944
+        max: 50318288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +168,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jz57zv9lp
+      job_id: jpedm6ov5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3939.0
-      throughput: 253.87154100025387
+      inference_time: 18291.0
+      throughput: 54.67169646274124
       estimated_peak_memory_range:
-        min: 847872
-        max: 2112960
+        min: 827392
+        max: 9088768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: jogkzr82g
+        total_layers: 142
+      job_id: jp2kyoxqp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:59:03Z'
+  - torchscript_onnx_tflite:
+      inference_time: 127380.0
+      throughput: 7.850525985241011
+      estimated_peak_memory_range:
+        min: 11624448
+        max: 66169216
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 101
+        layers_on_gpu: 3
+        layers_on_cpu: 0
+        total_layers: 104
+      job_id: jgz3dz2x5
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:55:37Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:58:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 4263.0
-      throughput: 234.57658925639223
+      inference_time: 3315.0
+      throughput: 301.65912518853696
       estimated_peak_memory_range:
-        min: 20480
-        max: 66912736
+        min: 16384
+        max: 8860536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +229,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jqp4qj3vg
+      job_id: jglvmw4m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5664.0
-      throughput: 176.5536723163842
+      inference_time: 3963.0
+      throughput: 252.33409033560434
       estimated_peak_memory_range:
-        min: 802816
-        max: 30079360
+        min: 831488
+        max: 2029904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: j1p3kq6l5
+        total_layers: 142
+      job_id: j57yro295
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:55:42Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:58:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 3280.0
-      throughput: 304.8780487804878
+      inference_time: 3335.0
+      throughput: 299.85007496251876
       estimated_peak_memory_range:
         min: 12288
-        max: 8695576
+        max: 4632240
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +267,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: j0pxvex1g
+      job_id: jpv6k2qr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3980.0
-      throughput: 251.25628140703517
+      inference_time: 3970.0
+      throughput: 251.88916876574308
       estimated_peak_memory_range:
-        min: 831488
-        max: 2093912
+        min: 827392
+        max: 2031208
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: jn5q89v45
+        total_layers: 142
+      job_id: j5mnx9e9p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:55:39Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:59:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 3323.0
-      throughput: 300.9328919650918
+      inference_time: 3294.0
+      throughput: 303.58227079538557
       estimated_peak_memory_range:
         min: 12288
-        max: 3232360
+        max: 8952312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +305,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jo5mrv8wg
+      job_id: jgo26dzkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3969.0
-      throughput: 251.95263290501387
+      inference_time: 3994.0
+      throughput: 250.37556334501753
       estimated_peak_memory_range:
-        min: 827392
-        max: 2407720
+        min: 819200
+        max: 2086304
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: j1glnel8p
+        total_layers: 142
+      job_id: jpxko09l5
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:55:40Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:58:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 3317.0
-      throughput: 301.4772384684956
+      inference_time: 3328.0
+      throughput: 300.4807692307692
       estimated_peak_memory_range:
-        min: 40960
-        max: 148009896
+        min: 12288
+        max: 120682856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +343,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jegn2rkrg
+      job_id: jp3j0onng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3975.0
-      throughput: 251.57232704402514
+      inference_time: 3963.0
+      throughput: 252.33409033560434
       estimated_peak_memory_range:
-        min: 847872
-        max: 2052464
+        min: 843776
+        max: 2147872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: jw566qw05
+        total_layers: 142
+      job_id: jp4lren15
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:55:41Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:58:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 14594.0
-      throughput: 68.52131012744964
+      inference_time: 4166.0
+      throughput: 240.03840614498318
       estimated_peak_memory_range:
-        min: 5537792
-        max: 49597760
+        min: 5566464
+        max: 74729504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +381,105 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: joprk1w95
+      job_id: j56y4o2yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 18495.0
-      throughput: 54.068667207353336
+      inference_time: 5510.0
+      throughput: 181.48820326678765
       estimated_peak_memory_range:
-        min: 888832
-        max: 8740416
+        min: 802816
+        max: 33657840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: jwgoye8x5
+        total_layers: 142
+      job_id: jprv3x67g
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:55:43Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:59:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 115874.0
-      throughput: 8.630063689870031
+      inference_time: 2441.0
+      throughput: 409.6681687832855
       estimated_peak_memory_range:
-        min: 10899456
-        max: 53906400
+        min: 8192
+        max: 43724352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 101
-        layers_on_gpu: 3
+        layers_on_npu: 104
+        layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jep283e4p
+      job_id: j5we6ywm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3816.0
+      throughput: 262.0545073375262
+      estimated_peak_memory_range:
+        min: 815104
+        max: 26711376
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 142
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 142
+      job_id: jpy138zlp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 2494.0
+      throughput: 400.962309542903
+      estimated_peak_memory_range:
+        min: 65536
+        max: 49631824
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 103
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 103
+      job_id: jglvmw6m5
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:55:33Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:59:09Z'
   - torchscript_onnx_qnn:
-      inference_time: 4292.0
-      throughput: 232.99161230195713
+      inference_time: 4324.0
+      throughput: 231.26734505087882
       estimated_peak_memory_range:
-        min: 802816
-        max: 802816
+        min: 815104
+        max: 815104
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 99
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 99
-      job_id: j1p8ow1xg
+        total_layers: 142
+      job_id: jgdx16nzp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4600.0
-      throughput: 217.3913043478261
+      inference_time: 4680.0
+      throughput: 213.67521367521368
       estimated_peak_memory_range:
-        min: 18182144
-        max: 18182144
+        min: 18178048
+        max: 18178048
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jlpe94y1g
+      job_id: jgkex6vng
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:55:45Z'
+    timestamp: '2024-10-15T00:59:07Z'
diff --git a/qai_hub_models/models/deeplabv3_resnet50/README.md b/qai_hub_models/models/deeplabv3_resnet50/README.md
index 2149f272..e57f8d8a 100644
--- a/qai_hub_models/models/deeplabv3_resnet50/README.md
+++ b/qai_hub_models/models/deeplabv3_resnet50/README.md
@@ -6,7 +6,7 @@
 DeepLabV3 is designed for semantic segmentation at multiple scales, trained on the COCO dataset. It uses ResNet50 as a backbone.
 
 This is based on the implementation of DeepLabV3-ResNet50 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/deeplabv3.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/deeplabv3_resnet50).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.deeplabv3_resnet50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DeepLabV3-ResNet50 can be found
+* The license for the original implementation of DeepLabV3-ResNet50 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/deeplabv3.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/deeplabv3_resnet50/export.py b/qai_hub_models/models/deeplabv3_resnet50/export.py
index d693b76e..3913a5b5 100644
--- a/qai_hub_models/models/deeplabv3_resnet50/export.py
+++ b/qai_hub_models/models/deeplabv3_resnet50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.deeplabv3_resnet50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "deeplabv3_resnet50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/deeplabv3_resnet50/perf.yaml b/qai_hub_models/models/deeplabv3_resnet50/perf.yaml
index 598764c9..49b7bdcf 100644
--- a/qai_hub_models/models/deeplabv3_resnet50/perf.yaml
+++ b/qai_hub_models/models/deeplabv3_resnet50/perf.yaml
@@ -18,37 +18,31 @@ aggregated:
   - Samsung Galaxy S21
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
-  - QCS8550 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
   - SA8775 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
   supported_chipsets:
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DeepLabV3-ResNet50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 291699.0
-      throughput: 3.428191389068869
+      inference_time: 291789.0
+      throughput: 3.427133990657633
       estimated_peak_memory_range:
-        min: 12288
-        max: 148775512
+        min: 22183936
+        max: 200672256
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -56,7 +50,7 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: jogkzr9og
+      job_id: j56y4oyyp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -65,13 +59,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:54:35Z'
+    timestamp: '2024-10-15T00:57:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 225905.0
-      throughput: 4.426639516610964
+      inference_time: 225775.0
+      throughput: 4.429188351234636
       estimated_peak_memory_range:
-        min: 22335488
-        max: 44592032
+        min: 22384640
+        max: 44657360
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -79,7 +73,7 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: jn5q89mm5
+      job_id: jp3j0ojng
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -88,13 +82,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:54:36Z'
+    timestamp: '2024-10-15T00:57:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 290320.0
-      throughput: 3.444475062000551
+      inference_time: 289970.0
+      throughput: 3.448632617167293
       estimated_peak_memory_range:
-        min: 253952
-        max: 148467448
+        min: 32768
+        max: 244270728
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -102,7 +96,7 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: j1glne1lp
+      job_id: jgo26d2kp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -110,14 +104,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:54:37Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:57:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 776883.0
-      throughput: 1.2871951117478437
+      inference_time: 290802.0
+      throughput: 3.438765895695353
       estimated_peak_memory_range:
-        min: 73728
-        max: 31080016
+        min: 49152
+        max: 148905792
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -125,22 +119,22 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: jw566qd75
+      job_id: jgz3dz3x5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:54:38Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:57:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 291442.0
-      throughput: 3.431214444040324
+      inference_time: 289879.0
+      throughput: 3.449715226008093
       estimated_peak_memory_range:
-        min: 110592
-        max: 148877688
+        min: 2187264
+        max: 145774384
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -148,22 +142,22 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: j1p3kqwz5
+      job_id: jpedm6dv5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:54:39Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:57:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 290379.0
-      throughput: 3.4437752041297753
+      inference_time: 290181.0
+      throughput: 3.446125004738422
       estimated_peak_memory_range:
-        min: 32768
-        max: 149249784
+        min: 86016
+        max: 148819304
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -171,22 +165,22 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: j1pv3z9m5
+      job_id: jgjvn3veg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:54:41Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:57:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 290542.0
-      throughput: 3.441843175857535
+      inference_time: 757728.0
+      throughput: 1.3197347860973858
       estimated_peak_memory_range:
-        min: 16384
-        max: 304936512
+        min: 21626880
+        max: 53838160
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -194,13 +188,21 @@ models:
         layers_on_gpu: 95
         layers_on_cpu: 0
         total_layers: 95
-      job_id: j7gjxkw8p
+      job_id: jpv6k26r5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:57:48Z'
+  - reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:54:42Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:57:53Z'
diff --git a/qai_hub_models/models/densenet121/README.md b/qai_hub_models/models/densenet121/README.md
index 8da6afd9..a291d310 100644
--- a/qai_hub_models/models/densenet121/README.md
+++ b/qai_hub_models/models/densenet121/README.md
@@ -6,7 +6,7 @@
 Densenet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of DenseNet-121 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/densenet121).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.densenet121.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DenseNet-121 can be found
+* The license for the original implementation of DenseNet-121 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/densenet121/export.py b/qai_hub_models/models/densenet121/export.py
index 0c09c8d4..339c295a 100644
--- a/qai_hub_models/models/densenet121/export.py
+++ b/qai_hub_models/models/densenet121/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.densenet121 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "densenet121"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/densenet121/perf.yaml b/qai_hub_models/models/densenet121/perf.yaml
index 363e6c8e..510ac059 100644
--- a/qai_hub_models/models/densenet121/perf.yaml
+++ b/qai_hub_models/models/densenet121/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DenseNet-121
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1930.0
-      throughput: 518.1347150259068
+      inference_time: 1922.0
+      throughput: 520.2913631633714
       estimated_peak_memory_range:
-        min: 28672
-        max: 247350448
+        min: 16384
+        max: 6100120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: jogkzroog
+      job_id: jpy13o0lp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1994.0
-      throughput: 501.5045135406219
+      inference_time: 1990.0
+      throughput: 502.51256281407035
       estimated_peak_memory_range:
-        min: 12288
-        max: 5539288
+        min: 622592
+        max: 5712216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: j7gjxko8p
+      job_id: jpv6k2kr5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1948.0
-      throughput: 513.347022587269
+      inference_time: 1872.0
+      throughput: 534.1880341880342
       estimated_peak_memory_range:
         min: 12288
-        max: 18023096
+        max: 17858104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 374
-      job_id: jqp4qj9lg
+      job_id: j5mnx9x9p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:50:57Z'
+    timestamp: '2024-10-15T00:57:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 1634.0
-      throughput: 611.9951040391677
+      inference_time: 1425.0
+      throughput: 701.7543859649123
       estimated_peak_memory_range:
-        min: 12288
-        max: 104142304
+        min: 16384
+        max: 104650240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: jn5q89zm5
+      job_id: jp0z0m7n5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1472.0
-      throughput: 679.3478260869565
+      inference_time: 1474.0
+      throughput: 678.42605156038
       estimated_peak_memory_range:
-        min: 626688
-        max: 19773136
+        min: 0
+        max: 19659696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jlpe9480g
+      job_id: jgjvn3neg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1466.0
-      throughput: 682.1282401091405
+      inference_time: 1458.0
+      throughput: 685.8710562414266
       estimated_peak_memory_range:
-        min: 0
-        max: 108106720
+        min: 466944
+        max: 110751600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 374
-      job_id: j0pxved9g
+      job_id: jgn6v1vq5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:50:58Z'
+    timestamp: '2024-10-15T00:57:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 1912.0
-      throughput: 523.0125523012553
+      inference_time: 1920.0
+      throughput: 520.8333333333334
       estimated_peak_memory_range:
-        min: 12288
-        max: 1692536
+        min: 20480
+        max: 27558888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: j1glneolp
+      job_id: jp8qyevop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1789.0
-      throughput: 558.9714924538848
+      inference_time: 1788.0
+      throughput: 559.2841163310962
       estimated_peak_memory_range:
-        min: 647168
-        max: 1850192
+        min: 626688
+        max: 1912584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jz5wom8jp
+      job_id: jgz3dzdx5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:50:52Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:57:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 2606.0
-      throughput: 383.7298541826554
+      inference_time: 1928.0
+      throughput: 518.6721991701245
       estimated_peak_memory_range:
-        min: 16384
-        max: 104512000
+        min: 12288
+        max: 1484832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: jw566qr75
+      job_id: j56y4o4yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2677.0
-      throughput: 373.55248412401943
+      inference_time: 1799.0
+      throughput: 555.864369093941
       estimated_peak_memory_range:
-        min: 618496
-        max: 20196928
+        min: 634880
+        max: 2299552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jz57zv7rp
+      job_id: jp14zoz7p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:50:56Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:57:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 1936.0
-      throughput: 516.5289256198347
+      inference_time: 1927.0
+      throughput: 518.9413596263622
       estimated_peak_memory_range:
-        min: 20480
-        max: 6452032
+        min: 24576
+        max: 1738480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: j1p3kqxz5
+      job_id: jglvmwmm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1794.0
-      throughput: 557.4136008918617
+      inference_time: 1791.0
+      throughput: 558.3472920156337
       estimated_peak_memory_range:
         min: 638976
-        max: 1871608
+        max: 1984776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jmg9v9kv5
+      job_id: jg9lnon8g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:50:53Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:57:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 1926.0
-      throughput: 519.2107995846313
+      inference_time: 1922.0
+      throughput: 520.2913631633714
       estimated_peak_memory_range:
-        min: 24576
-        max: 1541736
+        min: 49152
+        max: 223232648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: jwgoyeod5
+      job_id: j5q6qloop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1786.0
-      throughput: 559.9104143337066
+      inference_time: 1803.0
+      throughput: 554.6311702717693
       estimated_peak_memory_range:
-        min: 634880
-        max: 1915920
+        min: 643072
+        max: 2285632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jnp10q7l5
+      job_id: j5we6y6m5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:50:54Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:57:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 1934.0
-      throughput: 517.063081695967
+      inference_time: 2617.0
+      throughput: 382.11692777990066
       estimated_peak_memory_range:
-        min: 40960
-        max: 255189400
+        min: 16384
+        max: 106824192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 312
-      job_id: j1pv3zem5
+      job_id: jgkex2mng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1940.0
-      throughput: 515.4639175257732
+      inference_time: 2723.0
+      throughput: 367.2420124862284
       estimated_peak_memory_range:
-        min: 655360
-        max: 1939584
+        min: 0
+        max: 24374832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jvgdw78l5
+      job_id: j57yror95
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:57:09Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1004.0
+      throughput: 996.01593625498
+      estimated_peak_memory_range:
+        min: 12288
+        max: 27684800
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 312
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 312
+      job_id: jgo26d6kp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1293.0
+      throughput: 773.3952049497293
+      estimated_peak_memory_range:
+        min: 0
+        max: 19214256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 372
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 372
+      job_id: jpxko0ol5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1297.0
+      throughput: 771.0100231303007
+      estimated_peak_memory_range:
+        min: 0
+        max: 32728176
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jpy1383lp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:50:55Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:57:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 2012.0
-      throughput: 497.0178926441352
+      inference_time: 2019.0
+      throughput: 495.2947003467063
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 372
-      job_id: jygzev86g
+      job_id: jpedm6mv5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2007.0
-      throughput: 498.2561036372696
+      inference_time: 2054.0
+      throughput: 486.8549172346641
       estimated_peak_memory_range:
-        min: 17166336
-        max: 17166336
+        min: 17170432
+        max: 17170432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 374
-      job_id: jo5mrvdqg
+      job_id: jprv3x37g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:50:59Z'
+    timestamp: '2024-10-15T00:57:13Z'
diff --git a/qai_hub_models/models/densenet121_quantized/README.md b/qai_hub_models/models/densenet121_quantized/README.md
new file mode 100644
index 00000000..d0ac37c1
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/README.md
@@ -0,0 +1,59 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [DenseNet-121-Quantized: Imagenet classifier and general purpose backbone](https://aihub.qualcomm.com/models/densenet121_quantized)
+
+Densenet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
+
+This is based on the implementation of DenseNet-121-Quantized found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/densenet121_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+
+
+
+## Example & Usage
+
+
+Once installed, run the following simple CLI demo:
+
+```bash
+python -m qai_hub_models.models.densenet121_quantized.demo
+```
+More details on the CLI tool can be found with the `--help` option. See
+[demo.py](demo.py) for sample usage of the model including pre/post processing
+scripts. Please refer to our [general instructions on using
+models](../../../#getting-started) for more usage instructions.
+
+## Export for on-device deployment
+
+This repository contains export scripts that produce a model optimized for
+on-device deployment. This can be run as follows:
+
+```bash
+python -m qai_hub_models.models.densenet121_quantized.export
+```
+Additional options are documented with the `--help` option. Note that the above
+script requires access to Deployment instructions for Qualcomm® AI Hub.
+
+
+## License
+* The license for the original implementation of DenseNet-121-Quantized can be found
+  [here](https://github.com/pytorch/vision/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
+
+## References
+* [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)
+* [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
diff --git a/qai_hub_models/models/densenet121_quantized/__init__.py b/qai_hub_models/models/densenet121_quantized/__init__.py
new file mode 100644
index 00000000..13778437
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/__init__.py
@@ -0,0 +1,10 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.imagenet_classifier.app import (  # noqa: F401
+    ImagenetClassifierApp as App,
+)
+
+from .model import MODEL_ID  # noqa: F401
+from .model import ConvNextTinyW8A8Quantizable as Model  # noqa: F401
diff --git a/qai_hub_models/models/densenet121_quantized/conftest.py b/qai_hub_models/models/densenet121_quantized/conftest.py
new file mode 100644
index 00000000..2857e300
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/conftest.py
@@ -0,0 +1,37 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+import inspect
+
+import pytest
+
+from qai_hub_models.models.densenet121_quantized import Model
+
+
+# Instantiate the model only once for all tests.
+# Mock from_pretrained to always return the initialized model.
+# This speeds up tests and limits memory leaks.
+@pytest.fixture(scope="module", autouse=True)
+def cached_from_pretrained():
+    with pytest.MonkeyPatch.context() as mp:
+        pretrained_cache = {}
+        from_pretrained = Model.from_pretrained
+        sig = inspect.signature(from_pretrained)
+
+        def _cached_from_pretrained(*args, **kwargs):
+            cache_key = str(args) + str(kwargs)
+            model = pretrained_cache.get(cache_key, None)
+            if model:
+                return model
+            else:
+                model = from_pretrained(*args, **kwargs)
+                pretrained_cache[cache_key] = model
+                return model
+
+        _cached_from_pretrained.__signature__ = sig
+
+        mp.setattr(Model, "from_pretrained", _cached_from_pretrained)
+        yield mp
diff --git a/qai_hub_models/models/densenet121_quantized/demo.py b/qai_hub_models/models/densenet121_quantized/demo.py
new file mode 100644
index 00000000..adc48957
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/demo.py
@@ -0,0 +1,17 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.imagenet_classifier.demo import imagenet_demo
+from qai_hub_models.models.convnext_tiny_w8a8_quantized.model import (
+    MODEL_ID,
+    ConvNextTinyW8A8Quantizable,
+)
+
+
+def main(is_test: bool = False):
+    imagenet_demo(ConvNextTinyW8A8Quantizable, MODEL_ID, is_test)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/densenet121_quantized/evaluate.py b/qai_hub_models/models/densenet121_quantized/evaluate.py
new file mode 100644
index 00000000..b6133faa
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/evaluate.py
@@ -0,0 +1,56 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import warnings
+
+import qai_hub as hub
+
+from qai_hub_models.models.densenet121_quantized import MODEL_ID, Model
+from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
+from qai_hub_models.utils.evaluate import evaluate_on_dataset
+from qai_hub_models.utils.inference import compile_model_from_args
+
+SUPPORTED_DATASETS = ["imagenette", "imagenet"]
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = evaluate_parser(
+        model_cls=Model,
+        default_split_size=2500,
+        supported_datasets=SUPPORTED_DATASETS,
+        supports_tflite=False,
+        is_hub_quantized=True,
+    )
+    args = parser.parse_args()
+    args.device = None
+
+    if args.hub_model_id is not None:
+        hub_model = hub.get_model(args.hub_model_id)
+    else:
+        hub_model = compile_model_from_args(
+            MODEL_ID, args, get_model_kwargs(Model, vars(args))
+        )
+    hub_device = get_hub_device(None, args.chipset)
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
+    evaluate_on_dataset(
+        hub_model,
+        torch_model,
+        hub_device,
+        args.dataset_name,
+        args.split_size,
+        args.num_samples,
+        args.seed,
+        args.profile_options,
+        args.use_cache,
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/densenet121_quantized/export.py b/qai_hub_models/models/densenet121_quantized/export.py
new file mode 100644
index 00000000..423e8c87
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/export.py
@@ -0,0 +1,250 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import os
+import warnings
+from pathlib import Path
+from typing import Any, Dict, List, Optional, cast
+
+import qai_hub as hub
+import torch
+
+from qai_hub_models.models.common import ExportResult, TargetRuntime
+from qai_hub_models.models.densenet121_quantized import Model
+from qai_hub_models.utils.args import (
+    export_parser,
+    get_input_spec_kwargs,
+    get_model_kwargs,
+)
+from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
+from qai_hub_models.utils.printing import (
+    print_inference_metrics,
+    print_on_target_demo_cmd,
+    print_profile_metrics_from_job,
+)
+from qai_hub_models.utils.qai_hub_helpers import (
+    can_access_qualcomm_ai_hub,
+    export_without_hub_access,
+)
+from qai_hub_models.utils.quantization import get_calibration_data
+
+
+def export_model(
+    device: str = "Samsung Galaxy S23 (Family)",
+    chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
+    skip_profiling: bool = False,
+    skip_inferencing: bool = False,
+    skip_downloading: bool = False,
+    skip_summary: bool = False,
+    output_dir: Optional[str] = None,
+    target_runtime: TargetRuntime = TargetRuntime.QNN,
+    compile_options: str = "",
+    profile_options: str = "",
+    **additional_model_kwargs,
+) -> ExportResult | List[str]:
+    """
+    This function executes the following recipe:
+
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
+
+    Each of the last 5 steps can be optionally skipped using the input options.
+
+    Parameters:
+        device: Device for which to export the model.
+            Full list of available devices can be found by running `hub.get_devices()`.
+            Defaults to DEFAULT_DEVICE if not specified.
+        chipset: If set, will choose a random device with this chipset.
+            Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
+        skip_profiling: If set, skips profiling of compiled model on real devices.
+        skip_inferencing: If set, skips computing on-device outputs from sample data.
+        skip_downloading: If set, skips downloading of compiled model.
+        skip_summary: If set, skips waiting for and summarizing results
+            from profiling and inference.
+        output_dir: Directory to store generated assets (e.g. compiled model).
+            Defaults to `<cwd>/build/<model_name>`.
+        target_runtime: Which on-device runtime to target. Default is TFLite.
+        compile_options: Additional options to pass when submitting the compile job.
+        profile_options: Additional options to pass when submitting the profile job.
+        **additional_model_kwargs: Additional optional kwargs used to customize
+            `model_cls.from_pretrained` and `model.get_input_spec`
+
+    Returns:
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
+            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
+    """
+    model_name = "densenet121_quantized"
+    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
+    if chipset:
+        hub_device = hub.Device(attributes=f"chipset:{chipset}")
+    else:
+        hub_device = hub.Device(name=device)
+    if not can_access_qualcomm_ai_hub():
+        return export_without_hub_access(
+            "densenet121_quantized",
+            "DenseNet-121-Quantized",
+            device,
+            skip_profiling,
+            skip_inferencing,
+            skip_downloading,
+            skip_summary,
+            output_path,
+            target_runtime,
+            compile_options,
+            profile_options,
+        )
+
+    # On-device perf improves with I/O in channel_last format except when using ONNX.
+    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+    model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
+    input_spec = model.get_input_spec(
+        **get_input_spec_kwargs(model, additional_model_kwargs)
+    )
+
+    # Trace the model
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
+    )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
+
+    # 3. Compiles the model to an asset that can be run on device
+    model_compile_options = model.get_hub_compile_options(
+        target_runtime, compile_options, hub_device
+    )
+    print(f"Optimizing model {model_name} to run on-device")
+    submitted_compile_job = hub.submit_compile_job(
+        model=quantize_job.get_target_model(),
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options=model_compile_options,
+    )
+    compile_job = cast(hub.client.CompileJob, submitted_compile_job)
+
+    # 4. Profiles the model performance on a real device
+    profile_job: Optional[hub.client.ProfileJob] = None
+    if not skip_profiling:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(f"Profiling model {model_name} on a hosted device.")
+        submitted_profile_job = hub.submit_profile_job(
+            model=compile_job.get_target_model(),
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
+
+    # 5. Inferences the model on sample inputs
+    inference_job: Optional[hub.client.InferenceJob] = None
+    if not skip_inferencing:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(
+            f"Running inference for {model_name} on a hosted device with example inputs."
+        )
+        sample_inputs = model.sample_inputs(
+            input_spec, use_channel_last_format=use_channel_last_format
+        )
+        submitted_inference_job = hub.submit_inference_job(
+            model=compile_job.get_target_model(),
+            inputs=sample_inputs,
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
+
+    # 6. Downloads the model asset to the local directory
+    if not skip_downloading:
+        os.makedirs(output_path, exist_ok=True)
+        target_model: hub.Model = compile_job.get_target_model()  # type: ignore
+        target_model.download(str(output_path / model_name))
+
+    # 7. Summarizes the results from profiling and inference
+    if not skip_summary and not skip_profiling:
+        assert profile_job is not None and profile_job.wait().success
+        profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+        print_profile_metrics_from_job(profile_job, profile_data)
+
+    if not skip_summary and not skip_inferencing:
+        sample_inputs = model.sample_inputs(use_channel_last_format=False)
+        torch_out = torch_inference(
+            model, sample_inputs, return_channel_last_output=use_channel_last_format
+        )
+        assert inference_job is not None and inference_job.wait().success
+        inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
+
+        print_inference_metrics(
+            inference_job,
+            inference_result,
+            torch_out,
+            model.get_output_names(),
+            metrics="psnr,top1,top5",
+        )
+
+    if not skip_summary:
+        print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
+
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(
+        model_cls=Model, supports_tflite=False, is_hub_quantized=True
+    )
+    args = parser.parse_args()
+    export_model(**vars(args))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/densenet121_quantized/info.yaml b/qai_hub_models/models/densenet121_quantized/info.yaml
new file mode 100644
index 00000000..37aadad5
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/info.yaml
@@ -0,0 +1,43 @@
+name: DenseNet-121-Quantized
+# id must match with the model dir name in qai_hub_models
+id: densenet121_quantized
+status: public
+headline: Imagenet classifier and general purpose backbone.
+domain: Computer Vision
+description: Densenet is a machine learning model that can classify images from the
+  Imagenet dataset. It can also be used as a backbone in building more complex models
+  for specific use cases.
+use_case: Image Classification
+tags:
+  - backbone
+  - quantized
+research_paper: https://arxiv.org/abs/1608.06993
+research_paper_title: Densely Connected Convolutional Networks
+license: https://github.com/pytorch/vision/blob/main/LICENSE
+deploy_license: https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf
+source_repo: https://github.com/pytorch/vision/blob/main/torchvision/models/densenet.py
+technical_details:
+  Model checkpoint: Imagenet
+  Input resolution: 224x224
+  Number of parameters: 7.97M
+  Model size: 9.4 MB
+applicable_scenarios:
+  - Medical Imaging
+  - Anomaly Detection
+  - Inventory Management
+related_models:
+  - mobilenet_v2
+  - squeezenet1_1
+  - googlenet
+form_factors:
+  - Phone
+  - Tablet
+  - IoT
+has_static_banner: true
+has_animated_banner: true
+license_type: bsd-3-clause
+deploy_license_type: AI Model Hub License
+dataset:
+  - imagenet-1k
+  - imagenet-22k
+labels_file: imagenet_labels.txt
diff --git a/qai_hub_models/models/densenet121_quantized/model.py b/qai_hub_models/models/densenet121_quantized/model.py
new file mode 100644
index 00000000..a6e459c7
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/model.py
@@ -0,0 +1,14 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from qai_hub_models.models.convnext_tiny.model import ConvNextTiny
+from qai_hub_models.utils.quantization import HubQuantizableMixin
+
+MODEL_ID = __name__.split(".")[-2]
+
+
+class ConvNextTinyW8A8Quantizable(HubQuantizableMixin, ConvNextTiny):
+    pass
diff --git a/qai_hub_models/models/densenet121_quantized/perf.yaml b/qai_hub_models/models/densenet121_quantized/perf.yaml
new file mode 100644
index 00000000..408a8eb0
--- /dev/null
+++ b/qai_hub_models/models/densenet121_quantized/perf.yaml
@@ -0,0 +1,298 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS6490 (Proxy)
+  - RB3 Gen 2 (Proxy)
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
+models:
+- name: DenseNet-121-Quantized
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      inference_time: 1745.0
+      throughput: 573.0659025787966
+      estimated_peak_memory_range:
+        min: 16384
+        max: 285175608
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jp2kx7lmp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 29847.0
+      throughput: 33.5042047776996
+      estimated_peak_memory_range:
+        min: 7876608
+        max: 12774376
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 325
+        layers_on_gpu: 0
+        layers_on_cpu: 27
+        total_layers: 352
+      job_id: jgjvd0e8g
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-17T17:32:14Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1218.0
+      throughput: 821.0180623973728
+      estimated_peak_memory_range:
+        min: 163840
+        max: 23146192
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jpy1z464p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 22391.0
+      throughput: 44.66080121477379
+      estimated_peak_memory_range:
+        min: 9449472
+        max: 1067915744
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 325
+        layers_on_gpu: 0
+        layers_on_cpu: 27
+        total_layers: 352
+      job_id: jpedork05
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-17T17:32:15Z'
+  - torchscript_onnx_qnn:
+      inference_time: 6521.0
+      throughput: 153.35071308081584
+      estimated_peak_memory_range:
+        min: 212992
+        max: 8568672
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jp0z41le5
+      job_status: Passed
+    reference_device_info:
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:31:57Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1672.0
+      throughput: 598.0861244019138
+      estimated_peak_memory_range:
+        min: 180224
+        max: 1556048
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jp8q23z8p
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:31:59Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1670.0
+      throughput: 598.8023952095808
+      estimated_peak_memory_range:
+        min: 172032
+        max: 1472536
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: j5q6073mp
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:32:03Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1684.0
+      throughput: 593.8242280285035
+      estimated_peak_memory_range:
+        min: 196608
+        max: 1433416
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jglv403l5
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:32:05Z'
+  - torchscript_onnx_qnn:
+      inference_time: 2130.0
+      throughput: 469.4835680751174
+      estimated_peak_memory_range:
+        min: 167936
+        max: 28636608
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: j56y23n7p
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:32:06Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1160.0
+      throughput: 862.0689655172414
+      estimated_peak_memory_range:
+        min: 0
+        max: 27657328
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jp3jn4ezg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:32:19Z'
+  - torchscript_onnx_qnn:
+      inference_time: 1822.0
+      throughput: 548.847420417124
+      estimated_peak_memory_range:
+        min: 487424
+        max: 487424
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jgkevl3og
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 32525.0
+      throughput: 30.745580322828594
+      estimated_peak_memory_range:
+        min: 48742400
+        max: 48742400
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 325
+        layers_on_gpu: 0
+        layers_on_cpu: 27
+        total_layers: 352
+      job_id: jgz32xr65
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-17T17:32:17Z'
diff --git a/qai_hub_models/models/detr_resnet101/README.md b/qai_hub_models/models/detr_resnet101/README.md
index 7b1057a5..3ca26748 100644
--- a/qai_hub_models/models/detr_resnet101/README.md
+++ b/qai_hub_models/models/detr_resnet101/README.md
@@ -6,7 +6,7 @@
 DETR is a machine learning model that can detect objects (trained on COCO dataset).
 
 This is based on the implementation of DETR-ResNet101 found
-[here](https://github.com/facebookresearch/detr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/detr_resnet101).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.detr_resnet101.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DETR-ResNet101 can be found
+* The license for the original implementation of DETR-ResNet101 can be found
   [here](https://github.com/facebookresearch/detr/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872)
 * [Source Model Implementation](https://github.com/facebookresearch/detr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/detr_resnet101/export.py b/qai_hub_models/models/detr_resnet101/export.py
index 2e00aede..bbb2ab54 100644
--- a/qai_hub_models/models/detr_resnet101/export.py
+++ b/qai_hub_models/models/detr_resnet101/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.detr_resnet101 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "detr_resnet101"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/detr_resnet101/perf.yaml b/qai_hub_models/models/detr_resnet101/perf.yaml
index 20087154..da277769 100644
--- a/qai_hub_models/models/detr_resnet101/perf.yaml
+++ b/qai_hub_models/models/detr_resnet101/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DETR-ResNet101
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 17324.0
-      throughput: 57.723389517432466
+      inference_time: 15179.0
+      throughput: 65.88049278608604
       estimated_peak_memory_range:
-        min: 73728
-        max: 3077384
+        min: 90112
+        max: 2967160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,22 +56,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: jw566qv75
+      job_id: jp0z0mq95
       job_status: Passed
     torchscript_onnx:
-      inference_time: 20465.0
-      throughput: 48.86391399951136
+      inference_time: 16036.0
+      throughput: 62.35969069593415
       estimated_peak_memory_range:
-        min: 40960
-        max: 133388992
+        min: 28672
+        max: 133747336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 856
+        layers_on_npu: 886
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 856
-      job_id: jegn2romg
+        total_layers: 886
+      job_id: jgdx1dlzp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:50:16Z'
+    timestamp: '2024-10-15T00:56:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 14027.0
-      throughput: 71.29108148570614
+      inference_time: 12443.0
+      throughput: 80.36647110825363
       estimated_peak_memory_range:
         min: 53248
-        max: 304006048
+        max: 316370960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,22 +94,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: j1p3kq8z5
+      job_id: jp8qye9kp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16238.0
-      throughput: 61.583938908732605
+      inference_time: 14577.0
+      throughput: 68.60122110173562
       estimated_peak_memory_range:
-        min: 606208
-        max: 257867888
+        min: 0
+        max: 285366128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 856
+        layers_on_npu: 886
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 856
-      job_id: joprk1oe5
+        total_layers: 886
+      job_id: j57yre395
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:50:17Z'
+    timestamp: '2024-10-15T00:56:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 17375.0
-      throughput: 57.55395683453237
+      inference_time: 15100.0
+      throughput: 66.2251655629139
       estimated_peak_memory_range:
-        min: 77824
-        max: 3496320
+        min: 94208
+        max: 2422152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: jwgoyemd5
+      job_id: jgkex2nwg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:50:03Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:56:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 23451.0
-      throughput: 42.64210481429363
+      inference_time: 15078.0
+      throughput: 66.32179334129195
       estimated_peak_memory_range:
-        min: 73728
-        max: 248929136
+        min: 81920
+        max: 2476136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: j1pv3z4m5
+      job_id: jp3j0z33g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:50:04Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:56:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 17341.0
-      throughput: 57.666801222536186
+      inference_time: 15122.0
+      throughput: 66.12881893929375
       estimated_peak_memory_range:
-        min: 106496
-        max: 2650776
+        min: 57344
+        max: 2307120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: j7gjxk18p
+      job_id: j56y48j6p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:50:05Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:56:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 17418.0
-      throughput: 57.41187277528993
+      inference_time: 15140.0
+      throughput: 66.05019815059445
       estimated_peak_memory_range:
-        min: 73728
-        max: 2739360
+        min: 86016
+        max: 2745760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: jlpe9420g
+      job_id: jglvmyzj5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:50:06Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:56:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 17358.0
-      throughput: 57.61032377001959
+      inference_time: 21263.0
+      throughput: 47.03005220335795
       estimated_peak_memory_range:
-        min: 126976
-        max: 3318264
+        min: 86016
+        max: 256037744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,22 +224,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: jygzevw6g
+      job_id: j5q6qlknp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:50:07Z'
-  - torchscript_onnx:
-      inference_time: 21172.0
-      throughput: 47.23219346306443
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:56:05Z'
+  - torchscript_onnx_tflite:
+      inference_time: 8835.0
+      throughput: 113.18619128466327
       estimated_peak_memory_range:
-        min: 121675776
-        max: 121675776
+        min: 77824
+        max: 121667136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -249,7 +247,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 856
-      job_id: jep2834mp
+      job_id: jpv6klxk5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 11385.0
+      throughput: 87.83487044356609
+      estimated_peak_memory_range:
+        min: 2883584
+        max: 123746080
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 886
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 886
+      job_id: j5mnx0y9p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:56:28Z'
+  - torchscript_onnx:
+      inference_time: 17968.0
+      throughput: 55.65449688334817
+      estimated_peak_memory_range:
+        min: 121643008
+        max: 121643008
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 886
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 886
+      job_id: jp4lry015
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -258,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:50:18Z'
+    timestamp: '2024-10-15T00:56:25Z'
diff --git a/qai_hub_models/models/detr_resnet101_dc5/README.md b/qai_hub_models/models/detr_resnet101_dc5/README.md
index 8a40c445..17aabd5f 100644
--- a/qai_hub_models/models/detr_resnet101_dc5/README.md
+++ b/qai_hub_models/models/detr_resnet101_dc5/README.md
@@ -6,7 +6,7 @@
 DETR is a machine learning model that can detect objects (trained on COCO dataset).
 
 This is based on the implementation of DETR-ResNet101-DC5 found
-[here](https://github.com/facebookresearch/detr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/detr_resnet101_dc5).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.detr_resnet101_dc5.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DETR-ResNet101-DC5 can be found
+* The license for the original implementation of DETR-ResNet101-DC5 can be found
   [here](https://github.com/facebookresearch/detr/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872)
 * [Source Model Implementation](https://github.com/facebookresearch/detr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/detr_resnet101_dc5/export.py b/qai_hub_models/models/detr_resnet101_dc5/export.py
index 6757d732..3b1b6587 100644
--- a/qai_hub_models/models/detr_resnet101_dc5/export.py
+++ b/qai_hub_models/models/detr_resnet101_dc5/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.detr_resnet101_dc5 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "detr_resnet101_dc5"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/detr_resnet101_dc5/perf.yaml b/qai_hub_models/models/detr_resnet101_dc5/perf.yaml
index b9d1383d..23686df3 100644
--- a/qai_hub_models/models/detr_resnet101_dc5/perf.yaml
+++ b/qai_hub_models/models/detr_resnet101_dc5/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DETR-ResNet101-DC5
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 117649.0
-      throughput: 8.499859752314087
+      inference_time: 92491.0
+      throughput: 10.81186277583765
       estimated_peak_memory_range:
-        min: 270336
-        max: 2720080
+        min: 180224
+        max: 2968360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,22 +56,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: j1pv3zzm5
+      job_id: jgn6vz2k5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 130513.0
-      throughput: 7.662071977504157
+      inference_time: 91901.0
+      throughput: 10.881274414859469
       estimated_peak_memory_range:
-        min: 135168
-        max: 134580736
+        min: 147456
+        max: 133899832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 856
+        layers_on_npu: 886
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 856
-      job_id: jqpyevn4g
+        total_layers: 886
+      job_id: jgdx1d9rp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:49:24Z'
+    timestamp: '2024-10-15T00:55:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 108892.0
-      throughput: 9.183411086213864
+      inference_time: 67087.0
+      throughput: 14.906017559288685
       estimated_peak_memory_range:
-        min: 237568
-        max: 503637600
+        min: 184320
+        max: 574980688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,7 +94,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: j7gjxkk8p
+      job_id: jprv3lk0g
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 81298.0
+      throughput: 12.300425594725578
+      estimated_peak_memory_range:
+        min: 1597440
+        max: 589120736
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 886
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 886
+      job_id: j57yrewv5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -105,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:49:10Z'
+    timestamp: '2024-10-15T00:55:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 116654.0
-      throughput: 8.57235928472234
+      inference_time: 81085.0
+      throughput: 12.332737251032867
       estimated_peak_memory_range:
-        min: 106496
-        max: 2419008
+        min: 49152
+        max: 2737600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -119,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: jlpe9440g
+      job_id: jp2kyr8rp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -127,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:49:11Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:54:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 137101.0
-      throughput: 7.293892823538851
+      inference_time: 82049.0
+      throughput: 12.187838974271472
       estimated_peak_memory_range:
-        min: 110592
-        max: 445147040
+        min: 32768
+        max: 3290264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -142,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: jygzevv6g
+      job_id: jgkex2zwg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:49:11Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:55:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 126921.0
-      throughput: 7.87891680651744
+      inference_time: 90194.0
+      throughput: 11.087212009668049
       estimated_peak_memory_range:
-        min: 176128
-        max: 3387008
+        min: 24576
+        max: 2621128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -165,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: jz5wommjp
+      job_id: jp8qyeokp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:49:13Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:55:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 127138.0
-      throughput: 7.865469017917539
+      inference_time: 81200.0
+      throughput: 12.31527093596059
       estimated_peak_memory_range:
-        min: 155648
-        max: 3586816
+        min: 57344
+        max: 3438264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -188,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: jmg9v99v5
+      job_id: jp0z0my95
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:49:14Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:55:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 117019.0
-      throughput: 8.54562079662277
+      inference_time: 96012.0
+      throughput: 10.415364746073408
       estimated_peak_memory_range:
-        min: 86016
-        max: 2450616
+        min: 655360
+        max: 511797968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -211,30 +224,68 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 857
-      job_id: jnp10qql5
+      job_id: jpy13oe8p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:55:00Z'
+  - torchscript_onnx_tflite:
+      inference_time: 61375.0
+      throughput: 16.293279022403258
+      estimated_peak_memory_range:
+        min: 0
+        max: 290416816
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 857
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 857
+      job_id: jglvmynj5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 66684.0
+      throughput: 14.996101013736428
+      estimated_peak_memory_range:
+        min: 622592
+        max: 337442992
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 886
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 886
+      job_id: jp144vxnp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:49:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T09:31:10Z'
   - torchscript_onnx:
-      inference_time: 122356.0
-      throughput: 8.172872601261892
+      inference_time: 69984.0
+      throughput: 14.288980338363054
       estimated_peak_memory_range:
-        min: 125001728
-        max: 125001728
+        min: 125333504
+        max: 125333504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 856
+        layers_on_npu: 886
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 856
-      job_id: j1p8ow88g
+        total_layers: 886
+      job_id: jp4lryo85
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -243,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:49:26Z'
+    timestamp: '2024-10-15T00:55:21Z'
diff --git a/qai_hub_models/models/detr_resnet50/README.md b/qai_hub_models/models/detr_resnet50/README.md
index db07b7cf..5086b09b 100644
--- a/qai_hub_models/models/detr_resnet50/README.md
+++ b/qai_hub_models/models/detr_resnet50/README.md
@@ -6,7 +6,7 @@
 DETR is a machine learning model that can detect objects (trained on COCO dataset).
 
 This is based on the implementation of DETR-ResNet50 found
-[here](https://github.com/facebookresearch/detr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/detr_resnet50).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.detr_resnet50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DETR-ResNet50 can be found
+* The license for the original implementation of DETR-ResNet50 can be found
   [here](https://github.com/facebookresearch/detr/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872)
 * [Source Model Implementation](https://github.com/facebookresearch/detr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/detr_resnet50/export.py b/qai_hub_models/models/detr_resnet50/export.py
index cf88ac19..9a364c3c 100644
--- a/qai_hub_models/models/detr_resnet50/export.py
+++ b/qai_hub_models/models/detr_resnet50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.detr_resnet50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "detr_resnet50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/detr_resnet50/perf.yaml b/qai_hub_models/models/detr_resnet50/perf.yaml
index bfcdcd33..15246f6a 100644
--- a/qai_hub_models/models/detr_resnet50/perf.yaml
+++ b/qai_hub_models/models/detr_resnet50/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DETR-ResNet50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 13273.0
-      throughput: 75.340917652377
+      inference_time: 10837.0
+      throughput: 92.27646027498385
       estimated_peak_memory_range:
         min: 53248
-        max: 2855424
+        max: 6342384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,22 +56,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jvgdw7rk5
+      job_id: jp4lryz85
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16264.0
-      throughput: 61.48548942449582
+      inference_time: 12140.0
+      throughput: 82.37232289950576
       estimated_peak_memory_range:
-        min: 49152
-        max: 99946208
+        min: 20480
+        max: 100370800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: jogkzrrog
+        total_layers: 767
+      job_id: j5we6lo35
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:48:27Z'
+    timestamp: '2024-10-15T00:54:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 9929.0
-      throughput: 100.71507704703394
+      inference_time: 8412.0
+      throughput: 118.87779362815026
       estimated_peak_memory_range:
-        min: 73728
-        max: 246665024
+        min: 53248
+        max: 258123680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,22 +94,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jz5womdjp
+      job_id: jpxkolw35
       job_status: Passed
     torchscript_onnx:
-      inference_time: 12148.0
-      throughput: 82.31807704972012
+      inference_time: 9784.0
+      throughput: 102.20768601798855
       estimated_peak_memory_range:
-        min: 770048
-        max: 205117088
+        min: 1183744
+        max: 222319216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: jn5q899m5
+        total_layers: 767
+      job_id: jg9lnzvwg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:48:28Z'
+    timestamp: '2024-10-15T00:54:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 13107.0
-      throughput: 76.29510948348211
+      inference_time: 10824.0
+      throughput: 92.38728750923873
       estimated_peak_memory_range:
-        min: 90112
-        max: 2641592
+        min: 57344
+        max: 2919216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jmg9v93v5
+      job_id: j5mnx0jdp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:48:15Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:53:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 16617.0
-      throughput: 60.17933441656135
+      inference_time: 10814.0
+      throughput: 92.4727205474385
       estimated_peak_memory_range:
-        min: 53248
-        max: 213364224
+        min: 61440
+        max: 2394160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jnp10qdl5
+      job_id: jpy13o98p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:48:16Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:54:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 13086.0
-      throughput: 76.41754546843956
+      inference_time: 10889.0
+      throughput: 91.8357975939021
       estimated_peak_memory_range:
-        min: 57344
-        max: 2252832
+        min: 65536
+        max: 2615152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jvgdw7rl5
+      job_id: jp2kyr2rp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:48:16Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:54:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 13093.0
-      throughput: 76.37668983426258
+      inference_time: 10901.0
+      throughput: 91.73470323823503
       estimated_peak_memory_range:
-        min: 847872
-        max: 2912120
+        min: 73728
+        max: 2930952
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jz57zvvrp
+      job_id: jprv3lz0g
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:48:17Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:54:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 13125.0
-      throughput: 76.19047619047619
+      inference_time: 14348.0
+      throughput: 69.69612489545581
       estimated_peak_memory_range:
-        min: 65536
-        max: 2501112
+        min: 53248
+        max: 223624288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,30 +224,68 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 788
-      job_id: jqp4qjjlg
+      job_id: jgn6vzjk5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:53:59Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7199.0
+      throughput: 138.90818169190166
+      estimated_peak_memory_range:
+        min: 53248
+        max: 94489632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 788
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 788
+      job_id: jp8qyelkp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 8554.0
+      throughput: 116.90437222352116
+      estimated_peak_memory_range:
+        min: 0
+        max: 91226368
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 767
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 767
+      job_id: j57yrezv5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:48:18Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:54:22Z'
   - torchscript_onnx:
-      inference_time: 16799.0
-      throughput: 59.52735281862016
+      inference_time: 13315.0
+      throughput: 75.10326699211416
       estimated_peak_memory_range:
-        min: 83038208
-        max: 83038208
+        min: 83030016
+        max: 83030016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: j1glneelp
+        total_layers: 767
+      job_id: jp14zn08p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -258,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:48:29Z'
+    timestamp: '2024-10-15T00:54:20Z'
diff --git a/qai_hub_models/models/detr_resnet50_dc5/README.md b/qai_hub_models/models/detr_resnet50_dc5/README.md
index e0be8280..7c4b569f 100644
--- a/qai_hub_models/models/detr_resnet50_dc5/README.md
+++ b/qai_hub_models/models/detr_resnet50_dc5/README.md
@@ -6,7 +6,7 @@
 DETR is a machine learning model that can detect objects (trained on COCO dataset).
 
 This is based on the implementation of DETR-ResNet50-DC5 found
-[here](https://github.com/facebookresearch/detr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/detr_resnet50_dc5).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.detr_resnet50_dc5.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of DETR-ResNet50-DC5 can be found
+* The license for the original implementation of DETR-ResNet50-DC5 can be found
   [here](https://github.com/facebookresearch/detr/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [End-to-End Object Detection with Transformers](https://arxiv.org/abs/2005.12872)
 * [Source Model Implementation](https://github.com/facebookresearch/detr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/detr_resnet50_dc5/export.py b/qai_hub_models/models/detr_resnet50_dc5/export.py
index 2f3caed7..bac47113 100644
--- a/qai_hub_models/models/detr_resnet50_dc5/export.py
+++ b/qai_hub_models/models/detr_resnet50_dc5/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.detr_resnet50_dc5 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "detr_resnet50_dc5"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/detr_resnet50_dc5/perf.yaml b/qai_hub_models/models/detr_resnet50_dc5/perf.yaml
index b0fba8fb..e56ef18a 100644
--- a/qai_hub_models/models/detr_resnet50_dc5/perf.yaml
+++ b/qai_hub_models/models/detr_resnet50_dc5/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: DETR-ResNet50-DC5
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 111119.0
-      throughput: 8.999361045365779
+      inference_time: 75052.0
+      throughput: 13.324095293929542
       estimated_peak_memory_range:
-        min: 28672
-        max: 2167856
+        min: 81920
+        max: 2203896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,22 +56,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: jqp4qjwqg
+      job_id: j5we6lk35
       job_status: Passed
     torchscript_onnx:
-      inference_time: 114154.0
-      throughput: 8.760096010652276
+      inference_time: 92231.0
+      throughput: 10.842341512072947
       estimated_peak_memory_range:
-        min: 40960
-        max: 100199608
+        min: 131072
+        max: 101161016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: j1pv3z175
+        total_layers: 767
+      job_id: jgo26lxqp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:47:41Z'
+    timestamp: '2024-10-15T00:53:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 101554.0
-      throughput: 9.84697796246332
+      inference_time: 68196.0
+      throughput: 14.66361663440671
       estimated_peak_memory_range:
-        min: 262144
-        max: 448896640
+        min: 167936
+        max: 517087552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,22 +94,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: j0pxve1jg
+      job_id: jg9lnzrwg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 89688.0
-      throughput: 11.14976362501115
+      inference_time: 81134.0
+      throughput: 12.325289028027708
       estimated_peak_memory_range:
-        min: 1613824
-        max: 413587120
+        min: 2596864
+        max: 529878240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: j7gjxk07p
+        total_layers: 767
+      job_id: jpv6kljk5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:47:41Z'
+    timestamp: '2024-10-15T00:53:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 109750.0
-      throughput: 9.111617312072893
+      inference_time: 74434.0
+      throughput: 13.43472069215681
       estimated_peak_memory_range:
-        min: 2879488
-        max: 5519040
+        min: 86016
+        max: 2524712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: jo5mrvzyg
+      job_id: jp14zn98p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:47:27Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:53:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 131028.0
-      throughput: 7.631956528375614
+      inference_time: 85586.0
+      throughput: 11.684153950412451
       estimated_peak_memory_range:
-        min: 524288
-        max: 416029872
+        min: 184320
+        max: 3234288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: jegn2r9vg
+      job_id: jpxkolq35
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:47:28Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:53:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 120615.0
-      throughput: 8.290842764166978
+      inference_time: 74502.0
+      throughput: 13.422458457491073
       estimated_peak_memory_range:
-        min: 192512
-        max: 3228888
+        min: 147456
+        max: 2657944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: joprk14v5
+      job_id: jp4lry785
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:47:29Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:53:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 114460.0
-      throughput: 8.736676568233444
+      inference_time: 80512.0
+      throughput: 12.420508744038155
       estimated_peak_memory_range:
-        min: 12288
-        max: 1852912
+        min: 147456
+        max: 3016376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: jep2837xp
+      job_id: j57yremv5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:47:30Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:53:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 117557.0
-      throughput: 8.506511734732937
+      inference_time: 91938.0
+      throughput: 10.876895299005852
       estimated_peak_memory_range:
-        min: 192512
-        max: 2849040
+        min: 16384
+        max: 477056736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,30 +224,68 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 789
-      job_id: jqpyev4rg
+      job_id: jgdx1dkrp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:53:05Z'
+  - torchscript_onnx_tflite:
+      inference_time: 49532.0
+      throughput: 20.18896874747638
+      estimated_peak_memory_range:
+        min: 81920
+        max: 265154736
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 789
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 789
+      job_id: jgn6vz4k5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 66336.0
+      throughput: 15.074770863482875
+      estimated_peak_memory_range:
+        min: 2220032
+        max: 309790256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 767
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 767
+      job_id: jprvvnjkg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:47:31Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T09:31:45Z'
   - torchscript_onnx:
-      inference_time: 115048.0
-      throughput: 8.69202419859537
+      inference_time: 65232.0
+      throughput: 15.329899435859701
       estimated_peak_memory_range:
-        min: 86614016
-        max: 86614016
+        min: 86802432
+        max: 86802432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 737
+        layers_on_npu: 767
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 737
-      job_id: jlpe94r7g
+        total_layers: 767
+      job_id: jgjvnrjvg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -258,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:47:42Z'
+    timestamp: '2024-10-15T00:53:26Z'
diff --git a/qai_hub_models/models/efficientnet_b0/README.md b/qai_hub_models/models/efficientnet_b0/README.md
index b2ffa91b..6a3dac7b 100644
--- a/qai_hub_models/models/efficientnet_b0/README.md
+++ b/qai_hub_models/models/efficientnet_b0/README.md
@@ -6,7 +6,7 @@
 EfficientNetB0 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of EfficientNet-B0 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/efficientnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/efficientnet_b0).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.efficientnet_b0.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of EfficientNet-B0 can be found
+* The license for the original implementation of EfficientNet-B0 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/efficientnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/efficientnet_b0/export.py b/qai_hub_models/models/efficientnet_b0/export.py
index 80d94f65..b7d112a1 100644
--- a/qai_hub_models/models/efficientnet_b0/export.py
+++ b/qai_hub_models/models/efficientnet_b0/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.efficientnet_b0 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "efficientnet_b0"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/efficientnet_b0/perf.yaml b/qai_hub_models/models/efficientnet_b0/perf.yaml
index 61118a09..0921e8f4 100644
--- a/qai_hub_models/models/efficientnet_b0/perf.yaml
+++ b/qai_hub_models/models/efficientnet_b0/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: EfficientNet-B0
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1604.0
-      throughput: 623.4413965087282
+      inference_time: 1603.0
+      throughput: 623.8303181534623
       estimated_peak_memory_range:
-        min: 12288
-        max: 1636936
+        min: 49152
+        max: 1629576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jo5mrv3yg
+      job_id: j56y4jryp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1681.0
-      throughput: 594.883997620464
+      inference_time: 1673.0
+      throughput: 597.7286312014345
       estimated_peak_memory_range:
-        min: 618496
-        max: 87307416
+        min: 12288
+        max: 85201864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: jogkzryyg
+      job_id: jgdx198zp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1612.0
-      throughput: 620.3473945409429
+      inference_time: 1591.0
+      throughput: 628.5355122564425
       estimated_peak_memory_range:
-        min: 626688
-        max: 17485456
+        min: 12288
+        max: 15613000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jlpe94v7g
+      job_id: j5mnx2d7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:46:53Z'
+    timestamp: '2024-10-15T17:27:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 1537.0
-      throughput: 650.6180871828237
+      inference_time: 1159.0
+      throughput: 862.8127696289905
       estimated_peak_memory_range:
-        min: 20480
-        max: 78350944
+        min: 16384
+        max: 79371328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jegn2revg
+      job_id: jp3j03xng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1590.0
-      throughput: 628.930817610063
+      inference_time: 1392.0
+      throughput: 718.3908045977012
       estimated_peak_memory_range:
         min: 618496
-        max: 19453120
+        max: 19653984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: jn5q89275
+      job_id: jp14zl7np
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1202.0
-      throughput: 831.9467554076539
+      inference_time: 1200.0
+      throughput: 833.3333333333334
       estimated_peak_memory_range:
         min: 0
-        max: 84180960
+        max: 84684048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jygzev7zg
+      job_id: jpy13w70p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:46:54Z'
+    timestamp: '2024-10-15T17:27:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 1599.0
-      throughput: 625.3908692933084
+      inference_time: 1596.0
+      throughput: 626.5664160401003
       estimated_peak_memory_range:
-        min: 20480
-        max: 1383992
+        min: 28672
+        max: 1894072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: joprk1yv5
+      job_id: jgo260okp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1590.0
-      throughput: 628.930817610063
+      inference_time: 1560.0
+      throughput: 641.025641025641
       estimated_peak_memory_range:
-        min: 626688
-        max: 2414336
+        min: 663552
+        max: 1744304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: jw566q1v5
+      job_id: j5mnx2o7p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:46:48Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:27:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 3063.0
-      throughput: 326.47730982696703
+      inference_time: 1606.0
+      throughput: 622.66500622665
       estimated_peak_memory_range:
-        min: 16384
-        max: 87640400
+        min: 28672
+        max: 1469416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jep283mxp
+      job_id: j5we6v8m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3154.0
-      throughput: 317.0577045022194
+      inference_time: 1572.0
+      throughput: 636.1323155216285
       estimated_peak_memory_range:
-        min: 618496
-        max: 22421040
+        min: 630784
+        max: 2023896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: j7gjxkl7p
+      job_id: j5q6qkmep
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:46:52Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:27:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 1610.0
-      throughput: 621.1180124223603
+      inference_time: 1606.0
+      throughput: 622.66500622665
       estimated_peak_memory_range:
-        min: 24576
-        max: 1478496
+        min: 45056
+        max: 305898320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jqpyevdrg
+      job_id: jgz3d98x5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1574.0
-      throughput: 635.3240152477764
+      inference_time: 1572.0
+      throughput: 636.1323155216285
       estimated_peak_memory_range:
-        min: 626688
-        max: 2065992
+        min: 659456
+        max: 2003088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: j1p3kqmx5
+      job_id: jp0z0qv05
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:46:49Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:27:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 1604.0
-      throughput: 623.4413965087282
+      inference_time: 1606.0
+      throughput: 622.66500622665
       estimated_peak_memory_range:
-        min: 12288
-        max: 1437320
+        min: 16384
+        max: 24360160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j2p0yer2g
+      job_id: jgjvnmoeg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1574.0
-      throughput: 635.3240152477764
+      inference_time: 1580.0
+      throughput: 632.9113924050633
       estimated_peak_memory_range:
         min: 643072
-        max: 2196424
+        max: 2374184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: jwgoyev45
+      job_id: jp2ky646p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:46:50Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:27:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 1608.0
-      throughput: 621.8905472636816
+      inference_time: 3074.0
+      throughput: 325.30904359141186
       estimated_peak_memory_range:
-        min: 20480
-        max: 7748824
+        min: 16384
+        max: 88034752
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1p8ow7zg
+      job_id: jpv6koer5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1568.0
-      throughput: 637.7551020408164
+      inference_time: 3166.0
+      throughput: 315.8559696778269
       estimated_peak_memory_range:
-        min: 634880
-        max: 1954144
+        min: 618496
+        max: 23178912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: j1pv3zw75
+      job_id: jgz3d9445
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:27:53Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1118.0
+      throughput: 894.4543828264758
+      estimated_peak_memory_range:
+        min: 8192
+        max: 32749360
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 245
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 245
+      job_id: jp14zl77p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1171.0
+      throughput: 853.9709649871904
+      estimated_peak_memory_range:
+        min: 614400
+        max: 15861920
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 243
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 243
+      job_id: jp14zlvnp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 962.0
+      throughput: 1039.5010395010395
+      estimated_peak_memory_range:
+        min: 0
+        max: 36186352
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 245
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 245
+      job_id: jp3j036mg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:46:51Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:27:59Z'
   - torchscript_onnx_qnn:
-      inference_time: 1743.0
-      throughput: 573.7234652897304
+      inference_time: 1776.0
+      throughput: 563.063063063063
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 243
-      job_id: j1glnekep
+      job_id: j57yrwkn5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1707.0
-      throughput: 585.8230814294083
+      inference_time: 1683.0
+      throughput: 594.1770647653001
       estimated_peak_memory_range:
-        min: 14692352
-        max: 14692352
+        min: 14618624
+        max: 14618624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jz5wom9zp
+      job_id: jgkexn8vg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:46:55Z'
+    timestamp: '2024-10-15T17:27:57Z'
diff --git a/qai_hub_models/models/esrgan/README.md b/qai_hub_models/models/esrgan/README.md
index 524a2fa3..80590084 100644
--- a/qai_hub_models/models/esrgan/README.md
+++ b/qai_hub_models/models/esrgan/README.md
@@ -6,7 +6,7 @@
 ESRGAN is a machine learning model that upscales an image with minimal loss in quality.
 
 This is based on the implementation of ESRGAN found
-[here](https://github.com/xinntao/ESRGAN/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/esrgan).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.esrgan.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ESRGAN can be found
+* The license for the original implementation of ESRGAN can be found
   [here](https://github.com/xinntao/ESRGAN/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks](https://arxiv.org/abs/1809.00219)
 * [Source Model Implementation](https://github.com/xinntao/ESRGAN/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/esrgan/export.py b/qai_hub_models/models/esrgan/export.py
index ff1189fa..6a55cf77 100644
--- a/qai_hub_models/models/esrgan/export.py
+++ b/qai_hub_models/models/esrgan/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.esrgan import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "esrgan"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/esrgan/perf.yaml b/qai_hub_models/models/esrgan/perf.yaml
index f8b4c070..ec2ed366 100644
--- a/qai_hub_models/models/esrgan/perf.yaml
+++ b/qai_hub_models/models/esrgan/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ESRGAN
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 70500.0
-      throughput: 14.184397163120567
+      inference_time: 67448.0
+      throughput: 14.826236508124778
       estimated_peak_memory_range:
-        min: 3276800
-        max: 5950536
+        min: 3194880
+        max: 6302928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: j0pxve6jg
+      job_id: jp14zlq7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 70092.0
-      throughput: 14.266963419505792
+      inference_time: 70723.0
+      throughput: 14.139671676823664
       estimated_peak_memory_range:
-        min: 118784
-        max: 108858496
+        min: 122880
+        max: 115569424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: j1p8owzzg
+      job_id: jpy13wvlp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 66911.0
-      throughput: 14.945225747634918
+      inference_time: 70475.0
+      throughput: 14.189428875487762
       estimated_peak_memory_range:
-        min: 110592
-        max: 43964008
+        min: 159744
+        max: 44247240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: j7gjxke7p
+      job_id: jgjvnm1eg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:46:13Z'
+    timestamp: '2024-10-15T17:27:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 54928.0
-      throughput: 18.20565103408098
+      inference_time: 55287.0
+      throughput: 18.087434659142293
       estimated_peak_memory_range:
-        min: 3260416
-        max: 627269248
+        min: 3272704
+        max: 690795968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: jo5mrv6yg
+      job_id: jgdx197zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 56060.0
-      throughput: 17.83803068141277
+      inference_time: 55722.0
+      throughput: 17.946233085675317
       estimated_peak_memory_range:
-        min: 69632
-        max: 100600272
+        min: 90112
+        max: 114988816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: jogkzr3yg
+      job_id: jp0z0qen5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 55529.0
-      throughput: 18.008608114678818
+      inference_time: 59118.0
+      throughput: 16.91532189857573
       estimated_peak_memory_range:
-        min: 6418432
-        max: 655419728
+        min: 6443008
+        max: 728828640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jlpe94k7g
+      job_id: jpedm12v5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:46:14Z'
+    timestamp: '2024-10-15T17:27:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 64316.0
-      throughput: 15.548230611356427
+      inference_time: 60535.0
+      throughput: 16.519368960105723
       estimated_peak_memory_range:
-        min: 3203072
-        max: 5437424
+        min: 3186688
+        max: 846716424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: jegn2r3vg
+      job_id: j57yrwv95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 62756.0
-      throughput: 15.9347313404296
+      inference_time: 62046.0
+      throughput: 16.117074428649712
       estimated_peak_memory_range:
-        min: 348160
-        max: 1620992
+        min: 434176
+        max: 1666728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: j1glne3ep
+      job_id: jgkexnrng
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:46:07Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:27:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 162662.0
-      throughput: 6.147717352547
+      inference_time: 64451.0
+      throughput: 15.515663061860948
       estimated_peak_memory_range:
-        min: 3223552
-        max: 593337776
+        min: 3473408
+        max: 7950592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: joprk1ev5
+      job_id: jgn6vyrq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 135659.0
-      throughput: 7.3714239379620965
+      inference_time: 62024.0
+      throughput: 16.12279117760867
       estimated_peak_memory_range:
-        min: 237568
-        max: 78143568
+        min: 368640
+        max: 2043176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: j1pv3zv75
+      job_id: j56y4jvyp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:46:12Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:27:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 63613.0
-      throughput: 15.720057221008284
+      inference_time: 68788.0
+      throughput: 14.537419317322788
       estimated_peak_memory_range:
-        min: 3293184
-        max: 5430712
+        min: 3166208
+        max: 6302720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: jep283lxp
+      job_id: j5mnx2v9p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 62831.0
-      throughput: 15.915710397733603
+      inference_time: 63011.0
+      throughput: 15.870244877878466
       estimated_peak_memory_range:
-        min: 425984
-        max: 1684216
+        min: 348160
+        max: 1969064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: jw566qnv5
+      job_id: jglvmz7m5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:46:08Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:27:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 67352.0
-      throughput: 14.847369046205012
+      inference_time: 64819.0
+      throughput: 15.427575247998272
       estimated_peak_memory_range:
-        min: 3252224
-        max: 6000000
+        min: 3256320
+        max: 6156008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: jqpyev6rg
+      job_id: jpxkojel5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 64446.0
-      throughput: 15.516866834248829
+      inference_time: 63190.0
+      throughput: 15.82528881152081
       estimated_peak_memory_range:
-        min: 425984
-        max: 5189448
+        min: 409600
+        max: 1637688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: j1p3kqex5
+      job_id: j5q6qk9op
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:46:10Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:27:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 65640.0
-      throughput: 15.234613040828762
+      inference_time: 136101.0
+      throughput: 7.347484588651075
       estimated_peak_memory_range:
-        min: 3268608
-        max: 6235928
+        min: 3162112
+        max: 648556912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1024
-      job_id: j2p0yel2g
+      job_id: jp4lroj15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 63738.0
-      throughput: 15.689227776208854
+      inference_time: 134526.0
+      throughput: 7.433507277403624
       estimated_peak_memory_range:
-        min: 405504
-        max: 1688576
+        min: 331776
+        max: 92125040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: jwgoye345
+      job_id: jgo260mkp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:27:10Z'
+  - torchscript_onnx_tflite:
+      inference_time: 42308.0
+      throughput: 23.636191736787367
+      estimated_peak_memory_range:
+        min: 12288
+        max: 188887088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1024
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1024
+      job_id: jp2ky63qp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 37764.0
+      throughput: 26.480245736680438
+      estimated_peak_memory_range:
+        min: 831488
+        max: 137073696
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1026
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1026
+      job_id: jpv6ko4r5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 38328.0
+      throughput: 26.09058651638489
+      estimated_peak_memory_range:
+        min: 8298496
+        max: 194769792
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1028
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1028
+      job_id: jg9ln188g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:46:11Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:27:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 65269.0
-      throughput: 15.321209149826105
+      inference_time: 64824.0
+      throughput: 15.426385289398988
       estimated_peak_memory_range:
-        min: 208896
-        max: 208896
+        min: 204800
+        max: 204800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1026
-      job_id: jn5q89375
+      job_id: jp8qy9wop
       job_status: Passed
     torchscript_onnx:
-      inference_time: 65506.0
-      throughput: 15.26577718071627
+      inference_time: 65670.0
+      throughput: 15.227653418608192
       estimated_peak_memory_range:
-        min: 40529920
-        max: 40529920
+        min: 39833600
+        max: 39833600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jygzevrzg
+      job_id: jgz3d9wx5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:46:15Z'
+    timestamp: '2024-10-15T17:27:14Z'
diff --git a/qai_hub_models/models/facemap_3dmm/README.md b/qai_hub_models/models/facemap_3dmm/README.md
index fadc852d..d320f639 100644
--- a/qai_hub_models/models/facemap_3dmm/README.md
+++ b/qai_hub_models/models/facemap_3dmm/README.md
@@ -1,14 +1,14 @@
 [![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
 
 
-# [FaceMap_3DMM: Facial landmark predictor with 3DMM](#)
+# [Facial-Landmark-Detection: Facial landmark predictor with 3DMM](https://aihub.qualcomm.com/models/facemap_3dmm)
 
 Facial landmark is a deep learning model that can predict 68 landmarks from a single image. It can also be used as a backbone in building more complex models for specific use cases.
 
-This is based on the implementation of FaceMap_3DMM found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+This is based on the implementation of Facial-Landmark-Detection found
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
-accross various devices, can be found [here](#).
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/facemap_3dmm).
 
 [Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
 
@@ -39,14 +39,18 @@ python -m qai_hub_models.models.facemap_3dmm.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FaceMap_3DMM can be found
-  [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the original implementation of Facial-Landmark-Detection can be found
+  [here](https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
-* [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385.)
-* [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
+* [None](None)
+* [Source Model Implementation](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/facemap_3dmm/model.py)
+
+
 
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
diff --git a/qai_hub_models/models/facemap_3dmm/export.py b/qai_hub_models/models/facemap_3dmm/export.py
index 6ab29d08..c21ddb3c 100644
--- a/qai_hub_models/models/facemap_3dmm/export.py
+++ b/qai_hub_models/models/facemap_3dmm/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.facemap_3dmm import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "facemap_3dmm"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -96,7 +94,7 @@ def export_model(
     if not can_access_qualcomm_ai_hub():
         return export_without_hub_access(
             "facemap_3dmm",
-            "FaceMap_3DMM",
+            "Facial-Landmark-Detection",
             device,
             skip_profiling,
             skip_inferencing,
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/facemap_3dmm/info.yaml b/qai_hub_models/models/facemap_3dmm/info.yaml
index 68142447..1d9f2a4c 100644
--- a/qai_hub_models/models/facemap_3dmm/info.yaml
+++ b/qai_hub_models/models/facemap_3dmm/info.yaml
@@ -1,20 +1,18 @@
-name: FaceMap_3DMM
+name: Facial-Landmark-Detection
 # id must match with the model dir name in qai_hub_models
 id: facemap_3dmm
 status: public
 headline: Facial landmark predictor with 3DMM.
 domain: Computer Vision
-use_case: POSE_ESTIMATION
+use_case: Pose Estimation
 description: Facial landmark is a deep learning model that can predict 68 landmarks from a
   single image. It can also be used as a backbone in building more complex models
   for specific use cases.
 tags:
   - backbone
-research_paper: https://arxiv.org/abs/1512.03385.
-research_paper_title: Deep Residual Learning for Image Recognition
-license: https://github.com/pytorch/vision/blob/main/LICENSE
+source_repo: https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/facemap_3dmm/model.py
+license: https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE
 deploy_license: https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf
-source_repo: https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py
 technical_details:
   Input resolution: 128x128
   Number of parameters: 5.424M
@@ -29,10 +27,8 @@ form_factors:
   - Tablet
   - IoT
   - XR
-has_static_banner: false
-has_animated_banner: false
+has_static_banner: true
+has_animated_banner: true
 license_type: bsd-3-clause
 deploy_license_type: AI Model Hub License
-dataset:
-  - QDMS images
-  - R channel images from Datatang captured images, Getty images, Multipie Images
+dataset: []
diff --git a/qai_hub_models/models/facemap_3dmm/model.py b/qai_hub_models/models/facemap_3dmm/model.py
index cad3dee9..5c9264af 100644
--- a/qai_hub_models/models/facemap_3dmm/model.py
+++ b/qai_hub_models/models/facemap_3dmm/model.py
@@ -63,4 +63,4 @@ def get_input_spec() -> InputSpec:
 
     @staticmethod
     def get_output_names() -> List[str]:
-        return ["3dmm_parameters"]
+        return ["parameters_3dmm"]
diff --git a/qai_hub_models/models/facemap_3dmm/perf.yaml b/qai_hub_models/models/facemap_3dmm/perf.yaml
new file mode 100644
index 00000000..4be519d1
--- /dev/null
+++ b/qai_hub_models/models/facemap_3dmm/perf.yaml
@@ -0,0 +1,432 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
+models:
+- name: Facial-Landmark-Detection
+  performance_metrics:
+  - torchscript_onnx_tflite:
+      inference_time: 347.0
+      throughput: 2881.844380403458
+      estimated_peak_memory_range:
+        min: 24576
+        max: 3974056
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: j5q6qllmp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 364.0
+      throughput: 2747.252747252747
+      estimated_peak_memory_range:
+        min: 233472
+        max: 26049896
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: j5we6llj5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 474.0
+      throughput: 2109.7046413502107
+      estimated_peak_memory_range:
+        min: 40960
+        max: 12298000
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 59
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 60
+      job_id: jpxkol015
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-15T00:49:51Z'
+  - torchscript_onnx_tflite:
+      inference_time: 274.0
+      throughput: 3649.6350364963505
+      estimated_peak_memory_range:
+        min: 16384
+        max: 26486288
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jglvmyyl5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 306.0
+      throughput: 3267.97385620915
+      estimated_peak_memory_range:
+        min: 208896
+        max: 11125488
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jg9lnzzvg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 378.0
+      throughput: 2645.5026455026455
+      estimated_peak_memory_range:
+        min: 0
+        max: 27464624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 59
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 60
+      job_id: j5mnx09wp
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-15T00:49:52Z'
+  - torchscript_onnx_tflite:
+      inference_time: 352.0
+      throughput: 2840.909090909091
+      estimated_peak_memory_range:
+        min: 28672
+        max: 1533216
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: j56y4887p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 351.0
+      throughput: 2849.002849002849
+      estimated_peak_memory_range:
+        min: 229376
+        max: 1547376
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jgdx1ddlp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:49:43Z'
+  - torchscript_onnx_tflite:
+      inference_time: 348.0
+      throughput: 2873.5632183908046
+      estimated_peak_memory_range:
+        min: 28672
+        max: 1568024
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jgjvnrr8g
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 359.0
+      throughput: 2785.515320334262
+      estimated_peak_memory_range:
+        min: 221184
+        max: 1919480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jp14zno2p
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:49:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 344.0
+      throughput: 2906.9767441860463
+      estimated_peak_memory_range:
+        min: 28672
+        max: 166357928
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jpv6kllm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 351.0
+      throughput: 2849.002849002849
+      estimated_peak_memory_range:
+        min: 225280
+        max: 1447848
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jg9lnzolg
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:49:45Z'
+  - torchscript_onnx_tflite:
+      inference_time: 342.0
+      throughput: 2923.9766081871344
+      estimated_peak_memory_range:
+        min: 28672
+        max: 1508248
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jgo26lldp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 351.0
+      throughput: 2849.002849002849
+      estimated_peak_memory_range:
+        min: 221184
+        max: 1473168
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: j5we6ly65
+      job_status: Passed
+    reference_device_info:
+      name: SA8650 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:49:44Z'
+  - torchscript_onnx_tflite:
+      inference_time: 450.0
+      throughput: 2222.222222222222
+      estimated_peak_memory_range:
+        min: 24576
+        max: 26895376
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jp3j0zzzg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 476.0
+      throughput: 2100.840336134454
+      estimated_peak_memory_range:
+        min: 0
+        max: 13865344
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: j57yreol5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:49:49Z'
+  - torchscript_onnx_tflite:
+      inference_time: 266.0
+      throughput: 3759.3984962406016
+      estimated_peak_memory_range:
+        min: 12288
+        max: 15695424
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jgz3dll65
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 299.0
+      throughput: 3344.4816053511704
+      estimated_peak_memory_range:
+        min: 0
+        max: 9206240
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jp4lryev5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 369.0
+      throughput: 2710.027100271003
+      estimated_peak_memory_range:
+        min: 0
+        max: 16134224
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 59
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 60
+      job_id: jp2kyro4p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:49:55Z'
+  - torchscript_onnx_qnn:
+      inference_time: 450.0
+      throughput: 2222.222222222222
+      estimated_peak_memory_range:
+        min: 585728
+        max: 585728
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 60
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 60
+      job_id: jp14znnlp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 527.0
+      throughput: 1897.5332068311195
+      estimated_peak_memory_range:
+        min: 12500992
+        max: 12500992
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 59
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 60
+      job_id: jgn6vz1r5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-15T00:49:53Z'
diff --git a/qai_hub_models/models/fastsam_s/README.md b/qai_hub_models/models/fastsam_s/README.md
index a2e3760d..882d25fb 100644
--- a/qai_hub_models/models/fastsam_s/README.md
+++ b/qai_hub_models/models/fastsam_s/README.md
@@ -6,7 +6,7 @@
 The Fast Segment Anything Model (FastSAM) is a novel, real-time CNN-based solution for the Segment Anything task. This task is designed to segment any object within an image based on various possible user interaction prompts. The model performs competitively despite significantly reduced computation, making it a practical choice for a variety of vision tasks.
 
 This is based on the implementation of FastSam-S found
-[here](https://github.com/CASIA-IVA-Lab/FastSAM). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/fastsam_s).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.fastsam_s.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FastSam-S can be found
+* The license for the original implementation of FastSam-S can be found
   [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE)
+
 
 ## References
 * [Fast Segment Anything](https://arxiv.org/abs/2306.12156)
 * [Source Model Implementation](https://github.com/CASIA-IVA-Lab/FastSAM)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/fastsam_s/export.py b/qai_hub_models/models/fastsam_s/export.py
index 11a05633..d38cdfd2 100644
--- a/qai_hub_models/models/fastsam_s/export.py
+++ b/qai_hub_models/models/fastsam_s/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.fastsam_s import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "fastsam_s"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -199,7 +197,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/fastsam_s/perf.yaml b/qai_hub_models/models/fastsam_s/perf.yaml
index 839aa124..10c4cd5b 100644
--- a/qai_hub_models/models/fastsam_s/perf.yaml
+++ b/qai_hub_models/models/fastsam_s/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FastSam-S
   performance_metrics:
   - torchscript_onnx_qnn:
-      inference_time: 8062.0
-      throughput: 124.03870007442322
+      inference_time: 8064.0
+      throughput: 124.0079365079365
       estimated_peak_memory_range:
-        min: 6352896
-        max: 23213592
+        min: 4218880
+        max: 19390024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jogkzrqyg
+      job_id: jgjvnr78g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 10484.0
-      throughput: 95.38344143456696
+      inference_time: 9580.0
+      throughput: 104.38413361169103
       estimated_peak_memory_range:
-        min: 327680
-        max: 28540592
+        min: 4132864
+        max: 25901272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jygzevjzg
+      job_id: j5mnx00qp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:45:21Z'
+    timestamp: '2024-10-15T00:49:05Z'
   - torchscript_onnx_qnn:
-      inference_time: 6560.0
-      throughput: 152.4390243902439
+      inference_time: 6960.0
+      throughput: 143.67816091954023
       estimated_peak_memory_range:
         min: 4931584
-        max: 36809936
+        max: 40159504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jn5q89r75
+      job_id: jpedm7z05
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8041.0
-      throughput: 124.36264146250467
+      inference_time: 7273.0
+      throughput: 137.49484394335212
       estimated_peak_memory_range:
-        min: 14278656
-        max: 89357632
+        min: 1331200
+        max: 84592272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jz5wom3zp
+      job_id: jgn6vzzm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:45:22Z'
+    timestamp: '2024-10-15T00:49:06Z'
   - torchscript_onnx_qnn:
-      inference_time: 7924.0
-      throughput: 126.19888944977284
+      inference_time: 7380.0
+      throughput: 135.50135501355012
       estimated_peak_memory_range:
-        min: 4952064
-        max: 10352832
+        min: 4947968
+        max: 10193600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jw566qzv5
+      job_id: j5we6l7j5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:45:16Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:48:58Z'
   - torchscript_onnx_qnn:
-      inference_time: 13435.0
-      throughput: 74.4324525493115
+      inference_time: 7689.0
+      throughput: 130.05592404734037
       estimated_peak_memory_range:
-        min: 4952064
-        max: 35733824
+        min: 4968448
+        max: 10193296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jlpe94w7g
+      job_id: jgdx1d3lp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:45:20Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:49:01Z'
   - torchscript_onnx_qnn:
-      inference_time: 8030.0
-      throughput: 124.53300124533001
+      inference_time: 7719.0
+      throughput: 129.55045990413265
       estimated_peak_memory_range:
-        min: 4988928
-        max: 8364352
+        min: 4997120
+        max: 9779592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: j1p3kq1x5
+      job_id: jp14znjlp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:45:17Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:49:00Z'
   - torchscript_onnx_qnn:
-      inference_time: 8004.0
-      throughput: 124.9375312343828
+      inference_time: 7618.0
+      throughput: 131.26804935678655
       estimated_peak_memory_range:
-        min: 4997120
-        max: 8482232
+        min: 4972544
+        max: 8709760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jwgoyen45
+      job_id: jg9lnzmvg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:45:18Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:48:59Z'
   - torchscript_onnx_qnn:
-      inference_time: 7833.0
-      throughput: 127.66500702157539
+      inference_time: 13749.0
+      throughput: 72.73256236817222
       estimated_peak_memory_range:
-        min: 4947968
-        max: 8111160
+        min: 4960256
+        max: 44328528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,19 +224,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: j1pv3zr75
+      job_id: jp4lryyl5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:49:03Z'
+  - torchscript_onnx_qnn:
+      inference_time: 5490.0
+      throughput: 182.14936247723134
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 37555888
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 286
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 286
+      job_id: jpxkoll95
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5354.0
+      throughput: 186.77624206200971
+      estimated_peak_memory_range:
+        min: 16953344
+        max: 65201088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 289
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 289
+      job_id: jpy13oo4p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:45:19Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:49:09Z'
   - torchscript_onnx_qnn:
-      inference_time: 8378.0
-      throughput: 119.36022917164001
+      inference_time: 8317.0
+      throughput: 120.23566189731875
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -249,14 +285,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: j1glne2ep
+      job_id: jgz3dlm65
       job_status: Passed
     torchscript_onnx:
-      inference_time: 10647.0
-      throughput: 93.92317084624777
+      inference_time: 9903.0
+      throughput: 100.97950116126427
       estimated_peak_memory_range:
-        min: 21442560
-        max: 21442560
+        min: 22536192
+        max: 22536192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -264,7 +300,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jmg9v9yq5
+      job_id: jprv3lleg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -273,4 +309,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:45:23Z'
+    timestamp: '2024-10-15T00:49:07Z'
diff --git a/qai_hub_models/models/fastsam_x/README.md b/qai_hub_models/models/fastsam_x/README.md
index b6890348..0ebcaa2c 100644
--- a/qai_hub_models/models/fastsam_x/README.md
+++ b/qai_hub_models/models/fastsam_x/README.md
@@ -6,7 +6,7 @@
 The Fast Segment Anything Model (FastSAM) is a novel, real-time CNN-based solution for the Segment Anything task. This task is designed to segment any object within an image based on various possible user interaction prompts. The model performs competitively despite significantly reduced computation, making it a practical choice for a variety of vision tasks.
 
 This is based on the implementation of FastSam-X found
-[here](https://github.com/CASIA-IVA-Lab/FastSAM). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/fastsam_x).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.fastsam_x.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FastSam-X can be found
+* The license for the original implementation of FastSam-X can be found
   [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/CASIA-IVA-Lab/FastSAM/blob/main/LICENSE)
+
 
 ## References
 * [Fast Segment Anything](https://arxiv.org/abs/2306.12156)
 * [Source Model Implementation](https://github.com/CASIA-IVA-Lab/FastSAM)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/fastsam_x/export.py b/qai_hub_models/models/fastsam_x/export.py
index 5a8cba67..46438c56 100644
--- a/qai_hub_models/models/fastsam_x/export.py
+++ b/qai_hub_models/models/fastsam_x/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.fastsam_x import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "fastsam_x"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -199,7 +197,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/fastsam_x/perf.yaml b/qai_hub_models/models/fastsam_x/perf.yaml
index 90071a5c..7b95d17f 100644
--- a/qai_hub_models/models/fastsam_x/perf.yaml
+++ b/qai_hub_models/models/fastsam_x/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FastSam-X
   performance_metrics:
   - torchscript_onnx_qnn:
-      inference_time: 45786.0
-      throughput: 21.84073734329271
+      inference_time: 45671.0
+      throughput: 21.895732521731514
       estimated_peak_memory_range:
-        min: 4968448
-        max: 21532680
+        min: 4980736
+        max: 20363240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: jogkzd8og
+      job_id: jp3j0z6zg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 49871.0
-      throughput: 20.051733472358684
+      inference_time: 48826.0
+      throughput: 20.480891328390612
       estimated_peak_memory_range:
-        min: 106496
-        max: 165346544
+        min: 28672
+        max: 164570040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 421
-      job_id: jlpe92y0g
+      job_id: j57yre4r5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T23:10:24Z'
+    timestamp: '2024-10-15T00:48:17Z'
   - torchscript_onnx_qnn:
-      inference_time: 38576.0
-      throughput: 25.922853587722937
+      inference_time: 38249.0
+      throughput: 26.144474365342884
       estimated_peak_memory_range:
-        min: 4931584
-        max: 54882560
+        min: 4952064
+        max: 66109264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: jn5q8wvm5
+      job_id: jgo26l8dp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 40684.0
-      throughput: 24.579687346376954
+      inference_time: 39188.0
+      throughput: 25.518015719097683
       estimated_peak_memory_range:
-        min: 479232
-        max: 145914784
+        min: 585728
+        max: 164283632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 421
-      job_id: jygzewn6g
+      job_id: jp4lry1l5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T23:10:25Z'
+    timestamp: '2024-10-15T00:48:18Z'
   - torchscript_onnx_qnn:
-      inference_time: 40305.0
-      throughput: 24.810817516437165
+      inference_time: 43275.0
+      throughput: 23.108030040439054
       estimated_peak_memory_range:
-        min: 5017600
-        max: 6240864
+        min: 5120000
+        max: 6442816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: jw566vw75
+      job_id: jgjvnrq8g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T23:10:20Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:48:09Z'
   - torchscript_onnx_qnn:
-      inference_time: 86347.0
-      throughput: 11.581178269077096
+      inference_time: 43253.0
+      throughput: 23.119783598825514
       estimated_peak_memory_range:
-        min: 4931584
-        max: 53543440
+        min: 5042176
+        max: 11567880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: j7gjx1q8p
+      job_id: j5we6l4j5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T23:10:23Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:48:13Z'
   - torchscript_onnx_qnn:
-      inference_time: 42954.0
-      throughput: 23.280718908599898
+      inference_time: 42992.0
+      throughput: 23.260141421659842
       estimated_peak_memory_range:
-        min: 5115904
-        max: 6705216
+        min: 5033984
+        max: 12084568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: j1p3k86z5
+      job_id: jgz3dln65
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T23:10:21Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:48:12Z'
   - torchscript_onnx_qnn:
-      inference_time: 43626.0
-      throughput: 22.922110667950307
+      inference_time: 43064.0
+      throughput: 23.22125208991269
       estimated_peak_memory_range:
-        min: 5058560
-        max: 6374936
+        min: 5038080
+        max: 13404360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: jwgoym8d5
+      job_id: jpedm7y05
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T23:10:21Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:48:10Z'
   - torchscript_onnx_qnn:
-      inference_time: 43428.0
-      throughput: 23.02661877129962
+      inference_time: 90623.0
+      throughput: 11.034726283614535
       estimated_peak_memory_range:
-        min: 5099520
-        max: 6388424
+        min: 4931584
+        max: 70253824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,19 +224,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: j1pv347m5
+      job_id: jp14zn6lp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:48:15Z'
+  - torchscript_onnx_qnn:
+      inference_time: 30823.0
+      throughput: 32.443305323946404
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 64419072
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 418
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 418
+      job_id: jgdx1d2lp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 31678.0
+      throughput: 31.567649472820253
+      estimated_peak_memory_range:
+        min: 753664
+        max: 80827200
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 421
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 421
+      job_id: jgn6vznm5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T23:10:22Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:48:21Z'
   - torchscript_onnx_qnn:
-      inference_time: 44461.0
-      throughput: 22.491621870853105
+      inference_time: 44484.0
+      throughput: 22.4799928064023
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -249,14 +285,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 418
-      job_id: j1gln7llp
+      job_id: jpv6kl7m5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 49409.0
-      throughput: 20.239227671072072
+      inference_time: 49500.0
+      throughput: 20.2020202020202
       estimated_peak_memory_range:
-        min: 146132992
-        max: 146132992
+        min: 146264064
+        max: 146264064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -264,7 +300,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 421
-      job_id: jz5wox4jp
+      job_id: jpxkol495
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -273,4 +309,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T23:10:26Z'
+    timestamp: '2024-10-15T00:48:19Z'
diff --git a/qai_hub_models/models/fcn_resnet50/README.md b/qai_hub_models/models/fcn_resnet50/README.md
index 5f781abd..862ed945 100644
--- a/qai_hub_models/models/fcn_resnet50/README.md
+++ b/qai_hub_models/models/fcn_resnet50/README.md
@@ -6,7 +6,7 @@
 FCN_ResNet50 is a machine learning model that can segment images from the COCO dataset. It uses ResNet50 as a backbone.
 
 This is based on the implementation of FCN-ResNet50 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/fcn.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/fcn_resnet50).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.fcn_resnet50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FCN-ResNet50 can be found
+* The license for the original implementation of FCN-ResNet50 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Fully Convolutional Networks for Semantic Segmentation](https://arxiv.org/abs/1411.4038)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/fcn.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/fcn_resnet50/export.py b/qai_hub_models/models/fcn_resnet50/export.py
index 6d9a42c3..ecb04253 100644
--- a/qai_hub_models/models/fcn_resnet50/export.py
+++ b/qai_hub_models/models/fcn_resnet50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.fcn_resnet50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "fcn_resnet50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/fcn_resnet50/perf.yaml b/qai_hub_models/models/fcn_resnet50/perf.yaml
index 8161c75a..7f89e516 100644
--- a/qai_hub_models/models/fcn_resnet50/perf.yaml
+++ b/qai_hub_models/models/fcn_resnet50/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FCN-ResNet50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 41802.0
-      throughput: 23.922300368403427
+      inference_time: 41294.0
+      throughput: 24.216593209667263
       estimated_peak_memory_range:
-        min: 98304
-        max: 2318400
+        min: 22110208
+        max: 24248904
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j1glne42p
+      job_id: jp4lry9l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 42427.0
-      throughput: 23.569896528154242
+      inference_time: 42202.0
+      throughput: 23.695559452158665
       estimated_peak_memory_range:
-        min: 3215360
-        max: 18722408
+        min: 3166208
+        max: 18804520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jygzev24g
+      job_id: jgkex29og
       job_status: Passed
     torchscript_onnx:
-      inference_time: 42829.0
-      throughput: 23.3486656237596
+      inference_time: 43019.0
+      throughput: 23.245542667193565
       estimated_peak_memory_range:
-        min: 16384
-        max: 124852056
+        min: 47304704
+        max: 49692968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jo5mrve7g
+      job_id: j5we6l1j5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:43:52Z'
+    timestamp: '2024-10-15T00:47:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 36249.0
-      throughput: 27.586967916356315
+      inference_time: 36401.0
+      throughput: 27.471772753495785
       estimated_peak_memory_range:
-        min: 7905280
-        max: 152998672
+        min: 22093824
+        max: 191262144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jw566q2n5
+      job_id: jpxkold95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 36793.0
-      throughput: 27.179082977740332
+      inference_time: 38829.0
+      throughput: 25.75394679234593
       estimated_peak_memory_range:
-        min: 2564096
-        max: 57565264
+        min: 3162112
+        max: 77455616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jz5womw4p
+      job_id: j5q6qlmmp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 39140.0
-      throughput: 25.549310168625446
+      inference_time: 39035.0
+      throughput: 25.618035096708084
       estimated_peak_memory_range:
-        min: 49319936
-        max: 195369312
+        min: 1544192
+        max: 174381568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jegn2r0jg
+      job_id: jg9lnzxvg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:43:53Z'
+    timestamp: '2024-10-15T00:47:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 41582.0
-      throughput: 24.04886729835025
+      inference_time: 41355.0
+      throughput: 24.180872929512756
       estimated_peak_memory_range:
-        min: 22102016
-        max: 24466560
+        min: 22097920
+        max: 24380520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j1p3kqnm5
+      job_id: j5mnx0dqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 38881.0
-      throughput: 25.719503099200125
+      inference_time: 39263.0
+      throughput: 25.469271324147417
       estimated_peak_memory_range:
-        min: 3219456
-        max: 4360256
+        min: 3276800
+        max: 4505048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jnp10q2n5
+      job_id: j56y48d7p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:43:47Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:47:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 65469.0
-      throughput: 15.274404680077595
+      inference_time: 41195.0
+      throughput: 24.274790629930816
       estimated_peak_memory_range:
-        min: 22151168
-        max: 109518784
+        min: 22126592
+        max: 24212112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jwgoyez15
+      job_id: jpy13o74p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 68055.0
-      throughput: 14.693997502020425
+      inference_time: 39366.0
+      throughput: 25.40263171264543
       estimated_peak_memory_range:
-        min: 1740800
-        max: 36045680
+        min: 3289088
+        max: 4504280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: j0pxve98g
+      job_id: jpv6kl9m5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:43:51Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:47:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 41353.0
-      throughput: 24.182042415302398
+      inference_time: 41439.0
+      throughput: 24.131856463717753
       estimated_peak_memory_range:
-        min: 0
-        max: 1795512
+        min: 22040576
+        max: 23827488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j1pv3zqz5
+      job_id: jp2kyrvmp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 39684.0
-      throughput: 25.199072674125592
+      inference_time: 38937.0
+      throughput: 25.682512777050107
       estimated_peak_memory_range:
-        min: 3334144
-        max: 4604568
+        min: 3313664
+        max: 4747336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jvgdw7n65
+      job_id: jgo26l4dp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:43:48Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:47:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 41532.0
-      throughput: 24.077819512664934
+      inference_time: 41541.0
+      throughput: 24.072602970559206
       estimated_peak_memory_range:
-        min: 22126592
-        max: 24172080
+        min: 22110208
+        max: 23911848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j7gjxkd1p
+      job_id: jprv3lneg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 39565.0
-      throughput: 25.274864147605207
+      inference_time: 39506.0
+      throughput: 25.312610742672
       estimated_peak_memory_range:
-        min: 3289088
-        max: 4588256
+        min: 3309568
+        max: 4568112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jz57zv2np
+      job_id: jp3j0zwzg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:43:49Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:47:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 41181.0
-      throughput: 24.28304315096768
+      inference_time: 65773.0
+      throughput: 15.203807033281134
       estimated_peak_memory_range:
-        min: 22126592
-        max: 24436800
+        min: 22204416
+        max: 119273184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jlpe94o8g
+      job_id: jgn6vz7m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 38834.0
-      throughput: 25.750630890456815
+      inference_time: 65372.0
+      throughput: 15.297069081563972
       estimated_peak_memory_range:
-        min: 3317760
-        max: 4564432
+        min: 3260416
+        max: 41841584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jqp4qjn2g
+      job_id: jpedm7l05
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:47:17Z'
+  - torchscript_onnx_tflite:
+      inference_time: 29946.0
+      throughput: 33.39344152808388
+      estimated_peak_memory_range:
+        min: 15872000
+        max: 118544416
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 86
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 86
+      job_id: jp8qye48p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 29775.0
+      throughput: 33.58522250209908
+      estimated_peak_memory_range:
+        min: 3194880
+        max: 74920576
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 127
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 127
+      job_id: jgz3dl465
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 26793.0
+      throughput: 37.32318142798492
+      estimated_peak_memory_range:
+        min: 35823616
+        max: 137945296
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 129
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 129
+      job_id: j57yre9r5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:43:50Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:47:23Z'
   - torchscript_onnx_qnn:
-      inference_time: 39297.0
-      throughput: 25.447235157900096
+      inference_time: 39296.0
+      throughput: 25.44788273615635
       estimated_peak_memory_range:
         min: 3153920
         max: 3153920
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 127
-      job_id: jmg9v90m5
+      job_id: jglvmy1l5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 42125.0
-      throughput: 23.73887240356083
+      inference_time: 42238.0
+      throughput: 23.67536341682845
       estimated_peak_memory_range:
-        min: 69451776
-        max: 69451776
+        min: 69459968
+        max: 69459968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: joprk16k5
+      job_id: jp14znvlp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:43:54Z'
+    timestamp: '2024-10-15T00:47:21Z'
diff --git a/qai_hub_models/models/fcn_resnet50_quantized/README.md b/qai_hub_models/models/fcn_resnet50_quantized/README.md
index f2a318f8..94047c54 100644
--- a/qai_hub_models/models/fcn_resnet50_quantized/README.md
+++ b/qai_hub_models/models/fcn_resnet50_quantized/README.md
@@ -6,7 +6,7 @@
 FCN_ResNet50 is a quantized machine learning model that can segment images from the COCO dataset. It uses ResNet50 as a backbone.
 
 This is based on the implementation of FCN-ResNet50-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/fcn.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/fcn_resnet50_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.fcn_resnet50_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FCN-ResNet50-Quantized can be found
+* The license for the original implementation of FCN-ResNet50-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Fully Convolutional Networks for Semantic Segmentation](https://arxiv.org/abs/1411.4038)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/fcn.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/fcn_resnet50_quantized/export.py b/qai_hub_models/models/fcn_resnet50_quantized/export.py
index d81e827e..3a559df3 100644
--- a/qai_hub_models/models/fcn_resnet50_quantized/export.py
+++ b/qai_hub_models/models/fcn_resnet50_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.fcn_resnet50_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "fcn_resnet50_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/fcn_resnet50_quantized/perf.yaml b/qai_hub_models/models/fcn_resnet50_quantized/perf.yaml
index fcb23e38..4df69318 100644
--- a/qai_hub_models/models/fcn_resnet50_quantized/perf.yaml
+++ b/qai_hub_models/models/fcn_resnet50_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FCN-ResNet50-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 12976.0
-      throughput: 77.06535141800246
+      inference_time: 12962.0
+      throughput: 77.14858818083628
       estimated_peak_memory_range:
-        min: 6324224
-        max: 7662360
+        min: 5533696
+        max: 7846280
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,29 +59,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: jn5q896e5
+      job_id: jg9lnz8qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 14822.0
-      throughput: 67.46727836999055
+      inference_time: 14804.0
+      throughput: 67.54931099702783
       estimated_peak_memory_range:
-        min: 36864
-        max: 16490184
+        min: 12288
+        max: 138402352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jz5wome4p
+        total_layers: 128
+      job_id: jp0z0md25
       job_status: Passed
     torchscript_onnx:
-      inference_time: 25046.0
-      throughput: 39.926535175277486
+      inference_time: 22000.0
+      throughput: 45.45454545454545
       estimated_peak_memory_range:
-        min: 61440
-        max: 43208608
+        min: 0
+        max: 43668568
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -91,7 +89,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: joprk1vk5
+      job_id: jgz3dl8z5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -100,13 +98,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:43:06Z'
+    timestamp: '2024-10-15T00:46:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 12479.0
-      throughput: 80.1346261719689
+      inference_time: 10618.0
+      throughput: 94.17969485778866
       estimated_peak_memory_range:
-        min: 32768
-        max: 89827680
+        min: 49152
+        max: 94352608
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -114,29 +112,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: j1glnev2p
+      job_id: jp14zn3kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10866.0
-      throughput: 92.03018590097552
+      inference_time: 10827.0
+      throughput: 92.36168837166343
       estimated_peak_memory_range:
-        min: 802816
-        max: 34130528
+        min: 819200
+        max: 38062816
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jmg9v9lm5
+        total_layers: 128
+      job_id: jp8qye6zp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 18811.0
-      throughput: 53.16038488118654
+      inference_time: 16828.0
+      throughput: 59.424768243403854
       estimated_peak_memory_range:
         min: 12288
-        max: 165767904
+        max: 183845840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -144,7 +142,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jep283k6p
+      job_id: j5we6l8z5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -153,13 +151,44 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:43:07Z'
+    timestamp: '2024-10-15T00:46:26Z'
+  - torchscript_onnx_qnn:
+      inference_time: 113680.0
+      throughput: 8.796622097114708
+      estimated_peak_memory_range:
+        min: 1269760
+        max: 9004752
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jgjvnro7g
+      job_status: Passed
+    reference_device_info:
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:46:23Z'
+  - reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:46:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 13041.0
-      throughput: 76.68123610152595
+      inference_time: 12985.0
+      throughput: 77.01193685021178
       estimated_peak_memory_range:
-        min: 5550080
-        max: 320087936
+        min: 5566464
+        max: 9482200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -167,22 +196,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: jw566qyn5
+      job_id: jgdx1d0kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12659.0
-      throughput: 78.99518129394107
+      inference_time: 13182.0
+      throughput: 75.86102260658474
       estimated_peak_memory_range:
-        min: 819200
-        max: 2540176
+        min: 847872
+        max: 2106504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jvgdw7x65
+        total_layers: 128
+      job_id: j5q6qlz7p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -190,14 +219,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:43:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:46:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 15359.0
-      throughput: 65.10840549514943
+      inference_time: 13045.0
+      throughput: 76.65772326561901
       estimated_peak_memory_range:
-        min: 5558272
-        max: 98656656
+        min: 5525504
+        max: 11347872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -205,37 +234,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: j1p3kqjm5
+      job_id: j5mnx04yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17161.0
-      throughput: 58.27166249053086
+      inference_time: 13235.0
+      throughput: 75.55723460521345
       estimated_peak_memory_range:
-        min: 811008
-        max: 36156400
+        min: 819200
+        max: 2086088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jo5mrvn7g
+        total_layers: 128
+      job_id: jp3j0zxxg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:43:04Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:46:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 12926.0
-      throughput: 77.36345350456445
+      inference_time: 12924.0
+      throughput: 77.37542556484061
       estimated_peak_memory_range:
-        min: 5545984
-        max: 18638280
+        min: 5550080
+        max: 6911416
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -243,37 +272,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: jwgoye215
+      job_id: jpxkolmj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12796.0
-      throughput: 78.14942169427947
+      inference_time: 13286.0
+      throughput: 75.2671985548698
       estimated_peak_memory_range:
-        min: 823296
-        max: 2186032
+        min: 819200
+        max: 2199288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jz57zvynp
+        total_layers: 128
+      job_id: j56y48rvp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:43:01Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:46:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 12914.0
-      throughput: 77.43534148985597
+      inference_time: 13019.0
+      throughput: 76.81081496274676
       estimated_peak_memory_range:
-        min: 5545984
-        max: 93051008
+        min: 5521408
+        max: 19730848
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -281,37 +310,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: j1pv3z6z5
+      job_id: jp4lry8q5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12828.0
-      throughput: 77.95447458684129
+      inference_time: 13214.0
+      throughput: 75.67731194187982
       estimated_peak_memory_range:
-        min: 847872
-        max: 2227720
+        min: 819200
+        max: 2320624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jqp4qjl2g
+        total_layers: 128
+      job_id: jglvmyoe5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:43:02Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:46:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 12957.0
-      throughput: 77.17835918808366
+      inference_time: 15187.0
+      throughput: 65.8457891617831
       estimated_peak_memory_range:
-        min: 2605056
-        max: 4449936
+        min: 5627904
+        max: 101214848
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -319,75 +348,105 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 89
-      job_id: j7gjxkv1p
+      job_id: j57yre6q5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12273.0
-      throughput: 81.47967082212988
+      inference_time: 17021.0
+      throughput: 58.750954703013925
       estimated_peak_memory_range:
-        min: 880640
-        max: 2430368
+        min: 716800
+        max: 37925024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j0pxvek8g
+        total_layers: 128
+      job_id: jpv6kle75
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:43:03Z'
-  - torchscript_onnx_qnn:
-      inference_time: 117049.0
-      throughput: 8.543430529094653
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:46:22Z'
+  - torchscript_onnx_tflite:
+      inference_time: 9021.0
+      throughput: 110.85245538188671
       estimated_peak_memory_range:
-        min: 1277952
-        max: 9028416
+        min: 5517312
+        max: 51723280
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 89
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jegn2r6jg
+        total_layers: 89
+      job_id: jpy13oqrp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 10772.0
+      throughput: 92.8332714444857
+      estimated_peak_memory_range:
+        min: 835584
+        max: 34962752
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jpedm7875
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 14952.0
+      throughput: 66.88068485821294
+      estimated_peak_memory_range:
+        min: 1753088
+        max: 103152720
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 144
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 144
+      job_id: j5we6l8j5
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:43:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:46:29Z'
   - torchscript_onnx_qnn:
-      inference_time: 12992.0
-      throughput: 76.9704433497537
+      inference_time: 13359.0
+      throughput: 74.85590238790328
       estimated_peak_memory_range:
         min: 794624
         max: 794624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 128
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jnp10q4n5
+        total_layers: 128
+      job_id: jgkex2oyg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 25865.0
-      throughput: 38.662284941040014
+      inference_time: 21562.0
+      throughput: 46.37788702346721
       estimated_peak_memory_range:
-        min: 35233792
-        max: 35233792
+        min: 36794368
+        max: 36794368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -395,7 +454,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jqpyev10g
+      job_id: jp14zn7kp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -404,4 +463,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:43:08Z'
+    timestamp: '2024-10-15T00:46:27Z'
diff --git a/qai_hub_models/models/ffnet_122ns_lowres/README.md b/qai_hub_models/models/ffnet_122ns_lowres/README.md
index 4bf440e2..2b69b08a 100644
--- a/qai_hub_models/models/ffnet_122ns_lowres/README.md
+++ b/qai_hub_models/models/ffnet_122ns_lowres/README.md
@@ -6,7 +6,7 @@
 FFNet-122NS-LowRes is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-122NS-LowRes found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_122ns_lowres).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_122ns_lowres.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-122NS-LowRes can be found
+* The license for the original implementation of FFNet-122NS-LowRes can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_122ns_lowres/export.py b/qai_hub_models/models/ffnet_122ns_lowres/export.py
index 4d467cd3..2a304b3f 100644
--- a/qai_hub_models/models/ffnet_122ns_lowres/export.py
+++ b/qai_hub_models/models/ffnet_122ns_lowres/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_122ns_lowres import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_122ns_lowres"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_122ns_lowres/perf.yaml b/qai_hub_models/models/ffnet_122ns_lowres/perf.yaml
index ec86f95c..29d342b0 100644
--- a/qai_hub_models/models/ffnet_122ns_lowres/perf.yaml
+++ b/qai_hub_models/models/ffnet_122ns_lowres/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-122NS-LowRes
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 7331.0
-      throughput: 136.40703860319192
+      inference_time: 7435.0
+      throughput: 134.49899125756556
       estimated_peak_memory_range:
-        min: 667648
-        max: 2865608
+        min: 647168
+        max: 2559144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: jw566q4n5
+      job_id: jgo26le4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7187.0
-      throughput: 139.14011409489356
+      inference_time: 7220.0
+      throughput: 138.50415512465375
       estimated_peak_memory_range:
         min: 6307840
-        max: 33284944
+        max: 37875312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jz5wom64p
+      job_id: j57yrevq5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7783.0
-      throughput: 128.48515996402415
+      inference_time: 7563.0
+      throughput: 132.22266296443212
       estimated_peak_memory_range:
-        min: 6324224
-        max: 9185048
+        min: 6316032
+        max: 68449600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 350
-      job_id: jegn2rvjg
+      job_id: jgkex2dyg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:41:22Z'
+    timestamp: '2024-10-15T00:44:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 5667.0
-      throughput: 176.4602082230457
+      inference_time: 6103.0
+      throughput: 163.85384237260365
       estimated_peak_memory_range:
-        min: 659456
-        max: 67127136
+        min: 667648
+        max: 71345056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: j1p3kq0m5
+      job_id: jpv6klz75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5656.0
-      throughput: 176.8033946251768
+      inference_time: 5973.0
+      throughput: 167.42005692281936
       estimated_peak_memory_range:
         min: 6307840
-        max: 29405056
+        max: 31171600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jmg9v9nm5
+      job_id: jp4lryjq5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8050.0
-      throughput: 124.22360248447205
+      inference_time: 6501.0
+      throughput: 153.82248884786955
       estimated_peak_memory_range:
-        min: 7589888
-        max: 92935840
+        min: 999424
+        max: 95885696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 350
-      job_id: joprk13k5
+      job_id: j5q6qlw7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:41:23Z'
+    timestamp: '2024-10-15T00:44:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 7260.0
-      throughput: 137.7410468319559
+      inference_time: 7253.0
+      throughput: 137.87398317937405
       estimated_peak_memory_range:
-        min: 651264
-        max: 2269632
+        min: 647168
+        max: 2878104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: jwgoye615
+      job_id: jgjvnrk7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6788.0
-      throughput: 147.3187978786093
+      inference_time: 6724.0
+      throughput: 148.720999405116
       estimated_peak_memory_range:
-        min: 6328320
-        max: 7716224
+        min: 6365184
+        max: 7488816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jvgdw7165
+      job_id: j5mnx0vyp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:41:17Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:44:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 10766.0
-      throughput: 92.88500835965075
+      inference_time: 7262.0
+      throughput: 137.70311209033324
       estimated_peak_memory_range:
-        min: 638976
-        max: 61768912
+        min: 671744
+        max: 2912336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: j1pv3zkz5
+      job_id: jg9lnz9qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11175.0
-      throughput: 89.48545861297539
+      inference_time: 6725.0
+      throughput: 148.6988847583643
       estimated_peak_memory_range:
-        min: 6422528
-        max: 27181728
+        min: 6340608
+        max: 8135168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jo5mrvx7g
+      job_id: jp2kyrjxp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:41:21Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:44:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 7252.0
-      throughput: 137.89299503585218
+      inference_time: 7274.0
+      throughput: 137.47594171020071
       estimated_peak_memory_range:
-        min: 0
-        max: 5974544
+        min: 647168
+        max: 2696992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: j7gjxkn1p
+      job_id: j5we6lmz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6911.0
-      throughput: 144.6968600781363
+      inference_time: 6701.0
+      throughput: 149.23145799134457
       estimated_peak_memory_range:
-        min: 6336512
-        max: 7479032
+        min: 6340608
+        max: 7749680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jz57zvrnp
+      job_id: jprv3l9vg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:41:18Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:44:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 7373.0
-      throughput: 135.63000135630003
+      inference_time: 7383.0
+      throughput: 135.4462955438169
       estimated_peak_memory_range:
-        min: 647168
-        max: 9586848
+        min: 638976
+        max: 2937272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: jlpe94m8g
+      job_id: jgz3dlvz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6887.0
-      throughput: 145.20110352838682
+      inference_time: 6722.0
+      throughput: 148.76524843796489
       estimated_peak_memory_range:
-        min: 6365184
-        max: 7608704
+        min: 6373376
+        max: 7866616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jqp4qjr2g
+      job_id: jgn6vzxv5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:41:19Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:44:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 7255.0
-      throughput: 137.83597518952448
+      inference_time: 10742.0
+      throughput: 93.0925339787749
       estimated_peak_memory_range:
-        min: 675840
-        max: 2804416
+        min: 638976
+        max: 63506224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 216
-      job_id: jygzevd4g
+      job_id: jpedm7475
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6817.0
-      throughput: 146.69209329617135
+      inference_time: 11012.0
+      throughput: 90.81002542680712
       estimated_peak_memory_range:
-        min: 6332416
-        max: 7559904
+        min: 1155072
+        max: 23134992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: j0pxveo8g
+      job_id: jp0z0mk25
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:44:34Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4901.0
+      throughput: 204.03999183840034
+      estimated_peak_memory_range:
+        min: 622592
+        max: 30095232
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 216
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 216
+      job_id: jgdx1d7kp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5054.0
+      throughput: 197.86307874950535
+      estimated_peak_memory_range:
+        min: 6291456
+        max: 28683552
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 348
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 348
+      job_id: jp8qye8zp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5389.0
+      throughput: 185.56318426424198
+      estimated_peak_memory_range:
+        min: 7593984
+        max: 53147632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 350
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 350
+      job_id: jp3j0z8xg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:41:20Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:44:40Z'
   - torchscript_onnx_qnn:
-      inference_time: 7181.0
-      throughput: 139.25637097897229
+      inference_time: 7133.0
+      throughput: 140.19346698443852
       estimated_peak_memory_range:
         min: 6303744
         max: 6303744
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 348
-      job_id: jnp10qzn5
+      job_id: jpxkolej5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7591.0
-      throughput: 131.73494928204454
+      inference_time: 7697.0
+      throughput: 129.92074834351047
       estimated_peak_memory_range:
-        min: 60002304
-        max: 60002304
+        min: 61521920
+        max: 61521920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 350
-      job_id: jep283y6p
+      job_id: jglvmy7e5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:41:24Z'
+    timestamp: '2024-10-15T00:44:38Z'
diff --git a/qai_hub_models/models/ffnet_40s/README.md b/qai_hub_models/models/ffnet_40s/README.md
index f9ee034a..d2b952c7 100644
--- a/qai_hub_models/models/ffnet_40s/README.md
+++ b/qai_hub_models/models/ffnet_40s/README.md
@@ -6,7 +6,7 @@
 FFNet-40S is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-40S found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_40s).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_40s.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-40S can be found
+* The license for the original implementation of FFNet-40S can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_40s/export.py b/qai_hub_models/models/ffnet_40s/export.py
index a6e519b6..530eb87c 100644
--- a/qai_hub_models/models/ffnet_40s/export.py
+++ b/qai_hub_models/models/ffnet_40s/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_40s import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_40s"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_40s/perf.yaml b/qai_hub_models/models/ffnet_40s/perf.yaml
index f3c5f185..c186f2c9 100644
--- a/qai_hub_models/models/ffnet_40s/perf.yaml
+++ b/qai_hub_models/models/ffnet_40s/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-40S
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 17077.0
-      throughput: 58.55829478245594
+      inference_time: 17007.0
+      throughput: 58.799317927912035
       estimated_peak_memory_range:
-        min: 2621440
-        max: 4673240
+        min: 2519040
+        max: 4726640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jygzexkxg
+      job_id: j5q6ql77p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17566.0
-      throughput: 56.928156666287144
+      inference_time: 17621.0
+      throughput: 56.75046819136258
       estimated_peak_memory_range:
-        min: 27770880
-        max: 42248864
+        min: 26689536
+        max: 51292232
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jnp10dxn5
+      job_id: j5we6ldz5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 27096.0
-      throughput: 36.90581635665781
+      inference_time: 24964.0
+      throughput: 40.0576830636116
       estimated_peak_memory_range:
-        min: 0
-        max: 31581472
+        min: 27320320
+        max: 29771632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 142
-      job_id: jep287n6p
+      job_id: jp2kyr3xp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:40:32Z'
+    timestamp: '2024-10-15T00:43:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 14933.0
-      throughput: 66.96578048617157
+      inference_time: 14889.0
+      throughput: 67.16367788300087
       estimated_peak_memory_range:
-        min: 2527232
-        max: 97092528
+        min: 1675264
+        max: 107263440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jz5wodnmp
+      job_id: jglvmy0e5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 15207.0
-      throughput: 65.75918984678108
+      inference_time: 15114.0
+      throughput: 66.16382162233691
       estimated_peak_memory_range:
-        min: 25206784
-        max: 55393184
+        min: 25198592
+        max: 60290624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jvgdwrl65
+      job_id: jg9lnz3qg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 22259.0
-      throughput: 44.92564805247316
+      inference_time: 22020.0
+      throughput: 45.41326067211626
       estimated_peak_memory_range:
-        min: 28938240
-        max: 143201440
+        min: 28663808
+        max: 158544848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 142
-      job_id: jqpye400g
+      job_id: jpy13ovrp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:40:33Z'
+    timestamp: '2024-10-15T00:43:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 16980.0
-      throughput: 58.89281507656066
+      inference_time: 16785.0
+      throughput: 59.577003276735184
       estimated_peak_memory_range:
-        min: 2527232
-        max: 4745208
+        min: 2539520
+        max: 4771248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jmg9v3e85
+      job_id: j56y483vp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17375.0
-      throughput: 57.55395683453237
+      inference_time: 16327.0
+      throughput: 61.2482391131255
       estimated_peak_memory_range:
-        min: 25239552
-        max: 26556672
+        min: 25235456
+        max: 26408256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jqp4qx02g
+      job_id: jgdx1drkp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:40:27Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:43:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 27754.0
-      throughput: 36.030842401095335
+      inference_time: 16799.0
+      throughput: 59.52735281862016
       estimated_peak_memory_range:
-        min: 2547712
-        max: 88737440
+        min: 2531328
+        max: 4632704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jnp10dx75
+      job_id: jgjvnr07g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 28663.0
-      throughput: 34.888183372291806
+      inference_time: 16511.0
+      throughput: 60.56568348373811
       estimated_peak_memory_range:
-        min: 25202688
-        max: 53468224
+        min: 25264128
+        max: 26509960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: joprk4jk5
+      job_id: jpxkol7j5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:40:31Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:43:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 16954.0
-      throughput: 58.983130824584165
+      inference_time: 16774.0
+      throughput: 59.61607249314415
       estimated_peak_memory_range:
-        min: 2535424
-        max: 4722736
+        min: 2531328
+        max: 4739104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jvgdwrlz5
+      job_id: jpv6kl175
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17597.0
-      throughput: 56.82786838665682
+      inference_time: 16850.0
+      throughput: 59.347181008902076
       estimated_peak_memory_range:
-        min: 25264128
-        max: 26983544
+        min: 25284608
+        max: 26491568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: j0pxv728g
+      job_id: jp4lryxq5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:40:28Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:43:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 16989.0
-      throughput: 58.86161633998469
+      inference_time: 16854.0
+      throughput: 59.33309600094933
       estimated_peak_memory_range:
-        min: 2519040
-        max: 4551392
+        min: 2527232
+        max: 4802984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jz5wodn4p
+      job_id: jgo26l14p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17515.0
-      throughput: 57.09391949757351
+      inference_time: 16816.0
+      throughput: 59.467174119885826
       estimated_peak_memory_range:
-        min: 25268224
-        max: 26538128
+        min: 25210880
+        max: 26464240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jo5mrwy7g
+      job_id: j57yrejq5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:40:29Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:43:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 16916.0
-      throughput: 59.11563017261764
+      inference_time: 27915.0
+      throughput: 35.82303421099767
       estimated_peak_memory_range:
-        min: 2539520
-        max: 4768152
+        min: 2555904
+        max: 97240432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 92
-      job_id: jmg9v3em5
+      job_id: jp3j0z4xg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17492.0
-      throughput: 57.16899153898925
+      inference_time: 28352.0
+      throughput: 35.270880361173816
       estimated_peak_memory_range:
-        min: 25264128
-        max: 26549120
+        min: 23126016
+        max: 57746960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jegn298jg
+      job_id: jgn6vzrv5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:43:37Z'
+  - torchscript_onnx_tflite:
+      inference_time: 11794.0
+      throughput: 84.78887569950822
+      estimated_peak_memory_range:
+        min: 872448
+        max: 45721472
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 92
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 92
+      job_id: jgz3dlxz5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 12082.0
+      throughput: 82.76775368316504
+      estimated_peak_memory_range:
+        min: 25178112
+        max: 58235520
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 140
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 140
+      job_id: jprv3l1vg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 15466.0
+      throughput: 64.6579593948015
+      estimated_peak_memory_range:
+        min: 33693696
+        max: 87935584
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 142
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 142
+      job_id: jgkex2ryg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:40:30Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:43:43Z'
   - torchscript_onnx_qnn:
-      inference_time: 17860.0
-      throughput: 55.99104143337066
+      inference_time: 16542.0
+      throughput: 60.45218232378189
       estimated_peak_memory_range:
-        min: 25223168
-        max: 25223168
+        min: 25219072
+        max: 25219072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,11 +405,11 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jz57zj3np
+      job_id: jp14zndkp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 26179.0
-      throughput: 38.19855609457962
+      inference_time: 30278.0
+      throughput: 33.027280533720855
       estimated_peak_memory_range:
         min: 25223168
         max: 25223168
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 142
-      job_id: j2p0ye00g
+      job_id: jp0z0me25
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:40:34Z'
+    timestamp: '2024-10-15T00:43:41Z'
diff --git a/qai_hub_models/models/ffnet_40s_quantized/README.md b/qai_hub_models/models/ffnet_40s_quantized/README.md
index 5237d76f..1772b22d 100644
--- a/qai_hub_models/models/ffnet_40s_quantized/README.md
+++ b/qai_hub_models/models/ffnet_40s_quantized/README.md
@@ -6,7 +6,7 @@
 FFNet-40S-Quantized is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-40S-Quantized found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_40s_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_40s_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-40S-Quantized can be found
+* The license for the original implementation of FFNet-40S-Quantized can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_40s_quantized/export.py b/qai_hub_models/models/ffnet_40s_quantized/export.py
index 2f335230..0f4c39fd 100644
--- a/qai_hub_models/models/ffnet_40s_quantized/export.py
+++ b/qai_hub_models/models/ffnet_40s_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_40s_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_40s_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_40s_quantized/perf.yaml b/qai_hub_models/models/ffnet_40s_quantized/perf.yaml
index b7fc36e3..779f9d20 100644
--- a/qai_hub_models/models/ffnet_40s_quantized/perf.yaml
+++ b/qai_hub_models/models/ffnet_40s_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-40S-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 4110.0
-      throughput: 243.30900243309003
+      inference_time: 4177.0
+      throughput: 239.40627244433804
       estimated_peak_memory_range:
-        min: 655360
-        max: 17844600
+        min: 675840
+        max: 3087992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,14 +62,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jygzex9xg
+      job_id: jp8qye7qp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 9631.0
-      throughput: 103.83137784238397
+      inference_time: 8966.0
+      throughput: 111.5324559446799
       estimated_peak_memory_range:
-        min: 7917568
-        max: 17182824
+        min: 110592
+        max: 11835736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -79,7 +77,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 168
-      job_id: jw5663ly5
+      job_id: j5mnx0zyp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -88,13 +86,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:39:47Z'
+    timestamp: '2024-10-15T00:42:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 2910.0
-      throughput: 343.64261168384877
+      inference_time: 2927.0
+      throughput: 341.646737273659
       estimated_peak_memory_range:
-        min: 659456
-        max: 65961408
+        min: 655360
+        max: 66216384
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -102,7 +100,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jz5wodvmp
+      job_id: jgkex2yvg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 6409.0
+      throughput: 156.03058199407084
+      estimated_peak_memory_range:
+        min: 4567040
+        max: 112039248
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 168
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 168
+      job_id: jgn6vz9v5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -111,13 +124,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:39:30Z'
+    timestamp: '2024-10-15T00:42:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 4064.0
-      throughput: 246.06299212598427
+      inference_time: 27414.0
+      throughput: 36.47771211789597
       estimated_peak_memory_range:
-        min: 638976
-        max: 12263072
+        min: 3170304
+        max: 43533152
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -125,22 +138,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jmg9v3185
+      job_id: jgjvnrl1g
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:39:31Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:42:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 5114.0
-      throughput: 195.54165037152913
+      inference_time: 189840.0
+      throughput: 5.267593763168985
       estimated_peak_memory_range:
-        min: 704512
-        max: 66771808
+        min: 929792
+        max: 14978752
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -148,22 +161,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jnp10dl75
+      job_id: jpedm7v85
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:39:32Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:42:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 4097.0
-      throughput: 244.081034903588
+      inference_time: 4061.0
+      throughput: 246.2447672986949
       estimated_peak_memory_range:
-        min: 647168
-        max: 2774536
+        min: 684032
+        max: 1949272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -171,22 +184,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jvgdwr9z5
+      job_id: j5q6ql2ep
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:42:26Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4154.0
+      throughput: 240.73182474723157
+      estimated_peak_memory_range:
+        min: 638976
+        max: 8262920
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 99
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 99
+      job_id: jgo26lv1p
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:39:33Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:42:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 4159.0
-      throughput: 240.44241404183697
+      inference_time: 4147.0
+      throughput: 241.13817217265492
       estimated_peak_memory_range:
-        min: 663552
-        max: 5066760
+        min: 659456
+        max: 2704744
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -194,7 +230,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jz57zjw9p
+      job_id: jp3j0zmmg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -202,14 +238,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:39:34Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:42:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 4080.0
-      throughput: 245.09803921568627
+      inference_time: 4123.0
+      throughput: 242.5418384671356
       estimated_peak_memory_range:
-        min: 651264
-        max: 6841376
+        min: 16384
+        max: 194057624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -217,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jqp4qxo1g
+      job_id: j56y481np
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:39:35Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:42:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 26480.0
-      throughput: 37.764350453172206
+      inference_time: 5144.0
+      throughput: 194.4012441679627
       estimated_peak_memory_range:
-        min: 1085440
-        max: 43239040
+        min: 0
+        max: 70203216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -240,22 +276,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: j0pxv7jlg
+      job_id: jglvmyk25
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:39:35Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:42:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 187748.0
-      throughput: 5.326288429171017
+      inference_time: 2516.0
+      throughput: 397.456279809221
       estimated_peak_memory_range:
-        min: 716800
-        max: 8401296
+        min: 651264
+        max: 32985312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -263,22 +299,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 99
-      job_id: jo5mrw29g
+      job_id: jgz3dl745
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5339.0
+      throughput: 187.30099269526127
+      estimated_peak_memory_range:
+        min: 0
+        max: 53719008
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 168
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 168
+      job_id: jpy13o4rp
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:39:37Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:42:52Z'
   - torchscript_onnx:
-      inference_time: 9749.0
-      throughput: 102.57462303826034
+      inference_time: 9156.0
+      throughput: 109.217999126256
       estimated_peak_memory_range:
-        min: 9482240
-        max: 9482240
+        min: 10833920
+        max: 10833920
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -286,7 +337,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 168
-      job_id: jwgoy1qk5
+      job_id: jprv3l4vg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -295,4 +346,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:39:48Z'
+    timestamp: '2024-10-15T00:42:50Z'
diff --git a/qai_hub_models/models/ffnet_54s/README.md b/qai_hub_models/models/ffnet_54s/README.md
index 90096232..a652acc3 100644
--- a/qai_hub_models/models/ffnet_54s/README.md
+++ b/qai_hub_models/models/ffnet_54s/README.md
@@ -6,7 +6,7 @@
 FFNet-54S is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-54S found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_54s).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_54s.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-54S can be found
+* The license for the original implementation of FFNet-54S can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_54s/export.py b/qai_hub_models/models/ffnet_54s/export.py
index d199dbaa..0389e2ba 100644
--- a/qai_hub_models/models/ffnet_54s/export.py
+++ b/qai_hub_models/models/ffnet_54s/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_54s import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_54s"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_54s/perf.yaml b/qai_hub_models/models/ffnet_54s/perf.yaml
index 257544e0..f4d12321 100644
--- a/qai_hub_models/models/ffnet_54s/perf.yaml
+++ b/qai_hub_models/models/ffnet_54s/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-54S
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 19620.0
-      throughput: 50.9683995922528
+      inference_time: 19975.0
+      throughput: 50.06257822277847
       estimated_peak_memory_range:
-        min: 2158592
-        max: 4310984
+        min: 2146304
+        max: 4449216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jmg9v3v85
+      job_id: jprv3l2kg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 20290.0
-      throughput: 49.28536224741252
+      inference_time: 20164.0
+      throughput: 49.59333465582226
       estimated_peak_memory_range:
-        min: 25223168
-        max: 50812504
+        min: 25219072
+        max: 47290376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: jegn292qg
+      job_id: jp3j0zemg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 29790.0
-      throughput: 33.56831151393085
+      inference_time: 28053.0
+      throughput: 35.64681139272092
       estimated_peak_memory_range:
-        min: 25997312
-        max: 41080456
+        min: 25911296
+        max: 28680032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 177
-      job_id: j1gln0zmp
+      job_id: j57yrexn5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:38:42Z'
+    timestamp: '2024-10-15T00:41:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 17535.0
-      throughput: 57.0287995437696
+      inference_time: 17729.0
+      throughput: 56.40476056179141
       estimated_peak_memory_range:
-        min: 2523136
-        max: 107213376
+        min: 2535424
+        max: 120832592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jnp10d075
+      job_id: jp2kyr96p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 17684.0
-      throughput: 56.5482922415743
+      inference_time: 17811.0
+      throughput: 56.14507888383583
       estimated_peak_memory_range:
         min: 21004288
-        max: 52449872
+        max: 56553616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: joprk4k75
+      job_id: jgo26l31p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 25465.0
-      throughput: 39.2695857058708
+      inference_time: 24675.0
+      throughput: 40.52684903748734
       estimated_peak_memory_range:
-        min: 405504
-        max: 122913152
+        min: 589824
+        max: 142903376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 177
-      job_id: jw5663jy5
+      job_id: jp4lryv25
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:38:43Z'
+    timestamp: '2024-10-15T00:41:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 19638.0
-      throughput: 50.921682452388225
+      inference_time: 19858.0
+      throughput: 50.35753852351697
       estimated_peak_memory_range:
-        min: 262144
-        max: 14650120
+        min: 2547712
+        max: 7829712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jvgdwrwz5
+      job_id: jpy13oj0p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 19843.0
-      throughput: 50.39560550320012
+      inference_time: 18980.0
+      throughput: 52.68703898840885
       estimated_peak_memory_range:
-        min: 25231360
-        max: 26398024
+        min: 25247744
+        max: 26457440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: jqpye4elg
+      job_id: jgjvnre1g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:38:37Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:41:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 31911.0
-      throughput: 31.337156466422236
+      inference_time: 19960.0
+      throughput: 50.100200400801604
       estimated_peak_memory_range:
-        min: 2580480
-        max: 95283264
+        min: 2543616
+        max: 4964736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jz57zjz9p
+      job_id: j5q6ql3ep
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 36419.0
-      throughput: 27.458194898267386
+      inference_time: 19252.0
+      throughput: 51.94265530853937
       estimated_peak_memory_range:
-        min: 25182208
-        max: 53629712
+        min: 25264128
+        max: 26769104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: jn5q878o5
+      job_id: j5we6lq45
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:38:41Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:41:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 19639.0
-      throughput: 50.91908956667855
+      inference_time: 19841.0
+      throughput: 50.40068544932211
       estimated_peak_memory_range:
-        min: 2461696
-        max: 4436168
+        min: 2539520
+        max: 4601856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jqp4qxq1g
+      job_id: jgkex23vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 20167.0
-      throughput: 49.58595725690484
+      inference_time: 19187.0
+      throughput: 52.11862198363475
       estimated_peak_memory_range:
-        min: 25223168
-        max: 26881960
+        min: 25264128
+        max: 26467192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: j2p0y1yng
+      job_id: jgz3dlr45
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:38:38Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:41:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 19617.0
-      throughput: 50.9761941173472
+      inference_time: 20016.0
+      throughput: 49.96003197442047
       estimated_peak_memory_range:
-        min: 2461696
-        max: 4670472
+        min: 2551808
+        max: 4598328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: j0pxv7vlg
+      job_id: jp8qyezqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 19890.0
-      throughput: 50.27652086475616
+      inference_time: 19257.0
+      throughput: 51.92916861401049
       estimated_peak_memory_range:
-        min: 25251840
-        max: 26492392
+        min: 25276416
+        max: 26660264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: j1p8o3oog
+      job_id: jpedm7k85
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:38:39Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:41:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 19580.0
-      throughput: 51.07252298263534
+      inference_time: 32162.0
+      throughput: 31.092593744170138
       estimated_peak_memory_range:
-        min: 2535424
-        max: 4331176
+        min: 2560000
+        max: 104041296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 113
-      job_id: jo5mrwr9g
+      job_id: jp0z0ml05
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 20061.0
-      throughput: 49.84796371068242
+      inference_time: 32442.0
+      throughput: 30.824240182479503
       estimated_peak_memory_range:
-        min: 25247744
-        max: 26485640
+        min: 25153536
+        max: 58919248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: jogkzlzng
+      job_id: jp14znenp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:41:33Z'
+  - torchscript_onnx_tflite:
+      inference_time: 14186.0
+      throughput: 70.49203440011279
+      estimated_peak_memory_range:
+        min: 454656
+        max: 48936256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 113
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 113
+      job_id: j56y48nnp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 11828.0
+      throughput: 84.54514710855597
+      estimated_peak_memory_range:
+        min: 25202688
+        max: 63362416
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 175
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 175
+      job_id: jgdx1do6p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 22041.0
+      throughput: 45.36999228710131
+      estimated_peak_memory_range:
+        min: 31764480
+        max: 84906384
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 177
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 177
+      job_id: jgn6vz3j5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:38:40Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:41:39Z'
   - torchscript_onnx_qnn:
-      inference_time: 20259.0
-      throughput: 49.36077792586011
+      inference_time: 19271.0
+      throughput: 51.89144310103264
       estimated_peak_memory_range:
         min: 25223168
         max: 25223168
@@ -354,11 +405,11 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 175
-      job_id: jep2878qp
+      job_id: jpv6klvz5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 29226.0
-      throughput: 34.21610894409088
+      inference_time: 32787.0
+      throughput: 30.499893250373624
       estimated_peak_memory_range:
         min: 25223168
         max: 25223168
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 177
-      job_id: j1p3k43n5
+      job_id: jpxkoly85
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:38:44Z'
+    timestamp: '2024-10-15T00:41:37Z'
diff --git a/qai_hub_models/models/ffnet_54s_quantized/README.md b/qai_hub_models/models/ffnet_54s_quantized/README.md
index 1f6912e8..9306773a 100644
--- a/qai_hub_models/models/ffnet_54s_quantized/README.md
+++ b/qai_hub_models/models/ffnet_54s_quantized/README.md
@@ -6,7 +6,7 @@
 FFNet-54S-Quantized is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-54S-Quantized found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_54s_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_54s_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-54S-Quantized can be found
+* The license for the original implementation of FFNet-54S-Quantized can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_54s_quantized/export.py b/qai_hub_models/models/ffnet_54s_quantized/export.py
index 83412cea..333e5f15 100644
--- a/qai_hub_models/models/ffnet_54s_quantized/export.py
+++ b/qai_hub_models/models/ffnet_54s_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_54s_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_54s_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_54s_quantized/perf.yaml b/qai_hub_models/models/ffnet_54s_quantized/perf.yaml
index cd23609b..6c399fe9 100644
--- a/qai_hub_models/models/ffnet_54s_quantized/perf.yaml
+++ b/qai_hub_models/models/ffnet_54s_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-54S-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 4787.0
-      throughput: 208.89910173386255
+      inference_time: 4790.0
+      throughput: 208.76826722338205
       estimated_peak_memory_range:
-        min: 671744
-        max: 2785320
+        min: 638976
+        max: 3180224
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,14 +62,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jmg9v3685
+      job_id: jgdx1de6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 11208.0
-      throughput: 89.22198429693077
+      inference_time: 11064.0
+      throughput: 90.38322487346349
       estimated_peak_memory_range:
-        min: 131072
-        max: 15625080
+        min: 32768
+        max: 16535720
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -79,7 +77,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 217
-      job_id: jwgoy1yk5
+      job_id: jg9lnzymg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -88,13 +86,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:37:57Z'
+    timestamp: '2024-10-15T00:40:42Z'
   - torchscript_onnx_tflite:
       inference_time: 3364.0
       throughput: 297.2651605231867
       estimated_peak_memory_range:
-        min: 659456
-        max: 73994144
+        min: 434176
+        max: 75581088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -102,14 +100,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jnp10dr75
+      job_id: j57yre0n5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8324.0
-      throughput: 120.1345506967804
+      inference_time: 7967.0
+      throughput: 125.51776076314799
       estimated_peak_memory_range:
-        min: 4820992
-        max: 119541024
+        min: 4702208
+        max: 130219984
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,7 +115,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 217
-      job_id: j1pv313r5
+      job_id: jp14znwnp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -126,13 +124,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:37:58Z'
+    timestamp: '2024-10-15T00:40:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 4704.0
-      throughput: 212.58503401360545
+      inference_time: 31917.0
+      throughput: 31.331265469812326
       estimated_peak_memory_range:
-        min: 659456
-        max: 1998880
+        min: 696320
+        max: 46927552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -140,22 +138,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jvgdwrjz5
+      job_id: jpy13or0p
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:37:41Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:40:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 5923.0
-      throughput: 168.83336147222693
+      inference_time: 201523.0
+      throughput: 4.9622127499094395
       estimated_peak_memory_range:
-        min: 675840
-        max: 78116048
+        min: 1114112
+        max: 2992344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -163,22 +161,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jz57zjq9p
+      job_id: jp0z0m205
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:37:42Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:40:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 4761.0
-      throughput: 210.03990758244066
+      inference_time: 4692.0
+      throughput: 213.12872975277068
       estimated_peak_memory_range:
-        min: 638976
-        max: 26838784
+        min: 643072
+        max: 13561432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -186,22 +184,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jqp4qxz1g
+      job_id: jp4lryk25
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:40:21Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4711.0
+      throughput: 212.26915729144557
+      estimated_peak_memory_range:
+        min: 643072
+        max: 2704944
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 120
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 120
+      job_id: jprv3l8kg
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:37:43Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:40:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 4859.0
-      throughput: 205.80366330520684
+      inference_time: 4773.0
+      throughput: 209.51183741881417
       estimated_peak_memory_range:
-        min: 692224
-        max: 2767248
+        min: 12288
+        max: 18584840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -209,7 +230,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: j0pxv7wlg
+      job_id: jgn6vzlj5
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -217,14 +238,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:37:44Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:40:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 4696.0
-      throughput: 212.94718909710392
+      inference_time: 4698.0
+      throughput: 212.85653469561515
       estimated_peak_memory_range:
-        min: 647168
-        max: 2673432
+        min: 651264
+        max: 12307840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -232,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jo5mrwj9g
+      job_id: j5mnx0q7p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:37:45Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:40:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 29982.0
-      throughput: 33.35334534053766
+      inference_time: 5950.0
+      throughput: 168.0672268907563
       estimated_peak_memory_range:
-        min: 1241088
-        max: 46676736
+        min: 671744
+        max: 79706752
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -255,22 +276,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: jegn29jqg
+      job_id: jpxkoln85
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:37:46Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:40:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 202726.0
-      throughput: 4.932766394049111
+      inference_time: 2871.0
+      throughput: 348.31069313827936
       estimated_peak_memory_range:
-        min: 970752
-        max: 2884360
+        min: 634880
+        max: 35734560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -278,22 +299,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 120
-      job_id: joprk4z75
+      job_id: jp8qyemqp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 7316.0
+      throughput: 136.6867140513942
+      estimated_peak_memory_range:
+        min: 7585792
+        max: 68578848
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 217
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 217
+      job_id: jp4lryd25
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:37:47Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:40:46Z'
   - torchscript_onnx:
-      inference_time: 11477.0
-      throughput: 87.13078330574191
+      inference_time: 11112.0
+      throughput: 89.99280057595392
       estimated_peak_memory_range:
-        min: 14020608
-        max: 14020608
+        min: 13795328
+        max: 13795328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -301,7 +337,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 217
-      job_id: j7gjx0xep
+      job_id: jgdx1dq6p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -310,4 +346,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:37:59Z'
+    timestamp: '2024-10-15T00:40:44Z'
diff --git a/qai_hub_models/models/ffnet_78s/README.md b/qai_hub_models/models/ffnet_78s/README.md
index 0f2d79dc..15383f97 100644
--- a/qai_hub_models/models/ffnet_78s/README.md
+++ b/qai_hub_models/models/ffnet_78s/README.md
@@ -6,7 +6,7 @@
 FFNet-78S is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-78S found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_78s).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_78s.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-78S can be found
+* The license for the original implementation of FFNet-78S can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_78s/export.py b/qai_hub_models/models/ffnet_78s/export.py
index 6478f4eb..ba9b2de6 100644
--- a/qai_hub_models/models/ffnet_78s/export.py
+++ b/qai_hub_models/models/ffnet_78s/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_78s import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_78s"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_78s/perf.yaml b/qai_hub_models/models/ffnet_78s/perf.yaml
index db839eb8..4df0ad09 100644
--- a/qai_hub_models/models/ffnet_78s/perf.yaml
+++ b/qai_hub_models/models/ffnet_78s/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-78S
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 23714.0
-      throughput: 42.16918276123809
+      inference_time: 23254.0
+      throughput: 43.00335426163241
       estimated_peak_memory_range:
-        min: 2568192
-        max: 4545248
+        min: 2560000
+        max: 4265008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jvgdwrkz5
+      job_id: jp14zn27p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23669.0
-      throughput: 42.24935569732562
+      inference_time: 23635.0
+      throughput: 42.31013327691982
       estimated_peak_memory_range:
-        min: 2555904
-        max: 24837272
+        min: 25231360
+        max: 49994480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jep2871qp
+      job_id: j5mnx0e7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 33511.0
-      throughput: 29.840947748500493
+      inference_time: 32700.0
+      throughput: 30.581039755351682
       estimated_peak_memory_range:
-        min: 25206784
-        max: 27484456
+        min: 25268224
+        max: 57157440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 237
-      job_id: j1p3k4yn5
+      job_id: j56y48enp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:36:47Z'
+    timestamp: '2024-10-15T00:39:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 21171.0
-      throughput: 47.23442444853809
+      inference_time: 21162.0
+      throughput: 47.25451280597297
       estimated_peak_memory_range:
-        min: 2560000
-        max: 121156992
+        min: 2543616
+        max: 136007200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jz57zjm9p
+      job_id: jgdx1dnzp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 21243.0
-      throughput: 47.07433036765052
+      inference_time: 21440.0
+      throughput: 46.64179104477612
       estimated_peak_memory_range:
-        min: 21016576
-        max: 58204224
+        min: 21008384
+        max: 63699088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jqpye4llg
+      job_id: jgn6vz0j5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 28905.0
-      throughput: 34.596090641757485
+      inference_time: 29103.0
+      throughput: 34.36071882623784
       estimated_peak_memory_range:
-        min: 1335296
-        max: 139201984
+        min: 2121728
+        max: 159953264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 237
-      job_id: jwgoy1jk5
+      job_id: jp3j0zvmg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:36:48Z'
+    timestamp: '2024-10-15T00:39:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 23807.0
-      throughput: 42.004452471962026
+      inference_time: 23104.0
+      throughput: 43.282548476454295
       estimated_peak_memory_range:
-        min: 2539520
-        max: 8656216
+        min: 2560000
+        max: 4869240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jqp4qx71g
+      job_id: j5we6lw45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23508.0
-      throughput: 42.53871022630594
+      inference_time: 23037.0
+      throughput: 43.4084299170899
       estimated_peak_memory_range:
-        min: 25264128
-        max: 26553344
+        min: 25272320
+        max: 26476200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: j1p8o3nog
+      job_id: jp2kyrx6p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:36:42Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:39:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 39381.0
-      throughput: 25.392955994007263
+      inference_time: 23077.0
+      throughput: 43.33318888937037
       estimated_peak_memory_range:
-        min: 2699264
-        max: 106880720
+        min: 2572288
+        max: 4742120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: j0pxv7qlg
+      job_id: j57yre2n5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 39522.0
-      throughput: 25.302363240726685
+      inference_time: 23073.0
+      throughput: 43.34070125254627
       estimated_peak_memory_range:
-        min: 25300992
-        max: 55599376
+        min: 25268224
+        max: 26886872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jw5663ky5
+      job_id: jp8qye0qp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:36:46Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:39:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 23157.0
-      throughput: 43.18348663471089
+      inference_time: 23169.0
+      throughput: 43.16112046268721
       estimated_peak_memory_range:
-        min: 2547712
-        max: 4615128
+        min: 2543616
+        max: 4705344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jo5mrw79g
+      job_id: jgdx1dn6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23725.0
-      throughput: 42.14963119072708
+      inference_time: 23407.0
+      throughput: 42.72226257102576
       estimated_peak_memory_range:
-        min: 25309184
-        max: 26654208
+        min: 25280512
+        max: 26525808
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jogkzl1ng
+      job_id: jp0z0m305
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:36:43Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:39:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 23288.0
-      throughput: 42.94057025077293
+      inference_time: 23240.0
+      throughput: 43.029259896729776
       estimated_peak_memory_range:
-        min: 2547712
-        max: 4718832
+        min: 2555904
+        max: 4660656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jegn294qg
+      job_id: jp14zn2np
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23516.0
-      throughput: 42.52423881612519
+      inference_time: 23501.0
+      throughput: 42.55138079230671
       estimated_peak_memory_range:
         min: 25284608
-        max: 26604224
+        max: 29540720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: jn5q87no5
+      job_id: jpy13oz0p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:36:44Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:39:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 23346.0
-      throughput: 42.833890173905594
+      inference_time: 39261.0
+      throughput: 25.47056875780036
       estimated_peak_memory_range:
-        min: 2560000
-        max: 4387696
+        min: 1220608
+        max: 113735920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: joprk4r75
+      job_id: jg9lnz0mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23922.0
-      throughput: 41.802524872502296
+      inference_time: 39234.0
+      throughput: 25.4880970586736
       estimated_peak_memory_range:
-        min: 25272320
-        max: 26547768
+        min: 25079808
+        max: 61194848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: j1gln0jmp
+      job_id: j5q6qleep
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:39:23Z'
+  - torchscript_onnx_tflite:
+      inference_time: 16614.0
+      throughput: 60.19020103527146
+      estimated_peak_memory_range:
+        min: 2174976
+        max: 57745728
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 149
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 149
+      job_id: jpxkol985
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 16705.0
+      throughput: 59.86231667165519
+      estimated_peak_memory_range:
+        min: 25178112
+        max: 67295488
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 235
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 235
+      job_id: jglvmy625
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 20857.0
+      throughput: 47.94553387351968
+      estimated_peak_memory_range:
+        min: 27004928
+        max: 90335184
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 237
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 237
+      job_id: jgjvnrz1g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:36:45Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:39:29Z'
   - torchscript_onnx_qnn:
-      inference_time: 24051.0
-      throughput: 41.57831275206852
+      inference_time: 23035.0
+      throughput: 43.41219882787063
       estimated_peak_memory_range:
         min: 25219072
         max: 25219072
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 235
-      job_id: j2p0y1wng
+      job_id: jprv3l6kg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 33007.0
-      throughput: 30.296603750719544
+      inference_time: 36440.0
+      throughput: 27.442371020856204
       estimated_peak_memory_range:
-        min: 33652736
-        max: 33652736
+        min: 32493568
+        max: 32493568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 237
-      job_id: j1pv31jr5
+      job_id: jgo26lk1p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:36:49Z'
+    timestamp: '2024-10-15T00:39:27Z'
diff --git a/qai_hub_models/models/ffnet_78s_lowres/README.md b/qai_hub_models/models/ffnet_78s_lowres/README.md
index d39d8172..cb86eb48 100644
--- a/qai_hub_models/models/ffnet_78s_lowres/README.md
+++ b/qai_hub_models/models/ffnet_78s_lowres/README.md
@@ -6,7 +6,7 @@
 FFNet-78S-LowRes is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-78S-LowRes found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_78s_lowres).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_78s_lowres.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-78S-LowRes can be found
+* The license for the original implementation of FFNet-78S-LowRes can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_78s_lowres/export.py b/qai_hub_models/models/ffnet_78s_lowres/export.py
index 1d8469f1..42c7e301 100644
--- a/qai_hub_models/models/ffnet_78s_lowres/export.py
+++ b/qai_hub_models/models/ffnet_78s_lowres/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_78s_lowres import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_78s_lowres"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_78s_lowres/perf.yaml b/qai_hub_models/models/ffnet_78s_lowres/perf.yaml
index 014f110d..3c41bf2e 100644
--- a/qai_hub_models/models/ffnet_78s_lowres/perf.yaml
+++ b/qai_hub_models/models/ffnet_78s_lowres/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-78S-LowRes
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 8311.0
-      throughput: 120.3224642040669
+      inference_time: 8330.0
+      throughput: 120.04801920768307
       estimated_peak_memory_range:
-        min: 651264
-        max: 2762192
+        min: 638976
+        max: 2660728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: joprk4m05
+      job_id: jpedm7dv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 8366.0
-      throughput: 119.53143676786995
+      inference_time: 8398.0
+      throughput: 119.07597046915933
       estimated_peak_memory_range:
         min: 6311936
-        max: 35352632
+        max: 29553584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: j1gln09jp
+      job_id: j5mnx0n9p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8924.0
-      throughput: 112.05737337516808
+      inference_time: 8025.0
+      throughput: 124.61059190031153
       estimated_peak_memory_range:
-        min: 6307840
-        max: 8953200
+        min: 6320128
+        max: 9269528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 238
-      job_id: jz5wodk3p
+      job_id: j56y482yp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:36:00Z'
+    timestamp: '2024-10-15T00:38:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 6668.0
-      throughput: 149.97000599880025
+      inference_time: 6595.0
+      throughput: 151.6300227445034
       estimated_peak_memory_range:
         min: 655360
-        max: 59796448
+        max: 63719536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jep287qrp
+      job_id: jgz3dl3x5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6652.0
-      throughput: 150.3307276007216
+      inference_time: 7132.0
+      throughput: 140.21312394840157
       estimated_peak_memory_range:
         min: 6307840
-        max: 32223456
+        max: 33817856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: jw5663965
+      job_id: jgn6vz6q5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7365.0
-      throughput: 135.77732518669382
+      inference_time: 6923.0
+      throughput: 144.4460494005489
       estimated_peak_memory_range:
-        min: 7581696
-        max: 80764752
+        min: 2400256
+        max: 85574912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 238
-      job_id: jmg9v3rw5
+      job_id: jp3j0znng
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:36:01Z'
+    timestamp: '2024-10-15T00:38:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 8162.0
-      throughput: 122.51899044351875
+      inference_time: 8303.0
+      throughput: 120.43839576056847
       estimated_peak_memory_range:
-        min: 651264
-        max: 2121592
+        min: 188416
+        max: 21398488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jqpye4k8g
+      job_id: j5we6lem5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7694.0
-      throughput: 129.97140629061607
+      inference_time: 7635.0
+      throughput: 130.97576948264572
       estimated_peak_memory_range:
-        min: 6361088
-        max: 7466096
+        min: 6365184
+        max: 7587240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: jwgoy17q5
+      job_id: jp2kyrkqp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:35:55Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:38:21Z'
   - torchscript_onnx_tflite:
-      inference_time: 12057.0
-      throughput: 82.9393713195654
+      inference_time: 8344.0
+      throughput: 119.84659635666347
       estimated_peak_memory_range:
-        min: 12288
-        max: 52558496
+        min: 0
+        max: 1809984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: j2p0y189g
+      job_id: j57yrey95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12604.0
-      throughput: 79.33989209774674
+      inference_time: 7638.0
+      throughput: 130.92432573972243
       estimated_peak_memory_range:
-        min: 6307840
-        max: 24217200
+        min: 6356992
+        max: 8020696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: jygzex6og
+      job_id: jp8qyeqop
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:35:59Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:38:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 8301.0
-      throughput: 120.46741356463076
+      inference_time: 8181.0
+      throughput: 122.2344456667889
       estimated_peak_memory_range:
-        min: 49152
-        max: 2351024
+        min: 24576
+        max: 4455456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: j1p8o3dkg
+      job_id: jgdx1dxzp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7841.0
-      throughput: 127.53475322025253
+      inference_time: 7735.0
+      throughput: 129.2824822236587
       estimated_peak_memory_range:
-        min: 6365184
-        max: 7682824
+        min: 6393856
+        max: 7584728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: j1pv318k5
+      job_id: jp0z0mzn5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:35:56Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:38:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 8278.0
-      throughput: 120.80212611741966
+      inference_time: 8180.0
+      throughput: 122.24938875305624
       estimated_peak_memory_range:
-        min: 671744
-        max: 2711488
+        min: 172032
+        max: 9611728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jogkzlwwg
+      job_id: jp14zn47p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7693.0
-      throughput: 129.98830105290523
+      inference_time: 7747.0
+      throughput: 129.08222537756552
       estimated_peak_memory_range:
-        min: 6393856
-        max: 7764184
+        min: 6414336
+        max: 7559240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: j7gjx09vp
+      job_id: jpy13o1lp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:35:57Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:38:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 8301.0
-      throughput: 120.46741356463076
+      inference_time: 11977.0
+      throughput: 83.49336227769892
       estimated_peak_memory_range:
-        min: 16384
-        max: 1772056
+        min: 663552
+        max: 56111328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jn5q87xn5
+      job_id: jg9lnzl8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7825.0
-      throughput: 127.79552715654953
+      inference_time: 12509.0
+      throughput: 79.94244144216164
       estimated_peak_memory_range:
-        min: 6369280
-        max: 7692072
+        min: 6320128
+        max: 26534272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: jlpe9rqog
+      job_id: j5q6ql6op
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:38:26Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5656.0
+      throughput: 176.8033946251768
+      estimated_peak_memory_range:
+        min: 57344
+        max: 30200688
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 149
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 149
+      job_id: jpxkolkl5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5912.0
+      throughput: 169.14749661705008
+      estimated_peak_memory_range:
+        min: 6303744
+        max: 27599472
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 236
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 236
+      job_id: jglvmy4m5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4997.0
+      throughput: 200.12007204322595
+      estimated_peak_memory_range:
+        min: 7557120
+        max: 52304048
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 238
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 238
+      job_id: jgjvnrdeg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:35:58Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:38:32Z'
   - torchscript_onnx_qnn:
-      inference_time: 8193.0
-      throughput: 122.05541315757354
+      inference_time: 8198.0
+      throughput: 121.98097096852891
       estimated_peak_memory_range:
         min: 6303744
         max: 6303744
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 236
-      job_id: j1p3k4l35
+      job_id: jprv3lv7g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8769.0
-      throughput: 114.03808872163303
+      inference_time: 8793.0
+      throughput: 113.72682815876266
       estimated_peak_memory_range:
-        min: 50311168
-        max: 50311168
+        min: 50634752
+        max: 50634752
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 238
-      job_id: jnp10d985
+      job_id: jgo26lzkp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:36:02Z'
+    timestamp: '2024-10-15T00:38:30Z'
diff --git a/qai_hub_models/models/ffnet_78s_quantized/README.md b/qai_hub_models/models/ffnet_78s_quantized/README.md
index f2c25927..f370cdde 100644
--- a/qai_hub_models/models/ffnet_78s_quantized/README.md
+++ b/qai_hub_models/models/ffnet_78s_quantized/README.md
@@ -6,7 +6,7 @@
 FFNet-78S-Quantized is a "fuss-free network" that segments street scene images with per-pixel classes like road, sidewalk, and pedestrian. Trained on the Cityscapes dataset.
 
 This is based on the implementation of FFNet-78S-Quantized found
-[here](https://github.com/Qualcomm-AI-research/FFNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/ffnet_78s_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.ffnet_78s_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of FFNet-78S-Quantized can be found
+* The license for the original implementation of FFNet-78S-Quantized can be found
   [here](https://github.com/Qualcomm-AI-research/FFNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Simple and Efficient Architectures for Semantic Segmentation](https://arxiv.org/abs/2206.08236)
 * [Source Model Implementation](https://github.com/Qualcomm-AI-research/FFNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/ffnet_78s_quantized/export.py b/qai_hub_models/models/ffnet_78s_quantized/export.py
index 8b6958f4..cc26a492 100644
--- a/qai_hub_models/models/ffnet_78s_quantized/export.py
+++ b/qai_hub_models/models/ffnet_78s_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.ffnet_78s_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -44,20 +44,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -79,10 +77,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "ffnet_78s_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -108,7 +106,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -119,7 +117,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -194,7 +192,11 @@ def export_model(
             inference_job, inference_result, torch_out, model.get_output_names()
         )
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/ffnet_78s_quantized/perf.yaml b/qai_hub_models/models/ffnet_78s_quantized/perf.yaml
index 925abbd0..b9089de1 100644
--- a/qai_hub_models/models/ffnet_78s_quantized/perf.yaml
+++ b/qai_hub_models/models/ffnet_78s_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: FFNet-78S-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 5845.0
-      throughput: 171.0863986313088
+      inference_time: 5745.0
+      throughput: 174.06440382941688
       estimated_peak_memory_range:
-        min: 643072
-        max: 2934216
+        min: 12288
+        max: 2606296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,14 +62,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: jegn29wkg
+      job_id: jglvmymm5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 13450.0
-      throughput: 74.34944237918215
+      inference_time: 11963.0
+      throughput: 83.59107247345983
       estimated_peak_memory_range:
-        min: 94208
-        max: 24616864
+        min: 126976
+        max: 24740320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -79,7 +77,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 301
-      job_id: jnp10dk85
+      job_id: jgkex2xng
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -88,13 +86,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:35:15Z'
+    timestamp: '2024-10-15T00:37:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 4063.0
-      throughput: 246.1235540241201
+      inference_time: 4089.0
+      throughput: 244.5585717779408
       estimated_peak_memory_range:
         min: 638976
-        max: 88888720
+        max: 90400272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -102,14 +100,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: joprk4705
+      job_id: j56y484yp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 10176.0
-      throughput: 98.27044025157232
+      inference_time: 8580.0
+      throughput: 116.55011655011656
       estimated_peak_memory_range:
-        min: 4907008
-        max: 143410896
+        min: 5042176
+        max: 158745840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,7 +115,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 301
-      job_id: jvgdwryr5
+      job_id: j5q6qlqop
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -126,13 +124,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:35:16Z'
+    timestamp: '2024-10-15T00:37:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 5706.0
-      throughput: 175.2541184717841
+      inference_time: 35597.0
+      throughput: 28.092254965306065
       estimated_peak_memory_range:
-        min: 626688
-        max: 14584768
+        min: 700416
+        max: 50412352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -140,22 +138,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: jep287zrp
+      job_id: j5we6l6m5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:34:59Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:37:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 7065.0
-      throughput: 141.54281670205236
+      inference_time: 218427.0
+      throughput: 4.578188593900937
       estimated_peak_memory_range:
-        min: 995328
-        max: 91954720
+        min: 905216
+        max: 3281984
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -163,22 +161,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: jqpye4y8g
+      job_id: jg9lnzn8g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:35:00Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:37:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 5779.0
-      throughput: 173.04031839418585
+      inference_time: 5683.0
+      throughput: 175.96339961288052
       estimated_peak_memory_range:
-        min: 655360
-        max: 2276784
+        min: 638976
+        max: 2532944
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -186,22 +184,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: j2p0y1x9g
+      job_id: jp3j0z0ng
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:37:10Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5752.0
+      throughput: 173.85257301808068
+      estimated_peak_memory_range:
+        min: 651264
+        max: 2770552
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 156
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 156
+      job_id: jpedm7mv5
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:35:01Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:37:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 5823.0
-      throughput: 171.73278378842522
+      inference_time: 5787.0
+      throughput: 172.80110592707794
       estimated_peak_memory_range:
-        min: 28672
-        max: 2202960
+        min: 20480
+        max: 2510840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -209,7 +230,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: j1p8o3kkg
+      job_id: jgjvnrneg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -217,14 +238,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:35:02Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:37:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 5701.0
-      throughput: 175.40782318891422
+      inference_time: 5717.0
+      throughput: 174.91691446562882
       estimated_peak_memory_range:
-        min: 663552
-        max: 2493992
+        min: 655360
+        max: 2904768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -232,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: jogkzlkwg
+      job_id: jpv6klkr5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:35:03Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:37:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 35341.0
-      throughput: 28.295747149203475
+      inference_time: 7035.0
+      throughput: 142.14641080312722
       estimated_peak_memory_range:
-        min: 12288
-        max: 48758736
+        min: 868352
+        max: 94568160
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -255,22 +276,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: jn5q87dn5
+      job_id: jgo26l6kp
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:35:04Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:37:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 222511.0
-      throughput: 4.494159839288844
+      inference_time: 3501.0
+      throughput: 285.6326763781777
       estimated_peak_memory_range:
-        min: 888832
-        max: 12590936
+        min: 659456
+        max: 40289216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -278,22 +299,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 156
-      job_id: j1gln0qjp
+      job_id: jp14znz7p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 8055.0
+      throughput: 124.14649286157666
+      estimated_peak_memory_range:
+        min: 7507968
+        max: 80165664
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 301
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 301
+      job_id: jp3j0zjng
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:35:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:37:36Z'
   - torchscript_onnx:
-      inference_time: 14011.0
-      throughput: 71.37249304118193
+      inference_time: 12351.0
+      throughput: 80.96510404015869
       estimated_peak_memory_range:
-        min: 23351296
-        max: 23351296
+        min: 23576576
+        max: 23576576
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -301,7 +337,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 301
-      job_id: jz57zj1vp
+      job_id: jglvmyvm5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -310,4 +346,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:35:17Z'
+    timestamp: '2024-10-15T00:37:35Z'
diff --git a/qai_hub_models/models/foot_track_net/README.md b/qai_hub_models/models/foot_track_net/README.md
new file mode 100644
index 00000000..f073400f
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/README.md
@@ -0,0 +1,59 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Person-Foot-Detection: Multi-task Human detector](https://aihub.qualcomm.com/models/foot_track_net)
+
+FootTrackNet can detect person and face bounding boxes, head and feet landmark locations and feet visibility.
+
+This is based on the implementation of Person-Foot-Detection found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/foot_track_net).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+
+
+
+## Example & Usage
+
+
+Once installed, run the following simple CLI demo:
+
+```bash
+python -m qai_hub_models.models.foot_track_net.demo
+```
+More details on the CLI tool can be found with the `--help` option. See
+[demo.py](demo.py) for sample usage of the model including pre/post processing
+scripts. Please refer to our [general instructions on using
+models](../../../#getting-started) for more usage instructions.
+
+## Export for on-device deployment
+
+This repository contains export scripts that produce a model optimized for
+on-device deployment. This can be run as follows:
+
+```bash
+python -m qai_hub_models.models.foot_track_net.export
+```
+Additional options are documented with the `--help` option. Note that the above
+script requires access to Deployment instructions for Qualcomm® AI Hub.
+
+
+## License
+* The license for the original implementation of Person-Foot-Detection can be found
+  [here](https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
+
+## References
+* [None](None)
+* [Source Model Implementation](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/foot_track_net/model.py)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
diff --git a/qai_hub_models/models/foot_track_net/__init__.py b/qai_hub_models/models/foot_track_net/__init__.py
new file mode 100644
index 00000000..a300223f
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/__init__.py
@@ -0,0 +1,7 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from .app import FootTrackNet_App as App  # noqa: F401
+from .model import MODEL_ID  # noqa: F401
+from .model import FootTrackNet_model as Model  # noqa: F401
diff --git a/qai_hub_models/models/foot_track_net/app.py b/qai_hub_models/models/foot_track_net/app.py
new file mode 100644
index 00000000..fbc15d48
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/app.py
@@ -0,0 +1,366 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from typing import Callable, List, Tuple
+
+import cv2
+import numpy as np
+import torch
+from PIL import Image
+
+from qai_hub_models.utils.bounding_box_processing import get_iou
+from qai_hub_models.utils.image_processing import (
+    app_to_net_image_inputs,
+    normalize_image_transform,
+)
+
+CLASSNAME_TO_ID_MAP = {"face": 0, "person": 1}
+
+
+def id_to_classname(id: int) -> str:
+    """CLASSNAME_TO_ID_MAP traverse the ID, return the corresponding class name"""
+    for k, v in CLASSNAME_TO_ID_MAP.items():
+        if v == id:
+            return k
+
+
+def restructure_topk(scores: torch.Tensor, K: int = 20) -> list:
+    """
+    cutomized function for top_k specific for this the FootTrackNet. Wil restructure the original coordinates, class id from the floored index.
+    After top k operation. this will specifically decoding the coordinates, class from the topk result.
+    parameters:
+        scores:  the heatmap scores in flat shape
+        K: how many top k to be kept.
+    return:
+        topk_scores: the scorse list for the top k.
+        topk_inds: the index list for the top k.
+        topk_clses: the class list for the top k.
+        topk_ys: the y coordinate list for the top k.
+        topk_xs: the x coordinate list for the top k.
+    """
+    batch, cat, height, width = scores.size()
+
+    topk_scores, topk_inds = torch.topk(
+        scores.reshape(batch, -1), min(K, batch * cat * height * width)
+    )
+    topk_clses = (topk_inds // (height * width)).int()
+
+    topk_inds = topk_inds % (height * width)
+    topk_ys = (topk_inds // width).int().float()
+    topk_xs = (topk_inds % width).int().float()
+    return topk_scores, topk_inds, topk_clses, topk_ys, topk_xs
+
+
+class BBox_landmarks:
+    def __init__(
+        self,
+        label: str,
+        xyrb: list | np.ndarray,
+        score: float | int = 0,
+        landmark: list | np.ndarray | None = None,
+        vis: list | np.ndarray | None = None,
+    ):
+        """
+        A bounding box plus landmarks structure to hold the hierarchical result.
+        parameters:
+            label:str the class label
+            xyrb: 4 array or list for bbox left, top,  right bottom coordinates
+            score:the score of the deteciton
+            landmark: 17x2 the landmark of the joints [[x1,y1], [x2,y2]...]
+            vis: 17 the visiblity of the joints.
+        """
+        self.label = label
+        self.score = score
+        self.landmark = landmark
+        self.vis = vis
+        self.x, self.y, self.r, self.b = xyrb
+        minx = min(self.x, self.r)
+        maxx = max(self.x, self.r)
+        miny = min(self.y, self.b)
+        maxy = max(self.y, self.b)
+        self.x, self.y, self.r, self.b = minx, miny, maxx, maxy
+
+    @property
+    def label_prop(self):
+        return self.label
+
+    @property
+    def haslandmark(self):
+        return self.landmark is not None
+
+    @property
+    def box(self):
+        return [self.x, self.y, self.r, self.b]
+
+    @box.setter
+    def box(self, newvalue):
+        self.x, self.y, self.r, self.b = newvalue
+
+    @label_prop.setter
+    def label_prop(self, newvalue):
+        self.label = newvalue
+
+
+def nms_bbox_landmark(
+    objs: list[BBox_landmarks], iou: float = 0.5
+) -> list[BBox_landmarks]:
+    """
+    nms function customized to work on the BBox_landmarks objects list.
+    parameter:
+        objs: the list of the BBox_landmarks objects.
+    return:
+        the rest of the BBox_landmarks after nms operation.
+    """
+    if objs is None or len(objs) <= 1:
+        return objs
+
+    objs = sorted(objs, key=lambda obj: obj.score, reverse=True)
+    keep = []
+    flags = [0] * len(objs)
+    for index, obj in enumerate(objs):
+
+        if flags[index] != 0:
+            continue
+
+        keep.append(obj)
+        for j in range(index + 1, len(objs)):
+            if (
+                flags[j] == 0
+                and get_iou(np.array(obj.box), np.array(objs[j].box)) > iou
+            ):
+                flags[j] = 1
+    return keep
+
+
+def drawbbox(
+    image: np.ndarray,
+    bbox: BBox_landmarks,
+    color: list | tuple | None = None,
+    thickness: int = 2,
+    landmarkcolor: tuple | list = (0, 0, 255),
+    visibility: list | np.ndarray | None = None,
+    joint_to_visualize: list = [0, 15, 16],
+    visibility_thresh: float = 0.05,
+) -> np.ndarray:
+    """
+    draw a bounding box and landmarks on the input image based on the detection result in BBox_landmarks.
+    parameters:
+        image: the input image in cv2 format.
+        bbox: the detection result in format of BBox_landmarks
+        color:the color for the result
+        thickness: the thickness of the boundary
+        landmarkcolor: the color for the landmark
+        visiblity: the visibility of the landmarks.
+        joint_to_visualize: which joint to be visualized.
+        visibility_thresh: the thresh to deem as the landmark visible or not when drawing it.
+    return:
+        the image after drawing the result
+    """
+
+    x, y, r, b = [int(bb + 0.5) for bb in np.array(bbox.box).astype(int)]
+    # 3DMM adjustment,  reuse the bbox structure
+    if bbox.label_prop == 0:
+        cx, cy = (r + x) // 2, (b + y) // 2
+        offset = max(r - x, b - y) // 2
+        x2 = cx - offset
+        y2 = cy - offset
+        r2 = cx + offset
+        b2 = cy + offset
+        cv2.rectangle(image, (x2, y2, r2 - x2 + 1, b2 - y2 + 1), color, thickness, 16)
+
+    else:
+        cv2.rectangle(image, (x, y, r - x + 1, b - y + 1), color, thickness, 16)
+
+    if bbox.haslandmark:
+        for i in range(len(bbox.landmark)):
+            x, y = bbox.landmark[i][:2]
+
+            if not joint_to_visualize or i not in joint_to_visualize:
+                continue
+            if visibility is not None and visibility[i] > visibility_thresh:
+                cv2.circle(image, (int(x), int(y)), 4, landmarkcolor, -1, 16)
+            else:
+                cv2.circle(image, (int(x), int(y)), 4, (0, 0, 255), -1, 16)
+    return image
+
+
+def detect_images_multiclass_fb(
+    output_hm: torch.Tensor,
+    output_tlrb: torch.Tensor,
+    output_landmark: torch.Tensor | None = None,
+    vis: torch.Tensor | None = None,
+    threshold: list | np.ndarray = [0.7, 0.7, 0.7],
+    stride: int = 4,
+    n_lmk: int = 17,
+) -> list:
+    """
+    Get the detection result from the model raw output tensors.
+    parameters:
+        output_hm: N,C,H,W the model heatmap output.
+        output_tlrb: N,12,H,W the model bbox output.
+        output_landmark: N,34,H,W the model output_landmark output.
+        vis: N,17,H,W the model visiblity output
+        threshold: 3 the threshold for each class.
+        stride: the stride of the output map comparing to input.
+        n_lmk: the landmark number.
+    return:
+        detection result: list[BBox_landmarks]
+
+    """
+    _, num_classes, hm_height, hm_width = output_hm.shape
+    hm = output_hm[0].reshape(1, num_classes, hm_height, hm_width)
+    hm = hm[:, :2]
+
+    tlrb = (
+        output_tlrb[0]
+        .cpu()
+        .data.numpy()
+        .reshape(1, num_classes * 4, hm_height, hm_width)
+    )
+
+    landmark = output_landmark[0].cpu().data.numpy().reshape(1, -1, hm_height, hm_width)
+    vis = vis[0].cpu().data.numpy().reshape(1, -1, hm_height, hm_width)
+    nmskey = hm
+
+    kscore, kinds, kcls, kys, kxs = restructure_topk(nmskey, 1000)
+
+    kys = kys.cpu().data.numpy().astype(np.int)
+    kxs = kxs.cpu().data.numpy().astype(np.int)
+    kcls = kcls.cpu().data.numpy().astype(np.int)
+    kscore = kscore.cpu().data.numpy().astype(np.float32)
+    kinds = kinds.cpu().data.numpy().astype(np.int)
+
+    key = [[], [], [], [], []]  # [ [kys..],  [kxs..], [score..], [class..]]
+
+    score_fc = []
+    for ind in range(kscore.shape[1]):
+        score = kscore[0, ind]
+        thr = threshold[kcls[0, ind]]
+        if kcls[0, ind] == 0:
+            score_fc.append(kscore[0, ind])
+        if score > thr:
+            key[0].append(kys[0, ind])
+            key[1].append(kxs[0, ind])
+            key[2].append(score)
+            key[3].append(kcls[0, ind])
+            key[4].append(kinds[0, ind])
+
+    imboxs = []
+    if key[0] is not None and len(key[0]) > 0:
+        ky, kx = key[0], key[1]
+        classes = key[3]
+        scores = key[2]
+
+        for i in range(len(kx)):
+            class_ = classes[i]
+            cx, cy = kx[i], ky[i]
+            x1, y1, x2, y2 = tlrb[0, class_ * 4 : (class_ + 1) * 4, cy, cx]
+            x1, y1, x2, y2 = (
+                np.array([cx, cy, cx, cy]) + np.array([-x1, -y1, x2, y2])
+            ) * stride  # back to world
+
+            if class_ == 1:  # face person, only person has landmakr otherwise None
+                x5y5 = landmark[0, : n_lmk * 2, cy, cx]
+                x5y5 = (x5y5 + np.array([cx] * n_lmk + [cy] * n_lmk)) * stride
+                boxlandmark = np.array(list(zip(x5y5[:n_lmk], x5y5[n_lmk:])))
+                box_vis = vis[0, :, cy, cx].tolist()
+            else:
+                boxlandmark = None
+                box_vis = None
+            imboxs.append(
+                BBox_landmarks(
+                    label=str(class_),
+                    xyrb=np.array([x1, y1, x2, y2]),
+                    score=scores[i].item(),
+                    landmark=boxlandmark,
+                    vis=box_vis,
+                )
+            )
+    return imboxs
+
+
+class FootTrackNet_App:
+    """
+    This class consists of light-weight "app code" that is required to perform end to end inference with DDRNet.
+
+    The app uses 1 model:
+        * FootTrackNet
+
+    For a given image input, the app will:
+        * pre-process the image (convert to range[0, 1])
+        * Run FootTrackNet inference
+        * Convert the output to two lists of BBox_landmarks objects for face and body.
+    """
+
+    def __init__(self, model: Callable[[torch.Tensor], torch.Tensor]):
+        self.model = model
+
+    def predict(self, *args, **kwargs):
+        return self.det_image(*args, **kwargs)
+
+    def det_image(
+        self,
+        pixel_values_or_image: torch.Tensor
+        | np.ndarray
+        | Image.Image
+        | List[Image.Image],
+    ) -> Tuple[List[BBox_landmarks], List[BBox_landmarks]]:
+        """
+        return two lists,  objs_face, objs_person.
+        Each list contains the object of BBox_landmarks which contains the bbox and landmark info. Please refer to BBox definition.
+
+        Parameters:
+            pixel_values_or_image
+                PIL image(s)
+                or
+                numpy array (N H W C x uint8) or (H W C x uint8) -- both RGB channel layout
+                or
+                pyTorch tensor (N C H W x fp32, value range is [0, 1]), RGB channel layout
+
+        Returns:
+            objs_face: a list of BBox_landmarks for face  list[BBox_landmarks]
+            objs_person: a list of BBox_landmarks for person  list[BBox_landmarks]
+        """
+        NHWC_int_numpy_frames, NCHW_fp32_torch_frames = app_to_net_image_inputs(
+            pixel_values_or_image
+        )
+        input_transform = normalize_image_transform()
+        NCHW_fp32_torch_frames = input_transform(NCHW_fp32_torch_frames)  # normalize
+        threshhold = [0.6, 0.7, 0.7]  # threshold for each detector
+        iou_thr = [0.2, 0.5, 0.5]  # iou threshold
+        output = self.model(NCHW_fp32_torch_frames)
+
+        heatmap = output[0]
+        bbox = output[1]
+        landmark = output[2]
+        landmark_visiblity = output[3]
+
+        stride = 4
+        num_landmarks = 17
+        objs = detect_images_multiclass_fb(
+            heatmap,
+            bbox,
+            landmark,
+            threshold=threshhold,
+            stride=stride,
+            n_lmk=num_landmarks,
+            vis=landmark_visiblity,
+        )
+
+        objs_face = []
+        objs_person = []
+
+        for obj in objs:
+            label = id_to_classname(int(obj.label_prop))
+            if label == "face":
+                objs_face.append(obj)
+            elif label == "person":
+                objs_person.append(obj)
+
+        objs_face = nms_bbox_landmark(objs_face, iou=iou_thr[0])
+        objs_person = nms_bbox_landmark(objs_person, iou=iou_thr[1])
+
+        return objs_face, objs_person
diff --git a/qai_hub_models/models/foot_track_net/conftest.py b/qai_hub_models/models/foot_track_net/conftest.py
new file mode 100644
index 00000000..811b52bd
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/conftest.py
@@ -0,0 +1,39 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+import inspect
+
+import pytest
+
+from qai_hub_models.models.foot_track_net import Model
+from qai_hub_models.utils.testing import skip_clone_repo_check
+
+
+# Instantiate the model only once for all tests.
+# Mock from_pretrained to always return the initialized model.
+# This speeds up tests and limits memory leaks.
+@pytest.fixture(scope="module", autouse=True)
+def cached_from_pretrained():
+    with pytest.MonkeyPatch.context() as mp:
+        pretrained_cache = {}
+        from_pretrained = Model.from_pretrained
+        sig = inspect.signature(from_pretrained)
+
+        @skip_clone_repo_check
+        def _cached_from_pretrained(*args, **kwargs):
+            cache_key = str(args) + str(kwargs)
+            model = pretrained_cache.get(cache_key, None)
+            if model:
+                return model
+            else:
+                model = from_pretrained(*args, **kwargs)
+                pretrained_cache[cache_key] = model
+                return model
+
+        _cached_from_pretrained.__signature__ = sig
+
+        mp.setattr(Model, "from_pretrained", _cached_from_pretrained)
+        yield mp
diff --git a/qai_hub_models/models/foot_track_net/demo.py b/qai_hub_models/models/foot_track_net/demo.py
new file mode 100644
index 00000000..26ee52aa
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/demo.py
@@ -0,0 +1,107 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import numpy as np
+from PIL import Image
+
+from qai_hub_models.models.foot_track_net.app import (
+    BBox_landmarks,
+    FootTrackNet_App,
+    drawbbox,
+)
+from qai_hub_models.models.foot_track_net.model import (
+    MODEL_ASSET_VERSION,
+    MODEL_ID,
+    FootTrackNet_model,
+)
+from qai_hub_models.utils.args import (
+    demo_model_from_cli_args,
+    get_model_cli_parser,
+    get_on_device_demo_parser,
+    validate_on_device_demo_args,
+)
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_image
+from qai_hub_models.utils.display import display_or_save_image
+from qai_hub_models.utils.draw import create_color_map
+from qai_hub_models.utils.image_processing import pil_resize_pad
+
+INPUT_IMAGE_ADDRESS = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "test1.jpg"
+)
+
+
+def undo_resize_pad_BBox(bbox: BBox_landmarks, scale: float, padding: list):
+    """
+    undo the resize and pad in place of the BBox_landmarks object.
+    operation in place to replace the inner coordinates
+    Parameters:
+        scale: single scale from original to target image.
+        pad: left, top padding size
+    Return:
+        None.
+    """
+    if bbox.haslandmark:
+        for lmk in bbox.landmark:
+            lmk[0] = (lmk[0] + padding[0]) / scale
+            lmk[1] = (lmk[1] + padding[1]) / scale
+    bbox.x = (bbox.x + padding[0]) / scale
+    bbox.y = (bbox.y + padding[1]) / scale
+    bbox.r = (bbox.r + padding[0]) / scale
+    bbox.b = (bbox.b + padding[1]) / scale
+
+    return
+
+
+def main(is_test: bool = False):
+    parser = get_model_cli_parser(FootTrackNet_model)
+    parser = get_on_device_demo_parser(parser, add_output_dir=True)
+    parser.add_argument(
+        "--image",
+        type=str,
+        default=INPUT_IMAGE_ADDRESS,
+        help="image file path or URL",
+    )
+    args = parser.parse_args([] if is_test else None)
+    model = demo_model_from_cli_args(FootTrackNet_model, MODEL_ID, args)
+    validate_on_device_demo_args(args, MODEL_ID)
+
+    # Load image
+    (_, _, height, width) = FootTrackNet_model.get_input_spec()["image"][0]
+    orig_image = load_image(args.image)
+    image, scale, padding = pil_resize_pad(orig_image, (height, width))
+    print("Model Loaded")
+
+    app = FootTrackNet_App(model)
+    objs_face, objs_person = app.det_image(image)
+    objs = objs_face + objs_person
+
+    img_out = np.array(orig_image)[:, :, ::-1].copy()  # to BGR
+    jt_vis = [0, 15, 16]
+    vis_thr = 0.5
+    color_maps = create_color_map(2)
+
+    for obj in objs:
+        undo_resize_pad_BBox(obj, scale, padding)
+        color = color_maps[int(obj.label)]
+        color = [int(e) for e in color]
+        vis = obj.vis
+        img_out = drawbbox(
+            img_out,
+            obj,
+            color=color,
+            landmarkcolor=color,
+            visibility=vis,
+            joint_to_visualize=jt_vis,
+            visibility_thresh=vis_thr,
+        )
+    img_out_PIL = Image.fromarray(img_out[:, :, ::-1])
+
+    if not is_test:
+        display_or_save_image(
+            img_out_PIL, args.output_dir, "FootTrackNet_demo_output.png"
+        )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/foot_track_net/export.py b/qai_hub_models/models/foot_track_net/export.py
new file mode 100644
index 00000000..82ca1a0f
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/export.py
@@ -0,0 +1,209 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import os
+import warnings
+from pathlib import Path
+from typing import Any, Dict, List, Optional, cast
+
+import qai_hub as hub
+import torch
+
+from qai_hub_models.models.common import ExportResult, TargetRuntime
+from qai_hub_models.models.foot_track_net import Model
+from qai_hub_models.utils.args import (
+    export_parser,
+    get_input_spec_kwargs,
+    get_model_kwargs,
+)
+from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
+from qai_hub_models.utils.printing import (
+    print_inference_metrics,
+    print_profile_metrics_from_job,
+)
+from qai_hub_models.utils.qai_hub_helpers import (
+    can_access_qualcomm_ai_hub,
+    export_without_hub_access,
+)
+
+
+def export_model(
+    device: str = "Samsung Galaxy S23 (Family)",
+    chipset: Optional[str] = None,
+    skip_profiling: bool = False,
+    skip_inferencing: bool = False,
+    skip_downloading: bool = False,
+    skip_summary: bool = False,
+    output_dir: Optional[str] = None,
+    target_runtime: TargetRuntime = TargetRuntime.TFLITE,
+    compile_options: str = "",
+    profile_options: str = "",
+    **additional_model_kwargs,
+) -> ExportResult | List[str]:
+    """
+    This function executes the following recipe:
+
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
+
+    Each of the last 4 steps can be optionally skipped using the input options.
+
+    Parameters:
+        device: Device for which to export the model.
+            Full list of available devices can be found by running `hub.get_devices()`.
+            Defaults to DEFAULT_DEVICE if not specified.
+        chipset: If set, will choose a random device with this chipset.
+            Overrides the `device` argument.
+        skip_profiling: If set, skips profiling of compiled model on real devices.
+        skip_inferencing: If set, skips computing on-device outputs from sample data.
+        skip_downloading: If set, skips downloading of compiled model.
+        skip_summary: If set, skips waiting for and summarizing results
+            from profiling and inference.
+        output_dir: Directory to store generated assets (e.g. compiled model).
+            Defaults to `<cwd>/build/<model_name>`.
+        target_runtime: Which on-device runtime to target. Default is TFLite.
+        compile_options: Additional options to pass when submitting the compile job.
+        profile_options: Additional options to pass when submitting the profile job.
+        **additional_model_kwargs: Additional optional kwargs used to customize
+            `model_cls.from_pretrained` and `model.get_input_spec`
+
+    Returns:
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub.
+            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+    """
+    model_name = "foot_track_net"
+    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
+    if chipset:
+        hub_device = hub.Device(attributes=f"chipset:{chipset}")
+    else:
+        hub_device = hub.Device(name=device)
+    if not can_access_qualcomm_ai_hub():
+        return export_without_hub_access(
+            "foot_track_net",
+            "Person-Foot-Detection",
+            device,
+            skip_profiling,
+            skip_inferencing,
+            skip_downloading,
+            skip_summary,
+            output_path,
+            target_runtime,
+            compile_options,
+            profile_options,
+        )
+
+    # On-device perf improves with I/O in channel_last format except when using ONNX.
+    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+    model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
+    input_spec = model.get_input_spec(
+        **get_input_spec_kwargs(model, additional_model_kwargs)
+    )
+
+    # Trace the model
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    # 2. Compiles the model to an asset that can be run on device
+    model_compile_options = model.get_hub_compile_options(
+        target_runtime, compile_options, hub_device
+    )
+    print(f"Optimizing model {model_name} to run on-device")
+    submitted_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options=model_compile_options,
+    )
+    compile_job = cast(hub.client.CompileJob, submitted_compile_job)
+
+    # 3. Profiles the model performance on a real device
+    profile_job: Optional[hub.client.ProfileJob] = None
+    if not skip_profiling:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(f"Profiling model {model_name} on a hosted device.")
+        submitted_profile_job = hub.submit_profile_job(
+            model=compile_job.get_target_model(),
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
+
+    # 4. Inferences the model on sample inputs
+    inference_job: Optional[hub.client.InferenceJob] = None
+    if not skip_inferencing:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(
+            f"Running inference for {model_name} on a hosted device with example inputs."
+        )
+        sample_inputs = model.sample_inputs(
+            input_spec, use_channel_last_format=use_channel_last_format
+        )
+        submitted_inference_job = hub.submit_inference_job(
+            model=compile_job.get_target_model(),
+            inputs=sample_inputs,
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
+
+    # 5. Downloads the model asset to the local directory
+    if not skip_downloading:
+        os.makedirs(output_path, exist_ok=True)
+        target_model: hub.Model = compile_job.get_target_model()  # type: ignore
+        target_model.download(str(output_path / model_name))
+
+    # 6. Summarizes the results from profiling and inference
+    if not skip_summary and not skip_profiling:
+        assert profile_job is not None and profile_job.wait().success
+        profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+        print_profile_metrics_from_job(profile_job, profile_data)
+
+    if not skip_summary and not skip_inferencing:
+        sample_inputs = model.sample_inputs(use_channel_last_format=False)
+        torch_out = torch_inference(
+            model, sample_inputs, return_channel_last_output=use_channel_last_format
+        )
+        assert inference_job is not None and inference_job.wait().success
+        inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
+
+        print_inference_metrics(
+            inference_job, inference_result, torch_out, model.get_output_names()
+        )
+
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(model_cls=Model)
+    args = parser.parse_args()
+    export_model(**vars(args))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/foot_track_net/foot_track_net.py b/qai_hub_models/models/foot_track_net/foot_track_net.py
new file mode 100644
index 00000000..d899cc7b
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/foot_track_net.py
@@ -0,0 +1,162 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import os
+
+import numpy as np
+import torch
+import torch.nn as nn
+
+from .layers import CBAModule, DetectModule, HeadModule, Mbv3SmallFast, UpModule
+
+
+class FootTrackNet(nn.Module):
+    def __init__(
+        self,
+        wide: int = 64,
+        has_ext: bool = True,
+        upmode: str = "UCBA",
+        act: str = "relu",
+        RGB: bool = True,
+        strict: bool = False,
+        n_lmk: int = 17,
+    ):
+        super(FootTrackNet, self).__init__()
+        """
+        FootTrackNet multi task human detector model for person, face detection plus head and feet landmark detection.
+        Draw the given points on the frame.
+
+        Parameters:
+            wide: the channel size of bandwith of the intermediate layers
+            has_ext: if add extension layer in the head module.
+            upmode: upsampling mode.
+            act: activation function.
+            RGB: if the input is a 3 channel RGB
+            acb_mode: the ACBlock mode.
+            stand_conv: if use the standard convolution.
+            strict: if load the model weights in a strict way
+            n_lmk: the number of landmarks for detection.
+
+        Returns:
+            FootTrackNet model instance.
+        """
+        self.use_rgb = RGB
+        self.strict = strict
+
+        self.mean = nn.Parameter(
+            torch.tensor(
+                np.array([0.408, 0.447, 0.47]).reshape(1, 3, 1, 1).astype(np.float32)
+            ),
+            requires_grad=False,
+        )
+        self.std = nn.Parameter(
+            torch.tensor(
+                np.array([0.289, 0.274, 0.278]).reshape(1, 3, 1, 1).astype(np.float32)
+            ),
+            requires_grad=False,
+        )
+
+        # define backbone
+        self.bb = Mbv3SmallFast(act, RGB)
+
+        # Get the number of branch node channels stride 4, 8, 16
+        c0, c1, c2 = self.bb.uplayer_shape
+        act = "relu" if act == "hswish" else act
+        self.conv3 = CBAModule(
+            self.bb.output_channels,
+            wide,
+            kernel_size=1,
+            stride=1,
+            padding=0,
+            bias=False,
+            act=act,
+        )  # s32
+        self.connect0 = CBAModule(c0, wide, kernel_size=1, act=act)  # s4
+        self.connect1 = CBAModule(c1, wide, kernel_size=1, act=act)  # s8
+        self.connect2 = CBAModule(
+            c2, wide, kernel_size=1, act=act
+        )  # s16, conv, batchnorm activation.
+
+        self.up0 = UpModule(
+            wide, wide, kernel_size=2, stride=2, mode=upmode, act=act
+        )  # s16 nearest
+        self.up1 = UpModule(
+            wide, wide, kernel_size=2, stride=2, mode=upmode, act=act
+        )  # s8
+        self.up2 = UpModule(
+            wide, wide, kernel_size=2, stride=2, mode=upmode, act=act
+        )  # s4
+        self.detect = DetectModule(wide, act=act)
+
+        self.heatmap1 = HeadModule(wide, 1, has_ext=has_ext)
+        self.box1 = HeadModule(wide, 4, has_ext=has_ext)
+        self.heatmap2 = HeadModule(wide, 1, has_ext=has_ext)
+        self.box2 = HeadModule(wide, 4, has_ext=has_ext)
+        self.heatmap3 = HeadModule(wide, 1, has_ext=has_ext)
+        self.box3 = HeadModule(wide, 4, has_ext=has_ext)
+
+        self.landmark = HeadModule(wide, 2 * n_lmk, has_ext=has_ext)
+        self.landmark_vis = HeadModule(wide, n_lmk, has_ext=has_ext)
+
+    def forward(self, x: torch.Tensor) -> list:
+        """
+        x: N,C,H,W (1,3,480,640) tensor of input image
+        return: 4 tensors including
+        heatmap: N,C,H,W (1,3,120,160)
+        bbox: N,C,H,W (1,12,120,160)
+        landmark: N,C,H,W (1,34,120,160)
+        landmark: N,C,H,W (1,17,120,160)
+        """
+
+        s4, s8, s16, s32 = self.bb(x)
+        s32 = self.conv3(s32)
+        s16 = self.up0(s32) + self.connect2(s16)
+        s8 = self.up1(s16) + self.connect1(s8)
+        s4 = self.up2(s8) + self.connect0(s4)
+        x = self.detect(s4)
+
+        # simplify with sigmoid
+        center1 = self.heatmap1(x).sigmoid()
+        center2 = self.heatmap2(x).sigmoid()
+        center3 = self.heatmap3(x).sigmoid()
+
+        box1 = self.box1(x)
+        box2 = self.box2(x)
+        box3 = self.box3(x)  # when demo, no hand
+        landmark = self.landmark(x)  # 2 * 17
+        landmark_vis = self.landmark_vis(x).sigmoid()
+        return (
+            torch.cat((center1, center2, center3), dim=1),
+            torch.cat((box1, box2, box3), dim=1),
+            landmark,
+            landmark_vis,
+        )  # simple one landmark
+
+    def load_weights(self, base_file):
+        """load pretrined weights"""
+        other, ext = os.path.splitext(base_file)
+        if ext == ".pkl" or ".pth":
+            print("Loading pretrained weights into state dict...")
+
+            pretrained_dict = torch.load(
+                base_file, map_location=lambda storage, loc: storage
+            )
+            model_dict = self.state_dict()
+
+            if not self.strict:
+                pretrained_dict = {
+                    k: v for k, v in pretrained_dict.items() if k in model_dict
+                }
+                if (
+                    self.use_rgb and pretrained_dict["bb.conv1.weight"].shape[1] == 1
+                ):  # single channel.
+                    pretrained_dict["bb.conv1.weight"] = torch.tile(
+                        pretrained_dict["bb.conv1.weight"], [1, 3, 1, 1]
+                    )  # the input channel to 3.
+            model_dict.update(pretrained_dict)
+
+            self.load_state_dict(model_dict, strict=self.strict)
+            print("Finished!")
+        else:
+            raise ValueError("Sorry only .pth and .pkl files supported.")
diff --git a/qai_hub_models/models/foot_track_net/info.yaml b/qai_hub_models/models/foot_track_net/info.yaml
new file mode 100644
index 00000000..ecacb51d
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/info.yaml
@@ -0,0 +1,34 @@
+name: Person-Foot-Detection
+# id must match with the model dir name in qai_hub_models
+id: foot_track_net
+status: public
+headline: Multi-task Human detector.
+domain: Computer Vision
+description: FootTrackNet can detect person and face bounding boxes, head and feet landmark
+  locations and feet visibility.
+use_case: Object Detection
+tags:
+  - real-time
+license: https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE
+deploy_license: https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf
+source_repo: https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/foot_track_net/model.py
+technical_details:
+  Inference latency: RealTime
+  Input resolution: 640x480
+  Number of output classes: 2
+  Number of parameters: 2.53M
+  Model size: 9.69 MB
+applicable_scenarios:
+  - Restricted zone
+  - Safety zone
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+  - IoT
+has_static_banner: true
+has_animated_banner: true
+license_type: bsd-3-clause
+deploy_license_type: AI Model Hub License
+dataset:
+  - coco
diff --git a/qai_hub_models/models/foot_track_net/layers.py b/qai_hub_models/models/foot_track_net/layers.py
new file mode 100644
index 00000000..1259ffb0
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/layers.py
@@ -0,0 +1,399 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+
+
+import math
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.nn.init as init
+
+
+class SeModule(nn.Module):
+    """cutomized squeeze exitationm module"""
+
+    def __init__(self, in_size: int, reduction: int = 4):
+        super(SeModule, self).__init__()
+        self.pool = nn.AdaptiveAvgPool2d(1)
+        self.se = nn.Sequential(
+            nn.Conv2d(
+                in_size,
+                in_size // reduction,
+                kernel_size=1,
+                stride=1,
+                padding=0,
+                bias=False,
+            ),
+            nn.BatchNorm2d(in_size // reduction),
+            nn.ReLU(inplace=True),
+            nn.Conv2d(
+                in_size // reduction,
+                in_size,
+                kernel_size=1,
+                stride=1,
+                padding=0,
+                bias=False,
+            ),
+            nn.BatchNorm2d(in_size),
+            nn.Sigmoid(),
+        )
+
+    def forward(self, x: torch.Tensor):
+        """
+        x:   N,C,H,W  tensor
+        return: N,C,H,W tensor
+
+        """
+        return x * self.se(self.pool(x))
+
+
+class Block3x3(nn.Module):
+    """mobileNetV3 block modified version for simplification"""
+
+    def __init__(
+        self,
+        kernel_size: int,
+        in_size: int,
+        expand_size: int,
+        out_size: int,
+        nolinear: str,
+        semodule: nn.Module,
+        stride: int,
+    ):
+        super(Block3x3, self).__init__()
+        self.kernel_size = kernel_size
+        self.stride = stride
+        self.se = semodule
+        self.conv1 = nn.Conv2d(
+            in_size, expand_size, kernel_size=1, stride=1, padding=0, bias=False
+        )
+        self.bn1 = nn.BatchNorm2d(expand_size)
+        self.nolinear1 = nolinear
+        if kernel_size == 3:
+            self.conv2 = nn.Conv2d(
+                expand_size,
+                out_size,
+                kernel_size=3,
+                stride=stride,
+                padding=1,
+                bias=False,
+            )
+        else:
+            self.conv2 = nn.Conv2d(
+                expand_size, out_size, kernel_size=3, stride=1, padding=1, bias=False
+            )
+            self.conv3 = nn.Conv2d(
+                out_size, out_size, kernel_size=3, stride=stride, padding=1, bias=False
+            )
+            self.bn3 = nn.BatchNorm2d(out_size)
+            self.nolinear3 = nolinear
+        self.bn2 = nn.BatchNorm2d(out_size)
+        self.nolinear2 = nolinear
+        self.shortcut = nn.Sequential()
+        if stride == 1 and in_size != out_size:
+            self.shortcut = nn.Sequential(
+                nn.Conv2d(
+                    in_size, out_size, kernel_size=1, stride=1, padding=0, bias=False
+                ),
+                nn.BatchNorm2d(out_size),
+            )
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,H,W input feature
+        return: N,C,H,W torch.tensor
+        """
+        out = self.nolinear1(self.bn1(self.conv1(x)))
+        out = self.nolinear2(self.bn2(self.conv2(out)))
+        if self.kernel_size == 5:
+            out = self.nolinear3(self.bn3(self.conv3(out)))
+
+        if self.se is not None:
+            out = self.se(out)
+        out = out + self.shortcut(x) if self.stride == 1 else out
+        return out
+
+
+class Mbv3SmallFast(nn.Module):
+    """
+    Certain Layers are borrowed and modified based on MobileNet3
+    for details of each layer funcitonality please check:  https://arxiv.org/abs/1905.02244
+    """
+
+    def __init__(self, act: str = "relu", RGB: bool = True):
+        super(Mbv3SmallFast, self).__init__()
+        self.keep = [2, 5, 12]
+        self.uplayer_shape = [16, 32, 64]
+        self.output_channels = 96
+        if RGB:
+            self.conv1 = nn.Conv2d(
+                3, 16, kernel_size=3, stride=2, padding=1, bias=False
+            )
+        else:
+            self.conv1 = nn.Conv2d(
+                1, 16, kernel_size=3, stride=2, padding=1, bias=False
+            )
+
+        self.bn1 = nn.BatchNorm2d(16)
+        if act == "relu":
+            self.hs1 = nn.ReLU(inplace=True)
+        elif act == "prelu":
+            self.hs1 = nn.PReLU()
+        elif act == "hswish":
+            self.hs1 = nn.Hardswish()
+
+        self.bneck = nn.Sequential(
+            Block3x3(3, 16, 16, 16, self.hs1, None, 1),
+            Block3x3(3, 16, 64, 16, self.hs1, None, 2),  # 1
+            Block3x3(3, 16, 64, 16, self.hs1, None, 1),  # 2*
+            Block3x3(5, 16, 96, 32, self.hs1, SeModule(32), 2),  # 3
+            Block3x3(5, 32, 96, 32, self.hs1, SeModule(32), 1),  # 4
+            Block3x3(5, 32, 128, 32, self.hs1, SeModule(32), 1),  # 5*
+            Block3x3(3, 32, 128, 64, self.hs1, None, 2),  # 6
+            Block3x3(3, 64, 128, 64, self.hs1, None, 1),  # 7
+            Block3x3(3, 64, 160, 64, self.hs1, None, 1),  # 8
+            Block3x3(3, 64, 160, 64, self.hs1, None, 1),  # 9
+            Block3x3(3, 64, 256, 64, self.hs1, SeModule(64), 1),  # 10
+            Block3x3(3, 64, 320, 64, self.hs1, SeModule(64), 1),  # 11
+            Block3x3(5, 64, 320, 64, self.hs1, SeModule(64), 1),  # 12*
+            Block3x3(5, 64, 320, 96, self.hs1, SeModule(96), 2),  # 13
+            Block3x3(5, 96, 480, 96, self.hs1, SeModule(96), 1),  # 14
+        )
+
+    def initialize_weights(self):
+        print("random init...")
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+
+                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
+                m.weight.data.normal_(0, math.sqrt(2.0 / n))
+                if m.bias is not None:
+                    m.bias.data.zero_()
+            elif isinstance(m, nn.BatchNorm2d):
+                m.weight.data.fill_(1)
+                m.bias.data.zero_()
+            elif isinstance(m, nn.Linear):
+                n = m.weight.size(1)
+                m.weight.data.normal_(0, 0.01)
+                m.bias.data.zero_()
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,480,640 image tensor
+        return: List of tensors as N,C,H,W for each stage, for specific
+        0: N,16,120,160
+        1: N,32,60,80
+        2: N 64,30,40
+        3: N 96,15,20
+        """
+        x = self.hs1(self.bn1(self.conv1(x)))
+        outs = []
+        for index, item in enumerate(self.bneck):
+            x = item(x)
+
+            if index in self.keep:
+                outs.append(x)
+        outs.append(x)
+        return outs
+
+
+class CBAModule(nn.Module):
+    """Conv BatchNorm Activation block"""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int = 24,
+        kernel_size: int = 3,
+        stride: int = 1,
+        padding: int = 0,
+        bias: bool = False,
+        act: str = "relu",
+    ):
+        super(CBAModule, self).__init__()
+        self.conv = nn.Conv2d(
+            in_channels, out_channels, kernel_size, stride, padding=padding, bias=bias
+        )
+        self.bn = nn.BatchNorm2d(out_channels)
+        if act == "relu":
+            self.act = nn.ReLU(inplace=True)
+        elif act == "identity":
+            self.act = nn.Identity()
+        else:
+            self.act = nn.PReLU()
+
+        init.xavier_uniform_(self.conv.weight.data)
+        if self.conv.bias is not None:
+            self.conv.bias.data.zero_()
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,H,W tensor
+        return:  N,C,H,W tensor
+        """
+        x = self.conv(x)
+        x = self.bn(x)
+        x = self.act(x)
+        return x
+
+
+class UpModule(nn.Module):
+    """Upsampling module"""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        kernel_size: int = 2,
+        stride: int = 2,
+        bias: bool = False,
+        mode: str = "UCBA",
+        act: str = "relu",
+    ):
+        super(UpModule, self).__init__()
+        self.mode = mode
+
+        if self.mode == "UCBA":
+            self.up = nn.Upsample(size=None, scale_factor=2, mode="nearest")
+            self.conv = CBAModule(
+                in_channels, out_channels, 3, padding=1, bias=bias, act=act
+            )
+        elif self.mode == "DeconvBN":
+            self.dconv = nn.ConvTranspose2d(
+                in_channels, out_channels, kernel_size, stride, bias=bias
+            )
+            self.bn = nn.BatchNorm2d(out_channels)
+        elif self.mode == "DeCBA":
+            self.dconv = nn.ConvTranspose2d(
+                in_channels, out_channels, kernel_size, stride, bias=bias
+            )
+            self.conv = CBAModule(out_channels, out_channels, 3, padding=1, bias=bias)
+        else:
+            raise RuntimeError(f"Unsupport mode: {mode}")
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,H,W tensor
+        return: N,C,H,W tesnor.
+        """
+        if self.mode == "UCBA":
+            return self.conv(self.up(x))
+        elif self.mode == "DeconvBN":
+            return F.relu(self.bn(self.dconv(x)))
+        elif self.mode == "DeCBA":
+            return self.conv(self.dconv(x))
+
+
+class ContextModule(nn.Module):
+    """single stage headless face detector context module"""
+
+    def __init__(self, in_channels: int, act: str = "relu"):
+        super(ContextModule, self).__init__()
+
+        block_wide = in_channels // 4
+        self.inconv = CBAModule(in_channels, block_wide, 3, 1, padding=1, act=act)
+        self.upconv = CBAModule(block_wide, block_wide, 3, 1, padding=1, act=act)
+        self.downconv = CBAModule(block_wide, block_wide, 3, 1, padding=1, act="relu")
+        self.downconv2 = CBAModule(block_wide, block_wide, 3, 1, padding=1, act=act)
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,H,W tensor
+        return:  N,C,H,W tensor
+        """
+        x = self.inconv(x)
+        up = self.upconv(x)
+        down = self.downconv(x)
+        down = self.downconv2(down)
+        return torch.cat([up, down], dim=1)
+
+
+class DetectModule(nn.Module):
+    def __init__(self, in_channels: int, act: str = "relu"):
+        super(DetectModule, self).__init__()
+
+        self.upconv = CBAModule(in_channels, in_channels // 2, 3, 1, padding=1, act=act)
+        self.context = ContextModule(in_channels, act=act)
+
+    def forward(self, x: torch.Tensor):
+        """
+        x:  N,C,H,W tensor
+        return:  N,C,H,W tensor
+        """
+        up = self.upconv(x)
+        down = self.context(x)
+        return torch.cat([up, down], dim=1)
+
+
+class CropLayer(nn.Module):
+    """
+    crop layer. crop the input tensor based onthe specified number of rows and columns
+     E.g., (-1, 0) means this layer should crop the first and last rows of the feature map. And (0, -1) crops the first and last columns
+    """
+
+    def __init__(self, crop_set: list):
+        super(CropLayer, self).__init__()
+        self.rows_to_crop = -crop_set[0]
+        self.cols_to_crop = -crop_set[1]
+        assert self.rows_to_crop >= 0
+        assert self.cols_to_crop >= 0
+
+    def forward(self, input):
+        """
+        x: N,C,H,W tensor
+        return: N,C,H,W tensor
+        """
+        if self.rows_to_crop == 0 and self.cols_to_crop == 0:
+            return input
+        elif self.rows_to_crop > 0 and self.cols_to_crop == 0:
+            return input[:, :, self.rows_to_crop : -self.rows_to_crop, :]
+        elif self.rows_to_crop == 0 and self.cols_to_crop > 0:
+            return input[:, :, :, self.cols_to_crop : -self.cols_to_crop]
+        else:
+            return input[
+                :,
+                :,
+                self.rows_to_crop : -self.rows_to_crop,
+                self.cols_to_crop : -self.cols_to_crop,
+            ]
+
+
+class HeadModule(nn.Module):
+    """head module for specific task assignment"""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        has_ext: bool = False,
+        act: str = "relu",
+    ):
+        super(HeadModule, self).__init__()
+        self.head = nn.Conv2d(in_channels, out_channels, kernel_size=1)
+        self.has_ext = has_ext
+
+        if has_ext:
+            self.ext = CBAModule(
+                in_channels,
+                in_channels,
+                kernel_size=3,
+                padding=1,
+                bias=False,
+                act=act,
+            )
+
+    def init_normal(self, std: float, bias: float):
+        nn.init.normal_(self.head.weight, std=std)
+        nn.init.constant_(self.head.bias, bias)
+
+    def forward(self, x: torch.Tensor):
+        """
+        x: N,C,H,W tensor
+        return: N,C,H,W tensor
+        """
+        if self.has_ext:
+            x = self.ext(x)
+        return self.head(x)
diff --git a/qai_hub_models/models/foot_track_net/model.py b/qai_hub_models/models/foot_track_net/model.py
new file mode 100644
index 00000000..b6de34aa
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/model.py
@@ -0,0 +1,90 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+
+from __future__ import annotations
+
+from typing import List
+
+import torch
+import torch.nn as nn
+
+from qai_hub_models.models.foot_track_net.foot_track_net import FootTrackNet
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset  # SourceAsRoot
+from qai_hub_models.utils.base_model import BaseModel
+from qai_hub_models.utils.input_spec import InputSpec
+
+MODEL_ID = __name__.split(".")[-2]
+
+DEFAULT_WEIGHTS = "SA-e30_finetune50.pth"
+MODEL_ASSET_VERSION = 1
+
+
+class FootTrackNet_model(BaseModel):
+    """
+    qualcomm multi-task human detector model.
+    Detect bounding box for person, face,
+    Detect landmarks: head, feet and also their visibility.
+    The output will be saved as 4 maps which will be decoded to final result in the FootTrackNet_App.
+    """
+
+    def __init__(self, model: nn.Module) -> None:
+        super().__init__()
+        self.model = model
+
+    @classmethod
+    def from_pretrained(cls, checkpoint_path: str | None = None):
+        """Load FootTrackNet from a weightfile created by the source FootTrackNet repository."""
+
+        if not checkpoint_path:
+            checkpoint_path = CachedWebModelAsset.from_asset_store(
+                MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_WEIGHTS
+            ).fetch()
+        foot_track_net_model = FootTrackNet()  # original definition
+        foot_track_net_model.load_weights(checkpoint_path)
+        foot_track_net_model.to(torch.device("cpu"))
+
+        return cls(foot_track_net_model)
+
+    def forward(self, image: torch.Tensor):
+        """
+        Run FootTrackNet on `image`, and produce a the list of BBox for face and body
+
+        Parameters:
+            image: Pixel values pre-processed for encoder consumption.
+                   Range: float[0, 1]
+                   3-channel Color Space: RGB
+
+        Returns:
+            heatmap: N,C,H,W the heatmap for the person/face detection.
+            bbox: N,C*4, H,W the bounding box coordinate as a map.
+            landmark: N,C*34,H,W the coordinates of landmarks as a map.
+            landmark_visibility: N,C*17,H,W the visibility of the landmark as a map.
+        """
+        return self.model(image)
+
+    @staticmethod
+    def get_input_spec(
+        batch_size: int = 1,
+        height: int = 480,
+        width: int = 640,
+    ) -> InputSpec:
+        """
+        Returns the input specification (name -> (shape, type). This can be
+        used to submit profiling job on Qualcomm AI Hub. Default resolution is 2048x1024
+        so this expects an image where width is twice the height.
+        """
+        return {"image": ((batch_size, 3, height, width), "float32")}
+
+    @staticmethod
+    def get_output_names() -> List[str]:
+        return ["heatmap", "bbox", "landmark", "landmark_visibility"]
+
+    @staticmethod
+    def get_channel_last_inputs() -> List[str]:
+        return ["image"]
+
+    @staticmethod
+    def get_channel_last_outputs() -> List[str]:
+        return ["heatmap", "bbox", "landmark", "landmark_visibility"]
diff --git a/qai_hub_models/models/foot_track_net/perf.yaml b/qai_hub_models/models/foot_track_net/perf.yaml
new file mode 100644
index 00000000..68296b36
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/perf.yaml
@@ -0,0 +1,432 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
+models:
+- name: Person-Foot-Detection
+  performance_metrics:
+  - torchscript_onnx_tflite:
+      inference_time: 3484.0
+      throughput: 287.0264064293915
+      estimated_peak_memory_range:
+        min: 16384
+        max: 25709488
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jp2kyw6qp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3592.0
+      throughput: 278.39643652561244
+      estimated_peak_memory_range:
+        min: 4210688
+        max: 12355864
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jgo26rqkp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5293.0
+      throughput: 188.9287738522577
+      estimated_peak_memory_range:
+        min: 15429632
+        max: 19234088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 201
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 201
+      job_id: jp4lr1015
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-15T00:35:54Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2884.0
+      throughput: 346.74063800277395
+      estimated_peak_memory_range:
+        min: 12288
+        max: 58082624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jpy13xwlp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3046.0
+      throughput: 328.29940906106367
+      estimated_peak_memory_range:
+        min: 3702784
+        max: 21906704
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jpv6kdxr5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4623.0
+      throughput: 216.3097555699762
+      estimated_peak_memory_range:
+        min: 0
+        max: 68916192
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 201
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 201
+      job_id: jpxko42l5
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-15T00:35:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3339.0
+      throughput: 299.4908655286014
+      estimated_peak_memory_range:
+        min: 12288
+        max: 1968528
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jp0z0jqn5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3293.0
+      throughput: 303.67446097783176
+      estimated_peak_memory_range:
+        min: 2080768
+        max: 3247880
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jpedmz3v5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:35:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3382.0
+      throughput: 295.68302779420463
+      estimated_peak_memory_range:
+        min: 36864
+        max: 114761312
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jglvmxrm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3365.0
+      throughput: 297.1768202080238
+      estimated_peak_memory_range:
+        min: 2957312
+        max: 4465480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jg9lnme8g
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:35:50Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3443.0
+      throughput: 290.4443799012489
+      estimated_peak_memory_range:
+        min: 28672
+        max: 43066584
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: j5q6qykop
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3387.0
+      throughput: 295.24653085326247
+      estimated_peak_memory_range:
+        min: 2310144
+        max: 3595752
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: j5we67nm5
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:35:49Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3504.0
+      throughput: 285.38812785388126
+      estimated_peak_memory_range:
+        min: 12288
+        max: 4233992
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jgkex4nng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3384.0
+      throughput: 295.5082742316785
+      estimated_peak_memory_range:
+        min: 3698688
+        max: 5094096
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jgz3dmkx5
+      job_status: Passed
+    reference_device_info:
+      name: SA8650 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:35:48Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5660.0
+      throughput: 176.67844522968198
+      estimated_peak_memory_range:
+        min: 5107712
+        max: 60626976
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jp8qyx9op
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5772.0
+      throughput: 173.25017325017325
+      estimated_peak_memory_range:
+        min: 3702784
+        max: 25944608
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jgdx13lzp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:35:52Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2376.0
+      throughput: 420.8754208754209
+      estimated_peak_memory_range:
+        min: 8192
+        max: 29916112
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 134
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 134
+      job_id: jp3j092ng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2491.0
+      throughput: 401.4452027298274
+      estimated_peak_memory_range:
+        min: 0
+        max: 17896576
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: j57yr4395
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 3680.0
+      throughput: 271.7391304347826
+      estimated_peak_memory_range:
+        min: 18051072
+        max: 53326816
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 201
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 201
+      job_id: jprv30j7g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:35:58Z'
+  - torchscript_onnx_qnn:
+      inference_time: 3669.0
+      throughput: 272.5538293813028
+      estimated_peak_memory_range:
+        min: 3690496
+        max: 3690496
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 196
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 196
+      job_id: jgjvn74eg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5762.0
+      throughput: 173.55085039916696
+      estimated_peak_memory_range:
+        min: 17518592
+        max: 17518592
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 201
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 201
+      job_id: j5mnxmy9p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-15T00:35:57Z'
diff --git a/qai_hub_models/models/foot_track_net/test.py b/qai_hub_models/models/foot_track_net/test.py
new file mode 100644
index 00000000..d2fdc8e9
--- /dev/null
+++ b/qai_hub_models/models/foot_track_net/test.py
@@ -0,0 +1,267 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import pickle as pkl
+
+import numpy as np
+import pytest
+
+from qai_hub_models.models.foot_track_net.app import FootTrackNet_App
+from qai_hub_models.models.foot_track_net.demo import INPUT_IMAGE_ADDRESS
+from qai_hub_models.models.foot_track_net.demo import main as demo_main
+from qai_hub_models.models.foot_track_net.model import (
+    MODEL_ASSET_VERSION,
+    MODEL_ID,
+    FootTrackNet_model,
+)
+from qai_hub_models.utils.asset_loaders import (
+    CachedWebModelAsset,
+    load_image,
+    load_path,
+)
+from qai_hub_models.utils.testing import assert_most_close, skip_clone_repo_check
+
+OUTPUT_RST_ADDRESS = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "oracle_rst1.pkl"
+)
+
+
+# Verify that the output from Torch is as expected. bbox, landmark, visibility
+@skip_clone_repo_check
+def test_task():
+    app = FootTrackNet_App(FootTrackNet_model.from_pretrained())
+    original_image = load_image(INPUT_IMAGE_ADDRESS)
+    objs_face, objs_person = app.det_image(original_image)
+
+    pth = load_path(OUTPUT_RST_ADDRESS, "tmp")
+    print("pth is", pth)
+    with open(pth, "rb") as handle:
+        objs_face_oracle, objs_person_oracle = pkl.load(handle)
+
+    # extrac the oracle result
+    faces_bbox_ora = np.array(
+        [
+            objs_face_oracle[0].box,
+            objs_face_oracle[1].box,
+        ]
+    )
+
+    persons_bbox_ora = np.array(
+        [
+            objs_person_oracle[0].box,
+            objs_person_oracle[1].box,
+        ]
+    )
+
+    persons_landmark_ora = np.array(
+        [
+            [
+                objs_person_oracle[0].landmark[0],
+                objs_person_oracle[0].landmark[15],
+                objs_person_oracle[0].landmark[16],
+            ],
+            [
+                objs_person_oracle[1].landmark[0],
+                objs_person_oracle[1].landmark[15],
+                objs_person_oracle[1].landmark[16],
+            ],
+        ]
+    )
+
+    persons_visibility_ora = np.array(
+        [
+            [objs_person_oracle[0].vis[15], objs_person_oracle[0].vis[16]],
+            [objs_person_oracle[1].vis[15], objs_person_oracle[1].vis[16]],
+        ]
+    )
+
+    # extract the key detection result
+    faces_bbox = np.array(
+        [
+            objs_face[0].box,
+            objs_face[1].box,
+        ]
+    )
+
+    persons_bbox = np.array(
+        [
+            objs_person[0].box,
+            objs_person[1].box,
+        ]
+    )
+
+    persons_landmark = np.array(
+        [
+            [
+                objs_person[0].landmark[0],
+                objs_person[0].landmark[15],
+                objs_person[0].landmark[16],
+            ],
+            [
+                objs_person[1].landmark[0],
+                objs_person[1].landmark[15],
+                objs_person[1].landmark[16],
+            ],
+        ]
+    )
+
+    persons_visibility = np.array(
+        [
+            [objs_person[0].vis[15], objs_person[0].vis[16]],
+            [objs_person[1].vis[15], objs_person[1].vis[16]],
+        ]
+    )
+
+    # assert, face_bbox, person_bbox, person_landmark and person_landmark_visibility
+    assert_most_close(
+        np.asarray(faces_bbox_ora),
+        np.asarray(faces_bbox),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_bbox_ora),
+        np.asarray(persons_bbox),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_landmark_ora),
+        np.asarray(persons_landmark),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_visibility_ora),
+        np.asarray(persons_visibility),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+
+
+@pytest.mark.trace
+@skip_clone_repo_check
+def test_trace():
+    app = FootTrackNet_App(
+        FootTrackNet_model.from_pretrained().convert_to_torchscript()
+    )
+    original_image = load_image(INPUT_IMAGE_ADDRESS)
+    objs_face, objs_person = app.det_image(original_image)
+
+    pth = load_path(OUTPUT_RST_ADDRESS, "tmp")
+    with open(pth, "rb") as handle:
+        objs_face_oracle, objs_person_oracle = pkl.load(handle)
+
+    # extrac the oracle result
+    faces_bbox_ora = np.array(
+        [
+            objs_face_oracle[0].box,
+            objs_face_oracle[1].box,
+        ]
+    )
+
+    persons_bbox_ora = np.array(
+        [
+            objs_person_oracle[0].box,
+            objs_person_oracle[1].box,
+        ]
+    )
+
+    persons_landmark_ora = np.array(
+        [
+            [
+                objs_person_oracle[0].landmark[0],
+                objs_person_oracle[0].landmark[15],
+                objs_person_oracle[0].landmark[16],
+            ],
+            [
+                objs_person_oracle[1].landmark[0],
+                objs_person_oracle[1].landmark[15],
+                objs_person_oracle[1].landmark[16],
+            ],
+        ]
+    )
+
+    persons_visibility_ora = np.array(
+        [
+            [objs_person_oracle[0].vis[15], objs_person_oracle[0].vis[16]],
+            [objs_person_oracle[1].vis[15], objs_person_oracle[1].vis[16]],
+        ]
+    )
+
+    # extract the key detection result
+    faces_bbox = np.array(
+        [
+            objs_face[0].box,
+            objs_face[1].box,
+        ]
+    )
+
+    persons_bbox = np.array(
+        [
+            objs_person[0].box,
+            objs_person[1].box,
+        ]
+    )
+
+    persons_landmark = np.array(
+        [
+            [
+                objs_person[0].landmark[0],
+                objs_person[0].landmark[15],
+                objs_person[0].landmark[16],
+            ],
+            [
+                objs_person[1].landmark[0],
+                objs_person[1].landmark[15],
+                objs_person[1].landmark[16],
+            ],
+        ]
+    )
+
+    persons_visibility = np.array(
+        [
+            [objs_person[0].vis[15], objs_person[0].vis[16]],
+            [objs_person[1].vis[15], objs_person[1].vis[16]],
+        ]
+    )
+
+    # assert, face_bbox, person_bbox, person_landmark and person_landmark_visibility
+    assert_most_close(
+        np.asarray(faces_bbox_ora),
+        np.asarray(faces_bbox),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_bbox_ora),
+        np.asarray(persons_bbox),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_landmark_ora),
+        np.asarray(persons_landmark),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+    assert_most_close(
+        np.asarray(persons_visibility_ora),
+        np.asarray(persons_visibility),
+        diff_tol=0.01,
+        atol=0.001,
+        rtol=0.001,
+    )
+
+
+@skip_clone_repo_check
+def test_demo():
+    demo_main(is_test=True)
diff --git a/qai_hub_models/models/gear_guard_net/README.md b/qai_hub_models/models/gear_guard_net/README.md
new file mode 100644
index 00000000..cd7ecdda
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/README.md
@@ -0,0 +1,59 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [PPE-Detection: Object detection for personal protective equipments (PPE)](https://aihub.qualcomm.com/models/gear_guard_net)
+
+Detect if a person is wearing personal protective equipments (PPE) in real-time.
+
+This is based on the implementation of PPE-Detection found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/gear_guard_net).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+
+
+
+## Example & Usage
+
+
+Once installed, run the following simple CLI demo:
+
+```bash
+python -m qai_hub_models.models.gear_guard_net.demo
+```
+More details on the CLI tool can be found with the `--help` option. See
+[demo.py](demo.py) for sample usage of the model including pre/post processing
+scripts. Please refer to our [general instructions on using
+models](../../../#getting-started) for more usage instructions.
+
+## Export for on-device deployment
+
+This repository contains export scripts that produce a model optimized for
+on-device deployment. This can be run as follows:
+
+```bash
+python -m qai_hub_models.models.gear_guard_net.export
+```
+Additional options are documented with the `--help` option. Note that the above
+script requires access to Deployment instructions for Qualcomm® AI Hub.
+
+
+## License
+* The license for the original implementation of PPE-Detection can be found
+  [here](https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
+
+## References
+* [None](None)
+* [Source Model Implementation](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/gear_guard_net/model.py)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
diff --git a/qai_hub_models/models/gear_guard_net/__init__.py b/qai_hub_models/models/gear_guard_net/__init__.py
new file mode 100644
index 00000000..b4b59da2
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/__init__.py
@@ -0,0 +1,10 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.body_detection.app import (  # noqa: F401
+    BodyDetectionApp as App,
+)
+
+from .model import MODEL_ID  # noqa: F401
+from .model import GearGuardNet as Model  # noqa: F401
diff --git a/qai_hub_models/models/gear_guard_net/conftest.py b/qai_hub_models/models/gear_guard_net/conftest.py
new file mode 100644
index 00000000..62e6b22e
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/conftest.py
@@ -0,0 +1,39 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+import inspect
+
+import pytest
+
+from qai_hub_models.models.gear_guard_net import Model
+from qai_hub_models.utils.testing import skip_clone_repo_check
+
+
+# Instantiate the model only once for all tests.
+# Mock from_pretrained to always return the initialized model.
+# This speeds up tests and limits memory leaks.
+@pytest.fixture(scope="module", autouse=True)
+def cached_from_pretrained():
+    with pytest.MonkeyPatch.context() as mp:
+        pretrained_cache = {}
+        from_pretrained = Model.from_pretrained
+        sig = inspect.signature(from_pretrained)
+
+        @skip_clone_repo_check
+        def _cached_from_pretrained(*args, **kwargs):
+            cache_key = str(args) + str(kwargs)
+            model = pretrained_cache.get(cache_key, None)
+            if model:
+                return model
+            else:
+                model = from_pretrained(*args, **kwargs)
+                pretrained_cache[cache_key] = model
+                return model
+
+        _cached_from_pretrained.__signature__ = sig
+
+        mp.setattr(Model, "from_pretrained", _cached_from_pretrained)
+        yield mp
diff --git a/qai_hub_models/models/gear_guard_net/demo.py b/qai_hub_models/models/gear_guard_net/demo.py
new file mode 100644
index 00000000..7dc1f4a6
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/demo.py
@@ -0,0 +1,33 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.body_detection.app import BodyDetectionApp
+from qai_hub_models.models._shared.body_detection.demo import BodyDetectionDemo
+from qai_hub_models.models.gear_guard_net.model import (
+    MODEL_ASSET_VERSION,
+    MODEL_ID,
+    GearGuardNet,
+)
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+
+INPUT_IMAGE_ADDRESS = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "test_image.jpg"
+)
+
+
+def main(is_test: bool = False):
+    BodyDetectionDemo(
+        is_test,
+        GearGuardNet,
+        MODEL_ID,
+        BodyDetectionApp,
+        INPUT_IMAGE_ADDRESS,
+        320,
+        192,
+        0.9,
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/gear_guard_net/export.py b/qai_hub_models/models/gear_guard_net/export.py
new file mode 100644
index 00000000..78308196
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/export.py
@@ -0,0 +1,213 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import os
+import warnings
+from pathlib import Path
+from typing import Any, Dict, List, Optional, cast
+
+import qai_hub as hub
+import torch
+
+from qai_hub_models.models.common import ExportResult, TargetRuntime
+from qai_hub_models.models.gear_guard_net import Model
+from qai_hub_models.utils.args import (
+    export_parser,
+    get_input_spec_kwargs,
+    get_model_kwargs,
+)
+from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
+from qai_hub_models.utils.printing import (
+    print_inference_metrics,
+    print_on_target_demo_cmd,
+    print_profile_metrics_from_job,
+)
+from qai_hub_models.utils.qai_hub_helpers import (
+    can_access_qualcomm_ai_hub,
+    export_without_hub_access,
+)
+
+
+def export_model(
+    device: str = "Samsung Galaxy S23 (Family)",
+    chipset: Optional[str] = None,
+    skip_profiling: bool = False,
+    skip_inferencing: bool = False,
+    skip_downloading: bool = False,
+    skip_summary: bool = False,
+    output_dir: Optional[str] = None,
+    target_runtime: TargetRuntime = TargetRuntime.TFLITE,
+    compile_options: str = "",
+    profile_options: str = "",
+    **additional_model_kwargs,
+) -> ExportResult | List[str]:
+    """
+    This function executes the following recipe:
+
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
+
+    Each of the last 4 steps can be optionally skipped using the input options.
+
+    Parameters:
+        device: Device for which to export the model.
+            Full list of available devices can be found by running `hub.get_devices()`.
+            Defaults to DEFAULT_DEVICE if not specified.
+        chipset: If set, will choose a random device with this chipset.
+            Overrides the `device` argument.
+        skip_profiling: If set, skips profiling of compiled model on real devices.
+        skip_inferencing: If set, skips computing on-device outputs from sample data.
+        skip_downloading: If set, skips downloading of compiled model.
+        skip_summary: If set, skips waiting for and summarizing results
+            from profiling and inference.
+        output_dir: Directory to store generated assets (e.g. compiled model).
+            Defaults to `<cwd>/build/<model_name>`.
+        target_runtime: Which on-device runtime to target. Default is TFLite.
+        compile_options: Additional options to pass when submitting the compile job.
+        profile_options: Additional options to pass when submitting the profile job.
+        **additional_model_kwargs: Additional optional kwargs used to customize
+            `model_cls.from_pretrained` and `model.get_input_spec`
+
+    Returns:
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub.
+            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+    """
+    model_name = "gear_guard_net"
+    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
+    if chipset:
+        hub_device = hub.Device(attributes=f"chipset:{chipset}")
+    else:
+        hub_device = hub.Device(name=device)
+    if not can_access_qualcomm_ai_hub():
+        return export_without_hub_access(
+            "gear_guard_net",
+            "PPE-Detection",
+            device,
+            skip_profiling,
+            skip_inferencing,
+            skip_downloading,
+            skip_summary,
+            output_path,
+            target_runtime,
+            compile_options,
+            profile_options,
+        )
+
+    # On-device perf improves with I/O in channel_last format except when using ONNX.
+    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+    model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
+    input_spec = model.get_input_spec(
+        **get_input_spec_kwargs(model, additional_model_kwargs)
+    )
+
+    # Trace the model
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    # 2. Compiles the model to an asset that can be run on device
+    model_compile_options = model.get_hub_compile_options(
+        target_runtime, compile_options, hub_device
+    )
+    print(f"Optimizing model {model_name} to run on-device")
+    submitted_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options=model_compile_options,
+    )
+    compile_job = cast(hub.client.CompileJob, submitted_compile_job)
+
+    # 3. Profiles the model performance on a real device
+    profile_job: Optional[hub.client.ProfileJob] = None
+    if not skip_profiling:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(f"Profiling model {model_name} on a hosted device.")
+        submitted_profile_job = hub.submit_profile_job(
+            model=compile_job.get_target_model(),
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
+
+    # 4. Inferences the model on sample inputs
+    inference_job: Optional[hub.client.InferenceJob] = None
+    if not skip_inferencing:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(
+            f"Running inference for {model_name} on a hosted device with example inputs."
+        )
+        sample_inputs = model.sample_inputs(
+            input_spec, use_channel_last_format=use_channel_last_format
+        )
+        submitted_inference_job = hub.submit_inference_job(
+            model=compile_job.get_target_model(),
+            inputs=sample_inputs,
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
+
+    # 5. Downloads the model asset to the local directory
+    if not skip_downloading:
+        os.makedirs(output_path, exist_ok=True)
+        target_model: hub.Model = compile_job.get_target_model()  # type: ignore
+        target_model.download(str(output_path / model_name))
+
+    # 6. Summarizes the results from profiling and inference
+    if not skip_summary and not skip_profiling:
+        assert profile_job is not None and profile_job.wait().success
+        profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+        print_profile_metrics_from_job(profile_job, profile_data)
+
+    if not skip_summary and not skip_inferencing:
+        sample_inputs = model.sample_inputs(use_channel_last_format=False)
+        torch_out = torch_inference(
+            model, sample_inputs, return_channel_last_output=use_channel_last_format
+        )
+        assert inference_job is not None and inference_job.wait().success
+        inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
+
+        print_inference_metrics(
+            inference_job, inference_result, torch_out, model.get_output_names()
+        )
+
+    if not skip_summary:
+        print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
+
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(model_cls=Model)
+    args = parser.parse_args()
+    export_model(**vars(args))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/gear_guard_net/info.yaml b/qai_hub_models/models/gear_guard_net/info.yaml
new file mode 100644
index 00000000..ee874a07
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/info.yaml
@@ -0,0 +1,32 @@
+name: PPE-Detection
+# id must match with the model dir name in qai_hub_models
+id: gear_guard_net
+status: public
+headline: Object detection for personal protective equipments (PPE).
+domain: Computer Vision
+description: Detect if a person is wearing personal protective equipments (PPE) in real-time.
+use_case: Object detection
+tags:
+  - real-time
+license: https://github.com/qcom-ai-hub/ai-hub-models-internal/blob/main/LICENSE
+deploy_license: https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf
+source_repo: https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/gear_guard_net/model.py
+technical_details:
+  Inference latency: RealTime
+  Input resolution: 320x192
+  Number of parameters: 7.02M
+  Model size: 13.5 MB
+  Number of output classes: 2
+applicable_scenarios:
+  - IoT
+related_models:
+  - face_body_net
+form_factors:
+  - Phone
+  - Tablet
+  - IoT
+has_static_banner: true
+has_animated_banner: true
+license_type: bsd-3-clause
+deploy_license_type: AI Model Hub License
+dataset: []
diff --git a/qai_hub_models/models/gear_guard_net/model.py b/qai_hub_models/models/gear_guard_net/model.py
new file mode 100644
index 00000000..12cd9617
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/model.py
@@ -0,0 +1,124 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from typing import List
+
+import torch
+import torch.nn as nn
+
+from qai_hub_models.models._shared.body_detection.model import Model
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_torch
+from qai_hub_models.utils.base_model import BaseModel
+from qai_hub_models.utils.input_spec import InputSpec
+
+MODEL_ID = __name__.split(".")[-2]
+MODEL_ASSET_VERSION = 1
+DEFAULT_WEIGHTS = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "weights_v1.1.pt"
+)
+
+
+class GearGuardNet(BaseModel):
+    """GearGuardNet model"""
+
+    def __init__(self, model: nn.Module) -> None:
+        """
+        Initialize GearGuardNet
+
+        Inputs:
+            model: nn.Module
+                GearGuardNet model.
+        """
+        super().__init__()
+        self.model = model
+
+    @classmethod
+    def from_pretrained(cls, checkpoint_path: str = None) -> nn.Module:
+        """
+        Load model from pretrained weights.
+
+        Inputs:
+            checkpoint_path: str
+                Checkpoint path of pretrained weights.
+        Output: nn.Module
+            Detection model.
+        """
+        cfg = {
+            "nc": 2,
+            "depth_multiple": 0.33,
+            "width_multiple": 0.5,
+            "anchors": [
+                [10, 13, 16, 30, 33, 23],
+                [30, 61, 62, 45, 59, 119],
+                [116, 90, 156, 198, 373, 326],
+            ],
+            "backbone": [
+                [-1, 1, "FusedConvBatchNorm", [64, 6, 2, 2]],
+                [-1, 1, "FusedConvBatchNorm", [128, 3, 2]],
+                [-1, 3, "DoubleBlazeBlock", [128]],
+                [-1, 1, "FusedConvBatchNorm", [256, 3, 2]],
+                [-1, 3, "DoubleBlazeBlock", [256]],
+                [-1, 1, "FusedConvBatchNorm", [512, 3, 2]],
+                [-1, 9, "DoubleBlazeBlock", [512]],
+                [-1, 1, "FusedConvBatchNorm", [1024, 3, 2]],
+                [-1, 3, "DoubleBlazeBlock", [1024]],
+                [-1, 1, "FusedConvBatchNorm", [1024, 3, 1]],
+            ],
+            "head": [
+                [-1, 1, "FusedConvBatchNorm", [512, 1, 1]],
+                [-1, 1, "nn.Upsample", [None, 2, "nearest"]],
+                [[-1, 6], 1, "Concat", [1]],
+                [-1, 3, "DoubleBlazeBlock", [512]],
+                [-1, 1, "FusedConvBatchNorm", [256, 1, 1]],
+                [-1, 1, "nn.Upsample", [None, 2, "nearest"]],
+                [[-1, 4], 1, "Concat", [1]],
+                [-1, 3, "DoubleBlazeBlock", [256]],
+                [-1, 1, "FusedConvBatchNorm", [256, 3, 2]],
+                [[-1, 14], 1, "Concat", [1]],
+                [-1, 3, "DoubleBlazeBlock", [512]],
+                [-1, 1, "FusedConvBatchNorm", [512, 3, 2]],
+                [[-1, 10], 1, "Concat", [1]],
+                [-1, 3, "DoubleBlazeBlock", [1024]],
+                [[17, 20, 23], 1, "Detect", ["nc", "anchors"]],
+            ],
+        }
+        model = Model(cfg)
+        if checkpoint_path is None:
+            checkpoint_path = DEFAULT_WEIGHTS
+        ckpt = load_torch(checkpoint_path)
+        model.load_state_dict(ckpt)
+        model.eval()
+        return cls(model)
+
+    def forward(self, image: torch.Tensor) -> List[torch.Tensor]:
+        """
+        Forward computation of GearGuardNet.
+
+        Inputs:
+            image: torch.Tensor
+                Input image.
+        Outputs: List[torch.Tensor]
+            Multi-scale detection result.
+        """
+        return self.model(image)
+
+    @staticmethod
+    def get_input_spec(
+        batch_size: int = 1,
+        height: int = 320,
+        width: int = 192,
+    ) -> InputSpec:
+        """
+        Returns the input specification (name -> (shape, type). This can be
+        used to submit profiling job on Qualcomm AI Hub.
+        """
+        return {"image": ((batch_size, 3, height, width), "float32")}
+
+    @staticmethod
+    def get_output_names() -> List[str]:
+        return ["bbox_8x", "bbox_16x", "bbox_32x"]
+
+    @staticmethod
+    def get_channel_last_inputs() -> List[str]:
+        return ["image"]
diff --git a/qai_hub_models/models/gear_guard_net/perf.yaml b/qai_hub_models/models/gear_guard_net/perf.yaml
new file mode 100644
index 00000000..e01b7b5b
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/perf.yaml
@@ -0,0 +1,432 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
+models:
+- name: PPE-Detection
+  performance_metrics:
+  - torchscript_onnx_tflite:
+      inference_time: 670.0
+      throughput: 1492.5373134328358
+      estimated_peak_memory_range:
+        min: 28672
+        max: 243162120
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jp2kyw8rp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 725.0
+      throughput: 1379.3103448275863
+      estimated_peak_memory_range:
+        min: 12288
+        max: 51797976
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jgo26ryqp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1105.0
+      throughput: 904.9773755656108
+      estimated_peak_memory_range:
+        min: 12288
+        max: 15443864
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 107
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 107
+      job_id: jg9lnm18g
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-15T00:33:20Z'
+  - torchscript_onnx_tflite:
+      inference_time: 561.0
+      throughput: 1782.5311942959001
+      estimated_peak_memory_range:
+        min: 12288
+        max: 43883088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jpy13xe8p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 613.0
+      throughput: 1631.3213703099511
+      estimated_peak_memory_range:
+        min: 757760
+        max: 17179328
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jpv6kdok5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 943.0
+      throughput: 1060.4453870625662
+      estimated_peak_memory_range:
+        min: 0
+        max: 48287296
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 107
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 107
+      job_id: jp14zjl7p
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-15T00:33:21Z'
+  - torchscript_onnx_tflite:
+      inference_time: 666.0
+      throughput: 1501.5015015015015
+      estimated_peak_memory_range:
+        min: 12288
+        max: 4694624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jp0z0jy95
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 722.0
+      throughput: 1385.0415512465374
+      estimated_peak_memory_range:
+        min: 765952
+        max: 2413360
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jpedmz1o5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:33:12Z'
+  - torchscript_onnx_tflite:
+      inference_time: 671.0
+      throughput: 1490.312965722802
+      estimated_peak_memory_range:
+        min: 135168
+        max: 253906056
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jglvmxnj5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 728.0
+      throughput: 1373.6263736263736
+      estimated_peak_memory_range:
+        min: 782336
+        max: 2112144
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jg9lnm1wg
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:33:15Z'
+  - torchscript_onnx_tflite:
+      inference_time: 669.0
+      throughput: 1494.7683109118086
+      estimated_peak_memory_range:
+        min: 28672
+        max: 181989176
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: j5q6qy8np
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 731.0
+      throughput: 1367.9890560875513
+      estimated_peak_memory_range:
+        min: 770048
+        max: 1966424
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: j5we67v35
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:33:14Z'
+  - torchscript_onnx_tflite:
+      inference_time: 666.0
+      throughput: 1501.5015015015015
+      estimated_peak_memory_range:
+        min: 16384
+        max: 7082824
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jgkex4zwg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 730.0
+      throughput: 1369.86301369863
+      estimated_peak_memory_range:
+        min: 770048
+        max: 2102016
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jgz3dm9o5
+      job_status: Passed
+    reference_device_info:
+      name: SA8650 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:33:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1428.0
+      throughput: 700.2801120448179
+      estimated_peak_memory_range:
+        min: 16384
+        max: 40736464
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jp8qyxokp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1492.0
+      throughput: 670.2412868632708
+      estimated_peak_memory_range:
+        min: 753664
+        max: 17038528
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jgdx139rp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:33:18Z'
+  - torchscript_onnx_tflite:
+      inference_time: 480.0
+      throughput: 2083.3333333333335
+      estimated_peak_memory_range:
+        min: 8192
+        max: 23210944
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 80
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 80
+      job_id: jp3j09k3g
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 440.0
+      throughput: 2272.7272727272725
+      estimated_peak_memory_range:
+        min: 0
+        max: 14730448
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: j5we67vm5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 713.0
+      throughput: 1402.5245441795232
+      estimated_peak_memory_range:
+        min: 0
+        max: 24912112
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 107
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 107
+      job_id: jp4lr1o15
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:33:24Z'
+  - torchscript_onnx_qnn:
+      inference_time: 853.0
+      throughput: 1172.3329425556858
+      estimated_peak_memory_range:
+        min: 737280
+        max: 737280
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jgjvn7mvg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1173.0
+      throughput: 852.5149190110827
+      estimated_peak_memory_range:
+        min: 13352960
+        max: 13352960
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 107
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 107
+      job_id: jgdx139zp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-15T00:33:22Z'
diff --git a/qai_hub_models/models/gear_guard_net/test.py b/qai_hub_models/models/gear_guard_net/test.py
new file mode 100644
index 00000000..534809d0
--- /dev/null
+++ b/qai_hub_models/models/gear_guard_net/test.py
@@ -0,0 +1,48 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import numpy as np
+import pytest
+
+from qai_hub_models.models._shared.body_detection.app import BodyDetectionApp
+from qai_hub_models.models.gear_guard_net.demo import main as demo_main
+from qai_hub_models.models.gear_guard_net.model import (
+    MODEL_ASSET_VERSION,
+    MODEL_ID,
+    GearGuardNet,
+)
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_raw_file
+from qai_hub_models.utils.bounding_box_processing import get_iou
+from qai_hub_models.utils.testing import skip_clone_repo_check
+
+INPUT_IMAGE_ADDRESS = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "test_image.jpg"
+)
+GROUND_TRUTH_RESULT = CachedWebModelAsset.from_asset_store(
+    MODEL_ID, MODEL_ASSET_VERSION, "ground_truth.txt"
+)
+
+
+@skip_clone_repo_check
+def test_task():
+    app = BodyDetectionApp(GearGuardNet.from_pretrained())
+    result = app.detect(INPUT_IMAGE_ADDRESS, 320, 192, 0.9)
+    assert len(result) == 2
+
+
+@pytest.mark.trace
+@skip_clone_repo_check
+def test_trace():
+    app = BodyDetectionApp(GearGuardNet.from_pretrained().convert_to_torchscript())
+    result = app.detect(INPUT_IMAGE_ADDRESS, 320, 192, 0.9)
+    gt = load_raw_file(GROUND_TRUTH_RESULT)
+    gt = np.array(gt.split(), dtype=int)
+    result = result.astype(int)
+    assert result[0][0] == gt[0]
+    assert get_iou(result[0][1:5], gt[1:5]) > 0.8
+
+
+@skip_clone_repo_check
+def test_demo():
+    demo_main(is_test=True)
diff --git a/qai_hub_models/models/googlenet/README.md b/qai_hub_models/models/googlenet/README.md
index bf12b13f..fc59c1fd 100644
--- a/qai_hub_models/models/googlenet/README.md
+++ b/qai_hub_models/models/googlenet/README.md
@@ -6,7 +6,7 @@
 GoogLeNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of GoogLeNet found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/googlenet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/googlenet).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.googlenet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of GoogLeNet can be found
+* The license for the original implementation of GoogLeNet can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/googlenet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/googlenet/export.py b/qai_hub_models/models/googlenet/export.py
index b53c56a8..7e96979f 100644
--- a/qai_hub_models/models/googlenet/export.py
+++ b/qai_hub_models/models/googlenet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.googlenet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "googlenet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/googlenet/perf.yaml b/qai_hub_models/models/googlenet/perf.yaml
index da2445b1..bec69247 100644
--- a/qai_hub_models/models/googlenet/perf.yaml
+++ b/qai_hub_models/models/googlenet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: GoogLeNet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1013.0
-      throughput: 987.1668311944719
+      inference_time: 1015.0
+      throughput: 985.2216748768473
       estimated_peak_memory_range:
-        min: 32768
-        max: 1365224
+        min: 36864
+        max: 25018896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: jep287drp
+      job_id: jpxko4w35
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1079.0
-      throughput: 926.7840593141798
+      inference_time: 1077.0
+      throughput: 928.5051067780872
       estimated_peak_memory_range:
-        min: 28672
-        max: 36451744
+        min: 20480
+        max: 34586888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: jw5663m65
+      job_id: j5q6qyjnp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1274.0
-      throughput: 784.9293563579278
+      inference_time: 1143.0
+      throughput: 874.8906386701663
       estimated_peak_memory_range:
-        min: 12288
-        max: 15713920
+        min: 294912
+        max: 37222592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jmg9v3qw5
+      job_id: jg9lnmvwg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:33:46Z'
+    timestamp: '2024-10-15T00:32:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 811.0
-      throughput: 1233.0456226880394
+      inference_time: 743.0
+      throughput: 1345.8950201884254
       estimated_peak_memory_range:
         min: 16384
-        max: 51495824
+        max: 52283360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: jqpye428g
+      job_id: j5mnxmjdp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 790.0
-      throughput: 1265.8227848101267
+      inference_time: 787.0
+      throughput: 1270.6480304955528
       estimated_peak_memory_range:
-        min: 0
-        max: 16446528
+        min: 299008
+        max: 15968768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: j1p3k4735
+      job_id: jglvmxjj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 937.0
-      throughput: 1067.2358591248667
+      inference_time: 901.0
+      throughput: 1109.8779134295228
       estimated_peak_memory_range:
-        min: 0
-        max: 53819328
+        min: 258048
+        max: 55363168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jnp10dm85
+      job_id: jp14zj08p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:33:47Z'
+    timestamp: '2024-10-15T00:32:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 1010.0
-      throughput: 990.0990099009902
+      inference_time: 1013.0
+      throughput: 987.1668311944719
       estimated_peak_memory_range:
-        min: 12288
-        max: 13186544
+        min: 20480
+        max: 1369672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: j2p0y199g
+      job_id: jgn6vnjk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 898.0
-      throughput: 1113.5857461024498
+      inference_time: 899.0
+      throughput: 1112.3470522803113
       estimated_peak_memory_range:
         min: 634880
-        max: 2319688
+        max: 1861080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: j1pv31nk5
+      job_id: jp3j09y3g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:33:41Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:32:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 1491.0
-      throughput: 670.690811535882
+      inference_time: 1015.0
+      throughput: 985.2216748768473
       estimated_peak_memory_range:
-        min: 20480
-        max: 52825136
+        min: 28672
+        max: 84791208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: j1p8o3rkg
+      job_id: jp0z0jn95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1564.0
-      throughput: 639.386189258312
+      inference_time: 904.0
+      throughput: 1106.1946902654868
       estimated_peak_memory_range:
-        min: 618496
-        max: 18373472
+        min: 626688
+        max: 1913544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: jz5wodr3p
+      job_id: jgjvn7xvg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:33:45Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:32:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 1013.0
-      throughput: 987.1668311944719
+      inference_time: 1012.0
+      throughput: 988.1422924901186
       estimated_peak_memory_range:
-        min: 12288
-        max: 4872152
+        min: 24576
+        max: 1484152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: jogkzl0wg
+      job_id: jpy13x98p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 904.0
-      throughput: 1106.1946902654868
+      inference_time: 896.0
+      throughput: 1116.0714285714287
       estimated_peak_memory_range:
         min: 626688
-        max: 1938840
+        max: 2228968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: j7gjx08vp
+      job_id: jpv6kd3k5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:33:42Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:32:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 1014.0
-      throughput: 986.1932938856016
+      inference_time: 1011.0
+      throughput: 989.1196834817013
       estimated_peak_memory_range:
-        min: 24576
-        max: 1423208
+        min: 40960
+        max: 3671544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: jn5q871n5
+      job_id: jp2kyw2rp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 905.0
-      throughput: 1104.9723756906078
+      inference_time: 897.0
+      throughput: 1114.8272017837235
       estimated_peak_memory_range:
-        min: 626688
-        max: 2273688
+        min: 630784
+        max: 1932272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: jlpe9rnog
+      job_id: jgo26rjqp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:33:43Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:32:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1012.0
-      throughput: 988.1422924901186
+      inference_time: 1501.0
+      throughput: 666.2225183211193
       estimated_peak_memory_range:
-        min: 12288
-        max: 4382048
+        min: 16384
+        max: 53582944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 84
-      job_id: j1gln08jp
+      job_id: jprv30z0g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 937.0
-      throughput: 1067.2358591248667
+      inference_time: 1577.0
+      throughput: 634.1154090044388
       estimated_peak_memory_range:
-        min: 643072
-        max: 2019512
+        min: 618496
+        max: 20789424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: jygzex0og
+      job_id: jgz3dmeo5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:32:31Z'
+  - torchscript_onnx_tflite:
+      inference_time: 674.0
+      throughput: 1483.679525222552
+      estimated_peak_memory_range:
+        min: 8192
+        max: 19830480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 84
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 84
+      job_id: jgkex4jwg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 719.0
+      throughput: 1390.8205841446454
+      estimated_peak_memory_range:
+        min: 0
+        max: 11830000
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 143
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 143
+      job_id: j5we67o35
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 846.0
+      throughput: 1182.033096926714
+      estimated_peak_memory_range:
+        min: 0
+        max: 20493024
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 145
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 145
+      job_id: jp4lr1q85
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:33:44Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:32:37Z'
   - torchscript_onnx_qnn:
-      inference_time: 1043.0
-      throughput: 958.7727708533077
+      inference_time: 1060.0
+      throughput: 943.3962264150944
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 143
-      job_id: jwgoy1wq5
+      job_id: j56y47k6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1323.0
-      throughput: 755.8578987150415
+      inference_time: 1334.0
+      throughput: 749.6251874062968
       estimated_peak_memory_range:
-        min: 15245312
-        max: 15245312
+        min: 14516224
+        max: 14516224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jvgdwrmr5
+      job_id: jgdx13wrp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:33:48Z'
+    timestamp: '2024-10-15T00:32:35Z'
diff --git a/qai_hub_models/models/googlenet_quantized/README.md b/qai_hub_models/models/googlenet_quantized/README.md
index 49c02f9f..bef9fc8f 100644
--- a/qai_hub_models/models/googlenet_quantized/README.md
+++ b/qai_hub_models/models/googlenet_quantized/README.md
@@ -6,7 +6,7 @@
 GoogLeNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of GoogLeNetQuantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/googlenet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/googlenet_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/g
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[googlenet_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.googlenet_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of GoogLeNetQuantized can be found
+* The license for the original implementation of GoogLeNetQuantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/googlenet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/googlenet_quantized/evaluate.py b/qai_hub_models/models/googlenet_quantized/evaluate.py
index 0e8be6d5..57156cc6 100644
--- a/qai_hub_models/models/googlenet_quantized/evaluate.py
+++ b/qai_hub_models/models/googlenet_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.googlenet_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/googlenet_quantized/export.py b/qai_hub_models/models/googlenet_quantized/export.py
index 8932c1fe..1a83d465 100644
--- a/qai_hub_models/models/googlenet_quantized/export.py
+++ b/qai_hub_models/models/googlenet_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.googlenet_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "googlenet_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/googlenet_quantized/model.py b/qai_hub_models/models/googlenet_quantized/model.py
index fc0e583d..f2622c3f 100644
--- a/qai_hub_models/models/googlenet_quantized/model.py
+++ b/qai_hub_models/models/googlenet_quantized/model.py
@@ -4,104 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-)
-
-# isort: on
-
-from typing import Optional
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-from qai_hub import Device
-
-from qai_hub_models.models.common import TargetRuntime
 from qai_hub_models.models.googlenet.model import GoogLeNet
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
-from qai_hub_models.utils.quantization_aimet import (
-    constrain_quantized_inputs_to_image_range,
-    tie_observers,
-)
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 4
-DEFAULT_ENCODINGS = "googlenet_quantized_encodings.json"
-
-
-class GoogLeNetQuantizable(AIMETQuantizableMixin, GoogLeNet):
-    """GoogleNet with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        GoogLeNet.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-            needs_onnx_direct_aimet_export=True,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "GoogLeNetQuantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = GoogLeNet.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        tie_observers(sim)
-        constrain_quantized_inputs_to_image_range(sim)
-
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
 
-    # TODO(12424) remove this once encodings export correctly
-    def get_hub_compile_options(
-        self,
-        target_runtime: TargetRuntime,
-        other_compile_options: str = "",
-        device: Optional[Device] = None,
-    ) -> str:
-        compile_options = super().get_hub_compile_options(
-            target_runtime, other_compile_options, device
-        )
-        if target_runtime not in [
-            TargetRuntime.ONNX,
-            TargetRuntime.PRECOMPILED_QNN_ONNX,
-        ]:
-            compile_options += " --quantize_full_type int8"
-        return compile_options
+class GoogLeNetQuantizable(HubQuantizableMixin, GoogLeNet):
+    pass
diff --git a/qai_hub_models/models/googlenet_quantized/perf.yaml b/qai_hub_models/models/googlenet_quantized/perf.yaml
index 044337e8..01959c73 100644
--- a/qai_hub_models/models/googlenet_quantized/perf.yaml
+++ b/qai_hub_models/models/googlenet_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: GoogLeNetQuantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 275.0
-      throughput: 3636.3636363636365
+      inference_time: 284.0
+      throughput: 3521.1267605633802
       estimated_peak_memory_range:
-        min: 36864
-        max: 1543984
+        min: 12288
+        max: 2756880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,37 +60,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: joprk4x05
+      job_id: jgjvd0z8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 340.0
-      throughput: 2941.176470588235
+      inference_time: 342.0
+      throughput: 2923.9766081871344
       estimated_peak_memory_range:
-        min: 16384
-        max: 12814800
+        min: 28672
+        max: 10027984
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: j1p3k4o35
+        total_layers: 143
+      job_id: jgn609mm5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1229.0
-      throughput: 813.6696501220505
+      inference_time: 499.0
+      throughput: 2004.0080160320642
       estimated_peak_memory_range:
-        min: 151552
-        max: 1582768
+        min: 45056
+        max: 10383960
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 91
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jvgdwr4r5
+        total_layers: 91
+      job_id: jgo2z1ndp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:33:09Z'
+    timestamp: '2024-10-17T17:31:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 200.0
-      throughput: 5000.0
+      inference_time: 209.0
+      throughput: 4784.688995215311
       estimated_peak_memory_range:
         min: 12288
-        max: 39364304
+        max: 39378000
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,37 +113,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jep287orp
+      job_id: jpedore05
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 251.0
-      throughput: 3984.06374501992
+      inference_time: 253.0
+      throughput: 3952.5691699604745
       estimated_peak_memory_range:
-        min: 0
-        max: 13284944
+        min: 159744
+        max: 15566272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jwgoy1dq5
+        total_layers: 143
+      job_id: jprv642eg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 884.0
-      throughput: 1131.2217194570135
+      inference_time: 461.0
+      throughput: 2169.1973969631235
       estimated_peak_memory_range:
         min: 0
-        max: 54190736
+        max: 59955712
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 91
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jz57zjnvp
+        total_layers: 91
+      job_id: jpv6q1rm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:33:10Z'
+    timestamp: '2024-10-17T17:31:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 280.0
-      throughput: 3571.4285714285716
+      inference_time: 920.0
+      throughput: 1086.9565217391305
       estimated_peak_memory_range:
         min: 12288
-        max: 112193256
+        max: 22360992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jqpye488g
+      job_id: jgz32xo65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 302.0
-      throughput: 3311.2582781456954
+      inference_time: 1155.0
+      throughput: 865.8008658008658
       estimated_peak_memory_range:
-        min: 180224
-        max: 1293824
+        min: 12288
+        max: 7596560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: j7gjx0yvp
+        total_layers: 143
+      job_id: jp2kx79mp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:33:02Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:30:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 346.0
-      throughput: 2890.173410404624
+      inference_time: 5708.0
+      throughput: 175.1927119831815
       estimated_peak_memory_range:
-        min: 16384
-        max: 39709408
+        min: 49152
+        max: 2077784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,14 +204,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j2p0y1o9g
+      job_id: j5wewd2j5
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 405.0
-      throughput: 2469.135802469136
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:30:25Z'
+  - torchscript_onnx_tflite:
+      inference_time: 289.0
+      throughput: 3460.2076124567475
       estimated_peak_memory_range:
         min: 12288
-        max: 18102896
+        max: 1551016
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -223,22 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jmg9v32w5
+      job_id: jg9l03jvg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 299.0
+      throughput: 3344.4816053511704
+      estimated_peak_memory_range:
+        min: 200704
+        max: 1480456
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 143
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 143
+      job_id: jpy1z4j4p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:33:07Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:30:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 279.0
-      throughput: 3584.2293906810037
+      inference_time: 278.0
+      throughput: 3597.122302158273
       estimated_peak_memory_range:
-        min: 32768
-        max: 1440040
+        min: 12288
+        max: 1378104
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j1p8o3jkg
+      job_id: jp142dylp
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 301.0
       throughput: 3322.2591362126245
       estimated_peak_memory_range:
         min: 184320
-        max: 1705792
+        max: 1514328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jlpe9rxog
+        total_layers: 143
+      job_id: jp8q23m8p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:33:03Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:30:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 280.0
-      throughput: 3571.4285714285716
+      inference_time: 290.0
+      throughput: 3448.2758620689656
       estimated_peak_memory_range:
-        min: 12288
-        max: 1459240
+        min: 28672
+        max: 32153096
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jogkzl6wg
+      job_id: jgdxnrelp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 294.0
-      throughput: 3401.360544217687
+      inference_time: 306.0
+      throughput: 3267.97385620915
       estimated_peak_memory_range:
-        min: 184320
-        max: 1430800
+        min: 180224
+        max: 1566248
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jygzexyog
+        total_layers: 143
+      job_id: jgkevlqog
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:33:05Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:30:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 282.0
-      throughput: 3546.099290780142
+      inference_time: 352.0
+      throughput: 2840.909090909091
       estimated_peak_memory_range:
-        min: 12288
-        max: 112695712
+        min: 16384
+        max: 39549744
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: jn5q874n5
+      job_id: j57y2jlr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 301.0
-      throughput: 3322.2591362126245
+      inference_time: 410.0
+      throughput: 2439.0243902439024
       estimated_peak_memory_range:
-        min: 180224
-        max: 1450768
+        min: 163840
+        max: 17548080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jz5wodz3p
+        total_layers: 143
+      job_id: j5q607rmp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:33:06Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:30:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 906.0
-      throughput: 1103.7527593818984
+      inference_time: 180.0
+      throughput: 5555.555555555556
       estimated_peak_memory_range:
-        min: 12288
-        max: 22392944
+        min: 8192
+        max: 20749520
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,83 +379,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 86
-      job_id: j1gln0wjp
+      job_id: jp4lnxdl5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1153.0
-      throughput: 867.3026886383348
+      inference_time: 255.0
+      throughput: 3921.5686274509803
       estimated_peak_memory_range:
-        min: 12288
-        max: 8146544
+        min: 163840
+        max: 12187936
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jnp10d185
+        total_layers: 143
+      job_id: jglv402l5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:33:08Z'
-  - torchscript_onnx_tflite:
-      inference_time: 5814.0
-      throughput: 171.9986240110079
+    torchscript_onnx:
+      inference_time: 403.0
+      throughput: 2481.3895781637716
       estimated_peak_memory_range:
-        min: 28672
-        max: 6101632
+        min: 0
+        max: 27298256
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 91
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: jw5663o65
+        total_layers: 91
+      job_id: jpedorw05
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:32:58Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:31:06Z'
   - torchscript_onnx_qnn:
-      inference_time: 420.0
-      throughput: 2380.9523809523807
+      inference_time: 409.0
+      throughput: 2444.987775061125
       estimated_peak_memory_range:
-        min: 499712
-        max: 499712
+        min: 512000
+        max: 512000
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 86
+        layers_on_npu: 143
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 86
-      job_id: j1pv31mk5
+        total_layers: 143
+      job_id: jp0z412e5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1282.0
-      throughput: 780.0312012480499
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
-        min: 15044608
-        max: 15044608
+        min: 8499200
+        max: 8499200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 91
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jqp4qx48g
+        total_layers: 91
+      job_id: jgjvd028g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:33:11Z'
+    timestamp: '2024-10-17T17:31:05Z'
diff --git a/qai_hub_models/models/googlenet_quantized/requirements.txt b/qai_hub_models/models/googlenet_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/googlenet_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/googlenet_quantized/test.py b/qai_hub_models/models/googlenet_quantized/test.py
deleted file mode 100644
index c116898d..00000000
--- a/qai_hub_models/models/googlenet_quantized/test.py
+++ /dev/null
@@ -1,29 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.googlenet_quantized.demo import main as demo_main
-from qai_hub_models.models.googlenet_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    GoogLeNetQuantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        GoogLeNetQuantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/hrnet_pose/README.md b/qai_hub_models/models/hrnet_pose/README.md
index 2fa47408..6c41c41d 100644
--- a/qai_hub_models/models/hrnet_pose/README.md
+++ b/qai_hub_models/models/hrnet_pose/README.md
@@ -6,7 +6,7 @@
 HRNet performs pose estimation in high-resolution representations.
 
 This is based on the implementation of HRNetPose found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/hrnet_posenet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/hrnet_pose).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.hrnet_pose.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of HRNetPose can be found
+* The license for the original implementation of HRNetPose can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep High-Resolution Representation Learning for Human Pose Estimation](https://arxiv.org/abs/1902.09212)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/hrnet_posenet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/hrnet_pose/export.py b/qai_hub_models/models/hrnet_pose/export.py
index d77fbba8..5af9642c 100644
--- a/qai_hub_models/models/hrnet_pose/export.py
+++ b/qai_hub_models/models/hrnet_pose/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.hrnet_pose import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "hrnet_pose"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/hrnet_pose/perf.yaml b/qai_hub_models/models/hrnet_pose/perf.yaml
index 6146f70d..3b35594d 100644
--- a/qai_hub_models/models/hrnet_pose/perf.yaml
+++ b/qai_hub_models/models/hrnet_pose/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: HRNetPose
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2886.0
-      throughput: 346.5003465003465
+      inference_time: 2847.0
+      throughput: 351.24692658939233
       estimated_peak_memory_range:
-        min: 237568
-        max: 373364816
+        min: 16384
+        max: 2387888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: jn5q87l45
+      job_id: jgjvn78xg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2964.0
-      throughput: 337.38191632928476
+      inference_time: 2906.0
+      throughput: 344.1156228492774
       estimated_peak_memory_range:
-        min: 606208
-        max: 16821840
+        min: 16384
+        max: 14831168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jlpe9r71g
+      job_id: jp14zjk8p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3062.0
-      throughput: 326.5839320705421
+      inference_time: 2910.0
+      throughput: 343.64261168384877
       estimated_peak_memory_range:
-        min: 12288
-        max: 59999824
+        min: 20480
+        max: 710085448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 749
-      job_id: jnp10do85
+      job_id: jp0z0j895
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:32:14Z'
+    timestamp: '2024-10-15T00:30:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 2308.0
-      throughput: 433.27556325823224
+      inference_time: 2292.0
+      throughput: 436.3001745200698
       estimated_peak_memory_range:
-        min: 20480
-        max: 122914928
+        min: 16384
+        max: 126576592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: j1gln0y8p
+      job_id: jpedmzn15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2368.0
-      throughput: 422.2972972972973
+      inference_time: 2517.0
+      throughput: 397.29837107667856
       estimated_peak_memory_range:
         min: 606208
-        max: 34133200
+        max: 37904320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jygzexlkg
+      job_id: jgdx13yrp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2463.0
-      throughput: 406.00893219650834
+      inference_time: 2455.0
+      throughput: 407.33197556008145
       estimated_peak_memory_range:
         min: 0
-        max: 150875360
+        max: 155262800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 749
-      job_id: jvgdwr6r5
+      job_id: jgkex4wwg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:32:15Z'
+    timestamp: '2024-10-15T00:30:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 2816.0
-      throughput: 355.1136363636364
+      inference_time: 2838.0
+      throughput: 352.36081747709653
       estimated_peak_memory_range:
-        min: 20480
-        max: 2897008
+        min: 0
+        max: 2369552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: jw5663805
+      job_id: jgz3dm0k5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2715.0
-      throughput: 368.3241252302026
+      inference_time: 2709.0
+      throughput: 369.139904023625
       estimated_peak_memory_range:
-        min: 614400
-        max: 1971304
+        min: 622592
+        max: 1763760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jmg9v3ol5
+      job_id: jp4lr1685
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:32:08Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:30:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 3760.0
-      throughput: 265.9574468085106
+      inference_time: 2835.0
+      throughput: 352.7336860670194
       estimated_peak_memory_range:
         min: 16384
-        max: 113222160
+        max: 2326424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: j1p3k4zl5
+      job_id: jgdx13yep
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3866.0
-      throughput: 258.6652871184687
+      inference_time: 2717.0
+      throughput: 368.052999631947
       estimated_peak_memory_range:
-        min: 356352
-        max: 25341744
+        min: 663552
+        max: 2003840
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jmg9v3ow5
+      job_id: jgn6vndk5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:32:13Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:30:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 2839.0
-      throughput: 352.23670306445933
+      inference_time: 2815.0
+      throughput: 355.23978685612786
       estimated_peak_memory_range:
         min: 16384
-        max: 3214984
+        max: 2092832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: jwgoy1lx5
+      job_id: jp14zjk2p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2782.0
-      throughput: 359.45363048166786
+      inference_time: 2758.0
+      throughput: 362.58158085569255
       estimated_peak_memory_range:
-        min: 618496
-        max: 1824488
+        min: 622592
+        max: 1809960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jnp10do25
+      job_id: j5mnxm1dp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:32:10Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:30:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 2821.0
-      throughput: 354.4842254519674
+      inference_time: 2843.0
+      throughput: 351.74111853675697
       estimated_peak_memory_range:
-        min: 32768
-        max: 2156360
+        min: 24576
+        max: 2200184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: j1pv31lj5
+      job_id: jg9lnm7lg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2733.0
-      throughput: 365.89828027808267
+      inference_time: 2748.0
+      throughput: 363.901018922853
       estimated_peak_memory_range:
         min: 622592
-        max: 2259984
+        max: 1897176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jvgdwr6e5
+      job_id: jpxko4835
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:32:11Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:30:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 2834.0
-      throughput: 352.85815102328866
+      inference_time: 3758.0
+      throughput: 266.0989888238425
       estimated_peak_memory_range:
-        min: 28672
-        max: 2331352
+        min: 20480
+        max: 112712896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 516
-      job_id: j7gjx0rxp
+      job_id: j5we67065
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2805.0
-      throughput: 356.50623885918003
+      inference_time: 3885.0
+      throughput: 257.4002574002574
       estimated_peak_memory_range:
-        min: 618496
-        max: 2063520
+        min: 606208
+        max: 28636192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jz5wody3p
+      job_id: jp2kywqrp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:30:43Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1970.0
+      throughput: 507.61421319796955
+      estimated_peak_memory_range:
+        min: 12288
+        max: 61964880
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 516
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 516
+      job_id: jg9lnm7wg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2035.0
+      throughput: 491.4004914004914
+      estimated_peak_memory_range:
+        min: 602112
+        max: 35413504
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 747
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 747
+      job_id: jpy13xk8p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1866.0
+      throughput: 535.9056806002144
+      estimated_peak_memory_range:
+        min: 0
+        max: 75051936
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 749
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 749
+      job_id: j56y4796p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:32:12Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:30:49Z'
   - torchscript_onnx_qnn:
-      inference_time: 2959.0
-      throughput: 337.95201081446436
+      inference_time: 2978.0
+      throughput: 335.795836131632
       estimated_peak_memory_range:
         min: 589824
         max: 589824
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 747
-      job_id: jz5wody6p
+      job_id: j57yr41v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2986.0
-      throughput: 334.8961821835231
+      inference_time: 2972.0
+      throughput: 336.47375504710635
       estimated_peak_memory_range:
-        min: 58335232
-        max: 58335232
+        min: 59396096
+        max: 59396096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 749
-      job_id: jz57zjovp
+      job_id: j5q6qyxnp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:32:16Z'
+    timestamp: '2024-10-15T00:30:47Z'
diff --git a/qai_hub_models/models/hrnet_pose_quantized/README.md b/qai_hub_models/models/hrnet_pose_quantized/README.md
index 0e936653..03707c36 100644
--- a/qai_hub_models/models/hrnet_pose_quantized/README.md
+++ b/qai_hub_models/models/hrnet_pose_quantized/README.md
@@ -6,7 +6,7 @@
 HRNet performs pose estimation in high-resolution representations.
 
 This is based on the implementation of HRNetPoseQuantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/hrnet_posenet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/hrnet_pose_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.hrnet_pose_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of HRNetPoseQuantized can be found
+* The license for the original implementation of HRNetPoseQuantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep High-Resolution Representation Learning for Human Pose Estimation](https://arxiv.org/abs/1902.09212)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/hrnet_posenet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/hrnet_pose_quantized/export.py b/qai_hub_models/models/hrnet_pose_quantized/export.py
index fec68968..6872d705 100644
--- a/qai_hub_models/models/hrnet_pose_quantized/export.py
+++ b/qai_hub_models/models/hrnet_pose_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.hrnet_pose_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "hrnet_pose_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/hrnet_pose_quantized/perf.yaml b/qai_hub_models/models/hrnet_pose_quantized/perf.yaml
index 5d09c38d..6c6e5cf2 100644
--- a/qai_hub_models/models/hrnet_pose_quantized/perf.yaml
+++ b/qai_hub_models/models/hrnet_pose_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: HRNetPoseQuantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 966.0
-      throughput: 1035.1966873706003
+      inference_time: 956.0
+      throughput: 1046.0251046025105
       estimated_peak_memory_range:
-        min: 12288
-        max: 2325608
+        min: 16384
+        max: 2210536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,22 +59,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: jn5q87y45
+      job_id: j5q6qy14p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1250.0
-      throughput: 800.0
+      inference_time: 1251.0
+      throughput: 799.3605115907275
       estimated_peak_memory_range:
-        min: 16384
-        max: 22198416
+        min: 167936
+        max: 8571056
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jz5wodl6p
+        total_layers: 748
+      job_id: jp14zjm2p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -85,13 +83,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:31:16Z'
+    timestamp: '2024-10-15T00:29:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 696.0
-      throughput: 1436.7816091954023
+      inference_time: 792.0
+      throughput: 1262.6262626262626
       estimated_peak_memory_range:
-        min: 65536
-        max: 109951616
+        min: 16384
+        max: 113943488
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -99,22 +97,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: j1gln0x8p
+      job_id: jglvmx885
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 920.0
-      throughput: 1086.9565217391305
+      inference_time: 1051.0
+      throughput: 951.4747859181732
       estimated_peak_memory_range:
-        min: 172032
-        max: 30865488
+        min: 0
+        max: 34824448
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jmg9v3zl5
+        total_layers: 748
+      job_id: jgdx13mep
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -123,13 +121,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:31:17Z'
+    timestamp: '2024-10-15T00:29:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 943.0
-      throughput: 1060.4453870625662
+      inference_time: 3825.0
+      throughput: 261.437908496732
       estimated_peak_memory_range:
-        min: 16384
-        max: 2080912
+        min: 12288
+        max: 72031904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -137,37 +135,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: jw5663705
+      job_id: jgz3dmyk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1190.0
-      throughput: 840.3361344537815
+      inference_time: 5523.0
+      throughput: 181.0610175629187
       estimated_peak_memory_range:
-        min: 172032
-        max: 1393232
+        min: 225280
+        max: 8107920
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jvgdwrde5
+        total_layers: 748
+      job_id: jpy13xy7p
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:29:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 17117.0
+      throughput: 58.421452357305604
+      estimated_peak_memory_range:
+        min: 98304
+        max: 4442464
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 518
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 518
+      job_id: j5we67r65
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:31:19Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:29:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 1179.0
-      throughput: 848.1764206955047
+      inference_time: 953.0
+      throughput: 1049.3179433368311
       estimated_peak_memory_range:
-        min: 45056
-        max: 113234096
+        min: 12288
+        max: 2785544
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -175,37 +196,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: j1p3k49l5
+      job_id: j56y47m0p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1468.0
-      throughput: 681.1989100817439
+      inference_time: 1199.0
+      throughput: 834.0283569641368
       estimated_peak_memory_range:
-        min: 163840
-        max: 37739744
+        min: 176128
+        max: 1413960
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jo5mrw0wg
+        total_layers: 748
+      job_id: jp4lr12v5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:31:24Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:29:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 964.0
-      throughput: 1037.344398340249
+      inference_time: 952.0
+      throughput: 1050.420168067227
       estimated_peak_memory_range:
-        min: 36864
-        max: 2120904
+        min: 12288
+        max: 2260088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -213,37 +234,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: jwgoy1rx5
+      job_id: jgjvn7yxg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1205.0
-      throughput: 829.8755186721992
+      inference_time: 1211.0
+      throughput: 825.7638315441784
       estimated_peak_memory_range:
-        min: 172032
-        max: 1515664
+        min: 180224
+        max: 1924760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jz57zjelp
+        total_layers: 748
+      job_id: jgn6vnwr5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:31:20Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:29:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 968.0
-      throughput: 1033.0578512396694
+      inference_time: 962.0
+      throughput: 1039.5010395010395
       estimated_peak_memory_range:
-        min: 24576
-        max: 1932768
+        min: 12288
+        max: 1925872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -251,22 +272,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: j1pv31dj5
+      job_id: jpv6kdmj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1206.0
-      throughput: 829.1873963515754
+      inference_time: 1221.0
+      throughput: 819.000819000819
       estimated_peak_memory_range:
-        min: 184320
-        max: 1395928
+        min: 180224
+        max: 1373992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jqp4qxyvg
+        total_layers: 748
+      job_id: j5mnxmlwp
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -274,14 +295,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:31:22Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:29:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 962.0
-      throughput: 1039.5010395010395
+      inference_time: 960.0
+      throughput: 1041.6666666666667
       estimated_peak_memory_range:
-        min: 16384
-        max: 2151952
+        min: 12288
+        max: 18731344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -289,37 +310,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: j7gjx07xp
+      job_id: jgo26rwxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1195.0
-      throughput: 836.8200836820083
+      inference_time: 1212.0
+      throughput: 825.0825082508251
       estimated_peak_memory_range:
-        min: 180224
-        max: 1688952
+        min: 172032
+        max: 1445440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: j0pxv7l1g
+        total_layers: 748
+      job_id: jpxko4z15
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:31:23Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:29:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 3860.0
-      throughput: 259.0673575129534
+      inference_time: 1172.0
+      throughput: 853.2423208191126
       estimated_peak_memory_range:
-        min: 12288
-        max: 71371696
+        min: 61440
+        max: 114534080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -327,37 +348,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: jlpe9rz1g
+      job_id: jp3j097lg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5564.0
-      throughput: 179.72681524083393
+      inference_time: 1524.0
+      throughput: 656.1679790026246
       estimated_peak_memory_range:
-        min: 172032
-        max: 8591888
+        min: 163840
+        max: 40149008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jegn29zrg
+        total_layers: 748
+      job_id: jp2kywz4p
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:31:25Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:29:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 17160.0
-      throughput: 58.27505827505828
+      inference_time: 663.0
+      throughput: 1508.2956259426849
       estimated_peak_memory_range:
-        min: 561152
-        max: 3637416
+        min: 8192
+        max: 66830176
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -365,30 +386,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 518
-      job_id: jygzexmkg
+      job_id: jg9lnmqlg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 746.0
+      throughput: 1340.4825737265417
+      estimated_peak_memory_range:
+        min: 159744
+        max: 33374624
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 748
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 748
+      job_id: jp0z0jx65
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:31:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:29:48Z'
   - torchscript_onnx_qnn:
-      inference_time: 1326.0
-      throughput: 754.1478129713424
+      inference_time: 1343.0
+      throughput: 744.6016381236038
       estimated_peak_memory_range:
-        min: 327680
-        max: 327680
+        min: 446464
+        max: 446464
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 487
+        layers_on_npu: 748
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 487
-      job_id: jnp10dn25
+        total_layers: 748
+      job_id: j57yr48l5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -397,4 +433,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:31:18Z'
+    timestamp: '2024-10-15T00:29:39Z'
diff --git a/qai_hub_models/models/huggingface_wavlm_base_plus/README.md b/qai_hub_models/models/huggingface_wavlm_base_plus/README.md
index 04bf447b..6670fbde 100644
--- a/qai_hub_models/models/huggingface_wavlm_base_plus/README.md
+++ b/qai_hub_models/models/huggingface_wavlm_base_plus/README.md
@@ -6,7 +6,7 @@
 HuggingFaceWavLMBasePlus is a real time speech processing backbone based on Microsoft's WavLM model.
 
 This is based on the implementation of HuggingFace-WavLM-Base-Plus found
-[here](https://huggingface.co/patrickvonplaten/wavlm-libri-clean-100h-base-plus/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/huggingface_wavlm_base_plus).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.huggingface_wavlm_base_plus.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of HuggingFace-WavLM-Base-Plus can be found
+* The license for the original implementation of HuggingFace-WavLM-Base-Plus can be found
   [here](https://github.com/microsoft/unilm/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing](https://arxiv.org/abs/2110.13900)
 * [Source Model Implementation](https://huggingface.co/patrickvonplaten/wavlm-libri-clean-100h-base-plus/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/huggingface_wavlm_base_plus/export.py b/qai_hub_models/models/huggingface_wavlm_base_plus/export.py
index 6975b7b5..68b630c1 100644
--- a/qai_hub_models/models/huggingface_wavlm_base_plus/export.py
+++ b/qai_hub_models/models/huggingface_wavlm_base_plus/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.huggingface_wavlm_base_plus import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -46,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -81,10 +79,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "huggingface_wavlm_base_plus"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -110,7 +108,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -119,7 +117,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -133,7 +131,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -148,7 +146,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -169,13 +167,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -193,7 +191,11 @@ def export_model(
             inference_job, inference_result, torch_out, model.get_output_names()
         )
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/huggingface_wavlm_base_plus/perf.yaml b/qai_hub_models/models/huggingface_wavlm_base_plus/perf.yaml
index 1f9e0723..5e2877c2 100644
--- a/qai_hub_models/models/huggingface_wavlm_base_plus/perf.yaml
+++ b/qai_hub_models/models/huggingface_wavlm_base_plus/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: HuggingFace-WavLM-Base-Plus
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 957880.0
-      throughput: 1.0439721050653528
+      inference_time: 817718.0
+      throughput: 1.2229154794195554
       estimated_peak_memory_range:
-        min: 66510848
-        max: 69562208
+        min: 65708032
+        max: 68249568
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -58,7 +56,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jlpe9ry1g
+      job_id: jp3j09olg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -67,13 +65,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:28:39Z'
+    timestamp: '2024-10-15T00:26:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 633653.0
-      throughput: 1.578150817561031
+      inference_time: 631561.0
+      throughput: 1.583378327667478
       estimated_peak_memory_range:
-        min: 65622016
-        max: 87489936
+        min: 66519040
+        max: 88097792
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -81,7 +79,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jygzexnkg
+      job_id: jgo26rdxp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -90,13 +88,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:28:40Z'
+    timestamp: '2024-10-15T00:26:52Z'
   - torchscript_onnx_tflite:
-      inference_time: 886695.0
-      throughput: 1.1277835106772904
+      inference_time: 849395.0
+      throughput: 1.1773085549126143
       estimated_peak_memory_range:
-        min: 65736704
-        max: 67874536
+        min: 60493824
+        max: 631125408
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -104,7 +102,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jz5wod76p
+      job_id: jpv6kd2j5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -112,14 +110,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:28:41Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:26:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 1118324.0
-      throughput: 0.8941952421659555
+      inference_time: 850762.0
+      throughput: 1.175416861589963
       estimated_peak_memory_range:
-        min: 65699840
-        max: 92825376
+        min: 65540096
+        max: 68435096
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -127,22 +125,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jmg9v3ml5
+      job_id: j5we67z65
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:28:42Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:26:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 749518.0
-      throughput: 1.3341907732702885
+      inference_time: 846763.0
+      throughput: 1.1809679922245067
       estimated_peak_memory_range:
-        min: 65617920
-        max: 68774152
+        min: 65814528
+        max: 68615184
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -150,22 +148,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jnp10dj25
+      job_id: jgz3dmzk5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:28:43Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:26:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 802634.0
-      throughput: 1.2458978812260633
+      inference_time: 889799.0
+      throughput: 1.1238493187787355
       estimated_peak_memory_range:
-        min: 65568768
-        max: 104111960
+        min: 65630208
+        max: 68682344
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -173,22 +171,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jvgdwr3e5
+      job_id: jpedmz615
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:28:44Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:26:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 897006.0
-      throughput: 1.114819744795464
+      inference_time: 1305119.0
+      throughput: 0.766213655613013
       estimated_peak_memory_range:
-        min: 65593344
-        max: 104803536
+        min: 65798144
+        max: 93414896
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -196,13 +194,36 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 871
         total_layers: 871
-      job_id: jz57zj4lp
+      job_id: jgjvn73xg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:26:54Z'
+  - torchscript_onnx_tflite:
+      inference_time: 568662.0
+      throughput: 1.7585138447795
+      estimated_peak_memory_range:
+        min: 65544192
+        max: 81468816
+      primary_compute_unit: CPU
+      precision: fp32
+      layer_info:
+        layers_on_npu: 0
+        layers_on_gpu: 0
+        layers_on_cpu: 871
+        total_layers: 871
+      job_id: jp14zj12p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:28:45Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:26:59Z'
diff --git a/qai_hub_models/models/ibm_granite_3b_code_instruct/README.md b/qai_hub_models/models/ibm_granite_3b_code_instruct/README.md
new file mode 100644
index 00000000..4bc8ef0d
--- /dev/null
+++ b/qai_hub_models/models/ibm_granite_3b_code_instruct/README.md
@@ -0,0 +1,58 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [IBM-Granite-3B-Code-Instruct: State-of-the-art large language model useful on a variety of code understanding and generation tasks](https://aihub.qualcomm.com/models/ibm_granite_3b_code_instruct)
+
+Granite-3B-Code-Instruct-2K is a 3B parameter model fine tuned from Granite-3B-Code-Base-2K on a combination of permissively licensed instruction data to enhance instruction following capabilities including logical reasoning and problem-solving skills.
+
+This is based on the implementation of IBM-Granite-3B-Code-Instruct found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/ibm_granite_3b_code_instruct).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+
+
+
+
+
+## License
+* The license for the original implementation of IBM-Granite-3B-Code-Instruct can be found
+  [here](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md).
+* The license for the compiled assets for on-device deployment can be found [here](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
+
+
+## References
+* [Granite Code Models: A Family of Open Foundation Models for Code Intelligence](https://arxiv.org/abs/2405.04324)
+* [Source Model Implementation](https://huggingface.co/ibm-granite/granite-3b-code-instruct-2k)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/ibm_granite_3b_code_instruct/info.yaml b/qai_hub_models/models/ibm_granite_3b_code_instruct/info.yaml
new file mode 100644
index 00000000..bb428616
--- /dev/null
+++ b/qai_hub_models/models/ibm_granite_3b_code_instruct/info.yaml
@@ -0,0 +1,58 @@
+name: IBM-Granite-3B-Code-Instruct
+id: ibm_granite_3b_code_instruct
+status: public
+headline: State-of-the-art large language model useful on a variety of code
+  understanding and generation tasks.
+domain: Generative AI
+description: Granite-3B-Code-Instruct-2K is a 3B parameter model fine tuned from Granite-3B-Code-Base-2K
+ on a combination of permissively licensed instruction data to enhance instruction following
+ capabilities including logical reasoning and problem-solving skills.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+research_paper: https://arxiv.org/abs/2405.04324
+research_paper_title: "Granite Code Models: A Family of Open Foundation Models for Code Intelligence"
+license: https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md
+deploy_license: https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md
+source_repo: https://huggingface.co/ibm-granite/granite-3b-code-instruct-2k
+model_maker_id: ibm-watsonx
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 2048
+  Number of parameters: 3.48B
+  Precision: fp16
+  Num of key-value heads: 32
+  Information about the model parts: Prompt Processor and Token Generator are split into 4 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
+  Prompt processor model size: 7 GB
+  Prompt processor input (part1): 128 tokens
+  Prompt processor output (part1): Embeddings output
+  Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
+  Prompt processor output (other parts): 128 output tokens + KVCache for token generator
+  Token generator model size: 7 GB
+  Token generator input (part1): 1 token
+  Token generator output (part1): Embeddings output
+  Token generator input (other parts): 1 input token + past KVCache
+  Token generator output (other parts): 1 output token + KVCache for next iteration
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Supported natural languages: English
+  Supported programming languages: The Granite code foundation models support 116 programming languages including Python, Javascript, Java, C++, Go, and Rust.
+  Minimum QNN SDK version required: 2.27.7
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (2048 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Coding
+  - Coding assist
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+dataset: []
+model_type_llm: true
+restrict_model_sharing: true
+license_type: apache-2.0
+deploy_license_type: apache-2.
+llm_details:
+  call_to_action: 'contact_for_download'
diff --git a/qai_hub_models/models/ibm_granite_3b_code_instruct/perf.yaml b/qai_hub_models/models/ibm_granite_3b_code_instruct/perf.yaml
new file mode 100644
index 00000000..338d5d66
--- /dev/null
+++ b/qai_hub_models/models/ibm_granite_3b_code_instruct/perf.yaml
@@ -0,0 +1,29 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Elite
+models:
+  name: 'IBM-Granite-3B-Code'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 326200
+          max: 5219200
+        tokens_per_second: 5.47
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-17T23:51:08Z'
diff --git a/qai_hub_models/models/inception_v3/README.md b/qai_hub_models/models/inception_v3/README.md
index 81e30641..deeeab5e 100644
--- a/qai_hub_models/models/inception_v3/README.md
+++ b/qai_hub_models/models/inception_v3/README.md
@@ -6,7 +6,7 @@
 InceptionNetV3 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Inception-v3 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/inception_v3).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.inception_v3.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Inception-v3 can be found
+* The license for the original implementation of Inception-v3 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/inception_v3/export.py b/qai_hub_models/models/inception_v3/export.py
index 53449848..d16a859c 100644
--- a/qai_hub_models/models/inception_v3/export.py
+++ b/qai_hub_models/models/inception_v3/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.inception_v3 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "inception_v3"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/inception_v3/perf.yaml b/qai_hub_models/models/inception_v3/perf.yaml
index e01582b0..424426b4 100644
--- a/qai_hub_models/models/inception_v3/perf.yaml
+++ b/qai_hub_models/models/inception_v3/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Inception-v3
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1329.0
-      throughput: 752.4454477050414
+      inference_time: 1332.0
+      throughput: 750.7507507507507
       estimated_peak_memory_range:
-        min: 16384
-        max: 2231320
+        min: 20480
+        max: 2400960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jz5wod46p
+      job_id: jgkex422g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1416.0
-      throughput: 706.2146892655368
+      inference_time: 1400.0
+      throughput: 714.2857142857143
       estimated_peak_memory_range:
         min: 16384
-        max: 149319888
+        max: 148127296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jo5mrw8wg
+      job_id: jgz3dmlk5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1737.0
-      throughput: 575.7052389176741
+      inference_time: 1751.0
+      throughput: 571.1022272986864
       estimated_peak_memory_range:
-        min: 49152
-        max: 51589192
+        min: 16384
+        max: 51841872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jn5q87v45
+      job_id: jprv30x9g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:28:05Z'
+    timestamp: '2024-10-15T00:26:16Z'
   - torchscript_onnx_tflite:
       inference_time: 1148.0
       throughput: 871.0801393728223
       estimated_peak_memory_range:
         min: 16384
-        max: 59265056
+        max: 60836976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jmg9v3dl5
+      job_id: j5q6qyl4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1203.0
-      throughput: 831.255195344971
+      inference_time: 1197.0
+      throughput: 835.421888053467
       estimated_peak_memory_range:
-        min: 634880
-        max: 20864032
+        min: 618496
+        max: 17772976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jegn29krg
+      job_id: j5we67y65
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1448.0
-      throughput: 690.6077348066299
+      inference_time: 1419.0
+      throughput: 704.7216349541931
       estimated_peak_memory_range:
         min: 0
-        max: 58046256
+        max: 60074848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: j1gln0l8p
+      job_id: jp2kywo4p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:28:06Z'
+    timestamp: '2024-10-15T00:26:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 1325.0
-      throughput: 754.7169811320755
+      inference_time: 1327.0
+      throughput: 753.5795026375282
       estimated_peak_memory_range:
-        min: 32768
-        max: 6272856
+        min: 163840
+        max: 53492312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jnp10d625
+      job_id: jglvmxy85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1423.0
-      throughput: 702.7406886858749
+      inference_time: 1468.0
+      throughput: 681.1989100817439
       estimated_peak_memory_range:
         min: 634880
-        max: 1851200
+        max: 1868008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jep287e4p
+      job_id: jp14zjo2p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:28:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:26:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 2093.0
-      throughput: 477.78308647873865
+      inference_time: 1329.0
+      throughput: 752.4454477050414
       estimated_peak_memory_range:
-        min: 12288
-        max: 60240480
+        min: 53248
+        max: 2359064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jvgdwr2e5
+      job_id: jpv6kdlj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2176.0
-      throughput: 459.55882352941177
+      inference_time: 1462.0
+      throughput: 683.9945280437756
       estimated_peak_memory_range:
-        min: 618496
-        max: 22391728
+        min: 634880
+        max: 1924592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jogkzl82g
+      job_id: jp4lr1ev5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:28:04Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:26:12Z'
   - torchscript_onnx_tflite:
       inference_time: 1328.0
       throughput: 753.0120481927711
       estimated_peak_memory_range:
-        min: 0
-        max: 55130360
+        min: 28672
+        max: 1681584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jz57zj9lp
+      job_id: jgo26rlxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1427.0
-      throughput: 700.770847932726
+      inference_time: 1454.0
+      throughput: 687.757909215956
       estimated_peak_memory_range:
-        min: 634880
-        max: 1925000
+        min: 626688
+        max: 2346336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jqpye4m7g
+      job_id: j57yr4ol5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:28:01Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:26:11Z'
   - torchscript_onnx_tflite:
       inference_time: 1331.0
       throughput: 751.3148009015778
       estimated_peak_memory_range:
-        min: 24576
-        max: 1709592
+        min: 16384
+        max: 5672656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: jqp4qx3vg
+      job_id: jp3j09zlg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1427.0
-      throughput: 700.770847932726
+      inference_time: 1476.0
+      throughput: 677.5067750677507
       estimated_peak_memory_range:
-        min: 634880
-        max: 1871288
+        min: 651264
+        max: 1957064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: j2p0y166g
+      job_id: jgdx136ep
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:28:02Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:26:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 1328.0
-      throughput: 753.0120481927711
+      inference_time: 2100.0
+      throughput: 476.1904761904762
       estimated_peak_memory_range:
-        min: 28672
-        max: 1909136
+        min: 180224
+        max: 60784512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 129
-      job_id: j0pxv7x1g
+      job_id: j56y4780p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1420.0
-      throughput: 704.2253521126761
+      inference_time: 2187.0
+      throughput: 457.2473708276177
       estimated_peak_memory_range:
-        min: 626688
-        max: 1963880
+        min: 0
+        max: 22346048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: j1p8o31xg
+      job_id: j5mnxm9wp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:28:03Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:26:14Z'
+  - torchscript_onnx_tflite:
+      inference_time: 948.0
+      throughput: 1054.8523206751054
+      estimated_peak_memory_range:
+        min: 12288
+        max: 23427968
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 129
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 129
+      job_id: jpedmz715
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 998.0
+      throughput: 1002.0040080160321
+      estimated_peak_memory_range:
+        min: 618496
+        max: 16156208
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 219
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 219
+      job_id: jgn6vn1r5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1251.0
+      throughput: 799.3605115907275
+      estimated_peak_memory_range:
+        min: 0
+        max: 25469632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 221
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 221
+      job_id: jp8qyxjxp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:26:20Z'
   - torchscript_onnx_qnn:
-      inference_time: 1476.0
-      throughput: 677.5067750677507
+      inference_time: 1482.0
+      throughput: 674.7638326585695
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: joprk4w95
+      job_id: jg9lnmolg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1682.0
-      throughput: 594.5303210463734
+      inference_time: 1687.0
+      throughput: 592.7682276229995
       estimated_peak_memory_range:
-        min: 49823744
-        max: 49823744
+        min: 48934912
+        max: 48934912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jw5663w05
+      job_id: jpy13x87p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:28:07Z'
+    timestamp: '2024-10-15T00:26:18Z'
diff --git a/qai_hub_models/models/inception_v3_quantized/README.md b/qai_hub_models/models/inception_v3_quantized/README.md
index 87f38606..84df8a1b 100644
--- a/qai_hub_models/models/inception_v3_quantized/README.md
+++ b/qai_hub_models/models/inception_v3_quantized/README.md
@@ -6,7 +6,7 @@
 InceptionNetV3 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases. This model is post-training quantized to int8 using samples from Google's open images dataset.
 
 This is based on the implementation of Inception-v3-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/inception_v3_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/i
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[inception_v3_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.inception_v3_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Inception-v3-Quantized can be found
+* The license for the original implementation of Inception-v3-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/inception_v3_quantized/evaluate.py b/qai_hub_models/models/inception_v3_quantized/evaluate.py
index 47341fcd..6547e871 100644
--- a/qai_hub_models/models/inception_v3_quantized/evaluate.py
+++ b/qai_hub_models/models/inception_v3_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.inception_v3_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/inception_v3_quantized/export.py b/qai_hub_models/models/inception_v3_quantized/export.py
index f68f8a52..47650ff0 100644
--- a/qai_hub_models/models/inception_v3_quantized/export.py
+++ b/qai_hub_models/models/inception_v3_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.inception_v3_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "inception_v3_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/inception_v3_quantized/model.py b/qai_hub_models/models/inception_v3_quantized/model.py
index c5eaac55..3d95c1aa 100644
--- a/qai_hub_models/models/inception_v3_quantized/model.py
+++ b/qai_hub_models/models/inception_v3_quantized/model.py
@@ -4,85 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.inception_v3.model import InceptionNetV3
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
-from qai_hub_models.utils.quantization_aimet import (
-    constrain_quantized_inputs_to_image_range,
-    tie_observers,
-)
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 6
-DEFAULT_ENCODINGS = "inception_v3_quantized_encodings.json"
-
-
-class InceptionNetV3Quantizable(
-    AIMETQuantizableMixin,
-    InceptionNetV3,
-):
-    """InceptionNetV3 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        InceptionNetV3.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "InceptionNetV3Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = InceptionNetV3.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        tie_observers(sim)
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class InceptionNetV3Quantizable(HubQuantizableMixin, InceptionNetV3):
+    pass
diff --git a/qai_hub_models/models/inception_v3_quantized/perf.yaml b/qai_hub_models/models/inception_v3_quantized/perf.yaml
index 944f9dc9..6237e43b 100644
--- a/qai_hub_models/models/inception_v3_quantized/perf.yaml
+++ b/qai_hub_models/models/inception_v3_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,82 +20,77 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: Inception-v3-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 595.0
-      throughput: 1680.672268907563
+      inference_time: 590.0
+      throughput: 1694.915254237288
       estimated_peak_memory_range:
         min: 12288
-        max: 15492704
+        max: 1448856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jvgdwr8l5
+        total_layers: 142
+      job_id: jgn6090m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 645.0
-      throughput: 1550.3875968992247
+      inference_time: 650.0
+      throughput: 1538.4615384615386
       estimated_peak_memory_range:
-        min: 16384
-        max: 29053320
+        min: 20480
+        max: 229064024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: jegn297rg
+        total_layers: 219
+      job_id: jgo2z1zdp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 861.0
-      throughput: 1161.4401858304298
+      inference_time: 873.0
+      throughput: 1145.475372279496
       estimated_peak_memory_range:
-        min: 16384
-        max: 31125256
+        min: 12288
+        max: 31093288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 134
+        layers_on_npu: 130
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 134
-      job_id: jw5663d05
+        total_layers: 130
+      job_id: jpxk97n95
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,51 +99,51 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:27:25Z'
+    timestamp: '2024-10-17T17:29:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 442.0
-      throughput: 2262.443438914027
+      inference_time: 445.0
+      throughput: 2247.191011235955
       estimated_peak_memory_range:
-        min: 20480
-        max: 73483536
+        min: 12288
+        max: 73743552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jz5wod16p
+        total_layers: 142
+      job_id: jprv646eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 495.0
-      throughput: 2020.20202020202
+      inference_time: 490.0
+      throughput: 2040.8163265306123
       estimated_peak_memory_range:
-        min: 167936
-        max: 18911600
+        min: 172032
+        max: 21424032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: joprk4n95
+        total_layers: 219
+      job_id: jpv6q1qm5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 627.0
-      throughput: 1594.896331738437
+      inference_time: 695.0
+      throughput: 1438.8489208633093
       estimated_peak_memory_range:
         min: 0
-        max: 99466880
+        max: 101539664
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 134
+        layers_on_npu: 130
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 134
-      job_id: j1p3k4wl5
+        total_layers: 130
+      job_id: j5mnewqqp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,150 +152,173 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:27:26Z'
+    timestamp: '2024-10-17T17:29:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 588.0
-      throughput: 1700.6802721088436
+      inference_time: 2322.0
+      throughput: 430.66322136089576
       estimated_peak_memory_range:
-        min: 36864
-        max: 9896376
+        min: 16384
+        max: 27591120
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jmg9v3xl5
+        total_layers: 142
+      job_id: jp2kx7xmp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 655.0
-      throughput: 1526.7175572519084
+      inference_time: 2854.0
+      throughput: 350.385423966363
       estimated_peak_memory_range:
-        min: 184320
-        max: 1426904
+        min: 167936
+        max: 7752784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: jqpye477g
+        total_layers: 219
+      job_id: jgjvd0d8g
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:29:18Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7622.0
+      throughput: 131.19916032537392
+      estimated_peak_memory_range:
+        min: 16384
+        max: 2592552
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 142
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 142
+      job_id: jpy1z4z4p
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:27:19Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:29:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 704.0
-      throughput: 1420.4545454545455
+      inference_time: 587.0
+      throughput: 1703.5775127768313
       estimated_peak_memory_range:
         min: 16384
-        max: 74810784
+        max: 1363992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jnp10dv25
+        total_layers: 142
+      job_id: jp0z414e5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 761.0
-      throughput: 1314.060446780552
+      inference_time: 654.0
+      throughput: 1529.051987767584
       estimated_peak_memory_range:
-        min: 172032
-        max: 20979040
+        min: 176128
+        max: 1503016
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: jn5q87m45
+        total_layers: 219
+      job_id: jpedoro05
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:27:23Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:29:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 588.0
-      throughput: 1700.6802721088436
+      inference_time: 592.0
+      throughput: 1689.1891891891892
       estimated_peak_memory_range:
-        min: 12288
-        max: 19061664
+        min: 28672
+        max: 2042328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jvgdwrze5
+        total_layers: 142
+      job_id: jp8q2328p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 657.0
-      throughput: 1522.0700152207
+      inference_time: 649.0
+      throughput: 1540.8320493066255
       estimated_peak_memory_range:
-        min: 176128
-        max: 1497000
+        min: 196608
+        max: 1392784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: j2p0y1v6g
+        total_layers: 219
+      job_id: j5wewdwj5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:27:20Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:29:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 591.0
-      throughput: 1692.047377326565
+      inference_time: 595.0
+      throughput: 1680.672268907563
       estimated_peak_memory_range:
-        min: 24576
-        max: 28002680
+        min: 40960
+        max: 211398824
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jz57zj7lp
+        total_layers: 142
+      job_id: jgkevlvog
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 659.0
-      throughput: 1517.4506828528072
+      inference_time: 647.0
+      throughput: 1545.595054095827
       estimated_peak_memory_range:
-        min: 180224
-        max: 1259464
+        min: 184320
+        max: 1395136
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: j1p8o34xg
+        total_layers: 219
+      job_id: jg9l030vg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,136 +326,128 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:27:21Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:29:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 589.0
-      throughput: 1697.792869269949
+      inference_time: 700.0
+      throughput: 1428.5714285714287
       estimated_peak_memory_range:
         min: 12288
-        max: 208360352
+        max: 74882336
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jqp4qx9vg
+        total_layers: 142
+      job_id: j5q6070mp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 655.0
-      throughput: 1526.7175572519084
+      inference_time: 770.0
+      throughput: 1298.7012987012988
       estimated_peak_memory_range:
-        min: 180224
-        max: 1692520
+        min: 167936
+        max: 23876128
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: jogkzl92g
+        total_layers: 219
+      job_id: jp142d2lp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:27:22Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:29:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 2312.0
-      throughput: 432.52595155709344
+      inference_time: 421.0
+      throughput: 2375.296912114014
       estimated_peak_memory_range:
         min: 12288
-        max: 28043472
+        max: 24823056
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 142
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j0pxv7d1g
+        total_layers: 142
+      job_id: jglv404l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2944.0
-      throughput: 339.67391304347825
+      inference_time: 432.0
+      throughput: 2314.814814814815
       estimated_peak_memory_range:
-        min: 12288
-        max: 8019344
+        min: 0
+        max: 16446992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: j1gln018p
+        total_layers: 219
+      job_id: jgdxnrnlp
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:27:24Z'
-  - torchscript_onnx_tflite:
-      inference_time: 7834.0
-      throughput: 127.64871074802144
+    torchscript_onnx:
+      inference_time: 647.0
+      throughput: 1545.595054095827
       estimated_peak_memory_range:
-        min: 12288
-        max: 3393688
+        min: 8192
+        max: 34873856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 130
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jo5mrwdwg
+        total_layers: 130
+      job_id: jprv648eg
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:27:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:29:40Z'
   - torchscript_onnx_qnn:
-      inference_time: 697.0
-      throughput: 1434.7202295552368
+      inference_time: 721.0
+      throughput: 1386.9625520110958
       estimated_peak_memory_range:
-        min: 475136
-        max: 475136
+        min: 577536
+        max: 577536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 125
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 125
-      job_id: jep287v4p
+        total_layers: 219
+      job_id: jgz32x265
       job_status: Passed
     torchscript_onnx:
-      inference_time: 793.0
-      throughput: 1261.034047919294
+      inference_time: 784.0
+      throughput: 1275.5102040816328
       estimated_peak_memory_range:
-        min: 28610560
-        max: 28610560
+        min: 28753920
+        max: 28753920
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 134
+        layers_on_npu: 130
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 134
-      job_id: jwgoy14x5
+        total_layers: 130
+      job_id: jgn609lm5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:27:27Z'
+    timestamp: '2024-10-17T17:29:38Z'
diff --git a/qai_hub_models/models/inception_v3_quantized/requirements.txt b/qai_hub_models/models/inception_v3_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/inception_v3_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/inception_v3_quantized/test.py b/qai_hub_models/models/inception_v3_quantized/test.py
deleted file mode 100644
index 486a8cee..00000000
--- a/qai_hub_models/models/inception_v3_quantized/test.py
+++ /dev/null
@@ -1,29 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.inception_v3_quantized.demo import main as demo_main
-from qai_hub_models.models.inception_v3_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    InceptionNetV3Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        InceptionNetV3Quantizable.from_pretrained(),
-        MODEL_ID,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/indus_1b_quantized/README.md b/qai_hub_models/models/indus_1b_quantized/README.md
new file mode 100644
index 00000000..4950ce2e
--- /dev/null
+++ b/qai_hub_models/models/indus_1b_quantized/README.md
@@ -0,0 +1,55 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [IndusQ-1.1B: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/indus_1b_quantized)
+
+Indus is today a 1.2 billion parameter model and has been supervised fine tuned for Hindi and dialects.
+
+This is based on the implementation of IndusQ-1.1B found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/indus_1b_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying IndusQ-1.1B on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## References
+* [Project Indus: A Foundational Model for Indian Languages](https://www.techmahindra.com/makers-lab/indus-project/)
+* [Source Model Implementation](https://huggingface.co/nickmalhotra/ProjectIndus)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/indus_1b_quantized/info.yaml b/qai_hub_models/models/indus_1b_quantized/info.yaml
new file mode 100644
index 00000000..f4badf4c
--- /dev/null
+++ b/qai_hub_models/models/indus_1b_quantized/info.yaml
@@ -0,0 +1,42 @@
+name: IndusQ-1.1B
+id: indus_1b_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description:  Indus is today a 1.2 billion parameter model and has been supervised fine tuned for Hindi and dialects.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://www.techmahindra.com/makers-lab/indus-project/
+research_paper_title: "Project Indus: A Foundational Model for Indian Languages"
+source_repo: https://huggingface.co/nickmalhotra/ProjectIndus
+model_maker_id: tech-mahindra
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Max context length: 1024
+  Number of parameters: 1B
+  Precision: w4a16 + w8a16 (few layers)
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: Hindi and English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (1024 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+dataset: []
+model_type_llm: true
+license_type: 'other'
+restrict_model_sharing: true
+llm_details:
+  call_to_action: 'contact_for_purchase'
diff --git a/qai_hub_models/models/indus_1b_quantized/perf.yaml b/qai_hub_models/models/indus_1b_quantized/perf.yaml
new file mode 100644
index 00000000..795a43b2
--- /dev/null
+++ b/qai_hub_models/models/indus_1b_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: 'IndusQ-1.1B'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 28561
+          max: 228489
+        tokens_per_second: 74.60
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/jais_6p7b_chat_quantized/README.md b/qai_hub_models/models/jais_6p7b_chat_quantized/README.md
new file mode 100644
index 00000000..fb207748
--- /dev/null
+++ b/qai_hub_models/models/jais_6p7b_chat_quantized/README.md
@@ -0,0 +1,55 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [JAIS-6p7b-Chat: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/jais_6p7b_chat_quantized)
+
+JAIS 6.7B is a bilingual large language model (LLM) for both Arabic and English developed by Inception, a G42 company in partnership with MBZUAI and Cerebras. This is a 6.7 billion parameter LLM, trained on a dataset containing 141 billion Arabic tokens and 339 billion English/code tokens. The model is based on transformer-based decoder-only (GPT-3) architecture and uses SwiGLU non-linearity. It implements ALiBi position embeddings, enabling the model to extrapolate to long sequence lengths, providing improved context handling and model precision. The JAIS family of models is a comprehensive series of bilingual English-Arabic LLMs. These models are optimized to excel in Arabic while having strong English capabilities.
+
+This is based on the implementation of JAIS-6p7b-Chat found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/jais_6p7b_chat_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying JAIS-6p7b-Chat on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## References
+* [Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models](https://arxiv.org/abs/2308.16149)
+* [Source Model Implementation](https://huggingface.co/inceptionai/jais-family-6p7b)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/jais_6p7b_chat_quantized/info.yaml b/qai_hub_models/models/jais_6p7b_chat_quantized/info.yaml
new file mode 100644
index 00000000..b971e078
--- /dev/null
+++ b/qai_hub_models/models/jais_6p7b_chat_quantized/info.yaml
@@ -0,0 +1,42 @@
+name: JAIS-6p7b-Chat
+id: jais_6p7b_chat_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: JAIS 6.7B is a bilingual large language model (LLM) for both Arabic and English developed by Inception, a G42 company in partnership with MBZUAI and Cerebras. This is a 6.7 billion parameter LLM, trained on a dataset containing 141 billion Arabic tokens and 339 billion English/code tokens. The model is based on transformer-based decoder-only (GPT-3) architecture and uses SwiGLU non-linearity. It implements ALiBi position embeddings, enabling the model to extrapolate to long sequence lengths, providing improved context handling and model precision. The JAIS family of models is a comprehensive series of bilingual English-Arabic LLMs. These models are optimized to excel in Arabic while having strong English capabilities.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://arxiv.org/abs/2308.16149
+research_paper_title: "Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models"
+source_repo: https://huggingface.co/inceptionai/jais-family-6p7b
+model_maker_id: g42
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Max context length: 2048
+  Number of parameters: 6.7B
+  Precision: w4a16 + w8a16 (a few layers)
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Supported languages: Arabic (MSA) and English.
+  Minimum QNN SDK version required: 2.27.7
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (2048 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+dataset: []
+model_type_llm: true
+license_type: 'other'
+restrict_model_sharing: true
+llm_details:
+  call_to_action: 'contact_for_purchase'
diff --git a/qai_hub_models/models/jais_6p7b_chat_quantized/perf.yaml b/qai_hub_models/models/jais_6p7b_chat_quantized/perf.yaml
new file mode 100644
index 00000000..7f0a7e35
--- /dev/null
+++ b/qai_hub_models/models/jais_6p7b_chat_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: 'Jais-6p7b-Chat'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 238231
+          max: 3811696
+        tokens_per_second: 13.33
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/lama_dilated/README.md b/qai_hub_models/models/lama_dilated/README.md
index 685dab86..b84123a9 100644
--- a/qai_hub_models/models/lama_dilated/README.md
+++ b/qai_hub_models/models/lama_dilated/README.md
@@ -6,7 +6,7 @@
 LaMa-Dilated is a machine learning model that allows to erase and in-paint part of given input image.
 
 This is based on the implementation of LaMa-Dilated found
-[here](https://github.com/advimman/lama). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/lama_dilated).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.lama_dilated.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of LaMa-Dilated can be found
+* The license for the original implementation of LaMa-Dilated can be found
   [here](https://github.com/advimman/lama/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Resolution-robust Large Mask Inpainting with Fourier Convolutions](https://arxiv.org/abs/2109.07161)
 * [Source Model Implementation](https://github.com/advimman/lama)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/lama_dilated/export.py b/qai_hub_models/models/lama_dilated/export.py
index 1ded990c..db6c385b 100644
--- a/qai_hub_models/models/lama_dilated/export.py
+++ b/qai_hub_models/models/lama_dilated/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.lama_dilated import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "lama_dilated"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/lama_dilated/perf.yaml b/qai_hub_models/models/lama_dilated/perf.yaml
index 54fe8b30..9bd81190 100644
--- a/qai_hub_models/models/lama_dilated/perf.yaml
+++ b/qai_hub_models/models/lama_dilated/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: LaMa-Dilated
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 74926.0
-      throughput: 13.346501881856765
+      inference_time: 74881.0
+      throughput: 13.354522509047689
       estimated_peak_memory_range:
-        min: 3211264
-        max: 137684112
+        min: 3252224
+        max: 138080320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: j0pxv739g
+      job_id: j5mnxm8qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 70590.0
-      throughput: 14.166312508853945
+      inference_time: 70655.0
+      throughput: 14.153280022645248
       estimated_peak_memory_range:
-        min: 12288
-        max: 42684872
+        min: 1724416
+        max: 45349520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: jw566vd75
+      job_id: jglvmxll5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T23:09:34Z'
+    timestamp: '2024-10-15T00:23:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 56057.0
-      throughput: 17.838985318515082
+      inference_time: 55681.0
+      throughput: 17.95944756739283
       estimated_peak_memory_range:
-        min: 3219456
-        max: 250625712
+        min: 2813952
+        max: 273462720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: jo5mrwoqg
+      job_id: jgn6vnkm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 52897.0
-      throughput: 18.904663780554664
+      inference_time: 52685.0
+      throughput: 18.980734554427258
       estimated_peak_memory_range:
-        min: 4268032
-        max: 85104720
+        min: 4272128
+        max: 95572000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: j1p3k8wz5
+      job_id: j56y47w7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T23:09:35Z'
+    timestamp: '2024-10-15T00:23:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 74797.0
-      throughput: 13.369520167921173
+      inference_time: 74832.0
+      throughput: 13.363267051528759
       estimated_peak_memory_range:
         min: 3264512
-        max: 138471064
+        max: 137993800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,14 +132,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: jegn29omg
+      job_id: jprv30weg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 66777.0
-      throughput: 14.975216017491052
+      inference_time: 70284.0
+      throughput: 14.227989300552046
       estimated_peak_memory_range:
-        min: 4349952
-        max: 5644752
+        min: 4382720
+        max: 5517256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -149,7 +147,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: j1pv349m5
+      job_id: jgo26r8dp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -157,14 +155,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T23:09:37Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:23:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 105500.0
-      throughput: 9.47867298578199
+      inference_time: 74735.0
+      throughput: 13.380611493945274
       estimated_peak_memory_range:
-        min: 3391488
-        max: 158055104
+        min: 3284992
+        max: 137824744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -172,14 +170,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: joprk4oe5
+      job_id: jp8qyx18p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 100523.0
-      throughput: 9.947972105886215
+      inference_time: 70519.0
+      throughput: 14.18057544775167
       estimated_peak_memory_range:
-        min: 4235264
-        max: 45340112
+        min: 4395008
+        max: 5606856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -187,22 +185,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: jz5wox1jp
+      job_id: jpedmzy05
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T23:09:40Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:23:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 74797.0
-      throughput: 13.369520167921173
+      inference_time: 74677.0
+      throughput: 13.39100392356415
       estimated_peak_memory_range:
-        min: 3166208
-        max: 221163520
+        min: 6860800
+        max: 311096848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -210,14 +208,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: jep2874mp
+      job_id: jp0z0j6e5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 66962.0
-      throughput: 14.933843075176966
+      inference_time: 70451.0
+      throughput: 14.194262679025138
       estimated_peak_memory_range:
-        min: 4362240
-        max: 5961936
+        min: 3338240
+        max: 6555048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -225,22 +223,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: j7gjx1w8p
+      job_id: jgjvn7q8g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T23:09:37Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:23:52Z'
   - torchscript_onnx_tflite:
-      inference_time: 74435.0
-      throughput: 13.434540202861557
+      inference_time: 74665.0
+      throughput: 13.393156097234312
       estimated_peak_memory_range:
-        min: 3268608
-        max: 138088632
+        min: 3284992
+        max: 137949104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -248,14 +246,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: jqpye4q4g
+      job_id: jpy13xm4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 67059.0
-      throughput: 14.912241459013705
+      inference_time: 70620.0
+      throughput: 14.16029453412631
       estimated_peak_memory_range:
-        min: 4390912
-        max: 5581424
+        min: 4435968
+        max: 5702872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -263,22 +261,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: jlpe92l0g
+      job_id: jpv6kd7m5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T23:09:38Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:23:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 75016.0
-      throughput: 13.330489495574277
+      inference_time: 105083.0
+      throughput: 9.516287125415149
       estimated_peak_memory_range:
-        min: 3289088
-        max: 138110264
+        min: 3403776
+        max: 168433776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -286,14 +284,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 343
-      job_id: j2p0y1deg
+      job_id: jp2kywemp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 66302.0
-      throughput: 15.08250128201261
+      inference_time: 100500.0
+      throughput: 9.950248756218905
       estimated_peak_memory_range:
-        min: 4395008
-        max: 5975696
+        min: 4235264
+        max: 46386912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -301,19 +299,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: jygzew46g
+      job_id: j5we674j5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:23:55Z'
+  - torchscript_onnx_tflite:
+      inference_time: 49175.0
+      throughput: 20.335536349771225
+      estimated_peak_memory_range:
+        min: 2408448
+        max: 169963632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 343
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 343
+      job_id: j5q6qyvmp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 45997.0
+      throughput: 21.74054829662804
+      estimated_peak_memory_range:
+        min: 1814528
+        max: 92266624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 332
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 332
+      job_id: jg9lnmdvg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T23:09:39Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:23:56Z'
   - torchscript_onnx_qnn:
-      inference_time: 69435.0
-      throughput: 14.401958666378627
+      inference_time: 71913.0
+      throughput: 13.905691599571705
       estimated_peak_memory_range:
         min: 4202496
         max: 4202496
@@ -324,7 +360,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 332
-      job_id: jwgoym4d5
+      job_id: jp3j096zg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -333,4 +369,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T23:09:36Z'
+    timestamp: '2024-10-15T00:23:48Z'
diff --git a/qai_hub_models/models/litehrnet/README.md b/qai_hub_models/models/litehrnet/README.md
index c6f890e7..8084a6c2 100644
--- a/qai_hub_models/models/litehrnet/README.md
+++ b/qai_hub_models/models/litehrnet/README.md
@@ -6,7 +6,7 @@
 LiteHRNet is a machine learning model that detects human pose and returns a location and confidence for each of 17 joints.
 
 This is based on the implementation of LiteHRNet found
-[here](https://github.com/HRNet/Lite-HRNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/litehrnet).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.litehrnet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of LiteHRNet can be found
+* The license for the original implementation of LiteHRNet can be found
   [here](https://github.com/HRNet/Lite-HRNet/blob/hrnet/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Lite-HRNet: A Lightweight High-Resolution Network](https://arxiv.org/abs/2104.06403)
 * [Source Model Implementation](https://github.com/HRNet/Lite-HRNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/litehrnet/export.py b/qai_hub_models/models/litehrnet/export.py
index 2a9e800e..74d3e6a5 100644
--- a/qai_hub_models/models/litehrnet/export.py
+++ b/qai_hub_models/models/litehrnet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.litehrnet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "litehrnet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,16 +195,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
     parser = export_parser(
-        model_cls=Model,
-        supports_qnn=False,
-        supports_onnx=False,
-        supports_precompiled_qnn_onnx=False,
+        model_cls=Model, supports_qnn=False, supports_precompiled_qnn_onnx=False
     )
     args = parser.parse_args()
     export_model(**vars(args))
diff --git a/qai_hub_models/models/litehrnet/perf.yaml b/qai_hub_models/models/litehrnet/perf.yaml
index e6cada4c..9d95b1f9 100644
--- a/qai_hub_models/models/litehrnet/perf.yaml
+++ b/qai_hub_models/models/litehrnet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: LiteHRNet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 7904.0
-      throughput: 126.51821862348179
+      inference_time: 7959.0
+      throughput: 125.64392511622063
       estimated_peak_memory_range:
-        min: 249856
-        max: 4537216
+        min: 253952
+        max: 2920944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,7 +56,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: jogkzldog
+      job_id: jpy13x74p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 7130.0
+      throughput: 140.25245441795232
+      estimated_peak_memory_range:
+        min: 425984
+        max: 6890912
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1254
+        layers_on_gpu: 0
+        layers_on_cpu: 4
+        total_layers: 1258
+      job_id: j5we671j5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -67,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:25:03Z'
+    timestamp: '2024-10-15T00:23:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 5976.0
-      throughput: 167.33601070950468
+      inference_time: 4910.0
+      throughput: 203.66598778004072
       estimated_peak_memory_range:
         min: 249856
-        max: 94933776
+        max: 99864736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -81,7 +94,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: jn5q87wm5
+      job_id: jp0z0jve5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4533.0
+      throughput: 220.60445621001546
+      estimated_peak_memory_range:
+        min: 606208
+        max: 112216896
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1254
+        layers_on_gpu: 0
+        layers_on_cpu: 4
+        total_layers: 1258
+      job_id: jg9lnmxvg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -90,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:25:04Z'
+    timestamp: '2024-10-15T00:23:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 7904.0
-      throughput: 126.51821862348179
+      inference_time: 7938.0
+      throughput: 125.97631645250694
       estimated_peak_memory_range:
-        min: 266240
-        max: 2070696
+        min: 253952
+        max: 2423880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -104,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: j1gln07lp
+      job_id: jp8qyx48p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -112,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:25:05Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:22:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 8607.0
-      throughput: 116.18450098756826
+      inference_time: 7965.0
+      throughput: 125.54927809165098
       estimated_peak_memory_range:
-        min: 249856
-        max: 85742576
+        min: 245760
+        max: 2855640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -127,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: jw5663v75
+      job_id: j56y47d7p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:25:06Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:22:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 7908.0
-      throughput: 126.45422357106727
+      inference_time: 7929.0
+      throughput: 126.11930886618741
       estimated_peak_memory_range:
-        min: 270336
-        max: 2944712
+        min: 225280
+        max: 2036536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -150,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: j1p3k48z5
+      job_id: jglvmx1l5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:25:07Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:22:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 7887.0
-      throughput: 126.79092177000126
+      inference_time: 7934.0
+      throughput: 126.03982858583312
       estimated_peak_memory_range:
-        min: 274432
-        max: 2653064
+        min: 245760
+        max: 3055480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -173,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: jwgoy1md5
+      job_id: j5q6qymmp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:25:08Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:22:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 7901.0
-      throughput: 126.56625743576762
+      inference_time: 8522.0
+      throughput: 117.34334663224595
       estimated_peak_memory_range:
-        min: 262144
-        max: 2491936
+        min: 245760
+        max: 88355744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -196,13 +224,74 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 1235
-      job_id: j1pv314m5
+      job_id: jgkex49og
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:25:08Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:22:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5295.0
+      throughput: 188.85741265344666
+      estimated_peak_memory_range:
+        min: 221184
+        max: 71293792
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1233
+        layers_on_gpu: 0
+        layers_on_cpu: 2
+        total_layers: 1235
+      job_id: jgo26r4dp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4830.0
+      throughput: 207.0393374741201
+      estimated_peak_memory_range:
+        min: 1024000
+        max: 83432272
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1254
+        layers_on_gpu: 0
+        layers_on_cpu: 4
+        total_layers: 1258
+      job_id: j57yr49r5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:23:04Z'
+  - torchscript_onnx:
+      inference_time: 8063.0
+      throughput: 124.0233163834801
+      estimated_peak_memory_range:
+        min: 4661248
+        max: 4661248
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1254
+        layers_on_gpu: 0
+        layers_on_cpu: 4
+        total_layers: 1258
+      job_id: jp14zjvlp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-15T00:23:02Z'
diff --git a/qai_hub_models/models/llama_v2_7b_chat_quantized/README.md b/qai_hub_models/models/llama_v2_7b_chat_quantized/README.md
index 52871142..0463700d 100644
--- a/qai_hub_models/models/llama_v2_7b_chat_quantized/README.md
+++ b/qai_hub_models/models/llama_v2_7b_chat_quantized/README.md
@@ -6,7 +6,7 @@
 Llama 2 is a family of LLMs. The "Chat" at the end indicates that the model is optimized for chatbot-like dialogue. The model is quantized to w4a16(4-bit weights and 16-bit activations) and part of the model is quantized to w8a16(8-bit weights and 16-bit activations) making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-KVCache-Quantized's latency.
 
 This is based on the implementation of Llama-v2-7B-Chat found
-[here](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/llama_v2_7b_chat_quantized).
 
@@ -14,26 +14,7 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/l
 
 ## Deploying Llama 2 on-device
 
-Large Language Model (LLM) such as [Llama 2](https://llama.meta.com/llama2/) has the following complexities to deploy on-device:
-1. Model size is too large to fit in device memory for inference
-2. Multi-Head Attention (MHA) has large activations leading to fallback from accelerators
-3. High model load and inference time
-
-We can tackle the above constraints with the following steps:
-1. Quantize weights to reduce on-disk model size, e.g., int8 or int4 weights
-2. Quantize activations to reduce inference time memory pressure
-3. Graph transformations to reduce inference time memory pressure, e.g., Multi-Head to Split-Head Attention (MHA -> SHA)
-4. Graph transformations to convert or decompose operations into more accelerator friendly operations e.g. Linear to Conv
-5. For LLM with 7B or more parameters, above steps are still not good enough on mobile,
-  hence we go one step further and split model into sub-parts.
-
-Here, we divide the model into 4 parts in order to
-1. Make model exportable with low memory usage
-2. Avoid inference time out-of-memory errors
-
-In order to export Llama 2, please ensure
-1. Host machine has >40GB memory (RAM+swap-space)
-2. If you don't have enough memory, export.py will dump instructions to increase swap space accordingly.
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
 
 ## Sample output prompts generated on-device
 1. --prompt "what is gravity?" --max-output-tokens 30
@@ -69,44 +50,20 @@ print(fibonacci(5))
 
 
 
-## Example & Usage
-
-Install the package via pip:
-```bash
-pip install "qai_hub_models[llama_v2_7b_chat_quantized]"
-```
-
 
-Once installed, run the following simple CLI demo:
-
-```bash
-python -m qai_hub_models.models.llama_v2_7b_chat_quantized.demo
-```
-More details on the CLI tool can be found with the `--help` option. See
-[demo.py](demo.py) for sample usage of the model including pre/post processing
-scripts. Please refer to our [general instructions on using
-models](../../../#getting-started) for more usage instructions.
-
-## Export for on-device deployment
-
-This repository contains export scripts that produce a model optimized for
-on-device deployment. This can be run as follows:
-
-```bash
-python -m qai_hub_models.models.llama_v2_7b_chat_quantized.export
-```
-Additional options are documented with the `--help` option. Note that the above
-script requires access to Deployment instructions for Qualcomm® AI Hub.
 
 ## License
-- The license for the original implementation of Llama-v2-7B-Chat can be found
+* The license for the original implementation of Llama-v2-7B-Chat can be found
   [here](https://github.com/facebookresearch/llama/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+
 
 ## References
 * [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)
 * [Source Model Implementation](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/llama_v2_7b_chat_quantized/export.py b/qai_hub_models/models/llama_v2_7b_chat_quantized/export.py
index 061ab003..1fadd4cf 100644
--- a/qai_hub_models/models/llama_v2_7b_chat_quantized/export.py
+++ b/qai_hub_models/models/llama_v2_7b_chat_quantized/export.py
@@ -31,26 +31,34 @@
 )
 
 ALL_COMPONENTS = [
-    "PromptProcessor_1_Quantized",
-    "PromptProcessor_2_Quantized",
-    "PromptProcessor_3_Quantized",
-    "PromptProcessor_4_Quantized",
-    "TokenGenerator_1_Quantized",
-    "TokenGenerator_2_Quantized",
-    "TokenGenerator_3_Quantized",
-    "TokenGenerator_4_Quantized",
-]
-DEFAULT_COMPONENTS = [
-    "PromptProcessor_1_Quantized",
-    "PromptProcessor_2_Quantized",
-    "PromptProcessor_3_Quantized",
-    "PromptProcessor_4_Quantized",
-    "TokenGenerator_1_Quantized",
-    "TokenGenerator_2_Quantized",
-    "TokenGenerator_3_Quantized",
-    "TokenGenerator_4_Quantized",
+    "Llama2_Part1_Quantized",
+    "Llama2_Part2_Quantized",
+    "Llama2_Part3_Quantized",
+    "Llama2_Part4_Quantized",
 ]
 
+DEFAULT_COMPONENTS = ALL_COMPONENTS
+
+# Each components is two sub-components linked together with shared weights
+ALL_SUB_COMPONENTS = {
+    "Llama2_Part1_Quantized": [
+        "PromptProcessor_1_Quantized",
+        "TokenGenerator_1_Quantized",
+    ],
+    "Llama2_Part2_Quantized": [
+        "PromptProcessor_2_Quantized",
+        "TokenGenerator_2_Quantized",
+    ],
+    "Llama2_Part3_Quantized": [
+        "PromptProcessor_3_Quantized",
+        "TokenGenerator_3_Quantized",
+    ],
+    "Llama2_Part4_Quantized": [
+        "PromptProcessor_4_Quantized",
+        "TokenGenerator_4_Quantized",
+    ],
+}
+
 DEFAULT_EXPORT_DEVICE = "Samsung Galaxy S24 (Family)"
 
 
@@ -133,142 +141,168 @@ def export_model(
     # 1. Initialize PyTorch model
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
 
-    compile_jobs: Dict[str, hub.client.CompileJob] = {}
-    profile_options_per_component: Dict[str, str] = {}
+    hub_device = hub.Device(device)
+    compile_jobs: Dict[str, List[hub.client.CompileJob]] = {}
+    profile_options_per_sub_component: Dict[str, str] = {}
+    link_jobs: Dict[str, hub.client.LinkJob] = {}
 
     hub_device = hub.Device(device)
     for component_name in components:
-        # Load model part
+        compile_jobs[component_name] = []
+        for sub_component_name in ALL_SUB_COMPONENTS[component_name]:
 
-        component = model.load_model_part(component_name)
+            # Load model part
+            component = model.load_model_part(sub_component_name)
 
-        input_spec = component.get_input_spec(
-            **get_input_spec_kwargs(component, additional_model_kwargs)
-        )
-
-        source_model = component.convert_to_hub_source_model(
-            target_runtime,
-            output_path,
-            input_spec,
-            external_onnx_weights=True,
-            output_names=component.get_output_names(),
-        )
+            input_spec = component.get_input_spec(
+                **get_input_spec_kwargs(component, additional_model_kwargs)
+            )
 
-        if target_runtime == TargetRuntime.TFLITE:
-            quant_calibration_data = None
-        else:
-            quant_calibration_data = component.get_calibration_data(
-                target_runtime, input_spec=input_spec
+            source_model = component.convert_to_hub_source_model(
+                target_runtime,
+                output_path,
+                input_spec,
+                external_onnx_weights=True,
+                output_names=component.get_output_names(),
             )
 
-        # 2. Compile the models to an on-device asset
-        model_compile_options = component.get_hub_compile_options(
-            target_runtime, compile_options
-        )
-        print(f"Optimizing model {component_name} to run on-device")
-        submitted_compile_job = hub.submit_compile_job(
-            model=source_model,
-            input_specs=input_spec,
-            device=hub_device,
-            name=f"{model_name}_{component_name}",
-            calibration_data=quant_calibration_data,
-            options=model_compile_options,
-        )
+            if target_runtime == TargetRuntime.TFLITE:
+                quant_calibration_data = None
+            else:
+                quant_calibration_data = component.get_calibration_data(
+                    target_runtime, input_spec=input_spec
+                )
 
-        compile_jobs[component_name] = cast(
-            hub.client.CompileJob, submitted_compile_job
-        )
-        profile_options_per_component[
-            component_name
-        ] = component.get_hub_profile_options(target_runtime, profile_options)
+            # 2. Compile the models to an on-device asset
+            model_compile_options = component.get_hub_compile_options(
+                target_runtime, compile_options
+            )
+            print(f"Optimizing model {sub_component_name} to run on-device")
+            submitted_compile_job = hub.submit_compile_job(
+                model=source_model,
+                input_specs=input_spec,
+                device=hub_device,
+                name=f"{model_name}_{sub_component_name}",
+                calibration_data=quant_calibration_data,
+                options=model_compile_options,
+            )
 
-        # Free model part to reduce memory-pressure
-        del component
+            profile_options_per_sub_component[
+                sub_component_name
+            ] = component.get_hub_profile_options(target_runtime, profile_options)
+
+            compile_jobs[component_name].append(submitted_compile_job)
+            # Free model part to reduce memory-pressure
+            del component
+
+    for component_name, compile_jobs_list in compile_jobs.items():
+        models = []
+        for compile_job in compile_jobs_list:
+            if compile_job.get_status().code == "FAILED":
+                raise RuntimeError(
+                    f"Compile job failed for {component_name}. Please re-run export script for failed component."
+                )
+            models.append(compile_job.get_target_model())
+
+        # Link Prompt processor and Token generator
+        link_jobs[component_name] = hub.submit_link_job(
+            models, name=f"{model_name}_{component_name}"
+        )
 
-    # 3. Profile the model assets on real devices
+    # 4. Profile the model assets on real devices
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
-            profile_options_all = profile_options_per_component[component_name]
-            print(f"Profiling model {component_name} on a hosted device.")
-            submitted_profile_job = hub.submit_profile_job(
-                model=compile_jobs[component_name].get_target_model(),
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            profile_jobs[component_name] = cast(
-                hub.client.ProfileJob, submitted_profile_job
-            )
-
-    # 4. Run inference on-device with sample inputs
+            hub_model = link_jobs[component_name].get_target_model()
+            for sub_component_name in ALL_SUB_COMPONENTS[component_name]:
+                profile_options_all = profile_options_per_sub_component[
+                    sub_component_name
+                ]
+                print(f"Profiling model {component_name} on a hosted device.")
+                submitted_profile_job = hub.submit_profile_job(
+                    model=hub_model,
+                    device=hub_device,
+                    name=f"{model_name}_{sub_component_name}",
+                    options=profile_options_all,
+                )
+                profile_jobs[sub_component_name] = cast(
+                    hub.client.ProfileJob, submitted_profile_job
+                )
+
+    # 5. Run inference on-device with sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-
     if not skip_inferencing:
         for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            # Load model with no-AIMET mode
-            component = model.load_model_part(component_name)
-            profile_options_all = profile_options_per_component[component_name]
-            # Load individual model part
-            sample_inputs = component.sample_inputs()
-            submitted_inference_job = hub.submit_inference_job(
-                model=compile_jobs[component_name].get_target_model(),
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Download the model assets to a local file
+            for sub_component_name in ALL_SUB_COMPONENTS[component_name]:
+                print(
+                    f"Running inference for {sub_component_name} on a hosted device with example inputs."
+                )
+                # Load model with no-AIMET mode
+                component = model.load_model_part(sub_component_name)
+                profile_options_all = profile_options_per_sub_component[
+                    sub_component_name
+                ]
+                # Load individual model part
+                sample_inputs = component.sample_inputs()
+                submitted_inference_job = hub.submit_inference_job(
+                    model=link_jobs[component_name].get_target_model(),
+                    inputs=sample_inputs,
+                    device=hub_device,
+                    name=f"{model_name}_{sub_component_name}",
+                    options=profile_options_all,
+                )
+                inference_jobs[sub_component_name] = cast(
+                    hub.client.InferenceJob, submitted_inference_job
+                )
+
+    # 6. Download the model assets to a local file
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
-        for component_name, compile_job in compile_jobs.items():
+        for component_name, compile_job in link_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
-            target_model.download(
-                str(output_path / f"{model_name}_{component_name}.bin")
-            )
+            target_model.download(str(output_path / f"{component_name}.bin"))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarize the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
-            profile_job = profile_jobs[component_name]
-            assert profile_job is not None and profile_job.wait().success
-            profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
-            print_profile_metrics_from_job(profile_job, profile_data)
+            for sub_component_name in ALL_SUB_COMPONENTS[component_name]:
+                profile_job = profile_jobs[sub_component_name]
+                assert profile_job is not None and profile_job.wait().success
+                profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+                print_profile_metrics_from_job(profile_job, profile_data)
 
     if not skip_summary and not skip_inferencing:
         for component_name in components:
-            inference_job = inference_jobs[component_name]
-            # Load individual model part
-            component = model.load_model_part(component_name)
-            # Get ordered model output names
-            output_names = component.get_output_names()
-            sample_inputs = component.sample_inputs()
-            torch_out = torch_inference(component, sample_inputs)
-            assert inference_job is not None and inference_job.wait().success
-            inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
-            print_inference_metrics(
-                inference_job, inference_result, torch_out, output_names=output_names
-            )
+            for sub_component_name in ALL_SUB_COMPONENTS[component_name]:
+                inference_job = inference_jobs[sub_component_name]
+                # Load individual model part
+                component = model.load_model_part(sub_component_name)
+                # Get ordered model output names
+                output_names = component.get_output_names()
+                sample_inputs = component.sample_inputs()
+                torch_out = torch_inference(component, sample_inputs)
+                assert inference_job is not None and inference_job.wait().success
+                inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
+                print_inference_metrics(
+                    inference_job,
+                    inference_result,
+                    torch_out,
+                    output_names=output_names,
+                )
 
     if not skip_summary:
         print_on_target_demo_cmd(
-            compile_jobs.values(), Path(__file__).parent.resolve(), hub_device
+            link_jobs.values(), Path(__file__).parent.resolve(), hub_device
         )
 
     return {
         component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+            link_jobs[component_name],
+            profile_jobs.get(sub_component_name, None),
+            inference_jobs.get(sub_component_name, None),
         )
         for component_name in components
+        for sub_component_name in ALL_SUB_COMPONENTS[component_name]
     }
 
 
diff --git a/qai_hub_models/models/llama_v2_7b_chat_quantized/info.yaml b/qai_hub_models/models/llama_v2_7b_chat_quantized/info.yaml
index 767912c3..0e6d4d30 100644
--- a/qai_hub_models/models/llama_v2_7b_chat_quantized/info.yaml
+++ b/qai_hub_models/models/llama_v2_7b_chat_quantized/info.yaml
@@ -21,10 +21,11 @@ research_paper_title: "LLaMA: Open and Efficient Foundation Language Models"
 license: https://github.com/facebookresearch/llama/blob/main/LICENSE
 source_repo: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
 technical_details:
+  Input sequence length for Prompt Processor: 1024
+  Context length: 1024
   Number of parameters: 7B
   Precision: w4a16 + w8a16 (few layers)
   Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
-  Max context length: 1024
   Prompt processor model size: 3.6 GB
   Prompt processor input: 1024 tokens
   Prompt processor output: 1024 output tokens + KVCache for token generator
@@ -32,8 +33,11 @@ technical_details:
   Token generator model size: 3.6 GB
   Token generator input: 1 input token + past KVCache
   Token generator output: 1 output token + KVCache for next iteration
-  Decoding length: 1024 (1 output token + 1023 from KVCache)
   Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.0
+  Supported languages: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. For Llama-v2-7B-Chat, both values in the range are the same since prompt length is the full context length (1024 tokens).
+  Response Rate: Rate of response generation after the first response token.
 applicable_scenarios:
   - Dialogue
   - Content Generation
@@ -49,3 +53,7 @@ deploy_license: https://github.com/facebookresearch/llama/blob/main/LICENSE
 deploy_license_type: llama2
 dataset: []
 restrict_model_sharing: true
+model_type_llm: true
+llm_details:
+  call_to_action: 'view_readme'
+  genie_compatible: true
diff --git a/qai_hub_models/models/llama_v2_7b_chat_quantized/perf.yaml b/qai_hub_models/models/llama_v2_7b_chat_quantized/perf.yaml
index 9bb996f7..7627a2f4 100644
--- a/qai_hub_models/models/llama_v2_7b_chat_quantized/perf.yaml
+++ b/qai_hub_models/models/llama_v2_7b_chat_quantized/perf.yaml
@@ -1,173 +1,72 @@
+aggregated:
+  supported_devices:
+  - QCS8550 (Proxy)
+  - Samsung Galaxy S24
+  - Snapdragon X Elite CRD
+  - Snapdragon 8 Elite QRD
+  supported_oses:
+  - Android
+  supported_chipsets:
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® X Elite
+  - QCS8550 Proxy
+  - Snapdragon® 8 Elite
 models:
-- name: Llama2-TokenGenerator-KVCache-Quantized
+  name: Llama-v2-7B-Chat
   performance_metrics:
-  - reference_device_info:
-      name: QCS8550 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-10-04T23:59:26.162600Z'
-    torchscript_onnx_qnn:
-      inference_time: 97732
-      throughput: 10.23
-      estimated_peak_memory_range:
-        min: 74272768
-        max: 75651480
-      layer_info:
-        layers_on_npu: 35926
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 35926
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 1495830
+          max: 1495830
+        tokens_per_second: 12.85
+    reference_device_info:
       name: Samsung Galaxy S24
       os: '14'
       form_factor: Phone
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-07-01T19:11:33.087816Z'
-    torchscript_onnx_qnn:
-      inference_time: 88438
-      throughput: 11.307
-      estimated_peak_memory_range:
-        min: 95744000
-        max: 4468197056
-      layer_info:
-        layers_on_npu: 33818
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 33818
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
+    timestamp: '2024-10-16T00:32:42.210701Z'
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 1919000
+          max: 1919000
+        tokens_per_second: 11.20
+    reference_device_info:
       name: Snapdragon X Elite CRD
       os: '11'
       form_factor: Compute
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-07-01T19:09:26.083951Z'
-    torchscript_onnx_qnn:
-      inference_time: 95960
-      throughput: 10.421
-      estimated_peak_memory_range:
-        min: 68235264
-        max: 68235264
-      layer_info:
-        layers_on_npu: 33818
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 33818
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-- name: Llama2-PromptProcessor-Quantized
-  performance_metrics:
-  - reference_device_info:
+    timestamp: '2024-10-16T00:32:42.210701Z'
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 1919000
+          max: 1919000
+        tokens_per_second: 11.20
+    reference_device_info:
       name: QCS8550 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-10-04T23:59:26.162600Z'
-    torchscript_onnx_qnn:
-      inference_time: 2020745
-      throughput: 506.93
-      estimated_peak_memory_range:
-        min: 11554816
-        max: 13002000
-      layer_info:
-        layers_on_npu: 31830
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 31830
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
-      name: Samsung Galaxy S24
-      os: '14'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-16T00:32:42.210701Z'
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 1440000
+          max: 1440000
+        tokens_per_second: 17.94
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
       form_factor: Phone
       os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-07-01T20:53:21.204302Z'
-    torchscript_onnx_qnn:
-      inference_time: 1484949
-      throughput: 689.5859
-      estimated_peak_memory_range:
-        min: 8421376
-        max: 1809446256
-      layer_info:
-        layers_on_npu: 31766
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 31766
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
-      name: Snapdragon X Elite CRD
-      os: '11'
-      form_factor: Compute
-      os_name: Windows
       manufacturer: Qualcomm
-      chipset: Snapdragon® X Elite
-    timestamp: '2024-07-02T00:17:42.777637Z'
-    torchscript_onnx_qnn:
-      inference_time: 1889092
-      throughput: 542.059
-      estimated_peak_memory_range:
-        min: 10784768
-        max: 10784768
-      layer_info:
-        layers_on_npu: 31766
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 31766
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-aggregated:
-  supported_devices:
-  - Samsung Galaxy S23 Ultra
-  - Samsung Galaxy S24
-  - Snapdragon X Elite CRD
-  supported_oses:
-  - Android
-  supported_chipsets:
-  - Snapdragon® 8 Gen 2
-  - Snapdragon® 8 Gen 3
-  - Snapdragon® X Elite
-  performance_metrics:
-  - reference_device_info:
-      name: Samsung Galaxy S23 Ultra
-      os: '13'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-01-26T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 117423.0
-      throughput: 8.5
-      estimated_peak_memory_range:
-        min: 68579328
-        max: 73044264
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: ""
-      job_status: Passed
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/README.md b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/README.md
new file mode 100644
index 00000000..b49510f6
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/README.md
@@ -0,0 +1,61 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Llama-v3.1-8B-Chat: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/llama_v3_1_8b_chat_quantized)
+
+Llama 3 is a family of LLMs. The "Chat" at the end indicates that the model is optimized for chatbot-like dialogue. The model is quantized to w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-Quantized's latency.
+
+This is based on the implementation of Llama-v3.1-8B-Chat found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/llama_v3_1_8b_chat_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying Llama 3.1 on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## License
+* The license for the original implementation of Llama-v3.1-8B-Chat can be found
+  [here](https://github.com/facebookresearch/llama/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+
+
+## References
+* [LLaMA: Open and Efficient Foundation Language Models](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/)
+* [Source Model Implementation](https://github.com/meta-llama/llama3/tree/main)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/__init__.py b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/__init__.py
new file mode 100644
index 00000000..522353c1
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/__init__.py
@@ -0,0 +1,8 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.llama.app import ChatApp as App  # noqa: F401
+
+from .model import MODEL_ID  # noqa: F401
+from .model import Llama3_1_Quantized as Model  # noqa: F401
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/demo.py b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/demo.py
new file mode 100644
index 00000000..09e48f63
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/demo.py
@@ -0,0 +1,52 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from typing import List, Type
+
+from qai_hub_models.models._shared.llama3.demo import llama_chat_demo
+from qai_hub_models.models._shared.llama3.model import (
+    DEFAULT_USER_PROMPT,
+    END_TOKENS,
+    get_input_prompt_with_tags,
+    get_tokenizer,
+    prepare_combined_attention_mask,
+)
+from qai_hub_models.models.llama_v3_1_8b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_1_8b_chat_quantized.model import (
+    HF_REPO_NAME,
+    HF_REPO_URL,
+)
+from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+
+
+def llama_3_1_chat_demo(
+    model_cls: Type[BaseModel] = Model,
+    model_id: str = MODEL_ID,
+    end_tokens: set = END_TOKENS,
+    hf_repo_name: str = HF_REPO_NAME,
+    hf_repo_url: str = HF_REPO_URL,
+    default_prompt: str = DEFAULT_USER_PROMPT,
+    is_test: bool = False,
+    available_target_runtimes: List[TargetRuntime] = [TargetRuntime.QNN],
+):
+    llama_chat_demo(
+        model_cls=model_cls,
+        model_id=model_id,
+        get_input_prompt_with_tags=get_input_prompt_with_tags,
+        prepare_combined_attention_mask=prepare_combined_attention_mask,
+        tokenizer=get_tokenizer(hf_repo_name),
+        end_tokens=end_tokens,
+        hf_repo_name=hf_repo_name,
+        hf_repo_url=hf_repo_url,
+        default_prompt=default_prompt,
+        is_test=is_test,
+        available_target_runtimes=available_target_runtimes,
+        bundled_kvcache=False,
+    )
+
+
+if __name__ == "__main__":
+    llama_3_1_chat_demo(model_cls=Model)
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/export.py b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/export.py
new file mode 100644
index 00000000..9b27f1b7
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/export.py
@@ -0,0 +1,57 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+
+from __future__ import annotations
+
+import warnings
+
+from qai_hub_models.models._shared.llama3.export import export_model
+from qai_hub_models.models.llama_v3_1_8b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_1_8b_chat_quantized.model import (
+    NUM_LAYERS_PER_SPLIT,
+    NUM_SPLITS,
+)
+from qai_hub_models.utils.args import export_parser
+
+DEFAULT_EXPORT_DEVICE = "Snapdragon 8 Elite QRD"
+
+ALL_COMPONENTS = [f"part_{i + 1}_of_{NUM_SPLITS}" for i in range(NUM_SPLITS)]
+
+# Each components is two sub-components linked together with shared weights
+ALL_SUB_COMPONENTS = {
+    f"part_{i + 1}_of_{NUM_SPLITS}": [
+        f"prompt_{i + 1}_of_{NUM_SPLITS}",
+        f"token_{i + 1}_of_{NUM_SPLITS}",
+    ]
+    for i in range(NUM_SPLITS)
+}
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(
+        model_cls=Model,
+        supports_tflite=False,
+        supports_precompiled_qnn_onnx=False,
+        default_export_device=DEFAULT_EXPORT_DEVICE,
+    )
+    parser.add_argument(
+        "--synchronous",
+        action="store_true",
+        help="Wait for each command to finish before submitting new.",
+    )
+    args = parser.parse_args()
+    export_model(
+        model_cls=Model,
+        model_name=MODEL_ID,
+        components=ALL_COMPONENTS,
+        sub_components=ALL_SUB_COMPONENTS,
+        num_layers_per_split=NUM_LAYERS_PER_SPLIT,
+        **vars(args),
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/info.yaml b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/info.yaml
new file mode 100644
index 00000000..c36e4441
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/info.yaml
@@ -0,0 +1,61 @@
+name: Llama-v3.1-8B-Chat
+id: llama_v3_1_8b_chat_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: Llama 3 is a family of LLMs. The "Chat" at the end indicates that
+  the model is optimized for chatbot-like dialogue. The model is quantized to
+  w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to
+  w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device
+  deployment. For Prompt and output length specified below, the time to first token is
+  Llama-PromptProcessor-Quantized's latency and average time per addition token is
+  Llama-TokenGenerator-Quantized's latency.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/
+research_paper_title: "LLaMA: Open and Efficient Foundation Language Models"
+license: https://github.com/facebookresearch/llama/blob/main/LICENSE
+source_repo: https://github.com/meta-llama/llama3/tree/main
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 8B
+  Model size: 4.8GB
+  Precision: w4a16 + w8a16 (few layers)
+  Num of key-value heads: 8
+  Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
+  Prompt processor input: 128 tokens + position embeddings + attention mask + KV cache inputs
+  Prompt processor output: 128 output tokens + KV cache outputs
+  Model-2 (Token Generator): Llama-TokenGenerator-Quantized
+  Token generator input: 1 input token + position embeddings + attention mask + KV cache inputs
+  Token generator output: 1 output token + KV cache outputs
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Language(s) supported: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models:
+  - llama_v3_8b_chat_quantized
+  - llama_v3_2_3b_chat_quantized
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: true
+license_type: llama3
+deploy_license: https://github.com/facebookresearch/llama/blob/main/LICENSE
+deploy_license_type: llama3
+dataset: []
+restrict_model_sharing: true
+model_type_llm: true
+llm_details:
+  call_to_action: 'view_readme'
+  genie_compatible: true
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/model.py b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/model.py
new file mode 100644
index 00000000..e30695f1
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/model.py
@@ -0,0 +1,110 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+import os
+
+from qai_hub_models.models._shared.llama3.model import (
+    DEFAULT_CONTEXT_LENGTH,
+    Llama3Base_Quantized,
+)
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.input_spec import InputSpec
+
+MODEL_ID = __name__.split(".")[-2]
+MODEL_ASSET_VERSION = 2
+DEFAULT_ENCODINGS = "llama31.encodings"
+DEFAULT_ENCODINGS_ZIP = DEFAULT_ENCODINGS + ".zip"
+
+NUM_LAYERS = 32
+NUM_SPLITS = 5
+NUM_LAYERS_PER_SPLIT = 9
+
+# Hugging face repo name and url
+HF_REPO_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct"
+HF_REPO_URL = f"https://huggingface.co/meta-llama/{HF_REPO_NAME}"
+
+# Minimum memory (RAM+swap) recommended for export.
+# TODO: #10762 should reduce once AIMET export consumes less memory during export.   TODO!!! Not quite correct, since we are not using AIMET
+MIN_MEMORY_RECOMMENDED = 40  # TODO: Does this work for Llama 3?
+
+
+class Llama3_1_Quantized(Llama3Base_Quantized):
+    def __init__(self, huggingface_model_name: str = HF_REPO_NAME, *args, **kwargs):
+        super().__init__(
+            huggingface_model_name=huggingface_model_name,
+            min_memory_recommended=MIN_MEMORY_RECOMMENDED,
+            *args,
+            **kwargs,
+        )
+
+    @classmethod
+    def from_pretrained(
+        cls,
+        sequence_length: int,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        aimet_encodings: str | None = "DEFAULT",
+        huggingface_model_name: str = HF_REPO_NAME,
+    ) -> "Llama3_1_Quantized":
+        """
+        Load a pre-trained Llama 3.1 (8B) model from Meta via HuggingFace.
+
+        sequence_length:
+            Instantiate with this token sequence length input. A longer
+            sequence length means the model is capable of processing more
+            tokens at once. This can only be set to greater than one to process
+            prompts, since responses are auto-regressive in nature and require
+            this to be 1.
+        context_length:
+            Total context length of model. Longer context length means the
+            model is more capable of making longer connections in the input
+            prompt. However, it also hurts runtime performance (both time-to-
+            first-token and tokens-per-second), so this is a tradeoff that may
+            depend on the use case.
+        aimet_encodings:
+            Path to AIMET quantization encodings file.
+        huggingface_model_name:
+            Name or URL of the HuggingFace model. Change this if you want to
+            change the weights.
+        """
+        if aimet_encodings:
+            if aimet_encodings == "DEFAULT":
+                aimet_encodings = os.path.join(
+                    CachedWebModelAsset.from_asset_store(
+                        MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS_ZIP
+                    ).fetch(extract=True),
+                    DEFAULT_ENCODINGS,
+                )
+
+        return cls(
+            aimet_encodings=aimet_encodings,
+            sequence_length=sequence_length,
+            context_length=context_length,
+            huggingface_model_name=huggingface_model_name,
+        )
+
+    @staticmethod
+    def get_output_names(num_hidden_layers: int = NUM_LAYERS):
+        return Llama3Base_Quantized.get_output_names(
+            num_hidden_layers=num_hidden_layers
+        )
+
+    @staticmethod
+    def get_input_spec(
+        num_hidden_layers: int = NUM_LAYERS,
+        input_seq_length: int = 128,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        hidden_size: int = 4096,
+        num_key_value_heads: int = 8,
+        num_attention_heads: int = 32,
+    ) -> InputSpec:
+        return Llama3Base_Quantized.get_input_spec(
+            num_hidden_layers=NUM_LAYERS,
+            input_seq_length=input_seq_length,
+            context_length=context_length,
+            hidden_size=hidden_size,
+            num_key_value_heads=num_key_value_heads,
+            num_attention_heads=num_attention_heads,
+        )
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/perf.yaml b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/perf.yaml
new file mode 100644
index 00000000..a4eb058c
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_oses:
+  - Android
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: Llama-v3.1-8B-Chat
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 154517
+          max: 4944544
+        tokens_per_second: 13.0546
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/requirements.txt b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/requirements.txt
new file mode 100644
index 00000000..c5deadcc
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/requirements.txt
@@ -0,0 +1,5 @@
+onnx==1.16.2
+transformers==4.45.0
+huggingface_hub==0.23.2
+sentencepiece==0.2.0
+psutil
diff --git a/qai_hub_models/models/llama_v3_1_8b_chat_quantized/test.py b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/test.py
new file mode 100644
index 00000000..e4b557ee
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_1_8b_chat_quantized/test.py
@@ -0,0 +1,14 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import pytest
+
+from qai_hub_models.models.llama_v3_1_8b_chat_quantized.demo import llama_3_1_chat_demo
+
+
+@pytest.mark.skip("#105 move slow_cloud and slow tests to nightly.")
+@pytest.mark.slow_cloud
+def test_demo():
+    # Run demo and verify it does not crash
+    llama_3_1_chat_demo(is_test=True)
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/README.md b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/README.md
new file mode 100644
index 00000000..fcd3721b
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/README.md
@@ -0,0 +1,61 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Llama-v3.2-3B-Chat: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/llama_v3_2_3b_chat_quantized)
+
+Llama 3 is a family of LLMs. The "Chat" at the end indicates that the model is optimized for chatbot-like dialogue. The model is quantized to w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-Quantized's latency.
+
+This is based on the implementation of Llama-v3.2-3B-Chat found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/llama_v3_2_3b_chat_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying Llama 3.2 on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## License
+* The license for the original implementation of Llama-v3.2-3B-Chat can be found
+  [here](https://github.com/facebookresearch/llama/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+
+
+## References
+* [LLaMA: Open and Efficient Foundation Language Models](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2/)
+* [Source Model Implementation](https://github.com/meta-llama/llama3/tree/main)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/__init__.py b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/__init__.py
new file mode 100644
index 00000000..142d3feb
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/__init__.py
@@ -0,0 +1,8 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.llama.app import ChatApp as App  # noqa: F401
+
+from .model import MODEL_ID  # noqa: F401
+from .model import Llama3_2_Quantized as Model  # noqa: F401
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/demo.py b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/demo.py
new file mode 100644
index 00000000..a76d37f9
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/demo.py
@@ -0,0 +1,52 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from typing import List, Type
+
+from qai_hub_models.models._shared.llama3.demo import llama_chat_demo
+from qai_hub_models.models._shared.llama3.model import (
+    DEFAULT_USER_PROMPT,
+    END_TOKENS,
+    get_input_prompt_with_tags,
+    get_tokenizer,
+    prepare_combined_attention_mask,
+)
+from qai_hub_models.models.llama_v3_2_3b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_2_3b_chat_quantized.model import (
+    HF_REPO_NAME,
+    HF_REPO_URL,
+)
+from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+
+
+def llama_3_2_chat_demo(
+    model_cls: Type[BaseModel] = Model,
+    model_id: str = MODEL_ID,
+    end_tokens: set = END_TOKENS,
+    hf_repo_name: str = HF_REPO_NAME,
+    hf_repo_url: str = HF_REPO_URL,
+    default_prompt: str = DEFAULT_USER_PROMPT,
+    is_test: bool = False,
+    available_target_runtimes: List[TargetRuntime] = [TargetRuntime.QNN],
+):
+    llama_chat_demo(
+        model_cls=model_cls,
+        model_id=model_id,
+        get_input_prompt_with_tags=get_input_prompt_with_tags,
+        prepare_combined_attention_mask=prepare_combined_attention_mask,
+        tokenizer=get_tokenizer(hf_repo_name),
+        end_tokens=end_tokens,
+        hf_repo_name=hf_repo_name,
+        hf_repo_url=hf_repo_url,
+        default_prompt=default_prompt,
+        is_test=is_test,
+        available_target_runtimes=available_target_runtimes,
+        bundled_kvcache=False,
+    )
+
+
+if __name__ == "__main__":
+    llama_3_2_chat_demo(model_cls=Model)
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/export.py b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/export.py
new file mode 100644
index 00000000..4784b5d9
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/export.py
@@ -0,0 +1,57 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+
+from __future__ import annotations
+
+import warnings
+
+from qai_hub_models.models._shared.llama3.export import export_model
+from qai_hub_models.models.llama_v3_2_3b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_2_3b_chat_quantized.model import (
+    NUM_LAYERS_PER_SPLIT,
+    NUM_SPLITS,
+)
+from qai_hub_models.utils.args import export_parser
+
+DEFAULT_EXPORT_DEVICE = "Snapdragon 8 Elite QRD"
+
+ALL_COMPONENTS = [f"part_{i + 1}_of_{NUM_SPLITS}" for i in range(NUM_SPLITS)]
+
+# Each components is two sub-components linked together with shared weights
+ALL_SUB_COMPONENTS = {
+    f"part_{i + 1}_of_{NUM_SPLITS}": [
+        f"prompt_{i + 1}_of_{NUM_SPLITS}",
+        f"token_{i + 1}_of_{NUM_SPLITS}",
+    ]
+    for i in range(NUM_SPLITS)
+}
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(
+        model_cls=Model,
+        supports_tflite=False,
+        supports_precompiled_qnn_onnx=False,
+        default_export_device=DEFAULT_EXPORT_DEVICE,
+    )
+    parser.add_argument(
+        "--synchronous",
+        action="store_true",
+        help="Wait for each command to finish before submitting new.",
+    )
+    args = parser.parse_args()
+    export_model(
+        model_cls=Model,
+        model_name=MODEL_ID,
+        components=ALL_COMPONENTS,
+        sub_components=ALL_SUB_COMPONENTS,
+        num_layers_per_split=NUM_LAYERS_PER_SPLIT,
+        **vars(args),
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/info.yaml b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/info.yaml
new file mode 100644
index 00000000..37416791
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/info.yaml
@@ -0,0 +1,61 @@
+name: Llama-v3.2-3B-Chat
+id: llama_v3_2_3b_chat_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: Llama 3 is a family of LLMs. The "Chat" at the end indicates that
+  the model is optimized for chatbot-like dialogue. The model is quantized to
+  w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to
+  w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device
+  deployment. For Prompt and output length specified below, the time to first token is
+  Llama-PromptProcessor-Quantized's latency and average time per addition token is
+  Llama-TokenGenerator-Quantized's latency.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2/
+research_paper_title: "LLaMA: Open and Efficient Foundation Language Models"
+license: https://github.com/facebookresearch/llama/blob/main/LICENSE
+source_repo: https://github.com/meta-llama/llama3/tree/main
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 3B
+  Model size: 2.4G
+  Precision: w4a16 + w8a16 (few layers)
+  Num of key-value heads: 8
+  Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
+  Prompt processor input: 128 tokens + position embeddings + attention mask + KV cache inputs
+  Prompt processor output: 128 output tokens + KV cache outputs
+  Model-2 (Token Generator): Llama-TokenGenerator-Quantized
+  Token generator input: 1 input token + position embeddings + attention mask + KV cache inputs
+  Token generator output: 1 output token + KV cache outputs
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models:
+  - llama_v3_8b_chat_quantized
+  - llama_v3_1_8b_chat_quantized
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: true
+license_type: llama3
+deploy_license: https://github.com/facebookresearch/llama/blob/main/LICENSE
+deploy_license_type: llama3
+dataset: []
+restrict_model_sharing: true
+model_type_llm: true
+llm_details:
+  call_to_action: 'view_readme'
+  genie_compatible: true
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/model.py b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/model.py
new file mode 100644
index 00000000..3fd4b15c
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/model.py
@@ -0,0 +1,110 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+import os
+
+from qai_hub_models.models._shared.llama3.model import (
+    DEFAULT_CONTEXT_LENGTH,
+    Llama3Base_Quantized,
+)
+from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.input_spec import InputSpec
+
+MODEL_ID = __name__.split(".")[-2]
+MODEL_ASSET_VERSION = 1
+DEFAULT_ENCODINGS = "llama32.encodings"
+DEFAULT_ENCODINGS_ZIP = DEFAULT_ENCODINGS + ".zip"
+
+NUM_LAYERS = 28
+NUM_SPLITS = 3
+NUM_LAYERS_PER_SPLIT = 14
+
+# Hugging face repo name and url
+HF_REPO_NAME = "meta-llama/Llama-3.2-3B-Instruct"
+HF_REPO_URL = f"https://huggingface.co/meta-llama/{HF_REPO_NAME}"
+
+# Minimum memory (RAM+swap) recommended for export.
+# TODO: #10762 should reduce once AIMET export consumes less memory during export.   TODO!!! Not quite correct, since we are not using AIMET
+MIN_MEMORY_RECOMMENDED = 40  # TODO: Does this work for Llama 3?
+
+
+class Llama3_2_Quantized(Llama3Base_Quantized):
+    def __init__(self, huggingface_model_name: str = HF_REPO_NAME, *args, **kwargs):
+        super().__init__(
+            huggingface_model_name=huggingface_model_name,
+            min_memory_recommended=MIN_MEMORY_RECOMMENDED,
+            *args,
+            **kwargs,
+        )
+
+    @classmethod
+    def from_pretrained(
+        cls,
+        sequence_length: int,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        aimet_encodings: str | None = "DEFAULT",
+        huggingface_model_name: str = HF_REPO_NAME,
+    ) -> "Llama3_2_Quantized":
+        """
+        Load a pre-trained Llama 3.2 (3B) model from Meta via HuggingFace.
+
+        sequence_length:
+            Instantiate with this token sequence length input. A longer
+            sequence length means the model is capable of processing more
+            tokens at once. This can only be set to greater than one to process
+            prompts, since responses are auto-regressive in nature and require
+            this to be 1.
+        context_length:
+            Total context length of model. Longer context length means the
+            model is more capable of making longer connections in the input
+            prompt. However, it also hurts runtime performance (both time-to-
+            first-token and tokens-per-second), so this is a tradeoff that may
+            depend on the use case.
+        aimet_encodings:
+            Path to AIMET quantization encodings file.
+        huggingface_model_name:
+            Name or URL of the HuggingFace model. Change this if you want to
+            change the weights.
+        """
+        if aimet_encodings:
+            if aimet_encodings == "DEFAULT":
+                aimet_encodings = os.path.join(
+                    CachedWebModelAsset.from_asset_store(
+                        MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS_ZIP
+                    ).fetch(extract=True),
+                    DEFAULT_ENCODINGS,
+                )
+
+        return cls(
+            aimet_encodings=aimet_encodings,
+            sequence_length=sequence_length,
+            context_length=context_length,
+            huggingface_model_name=huggingface_model_name,
+        )
+
+    @staticmethod
+    def get_output_names(num_hidden_layers: int = NUM_LAYERS):
+        return Llama3Base_Quantized.get_output_names(
+            num_hidden_layers=num_hidden_layers
+        )
+
+    @staticmethod
+    def get_input_spec(
+        num_hidden_layers: int = NUM_LAYERS,
+        input_seq_length: int = 128,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        hidden_size: int = 3072,
+        num_key_value_heads: int = 8,
+        num_attention_heads: int = 24,
+    ) -> InputSpec:
+        return Llama3Base_Quantized.get_input_spec(
+            num_hidden_layers=NUM_LAYERS,
+            input_seq_length=input_seq_length,
+            context_length=context_length,
+            hidden_size=hidden_size,
+            num_key_value_heads=num_key_value_heads,
+            num_attention_heads=num_attention_heads,
+        )
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/perf.yaml b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/perf.yaml
new file mode 100644
index 00000000..a8e23cb8
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/perf.yaml
@@ -0,0 +1,27 @@
+aggregated:
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Snapdragon 8 Gen 3 QRD
+  supported_oses:
+  - Android
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+models:
+  name: Llama-v3.2-3B-Chat
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 88195
+          max: 2822250
+        tokens_per_second: 23.4718
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/requirements.txt b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/requirements.txt
new file mode 100644
index 00000000..c5deadcc
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/requirements.txt
@@ -0,0 +1,5 @@
+onnx==1.16.2
+transformers==4.45.0
+huggingface_hub==0.23.2
+sentencepiece==0.2.0
+psutil
diff --git a/qai_hub_models/models/llama_v3_2_3b_chat_quantized/test.py b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/test.py
new file mode 100644
index 00000000..44bed787
--- /dev/null
+++ b/qai_hub_models/models/llama_v3_2_3b_chat_quantized/test.py
@@ -0,0 +1,14 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import pytest
+
+from qai_hub_models.models.llama_v3_2_3b_chat_quantized.demo import llama_3_2_chat_demo
+
+
+@pytest.mark.skip("#105 move slow_cloud and slow tests to nightly.")
+@pytest.mark.slow_cloud
+def test_demo():
+    # Run demo and verify it does not crash
+    llama_3_2_chat_demo(is_test=True)
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/README.md b/qai_hub_models/models/llama_v3_8b_chat_quantized/README.md
index 27678cb2..c2ca1a4b 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/README.md
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/README.md
@@ -3,10 +3,10 @@
 
 # [Llama-v3-8B-Chat: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/llama_v3_8b_chat_quantized)
 
-Llama 3 is a family of LLMs. The "Chat" at the end indicates that the model is optimized for chatbot-like dialogue. The model is quantized to w4a16(4-bit weights and 16-bit activations) and part of the model is quantized to w8a16(8-bit weights and 16-bit activations) making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-KVCache-Quantized's latency.
+Llama 3 is a family of LLMs. The "Chat" at the end indicates that the model is optimized for chatbot-like dialogue. The model is quantized to w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device deployment. For Prompt and output length specified below, the time to first token is Llama-PromptProcessor-Quantized's latency and average time per addition token is Llama-TokenGenerator-Quantized's latency.
 
 This is based on the implementation of Llama-v3-8B-Chat found
-[here](https://github.com/meta-llama/llama3/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/llama_v3_8b_chat_quantized).
 
@@ -14,88 +14,24 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/l
 
 ## Deploying Llama 3 on-device
 
-Large Language Model (LLM) such as [Llama 2](https://llama.meta.com/llama3/) has the following complexities to deploy on-device:
-1. Model size is too large to fit in device memory for inference
-2. Multi-Head Attention (MHA) has large activations leading to fallback from accelerators
-3. High model load and inference time
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
 
-We can tackle the above constraints with the following steps:
-1. Quantize weights to reduce on-disk model size, e.g., int8 or int4 weights
-2. Quantize activations to reduce inference time memory pressure
-3. Graph transformations to reduce inference time memory pressure, e.g., Multi-Head to Split-Head Attention (MHA -> SHA)
-4. Graph transformations to convert or decompose operations into more accelerator friendly operations e.g. Linear to Conv
-5. For LLM with 7B or more parameters, above steps are still not good enough on mobile,
-  hence we go one step further and split model into sub-parts.
 
-Here, we divide the model into 4 parts in order to
-1. Make model exportable with low memory usage
-2. Avoid inference time out-of-memory errors
 
-In order to export Llama 3, please ensure
-1. Host machine has >40GB memory (RAM+swap-space)
-2. If you don't have enough memory, export.py will dump instructions to increase swap space accordingly
 
-## Sample output prompts generated on-device
-1. --prompt "where is California?"
-```
-------- Response Summary --------
-Prompt: where is California?
-Response: California is a state located on the West Coast of
-```
-
-2. --prompt "what is 2+3?" --max-output-tokens 30
-```
--------- Response Summary --------
-Prompt: what is 2+3?
-Response: 2 + 3 = 5
-```
-
-3. --prompt "what is superposition in Quantum Physics?" --max-output-tokens 30
-```
-Prompt: what is superposition in Quantum Physics?
-Response: Superposition is a fundamental concept in quantum mechanics, which is a branch of physics that studies the behavior of matter and energy at a very
-```
-
-
-
-## Example & Usage
-
-Install the package via pip:
-```bash
-pip install "qai_hub_models[llama_v3_8b_chat_quantized]"
-```
-
-
-Once installed, run the following simple CLI demo:
-
-```bash
-python -m qai_hub_models.models.llama_v3_8b_chat_quantized.demo
-```
-More details on the CLI tool can be found with the `--help` option. See
-[demo.py](demo.py) for sample usage of the model including pre/post processing
-scripts. Please refer to our [general instructions on using
-models](../../../#getting-started) for more usage instructions.
-
-## Export for on-device deployment
-
-This repository contains export scripts that produce a model optimized for
-on-device deployment. This can be run as follows:
-
-```bash
-python -m qai_hub_models.models.llama_v3_8b_chat_quantized.export
-```
-Additional options are documented with the `--help` option. Note that the above
-script requires access to Deployment instructions for Qualcomm® AI Hub.
 
 ## License
-- The license for the original implementation of Llama-v3-8B-Chat can be found
+* The license for the original implementation of Llama-v3-8B-Chat can be found
   [here](https://github.com/facebookresearch/llama/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/facebookresearch/llama/blob/main/LICENSE)
+
 
 ## References
 * [LLaMA: Open and Efficient Foundation Language Models](https://ai.meta.com/blog/meta-llama-3/)
 * [Source Model Implementation](https://github.com/meta-llama/llama3/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/demo.py b/qai_hub_models/models/llama_v3_8b_chat_quantized/demo.py
index 762a20e0..246c67cd 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/demo.py
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/demo.py
@@ -6,63 +6,25 @@
 
 from typing import List, Type
 
-from qai_hub_models.models._shared.llama.demo import llama_chat_demo
-from qai_hub_models.models.llama_v3_8b_chat_quantized import MODEL_ID, Model
-from qai_hub_models.models.llama_v3_8b_chat_quantized.model import (
+from qai_hub_models.models._shared.llama3.demo import llama_chat_demo
+from qai_hub_models.models._shared.llama3.model import (
     DEFAULT_USER_PROMPT,
     END_TOKENS,
-    HF_REPO_NAME,
-    HF_REPO_URL,
-    MODEL_SPLIT_MAP,
-    NUM_KEY_VAL_HEADS,
-    NUM_SPLITS,
-    Llama3_PromptProcessor_1_Quantized,
-    Llama3_PromptProcessor_2_Quantized,
-    Llama3_PromptProcessor_3_Quantized,
-    Llama3_PromptProcessor_4_Quantized,
-    Llama3_PromptProcessor_5_Quantized,
-    Llama3_TokenGenerator_1_Quantized,
-    Llama3_TokenGenerator_2_Quantized,
-    Llama3_TokenGenerator_3_Quantized,
-    Llama3_TokenGenerator_4_Quantized,
-    Llama3_TokenGenerator_5_Quantized,
     get_input_prompt_with_tags,
     get_tokenizer,
     prepare_combined_attention_mask,
 )
+from qai_hub_models.models.llama_v3_8b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_8b_chat_quantized.model import (
+    HF_REPO_NAME,
+    HF_REPO_URL,
+)
 from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
 
 
-def _get_model_class(split_part: int, is_token_generator: bool = False):
-    if split_part < 1 or split_part > 5:
-        raise RuntimeError(
-            "Incorrect index provided to request Model split class."
-            f" Must be within (1-5), provided ({split_part})."
-        )
-
-    if is_token_generator:
-        return [
-            Llama3_TokenGenerator_1_Quantized,
-            Llama3_TokenGenerator_2_Quantized,
-            Llama3_TokenGenerator_3_Quantized,
-            Llama3_TokenGenerator_4_Quantized,
-            Llama3_TokenGenerator_5_Quantized,
-        ][split_part - 1]
-    return [
-        Llama3_PromptProcessor_1_Quantized,
-        Llama3_PromptProcessor_2_Quantized,
-        Llama3_PromptProcessor_3_Quantized,
-        Llama3_PromptProcessor_4_Quantized,
-        Llama3_PromptProcessor_5_Quantized,
-    ][split_part - 1]
-
-
 def llama_3_chat_demo(
     model_cls: Type[BaseModel] = Model,
     model_id: str = MODEL_ID,
-    num_splits: int = NUM_SPLITS,
-    num_key_val_heads: int = NUM_KEY_VAL_HEADS,
-    model_split_map: dict = MODEL_SPLIT_MAP,
     end_tokens: set = END_TOKENS,
     hf_repo_name: str = HF_REPO_NAME,
     hf_repo_url: str = HF_REPO_URL,
@@ -73,13 +35,9 @@ def llama_3_chat_demo(
     llama_chat_demo(
         model_cls=model_cls,
         model_id=model_id,
-        get_model_class=_get_model_class,
         get_input_prompt_with_tags=get_input_prompt_with_tags,
         prepare_combined_attention_mask=prepare_combined_attention_mask,
-        tokenizer=get_tokenizer(),
-        num_splits=num_splits,
-        num_key_val_heads=num_key_val_heads,
-        model_split_map=model_split_map,
+        tokenizer=get_tokenizer(hf_repo_name),
         end_tokens=end_tokens,
         hf_repo_name=hf_repo_name,
         hf_repo_url=hf_repo_url,
@@ -91,4 +49,4 @@ def llama_3_chat_demo(
 
 
 if __name__ == "__main__":
-    llama_3_chat_demo()
+    llama_3_chat_demo(model_cls=Model)
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/export.py b/qai_hub_models/models/llama_v3_8b_chat_quantized/export.py
index fc3e5c20..3ed9600d 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/export.py
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/export.py
@@ -5,288 +5,52 @@
 
 from __future__ import annotations
 
-import os
 import warnings
-from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
 
-import qai_hub as hub
-
-from qai_hub_models.models.llama_v3_8b_chat_quantized import Model
-from qai_hub_models.utils.args import (
-    export_parser,
-    get_input_spec_kwargs,
-    get_model_kwargs,
-)
-from qai_hub_models.utils.base_model import TargetRuntime
-from qai_hub_models.utils.compare import torch_inference
-from qai_hub_models.utils.printing import (
-    print_inference_metrics,
-    print_on_target_demo_cmd,
-    print_profile_metrics_from_job,
+from qai_hub_models.models._shared.llama3.export import export_model
+from qai_hub_models.models.llama_v3_8b_chat_quantized import MODEL_ID, Model
+from qai_hub_models.models.llama_v3_8b_chat_quantized.model import (
+    NUM_LAYERS_PER_SPLIT,
+    NUM_SPLITS,
 )
-from qai_hub_models.utils.qai_hub_helpers import (
-    can_access_qualcomm_ai_hub,
-    export_without_hub_access,
-)
-
-ALL_COMPONENTS = [
-    "PromptProcessor_1_Quantized",
-    "PromptProcessor_2_Quantized",
-    "PromptProcessor_3_Quantized",
-    "PromptProcessor_4_Quantized",
-    "PromptProcessor_5_Quantized",
-    "TokenGenerator_1_Quantized",
-    "TokenGenerator_2_Quantized",
-    "TokenGenerator_3_Quantized",
-    "TokenGenerator_4_Quantized",
-    "TokenGenerator_5_Quantized",
-]
-DEFAULT_COMPONENTS = [
-    "PromptProcessor_1_Quantized",
-    "PromptProcessor_2_Quantized",
-    "PromptProcessor_3_Quantized",
-    "PromptProcessor_4_Quantized",
-    "PromptProcessor_5_Quantized",
-    "TokenGenerator_1_Quantized",
-    "TokenGenerator_2_Quantized",
-    "TokenGenerator_3_Quantized",
-    "TokenGenerator_4_Quantized",
-    "TokenGenerator_5_Quantized",
-]
-
-DEFAULT_EXPORT_DEVICE = "Samsung Galaxy S24 (Family)"
-
-
-def export_model(
-    device: str = DEFAULT_EXPORT_DEVICE,
-    components: Optional[List[str]] = None,
-    skip_profiling: bool = False,
-    skip_inferencing: bool = False,
-    skip_downloading: bool = False,
-    skip_summary: bool = False,
-    output_dir: Optional[str] = None,
-    target_runtime: TargetRuntime = TargetRuntime.QNN,
-    compile_options: str = "",
-    profile_options: str = "",
-    **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
-    """
-    This function accomplishes 6 main tasks:
-
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
-
-    Each of the last four steps can be optionally skipped using the input options.
-
-    Parameters:
-        device: Device for which to export the model.
-            Full list of available devices can be found by running `hub.get_devices()`.
-            Defaults to DEFAULT_DEVICE if not specified.
-        components: List of sub-components of the model that will be exported.
-            Each component is compiled and profiled separately.
-            Defaults to ALL_COMPONENTS if not specified.
-        skip_profiling: If set, skips profiling of compiled model on real devices.
-        skip_inferencing: If set, skips computing on-device outputs from sample data.
-        skip_downloading: If set, skips downloading of compiled model.
-        skip_summary: If set, skips waiting for and summarizing results
-            from profiling and inference.
-        output_dir: Directory to store generated assets (e.g. compiled model).
-            Defaults to `<cwd>/build/<model_name>`.
-        target_runtime: Which on-device runtime to target. Default is TFLite.
-        compile_options: Additional options to pass when submitting the compile job.
-        profile_options: Additional options to pass when submitting the profile job.
-        **additional_model_kwargs: Additional optional kwargs used to customize
-            `model_cls.from_pretrained`
-
-    Returns:
-        A Mapping from component_name to a 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
-            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
-    """
-    model_name = "llama_v3_8b_chat_quantized"
-    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
-    component_arg = components
-    components = components or DEFAULT_COMPONENTS
-    for component_name in components:
-        if component_name not in ALL_COMPONENTS:
-            raise ValueError(f"Invalid component {component_name}.")
-    if not can_access_qualcomm_ai_hub():
-        return export_without_hub_access(
-            "llama_v3_8b_chat_quantized",
-            "Llama-v3-7B-Chat",
-            device,
-            skip_profiling,
-            skip_inferencing,
-            skip_downloading,
-            skip_summary,
-            output_path,
-            target_runtime,
-            compile_options,
-            profile_options,
-            component_arg,
-        )
-
-    # 1. Initialize PyTorch model
-    model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
-
-    compile_jobs: Dict[str, hub.client.CompileJob] = {}
-    profile_options_per_component: Dict[str, str] = {}
-
-    hub_device = hub.Device(device)
-    for component_name in components:
-        # Load model part
-        component = model.load_model_part(component_name)
-
-        input_spec = component.get_input_spec(
-            **get_input_spec_kwargs(component, additional_model_kwargs)
-        )
+from qai_hub_models.utils.args import export_parser
 
-        # Trace the model
-        source_model = component.convert_to_hub_source_model(
-            target_runtime,
-            output_path,
-            input_spec,
-            external_onnx_weights=True,
-            output_names=component.get_output_names(),
-        )
+DEFAULT_EXPORT_DEVICE = "Snapdragon 8 Elite QRD"
 
-        if target_runtime == TargetRuntime.TFLITE:
-            quant_calibration_data = None
-        else:
-            quant_calibration_data = component.get_calibration_data(
-                target_runtime, input_spec=input_spec
-            )
+ALL_COMPONENTS = [f"part_{i + 1}_of_{NUM_SPLITS}" for i in range(NUM_SPLITS)]
 
-        # 2. Compile the models to an on-device asset
-        model_compile_options = component.get_hub_compile_options(
-            target_runtime, compile_options
-        )
-        print(f"Optimizing model {component_name} to run on-device")
-        submitted_compile_job = hub.submit_compile_job(
-            model=source_model,
-            input_specs=input_spec,
-            device=hub_device,
-            name=f"{model_name}_{component_name}",
-            calibration_data=quant_calibration_data,
-            options=model_compile_options,
-        )
-
-        compile_jobs[component_name] = cast(
-            hub.client.CompileJob, submitted_compile_job
-        )
-        profile_options_per_component[
-            component_name
-        ] = component.get_hub_profile_options(target_runtime, profile_options)
-
-        # Free model part to reduce memory-pressure
-        del component
-
-    # 3. Profile the model assets on real devices
-    profile_jobs: Dict[str, hub.client.ProfileJob] = {}
-    if not skip_profiling:
-        for component_name in components:
-            profile_options_all = profile_options_per_component[component_name]
-            print(f"Profiling model {component_name} on a hosted device.")
-            submitted_profile_job = hub.submit_profile_job(
-                model=compile_jobs[component_name].get_target_model(),
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            profile_jobs[component_name] = cast(
-                hub.client.ProfileJob, submitted_profile_job
-            )
-
-    # 4. Run inference on-device with sample inputs
-    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-
-    if not skip_inferencing:
-        for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            # Load model with no-AIMET mode
-            component = model.load_model_part(component_name)
-            profile_options_all = profile_options_per_component[component_name]
-            # Load individual model part
-            sample_inputs = component.sample_inputs()
-            submitted_inference_job = hub.submit_inference_job(
-                model=compile_jobs[component_name].get_target_model(),
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Download the model assets to a local file
-    if not skip_downloading:
-        os.makedirs(output_path, exist_ok=True)
-        for component_name, compile_job in compile_jobs.items():
-            target_model: hub.Model = compile_job.get_target_model()  # type: ignore
-            target_model.download(
-                str(output_path / f"{model_name}_{component_name}.bin")
-            )
-
-    # 6. Summarize the results from profiling and inference
-    if not skip_summary and not skip_profiling:
-        for component_name in components:
-            profile_job = profile_jobs[component_name]
-            assert profile_job is not None and profile_job.wait().success
-            profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
-            print_profile_metrics_from_job(profile_job, profile_data)
-
-    if not skip_summary and not skip_inferencing:
-        for component_name in components:
-            inference_job = inference_jobs[component_name]
-            # Load individual model part
-            component = model.load_model_part(component_name)
-            # Get ordered model output names
-            output_names = component.get_output_names()
-            sample_inputs = component.sample_inputs()
-            torch_out = torch_inference(component, sample_inputs)
-            assert inference_job is not None and inference_job.wait().success
-            inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
-            print_inference_metrics(
-                inference_job, inference_result, torch_out, output_names=output_names
-            )
-
-    if not skip_summary:
-        print_on_target_demo_cmd(
-            compile_jobs.values(), Path(__file__).parent.resolve(), hub_device
-        )
-
-    return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
-        )
-        for component_name in components
-    }
+# Each components is two sub-components linked together with shared weights
+ALL_SUB_COMPONENTS = {
+    f"part_{i + 1}_of_{NUM_SPLITS}": [
+        f"prompt_{i + 1}_of_{NUM_SPLITS}",
+        f"token_{i + 1}_of_{NUM_SPLITS}",
+    ]
+    for i in range(NUM_SPLITS)
+}
 
 
 def main():
     warnings.filterwarnings("ignore")
     parser = export_parser(
         model_cls=Model,
-        components=ALL_COMPONENTS,
         supports_tflite=False,
         supports_precompiled_qnn_onnx=False,
         default_export_device=DEFAULT_EXPORT_DEVICE,
     )
+    parser.add_argument(
+        "--synchronous",
+        action="store_true",
+        help="Wait for each command to finish before submitting new.",
+    )
     args = parser.parse_args()
-    export_model(**vars(args))
+    export_model(
+        model_cls=Model,
+        model_name=MODEL_ID,
+        components=ALL_COMPONENTS,
+        sub_components=ALL_SUB_COMPONENTS,
+        num_layers_per_split=NUM_LAYERS_PER_SPLIT,
+        **vars(args),
+    )
 
 
 if __name__ == "__main__":
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/info.yaml b/qai_hub_models/models/llama_v3_8b_chat_quantized/info.yaml
index 3d38e57b..6ef32977 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/info.yaml
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/info.yaml
@@ -6,11 +6,11 @@ headline: State-of-the-art large language model useful on a variety of language
 domain: Generative AI
 description: Llama 3 is a family of LLMs. The "Chat" at the end indicates that
   the model is optimized for chatbot-like dialogue. The model is quantized to
-  w4a16(4-bit weights and 16-bit activations) and part of the model is quantized to
-  w8a16(8-bit weights and 16-bit activations) making it suitable for on-device
+  w4a16 (4-bit weights and 16-bit activations) and part of the model is quantized to
+  w8a16 (8-bit weights and 16-bit activations) making it suitable for on-device
   deployment. For Prompt and output length specified below, the time to first token is
   Llama-PromptProcessor-Quantized's latency and average time per addition token is
-  Llama-TokenGenerator-KVCache-Quantized's latency.
+  Llama-TokenGenerator-Quantized's latency.
 use_case: Text Generation
 tags:
   - llm
@@ -21,25 +21,30 @@ research_paper_title: "LLaMA: Open and Efficient Foundation Language Models"
 license: https://github.com/facebookresearch/llama/blob/main/LICENSE
 source_repo: https://github.com/meta-llama/llama3/tree/main
 technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
   Number of parameters: 8B
+  Model size: 4.8GB
   Precision: w4a16 + w8a16 (few layers)
   Num of key-value heads: 8
   Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
-  Max context length: 1024
-  Prompt processor model size: 4.8GB
-  Prompt processor input: 1024 tokens
-  Prompt processor output: 1024 output tokens + KVCache for token generator
-  Model-2 (Token Generator): Llama-TokenGenerator-KVCache-Quantized
-  Token generator model size: 4.8GB
-  Token generator input: 1 input token + past KVCache
-  Token generator output: 1 output token + KVCache for next iteration
-  Decoding length: 1024 (1 output token + 1023 from KVCache)
+  Prompt processor input: 128 tokens + position embeddings + attention mask + KV cache inputs
+  Prompt processor output: 128 output tokens + KV cache outputs
+  Model-2 (Token Generator): Llama-TokenGenerator-Quantized
+  Token generator input: 1 input token + position embeddings + attention mask + KV cache inputs
+  Token generator output: 1 output token + KV cache outputs
   Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
 applicable_scenarios:
   - Dialogue
   - Content Generation
   - Customer Support
-related_models: []
+related_models:
+  - llama_v3_1_8b_chat_quantized
+  - llama_v3_2_3b_chat_quantized
 form_factors:
   - Phone
   - Tablet
@@ -50,3 +55,7 @@ deploy_license: https://github.com/facebookresearch/llama/blob/main/LICENSE
 deploy_license_type: llama3
 dataset: []
 restrict_model_sharing: true
+model_type_llm: true
+llm_details:
+  call_to_action: 'view_readme'
+  genie_compatible: true
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/model.py b/qai_hub_models/models/llama_v3_8b_chat_quantized/model.py
index bb025332..ffe75e12 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/model.py
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/model.py
@@ -5,1726 +5,106 @@
 from __future__ import annotations
 
 import os
-from typing import Optional, Tuple
 
-import torch
-from qai_hub.client import DatasetEntries
-
-from qai_hub_models.models._shared.llama.model import (
-    DEFAULT_INPUT_SEQ_LEN,
-    Llama_QuantizedMixin,
-    RopeEmbedding,
-    get_hidden_layer_range_from_split,
-    get_past_key_names,
-    get_past_keyval_with_shift,
-    load_input_cached_data,
-    make_torch_compatible_past_key_values,
-    save_input_cached_data,
-)
-from qai_hub_models.models.llama_v3_8b_chat_quantized.modeling_llama import (  # RopeEmbedding,
-    LlamaForCausalLM,
-    LlamaModel,
+from qai_hub_models.models._shared.llama3.model import (
+    DEFAULT_CONTEXT_LENGTH,
+    Llama3Base_Quantized,
 )
 from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
-from qai_hub_models.utils.base_model import CollectionModel, TargetRuntime
-from qai_hub_models.utils.huggingface import (
-    ensure_has_required_transformer,
-    has_model_access,
-)
 from qai_hub_models.utils.input_spec import InputSpec
-from qai_hub_models.utils.model_adapters import flatten, suppress_warnings
-from qai_hub_models.utils.system_info import has_recommended_memory
-
-MIN_TRANFORMER_VERSION = "4.40.0"
-
-
-# isort: off
-
-# TODO: 10761 remove transformer version check once AIMET
-# transformer restriction is uplifted.
-ensure_has_required_transformer(MIN_TRANFORMER_VERSION)
-from transformers import AutoConfig, AutoTokenizer  # noqa: E402
-
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 2
-
-# Configs
-AIMET_ENCODINGS_PREFIX = "config"
-AIMET_CONFIG = "default_config_llama"
+MODEL_ASSET_VERSION = 4
+DEFAULT_ENCODINGS = "llama3.encodings"
+DEFAULT_ENCODINGS_ZIP = DEFAULT_ENCODINGS + ".zip"
 
-# Model parameters
-MAX_HIDDEN_LAYERS = 32
-MAX_POS_EMBEDDINGS = 1024
-ATTENTION_HIDDEN_DIM = 4096
-POS_EMBED_DIM = 64
-DATA_DIR = "data"
-USE_CACHED_DATA = True
+NUM_LAYERS = 32
 NUM_SPLITS = 5
-NUM_KEY_VAL_HEADS = 8
-
-# Model split map to track DecodeLayer split for each part
-# key (model split number) ->
-# value Tuple of (start index of decoder Layer, end index of decode layer)
-MODEL_SPLIT_MAP = {
-    1: (0, 4),
-    2: (4, 12),
-    3: (12, 20),
-    4: (20, 28),
-    5: (28, 32),
-}
+NUM_LAYERS_PER_SPLIT = 9
 
 # Hugging face repo name and url
 HF_REPO_NAME = "meta-llama/Meta-Llama-3-8B-Instruct"
-HF_REPO_URL = "https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct"
+HF_REPO_URL = f"https://huggingface.co/meta-llama/{HF_REPO_NAME}"
 
 # Minimum memory (RAM+swap) recommended for export.
-# TODO: #10762 should reduce once AIMET export consumes less memory during export.
-MIN_MEMORY_RECOMMENDED = 40
-
-## Ref: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/
-BEGIN_TEXT = "<|begin_of_text|>"
-END_TEXT = "<|begin_of_text|>"
-START_HEADER = "<|start_header_id|>"
-END_HEADER = "<|end_header_id|>"
-SYSTEM_ID = "system"
-ASSISTANT_ID = "assistant"
-USER_ID = "user"
-EOT_ID = "<|eot_id|>"
-END_TOKENS = {"<|eot_id|>", "<|end_of_text|>"}
-
-DEFAULT_PROMPT_CONTEXT = "You are a helpful AI assistant"
-DEFAULT_USER_PROMPT = "Hi! What is 2+3?"
-
-
-def get_input_prompt_with_tags(
-    previous_history: str = "",
-    system_context_prompt: str = DEFAULT_PROMPT_CONTEXT,
-    user_input_prompt: str = DEFAULT_USER_PROMPT,
-):
-    """
-    Get prompt to set context and initialize prompt-processor
-    """
-    prompt = previous_history
-    prompt += "" if len(previous_history) == 0 else "</s>"
-
-    prompt = f"""{BEGIN_TEXT}{START_HEADER}{SYSTEM_ID}{END_HEADER}
-
-{system_context_prompt}
-{START_HEADER}{USER_ID}{END_HEADER}
-
-{user_input_prompt}{EOT_ID}{START_HEADER}{ASSISTANT_ID}{END_HEADER}
-
-
-"""
-    return prompt
-
-
-def get_tokenizer():
-    """
-    Tokenizer to use for Llama3
-    """
-    tokenizer = AutoTokenizer.from_pretrained(HF_REPO_NAME, is_fast=False)
-    tokenizer.padding_side = "left"
-    tokenizer.pad_token = tokenizer.eos_token
-    tokenizer.pad_token_id = tokenizer.eos_token_id
-    tokenizer.truncation_side = "left"
-    return tokenizer
-
-
-def prepare_combined_attention_mask(
-    attention_mask: torch.Tensor,
-    input_shape: Optional[Tuple] = None,
-    past_key_values_length: int = 0,
-    dtype: torch.dtype = torch.float32,
-):
-    """
-    Creates combined attention_mask from given input attention_mask
-        Input attention_mask: 2d (1, input_seq_len)
-        Output attention_mask: 4d (1, 1, input_seq_length, input_seq_length)
-    """
-    if input_shape is None:
-        input_shape = attention_mask.shape
-    dummy_enbedding = torch.tensor((1.0,)).to(dtype)
-    new_mask = LlamaModel._prepare_decoder_attention_mask(
-        attention_mask, input_shape, dummy_enbedding, past_key_values_length
-    )
-    return new_mask
-
+# TODO: #10762 should reduce once AIMET export consumes less memory during export.   TODO!!! Not quite correct, since we are not using AIMET
+MIN_MEMORY_RECOMMENDED = 40  # TODO: Does this work for Llama 3?
 
-class Llama3Wrapper(torch.nn.Module):
-    def __init__(
-        self,
-        max_position_embeddings: int = MAX_POS_EMBEDDINGS,
-        split_part: int = 1,
-        is_token_generator: bool = False,
-    ):
-        super().__init__()
 
-        model_type = "TokenGenerator" if is_token_generator else "PromptProcessor"
-        self.is_token_generator = is_token_generator
-        print(f"Loading Llama3 {model_type} {split_part}/{NUM_SPLITS}")
-
-        config = AutoConfig.from_pretrained(HF_REPO_NAME, torchscript=True)
-        hidden_layers = 32
-        config.num_hidden_layers = hidden_layers
-        config.max_position_embeddings = max_position_embeddings
-        config.num_attention_heads = 32
-        config.block_size = 4096
-        config.num_key_value_heads = NUM_KEY_VAL_HEADS
-        config.num_logits_to_return = 1
-        config.shift_cache = False
-        config.transposed_key_cache = True
-        config.return_new_key_value_only = True
-        config.return_top_k = 0
-        config.logit_temperature = 1.0
-        config.use_combined_mask_input = True
-        config.use_sha = True
-        config.use_conv = True
-        config.mask_neg = -100
-        config.split_model = split_part
-        if split_part < 1 or split_part > 5:
-            raise RuntimeError(
-                f"Llama3 split_part must be within 1-5 (Provided {split_part})."
-            )
-
-        hidden_layers_start, hidden_layers_end = get_hidden_layer_range_from_split(
-            split_part, MODEL_SPLIT_MAP
-        )
-        config.hidden_layers_start = hidden_layers_start
-        config.hidden_layers_end = hidden_layers_end
-        self.total_hidden_layers = hidden_layers_end - hidden_layers_start
-
-        print("Loading model")
-        self.model = LlamaForCausalLM.from_pretrained(HF_REPO_NAME, config=config)
-        self.model.eval()
-
-        if (
-            hidden_layers_start < 0
-            or hidden_layers_start > MAX_HIDDEN_LAYERS
-            or hidden_layers_end < 0
-            or hidden_layers_end > MAX_HIDDEN_LAYERS
-            or hidden_layers_start >= hidden_layers_end
-        ):
-            raise RuntimeError(
-                f"Incorrect hidden_layers range provided. Must be within 0-32 (provided {hidden_layers_start}-{hidden_layers_end})."
-            )
-
-        # Reduce # of hidden layers as per split
-        self.model.model.layers = self.model.model.layers[
-            hidden_layers_start:hidden_layers_end
-        ]
-
-        # Apply model conversion
-        # Convert MHA to SHA
-        use_sha = config.use_sha
-        use_conv = config.use_conv
-        # Convert Linear to 1x1 Conv2D
-        if use_conv:
-            for _, module in self.model.named_modules():
-                if type(module).__name__ in {
-                    "LlamaMLP",
-                    "LlamaForCausalLM",
-                    "LlamaAttention",
-                }:
-                    module.prepare_conv()
-
-        if use_sha:
-            for _, module in self.model.named_modules():
-                if type(module).__name__ == "LlamaAttention":
-                    module.prepare_sha()
-
-    def forward(
-        self,
-        input_ids,
-        attention_mask,
-        position_ids_cos,
-        position_ids_sin,
-        *past_key_values,
-    ):
-        if self.is_token_generator:
-            out = self.forward_token_generator(
-                input_ids,
-                attention_mask,
-                position_ids_cos,
-                position_ids_sin,
-                *past_key_values,
-            )
-        else:
-            out = self.forward_prompt_processor(
-                input_ids, attention_mask, position_ids_cos, position_ids_sin
-            )
-        # Flatten past_key_values
-        return tuple(
-            out[:1],
-        ) + tuple(flatten(out[1]))
-
-    def forward_prompt_processor(
-        self, input_ids, attention_mask, position_ids_cos, position_ids_sin
-    ):
-        return self.model(
-            input_ids, attention_mask, position_ids=(position_ids_cos, position_ids_sin)
-        )
-
-    def forward_token_generator(
-        self,
-        input_ids,
-        attention_mask,
-        position_ids_cos,
-        position_ids_sin,
-        *past_key_values,
-    ):
-        past_key_values_tuple = make_torch_compatible_past_key_values(
-            self.total_hidden_layers, 8, False, *past_key_values
+class Llama3_Quantized(Llama3Base_Quantized):
+    def __init__(self, huggingface_model_name: str = HF_REPO_NAME, *args, **kwargs):
+        super().__init__(
+            huggingface_model_name=huggingface_model_name,
+            min_memory_recommended=MIN_MEMORY_RECOMMENDED,
+            *args,
+            **kwargs,
         )
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids=(position_ids_cos, position_ids_sin),
-            past_key_values=past_key_values_tuple,
-        )
-
-
-def _get_llama_model_with_split(
-    max_position_embeddings: int = MAX_POS_EMBEDDINGS,
-    split_part: int = 1,
-    is_token_generator: bool = False,
-) -> Tuple[torch.nn.Module, str]:
-
-    # Ensure User has access to model,
-    # otherwise point to instructions to get access and error out.
-    has_model_access(HF_REPO_NAME, HF_REPO_URL)
-
-    # Ensure User has recommended memory,
-    # otherwise, provide warning to user and recommend to increase swap-space as a work-around.
-    has_recommended_memory(MIN_MEMORY_RECOMMENDED)
-
-    with suppress_warnings():
-        model = Llama3Wrapper(
-            max_position_embeddings=max_position_embeddings,
-            split_part=split_part,
-            is_token_generator=is_token_generator,
-        )
-        model.eval()
-
-    # Download quantization config and pre-computed encodings
-    model_encoding_tag = "tg" if is_token_generator else "pp"
-    aimet_encodings = str(
-        os.path.join(
-            AIMET_ENCODINGS_PREFIX,
-            model_encoding_tag,
-            f"llama3_{model_encoding_tag}_sha_{split_part}.encodings",
-        )
-    )
-    aimet_encodings = str(
-        CachedWebModelAsset.from_asset_store(
-            MODEL_ID, MODEL_ASSET_VERSION, aimet_encodings
-        ).fetch()
-    )
-    return model, aimet_encodings
-
-
-class Llama3_Quantized(CollectionModel):
-    def __init__(self, max_position_embeddings: int) -> None:
-        super().__init__()
-        self.max_position_embeddings = max_position_embeddings
 
     @classmethod
     def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
+        cls,
+        sequence_length: int,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        aimet_encodings: str | None = "DEFAULT",
+        huggingface_model_name: str = HF_REPO_NAME,
     ) -> "Llama3_Quantized":
-        return Llama3_Quantized(max_position_embeddings=max_position_embeddings)
-
-    def load_model_part(self, split_part):
-        if split_part == "PromptProcessor_1_Quantized":
-            return Llama3_PromptProcessor_1_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "PromptProcessor_2_Quantized":
-            return Llama3_PromptProcessor_2_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "PromptProcessor_3_Quantized":
-            return Llama3_PromptProcessor_3_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "PromptProcessor_4_Quantized":
-            return Llama3_PromptProcessor_4_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "PromptProcessor_5_Quantized":
-            return Llama3_PromptProcessor_5_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "TokenGenerator_1_Quantized":
-            return Llama3_TokenGenerator_1_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings,
-            )
-        if split_part == "TokenGenerator_2_Quantized":
-            return Llama3_TokenGenerator_2_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "TokenGenerator_3_Quantized":
-            return Llama3_TokenGenerator_3_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "TokenGenerator_4_Quantized":
-            return Llama3_TokenGenerator_4_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        if split_part == "TokenGenerator_5_Quantized":
-            return Llama3_TokenGenerator_5_Quantized.from_pretrained(
-                max_position_embeddings=self.max_position_embeddings
-            )
-        raise RuntimeError(f"Unsupported split_part {split_part}.")
-
-
-class Llama3_PromptProcessor_1_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model, encoding_path):
-        super().__init__(model, encoding_path)
-        self.model = model
-        self.split_part = 1
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-    ):
-        return self.model(input_ids, attention_mask, position_ids_cos, position_ids_sin)
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_PromptProcessor_1_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings, split_part=1
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-        return {
-            "input_ids": ((1, input_seq_length), "int32"),
-            "attention_mask": ((1, 1, input_seq_length, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-        }
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=1, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=1,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="pp",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        _, input_seq_len = Llama3_PromptProcessor_1_Quantized.get_input_spec()[
-            "input_ids"
-        ][0]
-
-        tokenizer = get_tokenizer()
-        prompt = get_input_prompt_with_tags(DEFAULT_USER_PROMPT)
-        input_tokens = tokenizer(
-            prompt, return_tensors="pt", padding="max_length", max_length=input_seq_len
-        )
-
-        inputs = {}
-        inputs["input_ids"] = input_tokens["input_ids"].type(torch.int32)
-        inputs["attention_mask"] = prepare_combined_attention_mask(
-            input_tokens["attention_mask"], input_tokens["attention_mask"].shape
-        ).type(torch.float32)
-        tokens = torch.sum(input_tokens["attention_mask"]).item()
-        position_ids = [0] * (input_seq_len - tokens) + list(range(0, tokens))
-        position_ids = (
-            torch.Tensor(position_ids).type(torch.long).reshape(1, input_seq_len)
-        )
-        position_ids_cos, position_ids_sin = RopeEmbedding(
-            max_length=input_seq_len
-        ).get_embedding(position_ids)
-        inputs["position_ids_cos"] = position_ids_cos
-        inputs["position_ids_sin"] = position_ids_sin
-        save_input_cached_data(
-            inputs,
-            split_part=1,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            input_seq_len=input_seq_len,
-        )
-        return inputs
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model and input spec.
         """
-        if input_spec is None:
-            input_spec = Llama3_PromptProcessor_1_Quantized.get_input_spec()
-
-        _, input_seq_len = input_spec["input_ids"][0]
-        return Llama3_PromptProcessor_1_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-class Llama3_PromptProcessor_2_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path)
-        self.split_part = 2
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-    ):
-        return self.model(input_ids, attention_mask, position_ids_cos, position_ids_sin)
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_PromptProcessor_2_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings, split_part=2
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-        return {
-            "input_ids": ((1, input_seq_length, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, input_seq_length, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-        }
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=2, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=2,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="pp",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        model = Llama3_PromptProcessor_1_Quantized.from_pretrained()
-        inputs = Llama3_PromptProcessor_1_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        output = model(*inputs.values())
-        del model
-
-        new_inputs = {}
-        new_inputs["input_ids"] = output[0].detach()
-        new_inputs["attention_mask"] = inputs["attention_mask"]
-        new_inputs["position_ids_cos"] = inputs["position_ids_cos"]
-        new_inputs["position_ids_sin"] = inputs["position_ids_sin"]
-        save_input_cached_data(
-            new_inputs,
-            split_part=2,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            input_seq_len=input_seq_len,
-        )
-        return new_inputs
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_PromptProcessor_2_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_PromptProcessor_2_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-class Llama3_PromptProcessor_3_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path)
-        self.split_part = 3
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-    ):
-        return self.model(input_ids, attention_mask, position_ids_cos, position_ids_sin)
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_PromptProcessor_3_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings, split_part=3
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-        return {
-            "input_ids": ((1, input_seq_length, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, input_seq_length, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-        }
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=3, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=3,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="pp",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        model = Llama3_PromptProcessor_2_Quantized.from_pretrained()
-        inputs = Llama3_PromptProcessor_2_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        output = model(*inputs.values())
-        del model
-
-        new_inputs = {}
-        new_inputs["input_ids"] = output[0].detach()
-        new_inputs["attention_mask"] = inputs["attention_mask"]
-        new_inputs["position_ids_cos"] = inputs["position_ids_cos"]
-        new_inputs["position_ids_sin"] = inputs["position_ids_sin"]
-        save_input_cached_data(
-            new_inputs,
-            split_part=3,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            input_seq_len=input_seq_len,
-        )
-        return new_inputs
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_PromptProcessor_3_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_PromptProcessor_3_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-class Llama3_PromptProcessor_4_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path)
-        self.split_part = 4
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-    ):
-        return self.model(input_ids, attention_mask, position_ids_cos, position_ids_sin)
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_PromptProcessor_4_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings, split_part=4
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-        return {
-            "input_ids": ((1, input_seq_length, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, input_seq_length, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-        }
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=4, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=4,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="pp",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        model = Llama3_PromptProcessor_3_Quantized.from_pretrained()
-        inputs = Llama3_PromptProcessor_3_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        output = model(*inputs.values())
-
-        new_inputs = {}
-        new_inputs["input_ids"] = output[0].detach()
-        new_inputs["attention_mask"] = inputs["attention_mask"]
-        new_inputs["position_ids_cos"] = inputs["position_ids_cos"]
-        new_inputs["position_ids_sin"] = inputs["position_ids_sin"]
-        save_input_cached_data(
-            new_inputs,
-            split_part=4,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            input_seq_len=input_seq_len,
-        )
-        return new_inputs
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_PromptProcessor_4_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_PromptProcessor_4_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-class Llama3_PromptProcessor_5_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path)
-        self.split_part = 5
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-    ):
-        return self.model(input_ids, attention_mask, position_ids_cos, position_ids_sin)
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_PromptProcessor_5_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings, split_part=5
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-        return {
-            "input_ids": ((1, input_seq_length, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, input_seq_length, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, input_seq_length, POS_EMBED_DIM), "float32"),
-        }
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=5, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-            output_name="logits",
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=5,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="pp",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        model = Llama3_PromptProcessor_4_Quantized.from_pretrained()
-        inputs = Llama3_PromptProcessor_4_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        output = model(*inputs.values())
-
-        new_inputs = {}
-        new_inputs["input_ids"] = output[0].detach()
-        new_inputs["attention_mask"] = inputs["attention_mask"]
-        new_inputs["position_ids_cos"] = inputs["position_ids_cos"]
-        new_inputs["position_ids_sin"] = inputs["position_ids_sin"]
-        save_input_cached_data(
-            new_inputs,
-            split_part=4,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            input_seq_len=input_seq_len,
-        )
-        return new_inputs
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_PromptProcessor_5_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_PromptProcessor_5_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-#
-# Token Generators
-#
-
-
-class Llama3_TokenGenerator_1_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path, is_token_generator=True)
-        self.split_part = 1
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-        *past_key_values,
-    ):
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids_cos,
-            position_ids_sin,
-            *past_key_values,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_TokenGenerator_1_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings,
-            split_part=1,
-            is_token_generator=True,
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-
-        input_spec = {
-            "input_ids": ((1, 1), "int32"),
-            "attention_mask": ((1, 1, 1, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-        }
-
-        # Collect past_key_values and drop output names
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=1, model_split_map=MODEL_SPLIT_MAP
-        )
-        past_key_val_names = get_past_key_names(
-            start=layers_start,
-            end=layers_end,
-            num_of_past_key_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-        for past_key_val in past_key_val_names:
-            if "key" in past_key_val:
-                input_spec[past_key_val] = (
-                    (1, 1, 128, input_seq_length - 1),
-                    "float32",
-                )
-            else:
-                input_spec[past_key_val] = (
-                    (1, 1, input_seq_length - 1, 128),
-                    "float32",
-                )
-        return input_spec
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=1, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=1,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        inputs = Llama3_PromptProcessor_1_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_PromptProcessor_1_Quantized.from_pretrained()
-        output = model(*inputs.values())
-        del model
-
-        tokenizer = get_tokenizer()
-        prompt = get_input_prompt_with_tags(DEFAULT_USER_PROMPT)
-        input_tokens = tokenizer(
-            prompt, return_tensors="pt", padding="max_length", max_length=input_seq_len
-        )
-        num_tokens = torch.sum(input_tokens["attention_mask"]).item()
-
-        # Get last input id
-        input_ids = inputs["input_ids"][:, -1].reshape(-1, 1).type(torch.int32)
-        # Create attention mask with
-        # [B, 1, Target Seq Len, Source Seq Len]
-        #   where Target Seq Len = 1
-        padding_size = input_seq_len - num_tokens
-
-        attention_mask = (
-            torch.Tensor([0] * padding_size + [1] * (input_seq_len - padding_size))
-            .reshape(1, -1)
-            .type(torch.float32)
-        )
-
-        # Get last input id
-        input_ids = inputs["input_ids"][:, -1].reshape(-1, 1).type(torch.int32)
-
-        # Create attention mask with
-        # [B, 1, Target Seq Len, Source Seq Len]
-        #   where Target Seq Len = 1
-        cm_attention_mask = prepare_combined_attention_mask(
-            attention_mask=attention_mask,
-            input_shape=input_ids.shape,
-            past_key_values_length=input_seq_len - 1,
-        )
-        position_ids = torch.Tensor([padding_size + 1]).reshape(1, -1).type(torch.long)
-        position_ids_cos, position_ids_sin = RopeEmbedding(
-            max_length=input_seq_len
-        ).get_embedding(position_ids)
-        inputs["position_ids_cos"] = position_ids_cos
-        inputs["position_ids_sin"] = position_ids_sin
-
-        data = {
-            "input_ids": input_ids,
-            "attention_mask": cm_attention_mask,
-            "position_ids_cos": position_ids_cos,
-            "position_ids_sin": position_ids_sin,
-        }
-
-        layers_start, _ = get_hidden_layer_range_from_split(
-            split_part=1, model_split_map=MODEL_SPLIT_MAP
-        )
-        key_val = get_past_keyval_with_shift(
-            output[1:], layers_start, NUM_KEY_VAL_HEADS, bundled_kvcache=False
-        )
-        for key, val in key_val.items():
-            data[key] = val
-
-        save_input_cached_data(
-            data,
-            split_part=1,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        return data
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_TokenGenerator_1_Quantized.get_input_spec()
-
-        # Attention mask is of shape [B, 1, TargetSeqLen, SourceSeqLen]
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_TokenGenerator_1_Quantized.get_model_data(
-            input_seq_len=input_seq_len,
-        )
-
-
-class Llama3_TokenGenerator_2_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path, is_token_generator=True)
-        self.split_part = 2
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-        *past_key_values,
-    ):
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids_cos,
-            position_ids_sin,
-            *past_key_values,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_TokenGenerator_2_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings,
-            split_part=2,
-            is_token_generator=True,
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-
-        input_spec = {
-            "input_ids": ((1, 1, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, 1, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-        }
-
-        # Collect past_key_values and drop output names
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=2, model_split_map=MODEL_SPLIT_MAP
-        )
-        past_key_val_names = get_past_key_names(
-            start=layers_start,
-            end=layers_end,
-            num_of_past_key_heads=8,
-            bundled_kvcache=False,
-        )
-        for past_key_val in past_key_val_names:
-            if "key" in past_key_val:
-                input_spec[past_key_val] = (
-                    (1, 1, 128, input_seq_length - 1),
-                    "float32",
-                )
-            else:
-                input_spec[past_key_val] = (
-                    (1, 1, input_seq_length - 1, 128),
-                    "float32",
-                )
-        return input_spec
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=2, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=2,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        inputs = Llama3_PromptProcessor_2_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_PromptProcessor_2_Quantized.from_pretrained()
-        output = model(*inputs.values())
-        del model
-
-        inputs = Llama3_TokenGenerator_1_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_TokenGenerator_1_Quantized.from_pretrained()
-        output_tg = model(*inputs.values())
-        del model
-
-        data = {
-            "input_ids": output_tg[0].detach(),
-            "attention_mask": inputs["attention_mask"],
-            "position_ids_cos": inputs["position_ids_cos"],
-            "position_ids_sin": inputs["position_ids_sin"],
-        }
-
-        layers_start, _ = get_hidden_layer_range_from_split(
-            split_part=2, model_split_map=MODEL_SPLIT_MAP
-        )
-        key_val = get_past_keyval_with_shift(
-            output[1:], layers_start, NUM_KEY_VAL_HEADS, bundled_kvcache=False
-        )
-        for key, val in key_val.items():
-            data[key] = val
-
-        save_input_cached_data(
-            data,
-            split_part=2,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        return data
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_TokenGenerator_2_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_TokenGenerator_2_Quantized.get_model_data(
-            input_seq_len=input_seq_len,
-        )
-
-
-class Llama3_TokenGenerator_3_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path, is_token_generator=True)
-        self.split_part = 3
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-        *past_key_values,
-    ):
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids_cos,
-            position_ids_sin,
-            *past_key_values,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_TokenGenerator_3_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings,
-            split_part=3,
-            is_token_generator=True,
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-
-        input_spec = {
-            "input_ids": ((1, 1, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, 1, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-        }
-
-        # Collect past_key_values and drop output names
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=3, model_split_map=MODEL_SPLIT_MAP
-        )
-        past_key_val_names = get_past_key_names(
-            start=layers_start,
-            end=layers_end,
-            num_of_past_key_heads=8,
-            bundled_kvcache=False,
-        )
-        for past_key_val in past_key_val_names:
-            if "key" in past_key_val:
-                input_spec[past_key_val] = (
-                    (1, 1, 128, input_seq_length - 1),
-                    "float32",
-                )
-            else:
-                input_spec[past_key_val] = (
-                    (1, 1, input_seq_length - 1, 128),
-                    "float32",
+        Load a pre-trained Llama 3 (8B) model from Meta via HuggingFace.
+
+        sequence_length:
+            Instantiate with this token sequence length input. A longer
+            sequence length means the model is capable of processing more
+            tokens at once. This can only be set to greater than one to process
+            prompts, since responses are auto-regressive in nature and require
+            this to be 1.
+        context_length:
+            Total context length of model. Longer context length means the
+            model is more capable of making longer connections in the input
+            prompt. However, it also hurts runtime performance (both time-to-
+            first-token and tokens-per-second), so this is a tradeoff that may
+            depend on the use case.
+        aimet_encodings:
+            Path to AIMET quantization encodings file.
+        huggingface_model_name:
+            Name or URL of the HuggingFace model. Change this if you want to
+            change the weights.
+        """
+        if aimet_encodings:
+            if aimet_encodings == "DEFAULT":
+                aimet_encodings = os.path.join(
+                    CachedWebModelAsset.from_asset_store(
+                        MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS_ZIP
+                    ).fetch(extract=True),
+                    DEFAULT_ENCODINGS,
                 )
-        return input_spec
 
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=3, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
+        return cls(
+            aimet_encodings=aimet_encodings,
+            sequence_length=sequence_length,
+            context_length=context_length,
+            huggingface_model_name=huggingface_model_name,
         )
 
     @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=3,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        inputs = Llama3_PromptProcessor_3_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_PromptProcessor_3_Quantized.from_pretrained()
-        output = model(*inputs.values())
-        del model
-
-        inputs = Llama3_TokenGenerator_2_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_TokenGenerator_2_Quantized.from_pretrained()
-        output_tg = model(*inputs.values())
-        del model
-
-        data = {
-            "input_ids": output_tg[0].detach(),
-            "attention_mask": inputs["attention_mask"],
-            "position_ids_cos": inputs["position_ids_cos"],
-            "position_ids_sin": inputs["position_ids_sin"],
-        }
-
-        layers_start, _ = get_hidden_layer_range_from_split(
-            split_part=3, model_split_map=MODEL_SPLIT_MAP
-        )
-        key_val = get_past_keyval_with_shift(
-            output[1:], layers_start, NUM_KEY_VAL_HEADS, bundled_kvcache=False
-        )
-        for key, val in key_val.items():
-            data[key] = val
-
-        save_input_cached_data(
-            data,
-            split_part=3,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        return data
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_TokenGenerator_3_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_TokenGenerator_3_Quantized.get_model_data(
-            input_seq_len=input_seq_len,
-        )
-
-
-class Llama3_TokenGenerator_4_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path, is_token_generator=True)
-        self.split_part = 4
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-        *past_key_values,
-    ):
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids_cos,
-            position_ids_sin,
-            *past_key_values,
+    def get_output_names(num_hidden_layers: int = NUM_LAYERS):
+        return Llama3Base_Quantized.get_output_names(
+            num_hidden_layers=num_hidden_layers
         )
 
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_TokenGenerator_4_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings,
-            split_part=4,
-            is_token_generator=True,
-        )
-        return cls(model, encoding_path)
-
     @staticmethod
     def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
+        num_hidden_layers: int = NUM_LAYERS,
+        input_seq_length: int = 128,
+        context_length: int = DEFAULT_CONTEXT_LENGTH,
+        hidden_size: int = 4096,
+        num_key_value_heads: int = 8,
+        num_attention_heads: int = 32,
     ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-
-        input_spec = {
-            "input_ids": ((1, 1, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, 1, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-        }
-
-        # Collect past_key_values and drop output names
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=4, model_split_map=MODEL_SPLIT_MAP
-        )
-        past_key_val_names = get_past_key_names(
-            start=layers_start,
-            end=layers_end,
-            num_of_past_key_heads=8,
-            bundled_kvcache=False,
-        )
-        for past_key_val in past_key_val_names:
-            if "key" in past_key_val:
-                input_spec[past_key_val] = (
-                    (1, 1, 128, input_seq_length - 1),
-                    "float32",
-                )
-            else:
-                input_spec[past_key_val] = (
-                    (1, 1, input_seq_length - 1, 128),
-                    "float32",
-                )
-        return input_spec
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=4, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=4,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        inputs = Llama3_PromptProcessor_4_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_PromptProcessor_4_Quantized.from_pretrained()
-        output = model(*inputs.values())
-        del model
-
-        inputs = Llama3_TokenGenerator_3_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_TokenGenerator_3_Quantized.from_pretrained()
-        output_tg = model(*inputs.values())
-        del model
-
-        data = {
-            "input_ids": output_tg[0].detach(),
-            "attention_mask": inputs["attention_mask"],
-            "position_ids_cos": inputs["position_ids_cos"],
-            "position_ids_sin": inputs["position_ids_sin"],
-        }
-
-        layers_start, _ = get_hidden_layer_range_from_split(
-            split_part=4, model_split_map=MODEL_SPLIT_MAP
-        )
-        key_val = get_past_keyval_with_shift(
-            output[1:], layers_start, NUM_KEY_VAL_HEADS, bundled_kvcache=False
-        )
-        for key, val in key_val.items():
-            data[key] = val
-
-        save_input_cached_data(
-            data,
-            split_part=4,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        return data
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_TokenGenerator_4_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_TokenGenerator_4_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-
-
-class Llama3_TokenGenerator_5_Quantized(Llama_QuantizedMixin):
-    def __init__(self, model: torch.nn.Module, encoding_path: str):
-        super().__init__(model, encoding_path, is_token_generator=True)
-        self.split_part = 5
-
-    def forward(
-        self,
-        input_ids: torch.Tensor,
-        attention_mask: torch.Tensor,
-        position_ids_cos: torch.Tensor,
-        position_ids_sin: torch.Tensor,
-        *past_key_values,
-    ):
-        return self.model(
-            input_ids,
-            attention_mask,
-            position_ids_cos,
-            position_ids_sin,
-            *past_key_values,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls, max_position_embeddings: int = MAX_POS_EMBEDDINGS
-    ) -> Llama3_TokenGenerator_5_Quantized:
-        model, encoding_path = _get_llama_model_with_split(
-            max_position_embeddings,
-            split_part=5,
-            is_token_generator=True,
-        )
-        return cls(model, encoding_path)
-
-    @staticmethod
-    def get_input_spec(
-        input_seq_length: int = DEFAULT_INPUT_SEQ_LEN,
-    ) -> InputSpec:
-        # Get the input specification ordered (name -> (shape, type)) pairs for this model.
-        #
-        # This can be used with the qai_hub python API to declare
-        # the model input specification upon submitting a compile job.
-
-        input_spec = {
-            "input_ids": ((1, 1, ATTENTION_HIDDEN_DIM), "float32"),
-            "attention_mask": ((1, 1, 1, input_seq_length), "float32"),
-            "position_ids_cos": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-            "position_ids_sin": ((1, 1, 1, POS_EMBED_DIM), "float32"),
-        }
-
-        # Collect past_key_values and drop output names
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=5, model_split_map=MODEL_SPLIT_MAP
-        )
-        past_key_val_names = get_past_key_names(
-            start=layers_start,
-            end=layers_end,
-            num_of_past_key_heads=8,
-            bundled_kvcache=False,
-        )
-        for past_key_val in past_key_val_names:
-            if "key" in past_key_val:
-                input_spec[past_key_val] = (
-                    (1, 1, 128, input_seq_length - 1),
-                    "float32",
-                )
-            else:
-                input_spec[past_key_val] = (
-                    (1, 1, input_seq_length - 1, 128),
-                    "float32",
-                )
-        return input_spec
-
-    @staticmethod
-    def get_output_names():
-        layers_start, layers_end = get_hidden_layer_range_from_split(
-            split_part=5, model_split_map=MODEL_SPLIT_MAP
-        )
-        return Llama_QuantizedMixin.get_output_names(
-            start=layers_start,
-            end=layers_end,
-            past_key_val_heads=NUM_KEY_VAL_HEADS,
-            bundled_kvcache=False,
-            output_name="logits",
-        )
-
-    @staticmethod
-    def get_model_data(input_seq_len: int = DEFAULT_INPUT_SEQ_LEN):
-        data = load_input_cached_data(
-            split_part=5,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        if data is not None:
-            return data
-
-        inputs = Llama3_PromptProcessor_5_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_PromptProcessor_5_Quantized.from_pretrained()
-        output = model(*inputs.values())
-        del model
-
-        inputs = Llama3_TokenGenerator_4_Quantized.get_model_data(
-            input_seq_len=input_seq_len
-        )
-        model = Llama3_TokenGenerator_4_Quantized.from_pretrained()
-        output_tg = model(*inputs.values())
-        del model
-
-        data = {
-            "input_ids": output_tg[0].detach(),
-            "attention_mask": inputs["attention_mask"],
-            "position_ids_cos": inputs["position_ids_cos"],
-            "position_ids_sin": inputs["position_ids_sin"],
-        }
-
-        layers_start, _ = get_hidden_layer_range_from_split(
-            split_part=5, model_split_map=MODEL_SPLIT_MAP
-        )
-        key_val = get_past_keyval_with_shift(
-            output[1:], layers_start, NUM_KEY_VAL_HEADS, bundled_kvcache=False
-        )
-        for key, val in key_val.items():
-            data[key] = val
-
-        save_input_cached_data(
-            data,
-            split_part=5,
-            data_dir=DATA_DIR,
-            model_name="llama_v3",
-            model_id=MODEL_ID,
-            model_asset_version=MODEL_ASSET_VERSION,
-            model_type="tg",
-            input_seq_len=input_seq_len,
-        )
-        return data
-
-    def get_calibration_data(
-        self,
-        target_runtime: TargetRuntime | None = None,
-        input_spec: InputSpec | None = None,
-    ) -> DatasetEntries | None:
-        """
-        Calibration dataset for this model.
-        """
-        if input_spec is None:
-            input_spec = Llama3_TokenGenerator_5_Quantized.get_input_spec()
-
-        input_seq_len = input_spec["attention_mask"][0][-1]
-        return Llama3_TokenGenerator_5_Quantized.get_model_data(
-            input_seq_len=input_seq_len
+        return Llama3Base_Quantized.get_input_spec(
+            num_hidden_layers=NUM_LAYERS,
+            input_seq_length=input_seq_length,
+            context_length=context_length,
+            hidden_size=hidden_size,
+            num_key_value_heads=num_key_value_heads,
+            num_attention_heads=num_attention_heads,
         )
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/modeling_llama.py b/qai_hub_models/models/llama_v3_8b_chat_quantized/modeling_llama.py
deleted file mode 100644
index 676a232e..00000000
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/modeling_llama.py
+++ /dev/null
@@ -1,1436 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-# coding=utf-8
-# Copyright 2022 EleutherAI and the HuggingFace Inc. team. All rights reserved.
-#
-# This code is based on EleutherAI's GPT-NeoX library and the GPT-NeoX
-# and OPT implementations in this library. It has been modified from its
-# original forms to accommodate minor architectural differences compared
-# to GPT-NeoX and OPT used by the Meta AI team that trained the model.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-""" PyTorch LLaMA model."""
-from __future__ import annotations
-
-import math
-from typing import List, Optional, Tuple, Union
-
-import torch
-import torch.utils.checkpoint
-from torch import nn
-from torch.nn import CrossEntropyLoss
-from transformers.activations import ACT2FN
-from transformers.modeling_outputs import (
-    BaseModelOutputWithPast,
-    CausalLMOutputWithPast,
-)
-from transformers.modeling_utils import PreTrainedModel
-from transformers.models.llama.configuration_llama import LlamaConfig
-from transformers.utils import (
-    add_start_docstrings,
-    add_start_docstrings_to_model_forward,
-    logging,
-    replace_return_docstrings,
-)
-
-logger = logging.get_logger(__name__)
-
-_CONFIG_FOR_DOC = "LlamaConfig"
-
-
-# Copied from transformers.models.bart.modeling_bart._make_causal_mask
-def _make_causal_mask(
-    input_ids_shape: torch.Size,
-    dtype: torch.dtype,
-    device: torch.device,
-    past_key_values_length: int = 0,
-    mask_neg: float = -100.0,
-):
-    """
-    Make causal mask used for bi-directional self-attention.
-    """
-    bsz, tgt_len = input_ids_shape
-    # mask = torch.full((tgt_len, tgt_len), torch.tensor(torch.finfo(dtype).min, device=device), device=device)
-    mask = torch.full(
-        (tgt_len, tgt_len), torch.tensor(mask_neg, device=device), device=device
-    )
-    mask_cond = torch.arange(mask.size(-1), device=device)
-    mask.masked_fill_(mask_cond < (mask_cond + 1).view(mask.size(-1), 1), 0)
-    mask = mask.to(dtype)
-
-    if past_key_values_length > 0:
-        mask = torch.cat(
-            [
-                torch.zeros(
-                    tgt_len, past_key_values_length, dtype=dtype, device=device
-                ),
-                mask,
-            ],
-            dim=-1,
-        )
-    return mask[None, None, :, :].expand(
-        bsz, 1, tgt_len, tgt_len + past_key_values_length
-    )
-
-
-# Copied from transformers.models.bart.modeling_bart._expand_mask
-def _expand_mask(
-    mask: torch.Tensor,
-    dtype: torch.dtype,
-    mask_neg: float = -100.0,
-    tgt_len: Optional[int] = None,
-):
-    """
-    Expands attention_mask from `[bsz, seq_len]` to `[bsz, 1, tgt_seq_len, src_seq_len]`.
-    """
-    bsz, src_len = mask.size()
-    tgt_len = tgt_len if tgt_len is not None else src_len
-
-    expanded_mask = mask[:, None, None, :].expand(bsz, 1, tgt_len, src_len).to(dtype)
-
-    inverted_mask = 1.0 - expanded_mask
-
-    # return inverted_mask.masked_fill(inverted_mask.to(torch.bool), torch.finfo(dtype).min)
-    return inverted_mask.masked_fill(inverted_mask.to(torch.bool), mask_neg)
-
-
-class LlamaRMSNorm(nn.Module):
-    def __init__(self, hidden_size, eps=1e-6):
-        """
-        LlamaRMSNorm is equivalent to T5LayerNorm
-        """
-        super().__init__()
-        self.weight = nn.Parameter(torch.ones(hidden_size))
-        self.variance_epsilon = eps
-
-    def forward(self, hidden_states):
-        input_dtype = hidden_states.dtype
-        variance = hidden_states.to(torch.float32).pow(2).mean(-1, keepdim=True)
-        hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon)
-
-        return (self.weight * hidden_states).to(input_dtype)
-
-
-class LlamaRotaryEmbedding(torch.nn.Module):
-    def __init__(self, dim, max_position_embeddings=2048, base=10000, device=None):
-        super().__init__()
-        inv_freq = 1.0 / (base ** (torch.arange(0, dim, 2).float().to(device) / dim))
-        self.register_buffer("inv_freq", inv_freq)
-
-        # Build here to make `torch.jit.trace` work.
-        self.max_seq_len_cached = max_position_embeddings
-        t = torch.arange(
-            self.max_seq_len_cached,
-            device=self.inv_freq.device,
-            dtype=self.inv_freq.dtype,
-        )
-        freqs = torch.einsum("i,j->ij", t, self.inv_freq)
-        # Different from paper, but it uses a different permutation in order to obtain the same calculation
-        emb = torch.cat((freqs, freqs), dim=-1)
-        self.register_buffer(
-            "cos_cached", emb.cos()[None, None, :, :], persistent=False
-        )
-        self.register_buffer(
-            "sin_cached", emb.sin()[None, None, :, :], persistent=False
-        )
-
-    def forward(self, x, seq_len=None):
-        # x: [bs, num_attention_heads, seq_len, head_size]
-        # This `if` block is unlikely to be run after we build sin/cos in `__init__`. Keep the logic here just in case.
-        if seq_len > self.max_seq_len_cached:
-            self.max_seq_len_cached = seq_len
-            t = torch.arange(
-                self.max_seq_len_cached, device=x.device, dtype=self.inv_freq.dtype
-            )
-            freqs = torch.einsum("i,j->ij", t, self.inv_freq)
-            # Different from paper, but it uses a different permutation in order to obtain the same calculation
-            emb = torch.cat((freqs, freqs), dim=-1).to(x.device)
-            self.register_buffer(
-                "cos_cached", emb.cos()[None, None, :, :], persistent=False
-            )
-            self.register_buffer(
-                "sin_cached", emb.sin()[None, None, :, :], persistent=False
-            )
-        return (
-            self.cos_cached[:, :, :seq_len, ...].to(dtype=x.dtype),
-            self.sin_cached[:, :, :seq_len, ...].to(dtype=x.dtype),
-        )
-
-
-def rotate_half(x):
-    """Rotates half the hidden dims of the input."""
-    x1 = x[..., : x.shape[-1] // 2]
-    x2 = x[..., x.shape[-1] // 2 :]
-    return torch.cat((-x2, x1), dim=-1)
-
-
-def apply_rotary_pos_emb(q, k, cos, sin, position_ids):
-    # The first two dimensions of cos and sin are always 1, so we can `squeeze` them.
-    cos = cos[0, 0, :, :]  # [seq_len, dim]
-    sin = sin[0, 0, :, :]  # [seq_len, dim]
-    cos = cos[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
-    sin = sin[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
-    q_embed = (q * cos) + (rotate_half(q) * sin)
-    k_embed = (k * cos) + (rotate_half(k) * sin)
-    return q_embed, k_embed
-
-
-def apply_rotary_pos_emb_single(x, cos, sin, position_ids):
-    # The first two dimensions of cos and sin are always 1, so we can `squeeze` them.
-    cos = cos[0, 0, :, :]  # [seq_len, dim]
-    sin = sin[0, 0, :, :]  # [seq_len, dim]
-    cos = cos[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
-    sin = sin[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
-    x_embed = (x * cos) + (rotate_half(x) * sin)
-    return x_embed
-
-
-def apply_rope_single(x, rope_vals: Tuple[torch.Tensor, torch.Tensor]):
-    """
-    Based on FacebookResearch's llama, provided by Carl
-    """
-    rope_real = rope_vals[0]  # shape should be 1, 1, seqlen, head_dim/2
-    rope_im = rope_vals[1]  # shape should be 1, 1, seqlen, head_dim/2
-
-    # TODO: Why HF uses different coordinates from the paper
-    x_real = x[:, :, :, : x.shape[-1] // 2]  # extract first half elements
-    x_im = x[:, :, :, x.shape[-1] // 2 :]  # extract second half elements
-
-    x_prod_real = x_real * rope_real - x_im * rope_im
-    x_prod_im = x_real * rope_im + x_im * rope_real
-
-    # TODO: HF need to uses different interleaving
-    x = torch.cat((x_prod_real, x_prod_im), dim=3).view(*x.shape)
-    return x
-
-
-class LlamaMLP(nn.Module):
-    def __init__(
-        self,
-        hidden_size: int,
-        intermediate_size: int,
-        hidden_act: str,
-    ):
-        super().__init__()
-        self.gate_proj = nn.Linear(hidden_size, intermediate_size, bias=False)
-        self.down_proj = nn.Linear(intermediate_size, hidden_size, bias=False)
-        self.up_proj = nn.Linear(hidden_size, intermediate_size, bias=False)
-        self.act_fn = ACT2FN[hidden_act]
-        self.hidden_size = hidden_size
-        self.intermediate_size = intermediate_size
-
-    def prepare_conv(self):
-        if not hasattr(self, "forward_linear"):
-            self.gate_proj_conv = nn.Conv2d(
-                self.hidden_size, self.intermediate_size, 1, bias=False
-            )
-            self.down_proj_conv = nn.Conv2d(
-                self.intermediate_size, self.hidden_size, 1, bias=False
-            )
-            self.up_proj_conv = nn.Conv2d(
-                self.hidden_size, self.intermediate_size, 1, bias=False
-            )
-            self.forward_linear = self.forward
-            self.forward = self.forward_conv
-
-            self.gate_proj_conv.weight.data.copy_(
-                self.gate_proj.weight[:, :, None, None]
-            )
-            self.down_proj_conv.weight.data.copy_(
-                self.down_proj.weight[:, :, None, None]
-            )
-            self.up_proj_conv.weight.data.copy_(self.up_proj.weight[:, :, None, None])
-
-            del self.gate_proj
-            del self.down_proj
-            del self.up_proj
-
-    def forward_conv(self, x):
-        bsz, _, _ = x.size()
-
-        x = torch.reshape(x, (bsz, -1, 1, self.hidden_size))
-        x = x.transpose(1, 3)  # Transpose right before and after Conv
-        x = self.down_proj_conv(
-            self.act_fn(self.gate_proj_conv(x)) * self.up_proj_conv(x)
-        )
-        x = x.transpose(1, 3)
-        x = torch.reshape(x, (bsz, -1, self.hidden_size))
-
-        return x
-
-    def forward(self, x):
-        return self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
-
-
-# Copied from transformers.models.llama.modeling_llama.repeat_kv
-def repeat_kv(hidden_states: torch.Tensor, n_rep: int) -> torch.Tensor:
-    """
-    This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). The hidden states go from (batch,
-    num_key_value_heads, seqlen, head_dim) to (batch, num_attention_heads, seqlen, head_dim)
-    """
-    if isinstance(hidden_states, list):
-        return [head for head in hidden_states for _ in range(n_rep)]
-
-    batch, num_key_value_heads, slen, head_dim = hidden_states.shape
-    if n_rep == 1:
-        return hidden_states
-    hidden_states = hidden_states[:, :, None, :, :].expand(
-        batch, num_key_value_heads, n_rep, slen, head_dim
-    )
-    return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim)
-
-
-class LlamaAttention(nn.Module):
-    """Multi-headed attention from 'Attention Is All You Need' paper"""
-
-    def __init__(self, config: LlamaConfig):
-        super().__init__()
-        self.config = config
-        self.hidden_size = config.hidden_size
-        self.num_heads = config.num_attention_heads
-        self.num_key_value_heads = (
-            config.num_key_value_heads
-            if hasattr(config, "num_key_value_heads")
-            else self.num_heads
-        )
-        self.num_key_value_groups = self.num_heads // self.num_key_value_heads
-        self.head_dim = self.hidden_size // self.num_heads
-        self.max_position_embeddings = config.max_position_embeddings
-
-        if (self.head_dim * self.num_heads) != self.hidden_size:
-            raise ValueError(
-                f"hidden_size must be divisible by num_heads (got `hidden_size`: {self.hidden_size}"
-                f" and `num_heads`: {self.num_heads})."
-            )
-        self.q_proj = nn.Linear(
-            self.hidden_size, self.num_heads * self.head_dim, bias=False
-        )
-        self.k_proj = nn.Linear(
-            self.hidden_size, self.num_key_value_heads * self.head_dim, bias=False
-        )
-        self.v_proj = nn.Linear(
-            self.hidden_size, self.num_key_value_heads * self.head_dim, bias=False
-        )
-        self.o_proj = nn.Linear(
-            self.num_heads * self.head_dim, self.hidden_size, bias=False
-        )
-        self.rotary_emb = LlamaRotaryEmbedding(
-            self.head_dim,
-            max_position_embeddings=self.max_position_embeddings,
-            base=getattr(config, "rope_theta", 10000.0),
-        )
-        self.mask_neg = config.mask_neg
-        self.return_new_key_value_only = (
-            config.return_new_key_value_only
-            if hasattr(config, "return_new_key_value_only")
-            else False
-        )
-
-    def _shape(self, tensor: torch.Tensor, seq_len: int, bsz: int):
-        return (
-            tensor.view(bsz, seq_len, self.num_heads, self.head_dim)
-            .transpose(1, 2)
-            .contiguous()
-        )
-
-    def prepare_conv(self):
-        if not hasattr(self, "forward_no_conv"):
-            self.q_proj_conv = nn.Conv2d(
-                self.hidden_size, self.num_heads * self.head_dim, 1, bias=False
-            )
-            self.k_proj_conv = nn.Conv2d(
-                self.hidden_size,
-                self.num_key_value_heads * self.head_dim,
-                1,
-                bias=False,
-            )
-            self.v_proj_conv = nn.Conv2d(
-                self.hidden_size,
-                self.num_key_value_heads * self.head_dim,
-                1,
-                bias=False,
-            )
-            self.o_proj_conv = nn.Conv2d(
-                self.num_heads * self.head_dim, self.hidden_size, 1, bias=False
-            )
-
-            self.forward_no_conv = self.forward
-            self.forward = self.forward_conv
-
-            self.q_proj_conv.weight.data.copy_(self.q_proj.weight[:, :, None, None])
-            self.k_proj_conv.weight.data.copy_(self.k_proj.weight[:, :, None, None])
-            self.v_proj_conv.weight.data.copy_(self.v_proj.weight[:, :, None, None])
-            self.o_proj_conv.weight.data.copy_(self.o_proj.weight[:, :, None, None])
-
-            del self.q_proj
-            del self.k_proj
-            del self.v_proj
-            del self.o_proj
-
-    def prepare_sha(self):
-        if not hasattr(self, "forward_mha"):
-            self.q_proj_sha = nn.ModuleList(
-                [
-                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
-                    for _ in range(self.num_heads)
-                ]
-            )
-            self.k_proj_sha = nn.ModuleList(
-                [
-                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
-                    for _ in range(self.num_key_value_heads)
-                ]
-            )
-            self.v_proj_sha = nn.ModuleList(
-                [
-                    nn.Conv2d(self.hidden_size, self.head_dim, 1, bias=False)
-                    for _ in range(self.num_key_value_heads)
-                ]
-            )
-            if not hasattr(self, "o_proj_conv"):
-                self.o_proj_conv = nn.Conv2d(
-                    self.num_heads * self.head_dim, self.hidden_size, 1, bias=False
-                )
-                self.o_proj_conv.weight.data.copy_(self.o_proj.weight[:, :, None, None])
-                del self.o_proj
-
-            self.forward_mha = self.forward
-            self.forward = self.forward_sha
-
-        for i in range(self.num_heads):
-            self.q_proj_sha[i].weight.data.copy_(
-                self.q_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
-            )
-
-        for i in range(self.num_key_value_heads):
-            self.k_proj_sha[i].weight.data.copy_(
-                self.k_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
-            )
-            self.v_proj_sha[i].weight.data.copy_(
-                self.v_proj_conv.weight[i * self.head_dim : (i + 1) * self.head_dim, :]
-            )
-
-        del self.q_proj_conv
-        del self.k_proj_conv
-        del self.v_proj_conv
-
-    def forward_sha(
-        self,
-        hidden_states: torch.Tensor,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_value: Optional[Tuple[torch.Tensor]] = None,
-        output_attentions: bool = False,
-        use_cache: bool = False,
-    ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
-
-        bsz, q_len, _ = hidden_states.size()
-
-        hidden_states = torch.reshape(hidden_states, (bsz, -1, 1, self.hidden_size))
-        hidden_states = hidden_states.transpose(1, 3)
-
-        query_states = [
-            q_proj(hidden_states).permute(0, 2, 3, 1) for q_proj in self.q_proj_sha
-        ]
-        key_states = [
-            k_proj(hidden_states).permute(0, 2, 3, 1) for k_proj in self.k_proj_sha
-        ]
-        value_states = [
-            v_proj(hidden_states).permute(0, 2, 3, 1) for v_proj in self.v_proj_sha
-        ]
-
-        kv_seq_len = value_states[0].shape[-2]
-        if past_key_value is not None:
-            kv_seq_len += past_key_value[1][0].shape[-2]
-
-        if isinstance(position_ids, (tuple, list)):
-            rope_embedding = position_ids
-            query_states = [apply_rope_single(q, rope_embedding) for q in query_states]
-            key_states = [apply_rope_single(k, rope_embedding) for k in key_states]
-        else:
-            cos, sin = self.rotary_emb(value_states[0], kv_seq_len)
-
-            query_states = [
-                apply_rotary_pos_emb_single(q, cos, sin, position_ids)
-                for q in query_states
-            ]
-            key_states = [
-                apply_rotary_pos_emb_single(k, cos, sin, position_ids)
-                for k in key_states
-            ]
-
-        key_states = [k.transpose(2, 3) for k in key_states]
-        if self.return_new_key_value_only:
-            present_key_value = (
-                (tuple(key_states), tuple(value_states)) if use_cache else None
-            )
-
-        if past_key_value is not None:
-            # reuse k, v, self_attention
-            past_key, past_value = past_key_value
-            key_states = [
-                torch.cat([pk, k], dim=3) for pk, k in zip(past_key, key_states)
-            ]
-            value_states = [
-                torch.cat([pv, v], dim=2) for pv, v in zip(past_value, value_states)
-            ]
-
-        if not self.return_new_key_value_only:
-            present_key_value = (
-                (tuple(key_states), tuple(value_states)) if use_cache else None
-            )
-
-        key_states = repeat_kv(key_states, self.num_key_value_groups)
-        value_states = repeat_kv(value_states, self.num_key_value_groups)
-
-        attn_weights = [
-            torch.matmul(q, k) / math.sqrt(self.head_dim)
-            for q, k in zip(query_states, key_states)
-        ]
-        if attn_weights[0].size() != (bsz, 1, q_len, kv_seq_len):
-            raise ValueError(
-                f"Attention weights should be of size {(bsz, 1, q_len, kv_seq_len)}, but is"
-                f" {attn_weights[0].size()}"
-            )
-
-        if attention_mask is not None:
-            if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
-                raise ValueError(
-                    f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
-                )
-            attn_weights = [aw + attention_mask for aw in attn_weights]
-
-        # upcast attention to fp32
-        attn_weights = [
-            nn.functional.softmax(aw, dim=-1, dtype=torch.float32).to(
-                query_states[0].dtype
-            )
-            for aw in attn_weights
-        ]
-        attn_output = [torch.matmul(aw, v) for aw, v in zip(attn_weights, value_states)]
-
-        if attn_output[0].size() != (bsz, 1, q_len, self.head_dim):
-            raise ValueError(
-                f"`attn_output` should be of size {(bsz, 1, q_len, self.head_dim)}, but is"
-                f" {attn_output[0].size()}"
-            )
-
-        attn_output = torch.cat(attn_output, dim=3)
-        attn_output = attn_output.permute(0, 3, 1, 2)
-        attn_output = self.o_proj_conv(attn_output)
-        attn_output = attn_output.transpose(1, 3)
-        attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
-
-        if not output_attentions:
-            attn_weights = None
-
-        return attn_output, attn_weights, present_key_value
-
-    def forward_conv(
-        self,
-        hidden_states: torch.Tensor,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_value: Optional[Tuple[torch.Tensor]] = None,
-        output_attentions: bool = False,
-        use_cache: bool = False,
-    ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
-        bsz, q_len, _ = hidden_states.size()
-
-        hidden_states = torch.reshape(
-            hidden_states, (bsz, q_len, 1, self.hidden_size)
-        ).transpose(1, 3)
-
-        query_states = self.q_proj_conv(hidden_states)
-        key_states = self.k_proj_conv(hidden_states)
-        value_states = self.v_proj_conv(hidden_states)
-
-        query_states = query_states.reshape(
-            bsz, self.num_heads, self.head_dim, q_len
-        ).transpose(2, 3)
-        key_states = key_states.reshape(
-            bsz, self.num_key_value_heads, self.head_dim, q_len
-        ).transpose(2, 3)
-        value_states = value_states.reshape(
-            bsz, self.num_key_value_heads, self.head_dim, q_len
-        ).transpose(2, 3)
-
-        kv_seq_len = key_states.shape[-2]
-        if past_key_value is not None:
-            dim = 3 if self.config.transposed_key_cache else 2
-            kv_seq_len += past_key_value[0].shape[dim]
-
-        if isinstance(position_ids, (tuple, list)):
-            rope_embedding = position_ids
-            query_states = apply_rope_single(query_states, rope_embedding)
-            key_states = apply_rope_single(key_states, rope_embedding)
-        else:
-            cos, sin = self.rotary_emb(value_states, kv_seq_len)
-            query_states, key_states = apply_rotary_pos_emb(
-                query_states, key_states, cos, sin, position_ids
-            )
-            # [bsz, nh, t, hd]
-
-        if self.config.transposed_key_cache:
-            key_states = key_states.transpose(2, 3)
-
-        if self.return_new_key_value_only:
-            present_key_value = (key_states, value_states) if use_cache else None
-
-        if past_key_value is not None:
-            # reuse k, v, self_attention
-            dim = 3 if self.config.transposed_key_cache else 2
-            key_states = torch.cat([past_key_value[0], key_states], dim=dim)
-            value_states = torch.cat([past_key_value[1], value_states], dim=2)
-
-        if not self.return_new_key_value_only:
-            present_key_value = (key_states, value_states) if use_cache else None
-
-        key_states = repeat_kv(key_states, self.num_key_value_groups)
-        value_states = repeat_kv(value_states, self.num_key_value_groups)
-
-        if self.config.transposed_key_cache:
-            attn_weights = torch.matmul(query_states, key_states) / math.sqrt(
-                self.head_dim
-            )
-        else:
-            attn_weights = torch.matmul(
-                query_states, key_states.transpose(2, 3)
-            ) / math.sqrt(self.head_dim)
-
-        if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
-            raise ValueError(
-                f"Attention weights should be of size {(bsz, self.num_heads, q_len, kv_seq_len)}, but is"
-                f" {attn_weights.size()}"
-            )
-
-        if attention_mask is not None:
-            if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
-                raise ValueError(
-                    f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
-                )
-            attn_weights = attn_weights + attention_mask
-
-        # upcast attention to fp32
-        attn_weights = nn.functional.softmax(
-            attn_weights, dim=-1, dtype=torch.float32
-        ).to(query_states.dtype)
-        attn_output = torch.matmul(attn_weights, value_states)
-
-        if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
-            raise ValueError(
-                f"`attn_output` should be of size {(bsz, self.num_heads, q_len, self.head_dim)}, but is"
-                f" {attn_output.size()}"
-            )
-
-        attn_output = attn_output.transpose(1, 2)
-        attn_output = attn_output.reshape(bsz, q_len, 1, self.hidden_size)
-        attn_output = attn_output.transpose(1, 3)
-        attn_output = self.o_proj_conv(attn_output)
-        attn_output = attn_output.transpose(1, 3)
-        attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
-
-        if not output_attentions:
-            attn_weights = None
-
-        return attn_output, attn_weights, present_key_value
-
-    def forward(
-        self,
-        hidden_states: torch.Tensor,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_value: Optional[Tuple[torch.Tensor]] = None,
-        output_attentions: bool = False,
-        use_cache: bool = False,
-    ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
-        bsz, q_len, _ = hidden_states.size()
-
-        query_states = self.q_proj(hidden_states)
-        key_states = self.k_proj(hidden_states)
-        value_states = self.v_proj(hidden_states)
-
-        query_states = query_states.view(
-            bsz, q_len, self.num_heads, self.head_dim
-        ).transpose(1, 2)
-        key_states = key_states.view(
-            bsz, q_len, self.num_key_value_heads, self.head_dim
-        ).transpose(1, 2)
-        value_states = value_states.view(
-            bsz, q_len, self.num_key_value_heads, self.head_dim
-        ).transpose(1, 2)
-
-        kv_seq_len = key_states.shape[-2]
-        if past_key_value is not None:
-            kv_seq_len += past_key_value[1].shape[-2]
-
-        if isinstance(position_ids, (tuple, list)):
-            rope_embedding = position_ids
-            query_states = apply_rope_single(query_states, rope_embedding)
-            key_states = apply_rope_single(key_states, rope_embedding)
-        else:
-            cos, sin = self.rotary_emb(value_states, kv_seq_len)
-            query_states, key_states = apply_rotary_pos_emb(
-                query_states, key_states, cos, sin, position_ids
-            )
-            # [bsz, nh, t, hd]
-
-        if self.config.transposed_key_cache:
-            key_states = key_states.transpose(2, 3)
-
-        if self.return_new_key_value_only:
-            present_key_value = (key_states, value_states) if use_cache else None
-
-        if past_key_value is not None:
-            # reuse k, v, self_attention
-            dim = 3 if self.config.transposed_key_cache else 2
-            key_states = torch.cat([past_key_value[0], key_states], dim=dim)
-            value_states = torch.cat([past_key_value[1], value_states], dim=2)
-
-        if not self.return_new_key_value_only:
-            present_key_value = (key_states, value_states) if use_cache else None
-
-        key_states = repeat_kv(key_states, self.num_key_value_groups)
-        value_states = repeat_kv(value_states, self.num_key_value_groups)
-
-        if self.config.transposed_key_cache:
-            attn_weights = torch.matmul(query_states, key_states) / math.sqrt(
-                self.head_dim
-            )
-        else:
-            attn_weights = torch.matmul(
-                query_states, key_states.transpose(2, 3)
-            ) / math.sqrt(self.head_dim)
-
-        if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
-            raise ValueError(
-                f"Attention weights should be of size {(bsz, self.num_heads, q_len, kv_seq_len)}, but is"
-                f" {attn_weights.size()}"
-            )
-
-        if attention_mask is not None:
-            if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
-                raise ValueError(
-                    f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
-                )
-            attn_weights = attn_weights + attention_mask
-
-        # upcast attention to fp32
-        attn_weights = nn.functional.softmax(
-            attn_weights, dim=-1, dtype=torch.float32
-        ).to(query_states.dtype)
-        attn_output = torch.matmul(attn_weights, value_states)
-
-        if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
-            raise ValueError(
-                f"`attn_output` should be of size {(bsz, self.num_heads, q_len, self.head_dim)}, but is"
-                f" {attn_output.size()}"
-            )
-
-        attn_output = attn_output.transpose(1, 2)
-
-        attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
-        attn_output = self.o_proj(attn_output)
-
-        if not output_attentions:
-            attn_weights = None
-
-        return attn_output, attn_weights, present_key_value
-
-
-class LlamaDecoderLayer(nn.Module):
-    def __init__(self, config: LlamaConfig):
-        super().__init__()
-        self.hidden_size = config.hidden_size
-        self.self_attn = LlamaAttention(config=config)
-        self.mlp = LlamaMLP(
-            hidden_size=self.hidden_size,
-            intermediate_size=config.intermediate_size,
-            hidden_act=config.hidden_act,
-        )
-        self.input_layernorm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
-        self.post_attention_layernorm = LlamaRMSNorm(
-            config.hidden_size, eps=config.rms_norm_eps
-        )
-
-    def forward(
-        self,
-        hidden_states: torch.Tensor,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_value: Optional[Tuple[torch.Tensor]] = None,
-        output_attentions: Optional[bool] = False,
-        use_cache: Optional[bool] = False,
-    ) -> Tuple[
-        torch.FloatTensor, Optional[Tuple[torch.FloatTensor, torch.FloatTensor]]
-    ]:
-        """
-        Args:
-            hidden_states (`torch.FloatTensor`): input to the layer of shape `(batch, seq_len, embed_dim)`
-            attention_mask (`torch.FloatTensor`, *optional*): attention mask of size
-                `(batch, 1, tgt_len, src_len)` where padding elements are indicated by very large negative values.
-            output_attentions (`bool`, *optional*):
-                Whether or not to return the attentions tensors of all attention layers. See `attentions` under
-                returned tensors for more detail.
-            use_cache (`bool`, *optional*):
-                If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding
-                (see `past_key_values`).
-            past_key_value (`Tuple(torch.FloatTensor)`, *optional*): cached past key and value projection states
-        """
-
-        residual = hidden_states
-
-        hidden_states = self.input_layernorm(hidden_states)
-
-        # Self Attention
-        hidden_states, self_attn_weights, present_key_value = self.self_attn(
-            hidden_states=hidden_states,
-            attention_mask=attention_mask,
-            position_ids=position_ids,
-            past_key_value=past_key_value,
-            output_attentions=output_attentions,
-            use_cache=use_cache,
-        )
-        hidden_states = residual + hidden_states
-
-        # Fully Connected
-        residual = hidden_states
-        hidden_states = self.post_attention_layernorm(hidden_states)
-        hidden_states = self.mlp(hidden_states)
-        hidden_states = residual + hidden_states
-
-        outputs = (hidden_states,)
-
-        if output_attentions:
-            outputs += (self_attn_weights,)
-
-        if use_cache:
-            outputs += (present_key_value,)
-
-        return outputs
-
-
-LLAMA_START_DOCSTRING = r"""
-    This model inherits from [`PreTrainedModel`]. Check the superclass documentation for the generic methods the
-    library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads
-    etc.)
-    This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass.
-    Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage
-    and behavior.
-    Parameters:
-        config ([`LlamaConfig`]):
-            Model configuration class with all the parameters of the model. Initializing with a config file does not
-            load the weights associated with the model, only the configuration. Check out the
-            [`~PreTrainedModel.from_pretrained`] method to load the model weights.
-"""
-
-
-@add_start_docstrings(
-    "The bare LLaMA Model outputting raw hidden-states without any specific head on top.",
-    LLAMA_START_DOCSTRING,
-)
-class LlamaPreTrainedModel(PreTrainedModel):
-    config_class = LlamaConfig
-    base_model_prefix = "model"
-    supports_gradient_checkpointing = True
-    _no_split_modules = ["LlamaDecoderLayer"]
-    _skip_keys_device_placement = "past_key_values"
-    _keys_to_ignore_on_load_unexpected = [r"decoder\.version"]
-
-    def _init_weights(self, module):
-        std = self.config.initializer_range
-        if isinstance(module, nn.Linear):
-            module.weight.data.normal_(mean=0.0, std=std)
-            if module.bias is not None:
-                module.bias.data.zero_()
-        elif isinstance(module, nn.Embedding):
-            module.weight.data.normal_(mean=0.0, std=std)
-            if module.padding_idx is not None:
-                module.weight.data[module.padding_idx].zero_()
-
-    def _set_gradient_checkpointing(self, module, value=False):
-        if isinstance(module, LlamaModel):
-            module.gradient_checkpointing = value
-
-
-LLAMA_INPUTS_DOCSTRING = r"""
-    Args:
-        input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
-            Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide
-            it.
-            Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
-            [`PreTrainedTokenizer.__call__`] for details.
-            [What are input IDs?](../glossary#input-ids)
-        attention_mask (`torch.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
-            Mask to avoid performing attention on padding token indices. Mask values selected in `[0, 1]`:
-            - 1 for tokens that are **not masked**,
-            - 0 for tokens that are **masked**.
-            [What are attention masks?](../glossary#attention-mask)
-            Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
-            [`PreTrainedTokenizer.__call__`] for details.
-            If `past_key_values` is used, optionally only the last `decoder_input_ids` have to be input (see
-            `past_key_values`).
-            If you want to change padding behavior, you should read [`modeling_opt._prepare_decoder_attention_mask`]
-            and modify to your needs. See diagram 1 in [the paper](https://arxiv.org/abs/1910.13461) for more
-            information on the default strategy.
-            - 1 indicates the head is **not masked**,
-            - 0 indicates the head is **masked**.
-        position_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
-            Indices of positions of each input sequence tokens in the position embeddings. Selected in the range `[0,
-            config.n_positions - 1]`.
-            [What are position IDs?](../glossary#position-ids)
-        past_key_values (`tuple(tuple(torch.FloatTensor))`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`):
-            Tuple of `tuple(torch.FloatTensor)` of length `config.n_layers`, with each tuple having 2 tensors of shape
-            `(batch_size, num_heads, sequence_length, embed_size_per_head)`) and 2 additional tensors of shape
-            `(batch_size, num_heads, encoder_sequence_length, embed_size_per_head)`.
-            Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention
-            blocks) that can be used (see `past_key_values` input) to speed up sequential decoding.
-            If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
-            don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
-            `decoder_input_ids` of shape `(batch_size, sequence_length)`.
-        inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
-            Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation. This
-            is useful if you want more control over how to convert `input_ids` indices into associated vectors than the
-            model's internal embedding lookup matrix.
-        use_cache (`bool`, *optional*):
-            If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
-            `past_key_values`).
-        output_attentions (`bool`, *optional*):
-            Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned
-            tensors for more detail.
-        output_hidden_states (`bool`, *optional*):
-            Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for
-            more detail.
-        return_dict (`bool`, *optional*):
-            Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
-"""
-
-
-@add_start_docstrings(
-    "The bare LLaMA Model outputting raw hidden-states without any specific head on top.",
-    LLAMA_START_DOCSTRING,
-)
-class LlamaModel(LlamaPreTrainedModel):
-    """
-    Transformer decoder consisting of *config.num_hidden_layers* layers. Each layer is a [`LlamaDecoderLayer`]
-    Args:
-        config: LlamaConfig
-    """
-
-    def __init__(self, config: LlamaConfig):
-        super().__init__(config)
-        self.padding_idx = config.pad_token_id
-        self.vocab_size = config.vocab_size
-
-        self.embed_tokens = nn.Embedding(
-            config.vocab_size, config.hidden_size, self.padding_idx
-        )
-        # self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
-        ### ------- QCOM EDITS STARTS ------- ###
-        self.layers = nn.ModuleList(
-            [
-                LlamaDecoderLayer(config)
-                if config.hidden_layers_start <= i < config.hidden_layers_end
-                else nn.Identity()
-                for i in range(config.num_hidden_layers)
-            ]
-        )
-        ### ------- QCOM EDITS ENDS ------- ###
-        self.norm = LlamaRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
-
-        self.gradient_checkpointing = False
-        # Initialize weights and apply final processing
-        self.post_init()
-        self.mask_neg = config.mask_neg
-
-    def get_input_embeddings(self):
-        return self.embed_tokens
-
-    def set_input_embeddings(self, value):
-        self.embed_tokens = value
-
-    # Copied from transformers.models.bart.modeling_bart.BartDecoder._prepare_decoder_attention_mask
-    @staticmethod
-    def _prepare_decoder_attention_mask(
-        attention_mask,
-        input_shape,
-        inputs_embeds,
-        past_key_values_length,
-        mask_neg=-100.0,
-    ):
-        # create causal mask
-        # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
-        combined_attention_mask = None
-        if input_shape[-1] > 1:
-            combined_attention_mask = _make_causal_mask(
-                input_shape,
-                inputs_embeds.dtype,
-                device=inputs_embeds.device,
-                past_key_values_length=past_key_values_length,
-                mask_neg=mask_neg,
-            )
-
-        if attention_mask is not None:
-            # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
-            expanded_attn_mask = _expand_mask(
-                attention_mask,
-                inputs_embeds.dtype,
-                tgt_len=input_shape[-1],
-                mask_neg=mask_neg,
-            ).to(inputs_embeds.device)
-            combined_attention_mask = (
-                expanded_attn_mask
-                if combined_attention_mask is None
-                else expanded_attn_mask + combined_attention_mask
-            )
-
-        return combined_attention_mask
-
-    @add_start_docstrings_to_model_forward(LLAMA_INPUTS_DOCSTRING)
-    def forward(
-        self,
-        input_ids: torch.LongTensor = None,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_values: Optional[List[torch.FloatTensor]] = None,
-        inputs_embeds: Optional[torch.FloatTensor] = None,
-        use_cache: Optional[bool] = None,
-        output_attentions: Optional[bool] = None,
-        output_hidden_states: Optional[bool] = None,
-        return_dict: Optional[bool] = None,
-    ) -> Union[Tuple, BaseModelOutputWithPast]:
-        output_attentions = (
-            output_attentions
-            if output_attentions is not None
-            else self.config.output_attentions
-        )
-        output_hidden_states = (
-            output_hidden_states
-            if output_hidden_states is not None
-            else self.config.output_hidden_states
-        )
-        use_cache = use_cache if use_cache is not None else self.config.use_cache
-        use_combined_mask_input = self.config.use_combined_mask_input
-
-        return_dict = (
-            return_dict if return_dict is not None else self.config.use_return_dict
-        )
-
-        # retrieve input_ids and inputs_embeds
-        # if input_ids is not None and inputs_embeds is not None:
-        #     raise ValueError("You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time")
-        # elif input_ids is not None:
-        #     batch_size, seq_length = input_ids.shape
-        # elif inputs_embeds is not None:
-        #     batch_size, seq_length, _ = inputs_embeds.shape
-        # else:
-        #     raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")
-        # retrieve input_ids and inputs_embeds
-        if input_ids is not None and inputs_embeds is not None:
-            raise ValueError(
-                "You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time"
-            )
-
-        ### ------- QCOM EDITS STARTS ------- ###
-        # Combined attention mask expand attention mask to rank-4
-        # [ bsz, 1, tgt_seq_len, src_seq_len ]
-        # check attention mask shape and fetch sequence length correctly.
-        elif attention_mask is not None:
-            attention_shape = attention_mask.shape
-            batch_size = attention_shape[0]
-            seq_length = (
-                attention_shape[-2]
-                if len(attention_shape) == 4
-                else attention_shape[-1]
-            )
-
-        ### ------- QCOM EDITS ENDS ------- ###
-        elif inputs_embeds is not None:
-            batch_size, seq_length, _ = inputs_embeds.shape
-        else:
-            raise ValueError(
-                "You have to specify either decoder_input_ids or decoder_inputs_embeds"
-            )
-
-        seq_length_with_past = seq_length
-        past_key_values_length = 0
-
-        if past_key_values is not None:
-            past_key_values_length = past_key_values[0][1][0].shape[-2]
-            seq_length_with_past = seq_length_with_past + past_key_values_length
-
-        if position_ids is None:
-            device = input_ids.device if input_ids is not None else inputs_embeds.device
-            position_ids = torch.arange(
-                past_key_values_length,
-                seq_length + past_key_values_length,
-                dtype=torch.long,
-                device=device,
-            )
-            position_ids = position_ids.unsqueeze(0).view(-1, seq_length)
-        elif isinstance(position_ids, (tuple, list)):
-            # don't make position_ids
-            pass
-        else:
-            position_ids = position_ids.view(-1, seq_length).long()
-
-        ### ------- QCOM EDITS STARTS ------- ###
-        if self.config.split_model is None or self.config.split_model == 1:
-            if inputs_embeds is None:
-                inputs_embeds = self.embed_tokens(input_ids)
-            # embed positions
-        ### ------- QCOM EDITS ENDS ------- ###
-
-        # if use_combined_mask_input, then attention mask is prepared outside the model
-        if not use_combined_mask_input:
-            if attention_mask is None:
-                attention_mask = torch.ones(
-                    (batch_size, seq_length_with_past),
-                    dtype=torch.bool,
-                    device=inputs_embeds.device,
-                )
-            attention_mask = self._prepare_decoder_attention_mask(
-                attention_mask,
-                (batch_size, seq_length),
-                inputs_embeds,
-                past_key_values_length,
-                self.mask_neg,
-            )
-
-        ### ------- QCOM EDITS STARTS ------- ###
-        if self.config.split_model is None or self.config.split_model == 1:
-            hidden_states = inputs_embeds
-        else:
-            hidden_states = input_ids
-        ### ------- QCOM EDITS ENDS ------- ###
-
-        if self.gradient_checkpointing and self.training:
-            if use_cache:
-                logger.warning_once(
-                    "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`..."
-                )
-                use_cache = False
-
-        # decoder layers
-        all_hidden_states = () if output_hidden_states else None
-        all_self_attns = () if output_attentions else None
-        next_decoder_cache = () if use_cache else None
-
-        for idx, decoder_layer in enumerate(self.layers):
-            if output_hidden_states:
-                all_hidden_states += (hidden_states,)
-
-            past_key_value = (
-                past_key_values[idx] if past_key_values is not None else None
-            )
-
-            if self.gradient_checkpointing and self.training:
-
-                def create_custom_forward(module):
-                    def custom_forward(*inputs):
-                        # None for past_key_value
-                        return module(*inputs, output_attentions, None)
-
-                    return custom_forward
-
-                layer_outputs = torch.utils.checkpoint.checkpoint(
-                    create_custom_forward(decoder_layer),
-                    hidden_states,
-                    attention_mask,
-                    position_ids,
-                    None,
-                )
-            else:
-                layer_outputs = decoder_layer(
-                    hidden_states,
-                    attention_mask=attention_mask,
-                    position_ids=position_ids,
-                    past_key_value=past_key_value,
-                    output_attentions=output_attentions,
-                    use_cache=use_cache,
-                )
-
-            hidden_states = layer_outputs[0]
-
-            if use_cache:
-                next_decoder_cache += (layer_outputs[2 if output_attentions else 1],)
-
-            if output_attentions:
-                all_self_attns += (layer_outputs[1],)
-
-        ### ------- QCOM EDITS STARTS ------- ###
-        if self.config.split_model is None or self.config.split_model == 5:
-            hidden_states = self.norm(hidden_states)
-        ### ------- QCOM EDITS ENDS ------- ###
-
-        # add hidden states from the last decoder layer
-        if output_hidden_states:
-            all_hidden_states += (hidden_states,)
-
-        next_cache = next_decoder_cache if use_cache else None
-        if not return_dict:
-            return tuple(
-                v
-                for v in [hidden_states, next_cache, all_hidden_states, all_self_attns]
-                if v is not None
-            )
-        return BaseModelOutputWithPast(
-            last_hidden_state=hidden_states,
-            past_key_values=next_cache,
-            hidden_states=all_hidden_states,
-            attentions=all_self_attns,
-        )
-
-
-class CustomLogitWarper(nn.Module):
-    """
-    Customized transformers.TopKLogitsWarper: Temperature + Topk + Softmax
-    """
-
-    def __init__(self, top_k, temperature, filter_value=-float("inf")):
-        super().__init__()
-        self.top_k = top_k
-        self.temperature = temperature
-        self.filter_value = filter_value
-
-    def forward(self, logits):
-        top_logits, indices = torch.topk(logits, self.top_k)
-        indices_to_remove = logits < top_logits[..., -1, None]
-        logits = logits / self.temperature
-        logits = logits.masked_fill(indices_to_remove, self.filter_value)
-        probs = nn.functional.softmax(logits, dim=-1)
-        return probs, indices
-
-
-class LlamaForCausalLM(LlamaPreTrainedModel):
-    def __init__(self, config):
-        super().__init__(config)
-        self.model = LlamaModel(config)
-
-        self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
-
-        # Initialize weights and apply final processing
-        self.post_init()
-
-        self.num_logits_to_return = config.num_logits_to_return
-        self.return_top_k = config.return_top_k
-        if self.return_top_k > 0:
-            self.logit_warper = CustomLogitWarper(
-                self.return_top_k,
-                config.logit_temperature,
-                filter_value=config.mask_neg,
-            )
-
-    def prepare_conv(self):
-        if not hasattr(self, "lm_head_conv"):
-            self.lm_head_conv = nn.Conv2d(
-                self.config.hidden_size, self.config.vocab_size, 1, bias=False
-            )
-            self.lm_head_conv.weight.data.copy_(self.lm_head.weight[:, :, None, None])
-
-            del self.lm_head
-
-    def lm_head_conv_forward(self, x):
-        bsz, _, _ = x.size()
-        x = torch.reshape(x, (bsz, -1, 1, self.config.hidden_size))
-        x = x.transpose(1, 3)  # Transpose right before and after Conv
-        x = self.lm_head_conv(x)
-        x = x.transpose(1, 3)
-        x = torch.reshape(x, (bsz, -1, self.config.vocab_size))
-        return x
-
-    def get_input_embeddings(self):
-        return self.model.embed_tokens
-
-    def set_input_embeddings(self, value):
-        self.model.embed_tokens = value
-
-    def get_output_embeddings(self):
-        return self.lm_head
-
-    def set_output_embeddings(self, new_embeddings):
-        self.lm_head = new_embeddings
-
-    def set_decoder(self, decoder):
-        self.model = decoder
-
-    def get_decoder(self):
-        return self.model
-
-    @add_start_docstrings_to_model_forward(LLAMA_INPUTS_DOCSTRING)
-    @replace_return_docstrings(
-        output_type=CausalLMOutputWithPast, config_class=_CONFIG_FOR_DOC
-    )
-    def forward(
-        self,
-        input_ids: torch.LongTensor = None,
-        attention_mask: Optional[torch.Tensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,
-        past_key_values: Optional[List[torch.FloatTensor]] = None,
-        inputs_embeds: Optional[torch.FloatTensor] = None,
-        labels: Optional[torch.LongTensor] = None,
-        use_cache: Optional[bool] = None,
-        output_attentions: Optional[bool] = None,
-        output_hidden_states: Optional[bool] = None,
-        return_dict: Optional[bool] = None,
-    ) -> Union[Tuple, CausalLMOutputWithPast]:
-        r"""
-        Args:
-            labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
-                Labels for computing the masked language modeling loss. Indices should either be in `[0, ...,
-                config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored
-                (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`.
-        Returns:
-        Example:
-        ```python
-        >>> from transformers import AutoTokenizer, LlamaForCausalLM
-        >>> model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
-        >>> tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
-        >>> prompt = "Hey, are you consciours? Can you talk to me?"
-        >>> inputs = tokenizer(prompt, return_tensors="pt")
-        >>> # Generate
-        >>> generate_ids = model.generate(inputs.input_ids, max_length=30)
-        >>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
-        "Hey, are you consciours? Can you talk to me?\nI'm not consciours, but I can talk to you."
-        ```"""
-
-        output_attentions = (
-            output_attentions
-            if output_attentions is not None
-            else self.config.output_attentions
-        )
-        output_hidden_states = (
-            output_hidden_states
-            if output_hidden_states is not None
-            else self.config.output_hidden_states
-        )
-        return_dict = (
-            return_dict if return_dict is not None else self.config.use_return_dict
-        )
-
-        # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
-        outputs = self.model(
-            input_ids=input_ids,
-            attention_mask=attention_mask,
-            position_ids=position_ids,
-            past_key_values=past_key_values,
-            inputs_embeds=inputs_embeds,
-            use_cache=use_cache,
-            output_attentions=output_attentions,
-            output_hidden_states=output_hidden_states,
-            return_dict=return_dict,
-        )
-
-        hidden_states = outputs[0]
-
-        ### ------- QCOM EDITS STARTS ------- ###
-        loss = None
-        if self.config.split_model is None or self.config.split_model == 5:
-            if self.num_logits_to_return == 0:
-                # return all logits by default
-                logits = (
-                    self.lm_head_conv_forward(hidden_states)
-                    if self.config.use_conv
-                    else self.lm_head(hidden_states)
-                )
-            else:
-                # only return num_logits_to_return logits for memory efficiency
-                last_hidden_states = hidden_states[
-                    :, -self.num_logits_to_return :, :
-                ].contiguous()
-                logits = (
-                    self.lm_head_conv_forward(last_hidden_states)
-                    if self.config.use_conv
-                    else self.lm_head(last_hidden_states)
-                )
-
-            if labels is not None:
-                # Shift so that tokens < n predict n
-                all_logits = self.lm_head(hidden_states)
-                shift_logits = all_logits[..., :-1, :].contiguous()
-                shift_labels = labels[..., 1:].contiguous()
-                # Flatten the tokens
-                loss_fct = CrossEntropyLoss()
-                shift_logits = shift_logits.view(-1, self.config.vocab_size)
-                shift_labels = shift_labels.view(-1)
-                # Enable model parallelism
-                shift_labels = shift_labels.to(shift_logits.device)
-                loss = loss_fct(shift_logits, shift_labels)
-
-            if self.return_top_k > 0:
-                probs, indices = self.logit_warper(logits)
-                output = (probs, indices) + outputs[1:]
-                return ((loss,) + output) if loss is not None else output
-        else:
-            logits = hidden_states
-        ### ------- QCOM EDITS ENDS ------- ###
-
-        if not return_dict:
-            output = (logits,) + outputs[1:]
-            return (loss,) + output if loss is not None else output
-
-        return CausalLMOutputWithPast(
-            loss=loss,
-            logits=logits,
-            past_key_values=outputs.past_key_values,
-            hidden_states=outputs.hidden_states,
-            attentions=outputs.attentions,
-        )
-
-    def prepare_inputs_for_generation(
-        self,
-        input_ids,
-        past_key_values=None,
-        attention_mask=None,
-        inputs_embeds=None,
-        **kwargs,
-    ):
-        if past_key_values:
-            input_ids = input_ids[:, -1:]
-
-        position_ids = kwargs.get("position_ids", None)
-        if attention_mask is not None and position_ids is None:
-            # create position_ids on the fly for batch generation
-            position_ids = attention_mask.long().cumsum(-1) - 1
-            position_ids.masked_fill_(attention_mask == 0, 1)
-            if past_key_values:
-                position_ids = position_ids[:, -1].unsqueeze(-1)
-
-        # if `inputs_embeds` are passed, we only want to use them in the 1st generation step
-        if inputs_embeds is not None and past_key_values is None:
-            model_inputs = {"inputs_embeds": inputs_embeds}
-        else:
-            model_inputs = {"input_ids": input_ids}
-
-        model_inputs.update(
-            {
-                "position_ids": position_ids,
-                "past_key_values": past_key_values,
-                "use_cache": kwargs.get("use_cache"),
-                "attention_mask": attention_mask,
-            }
-        )
-        return model_inputs
-
-    @staticmethod
-    def _reorder_cache(past_key_values, beam_idx):
-        reordered_past = ()
-        for layer_past in past_key_values:
-            reordered_past += (
-                tuple(
-                    past_state.index_select(0, beam_idx) for past_state in layer_past
-                ),
-            )
-        return reordered_past
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/perf.yaml b/qai_hub_models/models/llama_v3_8b_chat_quantized/perf.yaml
index 24c2c6f5..6a13d6da 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/perf.yaml
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/perf.yaml
@@ -1,173 +1,42 @@
+aggregated:
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Snapdragon X Elite CRD
+  supported_oses:
+  - Android
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® X Elite
 models:
-- name: Llama3-TokenGenerator-KVCache-Quantized
+  name: Llama-v3-8B-Chat
   performance_metrics:
-  - reference_device_info:
-      name: QCS8550 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-10-05T00:20:11.634769Z'
-    torchscript_onnx_qnn:
-      inference_time: 99315
-      throughput: 10.07
-      estimated_peak_memory_range:
-        min: 34553856
-        max: 36402280
-      layer_info:
-        layers_on_npu: 21272
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 21272
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: 'null'
-      job_status: Passed
-  - reference_device_info:
-      name: Samsung Galaxy S24
-      os: '14'
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 159383
+          max: 5100256
+        tokens_per_second: 12.9262
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
       form_factor: Phone
       os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-06-11T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 72856
-      throughput: 13.72
-      estimated_peak_memory_range:
-        min: 950272
-        max: 1322707920
-      layer_info:
-        layers_on_npu: 20765
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 20765
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
-      name: Snapdragon X Elite CRD
-      os: '11'
-      form_factor: Compute
-      os_name: Windows
       manufacturer: Qualcomm
-      chipset: Snapdragon® X Elite
-    timestamp: '2024-06-12T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 79170
-      throughput: 12.63
-      estimated_peak_memory_range:
-        min: 17051648
-        max: 17051648
-      layer_info:
-        layers_on_npu: 20765
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 20765
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-- name: Llama3-PromptProcessor-Quantized
-  performance_metrics:
-  - reference_device_info:
-      name: QCS8550 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-10-05T00:17:58.707236Z'
-    torchscript_onnx_qnn:
-      inference_time: 1807176
-      throughput: 566.63
-      estimated_peak_memory_range:
-        min: 11788288
-        max: 13357640
-      layer_info:
-        layers_on_npu: 20248
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 20248
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: 'null'
-      job_status: Passed
-  - reference_device_info:
-      name: Samsung Galaxy S24
-      os: '14'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-06-11T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 1316502
-      throughput: 781.67
-      estimated_peak_memory_range:
-        min: 12288
-        max: 1026895408
-      layer_info:
-        layers_on_npu: 20248
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 20248
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-  - reference_device_info:
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 211644
+          max: 6772608
+        tokens_per_second: 10.0367
+      evaluation_metrics: null
+    reference_device_info:
       name: Snapdragon X Elite CRD
       os: '11'
       form_factor: Compute
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-06-12T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 1668294
-      throughput: 613.83
-      estimated_peak_memory_range:
-        min: 10801152
-        max: 10801152
-      layer_info:
-        layers_on_npu: 20248
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 20248
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: "null"
-      job_status: Passed
-aggregated:
-  supported_devices:
-  - Samsung Galaxy S23 Ultra
-  - Samsung Galaxy S24
-  - Snapdragon X Elite CRD
-  supported_oses:
-  - Android
-  supported_chipsets:
-  - Snapdragon® 8 Gen 2
-  - Snapdragon® 8 Gen 3
-  - Snapdragon® X Elite
-  performance_metrics:
-  - reference_device_info:
-      name: Samsung Galaxy S23 Ultra
-      os: '13'
-      form_factor: Phone
-      os_name: Android
-      manufacturer: Samsung
-      chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-01-26T00:34:02.549319Z'
-    torchscript_onnx_qnn:
-      inference_time: 117423.0
-      throughput: 8.5
-      estimated_peak_memory_range:
-        min: 68579328
-        max: 73044264
-      precision: uint16
-      primary_compute_unit: NPU
-      job_id: ""
-      job_status: Passed
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/llama_v3_8b_chat_quantized/requirements.txt b/qai_hub_models/models/llama_v3_8b_chat_quantized/requirements.txt
index 10e857b7..c5deadcc 100644
--- a/qai_hub_models/models/llama_v3_8b_chat_quantized/requirements.txt
+++ b/qai_hub_models/models/llama_v3_8b_chat_quantized/requirements.txt
@@ -1,3 +1,5 @@
-transformers==4.40.0
+onnx==1.16.2
+transformers==4.45.0
+huggingface_hub==0.23.2
 sentencepiece==0.2.0
 psutil
diff --git a/qai_hub_models/models/mediapipe_face/README.md b/qai_hub_models/models/mediapipe_face/README.md
index 1aeb2a40..9d82ada2 100644
--- a/qai_hub_models/models/mediapipe_face/README.md
+++ b/qai_hub_models/models/mediapipe_face/README.md
@@ -6,7 +6,7 @@
 Designed for sub-millisecond processing, this model predicts bounding boxes and pose skeletons (left eye, right eye, nose tip, mouth, left eye tragion, and right eye tragion) of faces in an image.
 
 This is based on the implementation of MediaPipe-Face-Detection found
-[here](https://github.com/zmurez/MediaPipePyTorch/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mediapipe_face).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mediapipe_face.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MediaPipe-Face-Detection can be found
+* The license for the original implementation of MediaPipe-Face-Detection can be found
   [here](https://github.com/zmurez/MediaPipePyTorch/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs](https://arxiv.org/abs/1907.05047)
 * [Source Model Implementation](https://github.com/zmurez/MediaPipePyTorch/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mediapipe_face/export.py b/qai_hub_models/models/mediapipe_face/export.py
index 3ed353ab..e9be9d54 100644
--- a/qai_hub_models/models/mediapipe_face/export.py
+++ b/qai_hub_models/models/mediapipe_face/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mediapipe_face import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mediapipe_face"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "MediaPipeFaceDetector" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/mediapipe_face/perf.yaml b/qai_hub_models/models/mediapipe_face/perf.yaml
index ec65f38f..626156cb 100644
--- a/qai_hub_models/models/mediapipe_face/perf.yaml
+++ b/qai_hub_models/models/mediapipe_face/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MediaPipeFaceDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 552.0
-      throughput: 1811.5942028985507
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
-        min: 12288
-        max: 1453472
+        min: 24576
+        max: 1451224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: jogkzllog
+      job_id: jg9lnm9qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 622.0
-      throughput: 1607.717041800643
+      inference_time: 626.0
+      throughput: 1597.444089456869
       estimated_peak_memory_range:
-        min: 28672
-        max: 5194512
+        min: 806912
+        max: 5865936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jz57zjvrp
+      job_id: jgo26rm4p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1042.0
-      throughput: 959.6928982725528
+      inference_time: 1003.0
+      throughput: 997.0089730807578
       estimated_peak_memory_range:
-        min: 393216
-        max: 3493720
+        min: 12288
+        max: 77735904
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j1pv31zm5
+      job_id: jp0z0jde5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:19:38Z'
+    timestamp: '2024-10-15T00:17:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 440.0
-      throughput: 2272.7272727272725
+      inference_time: 450.0
+      throughput: 2222.222222222222
       estimated_peak_memory_range:
         min: 12288
-        max: 34769872
+        max: 35502640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: j1gln00lp
+      job_id: jgdx137kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 549.0
-      throughput: 1821.4936247723133
+      inference_time: 502.0
+      throughput: 1992.03187250996
       estimated_peak_memory_range:
         min: 802816
-        max: 17389312
+        max: 15520064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: j0pxv7e9g
+      job_id: jgjvn717g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 836.0
-      throughput: 1196.1722488038276
+      inference_time: 810.0
+      throughput: 1234.567901234568
       estimated_peak_memory_range:
         min: 0
-        max: 39429984
+        max: 40477728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jlpe9r40g
+      job_id: jgkex4oog
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:19:39Z'
+    timestamp: '2024-10-15T00:17:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 542.0
-      throughput: 1845.018450184502
+      inference_time: 546.0
+      throughput: 1831.5018315018315
       estimated_peak_memory_range:
         min: 12288
-        max: 1257848
+        max: 75495088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: j1p3k44z5
+      job_id: jp4lr1jq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 593.0
-      throughput: 1686.3406408094436
+      inference_time: 599.0
+      throughput: 1669.449081803005
       estimated_peak_memory_range:
-        min: 811008
-        max: 2063984
+        min: 819200
+        max: 2153864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jep2873mp
+      job_id: jg9lnm8qg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:19:29Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:17:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 749.0
-      throughput: 1335.1134846461948
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
         min: 12288
-        max: 31840544
+        max: 75878936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: j1pv311m5
+      job_id: jp8qyx8zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 825.0
-      throughput: 1212.121212121212
+      inference_time: 602.0
+      throughput: 1661.1295681063123
       estimated_peak_memory_range:
-        min: 802816
-        max: 17311184
+        min: 827392
+        max: 2353288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: j1p3k4qz5
+      job_id: jgdx130lp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:19:36Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:17:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 550.0
-      throughput: 1818.1818181818182
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
-        min: 24576
-        max: 1340576
+        min: 20480
+        max: 1445560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: jlpe9rr0g
+      job_id: jpy13xnrp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 602.0
-      throughput: 1661.1295681063123
+      inference_time: 612.0
+      throughput: 1633.986928104575
       estimated_peak_memory_range:
-        min: 831488
-        max: 2071336
+        min: 823296
+        max: 2118152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: j2p0y1eeg
+      job_id: jg9lnm8vg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:19:30Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:17:33Z'
   - torchscript_onnx_tflite:
       inference_time: 547.0
       throughput: 1828.1535648994516
       estimated_peak_memory_range:
-        min: 12288
-        max: 4834208
+        min: 65536
+        max: 1466768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: jz5woddjp
+      job_id: jprv309vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 604.0
-      throughput: 1655.6291390728477
+      inference_time: 613.0
+      throughput: 1631.3213703099511
       estimated_peak_memory_range:
         min: 819200
-        max: 2065576
+        max: 2122200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jogkzlrog
+      job_id: jgdx130kp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:19:32Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:17:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 552.0
-      throughput: 1811.5942028985507
+      inference_time: 763.0
+      throughput: 1310.615989515072
       estimated_peak_memory_range:
-        min: 20480
-        max: 1332120
+        min: 77824
+        max: 32675344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 111
-      job_id: jnp10ddl5
+      job_id: j5mnxmvyp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 608.0
-      throughput: 1644.7368421052631
+      inference_time: 830.0
+      throughput: 1204.8192771084337
       estimated_peak_memory_range:
-        min: 827392
-        max: 2271272
+        min: 802816
+        max: 17538896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: j1gln0elp
+      job_id: jgn6vnom5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:17:38Z'
+  - torchscript_onnx_tflite:
+      inference_time: 410.0
+      throughput: 2439.0243902439024
+      estimated_peak_memory_range:
+        min: 8192
+        max: 25089424
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 111
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 111
+      job_id: j56y47vvp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 380.0
+      throughput: 2631.5789473684213
+      estimated_peak_memory_range:
+        min: 798720
+        max: 13748800
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 146
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 146
+      job_id: jp2kyw4mp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 752.0
+      throughput: 1329.787234042553
+      estimated_peak_memory_range:
+        min: 0
+        max: 29157136
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 147
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 147
+      job_id: jpv6kdem5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:19:34Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:17:49Z'
   - torchscript_onnx_qnn:
-      inference_time: 766.0
-      throughput: 1305.4830287206266
+      inference_time: 804.0
+      throughput: 1243.7810945273632
       estimated_peak_memory_range:
         min: 786432
         max: 786432
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jegn29rmg
+      job_id: jgz3dmwz5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1076.0
-      throughput: 929.368029739777
+      inference_time: 1040.0
+      throughput: 961.5384615384615
       estimated_peak_memory_range:
-        min: 1908736
-        max: 1908736
+        min: 2031616
+        max: 2031616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jmg9v39v5
+      job_id: jglvmxol5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,15 +429,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:19:41Z'
+    timestamp: '2024-10-15T00:17:46Z'
 - name: MediaPipeFaceLandmarkDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 193.0
-      throughput: 5181.347150259067
+      inference_time: 190.0
+      throughput: 5263.1578947368425
       estimated_peak_memory_range:
-        min: 36864
-        max: 4354488
+        min: 20480
+        max: 1544456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -394,14 +445,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jn5q877m5
+      job_id: jp14zjqkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 277.0
-      throughput: 3610.1083032490974
+      inference_time: 279.0
+      throughput: 3584.2293906810037
       estimated_peak_memory_range:
-        min: 475136
-        max: 8373224
+        min: 458752
+        max: 7823360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -409,14 +460,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jqp4qxjlg
+      job_id: jpv6kd475
       job_status: Passed
     torchscript_onnx:
-      inference_time: 503.0
-      throughput: 1988.0715705765408
+      inference_time: 506.0
+      throughput: 1976.2845849802372
       estimated_peak_memory_range:
-        min: 24576
-        max: 1592672
+        min: 12288
+        max: 1609632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -424,7 +475,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: j7gjx0k8p
+      job_id: jp8qyx68p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -433,13 +484,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:19:38Z'
+    timestamp: '2024-10-15T00:17:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 153.0
-      throughput: 6535.9477124183
+      inference_time: 144.0
+      throughput: 6944.444444444444
       estimated_peak_memory_range:
-        min: 12288
-        max: 30067152
+        min: 16384
+        max: 30559200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -447,14 +498,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jw5663375
+      job_id: j57yr4vq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 216.0
-      throughput: 4629.62962962963
+      inference_time: 213.0
+      throughput: 4694.835680751174
       estimated_peak_memory_range:
-        min: 0
-        max: 10297152
+        min: 458752
+        max: 12063568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -462,14 +513,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jo5mrwvqg
+      job_id: jpedmz275
       job_status: Passed
     torchscript_onnx:
-      inference_time: 401.0
-      throughput: 2493.7655860349128
+      inference_time: 402.0
+      throughput: 2487.5621890547263
       estimated_peak_memory_range:
         min: 0
-        max: 33171120
+        max: 33128640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -477,7 +528,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: jygzexv6g
+      job_id: j5q6qyzmp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -486,13 +537,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:19:40Z'
+    timestamp: '2024-10-15T00:17:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 192.0
-      throughput: 5208.333333333333
+      inference_time: 190.0
+      throughput: 5263.1578947368425
       estimated_peak_memory_range:
-        min: 32768
-        max: 8564216
+        min: 12288
+        max: 1368704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -500,14 +551,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jwgoy11d5
+      job_id: jpxko4ej5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 275.0
-      throughput: 3636.3636363636365
+      inference_time: 274.0
+      throughput: 3649.6350364963505
       estimated_peak_memory_range:
-        min: 516096
-        max: 1751168
+        min: 0
+        max: 1780184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -515,7 +566,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jqpye4v4g
+      job_id: jp14zj3kp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -523,14 +574,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:19:29Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:17:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 279.0
-      throughput: 3584.2293906810037
+      inference_time: 193.0
+      throughput: 5181.347150259067
       estimated_peak_memory_range:
         min: 20480
-        max: 30064256
+        max: 17894512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -538,14 +589,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: j7gjx008p
+      job_id: jgkex4dyg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 377.0
-      throughput: 2652.5198938992044
+      inference_time: 275.0
+      throughput: 3636.3636363636365
       estimated_peak_memory_range:
-        min: 458752
-        max: 14833504
+        min: 475136
+        max: 1729048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -553,22 +604,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jwgoy1ed5
+      job_id: j57yr4kr5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:19:36Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:17:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 189.0
-      throughput: 5291.005291005291
+      inference_time: 194.0
+      throughput: 5154.639175257732
       estimated_peak_memory_range:
-        min: 32768
-        max: 1780696
+        min: 16384
+        max: 1366016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -576,14 +627,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jygzexx6g
+      job_id: jp0z0jk25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 276.0
-      throughput: 3623.1884057971015
+      inference_time: 277.0
+      throughput: 3610.1083032490974
       estimated_peak_memory_range:
-        min: 471040
-        max: 1620408
+        min: 466944
+        max: 1599728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -591,22 +642,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: j1p8o3w8g
+      job_id: jp14zj3lp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:19:31Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:17:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 192.0
-      throughput: 5208.333333333333
+      inference_time: 194.0
+      throughput: 5154.639175257732
       estimated_peak_memory_range:
-        min: 32768
-        max: 12650104
+        min: 28672
+        max: 77986736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -614,14 +665,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jmg9v33v5
+      job_id: jp2kywjxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 276.0
-      throughput: 3623.1884057971015
+      inference_time: 273.0
+      throughput: 3663.003663003663
       estimated_peak_memory_range:
-        min: 483328
-        max: 2200040
+        min: 466944
+        max: 1825032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -629,22 +680,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jn5q879m5
+      job_id: j5we67xj5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:19:33Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:17:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 191.0
-      throughput: 5235.602094240838
+      inference_time: 283.0
+      throughput: 3533.5689045936397
       estimated_peak_memory_range:
-        min: 28672
-        max: 9401336
+        min: 20480
+        max: 30559248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -652,14 +703,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 100
-      job_id: jvgdwrrl5
+      job_id: jgn6vnxv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 281.0
-      throughput: 3558.7188612099644
+      inference_time: 382.0
+      throughput: 2617.801047120419
       estimated_peak_memory_range:
-        min: 471040
-        max: 2095608
+        min: 0
+        max: 14615040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -667,19 +718,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jw5663q75
+      job_id: jprv30oeg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:17:39Z'
+  - torchscript_onnx_tflite:
+      inference_time: 122.0
+      throughput: 8196.72131147541
+      estimated_peak_memory_range:
+        min: 0
+        max: 19055568
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 100
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 100
+      job_id: jp3j098xg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 207.0
+      throughput: 4830.917874396136
+      estimated_peak_memory_range:
+        min: 458752
+        max: 10560912
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 105
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 105
+      job_id: jpy13xq4p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 409.0
+      throughput: 2444.987775061125
+      estimated_peak_memory_range:
+        min: 0
+        max: 19005648
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 106
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 106
+      job_id: jgjvn7o8g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:19:35Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:17:49Z'
   - torchscript_onnx_qnn:
-      inference_time: 376.0
-      throughput: 2659.574468085106
+      inference_time: 383.0
+      throughput: 2610.9660574412533
       estimated_peak_memory_range:
         min: 442368
         max: 442368
@@ -690,14 +794,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: joprk41e5
+      job_id: j5we67xz5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 509.0
-      throughput: 1964.6365422396857
+      inference_time: 512.0
+      throughput: 1953.125
       estimated_peak_memory_range:
-        min: 1847296
-        max: 1847296
+        min: 1884160
+        max: 1884160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -705,7 +809,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: jnp10dql5
+      job_id: j56y47r7p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -714,4 +818,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:19:41Z'
+    timestamp: '2024-10-15T00:17:46Z'
diff --git a/qai_hub_models/models/mediapipe_face_quantized/README.md b/qai_hub_models/models/mediapipe_face_quantized/README.md
index 72472470..46972b73 100644
--- a/qai_hub_models/models/mediapipe_face_quantized/README.md
+++ b/qai_hub_models/models/mediapipe_face_quantized/README.md
@@ -6,7 +6,7 @@
 Designed for sub-millisecond processing, this model predicts bounding boxes and pose skeletons (left eye, right eye, nose tip, mouth, left eye tragion, and right eye tragion) of faces in an image.
 
 This is based on the implementation of MediaPipe-Face-Detection-Quantized found
-[here](https://github.com/zmurez/MediaPipePyTorch/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mediapipe_face_quantized).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mediapipe_face_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MediaPipe-Face-Detection-Quantized can be found
+* The license for the original implementation of MediaPipe-Face-Detection-Quantized can be found
   [here](https://github.com/zmurez/MediaPipePyTorch/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs](https://arxiv.org/abs/1907.05047)
 * [Source Model Implementation](https://github.com/zmurez/MediaPipePyTorch/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mediapipe_face_quantized/export.py b/qai_hub_models/models/mediapipe_face_quantized/export.py
index 58c80292..a60c31fc 100644
--- a/qai_hub_models/models/mediapipe_face_quantized/export.py
+++ b/qai_hub_models/models/mediapipe_face_quantized/export.py
@@ -10,13 +10,14 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mediapipe_face_quantized import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -43,20 +44,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -81,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mediapipe_face_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -116,7 +115,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "MediaPipeFaceDetector" in components:
@@ -133,7 +132,7 @@ def export_model(
             target_runtime, output_path, input_spec
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -150,7 +149,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -168,7 +167,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -192,14 +191,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -224,10 +223,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/mediapipe_face_quantized/perf.yaml b/qai_hub_models/models/mediapipe_face_quantized/perf.yaml
index 7f813609..236a7a48 100644
--- a/qai_hub_models/models/mediapipe_face_quantized/perf.yaml
+++ b/qai_hub_models/models/mediapipe_face_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MediaPipeFaceDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 274.0
-      throughput: 3649.6350364963505
+      inference_time: 275.0
+      throughput: 3636.3636363636365
       estimated_peak_memory_range:
-        min: 12288
-        max: 1420480
+        min: 36864
+        max: 1335848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -64,22 +62,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: j1p3k4ex5
+      job_id: j5mnxm3yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 300.0
-      throughput: 3333.3333333333335
+      inference_time: 297.0
+      throughput: 3367.003367003367
       estimated_peak_memory_range:
-        min: 28672
-        max: 5606824
+        min: 16384
+        max: 76721656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: j2p0y1r2g
+        total_layers: 151
+      job_id: j5mnxmzyp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -88,13 +86,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:18:26Z'
+    timestamp: '2024-10-15T00:16:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 196.0
-      throughput: 5102.040816326531
+      inference_time: 182.0
+      throughput: 5494.505494505494
       estimated_peak_memory_range:
         min: 12288
-        max: 33583568
+        max: 33570912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -102,22 +100,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: j1pv31v75
+      job_id: jprv30yvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 211.0
-      throughput: 4739.336492890995
+      inference_time: 236.0
+      throughput: 4237.28813559322
       estimated_peak_memory_range:
-        min: 204800
-        max: 18012192
+        min: 208896
+        max: 20242528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: jogkzlyyg
+        total_layers: 151
+      job_id: jp2kyw7xp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -126,13 +124,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:18:28Z'
+    timestamp: '2024-10-15T00:16:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 273.0
-      throughput: 3663.003663003663
+      inference_time: 681.0
+      throughput: 1468.4287812041116
       estimated_peak_memory_range:
         min: 12288
-        max: 1265400
+        max: 26145872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -140,37 +138,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jlpe9rk7g
+      job_id: jg9lnm4qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 303.0
-      throughput: 3300.3300330033003
+      inference_time: 762.0
+      throughput: 1312.3359580052493
       estimated_peak_memory_range:
-        min: 278528
-        max: 1444976
+        min: 12288
+        max: 8048944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: j1p3k4mx5
+        total_layers: 151
+      job_id: jp14zjdkp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:18:32Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:16:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 322.0
-      throughput: 3105.590062111801
+      inference_time: 5031.0
+      throughput: 198.76764062810574
+      estimated_peak_memory_range:
+        min: 28672
+        max: 5636744
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 121
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 121
+      job_id: jgdx13vkp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:16:05Z'
+  - torchscript_onnx_tflite:
+      inference_time: 273.0
+      throughput: 3663.003663003663
       estimated_peak_memory_range:
         min: 12288
-        max: 33420464
+        max: 1499376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -178,37 +199,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jz5wodqzp
+      job_id: jp0z0jr25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 352.0
-      throughput: 2840.909090909091
+      inference_time: 301.0
+      throughput: 3322.2591362126245
       estimated_peak_memory_range:
-        min: 208896
-        max: 20119952
+        min: 229376
+        max: 1570912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: jnp10d8k5
+        total_layers: 151
+      job_id: jgkex4lyg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:18:39Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:16:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 276.0
-      throughput: 3623.1884057971015
+      inference_time: 279.0
+      throughput: 3584.2293906810037
       estimated_peak_memory_range:
         min: 12288
-        max: 1522664
+        max: 73232952
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -216,37 +237,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jnp10dek5
+      job_id: jpv6kdw75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 305.0
-      throughput: 3278.688524590164
+      inference_time: 303.0
+      throughput: 3300.3300330033003
       estimated_peak_memory_range:
-        min: 221184
-        max: 1508064
+        min: 237568
+        max: 1460392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: j1pv31w75
+        total_layers: 151
+      job_id: jpv6kd175
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:18:34Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:16:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 273.0
-      throughput: 3663.003663003663
+      inference_time: 274.0
+      throughput: 3649.6350364963505
       estimated_peak_memory_range:
         min: 12288
-        max: 1533688
+        max: 1308920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -254,22 +275,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jz57zjxqp
+      job_id: jp3j09mxg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 302.0
-      throughput: 3311.2582781456954
+      inference_time: 303.0
+      throughput: 3300.3300330033003
       estimated_peak_memory_range:
-        min: 225280
-        max: 1933896
+        min: 229376
+        max: 1455680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: jlpe9rv7g
+        total_layers: 151
+      job_id: jp3j094xg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -277,14 +298,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:18:36Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:16:21Z'
   - torchscript_onnx_tflite:
-      inference_time: 278.0
-      throughput: 3597.122302158273
+      inference_time: 272.0
+      throughput: 3676.470588235294
       estimated_peak_memory_range:
         min: 12288
-        max: 1478072
+        max: 2358944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -292,37 +313,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: j0pxv7yjg
+      job_id: jglvmxke5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 303.0
-      throughput: 3300.3300330033003
+      inference_time: 301.0
+      throughput: 3322.2591362126245
       estimated_peak_memory_range:
-        min: 225280
-        max: 1466272
+        min: 229376
+        max: 1461496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: jz5wod9zp
+        total_layers: 151
+      job_id: jglvmx0e5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:18:37Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:16:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 789.0
-      throughput: 1267.427122940431
+      inference_time: 324.0
+      throughput: 3086.41975308642
       estimated_peak_memory_range:
         min: 24576
-        max: 26053904
+        max: 34618272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -330,37 +351,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jegn29evg
+      job_id: jgkex4yyg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 747.0
-      throughput: 1338.6880856760374
+      inference_time: 350.0
+      throughput: 2857.1428571428573
       estimated_peak_memory_range:
-        min: 12288
-        max: 8071440
+        min: 208896
+        max: 19975968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: jz5wod9jp
+        total_layers: 151
+      job_id: j5we67dz5
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:18:41Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:16:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 4996.0
-      throughput: 200.160128102482
+      inference_time: 202.0
+      throughput: 4950.495049504951
       estimated_peak_memory_range:
-        min: 40960
-        max: 6796896
+        min: 8192
+        max: 24408816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -368,30 +389,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 121
-      job_id: jep287mxp
+      job_id: jp4lr1wq5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 207.0
+      throughput: 4830.917874396136
+      estimated_peak_memory_range:
+        min: 208896
+        max: 15580480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 151
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 151
+      job_id: j57yr4jq5
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:18:24Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:16:30Z'
   - torchscript_onnx_qnn:
-      inference_time: 430.0
-      throughput: 2325.5813953488373
+      inference_time: 427.0
+      throughput: 2341.92037470726
       estimated_peak_memory_range:
-        min: 475136
-        max: 475136
+        min: 552960
+        max: 552960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 118
+        layers_on_npu: 151
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 118
-      job_id: j1gln0kep
+        total_layers: 151
+      job_id: jp0z0j125
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -400,15 +436,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:18:30Z'
+    timestamp: '2024-10-15T00:16:15Z'
 - name: MediaPipeFaceLandmarkDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 184.0
-      throughput: 5434.782608695652
+      inference_time: 180.0
+      throughput: 5555.555555555556
       estimated_peak_memory_range:
-        min: 12288
-        max: 16939576
+        min: 20480
+        max: 71074632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -416,14 +452,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jwgoy1345
+      job_id: jgn6vnev5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 219.0
-      throughput: 4566.2100456621
+      inference_time: 226.0
+      throughput: 4424.778761061947
       estimated_peak_memory_range:
-        min: 139264
-        max: 3531368
+        min: 24576
+        max: 3226816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -431,7 +467,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: j1p8o37zg
+      job_id: jgn6vn9v5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -440,13 +476,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:18:26Z'
+    timestamp: '2024-10-15T00:16:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 127.0
-      throughput: 7874.0157480314965
+      inference_time: 142.0
+      throughput: 7042.2535211267605
       estimated_peak_memory_range:
         min: 12288
-        max: 28082720
+        max: 27584640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -454,14 +490,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: j7gjx0e7p
+      job_id: jp2kywmxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 163.0
-      throughput: 6134.9693251533745
+      inference_time: 166.0
+      throughput: 6024.096385542169
       estimated_peak_memory_range:
-        min: 0
-        max: 14810512
+        min: 126976
+        max: 12796320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -469,7 +505,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jn5q87275
+      job_id: jpy13x4rp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -478,13 +514,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:18:28Z'
+    timestamp: '2024-10-15T00:16:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 185.0
-      throughput: 5405.405405405405
+      inference_time: 395.0
+      throughput: 2531.6455696202534
       estimated_peak_memory_range:
-        min: 24576
-        max: 1349144
+        min: 12288
+        max: 19759872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -492,14 +528,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jygzexrzg
+      job_id: jp14zj8kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 215.0
-      throughput: 4651.162790697675
+      inference_time: 490.0
+      throughput: 2040.8163265306123
       estimated_peak_memory_range:
-        min: 143360
-        max: 1461008
+        min: 16384
+        max: 8039360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -507,22 +543,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jwgoy1v45
+      job_id: jgdx13rkp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:18:32Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:16:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 226.0
-      throughput: 4424.778761061947
+      inference_time: 2921.0
+      throughput: 342.3485107839781
+      estimated_peak_memory_range:
+        min: 12288
+        max: 6971816
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 117
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 117
+      job_id: j57yr4dq5
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:16:06Z'
+  - torchscript_onnx_tflite:
+      inference_time: 182.0
+      throughput: 5494.505494505494
       estimated_peak_memory_range:
         min: 12288
-        max: 28702592
+        max: 3054504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -530,14 +589,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jmg9v3wq5
+      job_id: jp8qyx7zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 260.0
-      throughput: 3846.153846153846
+      inference_time: 212.0
+      throughput: 4716.981132075472
       estimated_peak_memory_range:
-        min: 126976
-        max: 14818800
+        min: 0
+        max: 1741232
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -545,22 +604,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jvgdwrvk5
+      job_id: j5q6qy77p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:18:40Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:16:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 190.0
-      throughput: 5263.1578947368425
+      inference_time: 185.0
+      throughput: 5405.405405405405
       estimated_peak_memory_range:
-        min: 45056
-        max: 58132840
+        min: 28672
+        max: 1437576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -568,14 +627,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jvgdwrok5
+      job_id: jgjvn7l7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 221.0
-      throughput: 4524.886877828054
+      inference_time: 216.0
+      throughput: 4629.62962962963
       estimated_peak_memory_range:
-        min: 135168
-        max: 1401824
+        min: 188416
+        max: 1447600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -583,22 +642,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: j7gjx0l7p
+      job_id: jgjvn707g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:18:34Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:16:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 181.0
-      throughput: 5524.861878453039
+      inference_time: 185.0
+      throughput: 5405.405405405405
       estimated_peak_memory_range:
-        min: 24576
-        max: 1432552
+        min: 12288
+        max: 3185936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -606,14 +665,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jqp4qxvqg
+      job_id: jgo26rv4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 218.0
-      throughput: 4587.155963302752
+      inference_time: 214.0
+      throughput: 4672.897196261682
       estimated_peak_memory_range:
-        min: 16384
-        max: 1567608
+        min: 143360
+        max: 1484912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -621,7 +680,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jygzex7zg
+      job_id: jgo26r14p
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -629,14 +688,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:18:36Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:16:21Z'
   - torchscript_onnx_tflite:
-      inference_time: 183.0
-      throughput: 5464.48087431694
+      inference_time: 182.0
+      throughput: 5494.505494505494
       estimated_peak_memory_range:
-        min: 49152
-        max: 1431880
+        min: 16384
+        max: 1846152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -644,14 +703,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jo5mrw3yg
+      job_id: j56y471vp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 221.0
-      throughput: 4524.886877828054
+      inference_time: 219.0
+      throughput: 4566.2100456621
       estimated_peak_memory_range:
-        min: 135168
-        max: 1379384
+        min: 0
+        max: 1474192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -659,22 +718,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jmg9v34q5
+      job_id: j56y473vp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:18:38Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:16:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 404.0
-      throughput: 2475.2475247524753
+      inference_time: 214.0
+      throughput: 4672.897196261682
       estimated_peak_memory_range:
         min: 12288
-        max: 19660752
+        max: 29404864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -682,14 +741,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: joprk4yv5
+      job_id: j5q6qy27p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 490.0
-      throughput: 2040.8163265306123
+      inference_time: 269.0
+      throughput: 3717.472118959108
       estimated_peak_memory_range:
-        min: 131072
-        max: 7896480
+        min: 126976
+        max: 15028640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -697,22 +756,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jmg9v34v5
+      job_id: jg9lnm3qg
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:18:41Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:16:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 2886.0
-      throughput: 346.5003465003465
+      inference_time: 120.0
+      throughput: 8333.333333333334
       estimated_peak_memory_range:
-        min: 16384
-        max: 3187016
+        min: 8192
+        max: 18867616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -720,22 +779,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 117
-      job_id: jqpye4drg
+      job_id: jpxko41j5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 177.0
+      throughput: 5649.717514124294
+      estimated_peak_memory_range:
+        min: 0
+        max: 10204336
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 112
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 112
+      job_id: jp4lr1xq5
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:18:24Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:16:30Z'
   - torchscript_onnx_qnn:
-      inference_time: 333.0
-      throughput: 3003.003003003003
+      inference_time: 343.0
+      throughput: 2915.451895043732
       estimated_peak_memory_range:
-        min: 585728
-        max: 585728
+        min: 667648
+        max: 667648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -743,7 +817,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 112
-      job_id: jw56631v5
+      job_id: jp8qyx3zp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -752,4 +826,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:18:30Z'
+    timestamp: '2024-10-15T00:16:15Z'
diff --git a/qai_hub_models/models/mediapipe_hand/README.md b/qai_hub_models/models/mediapipe_hand/README.md
index 7c170f6a..b5e9c832 100644
--- a/qai_hub_models/models/mediapipe_hand/README.md
+++ b/qai_hub_models/models/mediapipe_hand/README.md
@@ -6,7 +6,7 @@
 The MediaPipe Hand Landmark Detector is a machine learning pipeline that predicts bounding boxes and pose skeletons of hands in an image.
 
 This is based on the implementation of MediaPipe-Hand-Detection found
-[here](https://github.com/zmurez/MediaPipePyTorch/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mediapipe_hand).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mediapipe_hand.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MediaPipe-Hand-Detection can be found
+* The license for the original implementation of MediaPipe-Hand-Detection can be found
   [here](https://github.com/zmurez/MediaPipePyTorch/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [MediaPipe Hands: On-device Real-time Hand Tracking](https://arxiv.org/abs/2006.10214)
 * [Source Model Implementation](https://github.com/zmurez/MediaPipePyTorch/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mediapipe_hand/export.py b/qai_hub_models/models/mediapipe_hand/export.py
index 111f81bd..3ab65b6f 100644
--- a/qai_hub_models/models/mediapipe_hand/export.py
+++ b/qai_hub_models/models/mediapipe_hand/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mediapipe_hand import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mediapipe_hand"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "MediaPipeHandDetector" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/mediapipe_hand/perf.yaml b/qai_hub_models/models/mediapipe_hand/perf.yaml
index 10d59228..2cc0d1fb 100644
--- a/qai_hub_models/models/mediapipe_hand/perf.yaml
+++ b/qai_hub_models/models/mediapipe_hand/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MediaPipeHandDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 714.0
-      throughput: 1400.5602240896358
+      inference_time: 704.0
+      throughput: 1420.4545454545455
       estimated_peak_memory_range:
-        min: 12288
-        max: 5003216
+        min: 20480
+        max: 3734688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,29 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jlpe9re7g
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 791.0
-      throughput: 1264.2225031605562
-      estimated_peak_memory_range:
-        min: 716800
-        max: 20735568
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: j2p0y122g
+      job_id: jp14zjynp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1150.0
-      throughput: 869.5652173913044
+      inference_time: 1160.0
+      throughput: 862.0689655172414
       estimated_peak_memory_range:
-        min: 32768
-        max: 6079328
+        min: 20480
+        max: 18222304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 196
-      job_id: jz57zjlqp
+      job_id: jglvmx325
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:17:29Z'
+    timestamp: '2024-10-15T00:15:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 565.0
-      throughput: 1769.9115044247787
+      inference_time: 612.0
+      throughput: 1633.986928104575
       estimated_peak_memory_range:
         min: 12288
-        max: 59678496
+        max: 61765328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,29 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jz5wod2zp
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 622.0
-      throughput: 1607.717041800643
-      estimated_peak_memory_range:
-        min: 806912
-        max: 18903824
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: jogkzlqyg
+      job_id: j57yr40n5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 949.0
-      throughput: 1053.740779768177
+      inference_time: 903.0
+      throughput: 1107.4197120708748
       estimated_peak_memory_range:
-        min: 307200
-        max: 68754544
+        min: 0
+        max: 70548400
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 196
-      job_id: j0pxv76jg
+      job_id: jp3j09emg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:17:31Z'
+    timestamp: '2024-10-15T00:15:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 720.0
-      throughput: 1388.888888888889
+      inference_time: 706.0
+      throughput: 1416.4305949008499
       estimated_peak_memory_range:
-        min: 28672
-        max: 4251000
+        min: 12288
+        max: 118955440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jnp10dyk5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 769.0
-      throughput: 1300.3901170351105
-      estimated_peak_memory_range:
-        min: 0
-        max: 1765648
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: j1p3k41x5
+      job_id: jpxko4n85
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:17:20Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:14:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 1290.0
-      throughput: 775.1937984496124
+      inference_time: 711.0
+      throughput: 1406.4697609001407
       estimated_peak_memory_range:
-        min: 16384
-        max: 54753632
+        min: 28672
+        max: 63810304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jz57zj0qp
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1401.0
-      throughput: 713.7758743754462
-      estimated_peak_memory_range:
-        min: 802816
-        max: 17189520
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: jnp10dwk5
+      job_id: jglvmx225
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:17:27Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:14:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 710.0
-      throughput: 1408.4507042253522
+      inference_time: 706.0
+      throughput: 1416.4305949008499
       estimated_peak_memory_range:
-        min: 28672
-        max: 5703480
+        min: 12288
+        max: 3533936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,22 +178,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: j0pxv7njg
+      job_id: jp0z0j205
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 789.0
-      throughput: 1267.427122940431
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:14:41Z'
+  - torchscript_onnx_tflite:
+      inference_time: 708.0
+      throughput: 1412.4293785310736
       estimated_peak_memory_range:
-        min: 864256
-        max: 2525072
+        min: 24576
+        max: 3593208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 195
+        layers_on_npu: 149
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 195
-      job_id: j1pv31r75
+        total_layers: 149
+      job_id: jp2kyw06p
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +209,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:17:22Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:14:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 713.0
-      throughput: 1402.5245441795232
+      inference_time: 1321.0
+      throughput: 757.002271006813
       estimated_peak_memory_range:
-        min: 28672
-        max: 4408696
+        min: 12288
+        max: 55013504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +224,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jegn29mvg
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 790.0
-      throughput: 1265.8227848101267
-      estimated_peak_memory_range:
-        min: 819200
-        max: 2107504
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: jlpe9rw7g
+      job_id: jgn6vnlj5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:17:24Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:14:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 710.0
-      throughput: 1408.4507042253522
+      inference_time: 529.0
+      throughput: 1890.359168241966
       estimated_peak_memory_range:
-        min: 28672
-        max: 3690112
+        min: 8192
+        max: 29453520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,52 +247,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 149
-      job_id: jep2879xp
+      job_id: jpv6kdrz5
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 798.0
-      throughput: 1253.1328320802006
+    torchscript_onnx:
+      inference_time: 878.0
+      throughput: 1138.9521640091116
       estimated_peak_memory_range:
-        min: 823296
-        max: 2057064
+        min: 0
+        max: 34077424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 195
+        layers_on_npu: 196
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 195
-      job_id: jz5wod3zp
+        total_layers: 196
+      job_id: j5we67q45
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:17:25Z'
-  - torchscript_onnx_qnn:
-      inference_time: 925.0
-      throughput: 1081.081081081081
-      estimated_peak_memory_range:
-        min: 786432
-        max: 786432
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 195
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 195
-      job_id: j1gln02ep
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 1178.0
-      throughput: 848.8964346349745
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:15:13Z'
+  - torchscript_onnx:
+      inference_time: 1204.0
+      throughput: 830.5647840531561
       estimated_peak_memory_range:
-        min: 4263936
-        max: 4263936
+        min: 5832704
+        max: 5832704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +285,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 196
-      job_id: jegn293vg
+      job_id: jpv6kdvz5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,15 +294,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:17:32Z'
+    timestamp: '2024-10-15T00:15:10Z'
 - name: MediaPipeHandLandmarkDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1048.0
-      throughput: 954.1984732824427
+      inference_time: 1030.0
+      throughput: 970.8737864077669
       estimated_peak_memory_range:
         min: 12288
-        max: 57929352
+        max: 1495504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -394,29 +310,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jygzexozg
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1109.0
-      throughput: 901.7132551848512
-      estimated_peak_memory_range:
-        min: 1626112
-        max: 41405288
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 208
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 208
-      job_id: j1p8o3mzg
+      job_id: jgdx13e6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1575.0
-      throughput: 634.9206349206349
+      inference_time: 1552.0
+      throughput: 644.3298969072165
       estimated_peak_memory_range:
         min: 12288
-        max: 7777872
+        max: 8154736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -424,7 +325,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 209
-      job_id: jqp4qxdqg
+      job_id: j56y47nnp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -433,13 +334,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:17:29Z'
+    timestamp: '2024-10-15T00:15:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 907.0
-      throughput: 1102.5358324145534
+      inference_time: 848.0
+      throughput: 1179.245283018868
       estimated_peak_memory_range:
         min: 12288
-        max: 64177952
+        max: 64923696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -447,29 +348,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jmg9v3jq5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 943.0
-      throughput: 1060.4453870625662
-      estimated_peak_memory_range:
-        min: 802816
-        max: 20160416
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 208
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 208
-      job_id: jn5q87r75
+      job_id: jp4lr1k25
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1255.0
-      throughput: 796.8127490039841
+      inference_time: 1213.0
+      throughput: 824.4023083264633
       estimated_peak_memory_range:
-        min: 0
-        max: 66155712
+        min: 327680
+        max: 68004160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -477,7 +363,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 209
-      job_id: jo5mrw6yg
+      job_id: jgo26r31p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -486,13 +372,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:17:31Z'
+    timestamp: '2024-10-15T00:15:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 1005.0
-      throughput: 995.0248756218906
+      inference_time: 1003.0
+      throughput: 997.0089730807578
       estimated_peak_memory_range:
-        min: 32768
-        max: 14970560
+        min: 53248
+        max: 179286624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -500,22 +386,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jvgdwrek5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1090.0
-      throughput: 917.4311926605504
-      estimated_peak_memory_range:
-        min: 819200
-        max: 1921360
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 208
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 208
-      job_id: jwgoy1n45
+      job_id: j5mnxmq7p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -523,14 +394,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:17:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:14:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 2570.0
-      throughput: 389.10505836575874
+      inference_time: 1008.0
+      throughput: 992.063492063492
       estimated_peak_memory_range:
-        min: 12288
-        max: 57590704
+        min: 49152
+        max: 1565352
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -538,22 +409,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jqp4qxkqg
+      job_id: j56y47znp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:17:09Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:14:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 1015.0
-      throughput: 985.2216748768473
+      inference_time: 1004.0
+      throughput: 996.01593625498
       estimated_peak_memory_range:
-        min: 24576
-        max: 1455056
+        min: 12288
+        max: 1431104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -561,22 +432,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jo5mrwqyg
+      job_id: jp8qyxmqp
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1091.0
-      throughput: 916.5902841429881
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:14:41Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1035.0
+      throughput: 966.1835748792271
       estimated_peak_memory_range:
-        min: 868352
-        max: 2128472
+        min: 20480
+        max: 1344472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 208
+        layers_on_npu: 158
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 208
-      job_id: j7gjx027p
+        total_layers: 158
+      job_id: jpy13xr0p
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -584,14 +463,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:17:22Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:14:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 999.0
-      throughput: 1001.001001001001
+      inference_time: 2590.0
+      throughput: 386.1003861003861
       estimated_peak_memory_range:
-        min: 28672
-        max: 1513784
+        min: 12288
+        max: 58059440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -599,37 +478,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: joprk42v5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1148.0
-      throughput: 871.0801393728223
-      estimated_peak_memory_range:
-        min: 823296
-        max: 2426232
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 208
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 208
-      job_id: jygzexjzg
+      job_id: jprv308kg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:17:24Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:14:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 1053.0
-      throughput: 949.667616334283
+      inference_time: 585.0
+      throughput: 1709.4017094017095
       estimated_peak_memory_range:
-        min: 20480
-        max: 1470912
+        min: 8192
+        max: 33374032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -637,52 +501,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jqpye4jrg
+      job_id: jgjvn721g
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1110.0
-      throughput: 900.9009009009009
+    torchscript_onnx:
+      inference_time: 1068.0
+      throughput: 936.3295880149813
       estimated_peak_memory_range:
-        min: 831488
-        max: 2170544
+        min: 0
+        max: 39068848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 208
+        layers_on_npu: 209
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 208
-      job_id: jmg9v3yq5
+        total_layers: 209
+      job_id: jg9lnmwmg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:17:26Z'
-  - torchscript_onnx_qnn:
-      inference_time: 1339.0
-      throughput: 746.8259895444362
-      estimated_peak_memory_range:
-        min: 786432
-        max: 786432
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 208
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 208
-      job_id: jw5663zv5
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 1619.0
-      throughput: 617.6652254478073
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:15:14Z'
+  - torchscript_onnx:
+      inference_time: 1641.0
+      throughput: 609.3845216331505
       estimated_peak_memory_range:
-        min: 6717440
-        max: 6717440
+        min: 8015872
+        max: 8015872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -690,7 +539,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 209
-      job_id: joprk4ev5
+      job_id: jgjvn7e1g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -699,4 +548,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:17:33Z'
+    timestamp: '2024-10-15T00:15:10Z'
diff --git a/qai_hub_models/models/mediapipe_pose/README.md b/qai_hub_models/models/mediapipe_pose/README.md
index 4df97c19..02ceb20e 100644
--- a/qai_hub_models/models/mediapipe_pose/README.md
+++ b/qai_hub_models/models/mediapipe_pose/README.md
@@ -6,7 +6,7 @@
 The MediaPipe Pose Landmark Detector is a machine learning pipeline that predicts bounding boxes and pose skeletons of poses in an image.
 
 This is based on the implementation of MediaPipe-Pose-Estimation found
-[here](https://github.com/zmurez/MediaPipePyTorch/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mediapipe_pose).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mediapipe_pose.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MediaPipe-Pose-Estimation can be found
+* The license for the original implementation of MediaPipe-Pose-Estimation can be found
   [here](https://github.com/zmurez/MediaPipePyTorch/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [BlazePose: On-device Real-time Body Pose tracking](https://arxiv.org/abs/2006.10204)
 * [Source Model Implementation](https://github.com/zmurez/MediaPipePyTorch/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mediapipe_pose/export.py b/qai_hub_models/models/mediapipe_pose/export.py
index dd844ad8..fae2a609 100644
--- a/qai_hub_models/models/mediapipe_pose/export.py
+++ b/qai_hub_models/models/mediapipe_pose/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mediapipe_pose import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mediapipe_pose"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "MediaPipePoseDetector" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/mediapipe_pose/perf.yaml b/qai_hub_models/models/mediapipe_pose/perf.yaml
index a7530740..1099c765 100644
--- a/qai_hub_models/models/mediapipe_pose/perf.yaml
+++ b/qai_hub_models/models/mediapipe_pose/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,29 +20,26 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MediaPipePoseDetector
   performance_metrics:
@@ -49,8 +47,8 @@ models:
       inference_time: 774.0
       throughput: 1291.9896640826873
       estimated_peak_memory_range:
-        min: 69632
-        max: 1600768
+        min: 28672
+        max: 5196752
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,29 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: j0pxv7k8g
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 838.0
-      throughput: 1193.3174224343675
-      estimated_peak_memory_range:
-        min: 12288
-        max: 6020592
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: j1pv31qz5
+      job_id: jg9lnml8g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1013.0
-      throughput: 987.1668311944719
+      inference_time: 1009.0
+      throughput: 991.0802775024777
       estimated_peak_memory_range:
-        min: 221184
-        max: 1711312
+        min: 16384
+        max: 4350520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 139
-      job_id: jegn29lvg
+      job_id: jp2kywx6p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:16:30Z'
+    timestamp: '2024-10-15T00:13:57Z'
   - torchscript_onnx_tflite:
       inference_time: 669.0
       throughput: 1494.7683109118086
       estimated_peak_memory_range:
-        min: 61440
-        max: 47617888
+        min: 16384
+        max: 49169056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,29 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: jegn296jg
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 720.0
-      throughput: 1388.888888888889
-      estimated_peak_memory_range:
-        min: 208896
-        max: 16873328
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: jlpe9ro8g
+      job_id: jgdx13xzp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 869.0
-      throughput: 1150.7479861910242
+      inference_time: 898.0
+      throughput: 1113.5857461024498
       estimated_peak_memory_range:
         min: 0
-        max: 50983296
+        max: 52554240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 139
-      job_id: jep2870xp
+      job_id: jp0z0j305
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:16:31Z'
+    timestamp: '2024-10-15T00:13:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 770.0
-      throughput: 1298.7012987012988
+      inference_time: 774.0
+      throughput: 1291.9896640826873
       estimated_peak_memory_range:
-        min: 28672
-        max: 15094632
+        min: 53248
+        max: 1412544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: jep287k6p
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 816.0
-      throughput: 1225.4901960784314
-      estimated_peak_memory_range:
-        min: 237568
-        max: 1474664
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: jnp10d2n5
+      job_id: jg9lnmlmg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:16:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:13:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1898.0
-      throughput: 526.8703898840885
+      inference_time: 777.0
+      throughput: 1287.001287001287
       estimated_peak_memory_range:
-        min: 61440
-        max: 42858912
+        min: 86016
+        max: 1434184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: j2p0y140g
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1990.0
-      throughput: 502.51256281407035
-      estimated_peak_memory_range:
-        min: 208896
-        max: 14890608
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: j0pxv79jg
+      job_id: jp2kywk6p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:16:28Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:13:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 772.0
-      throughput: 1295.3367875647668
+      inference_time: 779.0
+      throughput: 1283.6970474967907
       estimated_peak_memory_range:
         min: 28672
-        max: 1446720
+        max: 1387472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,22 +178,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: jogkzlvvg
+      job_id: jgn6vn6j5
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 819.0
-      throughput: 1221.001221001221
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:13:32Z'
+  - torchscript_onnx_tflite:
+      inference_time: 774.0
+      throughput: 1291.9896640826873
       estimated_peak_memory_range:
-        min: 233472
-        max: 1953384
+        min: 65536
+        max: 1705552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 138
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 138
-      job_id: jz5wodwzp
+        total_layers: 106
+      job_id: jp4lr1l25
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +209,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:16:22Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:13:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 775.0
-      throughput: 1290.3225806451612
+      inference_time: 1892.0
+      throughput: 528.5412262156448
       estimated_peak_memory_range:
-        min: 16384
-        max: 1545736
+        min: 12288
+        max: 43618656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +224,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: j1gln042p
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 818.0
-      throughput: 1222.4938875305625
-      estimated_peak_memory_range:
-        min: 258048
-        max: 1546952
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: jnp10d2k5
+      job_id: jgdx13x6p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:16:24Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:13:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 778.0
-      throughput: 1285.3470437017995
+      inference_time: 457.0
+      throughput: 2188.183807439825
       estimated_peak_memory_range:
-        min: 28672
-        max: 1582800
+        min: 12288
+        max: 24848112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,52 +247,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 106
-      job_id: j1p3k4nm5
+      job_id: jgkex4vvg
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 820.0
-      throughput: 1219.5121951219512
+    torchscript_onnx:
+      inference_time: 755.0
+      throughput: 1324.5033112582782
       estimated_peak_memory_range:
-        min: 212992
-        max: 1943736
+        min: 0
+        max: 27046240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 138
+        layers_on_npu: 139
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 138
-      job_id: jz57zj2qp
+        total_layers: 139
+      job_id: jp3j09vmg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:16:26Z'
-  - torchscript_onnx_qnn:
-      inference_time: 977.0
-      throughput: 1023.5414534288639
-      estimated_peak_memory_range:
-        min: 466944
-        max: 466944
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 138
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 138
-      job_id: jz5wodw4p
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 1053.0
-      throughput: 949.667616334283
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:14:04Z'
+  - torchscript_onnx:
+      inference_time: 1057.0
+      throughput: 946.073793755913
       estimated_peak_memory_range:
-        min: 2973696
-        max: 2973696
+        min: 3051520
+        max: 3051520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +285,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 139
-      job_id: j2p0y132g
+      job_id: jgkex47vg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,15 +294,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:16:33Z'
+    timestamp: '2024-10-15T00:14:00Z'
 - name: MediaPipePoseLandmarkDetector
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 832.0
-      throughput: 1201.923076923077
+      inference_time: 831.0
+      throughput: 1203.3694344163657
       estimated_peak_memory_range:
-        min: 36864
-        max: 2373544
+        min: 12288
+        max: 6408528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -394,29 +310,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jo5mrwn7g
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 915.0
-      throughput: 1092.896174863388
-      estimated_peak_memory_range:
-        min: 12288
-        max: 39718592
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 290
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 290
-      job_id: j7gjx0d1p
+      job_id: jp14zj47p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1333.0
-      throughput: 750.1875468867216
+      inference_time: 1315.0
+      throughput: 760.4562737642585
       estimated_peak_memory_range:
-        min: 12288
-        max: 9216760
+        min: 28672
+        max: 9434736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -424,7 +325,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 291
-      job_id: joprk48v5
+      job_id: jpy13xz0p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -433,13 +334,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:16:30Z'
+    timestamp: '2024-10-15T00:13:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 665.0
-      throughput: 1503.7593984962407
+      inference_time: 705.0
+      throughput: 1418.4397163120568
       estimated_peak_memory_range:
-        min: 16384
-        max: 93101840
+        min: 12288
+        max: 94875760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -447,29 +348,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: joprk4vk5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 724.0
-      throughput: 1381.2154696132598
-      estimated_peak_memory_range:
-        min: 0
-        max: 20107360
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 290
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 290
-      job_id: jygzex24g
+      job_id: j5we67e45
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1052.0
-      throughput: 950.5703422053232
+      inference_time: 1012.0
+      throughput: 988.1422924901186
       estimated_peak_memory_range:
         min: 0
-        max: 96993728
+        max: 100694720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -477,7 +363,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 291
-      job_id: jqpye4rrg
+      job_id: jp8qyx0qp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -486,13 +372,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:16:32Z'
+    timestamp: '2024-10-15T00:13:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 819.0
-      throughput: 1221.001221001221
+      inference_time: 818.0
+      throughput: 1222.4938875305625
       estimated_peak_memory_range:
         min: 12288
-        max: 1485816
+        max: 1410992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -500,22 +386,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jqpye410g
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 919.0
-      throughput: 1088.139281828074
-      estimated_peak_memory_range:
-        min: 811008
-        max: 2001472
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 290
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 290
-      job_id: jvgdwrn65
+      job_id: jp14zj4np
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -523,14 +394,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:16:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:13:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1820.0
-      throughput: 549.4505494505495
+      inference_time: 841.0
+      throughput: 1189.0606420927468
       estimated_peak_memory_range:
-        min: 12288
-        max: 82623056
+        min: 24576
+        max: 8275072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -538,22 +409,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: j1p8o32qg
+      job_id: jpy13x10p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:16:09Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:13:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 831.0
-      throughput: 1203.3694344163657
+      inference_time: 826.0
+      throughput: 1210.6537530266344
       estimated_peak_memory_range:
-        min: 12288
-        max: 1682648
+        min: 16384
+        max: 2589320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -561,22 +432,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jn5q870e5
+      job_id: jprv30vkg
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 904.0
-      throughput: 1106.1946902654868
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:13:32Z'
+  - torchscript_onnx_tflite:
+      inference_time: 846.0
+      throughput: 1182.033096926714
       estimated_peak_memory_range:
-        min: 827392
-        max: 2088096
+        min: 20480
+        max: 5743568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 290
+        layers_on_npu: 219
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 290
-      job_id: jmg9v30q5
+        total_layers: 219
+      job_id: jpxko4k85
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -584,14 +463,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:16:23Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:13:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 821.0
-      throughput: 1218.026796589525
+      inference_time: 1814.0
+      throughput: 551.2679162072767
       estimated_peak_memory_range:
         min: 12288
-        max: 43989096
+        max: 82768992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -599,37 +478,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jw56632n5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 901.0
-      throughput: 1109.8779134295228
-      estimated_peak_memory_range:
-        min: 811008
-        max: 2141312
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 290
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 290
-      job_id: jvgdwrnk5
+      job_id: j57yr4yn5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:16:25Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:13:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 826.0
-      throughput: 1210.6537530266344
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
-        min: 12288
-        max: 1473008
+        min: 8192
+        max: 36712976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -637,52 +501,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 219
-      job_id: jwgoy1z15
+      job_id: j5q6qy0ep
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 893.0
-      throughput: 1119.8208286674133
+    torchscript_onnx:
+      inference_time: 916.0
+      throughput: 1091.703056768559
       estimated_peak_memory_range:
-        min: 819200
-        max: 2073072
+        min: 245760
+        max: 44747168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 290
+        layers_on_npu: 291
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 290
-      job_id: jqp4qxnqg
+        total_layers: 291
+      job_id: jgo26rk1p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:16:27Z'
-  - torchscript_onnx_qnn:
-      inference_time: 1121.0
-      throughput: 892.0606601248885
-      estimated_peak_memory_range:
-        min: 786432
-        max: 786432
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 290
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 290
-      job_id: jmg9v30m5
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 1404.0
-      throughput: 712.2507122507122
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:14:04Z'
+  - torchscript_onnx:
+      inference_time: 1382.0
+      throughput: 723.589001447178
       estimated_peak_memory_range:
-        min: 8105984
-        max: 8105984
+        min: 8028160
+        max: 8028160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -690,7 +539,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 291
-      job_id: j1p8o30zg
+      job_id: j5q6qyeep
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -699,4 +548,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:16:34Z'
+    timestamp: '2024-10-15T00:14:01Z'
diff --git a/qai_hub_models/models/mediapipe_selfie/README.md b/qai_hub_models/models/mediapipe_selfie/README.md
index 49115f4e..e24f4220 100644
--- a/qai_hub_models/models/mediapipe_selfie/README.md
+++ b/qai_hub_models/models/mediapipe_selfie/README.md
@@ -6,7 +6,7 @@
 Light-weight model that segments a person from the background in square or landscape selfie and video conference imagery.
 
 This is based on the implementation of MediaPipe-Selfie-Segmentation found
-[here](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mediapipe_selfie).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.mediapipe_selfie.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MediaPipe-Selfie-Segmentation can be found
+* The license for the original implementation of MediaPipe-Selfie-Segmentation can be found
   [here](https://github.com/google/mediapipe/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Image segmentation guide](https://developers.google.com/mediapipe/solutions/vision/image_segmenter/)
 * [Source Model Implementation](https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mediapipe_selfie/export.py b/qai_hub_models/models/mediapipe_selfie/export.py
index fb867c80..21e3ae07 100644
--- a/qai_hub_models/models/mediapipe_selfie/export.py
+++ b/qai_hub_models/models/mediapipe_selfie/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mediapipe_selfie import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mediapipe_selfie"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/mediapipe_selfie/perf.yaml b/qai_hub_models/models/mediapipe_selfie/perf.yaml
index 6e1606bd..0ffd5a08 100644
--- a/qai_hub_models/models/mediapipe_selfie/perf.yaml
+++ b/qai_hub_models/models/mediapipe_selfie/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MediaPipe-Selfie-Segmentation
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 699.0
-      throughput: 1430.615164520744
+      inference_time: 698.0
+      throughput: 1432.6647564469913
       estimated_peak_memory_range:
-        min: 12288
-        max: 1574920
+        min: 290816
+        max: 2000184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: jegn29vjg
+      job_id: jgjvn7neg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 775.0
-      throughput: 1290.3225806451612
+      inference_time: 774.0
+      throughput: 1291.9896640826873
       estimated_peak_memory_range:
-        min: 811008
-        max: 4052192
+        min: 806912
+        max: 26386216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jn5q876e5
+      job_id: jpxko4ol5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1336.0
-      throughput: 748.502994011976
+      inference_time: 1320.0
+      throughput: 757.5757575757576
       estimated_peak_memory_range:
-        min: 589824
-        max: 15605624
+        min: 32768
+        max: 3657536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jygzex34g
+      job_id: jglvmxvm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:15:36Z'
+    timestamp: '2024-10-15T00:12:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 471.0
-      throughput: 2123.1422505307855
+      inference_time: 472.0
+      throughput: 2118.64406779661
       estimated_peak_memory_range:
         min: 12288
-        max: 29493584
+        max: 30256864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: joprk43k5
+      job_id: jpedmzmv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 527.0
-      throughput: 1897.5332068311195
+      inference_time: 525.0
+      throughput: 1904.7619047619048
       estimated_peak_memory_range:
-        min: 806912
-        max: 15650160
+        min: 0
+        max: 13296544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j1gln0v2p
+      job_id: j5mnxmx9p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 905.0
-      throughput: 1104.9723756906078
+      inference_time: 899.0
+      throughput: 1112.3470522803113
       estimated_peak_memory_range:
         min: 0
-        max: 32653568
+        max: 33673888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jz5wode4p
+      job_id: j56y47yyp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:15:37Z'
+    timestamp: '2024-10-15T00:12:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 702.0
-      throughput: 1424.5014245014245
+      inference_time: 696.0
+      throughput: 1436.7816091954023
       estimated_peak_memory_range:
-        min: 16384
-        max: 4487320
+        min: 12288
+        max: 1477832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: jep287y6p
+      job_id: jgz3dmdx5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 761.0
-      throughput: 1314.060446780552
+      inference_time: 756.0
+      throughput: 1322.7513227513227
       estimated_peak_memory_range:
-        min: 819200
-        max: 2749112
+        min: 815104
+        max: 2186448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j1p3k4jm5
+      job_id: jprv3037g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:15:31Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:12:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 931.0
-      throughput: 1074.1138560687432
+      inference_time: 698.0
+      throughput: 1432.6647564469913
       estimated_peak_memory_range:
-        min: 12288
-        max: 28976208
+        min: 28672
+        max: 1636016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: jqpye430g
+      job_id: jgdx131zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 997.0
-      throughput: 1003.0090270812437
+      inference_time: 755.0
+      throughput: 1324.5033112582782
       estimated_peak_memory_range:
-        min: 802816
-        max: 16742848
+        min: 811008
+        max: 2166472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jlpe9rd8g
+      job_id: jp0z0j0n5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:15:35Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:12:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 702.0
-      throughput: 1424.5014245014245
+      inference_time: 697.0
+      throughput: 1434.7202295552368
       estimated_peak_memory_range:
         min: 12288
-        max: 1762976
+        max: 71148648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: j2p0y1z0g
+      job_id: jp14zjz7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 766.0
-      throughput: 1305.4830287206266
+      inference_time: 760.0
+      throughput: 1315.7894736842106
       estimated_peak_memory_range:
-        min: 827392
-        max: 2140080
+        min: 823296
+        max: 2206160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jwgoy1215
+      job_id: jpy13x3lp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:15:32Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:12:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 702.0
-      throughput: 1424.5014245014245
+      inference_time: 704.0
+      throughput: 1420.4545454545455
       estimated_peak_memory_range:
-        min: 20480
-        max: 1562888
+        min: 28672
+        max: 1979288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: j1p8o3qqg
+      job_id: jg9lnmn8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 761.0
-      throughput: 1314.060446780552
+      inference_time: 763.0
+      throughput: 1310.615989515072
       estimated_peak_memory_range:
         min: 819200
-        max: 2150504
+        max: 2181288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j1pv316z5
+      job_id: jp2kywyqp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:15:33Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:12:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 708.0
-      throughput: 1412.4293785310736
+      inference_time: 934.0
+      throughput: 1070.6638115631692
       estimated_peak_memory_range:
-        min: 24576
-        max: 2987208
+        min: 16384
+        max: 29442032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 118
-      job_id: jogkzlevg
+      job_id: j5we676m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 775.0
-      throughput: 1290.3225806451612
+      inference_time: 995.0
+      throughput: 1005.0251256281407
       estimated_peak_memory_range:
-        min: 819200
-        max: 2135888
+        min: 802816
+        max: 16557232
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j7gjx0v1p
+      job_id: jgkex4xng
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:12:51Z'
+  - torchscript_onnx_tflite:
+      inference_time: 367.0
+      throughput: 2724.7956403269754
+      estimated_peak_memory_range:
+        min: 8192
+        max: 19071632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 118
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 118
+      job_id: jp4lr1r15
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 518.0
+      throughput: 1930.5019305019305
+      estimated_peak_memory_range:
+        min: 0
+        max: 10835744
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 138
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 138
+      job_id: j5q6qyqop
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 871.0
+      throughput: 1148.105625717566
+      estimated_peak_memory_range:
+        min: 0
+        max: 23945696
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 140
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 140
+      job_id: jpv6kd6r5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:15:34Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:12:57Z'
   - torchscript_onnx_qnn:
-      inference_time: 915.0
-      throughput: 1092.896174863388
+      inference_time: 908.0
+      throughput: 1101.3215859030836
       estimated_peak_memory_range:
         min: 786432
         max: 786432
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jw5663yn5
+      job_id: jgn6vnvq5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1374.0
-      throughput: 727.802037845706
+      inference_time: 1367.0
+      throughput: 731.528895391368
       estimated_peak_memory_range:
-        min: 1953792
-        max: 1953792
+        min: 1912832
+        max: 1912832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 140
-      job_id: jmg9v3lm5
+      job_id: jp3j09jng
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:15:38Z'
+    timestamp: '2024-10-15T00:12:56Z'
diff --git a/qai_hub_models/models/midas/README.md b/qai_hub_models/models/midas/README.md
index d8e6479e..2295b940 100644
--- a/qai_hub_models/models/midas/README.md
+++ b/qai_hub_models/models/midas/README.md
@@ -6,7 +6,7 @@
 Midas is designed for estimating depth at each point in an image.
 
 This is based on the implementation of Midas-V2 found
-[here](https://github.com/isl-org/MiDaS). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/midas).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.midas.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Midas-V2 can be found
+* The license for the original implementation of Midas-V2 can be found
   [here](https://github.com/isl-org/MiDaS/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer](https://arxiv.org/abs/1907.01341v3)
 * [Source Model Implementation](https://github.com/isl-org/MiDaS)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/midas/export.py b/qai_hub_models/models/midas/export.py
index b5091219..fe5f4cbf 100644
--- a/qai_hub_models/models/midas/export.py
+++ b/qai_hub_models/models/midas/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.midas import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "midas"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -199,7 +197,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/midas/perf.yaml b/qai_hub_models/models/midas/perf.yaml
index d9dd811e..dee34bc5 100644
--- a/qai_hub_models/models/midas/perf.yaml
+++ b/qai_hub_models/models/midas/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Midas-V2
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 3254.0
-      throughput: 307.3140749846343
+      inference_time: 3240.0
+      throughput: 308.641975308642
       estimated_peak_memory_range:
-        min: 16384
-        max: 2301536
+        min: 24576
+        max: 1960272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jegn2e8jg
+      job_id: jglvmlrm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3305.0
-      throughput: 302.571860816944
+      inference_time: 3278.0
+      throughput: 305.0640634533252
       estimated_peak_memory_range:
-        min: 245760
-        max: 109842840
+        min: 286720
+        max: 105600504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: jn5q87qe5
+      job_id: jg9lnde8g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3394.0
-      throughput: 294.6375957572186
+      inference_time: 3303.0
+      throughput: 302.7550711474417
       estimated_peak_memory_range:
-        min: 806912
-        max: 2627096
+        min: 16384
+        max: 43420992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 199
-      job_id: jygzexd4g
+      job_id: jp0z067n5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:14:58Z'
+    timestamp: '2024-10-15T00:12:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 2419.0
-      throughput: 413.39396444811905
+      inference_time: 2841.0
+      throughput: 351.98873636043646
       estimated_peak_memory_range:
         min: 12288
-        max: 89353488
+        max: 91484384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: joprkyjk5
+      job_id: j56y4wlyp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2865.0
-      throughput: 349.04013961605585
+      inference_time: 2462.0
+      throughput: 406.17384240454913
       estimated_peak_memory_range:
-        min: 0
-        max: 23754928
+        min: 802816
+        max: 28420720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: j1gln0m2p
+      job_id: jp14z6x7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2949.0
-      throughput: 339.097999321804
+      inference_time: 2550.0
+      throughput: 392.15686274509807
       estimated_peak_memory_range:
         min: 0
-        max: 90842160
+        max: 94982112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 199
-      job_id: jz5wod64p
+      job_id: jp8qy1vop
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:14:59Z'
+    timestamp: '2024-10-15T00:12:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 3196.0
-      throughput: 312.89111389236547
+      inference_time: 3213.0
+      throughput: 311.2356053532524
       estimated_peak_memory_range:
-        min: 24576
-        max: 44829872
+        min: 12288
+        max: 4925056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jep28mn6p
+      job_id: jp3j062ng
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 3087.0
       throughput: 323.9390994493035
       estimated_peak_memory_range:
-        min: 823296
-        max: 2617712
+        min: 819200
+        max: 2066160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: j1p3k40m5
+      job_id: jp4lr3015
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:14:53Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:12:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 4755.0
-      throughput: 210.3049421661409
+      inference_time: 3222.0
+      throughput: 310.36623215394167
       estimated_peak_memory_range:
-        min: 278528
-        max: 94961424
+        min: 24576
+        max: 2070128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jqpyed00g
+      job_id: jpedmy3v5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4923.0
-      throughput: 203.12817387771685
+      inference_time: 3045.0
+      throughput: 328.4072249589491
       estimated_peak_memory_range:
-        min: 802816
-        max: 26186432
+        min: 815104
+        max: 2149376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: jlpe9rm8g
+      job_id: jgn6vk8q5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:14:57Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:12:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 3234.0
-      throughput: 309.2145949288806
+      inference_time: 3228.0
+      throughput: 309.7893432465923
       estimated_peak_memory_range:
-        min: 94208
-        max: 2470544
+        min: 16384
+        max: 2079536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j2p0y100g
+      job_id: jgjvnq4eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3101.0
-      throughput: 322.4766204450177
+      inference_time: 3049.0
+      throughput: 327.97638570022957
       estimated_peak_memory_range:
         min: 819200
-        max: 2107176
+        max: 2135880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: jwgoy1615
+      job_id: j5mnx8y9p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:14:54Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:12:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 3268.0
-      throughput: 305.99755201958385
+      inference_time: 3228.0
+      throughput: 309.7893432465923
       estimated_peak_memory_range:
-        min: 20480
-        max: 2166808
+        min: 28672
+        max: 2099536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: j1p8o3yqg
+      job_id: jpv6k7xr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3103.0
-      throughput: 322.26877215597807
+      inference_time: 3049.0
+      throughput: 327.97638570022957
       estimated_peak_memory_range:
-        min: 843776
-        max: 2385512
+        min: 827392
+        max: 2248040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: j1pv31kz5
+      job_id: jpxkox2l5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:14:55Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:12:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 3234.0
-      throughput: 309.2145949288806
+      inference_time: 4752.0
+      throughput: 210.43771043771045
       estimated_peak_memory_range:
-        min: 20480
-        max: 1971784
+        min: 16384
+        max: 95709024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 138
-      job_id: jogkzlxvg
+      job_id: jgo268qkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3121.0
-      throughput: 320.41012495994875
+      inference_time: 4887.0
+      throughput: 204.62451401677922
       estimated_peak_memory_range:
-        min: 847872
-        max: 2181760
+        min: 802816
+        max: 28784448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: j7gjx0n1p
+      job_id: jp2kyenqp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:12:06Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2133.0
+      throughput: 468.8232536333802
+      estimated_peak_memory_range:
+        min: 20480
+        max: 40145520
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 138
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 138
+      job_id: j5we64nm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2164.0
+      throughput: 462.1072088724584
+      estimated_peak_memory_range:
+        min: 0
+        max: 23509904
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 197
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 197
+      job_id: jpy13m0lp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 2218.0
+      throughput: 450.8566275924256
+      estimated_peak_memory_range:
+        min: 0
+        max: 44280544
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 199
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 199
+      job_id: jglvmxmm5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:14:56Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:12:12Z'
   - torchscript_onnx_qnn:
-      inference_time: 3281.0
-      throughput: 304.7851264858275
+      inference_time: 3256.0
+      throughput: 307.12530712530713
       estimated_peak_memory_range:
         min: 786432
         max: 786432
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 197
-      job_id: jw56634n5
+      job_id: j57yr9395
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3348.0
-      throughput: 298.6857825567503
+      inference_time: 3378.0
+      throughput: 296.0331557134399
       estimated_peak_memory_range:
-        min: 37814272
-        max: 37814272
+        min: 37810176
+        max: 37810176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 199
-      job_id: jnp10dzn5
+      job_id: jgkex8mng
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:15:00Z'
+    timestamp: '2024-10-15T00:12:10Z'
diff --git a/qai_hub_models/models/midas_quantized/README.md b/qai_hub_models/models/midas_quantized/README.md
index c2a4db5a..c61ad7c9 100644
--- a/qai_hub_models/models/midas_quantized/README.md
+++ b/qai_hub_models/models/midas_quantized/README.md
@@ -6,7 +6,7 @@
 Midas is designed for estimating depth at each point in an image.
 
 This is based on the implementation of Midas-V2-Quantized found
-[here](https://github.com/isl-org/MiDaS). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/midas_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.midas_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Midas-V2-Quantized can be found
+* The license for the original implementation of Midas-V2-Quantized can be found
   [here](https://github.com/isl-org/MiDaS/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer](https://arxiv.org/abs/1907.01341v3)
 * [Source Model Implementation](https://github.com/isl-org/MiDaS)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/midas_quantized/export.py b/qai_hub_models/models/midas_quantized/export.py
index f80a1efa..09180c7f 100644
--- a/qai_hub_models/models/midas_quantized/export.py
+++ b/qai_hub_models/models/midas_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.midas_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "midas_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec, check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/midas_quantized/perf.yaml b/qai_hub_models/models/midas_quantized/perf.yaml
index af1fa03f..d8ebc38d 100644
--- a/qai_hub_models/models/midas_quantized/perf.yaml
+++ b/qai_hub_models/models/midas_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Midas-V2-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1094.0
-      throughput: 914.0767824497258
+      inference_time: 1112.0
+      throughput: 899.2805755395683
       estimated_peak_memory_range:
         min: 12288
-        max: 3108640
+        max: 8668496
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,22 +59,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jegn2eyjg
+      job_id: jp2kye8qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1435.0
-      throughput: 696.8641114982578
+      inference_time: 1434.0
+      throughput: 697.350069735007
       estimated_peak_memory_range:
-        min: 16384
-        max: 315875608
+        min: 24576
+        max: 54561184
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jw5661ln5
+        total_layers: 203
+      job_id: jgjvnqmeg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -85,13 +83,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:14:11Z'
+    timestamp: '2024-10-15T00:11:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 764.0
-      throughput: 1308.9005235602094
+      inference_time: 774.0
+      throughput: 1291.9896640826873
       estimated_peak_memory_range:
         min: 12288
-        max: 91793680
+        max: 93655872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -99,22 +97,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: joprkyqk5
+      job_id: jpy13melp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1018.0
-      throughput: 982.3182711198428
+      inference_time: 1013.0
+      throughput: 987.1668311944719
       estimated_peak_memory_range:
         min: 208896
-        max: 25454496
+        max: 22425392
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1p3km2m5
+        total_layers: 203
+      job_id: jpedmy1v5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -123,13 +121,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:14:12Z'
+    timestamp: '2024-10-15T00:11:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 1080.0
-      throughput: 925.925925925926
+      inference_time: 3827.0
+      throughput: 261.30128037627384
       estimated_peak_memory_range:
-        min: 12288
-        max: 1462496
+        min: 81920
+        max: 52640544
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -137,37 +135,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jep28m66p
+      job_id: jp3j063ng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1310.0
-      throughput: 763.3587786259542
+      inference_time: 6190.0
+      throughput: 161.55088852988692
       estimated_peak_memory_range:
-        min: 229376
-        max: 1931200
+        min: 241664
+        max: 8642432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1pv3wxz5
+        total_layers: 203
+      job_id: jpxkoxjl5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:14:14Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T00:11:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 1451.0
-      throughput: 689.1798759476223
+      inference_time: 15542.0
+      throughput: 64.34178355424012
       estimated_peak_memory_range:
-        min: 81920
-        max: 90190768
+        min: 102400
+        max: 6053224
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -175,37 +173,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jqpyedw0g
+      job_id: jgo2680kp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T00:11:07Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1083.0
+      throughput: 923.3610341643582
+      estimated_peak_memory_range:
+        min: 16384
+        max: 232524664
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 145
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 145
+      job_id: jp0z06yn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1767.0
-      throughput: 565.9309564233164
+      inference_time: 1306.0
+      throughput: 765.6967840735069
       estimated_peak_memory_range:
-        min: 217088
-        max: 27273760
+        min: 233472
+        max: 1537024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jz5wo9n4p
+        total_layers: 203
+      job_id: j5we64vm5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:14:18Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:11:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 1090.0
-      throughput: 917.4311926605504
+      inference_time: 1092.0
+      throughput: 915.7509157509157
       estimated_peak_memory_range:
-        min: 16384
-        max: 140063064
+        min: 36864
+        max: 5718496
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -213,37 +234,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: j2p0yr70g
+      job_id: jglvmlzm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1329.0
-      throughput: 752.4454477050414
+      inference_time: 1320.0
+      throughput: 757.5757575757576
       estimated_peak_memory_range:
-        min: 229376
-        max: 1546896
+        min: 225280
+        max: 1551736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j7gjxl41p
+        total_layers: 203
+      job_id: jgdx129zp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:14:15Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:11:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 1074.0
-      throughput: 931.0986964618249
+      inference_time: 1089.0
+      throughput: 918.2736455463728
       estimated_peak_memory_range:
-        min: 16384
-        max: 11851488
+        min: 61440
+        max: 1662920
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -251,22 +272,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: j1p8o7vqg
+      job_id: j5q6qv8op
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1307.0
-      throughput: 765.1109410864575
+      inference_time: 1315.0
+      throughput: 760.4562737642585
       estimated_peak_memory_range:
-        min: 233472
-        max: 1941720
+        min: 229376
+        max: 1785760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jlpe9v38g
+        total_layers: 203
+      job_id: jp14z6l7p
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -274,14 +295,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:14:16Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:11:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 1090.0
-      throughput: 917.4311926605504
+      inference_time: 1092.0
+      throughput: 915.7509157509157
       estimated_peak_memory_range:
-        min: 24576
-        max: 247403960
+        min: 16384
+        max: 1594440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -289,37 +310,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jogkzymvg
+      job_id: jgkex8zng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1344.0
-      throughput: 744.047619047619
+      inference_time: 1319.0
+      throughput: 758.1501137225171
       estimated_peak_memory_range:
-        min: 225280
-        max: 1583016
+        min: 229376
+        max: 1625368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jygze7k4g
+        total_layers: 203
+      job_id: jg9lnd18g
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:14:17Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:11:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 3787.0
-      throughput: 264.06126221283336
+      inference_time: 1431.0
+      throughput: 698.8120195667366
       estimated_peak_memory_range:
-        min: 40960
-        max: 51857952
+        min: 81920
+        max: 91993600
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -327,37 +348,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: jn5q82oe5
+      job_id: jp8qy1oop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5935.0
-      throughput: 168.49199663016006
+      inference_time: 1772.0
+      throughput: 564.3340857787811
       estimated_peak_memory_range:
-        min: 212992
-        max: 7695632
+        min: 208896
+        max: 25817216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jmg9v4em5
+        total_layers: 203
+      job_id: jp4lr3o15
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:14:19Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:11:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 15696.0
-      throughput: 63.710499490316
+      inference_time: 731.0
+      throughput: 1367.9890560875513
       estimated_peak_memory_range:
-        min: 114688
-        max: 2008800
+        min: 8192
+        max: 49238192
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -365,30 +386,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 145
-      job_id: j1glnkr2p
+      job_id: jpv6k7or5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1001.0
+      throughput: 999.000999000999
+      estimated_peak_memory_range:
+        min: 0
+        max: 22022448
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 203
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 203
+      job_id: j5mnx829p
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:14:10Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:11:20Z'
   - torchscript_onnx_qnn:
-      inference_time: 1480.0
-      throughput: 675.6756756756756
+      inference_time: 1461.0
+      throughput: 684.4626967830253
       estimated_peak_memory_range:
-        min: 344064
-        max: 344064
+        min: 442368
+        max: 442368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 203
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jwgoyvq15
+        total_layers: 203
+      job_id: jgz3dn9x5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -397,4 +433,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:14:13Z'
+    timestamp: '2024-10-15T00:11:11Z'
diff --git a/qai_hub_models/models/mistral_3b_quantized/README.md b/qai_hub_models/models/mistral_3b_quantized/README.md
new file mode 100644
index 00000000..a8500d9f
--- /dev/null
+++ b/qai_hub_models/models/mistral_3b_quantized/README.md
@@ -0,0 +1,55 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Mistral-3B: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/mistral_3b_quantized)
+
+Mistral 3B model is Mistral AI's first generation edge model, optimized for optimal performance on Snapdragon platforms.
+
+This is based on the implementation of Mistral-3B found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/mistral_3b_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying Mistral 3B on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## References
+* [None](None)
+* [Source Model Implementation](https://github.com/mistralai/mistral-inference)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/mistral_3b_quantized/info.yaml b/qai_hub_models/models/mistral_3b_quantized/info.yaml
new file mode 100644
index 00000000..fd7ec018
--- /dev/null
+++ b/qai_hub_models/models/mistral_3b_quantized/info.yaml
@@ -0,0 +1,41 @@
+name: Mistral-3B
+id: mistral_3b_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: Mistral 3B model is Mistral AI's first generation edge model, optimized for optimal performance on Snapdragon platforms.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+source_repo: https://github.com/mistralai/mistral-inference
+model_maker_id: mistral-ai
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Max context length: 4096
+  Num of key-value heads: 8
+  Number of parameters: 3B
+  Precision: w4a16 + w8a16 (few layers)
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+license_type: "other"
+dataset: []
+model_type_llm: true
+restrict_model_sharing: true
+llm_details:
+  call_to_action: 'contact_for_purchase'
diff --git a/qai_hub_models/models/mistral_3b_quantized/perf.yaml b/qai_hub_models/models/mistral_3b_quantized/perf.yaml
new file mode 100644
index 00000000..2a0c06be
--- /dev/null
+++ b/qai_hub_models/models/mistral_3b_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: 'Mistral-3B'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 92289
+          max: 2953273.6
+        tokens_per_second: 21.05
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/README.md b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/README.md
new file mode 100644
index 00000000..c93301ee
--- /dev/null
+++ b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/README.md
@@ -0,0 +1,61 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Mistral-7B-Instruct-v0.3: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/mistral_7b_instruct_v0_3_quantized)
+
+Mistral AI's first open source dense model released September 2023. Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine‑tuned version of the Mistral‑7B‑v0.3. It has an extended vocabulary and supports the v3 Tokenizer, enhancing language understanding and generation. Additionally function calling is enabled.
+
+This is based on the implementation of Mistral-7B-Instruct-v0.3 found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/mistral_7b_instruct_v0_3_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying Mistral 7B Instruct v0.3 on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## License
+* The license for the original implementation of Mistral-7B-Instruct-v0.3 can be found
+  [here](https://github.com/mistralai/mistral-inference/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/mistralai/mistral-inference/blob/main/LICENSE)
+
+
+## References
+* [Mistral 7B](https://arxiv.org/abs/2310.06825)
+* [Source Model Implementation](https://github.com/mistralai/mistral-inference)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/info.yaml b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/info.yaml
new file mode 100644
index 00000000..c3f6d769
--- /dev/null
+++ b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/info.yaml
@@ -0,0 +1,56 @@
+name: Mistral-7B-Instruct-v0.3
+id: mistral_7b_instruct_v0_3_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: Mistral AI's first open source dense model released September 2023. Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine‑tuned version of the Mistral‑7B‑v0.3. It has an extended vocabulary and supports the v3 Tokenizer, enhancing language understanding and generation. Additionally function calling is enabled.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://arxiv.org/abs/2310.06825
+research_paper_title: "Mistral 7B"
+model_maker_id: mistral-ai
+license: https://github.com/mistralai/mistral-inference/blob/main/LICENSE
+deploy_license: https://github.com/mistralai/mistral-inference/blob/main/LICENSE
+source_repo: https://github.com/mistralai/mistral-inference
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 7.3B
+  Precision: w4a16 + w8a16 (few layers)
+  Num of key-value heads: 8
+  Information about the model parts: Prompt Processor and Token Generator are split into 4 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
+  Prompt processor model size: 4.17 GB
+  Prompt processor input: 128 tokens + KVCache initialized with pad token
+  Prompt processor output: 128 output tokens + KVCache for token generator
+  Token generator model size: 4.17 GB
+  Token generator input: 1 input token + past KVCache
+  Token generator output: 1 output token + KVCache for next iteration
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+license_type: apache-2.0
+deploy_license_type: apache-2.0
+dataset: []
+model_type_llm: true
+llm_details:
+  call_to_action: 'download'
+  genie_compatible: true
+  Snapdragon 8 Elite QRD:
+    torchscript_onnx_qnn:
+      model_download_url: v2/snapdragon_8_elite/models.zip
diff --git a/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/perf.yaml b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/perf.yaml
new file mode 100644
index 00000000..517fabe5
--- /dev/null
+++ b/qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: ''
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 165650
+          max: 5300800
+        tokens_per_second: 12.56
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/mnasnet05/README.md b/qai_hub_models/models/mnasnet05/README.md
index 6f322636..adc79e3c 100644
--- a/qai_hub_models/models/mnasnet05/README.md
+++ b/qai_hub_models/models/mnasnet05/README.md
@@ -6,7 +6,7 @@
 MNASNet05 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MNASNet05 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/mnasnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mnasnet05).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mnasnet05.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MNASNet05 can be found
+* The license for the original implementation of MNASNet05 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [MnasNet: Platform-Aware Neural Architecture Search for Mobile](https://arxiv.org/abs/1807.11626)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/mnasnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mnasnet05/export.py b/qai_hub_models/models/mnasnet05/export.py
index 939e4572..d2a47929 100644
--- a/qai_hub_models/models/mnasnet05/export.py
+++ b/qai_hub_models/models/mnasnet05/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mnasnet05 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mnasnet05"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/mnasnet05/perf.yaml b/qai_hub_models/models/mnasnet05/perf.yaml
index f15bfade..66f1a7cd 100644
--- a/qai_hub_models/models/mnasnet05/perf.yaml
+++ b/qai_hub_models/models/mnasnet05/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MNASNet05
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 755.0
-      throughput: 1324.5033112582782
+      inference_time: 759.0
+      throughput: 1317.5230566534915
       estimated_peak_memory_range:
-        min: 24576
-        max: 13756000
+        min: 28672
+        max: 1984968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: j1p8o7oog
+      job_id: jgn6vkjk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 821.0
-      throughput: 1218.026796589525
+      inference_time: 825.0
+      throughput: 1212.121212121212
       estimated_peak_memory_range:
-        min: 16384
-        max: 172103328
+        min: 12288
+        max: 22109120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: j1pv3wor5
+      job_id: j56y4wk6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 780.0
-      throughput: 1282.051282051282
+      inference_time: 751.0
+      throughput: 1331.5579227696405
       estimated_peak_memory_range:
-        min: 49152
-        max: 7152088
+        min: 245760
+        max: 5644488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jz5wo9v4p
+      job_id: j5we64om5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:13:08Z'
+    timestamp: '2024-10-15T00:10:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 531.0
-      throughput: 1883.2391713747645
+      inference_time: 533.0
+      throughput: 1876.172607879925
       estimated_peak_memory_range:
         min: 16384
-        max: 52090288
+        max: 52102112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jogkzyzng
+      job_id: jprv3wz0g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 583.0
-      throughput: 1715.2658662092624
+      inference_time: 581.0
+      throughput: 1721.170395869191
       estimated_peak_memory_range:
         min: 618496
-        max: 14605200
+        max: 18174944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: j7gjxlmep
+      job_id: jp3j06y3g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 671.0
-      throughput: 1490.312965722802
+      inference_time: 572.0
+      throughput: 1748.2517482517483
       estimated_peak_memory_range:
         min: 0
-        max: 55460784
+        max: 55921184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jmg9v41m5
+      job_id: jg9lndv8g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:13:09Z'
+    timestamp: '2024-10-15T00:10:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 758.0
-      throughput: 1319.2612137203166
+      inference_time: 755.0
+      throughput: 1324.5033112582782
       estimated_peak_memory_range:
-        min: 24576
-        max: 3049560
+        min: 12288
+        max: 3595384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jn5q828o5
+      job_id: jp2kye2rp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 762.0
-      throughput: 1312.3359580052493
+      inference_time: 757.0
+      throughput: 1321.003963011889
       estimated_peak_memory_range:
-        min: 651264
-        max: 1968008
+        min: 667648
+        max: 1782736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jygze79xg
+      job_id: jpv6k73k5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,52 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:13:03Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:09:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 1027.0
-      throughput: 973.7098344693281
+      inference_time: 756.0
+      throughput: 1322.7513227513227
+      estimated_peak_memory_range:
+        min: 28672
+        max: 75716312
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 71
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 71
+      job_id: jgkex8jwg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 763.0
+      throughput: 1310.615989515072
+      estimated_peak_memory_range:
+        min: 634880
+        max: 2103512
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 103
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 103
+      job_id: jgz3dneo5
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:09:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 752.0
+      throughput: 1329.787234042553
       estimated_peak_memory_range:
         min: 16384
-        max: 52887776
+        max: 25221072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: j1glnkzmp
+      job_id: jp8qy1lkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1120.0
-      throughput: 892.8571428571429
+      inference_time: 764.0
+      throughput: 1308.9005235602094
       estimated_peak_memory_range:
-        min: 622592
-        max: 16481008
+        min: 638976
+        max: 1832664
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jvgdwv9z5
+      job_id: jpedmy9o5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:13:07Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:09:55Z'
   - torchscript_onnx_tflite:
       inference_time: 759.0
       throughput: 1317.5230566534915
       estimated_peak_memory_range:
-        min: 45056
-        max: 143981296
+        min: 28672
+        max: 1379768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jw5661jy5
+      job_id: jp0z06n95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 765.0
-      throughput: 1307.18954248366
+      inference_time: 766.0
+      throughput: 1305.4830287206266
       estimated_peak_memory_range:
-        min: 634880
-        max: 1953144
+        min: 630784
+        max: 2174552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,7 +291,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jz5wo9vmp
+      job_id: jgjvnqxvg
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +299,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:13:04Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:09:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 753.0
-      throughput: 1328.0212483399735
+      inference_time: 1026.0
+      throughput: 974.6588693957115
       estimated_peak_memory_range:
         min: 16384
-        max: 29055920
+        max: 54140992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: j1p3km3n5
+      job_id: jpy13m98p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 776.0
-      throughput: 1288.659793814433
+      inference_time: 1120.0
+      throughput: 892.8571428571429
       estimated_peak_memory_range:
-        min: 679936
-        max: 2252104
+        min: 0
+        max: 16380944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +329,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jmg9v4185
+      job_id: jp14z608p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:13:05Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:09:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 758.0
-      throughput: 1319.2612137203166
+      inference_time: 507.0
+      throughput: 1972.3865877712033
       estimated_peak_memory_range:
-        min: 16384
-        max: 10073128
+        min: 12288
+        max: 22764432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +352,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jwgoyv0k5
+      job_id: jglvmljj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 766.0
-      throughput: 1305.4830287206266
+      inference_time: 560.0
+      throughput: 1785.7142857142858
       estimated_peak_memory_range:
-        min: 634880
-        max: 2038648
+        min: 614400
+        max: 12920832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +367,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jnp108l75
+      job_id: jgdx12wrp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 573.0
+      throughput: 1745.2006980802792
+      estimated_peak_memory_range:
+        min: 0
+        max: 24190256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 104
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 104
+      job_id: j57yr9z95
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:13:06Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:10:04Z'
   - torchscript_onnx_qnn:
-      inference_time: 906.0
-      throughput: 1103.7527593818984
+      inference_time: 932.0
+      throughput: 1072.961373390558
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jlpe9v1vg
+      job_id: jgo268jqp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 799.0
-      throughput: 1251.5644555694619
+      inference_time: 824.0
+      throughput: 1213.5922330097087
       estimated_peak_memory_range:
-        min: 5668864
-        max: 5668864
+        min: 7053312
+        max: 7053312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 104
-      job_id: jnp108ln5
+      job_id: jp14z607p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:13:10Z'
+    timestamp: '2024-10-15T00:10:02Z'
diff --git a/qai_hub_models/models/mobilenet_v2/README.md b/qai_hub_models/models/mobilenet_v2/README.md
index bf6d9dca..62926fb4 100644
--- a/qai_hub_models/models/mobilenet_v2/README.md
+++ b/qai_hub_models/models/mobilenet_v2/README.md
@@ -6,7 +6,7 @@
 MobileNetV2 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MobileNet-v2 found
-[here](https://github.com/tonylins/pytorch-mobilenet-v2/tree/master). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mobilenet_v2).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mobilenet_v2.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MobileNet-v2 can be found
+* The license for the original implementation of MobileNet-v2 can be found
   [here](https://github.com/tonylins/pytorch-mobilenet-v2/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381)
 * [Source Model Implementation](https://github.com/tonylins/pytorch-mobilenet-v2/tree/master)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mobilenet_v2/export.py b/qai_hub_models/models/mobilenet_v2/export.py
index 06e19094..dc4abe1a 100644
--- a/qai_hub_models/models/mobilenet_v2/export.py
+++ b/qai_hub_models/models/mobilenet_v2/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mobilenet_v2 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mobilenet_v2"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/mobilenet_v2/perf.yaml b/qai_hub_models/models/mobilenet_v2/perf.yaml
index e7bb6554..326017fd 100644
--- a/qai_hub_models/models/mobilenet_v2/perf.yaml
+++ b/qai_hub_models/models/mobilenet_v2/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MobileNet-v2
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 906.0
-      throughput: 1103.7527593818984
+      inference_time: 905.0
+      throughput: 1104.9723756906078
       estimated_peak_memory_range:
-        min: 12288
-        max: 185155568
+        min: 20480
+        max: 184020992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jn5q82jo5
+      job_id: j57yr9mv5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 1253.0
       throughput: 798.0845969672786
       estimated_peak_memory_range:
-        min: 266240
-        max: 51335624
+        min: 28672
+        max: 39276912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jlpe9v9vg
+      job_id: jp8qy1nkp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 954.0
-      throughput: 1048.2180293501049
+      inference_time: 919.0
+      throughput: 1088.139281828074
       estimated_peak_memory_range:
         min: 12288
-        max: 14648184
+        max: 1628616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: j0pxv1vlg
+      job_id: jgz3dn1o5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:12:30Z'
+    timestamp: '2024-10-15T00:09:15Z'
   - torchscript_onnx_tflite:
       inference_time: 623.0
       throughput: 1605.1364365971108
       estimated_peak_memory_range:
         min: 16384
-        max: 63367312
+        max: 64676784
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: j1glnknmp
+      job_id: jp4lr3785
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 860.0
-      throughput: 1162.7906976744187
+      inference_time: 862.0
+      throughput: 1160.092807424594
       estimated_peak_memory_range:
         min: 618496
-        max: 14968944
+        max: 16158048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jygze7exg
+      job_id: jgkex81wg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 703.0
-      throughput: 1422.475106685633
+      inference_time: 678.0
+      throughput: 1474.9262536873157
       estimated_peak_memory_range:
-        min: 0
-        max: 65651824
+        min: 512000
+        max: 68785456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jo5mrzr9g
+      job_id: j5we64j35
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:12:31Z'
+    timestamp: '2024-10-15T00:09:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 905.0
-      throughput: 1104.9723756906078
+      inference_time: 902.0
+      throughput: 1108.6474501108648
       estimated_peak_memory_range:
-        min: 28672
-        max: 1431080
+        min: 12288
+        max: 1340448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jw56616y5
+      job_id: jpxkoxq35
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 1188.0
       throughput: 841.7508417508418
       estimated_peak_memory_range:
-        min: 634880
-        max: 2261792
+        min: 626688
+        max: 1791448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jmg9v4v85
+      job_id: jglvmldj5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:12:25Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:09:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 1093.0
-      throughput: 914.9130832570905
+      inference_time: 906.0
+      throughput: 1103.7527593818984
       estimated_peak_memory_range:
-        min: 16384
-        max: 65363808
+        min: 28672
+        max: 8275216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: j1p3kmkn5
+      job_id: jp2kye1rp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1439.0
-      throughput: 694.9270326615705
+      inference_time: 1188.0
+      throughput: 841.7508417508418
       estimated_peak_memory_range:
-        min: 618496
-        max: 17563664
+        min: 638976
+        max: 1769016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jqp4qwq1g
+      job_id: jgo268xqp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:12:29Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:09:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 907.0
-      throughput: 1102.5358324145534
+      inference_time: 901.0
+      throughput: 1109.8779134295228
       estimated_peak_memory_range:
-        min: 16384
-        max: 1391776
+        min: 24576
+        max: 1600056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jwgoyvyk5
+      job_id: jprv3wr0g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1183.0
-      throughput: 845.30853761623
+      inference_time: 1189.0
+      throughput: 841.0428931875525
       estimated_peak_memory_range:
-        min: 634880
-        max: 1990832
+        min: 630784
+        max: 1956072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jnp108075
+      job_id: jp3j06d3g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:12:26Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:09:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 907.0
-      throughput: 1102.5358324145534
+      inference_time: 902.0
+      throughput: 1108.6474501108648
       estimated_peak_memory_range:
-        min: 20480
-        max: 1935496
+        min: 12288
+        max: 24504288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: j1pv3w3r5
+      job_id: jgn6vk4k5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 1191.0
       throughput: 839.6305625524769
       estimated_peak_memory_range:
-        min: 638976
-        max: 2290136
+        min: 630784
+        max: 1901720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jvgdwvwz5
+      job_id: j56y4wx6p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:12:27Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:09:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 906.0
-      throughput: 1103.7527593818984
+      inference_time: 1083.0
+      throughput: 923.3610341643582
       estimated_peak_memory_range:
-        min: 12288
-        max: 6087368
+        min: 16384
+        max: 66463520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: j7gjxlxep
+      job_id: j5mnx87dp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1201.0
-      throughput: 832.6394671107411
+      inference_time: 1430.0
+      throughput: 699.3006993006993
       estimated_peak_memory_range:
-        min: 638976
-        max: 1895296
+        min: 618496
+        max: 19699392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jz57zdz9p
+      job_id: jgjvnqjvg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:09:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 503.0
+      throughput: 1988.0715705765408
+      estimated_peak_memory_range:
+        min: 8192
+        max: 25112736
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 72
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 72
+      job_id: jp0z06w95
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 848.0
+      throughput: 1179.245283018868
+      estimated_peak_memory_range:
+        min: 0
+        max: 14676272
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 105
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 105
+      job_id: jpedmyjo5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 681.0
+      throughput: 1468.4287812041116
+      estimated_peak_memory_range:
+        min: 0
+        max: 25528400
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 105
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 105
+      job_id: jgdx12jrp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:12:28Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:09:19Z'
   - torchscript_onnx_qnn:
-      inference_time: 1375.0
-      throughput: 727.2727272727273
+      inference_time: 1348.0
+      throughput: 741.839762611276
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jz5wo9omp
+      job_id: j5q6qvnnp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 961.0
-      throughput: 1040.5827263267429
+      inference_time: 971.0
+      throughput: 1029.8661174047375
       estimated_peak_memory_range:
-        min: 8085504
-        max: 8085504
+        min: 9220096
+        max: 9220096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 105
-      job_id: jegn2e2qg
+      job_id: jg9lnd6wg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:12:32Z'
+    timestamp: '2024-10-15T00:09:17Z'
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/README.md b/qai_hub_models/models/mobilenet_v2_quantized/README.md
index 9a8a7c06..378f950e 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/README.md
+++ b/qai_hub_models/models/mobilenet_v2_quantized/README.md
@@ -6,7 +6,7 @@
 MobileNetV2 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MobileNet-v2-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/mobilenetv2). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mobilenet_v2_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/m
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[mobilenet_v2_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.mobilenet_v2_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MobileNet-v2-Quantized can be found
+* The license for the original implementation of MobileNet-v2-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/mobilenetv2)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/conftest.py b/qai_hub_models/models/mobilenet_v2_quantized/conftest.py
index 56084dea..220651da 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/conftest.py
+++ b/qai_hub_models/models/mobilenet_v2_quantized/conftest.py
@@ -9,7 +9,6 @@
 import pytest
 
 from qai_hub_models.models.mobilenet_v2_quantized import Model
-from qai_hub_models.utils.testing import skip_clone_repo_check
 
 
 # Instantiate the model only once for all tests.
@@ -22,7 +21,6 @@ def cached_from_pretrained():
         from_pretrained = Model.from_pretrained
         sig = inspect.signature(from_pretrained)
 
-        @skip_clone_repo_check
         def _cached_from_pretrained(*args, **kwargs):
             cache_key = str(args) + str(kwargs)
             model = pretrained_cache.get(cache_key, None)
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/evaluate.py b/qai_hub_models/models/mobilenet_v2_quantized/evaluate.py
index 76dd0581..55e4c66d 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/evaluate.py
+++ b/qai_hub_models/models/mobilenet_v2_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.mobilenet_v2_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/export.py b/qai_hub_models/models/mobilenet_v2_quantized/export.py
index 47149272..f5b541e9 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/export.py
+++ b/qai_hub_models/models/mobilenet_v2_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mobilenet_v2_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "mobilenet_v2_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/model.py b/qai_hub_models/models/mobilenet_v2_quantized/model.py
index d884a6c7..70e84a3c 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/model.py
+++ b/qai_hub_models/models/mobilenet_v2_quantized/model.py
@@ -4,86 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.mobilenet_v2.model import MobileNetV2
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
-from qai_hub_models.utils.quantization_aimet import (
-    constrain_quantized_inputs_to_image_range,
-    convert_all_depthwise_to_per_tensor,
-)
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 5
-
-# Weights downloaded from https://github.com/quic/aimet-model-zoo/releases/download/phase_2_january_artifacts/torch_mobilenetv2_w8a8_state_dict.pth
-QUANTIZED_WEIGHTS = "torch_mobilenetv2_w8a8_state_dict.pth"
-DEFAULT_ENCODINGS = "mobilenet_v2_quantized_encodings.json"
-
-
-class MobileNetV2Quantizable(AIMETQuantizableMixin, MobileNetV2):
-    """MobileNetV2 with post train quantization support."""
-
-    def __init__(
-        self,
-        quant_sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        MobileNetV2.__init__(self, quant_sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            quant_sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "MobileNetV2Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        # Load Model
-        model = MobileNetV2.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-        # Following
-        # https://github.com/quic/aimet-model-zoo/blob/develop/aimet_zoo_torch/mobilenetv2/model/model_definition.py#L64
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-
-        aimet_config = get_default_aimet_config()
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=aimet_config,
-            dummy_input=torch.rand(input_shape),
-        )
-        convert_all_depthwise_to_per_tensor(sim.model)
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class MobileNetV2Quantizable(HubQuantizableMixin, MobileNetV2):
+    pass
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/perf.yaml b/qai_hub_models/models/mobilenet_v2_quantized/perf.yaml
index f76dbb7f..c1386822 100644
--- a/qai_hub_models/models/mobilenet_v2_quantized/perf.yaml
+++ b/qai_hub_models/models/mobilenet_v2_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,82 +20,62 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: MobileNet-v2-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 279.0
-      throughput: 3584.2293906810037
+      inference_time: 434.0
+      throughput: 2304.147465437788
       estimated_peak_memory_range:
         min: 12288
-        max: 1617296
+        max: 9924072
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jogkzy1ng
+        total_layers: 109
+      job_id: jgz32xdz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 657.0
-      throughput: 1522.0700152207
+      inference_time: 665.0
+      throughput: 1503.7593984962407
       estimated_peak_memory_range:
         min: 16384
-        max: 9679232
-      primary_compute_unit: NPU
-      precision: int8
-      layer_info:
-        layers_on_npu: 71
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 71
-      job_id: jygze71xg
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 567.0
-      throughput: 1763.668430335097
-      estimated_peak_memory_range:
-        min: 12288
-        max: 6765824
+        max: 10005680
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jegn2ejqg
+        total_layers: 106
+      job_id: jp2kx7kxp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,51 +84,36 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:11:52Z'
+    timestamp: '2024-10-17T17:28:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 209.0
-      throughput: 4784.688995215311
+      inference_time: 306.0
+      throughput: 3267.97385620915
       estimated_peak_memory_range:
         min: 12288
-        max: 41895024
+        max: 45111632
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jn5q82no5
+        total_layers: 109
+      job_id: j5wewd6z5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 482.0
-      throughput: 2074.688796680498
+      inference_time: 487.0
+      throughput: 2053.388090349076
       estimated_peak_memory_range:
         min: 159744
-        max: 17433984
-      primary_compute_unit: NPU
-      precision: int8
-      layer_info:
-        layers_on_npu: 71
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 71
-      job_id: jz5wo9jmp
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 417.0
-      throughput: 2398.0815347721823
-      estimated_peak_memory_range:
-        min: 0
-        max: 62311824
+        max: 16437872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: joprkyz75
+        total_layers: 106
+      job_id: jpy1z41rp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,287 +122,272 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:11:53Z'
+    timestamp: '2024-10-17T17:28:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 286.0
-      throughput: 3496.5034965034965
+      inference_time: 1067.0
+      throughput: 937.207122774133
       estimated_peak_memory_range:
-        min: 12288
-        max: 2184456
+        min: 16384
+        max: 28807200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: j1glnkjmp
+        total_layers: 109
+      job_id: jg9l03nqg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 602.0
-      throughput: 1661.1295681063123
+      inference_time: 1490.0
+      throughput: 671.1409395973154
       estimated_peak_memory_range:
-        min: 184320
-        max: 1872464
+        min: 12288
+        max: 7521664
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jnp108r75
+        total_layers: 106
+      job_id: jp0z41z25
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:11:46Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:27:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 329.0
-      throughput: 3039.51367781155
-      estimated_peak_memory_range:
-        min: 12288
-        max: 42901040
-      primary_compute_unit: NPU
-      precision: int8
-      layer_info:
-        layers_on_npu: 74
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 74
-      job_id: jw5661ky5
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 724.0
-      throughput: 1381.2154696132598
+      inference_time: 12534.0
+      throughput: 79.78299026647518
       estimated_peak_memory_range:
-        min: 0
-        max: 17851680
+        min: 28672
+        max: 6489896
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
-        layers_on_gpu: 0
+        layers_on_npu: 107
+        layers_on_gpu: 2
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: j0pxv1wlg
+        total_layers: 109
+      job_id: jp142dzkp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:11:50Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:27:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 284.0
-      throughput: 3521.1267605633802
+      inference_time: 433.0
+      throughput: 2309.4688221709007
       estimated_peak_memory_range:
         min: 12288
-        max: 1401424
+        max: 4784296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: j1p3kmyn5
+        total_layers: 109
+      job_id: jgdxnr1kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 610.0
-      throughput: 1639.344262295082
+      inference_time: 609.0
+      throughput: 1642.0361247947455
       estimated_peak_memory_range:
         min: 184320
-        max: 1483200
+        max: 1352704
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jvgdwvjz5
+        total_layers: 106
+      job_id: jp8q23qzp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:11:47Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:27:52Z'
   - torchscript_onnx_tflite:
-      inference_time: 287.0
-      throughput: 3484.320557491289
+      inference_time: 437.0
+      throughput: 2288.329519450801
       estimated_peak_memory_range:
-        min: 32768
-        max: 4324136
+        min: 12288
+        max: 1342408
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jwgoyvjk5
+        total_layers: 109
+      job_id: j57y2jrq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 626.0
-      throughput: 1597.444089456869
+      inference_time: 617.0
+      throughput: 1620.7455429497568
       estimated_peak_memory_range:
-        min: 221184
-        max: 1462848
+        min: 184320
+        max: 1510768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jz57zdq9p
+        total_layers: 106
+      job_id: j5q60767p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:11:48Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:27:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 283.0
-      throughput: 3533.5689045936397
+      inference_time: 433.0
+      throughput: 2309.4688221709007
       estimated_peak_memory_range:
         min: 12288
-        max: 1426808
+        max: 1327560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: j1pv3wjr5
+        total_layers: 109
+      job_id: jp4lnxrq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 602.0
-      throughput: 1661.1295681063123
+      inference_time: 625.0
+      throughput: 1600.0
       estimated_peak_memory_range:
-        min: 180224
-        max: 1801976
+        min: 184320
+        max: 1393752
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jqp4qwz1g
+        total_layers: 106
+      job_id: jglv40ve5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:11:49Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:27:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 830.0
-      throughput: 1204.8192771084337
+      inference_time: 486.0
+      throughput: 2057.61316872428
       estimated_peak_memory_range:
         min: 12288
-        max: 27623808
+        max: 44831232
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 109
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: j7gjxljep
+        total_layers: 109
+      job_id: jpxk97oj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1429.0
-      throughput: 699.7900629811056
+      inference_time: 725.0
+      throughput: 1379.3103448275863
       estimated_peak_memory_range:
-        min: 12288
-        max: 8095952
+        min: 159744
+        max: 20340720
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jo5mrzj9g
+        total_layers: 106
+      job_id: j56y23yvp
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:11:51Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:28:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 7669.0
-      throughput: 130.39509714434737
+      inference_time: 297.0
+      throughput: 3367.003367003367
       estimated_peak_memory_range:
-        min: 12288
-        max: 10982704
+        min: 8192
+        max: 28543344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 72
-        layers_on_gpu: 2
+        layers_on_npu: 109
+        layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jlpe9vjvg
+        total_layers: 109
+      job_id: j5mnewxyp
       job_status: Passed
-    reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:11:42Z'
-  - torchscript_onnx_qnn:
-      inference_time: 724.0
-      throughput: 1381.2154696132598
+    torchscript_onnx_qnn:
+      inference_time: 406.0
+      throughput: 2463.054187192118
       estimated_peak_memory_range:
-        min: 577536
-        max: 577536
+        min: 8192
+        max: 14825008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 71
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 71
-      job_id: jmg9v4685
+        total_layers: 106
+      job_id: jp3jn4jxg
       job_status: Passed
-    torchscript_onnx:
-      inference_time: 577.0
-      throughput: 1733.102253032929
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:28:12Z'
+  - torchscript_onnx_qnn:
+      inference_time: 757.0
+      throughput: 1321.003963011889
       estimated_peak_memory_range:
-        min: 5672960
-        max: 5672960
+        min: 630784
+        max: 630784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 74
+        layers_on_npu: 106
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 74
-      job_id: jep28m2qp
+        total_layers: 106
+      job_id: jgkevleyg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +396,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:11:54Z'
+    timestamp: '2024-10-17T17:28:11Z'
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/requirements.txt b/qai_hub_models/models/mobilenet_v2_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/mobilenet_v2_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/mobilenet_v2_quantized/test.py b/qai_hub_models/models/mobilenet_v2_quantized/test.py
deleted file mode 100644
index 9837761a..00000000
--- a/qai_hub_models/models/mobilenet_v2_quantized/test.py
+++ /dev/null
@@ -1,31 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.mobilenet_v2_quantized.demo import main as demo_main
-from qai_hub_models.models.mobilenet_v2_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    MobileNetV2Quantizable,
-)
-from qai_hub_models.utils.testing import skip_clone_repo_check
-
-
-@skip_clone_repo_check
-def test_task():
-    run_imagenet_classifier_test(
-        MobileNetV2Quantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        probability_threshold=0.56,
-        diff_tol=0.06,
-    )
-
-
-@skip_clone_repo_check
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/mobilenet_v3_large/README.md b/qai_hub_models/models/mobilenet_v3_large/README.md
index cbc69327..f2e5c92e 100644
--- a/qai_hub_models/models/mobilenet_v3_large/README.md
+++ b/qai_hub_models/models/mobilenet_v3_large/README.md
@@ -6,7 +6,7 @@
 MobileNet-v3-Large is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MobileNet-v3-Large found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mobilenet_v3_large).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mobilenet_v3_large.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MobileNet-v3-Large can be found
+* The license for the original implementation of MobileNet-v3-Large can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mobilenet_v3_large/export.py b/qai_hub_models/models/mobilenet_v3_large/export.py
index 5ff2c2ad..0ab299ba 100644
--- a/qai_hub_models/models/mobilenet_v3_large/export.py
+++ b/qai_hub_models/models/mobilenet_v3_large/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mobilenet_v3_large import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mobilenet_v3_large"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/mobilenet_v3_large/perf.yaml b/qai_hub_models/models/mobilenet_v3_large/perf.yaml
index 62227cc1..ae27a257 100644
--- a/qai_hub_models/models/mobilenet_v3_large/perf.yaml
+++ b/qai_hub_models/models/mobilenet_v3_large/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MobileNet-v3-Large
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 996.0
-      throughput: 1004.0160642570281
+      inference_time: 987.0
+      throughput: 1013.1712259371834
       estimated_peak_memory_range:
         min: 12288
-        max: 2218408
+        max: 38384416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1glnkdmp
+      job_id: jgjvnqyxg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1044.0
-      throughput: 957.8544061302682
+      inference_time: 1051.0
+      throughput: 951.4747859181732
       estimated_peak_memory_range:
-        min: 626688
-        max: 286895088
+        min: 622592
+        max: 5759816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jygze76xg
+      job_id: jp14z6m8p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1036.0
-      throughput: 965.2509652509652
+      inference_time: 993.0
+      throughput: 1007.0493454179255
       estimated_peak_memory_range:
-        min: 0
-        max: 276525096
+        min: 618496
+        max: 2189336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jo5mrz79g
+      job_id: jp0z06x95
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:11:02Z'
+    timestamp: '2024-10-15T00:07:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 692.0
-      throughput: 1445.086705202312
+      inference_time: 698.0
+      throughput: 1432.6647564469913
       estimated_peak_memory_range:
         min: 16384
-        max: 67093360
+        max: 68672320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jw5661xy5
+      job_id: jpedmyx15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 731.0
-      throughput: 1367.9890560875513
+      inference_time: 732.0
+      throughput: 1366.120218579235
       estimated_peak_memory_range:
         min: 618496
-        max: 18611664
+        max: 19697296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jz5wo9kmp
+      job_id: jgdx12mrp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 756.0
-      throughput: 1322.7513227513227
+      inference_time: 722.0
+      throughput: 1385.0415512465374
       estimated_peak_memory_range:
         min: 0
-        max: 67472384
+        max: 68762576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: jegn2e4qg
+      job_id: jp8qy1kkp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:11:03Z'
+    timestamp: '2024-10-15T00:07:29Z'
   - torchscript_onnx_tflite:
       inference_time: 987.0
       throughput: 1013.1712259371834
       estimated_peak_memory_range:
-        min: 24576
-        max: 1867088
+        min: 20480
+        max: 1464160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1p3kmdn5
+      job_id: jgz3dnyk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 996.0
-      throughput: 1004.0160642570281
+      inference_time: 1001.0
+      throughput: 999.000999000999
       estimated_peak_memory_range:
-        min: 630784
-        max: 1775008
+        min: 638976
+        max: 1908496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jnp108975
+      job_id: jp4lr3285
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:10:58Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:07:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 1392.0
-      throughput: 718.3908045977012
+      inference_time: 986.0
+      throughput: 1014.1987829614604
       estimated_peak_memory_range:
-        min: 12288
-        max: 68139712
+        min: 24576
+        max: 253750768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jwgoyvxk5
+      job_id: jgdx12mep
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1470.0
-      throughput: 680.2721088435375
+      inference_time: 1003.0
+      throughput: 997.0089730807578
       estimated_peak_memory_range:
-        min: 622592
-        max: 21538816
+        min: 634880
+        max: 1943496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 144
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j0pxv1qlg
+        total_layers: 144
+      job_id: jgn6vkwk5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:11:02Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:07:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 997.0
-      throughput: 1003.0090270812437
+      inference_time: 991.0
+      throughput: 1009.0817356205853
       estimated_peak_memory_range:
-        min: 57344
-        max: 1584984
+        min: 20480
+        max: 1590376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1pv3w8r5
+      job_id: jp14z6m2p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 998.0
-      throughput: 1002.0040080160321
+      inference_time: 994.0
+      throughput: 1006.0362173038229
       estimated_peak_memory_range:
-        min: 634880
-        max: 1977464
+        min: 638976
+        max: 1794568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jvgdwvkz5
+      job_id: j5mnx8ldp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:10:59Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:07:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 997.0
-      throughput: 1003.0090270812437
+      inference_time: 989.0
+      throughput: 1011.1223458038422
       estimated_peak_memory_range:
-        min: 45056
-        max: 2527128
+        min: 28672
+        max: 257923856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j7gjxl9ep
+      job_id: jg9lndqlg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 999.0
-      throughput: 1001.001001001001
+      inference_time: 1000.0
+      throughput: 1000.0
       estimated_peak_memory_range:
-        min: 663552
-        max: 2271392
+        min: 622592
+        max: 1873896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jz57zdm9p
+      job_id: jpxkoxz35
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:11:00Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:07:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 991.0
-      throughput: 1009.0817356205853
+      inference_time: 1385.0
+      throughput: 722.0216606498195
       estimated_peak_memory_range:
-        min: 24576
-        max: 1530320
+        min: 20480
+        max: 69762448
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: j5we64r65
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1473.0
+      throughput: 678.8866259334691
+      estimated_peak_memory_range:
+        min: 618496
+        max: 23621792
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 146
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 146
+      job_id: jp2kyezrp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:07:26Z'
+  - torchscript_onnx_tflite:
+      inference_time: 679.0
+      throughput: 1472.7540500736377
+      estimated_peak_memory_range:
+        min: 12288
+        max: 26004720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +352,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jlpe9vqvg
+      job_id: jg9lndqwg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1002.0
-      throughput: 998.003992015968
+      inference_time: 710.0
+      throughput: 1408.4507042253522
       estimated_peak_memory_range:
-        min: 630784
-        max: 1848168
+        min: 614400
+        max: 15213248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +367,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jqp4qw71g
+      job_id: jpy13my8p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 734.0
+      throughput: 1362.3978201634877
+      estimated_peak_memory_range:
+        min: 0
+        max: 28075408
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 146
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 146
+      job_id: jglvmlqj5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:11:01Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:07:32Z'
   - torchscript_onnx_qnn:
-      inference_time: 1173.0
-      throughput: 852.5149190110827
+      inference_time: 1166.0
+      throughput: 857.6329331046312
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 144
-      job_id: jmg9v4r85
+      job_id: j57yr98v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1040.0
-      throughput: 961.5384615384615
+      inference_time: 1072.0
+      throughput: 932.8358208955224
       estimated_peak_memory_range:
-        min: 14659584
-        max: 14659584
+        min: 13778944
+        max: 13778944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 146
-      job_id: joprkyr75
+      job_id: jgkex8kwg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:11:04Z'
+    timestamp: '2024-10-15T00:07:30Z'
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/README.md b/qai_hub_models/models/mobilenet_v3_large_quantized/README.md
index f1d80ca8..e799b0e9 100644
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/README.md
+++ b/qai_hub_models/models/mobilenet_v3_large_quantized/README.md
@@ -6,7 +6,7 @@
 MobileNet-v3-Large is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MobileNet-v3-Large-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mobilenet_v3_large_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/m
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[mobilenet_v3_large_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.mobilenet_v3_large_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MobileNet-v3-Large-Quantized can be found
+* The license for the original implementation of MobileNet-v3-Large-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/evaluate.py b/qai_hub_models/models/mobilenet_v3_large_quantized/evaluate.py
index 39314070..6716f23c 100644
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/evaluate.py
+++ b/qai_hub_models/models/mobilenet_v3_large_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.mobilenet_v3_large_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/export.py b/qai_hub_models/models/mobilenet_v3_large_quantized/export.py
index 0b04571e..81d02d7a 100644
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/export.py
+++ b/qai_hub_models/models/mobilenet_v3_large_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mobilenet_v3_large_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "mobilenet_v3_large_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,12 +225,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/model.py b/qai_hub_models/models/mobilenet_v3_large_quantized/model.py
index b13a9d4c..ee3c1b02 100644
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/model.py
+++ b/qai_hub_models/models/mobilenet_v3_large_quantized/model.py
@@ -4,78 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.mobilenet_v3_large.model import MobileNetV3Large
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 2
-DEFAULT_ENCODINGS = "mobilenet_v3_large_quantized_encodings.json"
-
-
-class MobileNetV3LargeQuantizable(AIMETQuantizableMixin, MobileNetV3Large):
-    """MobileNetV3Large with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        MobileNetV3Large.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "MobileNetV3LargeQuantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = MobileNetV3Large.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class MobileNetV3LargeQuantizable(HubQuantizableMixin, MobileNetV3Large):
+    pass
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/perf.yaml b/qai_hub_models/models/mobilenet_v3_large_quantized/perf.yaml
index 0a7c2e6c..659488a9 100644
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/perf.yaml
+++ b/qai_hub_models/models/mobilenet_v3_large_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: MobileNet-v3-Large-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 336.0
-      throughput: 2976.190476190476
+      inference_time: 346.0
+      throughput: 2890.173410404624
       estimated_peak_memory_range:
-        min: 24576
-        max: 1495384
+        min: 12288
+        max: 10174368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: j1p3kmr35
+      job_id: jp2kxmnxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 627.0
-      throughput: 1594.896331738437
+      inference_time: 630.0
+      throughput: 1587.3015873015872
       estimated_peak_memory_range:
-        min: 12288
-        max: 25425304
+        min: 28672
+        max: 14777032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jvgdwvyr5
+        total_layers: 145
+      job_id: jgjvdl47g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 681.0
-      throughput: 1468.4287812041116
+      inference_time: 656.0
+      throughput: 1524.3902439024391
       estimated_peak_memory_range:
         min: 12288
-        max: 17557696
+        max: 12047376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 134
-      job_id: joprkym75
+      job_id: jgn609vv5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:10:25Z'
+    timestamp: '2024-10-17T17:26:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 318.0
-      throughput: 3144.6540880503144
+      inference_time: 239.0
+      throughput: 4184.100418410042
       estimated_peak_memory_range:
         min: 12288
-        max: 55456912
+        max: 54438352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: jwgoyv9q5
+      job_id: jpy1zd0rp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 457.0
-      throughput: 2188.183807439825
+      inference_time: 602.0
+      throughput: 1661.1295681063123
       estimated_peak_memory_range:
         min: 163840
-        max: 18072000
+        max: 19043536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jz5wo90mp
+        total_layers: 145
+      job_id: jpedov375
       job_status: Passed
     torchscript_onnx:
-      inference_time: 521.0
-      throughput: 1919.3857965451057
+      inference_time: 495.0
+      throughput: 2020.20202020202
       estimated_peak_memory_range:
-        min: 0
-        max: 84828160
+        min: 12288
+        max: 85886736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 134
-      job_id: jep28mqqp
+      job_id: jprv643vg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:10:25Z'
+    timestamp: '2024-10-17T17:26:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 341.0
-      throughput: 2932.551319648094
+      inference_time: 1133.0
+      throughput: 882.61253309797
       estimated_peak_memory_range:
         min: 12288
-        max: 1313456
+        max: 33516144
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: j1pv3wyk5
+      job_id: jp0z4r725
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 579.0
-      throughput: 1727.1157167530225
+      inference_time: 1699.0
+      throughput: 588.5815185403178
       estimated_peak_memory_range:
-        min: 184320
-        max: 1431192
+        min: 12288
+        max: 7871536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jnp108k75
+        total_layers: 145
+      job_id: jgz327kz5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:26:32Z'
+  - torchscript_onnx_tflite:
+      inference_time: 6815.0
+      throughput: 146.7351430667645
+      estimated_peak_memory_range:
+        min: 40960
+        max: 2494136
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 137
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 137
+      job_id: jp8q27vzp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:10:19Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:26:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 440.0
-      throughput: 2272.7272727272725
+      inference_time: 349.0
+      throughput: 2865.3295128939826
       estimated_peak_memory_range:
-        min: 20480
-        max: 54918832
+        min: 28672
+        max: 1402784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: j7gjxl6vp
+      job_id: jgkevymyg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 756.0
-      throughput: 1322.7513227513227
+      inference_time: 574.0
+      throughput: 1742.1602787456445
       estimated_peak_memory_range:
-        min: 163840
-        max: 20758448
+        min: 184320
+        max: 1715536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jo5mrz19g
+        total_layers: 145
+      job_id: j5wew9nz5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:10:23Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:26:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 344.0
-      throughput: 2906.9767441860463
+      inference_time: 336.0
+      throughput: 2976.190476190476
       estimated_peak_memory_range:
         min: 12288
-        max: 2200328
+        max: 1349808
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: jlpe9v0og
+      job_id: j5q602o7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 579.0
-      throughput: 1727.1157167530225
+      inference_time: 574.0
+      throughput: 1742.1602787456445
       estimated_peak_memory_range:
-        min: 184320
-        max: 1425392
+        min: 188416
+        max: 1410288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jvgdwvyz5
+        total_layers: 145
+      job_id: jp1428xkp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:10:20Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:26:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 339.0
-      throughput: 2949.8525073746314
+      inference_time: 348.0
+      throughput: 2873.5632183908046
       estimated_peak_memory_range:
-        min: 24576
-        max: 129688360
+        min: 16384
+        max: 1496872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: jygze7qog
+      job_id: jglv4kre5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 575.0
-      throughput: 1739.1304347826087
+      inference_time: 577.0
+      throughput: 1733.102253032929
       estimated_peak_memory_range:
         min: 176128
-        max: 1611400
+        max: 1424912
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jz57zd19p
+        total_layers: 145
+      job_id: jgdxnvlkp
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:10:21Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:26:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 348.0
-      throughput: 2873.5632183908046
+      inference_time: 439.0
+      throughput: 2277.904328018223
       estimated_peak_memory_range:
         min: 16384
-        max: 3181552
+        max: 55946528
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: jz5wo903p
+      job_id: j56y21lvp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 576.0
-      throughput: 1736.111111111111
+      inference_time: 767.0
+      throughput: 1303.7809647979138
       estimated_peak_memory_range:
-        min: 184320
-        max: 1569304
+        min: 163840
+        max: 21186096
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: j0pxv18lg
+        total_layers: 145
+      job_id: j57y2d3q5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:10:22Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:26:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 1151.0
-      throughput: 868.8097306689835
+      inference_time: 249.0
+      throughput: 4016.0642570281125
       estimated_peak_memory_range:
-        min: 12288
-        max: 33354320
+        min: 8192
+        max: 31590176
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +379,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 137
-      job_id: jmg9v47w5
+      job_id: jp3jnm2xg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1734.0
-      throughput: 576.7012687427913
+      inference_time: 473.0
+      throughput: 2114.164904862579
       estimated_peak_memory_range:
-        min: 12288
-        max: 7523600
+        min: 159744
+        max: 14395792
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jegn2edqg
+        total_layers: 145
+      job_id: jp4lnw0q5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:10:24Z'
-  - torchscript_onnx_tflite:
-      inference_time: 6871.0
-      throughput: 145.53922282055015
+    torchscript_onnx:
+      inference_time: 528.0
+      throughput: 1893.939393939394
       estimated_peak_memory_range:
         min: 12288
-        max: 2058328
+        max: 38851536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 137
+        layers_on_npu: 134
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 137
-      job_id: jnp108k85
+        total_layers: 134
+      job_id: jpy1z43rp
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:10:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:26:51Z'
   - torchscript_onnx_qnn:
-      inference_time: 699.0
-      throughput: 1430.615164520744
+      inference_time: 716.0
+      throughput: 1396.6480446927374
       estimated_peak_memory_range:
-        min: 540672
-        max: 540672
+        min: 544768
+        max: 544768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 126
+        layers_on_npu: 145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 126
-      job_id: jmg9v4785
+        total_layers: 145
+      job_id: jg9l04eqg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 703.0
-      throughput: 1422.475106685633
+      inference_time: 720.0
+      throughput: 1388.888888888889
       estimated_peak_memory_range:
-        min: 11837440
-        max: 11837440
+        min: 10633216
+        max: 10633216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +447,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 134
-      job_id: jqpyedklg
+      job_id: jp2kx7yxp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:10:26Z'
+    timestamp: '2024-10-17T17:26:49Z'
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/requirements.txt b/qai_hub_models/models/mobilenet_v3_large_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/mobilenet_v3_large_quantized/test.py b/qai_hub_models/models/mobilenet_v3_large_quantized/test.py
deleted file mode 100644
index 6767deef..00000000
--- a/qai_hub_models/models/mobilenet_v3_large_quantized/test.py
+++ /dev/null
@@ -1,29 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.mobilenet_v3_large_quantized.demo import main as demo_main
-from qai_hub_models.models.mobilenet_v3_large_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    MobileNetV3LargeQuantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        MobileNetV3LargeQuantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/mobilenet_v3_small/README.md b/qai_hub_models/models/mobilenet_v3_small/README.md
index acbc1178..fc4a6a5c 100644
--- a/qai_hub_models/models/mobilenet_v3_small/README.md
+++ b/qai_hub_models/models/mobilenet_v3_small/README.md
@@ -6,7 +6,7 @@
 MobileNetV3Small is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of MobileNet-v3-Small found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/mobilenet_v3_small).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.mobilenet_v3_small.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of MobileNet-v3-Small can be found
+* The license for the original implementation of MobileNet-v3-Small can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/mobilenetv3.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/mobilenet_v3_small/export.py b/qai_hub_models/models/mobilenet_v3_small/export.py
index 1eac925c..28542f6f 100644
--- a/qai_hub_models/models/mobilenet_v3_small/export.py
+++ b/qai_hub_models/models/mobilenet_v3_small/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.mobilenet_v3_small import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "mobilenet_v3_small"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/mobilenet_v3_small/perf.yaml b/qai_hub_models/models/mobilenet_v3_small/perf.yaml
index c8114238..6daac35e 100644
--- a/qai_hub_models/models/mobilenet_v3_small/perf.yaml
+++ b/qai_hub_models/models/mobilenet_v3_small/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: MobileNet-v3-Small
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 817.0
-      throughput: 1223.9902080783354
+      inference_time: 812.0
+      throughput: 1231.527093596059
       estimated_peak_memory_range:
-        min: 28672
-        max: 147618024
+        min: 12288
+        max: 1330904
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: joprk9oe5
+      job_id: j5mnx80wp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 869.0
-      throughput: 1150.7479861910242
+      inference_time: 864.0
+      throughput: 1157.4074074074074
       estimated_peak_memory_range:
         min: 16384
-        max: 34205016
+        max: 145318176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jvgdwvmr5
+      job_id: j56y4w80p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 819.0
-      throughput: 1221.001221001221
+      inference_time: 813.0
+      throughput: 1230.0123001230013
       estimated_peak_memory_range:
-        min: 12288
-        max: 7244784
+        min: 282624
+        max: 13119392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jqpyedy8g
+      job_id: j57yr9ol5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T23:08:41Z'
+    timestamp: '2024-10-15T00:05:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 550.0
-      throughput: 1818.1818181818182
+      inference_time: 549.0
+      throughput: 1821.4936247723133
       estimated_peak_memory_range:
-        min: 12288
-        max: 45436416
+        min: 16384
+        max: 46787648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: jep28j4mp
+      job_id: jprv3wl9g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 714.0
-      throughput: 1400.5602240896358
+      inference_time: 590.0
+      throughput: 1694.915254237288
       estimated_peak_memory_range:
         min: 618496
-        max: 16021200
+        max: 16897440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jz57zd8vp
+      job_id: jp3j06zlg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 594.0
-      throughput: 1683.5016835016836
+      inference_time: 586.0
+      throughput: 1706.4846416382252
       estimated_peak_memory_range:
         min: 0
-        max: 48345728
+        max: 49373776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j2p0yrx9g
+      job_id: jp4lr3ev5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T23:08:42Z'
+    timestamp: '2024-10-15T00:05:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 817.0
-      throughput: 1223.9902080783354
+      inference_time: 814.0
+      throughput: 1228.5012285012285
       estimated_peak_memory_range:
         min: 12288
-        max: 1831712
+        max: 17196376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: jqpyenq4g
+      job_id: jp2kyer4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 840.0
-      throughput: 1190.4761904761904
+      inference_time: 830.0
+      throughput: 1204.8192771084337
       estimated_peak_memory_range:
         min: 638976
-        max: 2285872
+        max: 1870192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j0pxv1z3g
+      job_id: jpv6k7lj5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T23:08:43Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:05:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 1094.0
-      throughput: 914.0767824497258
+      inference_time: 814.0
+      throughput: 1228.5012285012285
       estimated_peak_memory_range:
-        min: 16384
-        max: 47850208
+        min: 24576
+        max: 18042136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: j2p0ykdeg
+      job_id: jgkex822g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1160.0
-      throughput: 862.0689655172414
+      inference_time: 838.0
+      throughput: 1193.3174224343675
       estimated_peak_memory_range:
-        min: 618496
-        max: 17611888
+        min: 626688
+        max: 1985320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 128
+        layers_on_npu: 126
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 128
-      job_id: jep28mzrp
+        total_layers: 126
+      job_id: jgz3dnlk5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T23:08:43Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:05:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 819.0
-      throughput: 1221.001221001221
+      inference_time: 824.0
+      throughput: 1213.5922330097087
       estimated_peak_memory_range:
-        min: 12288
-        max: 9970960
+        min: 20480
+        max: 1643256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: j1p8o868g
+      job_id: jp8qy1exp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 842.0
-      throughput: 1187.648456057007
+      inference_time: 839.0
+      throughput: 1191.8951132300358
       estimated_peak_memory_range:
-        min: 638976
-        max: 1910272
+        min: 651264
+        max: 1766504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jo5mrzldg
+      job_id: jpedmy715
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T23:08:44Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:05:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 815.0
-      throughput: 1226.993865030675
+      inference_time: 811.0
+      throughput: 1233.0456226880394
       estimated_peak_memory_range:
-        min: 12288
-        max: 1585672
+        min: 32768
+        max: 1493512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: jogkzdoog
+      job_id: jp0z06m65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 847.0
-      throughput: 1180.637544273908
+      inference_time: 832.0
+      throughput: 1201.923076923077
       estimated_peak_memory_range:
-        min: 634880
-        max: 1973856
+        min: 667648
+        max: 1965000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jegn2ewkg
+      job_id: jgjvnqrxg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T23:08:45Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:05:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 819.0
-      throughput: 1221.001221001221
+      inference_time: 1098.0
+      throughput: 910.7468123861566
       estimated_peak_memory_range:
         min: 12288
-        max: 1528968
+        max: 48386080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,52 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 115
-      job_id: jn5q8wzm5
+      job_id: jpy13mo7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 838.0
-      throughput: 1193.3174224343675
+      inference_time: 1157.0
+      throughput: 864.304235090752
       estimated_peak_memory_range:
-        min: 634880
-        max: 1874272
+        min: 618496
+        max: 18132000
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jp14z6o2p
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:05:38Z'
+  - torchscript_onnx_tflite:
+      inference_time: 457.0
+      throughput: 2188.183807439825
+      estimated_peak_memory_range:
+        min: 12288
+        max: 22173664
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 115
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 115
+      job_id: jglvmly85
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 476.0
+      throughput: 2100.840336134454
+      estimated_peak_memory_range:
+        min: 0
+        max: 11806272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +367,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: joprky705
+      job_id: jgdx126ep
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 596.0
+      throughput: 1677.8523489932886
+      estimated_peak_memory_range:
+        min: 0
+        max: 23416352
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jgn6vk1r5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T23:08:46Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:05:44Z'
   - torchscript_onnx_qnn:
-      inference_time: 968.0
-      throughput: 1033.0578512396694
+      inference_time: 1007.0
+      throughput: 993.0486593843099
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jqp4qw28g
+      job_id: jgo268lxp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 841.0
-      throughput: 1189.0606420927468
+      inference_time: 970.0
+      throughput: 1030.9278350515465
       estimated_peak_memory_range:
-        min: 6303744
-        max: 6303744
+        min: 6238208
+        max: 6238208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1p8o7kkg
+      job_id: jpxkox015
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:09:34Z'
+    timestamp: '2024-10-15T00:05:42Z'
diff --git a/qai_hub_models/models/openai_clip/README.md b/qai_hub_models/models/openai_clip/README.md
index 6bab600a..07421d68 100644
--- a/qai_hub_models/models/openai_clip/README.md
+++ b/qai_hub_models/models/openai_clip/README.md
@@ -6,7 +6,7 @@
 Contrastive Language-Image Pre-Training (CLIP) uses a ViT like transformer to get visual features and a causal language model to get the text features. Both the text and visual features can then be used for a variety of zero-shot learning tasks.
 
 This is based on the implementation of OpenAI-Clip found
-[here](https://github.com/openai/CLIP/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/openai_clip).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.openai_clip.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of OpenAI-Clip can be found
+* The license for the original implementation of OpenAI-Clip can be found
   [here](https://github.com/openai/CLIP/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Learning Transferable Visual Models From Natural Language Supervision](https://arxiv.org/abs/2103.00020)
 * [Source Model Implementation](https://github.com/openai/CLIP/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/openai_clip/export.py b/qai_hub_models/models/openai_clip/export.py
index 27dc1f2f..8f52b43e 100644
--- a/qai_hub_models/models/openai_clip/export.py
+++ b/qai_hub_models/models/openai_clip/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.openai_clip import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "openai_clip"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "CLIPTextEncoder" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/openai_clip/perf.yaml b/qai_hub_models/models/openai_clip/perf.yaml
index 6f2fde1f..4d2bdcd6 100644
--- a/qai_hub_models/models/openai_clip/perf.yaml
+++ b/qai_hub_models/models/openai_clip/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: CLIPTextEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 6808.0
-      throughput: 146.88601645123384
+      inference_time: 5779.0
+      throughput: 173.04031839418585
       estimated_peak_memory_range:
         min: 16384
-        max: 2516288
+        max: 2798120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: jz5wo9y3p
+      job_id: jgjvnmlvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5858.0
-      throughput: 170.7067258449983
+      inference_time: 4774.0
+      throughput: 209.46795140343528
       estimated_peak_memory_range:
-        min: 20480
-        max: 20442960
+        min: 16384
+        max: 16300928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: jogkzy6wg
+      job_id: jpy13wdlp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 38965.0
-      throughput: 25.664057487488773
+      inference_time: 35403.0
+      throughput: 28.24619382538203
       estimated_peak_memory_range:
-        min: 98304
-        max: 137364600
+        min: 81920
+        max: 136793360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 1
         total_layers: 508
-      job_id: j0pxv1r3g
+      job_id: jgn6vy9q5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:08:48Z'
+    timestamp: '2024-10-15T17:26:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 4873.0
-      throughput: 205.21239482864766
+      inference_time: 4079.0
+      throughput: 245.15812699190977
       estimated_peak_memory_range:
-        min: 16384
-        max: 186171488
+        min: 32768
+        max: 202933264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: jnp108o85
+      job_id: jgo2241dp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4160.0
-      throughput: 240.3846153846154
+      inference_time: 3405.0
+      throughput: 293.68575624082234
       estimated_peak_memory_range:
         min: 12288
-        max: 56794672
+        max: 69456128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: j1glnkwjp
+      job_id: jp8qy97op
       job_status: Passed
     torchscript_onnx:
-      inference_time: 29339.0
-      throughput: 34.084324619107676
+      inference_time: 26223.0
+      throughput: 38.13446211341189
       estimated_peak_memory_range:
         min: 61440
-        max: 471243408
+        max: 560002208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 1
         total_layers: 508
-      job_id: jegn2eqkg
+      job_id: jp4ll9ml5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:08:49Z'
+    timestamp: '2024-10-16T08:34:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 6725.0
-      throughput: 148.6988847583643
+      inference_time: 5717.0
+      throughput: 174.91691446562882
       estimated_peak_memory_range:
-        min: 24576
-        max: 2526952
+        min: 16384
+        max: 1691600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: jz57zdovp
+      job_id: jg9ln14wg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5791.0
-      throughput: 172.68174753928508
+      inference_time: 4856.0
+      throughput: 205.9308072487644
       estimated_peak_memory_range:
-        min: 28672
-        max: 1623792
+        min: 24576
+        max: 1295896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: j1pv3wmk5
+      job_id: j56y4j3yp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:08:39Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:25:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 7421.0
-      throughput: 134.75272874275703
+      inference_time: 5711.0
+      throughput: 175.1006828926633
       estimated_peak_memory_range:
         min: 16384
-        max: 163313920
+        max: 2156408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: j0pxv103g
+      job_id: jp4lrow15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6287.0
-      throughput: 159.0583744234134
+      inference_time: 4794.0
+      throughput: 208.59407592824363
       estimated_peak_memory_range:
-        min: 12288
-        max: 62928032
+        min: 49152
+        max: 1298000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: jz57zdnvp
+      job_id: j5we6vdm5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:08:46Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:25:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 6765.0
-      throughput: 147.81966001478196
+      inference_time: 5652.0
+      throughput: 176.92852087756546
       estimated_peak_memory_range:
-        min: 28672
-        max: 2280024
+        min: 16384
+        max: 2561760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: jegn2e1kg
+      job_id: jgdx19vzp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5773.0
-      throughput: 173.22016282695307
+      inference_time: 4897.0
+      throughput: 204.20665713702266
       estimated_peak_memory_range:
-        min: 61440
-        max: 1238176
+        min: 94208
+        max: 1440560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: jlpe9vxog
+      job_id: jgjvnm0eg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:08:41Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:25:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 6740.0
-      throughput: 148.3679525222552
+      inference_time: 5683.0
+      throughput: 175.96339961288052
       estimated_peak_memory_range:
-        min: 49152
-        max: 2948808
+        min: 20480
+        max: 301324312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: jep28morp
+      job_id: jg9ln148g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5787.0
-      throughput: 172.80110592707794
+      inference_time: 4903.0
+      throughput: 203.95676116663267
       estimated_peak_memory_range:
-        min: 81920
-        max: 1325240
+        min: 24576
+        max: 1123192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: jz5wo9z3p
+      job_id: jgo2601kp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:08:43Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:25:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 6704.0
-      throughput: 149.16467780429593
+      inference_time: 6593.0
+      throughput: 151.67602002123465
       estimated_peak_memory_range:
-        min: 24576
-        max: 2242120
+        min: 16384
+        max: 176541344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 2
         total_layers: 660
-      job_id: j2p0yro9g
+      job_id: jgdx19vrp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5821.0
-      throughput: 171.79178835251676
+      inference_time: 5491.0
+      throughput: 182.11619012930248
       estimated_peak_memory_range:
-        min: 73728
-        max: 1312600
+        min: 12288
+        max: 68984736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: jnp108185
+      job_id: j57yrwj95
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:26:02Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3963.0
+      throughput: 252.33409033560434
+      estimated_peak_memory_range:
+        min: 12288
+        max: 114708448
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 658
+        layers_on_gpu: 0
+        layers_on_cpu: 2
+        total_layers: 660
+      job_id: jprv3qy7g
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3266.0
+      throughput: 306.1849357011635
+      estimated_peak_memory_range:
+        min: 8192
+        max: 68595760
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 445
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 445
+      job_id: jgo224odp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 23780.0
+      throughput: 42.05214465937763
+      estimated_peak_memory_range:
+        min: 102400
+        max: 335012416
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 507
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 508
+      job_id: jglvmzem5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:08:45Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T08:37:16Z'
   - torchscript_onnx_qnn:
-      inference_time: 6204.0
-      throughput: 161.18633139909736
+      inference_time: 5196.0
+      throughput: 192.4557351809084
       estimated_peak_memory_range:
-        min: 126976
-        max: 126976
+        min: 266240
+        max: 266240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 445
-      job_id: j1p3kmo35
+      job_id: j5q6qk2op
       job_status: Passed
     torchscript_onnx:
-      inference_time: 39622.0
-      throughput: 25.23850386149109
+      inference_time: 38329.0
+      throughput: 26.089905815440005
       estimated_peak_memory_range:
-        min: 132591616
-        max: 132591616
+        min: 132571136
+        max: 132571136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 1
         total_layers: 508
-      job_id: jep28mdrp
+      job_id: jp0z0q1n5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,15 +429,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:08:51Z'
+    timestamp: '2024-10-15T17:26:09Z'
 - name: CLIPImageEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 41610.0
-      throughput: 24.03268445085316
+      inference_time: 38384.0
+      throughput: 26.052521884118384
       estimated_peak_memory_range:
         min: 69632
-        max: 4087128
+        max: 2451280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -394,14 +445,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: jmg9v4ow5
+      job_id: jpedm1vo5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 32966.0
-      throughput: 30.334283807559302
+      inference_time: 27206.0
+      throughput: 36.75659780930677
       estimated_peak_memory_range:
-        min: 65536
-        max: 61336256
+        min: 61440
+        max: 58920288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -409,7 +460,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jn5q824n5
+      job_id: jp0z0qrn5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 174036.0
+      throughput: 5.745937622101175
+      estimated_peak_memory_range:
+        min: 126976
+        max: 203668048
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 501
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 502
+      job_id: jprv3q47g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -418,13 +484,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:08:34Z'
+    timestamp: '2024-10-15T17:26:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 33464.0
-      throughput: 29.882859191967487
+      inference_time: 33247.0
+      throughput: 30.077901765572832
       estimated_peak_memory_range:
-        min: 53248
-        max: 601470704
+        min: 32768
+        max: 698029056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -432,14 +498,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: jvgdwv6r5
+      job_id: jpv6691m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 25442.0
-      throughput: 39.30508607813851
+      inference_time: 24164.0
+      throughput: 41.38387684158252
       estimated_peak_memory_range:
-        min: 638976
-        max: 121004016
+        min: 634880
+        max: 178605712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -447,7 +513,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jw5661o65
+      job_id: jgkexnyng
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 118868.0
+      throughput: 8.412693071305986
+      estimated_peak_memory_range:
+        min: 843776
+        max: 3744565520
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 501
+        layers_on_gpu: 0
+        layers_on_cpu: 1
+        total_layers: 502
+      job_id: jpxkkd395
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -456,13 +537,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:08:36Z'
+    timestamp: '2024-10-16T08:35:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 41781.0
-      throughput: 23.934324214355808
+      inference_time: 37343.0
+      throughput: 26.77878049433629
       estimated_peak_memory_range:
-        min: 294912
-        max: 2413024
+        min: 16384
+        max: 2185680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -470,14 +551,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: jqp4qwe8g
+      job_id: jp14zl88p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29215.0
-      throughput: 34.22899195618689
+      inference_time: 22015.0
+      throughput: 45.423574835339544
       estimated_peak_memory_range:
-        min: 696320
-        max: 1809600
+        min: 663552
+        max: 1787712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -485,7 +566,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: j7gjxlyvp
+      job_id: jp3j034ng
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -493,14 +574,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:08:40Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:25:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 41460.0
-      throughput: 24.1196333815726
+      inference_time: 37324.0
+      throughput: 26.79241238881149
       estimated_peak_memory_range:
-        min: 69632
-        max: 506845568
+        min: 90112
+        max: 2489264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -508,14 +589,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: jo5mrz9dg
+      job_id: jpxkoj1l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 35970.0
-      throughput: 27.80094523213789
+      inference_time: 22477.0
+      throughput: 44.489923032433154
       estimated_peak_memory_range:
-        min: 0
-        max: 125174752
+        min: 704512
+        max: 1957024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -523,22 +604,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jqp4qw48g
+      job_id: jg9ln138g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:08:47Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:25:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 42134.0
-      throughput: 23.73380168035316
+      inference_time: 36580.0
+      throughput: 27.33734281027884
       estimated_peak_memory_range:
-        min: 86016
-        max: 3439456
+        min: 90112
+        max: 2280272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -546,14 +627,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: joprkyx05
+      job_id: j57yrwd95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29412.0
-      throughput: 33.999728002175985
+      inference_time: 22644.0
+      throughput: 44.16180886769122
       estimated_peak_memory_range:
-        min: 716800
-        max: 1974088
+        min: 307200
+        max: 1567448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -561,22 +642,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jygze7yog
+      job_id: jpedm1rv5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:08:41Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:25:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 41566.0
-      throughput: 24.058124428619546
+      inference_time: 36958.0
+      throughput: 27.057741219762974
       estimated_peak_memory_range:
-        min: 94208
-        max: 3002560
+        min: 57344
+        max: 3455976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -584,14 +665,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: jqpyed88g
+      job_id: jp14zl87p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29440.0
-      throughput: 33.96739130434783
+      inference_time: 22477.0
+      throughput: 44.489923032433154
       estimated_peak_memory_range:
-        min: 696320
-        max: 2408968
+        min: 753664
+        max: 2113712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -599,22 +680,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jmg9v42w5
+      job_id: jpv6ko1r5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:08:43Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:25:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 41836.0
-      throughput: 23.902858781910318
+      inference_time: 37123.0
+      throughput: 26.937478113299033
       estimated_peak_memory_range:
-        min: 57344
-        max: 2921344
+        min: 98304
+        max: 575585456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -622,14 +703,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 659
-      job_id: j1p8o7jkg
+      job_id: j5we6v9m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29776.0
-      throughput: 33.584094572810315
+      inference_time: 30382.0
+      throughput: 32.91422552827332
       estimated_peak_memory_range:
-        min: 679936
-        max: 1939688
+        min: 0
+        max: 178714672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -637,19 +718,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jvgdwv4r5
+      job_id: jp4lrox15
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:26:02Z'
+  - torchscript_onnx_tflite:
+      inference_time: 25495.0
+      throughput: 39.223377132771134
+      estimated_peak_memory_range:
+        min: 0
+        max: 482536432
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 659
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 659
+      job_id: jp2ky6mqp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 17137.0
+      throughput: 58.35327070082278
+      estimated_peak_memory_range:
+        min: 614400
+        max: 180468880
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 438
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 438
+      job_id: j5wee18j5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 17137.0
+      throughput: 58.35327070082278
+      estimated_peak_memory_range:
+        min: 614400
+        max: 180468880
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 438
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 438
+      job_id: j5wee18j5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:08:45Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T08:38:40Z'
   - torchscript_onnx_qnn:
-      inference_time: 28828.0
-      throughput: 34.688497294297214
+      inference_time: 22135.0
+      throughput: 45.177320984865595
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -660,11 +794,11 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 438
-      job_id: jwgoyvdq5
+      job_id: jglvmz0m5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 189660.0
-      throughput: 5.272593061267531
+      inference_time: 162155.0
+      throughput: 6.166939039807591
       estimated_peak_memory_range:
         min: 196714496
         max: 196714496
@@ -675,7 +809,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 1
         total_layers: 502
-      job_id: jqpyed28g
+      job_id: jgjvvw08g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -684,4 +818,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:08:51Z'
+    timestamp: '2024-10-16T08:30:35Z'
diff --git a/qai_hub_models/models/openpose/README.md b/qai_hub_models/models/openpose/README.md
index 1585e5e7..601af4de 100644
--- a/qai_hub_models/models/openpose/README.md
+++ b/qai_hub_models/models/openpose/README.md
@@ -6,7 +6,7 @@
 OpenPose is a machine learning model that estimates body and hand pose in an image and returns location and confidence for each of 19 joints.
 
 This is based on the implementation of OpenPose found
-[here](https://github.com/CMU-Perceptual-Computing-Lab/openpose). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/openpose).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.openpose.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of OpenPose can be found
+* The license for the original implementation of OpenPose can be found
   [here](https://cmu.flintbox.com/technologies/b820c21d-8443-4aa2-a49f-8919d93a8740).
-- The license for the compiled assets for on-device deployment can be found [here](https://cmu.flintbox.com/technologies/b820c21d-8443-4aa2-a49f-8919d93a8740)
+* The license for the compiled assets for on-device deployment can be found [here](https://cmu.flintbox.com/technologies/b820c21d-8443-4aa2-a49f-8919d93a8740)
+
 
 ## References
 * [OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields](https://arxiv.org/abs/1812.08008)
 * [Source Model Implementation](https://github.com/CMU-Perceptual-Computing-Lab/openpose)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/openpose/export.py b/qai_hub_models/models/openpose/export.py
index 5a00f4eb..8d423a7e 100644
--- a/qai_hub_models/models/openpose/export.py
+++ b/qai_hub_models/models/openpose/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.openpose import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "openpose"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/openpose/perf.yaml b/qai_hub_models/models/openpose/perf.yaml
index 2dc6b971..5895d32b 100644
--- a/qai_hub_models/models/openpose/perf.yaml
+++ b/qai_hub_models/models/openpose/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: OpenPose
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 11699.0
-      throughput: 85.47739123001966
+      inference_time: 11959.0
+      throughput: 83.61903169161302
       estimated_peak_memory_range:
-        min: 208896
-        max: 2089080
+        min: 225280
+        max: 1795280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jnp108n85
+      job_id: jgo268odp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11933.0
-      throughput: 83.80122349786306
+      inference_time: 11864.0
+      throughput: 84.28860418071477
       estimated_peak_memory_range:
-        min: 626688
-        max: 229315488
+        min: 655360
+        max: 215276248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: joprkyl05
+      job_id: j57yr97r5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 12254.0
-      throughput: 81.60600620205648
+      inference_time: 12080.0
+      throughput: 82.78145695364239
       estimated_peak_memory_range:
-        min: 1105920
-        max: 3296864
+        min: 49152
+        max: 119390272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 189
-      job_id: jw5661865
+      job_id: jgkex89og
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:07:31Z'
+    timestamp: '2024-10-15T00:03:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 11402.0
-      throughput: 87.70391159445711
+      inference_time: 11393.0
+      throughput: 87.77319406653208
       estimated_peak_memory_range:
         min: 212992
-        max: 40327632
+        max: 42955552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jvgdwvdr5
+      job_id: jpv6k7em5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11476.0
-      throughput: 87.1383757406762
+      inference_time: 11479.0
+      throughput: 87.11560240439063
       estimated_peak_memory_range:
-        min: 634880
-        max: 18893744
+        min: 618496
+        max: 18821424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jep28mrrp
+      job_id: jp4lr39l5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 11671.0
-      throughput: 85.68246080027419
+      inference_time: 11482.0
+      throughput: 87.09284096847239
       estimated_peak_memory_range:
-        min: 1126400
-        max: 45462048
+        min: 0
+        max: 47661712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 189
-      job_id: j1p3kmz35
+      job_id: j5q6qvmmp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:07:32Z'
+    timestamp: '2024-10-15T00:03:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 11652.0
-      throughput: 85.82217645039478
+      inference_time: 11740.0
+      throughput: 85.17887563884156
       estimated_peak_memory_range:
-        min: 204800
-        max: 2085920
+        min: 208896
+        max: 2874496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jz57zdevp
+      job_id: jgjvnqo8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11606.0
-      throughput: 86.16232982939859
+      inference_time: 12078.0
+      throughput: 82.79516476237788
       estimated_peak_memory_range:
-        min: 638976
-        max: 2658336
+        min: 643072
+        max: 1744056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: j2p0yrm9g
+      job_id: j5mnx8dqp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:07:26Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T00:03:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 23631.0
-      throughput: 42.31729507849858
+      inference_time: 11737.0
+      throughput: 85.2006475249212
       estimated_peak_memory_range:
-        min: 192512
-        max: 41372880
+        min: 225280
+        max: 2119696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jqp4qwy8g
+      job_id: jg9lndkvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23709.0
-      throughput: 42.178075836180355
+      inference_time: 12134.0
+      throughput: 82.41305422778969
       estimated_peak_memory_range:
-        min: 0
-        max: 17770416
+        min: 700416
+        max: 2015240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: j1glnkyjp
+      job_id: jp2kyevmp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:07:30Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T00:03:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 11728.0
-      throughput: 85.26603001364256
+      inference_time: 11702.0
+      throughput: 85.45547769612033
       estimated_peak_memory_range:
-        min: 208896
-        max: 583859432
+        min: 184320
+        max: 1958856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: j0pxv1l3g
+      job_id: j5we648j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11676.0
-      throughput: 85.64576909900651
+      inference_time: 12105.0
+      throughput: 82.61049153242462
       estimated_peak_memory_range:
-        min: 638976
-        max: 1857784
+        min: 688128
+        max: 1905592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: j1p8o7ekg
+      job_id: jprv3wneg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:07:27Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T00:03:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 11793.0
-      throughput: 84.79606546256254
+      inference_time: 11688.0
+      throughput: 85.55783709787816
       estimated_peak_memory_range:
-        min: 12288
-        max: 4062312
+        min: 233472
+        max: 2213904
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jo5mrz0dg
+      job_id: jgz3dn865
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11597.0
-      throughput: 86.229197206174
+      inference_time: 12118.0
+      throughput: 82.5218682950982
       estimated_peak_memory_range:
-        min: 667648
-        max: 1990376
+        min: 675840
+        max: 2011832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jogkzy2wg
+      job_id: jgn6vk7m5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:07:28Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T00:03:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 11676.0
-      throughput: 85.64576909900651
+      inference_time: 23527.0
+      throughput: 42.504356696561395
       estimated_peak_memory_range:
-        min: 212992
-        max: 2224528
+        min: 249856
+        max: 43490928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 103
-      job_id: jegn2ezkg
+      job_id: jpedmy805
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11727.0
-      throughput: 85.27330092947898
+      inference_time: 23749.0
+      throughput: 42.10703608572992
       estimated_peak_memory_range:
-        min: 679936
-        max: 2011376
+        min: 634880
+        max: 18569184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jn5q82ln5
+      job_id: jp0z06ve5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T00:03:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 8656.0
+      throughput: 115.5268022181146
+      estimated_peak_memory_range:
+        min: 204800
+        max: 24459280
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 103
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 103
+      job_id: jgdx128lp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 8742.0
+      throughput: 114.39029970258522
+      estimated_peak_memory_range:
+        min: 614400
+        max: 16413312
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 186
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 186
+      job_id: jp8qy148p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 7177.0
+      throughput: 139.33398355858995
+      estimated_peak_memory_range:
+        min: 1134592
+        max: 28735968
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 189
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 189
+      job_id: jp3j06wzg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:07:29Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T00:03:19Z'
   - torchscript_onnx_qnn:
-      inference_time: 12322.0
-      throughput: 81.15565654926148
+      inference_time: 12659.0
+      throughput: 78.99518129394107
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jqpyedo8g
+      job_id: jpxkoxd95
       job_status: Passed
     torchscript_onnx:
-      inference_time: 12346.0
-      throughput: 80.99789405475458
+      inference_time: 12628.0
+      throughput: 79.18910357934749
       estimated_peak_memory_range:
-        min: 106577920
-        max: 106577920
+        min: 106684416
+        max: 106684416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 189
-      job_id: jwgoyvlq5
+      job_id: jglvml1l5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:07:33Z'
+    timestamp: '2024-10-15T00:03:17Z'
diff --git a/qai_hub_models/models/plamo_1b_quantized/README.md b/qai_hub_models/models/plamo_1b_quantized/README.md
new file mode 100644
index 00000000..fe3322ea
--- /dev/null
+++ b/qai_hub_models/models/plamo_1b_quantized/README.md
@@ -0,0 +1,50 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [PLaMo-1B: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/plamo_1b_quantized)
+
+PLaMo-1B is the first small language model (SLM) in the PLaMo™ Lite series from Preferred Networks (PFN), designed to power AI applications for edge devices including mobile, automotive, and robots across various industrial sectors. This model builds on the advancements of PLaMo-100B, a 100-billion parameter large language model (LLM) developed from the ground up by PFN’s subsidiary Preferred Elements (PFE). Leveraging high-quality Japanese and English text data generated by PLaMo-100B, PLaMo-1B has been pre-trained on a total of 4 trillion tokens. As a result, it delivers exceptional performance in Japanese benchmarks, outperforming other SLMs with similar parameter sizes. In evaluations such as Jaster 0-shot and 4-shot, PLaMo-1B has demonstrated performance on par with larger LLMs, making it a highly efficient solution for edge-based AI tasks.
+
+This is based on the implementation of PLaMo-1B found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/plamo_1b_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying PLaMo-1B on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/plamo_1b_quantized/info.yaml b/qai_hub_models/models/plamo_1b_quantized/info.yaml
new file mode 100644
index 00000000..669632cc
--- /dev/null
+++ b/qai_hub_models/models/plamo_1b_quantized/info.yaml
@@ -0,0 +1,39 @@
+name: PLaMo-1B
+id: plamo_1b_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: PLaMo-1B is the first small language model (SLM) in the PLaMo™ Lite series from Preferred Networks (PFN), designed to power AI applications for edge devices including mobile, automotive, and robots across various industrial sectors. This model builds on the advancements of PLaMo-100B, a 100-billion parameter large language model (LLM) developed from the ground up by PFN’s subsidiary Preferred Elements (PFE). Leveraging high-quality Japanese and English text data generated by PLaMo-100B, PLaMo-1B has been pre-trained on a total of 4 trillion tokens. As a result, it delivers exceptional performance in Japanese benchmarks, outperforming other SLMs with similar parameter sizes. In evaluations such as Jaster 0-shot and 4-shot, PLaMo-1B has demonstrated performance on par with larger LLMs, making it a highly efficient solution for edge-based AI tasks.
+model_maker_id: preferred-networks
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 1B
+  Precision: w4a16 + w8a16 (few layers)
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: Japanese and English.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: false
+license_type: 'other'
+dataset: []
+model_type_llm: true
+restrict_model_sharing: true
+llm_details:
+  call_to_action: 'contact_for_purchase'
diff --git a/qai_hub_models/models/plamo_1b_quantized/perf.yaml b/qai_hub_models/models/plamo_1b_quantized/perf.yaml
new file mode 100644
index 00000000..10f9dd57
--- /dev/null
+++ b/qai_hub_models/models/plamo_1b_quantized/perf.yaml
@@ -0,0 +1,25 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: 'PLaMo-1B'
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 31448
+          max: 1006336
+        tokens_per_second: 68.21
+      evaluation_metrics: null
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/posenet_mobilenet/README.md b/qai_hub_models/models/posenet_mobilenet/README.md
index 90a83c3b..7d2a086d 100644
--- a/qai_hub_models/models/posenet_mobilenet/README.md
+++ b/qai_hub_models/models/posenet_mobilenet/README.md
@@ -6,7 +6,7 @@
 Posenet performs pose estimation on human images.
 
 This is based on the implementation of Posenet-Mobilenet found
-[here](https://github.com/rwightman/posenet-pytorch). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/posenet_mobilenet).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.posenet_mobilenet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Posenet-Mobilenet can be found
+* The license for the original implementation of Posenet-Mobilenet can be found
   [here](https://github.com/rwightman/posenet-pytorch/blob/master/LICENSE.txt).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model](https://arxiv.org/abs/1803.08225)
 * [Source Model Implementation](https://github.com/rwightman/posenet-pytorch)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/posenet_mobilenet/export.py b/qai_hub_models/models/posenet_mobilenet/export.py
index ab0cd994..f6340c5e 100644
--- a/qai_hub_models/models/posenet_mobilenet/export.py
+++ b/qai_hub_models/models/posenet_mobilenet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.posenet_mobilenet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "posenet_mobilenet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/posenet_mobilenet/perf.yaml b/qai_hub_models/models/posenet_mobilenet/perf.yaml
index 44c6b725..8280a33b 100644
--- a/qai_hub_models/models/posenet_mobilenet/perf.yaml
+++ b/qai_hub_models/models/posenet_mobilenet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Posenet-Mobilenet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1367.0
-      throughput: 731.528895391368
+      inference_time: 1375.0
+      throughput: 727.2727272727273
       estimated_peak_memory_range:
         min: 12288
-        max: 7543960
+        max: 33771368
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jo5mrzmwg
+      job_id: j5mnx84qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1444.0
-      throughput: 692.5207756232687
+      inference_time: 1442.0
+      throughput: 693.4812760055479
       estimated_peak_memory_range:
-        min: 12288
-        max: 13179184
+        min: 36864
+        max: 12659440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: jogkzy42g
+      job_id: jglvml7l5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2098.0
-      throughput: 476.64442326024783
+      inference_time: 1894.0
+      throughput: 527.9831045406547
       estimated_peak_memory_range:
         min: 12288
-        max: 8496224
+        max: 7822264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jlpe9vz1g
+      job_id: jp14z63lp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:05:00Z'
+    timestamp: '2024-10-14T23:59:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 1095.0
-      throughput: 913.2420091324201
+      inference_time: 1102.0
+      throughput: 907.4410163339383
       estimated_peak_memory_range:
-        min: 12288
-        max: 41395376
+        min: 16384
+        max: 42395216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jegn2enrg
+      job_id: jgn6vkxm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1154.0
-      throughput: 866.5511265164645
+      inference_time: 1157.0
+      throughput: 864.304235090752
       estimated_peak_memory_range:
-        min: 32137216
-        max: 47341776
+        min: 1597440
+        max: 19228640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: jn5q82y45
+      job_id: j56y4wv7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1741.0
-      throughput: 574.3825387708214
+      inference_time: 1493.0
+      throughput: 669.7923643670462
       estimated_peak_memory_range:
-        min: 794624
-        max: 45730752
+        min: 49152
+        max: 47532832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jygze7mkg
+      job_id: jgdx120lp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:05:01Z'
+    timestamp: '2024-10-14T23:59:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 1358.0
-      throughput: 736.3770250368188
+      inference_time: 1396.0
+      throughput: 716.3323782234957
       estimated_peak_memory_range:
         min: 12288
-        max: 9840912
+        max: 1456816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: joprky095
+      job_id: jprv3w9eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1383.0
-      throughput: 723.0657989877079
+      inference_time: 1387.0
+      throughput: 720.9805335255949
       estimated_peak_memory_range:
-        min: 1617920
-        max: 3022376
+        min: 1622016
+        max: 2785808
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: jw5661705
+      job_id: jgo268mdp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:04:55Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:59:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 2186.0
-      throughput: 457.45654162854527
+      inference_time: 1369.0
+      throughput: 730.4601899196493
       estimated_peak_memory_range:
         min: 12288
-        max: 42154592
+        max: 2618912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jep28mw4p
+      job_id: jp8qy188p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2266.0
-      throughput: 441.306266548985
+      inference_time: 1396.0
+      throughput: 716.3323782234957
       estimated_peak_memory_range:
-        min: 1597440
-        max: 19997008
+        min: 1613824
+        max: 3051224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: j7gjxl7xp
+      job_id: jpedmy205
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:04:59Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:59:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 1367.0
-      throughput: 731.528895391368
+      inference_time: 1364.0
+      throughput: 733.1378299120234
       estimated_peak_memory_range:
-        min: 12288
-        max: 1384288
+        min: 36864
+        max: 9949568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jqpyedx7g
+      job_id: jp0z06ke5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1388.0
-      throughput: 720.4610951008646
+      inference_time: 1390.0
+      throughput: 719.4244604316547
       estimated_peak_memory_range:
-        min: 1605632
-        max: 2986128
+        min: 1617920
+        max: 3364464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: j1p3km9l5
+      job_id: jgjvnq18g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:04:56Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:59:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1366.0
-      throughput: 732.0644216691069
+      inference_time: 1372.0
+      throughput: 728.862973760933
       estimated_peak_memory_range:
-        min: 12288
-        max: 8282504
+        min: 315392
+        max: 1901168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: j2p0yrj6g
+      job_id: jpy13mn4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1374.0
-      throughput: 727.802037845706
+      inference_time: 1399.0
+      throughput: 714.7962830593281
       estimated_peak_memory_range:
-        min: 1654784
-        max: 3036896
+        min: 1581056
+        max: 3173216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: jwgoyvrx5
+      job_id: jpv6k74m5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:04:57Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:59:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 1365.0
-      throughput: 732.6007326007326
+      inference_time: 2195.0
+      throughput: 455.58086560364467
       estimated_peak_memory_range:
-        min: 12288
-        max: 6272568
+        min: 16384
+        max: 42900576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: j1p8o7xxg
+      job_id: jp2kyejmp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1389.0
-      throughput: 719.9424046076314
+      inference_time: 2293.0
+      throughput: 436.1098996947231
       estimated_peak_memory_range:
-        min: 1634304
-        max: 2932016
+        min: 1597440
+        max: 22069984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: j1pv3wdj5
+      job_id: j5we64xj5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:59:30Z'
+  - torchscript_onnx_tflite:
+      inference_time: 963.0
+      throughput: 1038.4215991692627
+      estimated_peak_memory_range:
+        min: 12288
+        max: 22594784
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 41
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 41
+      job_id: j5q6qvwmp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1077.0
+      throughput: 928.5051067780872
+      estimated_peak_memory_range:
+        min: 1593344
+        max: 15697760
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 69
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 69
+      job_id: jg9lnd8vg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1076.0
+      throughput: 929.368029739777
+      estimated_peak_memory_range:
+        min: 0
+        max: 25205696
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 70
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 70
+      job_id: jpxkox395
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:04:58Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:59:36Z'
   - torchscript_onnx_qnn:
-      inference_time: 1569.0
-      throughput: 637.3486297004462
+      inference_time: 1556.0
+      throughput: 642.6735218508998
       estimated_peak_memory_range:
         min: 1589248
         max: 1589248
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 69
-      job_id: j1glnkx8p
+      job_id: jp3j068zg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2163.0
-      throughput: 462.32085067036525
+      inference_time: 2147.0
+      throughput: 465.76618537494176
       estimated_peak_memory_range:
-        min: 8146944
-        max: 8146944
+        min: 7008256
+        max: 7008256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jz5wo9l6p
+      job_id: j57yr9kr5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:05:02Z'
+    timestamp: '2024-10-14T23:59:34Z'
diff --git a/qai_hub_models/models/posenet_mobilenet_quantized/README.md b/qai_hub_models/models/posenet_mobilenet_quantized/README.md
index f039d3c5..a7622691 100644
--- a/qai_hub_models/models/posenet_mobilenet_quantized/README.md
+++ b/qai_hub_models/models/posenet_mobilenet_quantized/README.md
@@ -6,7 +6,7 @@
 Posenet performs pose estimation on human images.
 
 This is based on the implementation of Posenet-Mobilenet-Quantized found
-[here](https://github.com/rwightman/posenet-pytorch). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/posenet_mobilenet_quantized).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.posenet_mobilenet_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Posenet-Mobilenet-Quantized can be found
+* The license for the original implementation of Posenet-Mobilenet-Quantized can be found
   [here](https://github.com/rwightman/posenet-pytorch/blob/master/LICENSE.txt).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model](https://arxiv.org/abs/1803.08225)
 * [Source Model Implementation](https://github.com/rwightman/posenet-pytorch)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/posenet_mobilenet_quantized/export.py b/qai_hub_models/models/posenet_mobilenet_quantized/export.py
index c0dd02b6..b3413484 100644
--- a/qai_hub_models/models/posenet_mobilenet_quantized/export.py
+++ b/qai_hub_models/models/posenet_mobilenet_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.posenet_mobilenet_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "posenet_mobilenet_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/posenet_mobilenet_quantized/perf.yaml b/qai_hub_models/models/posenet_mobilenet_quantized/perf.yaml
index 1b31238e..1d253fd1 100644
--- a/qai_hub_models/models/posenet_mobilenet_quantized/perf.yaml
+++ b/qai_hub_models/models/posenet_mobilenet_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Posenet-Mobilenet-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 560.0
-      throughput: 1785.7142857142858
+      inference_time: 558.0
+      throughput: 1792.1146953405018
       estimated_peak_memory_range:
         min: 12288
-        max: 64550200
+        max: 1725168
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,22 +59,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: j0pxv1x1g
+      job_id: j57yr9jq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 644.0
-      throughput: 1552.7950310559006
+      inference_time: 640.0
+      throughput: 1562.5
       estimated_peak_memory_range:
-        min: 28672
-        max: 12530808
+        min: 12288
+        max: 11923560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: jn5q82v45
+        total_layers: 69
+      job_id: j5q6qv97p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -85,13 +83,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:04:15Z'
+    timestamp: '2024-10-14T23:58:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 393.0
-      throughput: 2544.529262086514
+      inference_time: 480.0
+      throughput: 2083.3333333333335
       estimated_peak_memory_range:
         min: 12288
-        max: 48935440
+        max: 49692032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -99,22 +97,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jo5mrz8wg
+      job_id: jp4lr3xq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 541.0
-      throughput: 1848.4288354898335
+      inference_time: 445.0
+      throughput: 2247.191011235955
       estimated_peak_memory_range:
         min: 409600
-        max: 18725552
+        max: 19133904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: j1glnkl8p
+        total_layers: 69
+      job_id: jglvmlee5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -123,13 +121,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:04:16Z'
+    timestamp: '2024-10-14T23:58:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 563.0
-      throughput: 1776.1989342806394
+      inference_time: 2182.0
+      throughput: 458.29514207149407
       estimated_peak_memory_range:
-        min: 12288
-        max: 109879568
+        min: 40960
+        max: 28749600
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -137,37 +135,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jegn2ekrg
+      job_id: jp0z06e25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 563.0
-      throughput: 1776.1989342806394
+      inference_time: 2902.0
+      throughput: 344.5899379738112
       estimated_peak_memory_range:
-        min: 425984
-        max: 1708840
+        min: 12288
+        max: 8312528
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: j1p3km6l5
+        total_layers: 69
+      job_id: jg9lnd9qg
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:58:46Z'
+  - torchscript_onnx_tflite:
+      inference_time: 12597.0
+      throughput: 79.38398031277288
+      estimated_peak_memory_range:
+        min: 450560
+        max: 12687728
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 45
+        layers_on_gpu: 3
+        layers_on_cpu: 0
+        total_layers: 48
+      job_id: jp8qy1wzp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:04:18Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:58:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 726.0
-      throughput: 1377.4104683195592
+      inference_time: 551.0
+      throughput: 1814.8820326678765
       estimated_peak_memory_range:
         min: 12288
-        max: 50338816
+        max: 1304952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -175,37 +196,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: joprkyw95
+      job_id: jpxkox7j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 787.0
-      throughput: 1270.6480304955528
+      inference_time: 555.0
+      throughput: 1801.8018018018017
       estimated_peak_memory_range:
-        min: 409600
-        max: 21638960
+        min: 421888
+        max: 1641552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: jlpe9vy1g
+        total_layers: 69
+      job_id: jp3j06qxg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:04:23Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:58:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 556.0
-      throughput: 1798.5611510791366
+      inference_time: 560.0
+      throughput: 1785.7142857142858
       estimated_peak_memory_range:
         min: 12288
-        max: 108564488
+        max: 111488632
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -213,37 +234,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jep28me4p
+      job_id: jp2kye3xp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 549.0
-      throughput: 1821.4936247723133
+      inference_time: 561.0
+      throughput: 1782.5311942959001
       estimated_peak_memory_range:
-        min: 24576
-        max: 1709704
+        min: 434176
+        max: 2253600
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: jwgoyv8x5
+        total_layers: 69
+      job_id: jpedmy475
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:04:20Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:58:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 559.0
-      throughput: 1788.9087656529516
+      inference_time: 557.0
+      throughput: 1795.3321364452424
       estimated_peak_memory_range:
-        min: 16384
-        max: 111715192
+        min: 12288
+        max: 17778904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -251,22 +272,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jqpyedm7g
+      job_id: jprv3w1vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 566.0
-      throughput: 1766.7844522968198
+      inference_time: 560.0
+      throughput: 1785.7142857142858
       estimated_peak_memory_range:
-        min: 425984
-        max: 1707728
+        min: 417792
+        max: 2038960
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: j1pv3w7j5
+        total_layers: 69
+      job_id: jpv6k7z75
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -274,14 +295,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:04:21Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:58:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 561.0
-      throughput: 1782.5311942959001
+      inference_time: 559.0
+      throughput: 1788.9087656529516
       estimated_peak_memory_range:
         min: 12288
-        max: 111987240
+        max: 3012608
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -289,37 +310,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: j2p0yr66g
+      job_id: jgn6vkrv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 555.0
-      throughput: 1801.8018018018017
+      inference_time: 561.0
+      throughput: 1782.5311942959001
       estimated_peak_memory_range:
-        min: 446464
-        max: 1660944
+        min: 442368
+        max: 1759800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: j7gjxlqxp
+        total_layers: 69
+      job_id: jgo268e4p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:04:22Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:58:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 2174.0
-      throughput: 459.9816007359706
+      inference_time: 714.0
+      throughput: 1400.5602240896358
       estimated_peak_memory_range:
         min: 12288
-        max: 28761120
+        max: 52877952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -327,68 +348,83 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: j1p8o71xg
+      job_id: j5mnx8wyp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2940.0
-      throughput: 340.13605442176873
+      inference_time: 794.0
+      throughput: 1259.4458438287154
       estimated_peak_memory_range:
-        min: 413696
-        max: 8607040
+        min: 430080
+        max: 22998128
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: jygze7nkg
+        total_layers: 69
+      job_id: j5we64mz5
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:04:24Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:58:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 13079.0
-      throughput: 76.45844483523206
+      inference_time: 412.0
+      throughput: 2427.1844660194174
       estimated_peak_memory_range:
-        min: 454656
-        max: 11918216
+        min: 8192
+        max: 27482688
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
-        layers_on_gpu: 3
+        layers_on_npu: 48
+        layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jogkzy82g
+      job_id: jgkex8ryg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 484.0
+      throughput: 2066.115702479339
+      estimated_peak_memory_range:
+        min: 409600
+        max: 17955344
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 69
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 69
+      job_id: jp14z6qkp
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:04:14Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:58:47Z'
   - torchscript_onnx_qnn:
-      inference_time: 673.0
-      throughput: 1485.8841010401188
+      inference_time: 679.0
+      throughput: 1472.7540500736377
       estimated_peak_memory_range:
         min: 397312
         max: 397312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 42
+        layers_on_npu: 69
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 42
-      job_id: jw5661w05
+        total_layers: 69
+      job_id: j56y4wqvp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -397,4 +433,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:04:17Z'
+    timestamp: '2024-10-14T23:58:39Z'
diff --git a/qai_hub_models/models/quicksrnetlarge/README.md b/qai_hub_models/models/quicksrnetlarge/README.md
index 58607804..2ce082f7 100644
--- a/qai_hub_models/models/quicksrnetlarge/README.md
+++ b/qai_hub_models/models/quicksrnetlarge/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Large is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetLarge found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetlarge).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.quicksrnetlarge.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetLarge can be found
+* The license for the original implementation of QuickSRNetLarge can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetlarge/export.py b/qai_hub_models/models/quicksrnetlarge/export.py
index a683df51..d3b9a087 100644
--- a/qai_hub_models/models/quicksrnetlarge/export.py
+++ b/qai_hub_models/models/quicksrnetlarge/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetlarge import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetlarge"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetlarge/perf.yaml b/qai_hub_models/models/quicksrnetlarge/perf.yaml
index e49d3c13..80548d14 100644
--- a/qai_hub_models/models/quicksrnetlarge/perf.yaml
+++ b/qai_hub_models/models/quicksrnetlarge/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetLarge
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2439.0
-      throughput: 410.0041000410004
+      inference_time: 2476.0
+      throughput: 403.8772213247173
       estimated_peak_memory_range:
-        min: 6365184
-        max: 7849648
+        min: 16384
+        max: 1507448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: jegn2e7rg
+      job_id: jgz3dn7z5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2106.0
-      throughput: 474.8338081671415
+      inference_time: 2107.0
+      throughput: 474.6084480303749
       estimated_peak_memory_range:
-        min: 2117632
-        max: 6739536
+        min: 28672
+        max: 3253456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jn5q82m45
+      job_id: jgn6vk9v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2662.0
-      throughput: 375.6574004507889
+      inference_time: 2750.0
+      throughput: 363.6363636363636
       estimated_peak_memory_range:
-        min: 12288
-        max: 66866688
+        min: 4096
+        max: 2253520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: jygze74kg
+      job_id: jp3j064xg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:03:40Z'
+    timestamp: '2024-10-14T23:57:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 1957.0
-      throughput: 510.98620337250895
+      inference_time: 1933.0
+      throughput: 517.3305742369374
       estimated_peak_memory_range:
         min: 16384
-        max: 32660624
+        max: 33715280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: joprkyn95
+      job_id: j5we649z5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1781.0
-      throughput: 561.4823133071309
+      inference_time: 1901.0
+      throughput: 526.0389268805892
       estimated_peak_memory_range:
-        min: 212992
-        max: 11543632
+        min: 208896
+        max: 11501120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: j1glnk18p
+      job_id: jprv3w4vg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2115.0
-      throughput: 472.8132387706856
+      inference_time: 2674.0
+      throughput: 373.97157816005983
       estimated_peak_memory_range:
         min: 0
-        max: 35034944
+        max: 35547056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: jz5wo946p
+      job_id: jgo26814p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:03:41Z'
+    timestamp: '2024-10-14T23:57:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 2421.0
-      throughput: 413.0524576621231
+      inference_time: 2400.0
+      throughput: 416.6666666666667
       estimated_peak_memory_range:
-        min: 49152
-        max: 23850336
+        min: 16384
+        max: 1680016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: jep28mv4p
+      job_id: jg9lnd4qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2179.0
-      throughput: 458.9261128958238
+      inference_time: 2183.0
+      throughput: 458.0852038479157
       estimated_peak_memory_range:
-        min: 221184
-        max: 1405688
+        min: 225280
+        max: 1614408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: j1p3kmwl5
+      job_id: jpy13m4rp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:03:35Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:57:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 4401.0
-      throughput: 227.22108611679164
+      inference_time: 2443.0
+      throughput: 409.3327875562833
       estimated_peak_memory_range:
-        min: 6332416
-        max: 38759760
+        min: 20480
+        max: 14372720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: jqpyed77g
+      job_id: jp4lr3wq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3471.0
-      throughput: 288.1014116969173
+      inference_time: 2184.0
+      throughput: 457.87545787545787
       estimated_peak_memory_range:
-        min: 212992
-        max: 15745712
+        min: 225280
+        max: 1604072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jlpe9vl1g
+      job_id: jgkex8lyg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:03:39Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:57:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 2382.0
-      throughput: 419.81528127623847
+      inference_time: 2482.0
+      throughput: 402.90088638195004
       estimated_peak_memory_range:
-        min: 20480
-        max: 7572552
+        min: 16384
+        max: 8206304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: j2p0yrv6g
+      job_id: j57yr9dq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2187.0
-      throughput: 457.2473708276177
+      inference_time: 2209.0
+      throughput: 452.6935264825713
       estimated_peak_memory_range:
-        min: 217088
-        max: 1582792
+        min: 221184
+        max: 1602896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jwgoyv4x5
+      job_id: jp8qy13zp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:03:36Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:57:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 2404.0
-      throughput: 415.97337770382694
+      inference_time: 2448.0
+      throughput: 408.4967320261438
       estimated_peak_memory_range:
         min: 16384
-        max: 91562184
+        max: 6329616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: j1p8o74xg
+      job_id: jgdx12vkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2184.0
-      throughput: 457.87545787545787
+      inference_time: 2238.0
+      throughput: 446.82752457551385
       estimated_peak_memory_range:
         min: 233472
-        max: 4994536
+        max: 1502728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: j1pv3w9j5
+      job_id: jp0z06125
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:03:37Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:57:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 2399.0
-      throughput: 416.84035014589415
+      inference_time: 4174.0
+      throughput: 239.57834211787255
       estimated_peak_memory_range:
-        min: 16384
-        max: 5143680
+        min: 6336512
+        max: 39484208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 31
-      job_id: jogkzy92g
+      job_id: jp14z68kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2525.0
-      throughput: 396.03960396039605
+      inference_time: 3471.0
+      throughput: 288.1014116969173
       estimated_peak_memory_range:
-        min: 221184
-        max: 4898896
+        min: 208896
+        max: 15573856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: j7gjxlwxp
+      job_id: jglvml0e5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:03:38Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:57:54Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1859.0
+      throughput: 537.9236148466917
+      estimated_peak_memory_range:
+        min: 12288
+        max: 17013024
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 28
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 31
+      job_id: j5mnx8zyp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1594.0
+      throughput: 627.3525721455458
+      estimated_peak_memory_range:
+        min: 0
+        max: 10260544
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 31
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 31
+      job_id: j56y4w3vp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1871.0
+      throughput: 534.4735435595938
+      estimated_peak_memory_range:
+        min: 0
+        max: 15896080
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 33
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 33
+      job_id: jpedmyr75
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:58:00Z'
   - torchscript_onnx_qnn:
-      inference_time: 2387.0
-      throughput: 418.93590280687056
+      inference_time: 2388.0
+      throughput: 418.7604690117253
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 221184
+        max: 221184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jw5661d05
+      job_id: jp2kye7xp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2690.0
-      throughput: 371.74721189591077
+      inference_time: 2684.0
+      throughput: 372.5782414307005
       estimated_peak_memory_range:
-        min: 8937472
-        max: 8937472
+        min: 8847360
+        max: 8847360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: jmg9v4dl5
+      job_id: jpv6k7175
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:03:42Z'
+    timestamp: '2024-10-14T23:57:58Z'
diff --git a/qai_hub_models/models/quicksrnetlarge_quantized/README.md b/qai_hub_models/models/quicksrnetlarge_quantized/README.md
index 35690b40..1c550ce8 100644
--- a/qai_hub_models/models/quicksrnetlarge_quantized/README.md
+++ b/qai_hub_models/models/quicksrnetlarge_quantized/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Large is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetLarge-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetlarge_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.quicksrnetlarge_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetLarge-Quantized can be found
+* The license for the original implementation of QuickSRNetLarge-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetlarge_quantized/export.py b/qai_hub_models/models/quicksrnetlarge_quantized/export.py
index aea0ea89..44ee3e43 100644
--- a/qai_hub_models/models/quicksrnetlarge_quantized/export.py
+++ b/qai_hub_models/models/quicksrnetlarge_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetlarge_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetlarge_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetlarge_quantized/perf.yaml b/qai_hub_models/models/quicksrnetlarge_quantized/perf.yaml
index 48fce07e..a1610cbc 100644
--- a/qai_hub_models/models/quicksrnetlarge_quantized/perf.yaml
+++ b/qai_hub_models/models/quicksrnetlarge_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetLarge-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1501.0
-      throughput: 666.2225183211193
+      inference_time: 1434.0
+      throughput: 697.350069735007
       estimated_peak_memory_range:
-        min: 28672
-        max: 6381552
+        min: 12288
+        max: 3802792
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +62,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jo5mrzowg
+      job_id: j5q6qv37p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 907.0
-      throughput: 1102.5358324145534
+      inference_time: 905.0
+      throughput: 1104.9723756906078
       estimated_peak_memory_range:
-        min: 65536
-        max: 8346048
+        min: 24576
+        max: 8377864
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: j1glnko8p
+        total_layers: 31
+      job_id: jp14z6ekp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1058.0
-      throughput: 945.179584120983
+      inference_time: 902.0
+      throughput: 1108.6474501108648
       estimated_peak_memory_range:
-        min: 65536
-        max: 16427112
+        min: 57344
+        max: 1672000
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 34
-      job_id: jmg9v4xl5
+      job_id: jgkex8yyg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:03:03Z'
+    timestamp: '2024-10-14T23:57:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 1113.0
-      throughput: 898.4725965858041
+      inference_time: 1223.0
+      throughput: 817.6614881439084
       estimated_peak_memory_range:
         min: 12288
-        max: 28921856
+        max: 29971408
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +115,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jegn2eorg
+      job_id: jglvml3e5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 643.0
-      throughput: 1555.2099533437015
+      inference_time: 647.0
+      throughput: 1545.595054095827
       estimated_peak_memory_range:
-        min: 16384
-        max: 13702768
+        min: 12288
+        max: 12152080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: jw5661r05
+        total_layers: 31
+      job_id: jgdx12okp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 770.0
-      throughput: 1298.7012987012988
+      inference_time: 682.0
+      throughput: 1466.275659824047
       estimated_peak_memory_range:
         min: 0
-        max: 30754016
+        max: 30995088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 34
-      job_id: jnp108v25
+      job_id: j5q6qv27p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:03:04Z'
+    timestamp: '2024-10-14T23:57:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 1445.0
-      throughput: 692.0415224913495
+      inference_time: 4239.0
+      throughput: 235.90469450342061
       estimated_peak_memory_range:
-        min: 28672
-        max: 1592008
+        min: 1810432
+        max: 23251024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +168,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: joprkyo95
+      job_id: jgz3dnrz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 681.0
-      throughput: 1468.4287812041116
+      inference_time: 3128.0
+      throughput: 319.693094629156
       estimated_peak_memory_range:
-        min: 77824
-        max: 1397472
+        min: 65536
+        max: 8110256
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: jwgoyvox5
+        total_layers: 31
+      job_id: jp0z06r25
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:02:56Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:57:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 2256.0
-      throughput: 443.26241134751774
+      inference_time: 38641.0
+      throughput: 25.879247431484693
       estimated_peak_memory_range:
-        min: 1589248
-        max: 31850544
+        min: 1839104
+        max: 8801856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +206,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jep28m44p
+      job_id: j5we64qz5
       job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 1051.0
-      throughput: 951.4747859181732
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:56:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1458.0
+      throughput: 685.8710562414266
       estimated_peak_memory_range:
         min: 12288
-        max: 15042256
+        max: 3084656
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 30
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 33
+      job_id: j56y4wnvp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 682.0
+      throughput: 1466.275659824047
+      estimated_peak_memory_range:
+        min: 0
+        max: 1165008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: jygze78kg
+        total_layers: 31
+      job_id: jp4lr3vq5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:03:01Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:57:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 1447.0
-      throughput: 691.0850034554251
+      inference_time: 1439.0
+      throughput: 694.9270326615705
       estimated_peak_memory_range:
-        min: 20480
-        max: 1439032
+        min: 28672
+        max: 1494800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +267,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jqpyedq7g
+      job_id: jgjvnqe7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 685.0
-      throughput: 1459.85401459854
+      inference_time: 684.0
+      throughput: 1461.9883040935672
       estimated_peak_memory_range:
-        min: 81920
-        max: 1380480
+        min: 73728
+        max: 1344312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: j1pv3wej5
+        total_layers: 31
+      job_id: jprv3wyvg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:02:57Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:57:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 1434.0
-      throughput: 697.350069735007
+      inference_time: 1453.0
+      throughput: 688.2312456985547
       estimated_peak_memory_range:
-        min: 12288
-        max: 1390504
+        min: 806912
+        max: 5299552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +305,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: j2p0yrd6g
+      job_id: jpv6k7v75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 720.0
-      throughput: 1388.888888888889
+      inference_time: 679.0
+      throughput: 1472.7540500736377
       estimated_peak_memory_range:
         min: 81920
-        max: 1342168
+        max: 1741272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: j7gjxloxp
+        total_layers: 31
+      job_id: jgn6vkev5
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:02:58Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:57:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 1449.0
-      throughput: 690.1311249137336
+      inference_time: 1429.0
+      throughput: 699.7900629811056
       estimated_peak_memory_range:
-        min: 20480
-        max: 1442000
+        min: 24576
+        max: 78505088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +343,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: j1p8o76xg
+      job_id: jgo26834p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 681.0
-      throughput: 1468.4287812041116
+      inference_time: 723.0
+      throughput: 1383.1258644536654
       estimated_peak_memory_range:
-        min: 0
-        max: 1413664
+        min: 81920
+        max: 1427008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: jlpe9v81g
+        total_layers: 31
+      job_id: jpxkoxyj5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:02:59Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:57:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 3920.0
-      throughput: 255.10204081632654
+      inference_time: 1923.0
+      throughput: 520.0208008320333
       estimated_peak_memory_range:
-        min: 1609728
-        max: 22801536
+        min: 16384
+        max: 30993168
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,37 +381,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jogkzyo2g
+      job_id: jp3j06exg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3175.0
-      throughput: 314.96062992125985
+      inference_time: 1059.0
+      throughput: 944.2870632672333
       estimated_peak_memory_range:
-        min: 65536
-        max: 8335184
+        min: 12288
+        max: 15219760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: jz5wo916p
+        total_layers: 31
+      job_id: jpy13mdrp
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:03:02Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:57:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 38110.0
-      throughput: 26.239832065074783
+      inference_time: 1594.0
+      throughput: 627.3525721455458
       estimated_peak_memory_range:
-        min: 1503232
-        max: 4870208
+        min: 0
+        max: 20576960
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -398,37 +419,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 33
-      job_id: jn5q82z45
+      job_id: jg9lndwqg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 655.0
+      throughput: 1526.7175572519084
+      estimated_peak_memory_range:
+        min: 8192
+        max: 11344048
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 31
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 31
+      job_id: jp8qy17zp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 519.0
+      throughput: 1926.7822736030828
+      estimated_peak_memory_range:
+        min: 0
+        max: 21261808
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 34
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 34
+      job_id: jp3j06mxg
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:02:52Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:57:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 791.0
-      throughput: 1264.2225031605562
+      inference_time: 814.0
+      throughput: 1228.5012285012285
       estimated_peak_memory_range:
-        min: 61440
-        max: 61440
+        min: 233472
+        max: 233472
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 18
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 18
-      job_id: j1p3kmxl5
+        total_layers: 31
+      job_id: j57yr9xq5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1104.0
-      throughput: 905.7971014492754
+      inference_time: 1105.0
+      throughput: 904.9773755656108
       estimated_peak_memory_range:
-        min: 3301376
-        max: 3301376
+        min: 3379200
+        max: 3379200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 34
-      job_id: jvgdwvze5
+      job_id: jglvmlke5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:03:05Z'
+    timestamp: '2024-10-14T23:57:13Z'
diff --git a/qai_hub_models/models/quicksrnetmedium/README.md b/qai_hub_models/models/quicksrnetmedium/README.md
index a3adabe4..35ffd138 100644
--- a/qai_hub_models/models/quicksrnetmedium/README.md
+++ b/qai_hub_models/models/quicksrnetmedium/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Medium is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetMedium found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetmedium).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.quicksrnetmedium.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetMedium can be found
+* The license for the original implementation of QuickSRNetMedium can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetmedium/export.py b/qai_hub_models/models/quicksrnetmedium/export.py
index 42b487ec..dae61e56 100644
--- a/qai_hub_models/models/quicksrnetmedium/export.py
+++ b/qai_hub_models/models/quicksrnetmedium/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetmedium import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetmedium"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetmedium/perf.yaml b/qai_hub_models/models/quicksrnetmedium/perf.yaml
index 7e0119c1..6ee1eb06 100644
--- a/qai_hub_models/models/quicksrnetmedium/perf.yaml
+++ b/qai_hub_models/models/quicksrnetmedium/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetMedium
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1334.0
-      throughput: 749.6251874062968
+      inference_time: 1359.0
+      throughput: 735.8351729212657
       estimated_peak_memory_range:
-        min: 28672
-        max: 1477824
+        min: 16384
+        max: 1656408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: j1p8o788g
+      job_id: jgkex8qvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 994.0
-      throughput: 1006.0362173038229
+      inference_time: 1017.0
+      throughput: 983.284169124877
       estimated_peak_memory_range:
-        min: 233472
-        max: 7231440
+        min: 217088
+        max: 2695224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: j1pv3w4m5
+      job_id: jgz3dnj45
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1530.0
-      throughput: 653.59477124183
+      inference_time: 1512.0
+      throughput: 661.3756613756614
       estimated_peak_memory_range:
-        min: 28672
-        max: 7838800
+        min: 40960
+        max: 6775640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jz5wo986p
+      job_id: jpxkox6j5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:02:20Z'
+    timestamp: '2024-10-14T23:56:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 904.0
-      throughput: 1106.1946902654868
+      inference_time: 981.0
+      throughput: 1019.367991845056
       estimated_peak_memory_range:
-        min: 20480
-        max: 22514608
+        min: 16384
+        max: 22545904
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: jogkzydog
+      job_id: j5q6qvrep
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 682.0
-      throughput: 1466.275659824047
+      inference_time: 674.0
+      throughput: 1483.679525222552
       estimated_peak_memory_range:
-        min: 208896
-        max: 12075760
+        min: 204800
+        max: 11154848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: j7gjxl18p
+      job_id: j5we64345
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1367.0
-      throughput: 731.528895391368
+      inference_time: 1084.0
+      throughput: 922.509225092251
       estimated_peak_memory_range:
         min: 0
-        max: 24822464
+        max: 24816912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jmg9v4kl5
+      job_id: j5mnx86yp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:02:21Z'
+    timestamp: '2024-10-14T23:56:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 1405.0
-      throughput: 711.7437722419929
+      inference_time: 1333.0
+      throughput: 750.1875468867216
       estimated_peak_memory_range:
-        min: 28672
-        max: 1392592
+        min: 36864
+        max: 1384704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: jn5q82wm5
+      job_id: jglvml225
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 934.0
-      throughput: 1070.6638115631692
+      inference_time: 910.0
+      throughput: 1098.901098901099
       estimated_peak_memory_range:
-        min: 0
-        max: 1310736
+        min: 221184
+        max: 1362640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jygze7w6g
+      job_id: jgdx12q6p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:02:14Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:56:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 2043.0
-      throughput: 489.47626040137055
+      inference_time: 1366.0
+      throughput: 732.0644216691069
       estimated_peak_memory_range:
-        min: 6307840
-        max: 29877280
+        min: 24576
+        max: 1448016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: j1glnk7lp
+      job_id: jpv6k7rz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1250.0
-      throughput: 800.0
+      inference_time: 932.0
+      throughput: 1072.961373390558
       estimated_peak_memory_range:
-        min: 208896
-        max: 13835424
+        min: 221184
+        max: 1817048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jvgdwv0l5
+      job_id: jp14z6wkp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:02:19Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:56:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 1373.0
-      throughput: 728.3321194464676
+      inference_time: 1411.0
+      throughput: 708.7172218284904
       estimated_peak_memory_range:
-        min: 28672
-        max: 1643312
+        min: 36864
+        max: 1406160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: jw5661v75
+      job_id: jgo268n1p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 919.0
-      throughput: 1088.139281828074
+      inference_time: 1003.0
+      throughput: 997.0089730807578
       estimated_peak_memory_range:
-        min: 217088
-        max: 1431720
+        min: 221184
+        max: 1640680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jz5wo9xjp
+      job_id: jg9lndyqg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:02:15Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:56:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 1333.0
-      throughput: 750.1875468867216
+      inference_time: 1324.0
+      throughput: 755.2870090634441
       estimated_peak_memory_range:
         min: 24576
-        max: 1289536
+        max: 1500104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: j1p3km8z5
+      job_id: jp3j061mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 938.0
-      throughput: 1066.0980810234541
+      inference_time: 925.0
+      throughput: 1081.081081081081
       estimated_peak_memory_range:
-        min: 266240
-        max: 4960960
+        min: 229376
+        max: 1460104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jmg9v48v5
+      job_id: j5we643z5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:02:17Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:56:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 1343.0
-      throughput: 744.6016381236038
+      inference_time: 2746.0
+      throughput: 364.1660597232338
       estimated_peak_memory_range:
-        min: 1007616
-        max: 2372504
+        min: 6316032
+        max: 29582256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 17
-      job_id: jwgoyvmd5
+      job_id: j56y4wznp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1004.0
-      throughput: 996.01593625498
+      inference_time: 1234.0
+      throughput: 810.3727714748784
       estimated_peak_memory_range:
-        min: 229376
-        max: 4888840
+        min: 204800
+        max: 15147392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jnp1083l5
+      job_id: j57yr9lq5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:02:18Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:56:20Z'
+  - torchscript_onnx_tflite:
+      inference_time: 971.0
+      throughput: 1029.8661174047375
+      estimated_peak_memory_range:
+        min: 16384
+        max: 15718240
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 14
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 17
+      job_id: jpedmyw85
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 684.0
+      throughput: 1461.9883040935672
+      estimated_peak_memory_range:
+        min: 0
+        max: 8908608
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 17
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 17
+      job_id: jp4lr3dq5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 925.0
+      throughput: 1081.081081081081
+      estimated_peak_memory_range:
+        min: 0
+        max: 16116864
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 19
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 19
+      job_id: jp2kyelxp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:56:26Z'
   - torchscript_onnx_qnn:
-      inference_time: 1039.0
-      throughput: 962.4639076034649
+      inference_time: 1035.0
+      throughput: 966.1835748792271
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 208896
+        max: 208896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 17
-      job_id: jlpe9v20g
+      job_id: jp14z6wnp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1515.0
-      throughput: 660.0660066006601
+      inference_time: 1552.0
+      throughput: 644.3298969072165
       estimated_peak_memory_range:
-        min: 8982528
-        max: 8982528
+        min: 8929280
+        max: 8929280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jnp108725
+      job_id: jgn6vk3v5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:02:22Z'
+    timestamp: '2024-10-14T23:56:24Z'
diff --git a/qai_hub_models/models/quicksrnetmedium_quantized/README.md b/qai_hub_models/models/quicksrnetmedium_quantized/README.md
index ed5b04f5..64ae7804 100644
--- a/qai_hub_models/models/quicksrnetmedium_quantized/README.md
+++ b/qai_hub_models/models/quicksrnetmedium_quantized/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Medium is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetMedium-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetmedium_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.quicksrnetmedium_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetMedium-Quantized can be found
+* The license for the original implementation of QuickSRNetMedium-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetmedium_quantized/export.py b/qai_hub_models/models/quicksrnetmedium_quantized/export.py
index 767f0ab4..83e43b06 100644
--- a/qai_hub_models/models/quicksrnetmedium_quantized/export.py
+++ b/qai_hub_models/models/quicksrnetmedium_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetmedium_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetmedium_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetmedium_quantized/perf.yaml b/qai_hub_models/models/quicksrnetmedium_quantized/perf.yaml
index 08cf7fef..e69a2b3c 100644
--- a/qai_hub_models/models/quicksrnetmedium_quantized/perf.yaml
+++ b/qai_hub_models/models/quicksrnetmedium_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetMedium-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1111.0
-      throughput: 900.0900090009001
+      inference_time: 1127.0
+      throughput: 887.3114463176574
       estimated_peak_memory_range:
-        min: 28672
-        max: 1464480
+        min: 831488
+        max: 66800768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +62,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j2p0yreeg
+      job_id: jgkexn7wg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 512.0
-      throughput: 1953.125
+      inference_time: 519.0
+      throughput: 1926.7822736030828
       estimated_peak_memory_range:
-        min: 69632
-        max: 66788144
+        min: 12288
+        max: 3826384
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: j7gjxlk8p
+        total_layers: 17
+      job_id: jp0z0q295
       job_status: Passed
     torchscript_onnx:
-      inference_time: 771.0
-      throughput: 1297.0168612191958
+      inference_time: 676.0
+      throughput: 1479.2899408284025
       estimated_peak_memory_range:
-        min: 69632
-        max: 1463824
+        min: 65536
+        max: 1402392
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: j0pxv1m9g
+      job_id: jgdx19orp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:01:43Z'
+    timestamp: '2024-10-15T17:24:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 910.0
-      throughput: 1098.901098901099
+      inference_time: 899.0
+      throughput: 1112.3470522803113
       estimated_peak_memory_range:
         min: 16384
-        max: 23857536
+        max: 23513088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +115,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1p8o7w8g
+      job_id: jglvmz6j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 470.0
-      throughput: 2127.659574468085
+      inference_time: 359.0
+      throughput: 2785.515320334262
       estimated_peak_memory_range:
-        min: 12288
-        max: 12497168
+        min: 65536
+        max: 11297680
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jlpe9v40g
+        total_layers: 17
+      job_id: jp8qy9mkp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 550.0
-      throughput: 1818.1818181818182
+      inference_time: 503.0
+      throughput: 1988.0715705765408
       estimated_peak_memory_range:
         min: 0
-        max: 24443968
+        max: 24442864
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jo5mrz4qg
+      job_id: jpxkojy35
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:01:43Z'
+    timestamp: '2024-10-15T17:24:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 1116.0
-      throughput: 896.0573476702509
+      inference_time: 3558.0
+      throughput: 281.0567734682406
       estimated_peak_memory_range:
-        min: 16384
-        max: 5672664
+        min: 1622016
+        max: 17850784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +168,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jogkzyrog
+      job_id: jprv3q20g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 413.0
-      throughput: 2421.3075060532688
+      inference_time: 1050.0
+      throughput: 952.3809523809524
       estimated_peak_memory_range:
-        min: 81920
-        max: 1274688
+        min: 61440
+        max: 8135440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jz5wo9mjp
+        total_layers: 17
+      job_id: jgz3d9ro5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:01:37Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-15T17:24:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 1842.0
-      throughput: 542.8881650380022
+      inference_time: 12711.0
+      throughput: 78.6720163637794
       estimated_peak_memory_range:
-        min: 1605632
-        max: 25678304
+        min: 1748992
+        max: 7692808
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +206,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jn5q829m5
+      job_id: jp2ky69rp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-15T17:24:27Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1115.0
+      throughput: 896.8609865470852
+      estimated_peak_memory_range:
+        min: 20480
+        max: 1408456
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 16
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 19
+      job_id: jp3j03v3g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 573.0
-      throughput: 1745.2006980802792
+      inference_time: 412.0
+      throughput: 2427.1844660194174
       estimated_peak_memory_range:
-        min: 65536
-        max: 13607824
+        min: 77824
+        max: 2562544
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jz57zd6rp
+        total_layers: 17
+      job_id: j5q6qkrnp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:01:41Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:24:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 1119.0
-      throughput: 893.6550491510277
+      inference_time: 1106.0
+      throughput: 904.1591320072333
       estimated_peak_memory_range:
         min: 28672
-        max: 1358744
+        max: 5821904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +267,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1glnkelp
+      job_id: j57yrwlv5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 413.0
       throughput: 2421.3075060532688
       estimated_peak_memory_range:
-        min: 81920
-        max: 1262552
+        min: 28672
+        max: 1874272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jmg9v49v5
+        total_layers: 17
+      job_id: jp3j0313g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:01:38Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:24:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 1124.0
-      throughput: 889.6797153024911
+      inference_time: 1134.0
+      throughput: 881.8342151675485
       estimated_peak_memory_range:
-        min: 20480
-        max: 5770984
+        min: 1601536
+        max: 71370032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +305,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jw5661q75
+      job_id: jp14zlw8p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 413.0
-      throughput: 2421.3075060532688
+      inference_time: 416.0
+      throughput: 2403.846153846154
       estimated_peak_memory_range:
-        min: 81920
-        max: 2355968
+        min: 16384
+        max: 1941952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jnp108ql5
+        total_layers: 17
+      job_id: j56y4jz6p
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:01:39Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:24:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 1134.0
-      throughput: 881.8342151675485
+      inference_time: 1108.0
+      throughput: 902.5270758122743
       estimated_peak_memory_range:
-        min: 40960
-        max: 1397032
+        min: 20480
+        max: 3105112
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +343,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1p3kmqz5
+      job_id: j5we6v335
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 415.0
       throughput: 2409.6385542168673
       estimated_peak_memory_range:
-        min: 90112
-        max: 1416856
+        min: 81920
+        max: 1384840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jvgdwv7l5
+        total_layers: 17
+      job_id: jglvmz2j5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:01:40Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:24:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 2491.0
-      throughput: 401.4452027298274
+      inference_time: 1368.0
+      throughput: 730.9941520467836
       estimated_peak_memory_range:
-        min: 1617920
-        max: 17432640
+        min: 16384
+        max: 24876320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,37 +381,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jwgoyved5
+      job_id: jgjvnm2vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1089.0
-      throughput: 918.2736455463728
+      inference_time: 581.0
+      throughput: 1721.170395869191
       estimated_peak_memory_range:
-        min: 28672
-        max: 7778400
+        min: 65536
+        max: 13853008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jqp4qw8lg
+        total_layers: 17
+      job_id: jpv6kovk5
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:01:42Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:24:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 11336.0
-      throughput: 88.21453775582216
+      inference_time: 845.0
+      throughput: 1183.4319526627219
       estimated_peak_memory_range:
-        min: 1798144
-        max: 4980664
+        min: 16384
+        max: 15947360
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -398,37 +419,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1pv3wzm5
+      job_id: jpy13wj8p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 303.0
+      throughput: 3300.3300330033003
+      estimated_peak_memory_range:
+        min: 57344
+        max: 9609648
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 17
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 17
+      job_id: jg9ln1wwg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 373.0
+      throughput: 2680.9651474530833
+      estimated_peak_memory_range:
+        min: 0
+        max: 15758176
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 19
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 19
+      job_id: jgkexn3wg
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:01:33Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:24:44Z'
   - torchscript_onnx_qnn:
-      inference_time: 516.0
-      throughput: 1937.984496124031
+      inference_time: 521.0
+      throughput: 1919.3857965451057
       estimated_peak_memory_range:
-        min: 69632
-        max: 69632
+        min: 229376
+        max: 229376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 10
+        layers_on_npu: 17
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 10
-      job_id: jygze7v6g
+        total_layers: 17
+      job_id: jgkexnqwg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 759.0
-      throughput: 1317.5230566534915
+      inference_time: 777.0
+      throughput: 1287.001287001287
       estimated_peak_memory_range:
-        min: 3301376
-        max: 3301376
+        min: 3325952
+        max: 3325952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jegn2exmg
+      job_id: jprv3qe0g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:01:44Z'
+    timestamp: '2024-10-15T17:24:42Z'
diff --git a/qai_hub_models/models/quicksrnetsmall/README.md b/qai_hub_models/models/quicksrnetsmall/README.md
index 8b3c02c1..17dc918e 100644
--- a/qai_hub_models/models/quicksrnetsmall/README.md
+++ b/qai_hub_models/models/quicksrnetsmall/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Small is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetSmall found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetsmall).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.quicksrnetsmall.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetSmall can be found
+* The license for the original implementation of QuickSRNetSmall can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetsmall/export.py b/qai_hub_models/models/quicksrnetsmall/export.py
index 7adf31b5..ce3dfe17 100644
--- a/qai_hub_models/models/quicksrnetsmall/export.py
+++ b/qai_hub_models/models/quicksrnetsmall/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetsmall import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetsmall"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetsmall/perf.yaml b/qai_hub_models/models/quicksrnetsmall/perf.yaml
index 4ecdcd4c..6b7af2d2 100644
--- a/qai_hub_models/models/quicksrnetsmall/perf.yaml
+++ b/qai_hub_models/models/quicksrnetsmall/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetSmall
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1298.0
-      throughput: 770.4160246533128
+      inference_time: 1340.0
+      throughput: 746.2686567164179
       estimated_peak_memory_range:
-        min: 7168000
-        max: 79020744
+        min: 16384
+        max: 8878600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: jogkzylog
+      job_id: jgdx12x6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1010.0
-      throughput: 990.0990099009902
+      inference_time: 1061.0
+      throughput: 942.5070688030161
       estimated_peak_memory_range:
-        min: 12288
-        max: 65498344
+        min: 217088
+        max: 7570656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: j7gjxl08p
+      job_id: jp0z06405
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1455.0
-      throughput: 687.2852233676975
+      inference_time: 1440.0
+      throughput: 694.4444444444445
       estimated_peak_memory_range:
-        min: 212992
-        max: 11710960
+        min: 217088
+        max: 1760856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: jqp4qwjlg
+      job_id: jpedmyo85
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:01:03Z'
+    timestamp: '2024-10-14T23:54:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 937.0
-      throughput: 1067.2358591248667
+      inference_time: 830.0
+      throughput: 1204.8192771084337
       estimated_peak_memory_range:
         min: 16384
-        max: 21546912
+        max: 21265472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: jn5q827m5
+      job_id: j57yr9yn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 636.0
-      throughput: 1572.3270440251572
+      inference_time: 638.0
+      throughput: 1567.398119122257
       estimated_peak_memory_range:
-        min: 208896
-        max: 10675392
+        min: 204800
+        max: 12254000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jlpe9vr0g
+      job_id: jp8qy12qp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 987.0
-      throughput: 1013.1712259371834
+      inference_time: 997.0
+      throughput: 1003.0090270812437
       estimated_peak_memory_range:
         min: 0
-        max: 22202016
+        max: 21946496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: j0pxv1e9g
+      job_id: jgz3dn245
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:01:04Z'
+    timestamp: '2024-10-14T23:54:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 1324.0
-      throughput: 755.2870090634441
+      inference_time: 1360.0
+      throughput: 735.2941176470588
       estimated_peak_memory_range:
-        min: 16384
-        max: 3401384
+        min: 24576
+        max: 1367264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: j1glnk0lp
+      job_id: jp4lr3l25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 846.0
-      throughput: 1182.033096926714
+      inference_time: 863.0
+      throughput: 1158.7485515643104
       estimated_peak_memory_range:
-        min: 0
-        max: 1412928
+        min: 221184
+        max: 1497272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jz5wo9djp
+      job_id: j5q6qv0ep
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:00:58Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:54:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 1782.0
-      throughput: 561.1672278338945
+      inference_time: 1413.0
+      throughput: 707.7140835102618
       estimated_peak_memory_range:
         min: 16384
-        max: 22383920
+        max: 23004864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: jw5661375
+      job_id: jprv3wvkg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1120.0
-      throughput: 892.8571428571429
+      inference_time: 877.0
+      throughput: 1140.2508551881415
       estimated_peak_memory_range:
-        min: 45056
-        max: 11723152
+        min: 28672
+        max: 3997856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jz57zdvrp
+      job_id: jp3j06nmg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:01:02Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:54:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 1297.0
-      throughput: 771.0100231303007
+      inference_time: 1361.0
+      throughput: 734.7538574577517
       estimated_peak_memory_range:
-        min: 16384
-        max: 9217824
+        min: 32768
+        max: 1463672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: j1p3km4z5
+      job_id: jgn6vk6j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 864.0
-      throughput: 1157.4074074074074
+      inference_time: 863.0
+      throughput: 1158.7485515643104
       estimated_peak_memory_range:
-        min: 28672
-        max: 4037384
+        min: 233472
+        max: 1538088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jmg9v43v5
+      job_id: j56y4w2np
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:00:59Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:54:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 1429.0
-      throughput: 699.7900629811056
+      inference_time: 1316.0
+      throughput: 759.8784194528876
       estimated_peak_memory_range:
-        min: 16384
-        max: 7894552
+        min: 28672
+        max: 1429920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: jwgoyv1d5
+      job_id: j5mnx8n7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 858.0
-      throughput: 1165.5011655011656
+      inference_time: 872.0
+      throughput: 1146.788990825688
       estimated_peak_memory_range:
         min: 229376
-        max: 1506264
+        max: 1540536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jnp108dl5
+      job_id: jglvml425
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:01:00Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:54:43Z'
   - torchscript_onnx_tflite:
-      inference_time: 1356.0
-      throughput: 737.4631268436578
+      inference_time: 2807.0
+      throughput: 356.2522265764161
       estimated_peak_memory_range:
-        min: 1384448
-        max: 2780032
+        min: 16384
+        max: 21002512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 11
-      job_id: j1pv3w1m5
+      job_id: jpxkoxk85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 879.0
-      throughput: 1137.6564277588168
+      inference_time: 1118.0
+      throughput: 894.4543828264758
       estimated_peak_memory_range:
-        min: 258048
-        max: 1769616
+        min: 208896
+        max: 13023728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jvgdwvrl5
+      job_id: jpv6k7qz5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:54:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 746.0
+      throughput: 1340.4825737265417
+      estimated_peak_memory_range:
+        min: 0
+        max: 14930928
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 8
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 11
+      job_id: jpy13m10p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 663.0
+      throughput: 1508.2956259426849
+      estimated_peak_memory_range:
+        min: 208896
+        max: 9152912
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 11
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 11
+      job_id: jgjvnqd1g
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 968.0
+      throughput: 1033.0578512396694
+      estimated_peak_memory_range:
+        min: 0
+        max: 15075680
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 13
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 13
+      job_id: jp14z62np
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:01:01Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:54:53Z'
   - torchscript_onnx_qnn:
-      inference_time: 935.0
-      throughput: 1069.51871657754
+      inference_time: 942.0
+      throughput: 1061.5711252653928
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 204800
+        max: 204800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 11
-      job_id: jygze7x6g
+      job_id: jgkex8vvg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1453.0
-      throughput: 688.2312456985547
+      inference_time: 1468.0
+      throughput: 681.1989100817439
       estimated_peak_memory_range:
-        min: 8908800
-        max: 8908800
+        min: 8978432
+        max: 8978432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: jo5mrzvqg
+      job_id: j5we64w45
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:01:05Z'
+    timestamp: '2024-10-14T23:54:51Z'
diff --git a/qai_hub_models/models/quicksrnetsmall_quantized/README.md b/qai_hub_models/models/quicksrnetsmall_quantized/README.md
index 9eb783fb..415b8d00 100644
--- a/qai_hub_models/models/quicksrnetsmall_quantized/README.md
+++ b/qai_hub_models/models/quicksrnetsmall_quantized/README.md
@@ -6,7 +6,7 @@
 QuickSRNet Small is designed for upscaling images on mobile platforms to sharpen in real-time.
 
 This is based on the implementation of QuickSRNetSmall-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/quicksrnetsmall_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.quicksrnetsmall_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of QuickSRNetSmall-Quantized can be found
+* The license for the original implementation of QuickSRNetSmall-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/quicksrnet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/quicksrnetsmall_quantized/export.py b/qai_hub_models/models/quicksrnetsmall_quantized/export.py
index 2fa4da5b..4b071aaf 100644
--- a/qai_hub_models/models/quicksrnetsmall_quantized/export.py
+++ b/qai_hub_models/models/quicksrnetsmall_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.quicksrnetsmall_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "quicksrnetsmall_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/quicksrnetsmall_quantized/perf.yaml b/qai_hub_models/models/quicksrnetsmall_quantized/perf.yaml
index 115392b3..9f7b027d 100644
--- a/qai_hub_models/models/quicksrnetsmall_quantized/perf.yaml
+++ b/qai_hub_models/models/quicksrnetsmall_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: QuickSRNetSmall-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1072.0
-      throughput: 932.8358208955224
+      inference_time: 1083.0
+      throughput: 923.3610341643582
       estimated_peak_memory_range:
-        min: 16384
-        max: 2049624
+        min: 196608
+        max: 1592560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +62,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: j1p8o778g
+      job_id: j5we646m5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 466.0
       throughput: 2145.922746781116
       estimated_peak_memory_range:
-        min: 53248
-        max: 65720216
+        min: 69632
+        max: 2296112
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jlpe9vv0g
+        total_layers: 11
+      job_id: j5mnx8x7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 706.0
-      throughput: 1416.4305949008499
+      inference_time: 649.0
+      throughput: 1540.8320493066255
       estimated_peak_memory_range:
         min: 65536
-        max: 1305568
+        max: 1385928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: jo5mrzwqg
+      job_id: jp3j06jmg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T12:00:26Z'
+    timestamp: '2024-10-14T23:54:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 924.0
-      throughput: 1082.2510822510822
+      inference_time: 891.0
+      throughput: 1122.334455667789
       estimated_peak_memory_range:
         min: 12288
-        max: 20972496
+        max: 21241088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +115,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: jogkzyyog
+      job_id: jg9lndn8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 320.0
-      throughput: 3125.0
+      inference_time: 424.0
+      throughput: 2358.490566037736
       estimated_peak_memory_range:
         min: 65536
-        max: 10499312
+        max: 10591200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jygze776g
+        total_layers: 11
+      job_id: jgn6vkvj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 526.0
-      throughput: 1901.1406844106464
+      inference_time: 489.0
+      throughput: 2044.9897750511248
       estimated_peak_memory_range:
         min: 0
-        max: 21326464
+        max: 21853616
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: jegn2e9mg
+      job_id: jgo26821p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T12:00:27Z'
+    timestamp: '2024-10-14T23:54:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 1096.0
-      throughput: 912.4087591240876
+      inference_time: 2310.0
+      throughput: 432.9004329004329
       estimated_peak_memory_range:
-        min: 24576
-        max: 1363104
+        min: 1601536
+        max: 16737120
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +168,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: jn5q822m5
+      job_id: j57yr9rn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 390.0
-      throughput: 2564.102564102564
+      inference_time: 957.0
+      throughput: 1044.932079414838
       estimated_peak_memory_range:
-        min: 77824
-        max: 1338264
+        min: 16384
+        max: 7769488
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jmg9v44v5
+        total_layers: 11
+      job_id: jglvmlv25
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T12:00:20Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:54:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 1597.0
-      throughput: 626.1740763932373
+      inference_time: 10718.0
+      throughput: 93.3009889904833
       estimated_peak_memory_range:
-        min: 16384
-        max: 22560000
+        min: 1224704
+        max: 4587208
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 10
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 13
+      job_id: jp4lr3r25
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:53:51Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1076.0
+      throughput: 929.368029739777
+      estimated_peak_memory_range:
+        min: 32768
+        max: 1386368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +229,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: j1glnkklp
+      job_id: jp14z6z7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 545.0
-      throughput: 1834.8623853211009
+      inference_time: 385.0
+      throughput: 2597.4025974025976
       estimated_peak_memory_range:
-        min: 65536
-        max: 12065680
+        min: 0
+        max: 1169280
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jqp4qwxlg
+        total_layers: 11
+      job_id: jp2kyey6p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T12:00:24Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:53:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 1069.0
-      throughput: 935.4536950420954
+      inference_time: 1070.0
+      throughput: 934.5794392523364
       estimated_peak_memory_range:
-        min: 12288
-        max: 5594104
+        min: 24576
+        max: 1480888
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +267,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: jw5661175
+      job_id: jp14z6znp
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 390.0
       throughput: 2564.102564102564
       estimated_peak_memory_range:
-        min: 122880
-        max: 1772512
+        min: 73728
+        max: 1311440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jnp1088l5
+        total_layers: 11
+      job_id: jp8qy1qqp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T12:00:21Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:53:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 1088.0
-      throughput: 919.1176470588235
+      inference_time: 1133.0
+      throughput: 882.61253309797
       estimated_peak_memory_range:
-        min: 40960
-        max: 1450760
+        min: 28672
+        max: 1411208
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +305,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: j1p3kmmz5
+      job_id: jg9lndnmg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 392.0
-      throughput: 2551.0204081632655
+      inference_time: 393.0
+      throughput: 2544.529262086514
       estimated_peak_memory_range:
-        min: 86016
-        max: 1307992
+        min: 81920
+        max: 1500024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jvgdwvvl5
+        total_layers: 11
+      job_id: jp0z06z05
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T12:00:22Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:53:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 1067.0
-      throughput: 937.207122774133
+      inference_time: 1087.0
+      throughput: 919.9632014719411
       estimated_peak_memory_range:
-        min: 32768
-        max: 1628848
+        min: 16384
+        max: 5843504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +343,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: jwgoyvvd5
+      job_id: j5we64645
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 396.0
-      throughput: 2525.252525252525
+      inference_time: 393.0
+      throughput: 2544.529262086514
       estimated_peak_memory_range:
-        min: 65536
-        max: 1322144
+        min: 81920
+        max: 1487880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jz57zdjrp
+        total_layers: 11
+      job_id: jpy13m30p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T12:00:23Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:53:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 3216.0
-      throughput: 310.9452736318408
+      inference_time: 1451.0
+      throughput: 689.1798759476223
       estimated_peak_memory_range:
-        min: 16384
-        max: 15786608
+        min: 1634304
+        max: 24164512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,37 +381,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: j1pv3wwm5
+      job_id: jgdx121zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 979.0
-      throughput: 1021.4504596527069
+      inference_time: 539.0
+      throughput: 1855.287569573284
       estimated_peak_memory_range:
-        min: 61440
-        max: 7831568
+        min: 0
+        max: 11135840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: j0pxv179g
+        total_layers: 11
+      job_id: j5q6qv6ep
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T12:00:25Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:54:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 11413.0
-      throughput: 87.61938140716727
+      inference_time: 1268.0
+      throughput: 788.6435331230284
       estimated_peak_memory_range:
-        min: 1712128
-        max: 7502224
+        min: 16384
+        max: 14624368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -398,37 +419,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 13
-      job_id: j7gjxll8p
+      job_id: jpxkoxo85
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 336.0
+      throughput: 2976.190476190476
+      estimated_peak_memory_range:
+        min: 86016
+        max: 9037360
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 11
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 11
+      job_id: j56y4wynp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 444.0
+      throughput: 2252.252252252252
+      estimated_peak_memory_range:
+        min: 57344
+        max: 14252928
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 13
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 13
+      job_id: jpedmyd85
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T12:00:16Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:54:09Z'
   - torchscript_onnx_qnn:
-      inference_time: 494.0
-      throughput: 2024.2914979757086
+      inference_time: 482.0
+      throughput: 2074.688796680498
       estimated_peak_memory_range:
-        min: 61440
-        max: 61440
+        min: 139264
+        max: 139264
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 7
+        layers_on_npu: 11
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 7
-      job_id: jz5wo99jp
+        total_layers: 11
+      job_id: jprv3w3kg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 705.0
-      throughput: 1418.4397163120568
+      inference_time: 736.0
+      throughput: 1358.695652173913
       estimated_peak_memory_range:
-        min: 3301376
-        max: 3301376
+        min: 3375104
+        max: 3375104
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 13
-      job_id: joprky4e5
+      job_id: jpv6k76z5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T12:00:28Z'
+    timestamp: '2024-10-14T23:54:07Z'
diff --git a/qai_hub_models/models/qwen2_7b_instruct_quantized/README.md b/qai_hub_models/models/qwen2_7b_instruct_quantized/README.md
new file mode 100644
index 00000000..83dc0e06
--- /dev/null
+++ b/qai_hub_models/models/qwen2_7b_instruct_quantized/README.md
@@ -0,0 +1,61 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [Qwen2-7B-Instruct: State-of-the-art large language model useful on a variety of language understanding and generation tasks](https://aihub.qualcomm.com/models/qwen2_7b_instruct_quantized)
+
+The Qwen2-7B-Instruct is a state-of-the-art multilingual language model with 7.07 billion parameters, excelling in language understanding, generation, coding, and mathematics. AI Hub provides with four QNN context binaries (shared weights) that can be deployed on Snapdragon 8 Elite with Genie SDK.
+
+This is based on the implementation of Qwen2-7B-Instruct found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/qwen2_7b_instruct_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+## Deploying Qwen2-7B-Instruct on-device
+
+Please follow the [LLM on-device deployment](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie) tutorial.
+
+
+
+
+
+## License
+* The license for the original implementation of Qwen2-7B-Instruct can be found
+  [here](https://huggingface.co/Qwen/Qwen2-7B-Instruct/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://huggingface.co/Qwen/Qwen2-7B-Instruct/blob/main/LICENSE)
+
+
+## References
+* [Qwen2 Technical Report](https://arxiv.org/abs/2407.10671v1)
+* [Source Model Implementation](https://github.com/QwenLM/Qwen2.5)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
+## Usage and Limitations
+
+This model may not be used for or in connection with any of the following applications:
+
+- Accessing essential private and public services and benefits;
+- Administration of justice and democratic processes;
+- Assessing or recognizing the emotional state of a person;
+- Biometric and biometrics-based systems, including categorization of persons based on sensitive characteristics;
+- Education and vocational training;
+- Employment and workers management;
+- Exploitation of the vulnerabilities of persons resulting in harmful behavior;
+- General purpose social scoring;
+- Law enforcement;
+- Management and operation of critical infrastructure;
+- Migration, asylum and border control management;
+- Predictive policing;
+- Real-time remote biometric identification in public spaces;
+- Recommender systems of social media platforms;
+- Scraping of facial images (from the internet or otherwise); and/or
+- Subliminal manipulation
+
+
diff --git a/qai_hub_models/models/qwen2_7b_instruct_quantized/info.yaml b/qai_hub_models/models/qwen2_7b_instruct_quantized/info.yaml
new file mode 100644
index 00000000..b20e93e9
--- /dev/null
+++ b/qai_hub_models/models/qwen2_7b_instruct_quantized/info.yaml
@@ -0,0 +1,59 @@
+name: Qwen2-7B-Instruct
+id: qwen2_7b_instruct_quantized
+status: public
+headline: State-of-the-art large language model useful on a variety of language
+  understanding and generation tasks.
+domain: Generative AI
+description: The Qwen2-7B-Instruct is a state-of-the-art multilingual language model with 7.07 billion parameters, excelling in language understanding, generation, coding, and mathematics. AI Hub provides with four QNN context binaries (shared weights) that can be deployed on Snapdragon 8 Elite with Genie SDK.
+use_case: Text Generation
+tags:
+  - llm
+  - generative-ai
+  - quantized
+research_paper: https://arxiv.org/abs/2407.10671v1
+research_paper_title: "Qwen2 Technical Report"
+license: https://huggingface.co/Qwen/Qwen2-7B-Instruct/blob/main/LICENSE
+deploy_license: https://huggingface.co/Qwen/Qwen2-7B-Instruct/blob/main/LICENSE
+source_repo: https://github.com/QwenLM/Qwen2.5
+technical_details:
+  Input sequence length for Prompt Processor: 128
+  Context length: 4096
+  Number of parameters: 7.07B
+  Precision: w4a16 + w8a16 (few layers)
+  Num of key-value heads: 8
+  Information about the model parts: Prompt Processor and Token Generator are split into 5 parts each. Each corresponding Prompt Processor and Token Generator part share weights.
+  Prompt processor model size: 5.16 GB
+  Prompt processor input (part1): 128 tokens
+  Prompt processor output (part1): Embeddings output
+  Prompt processor input (other parts): 128 tokens + KVCache initialized with pad token
+  Prompt processor output (other parts): 128 output tokens + KVCache for token generator
+  Token generator model size: 5.16 GB
+  Token generator input (part1): 128 tokens
+  Token generator output (part1): Embeddings output
+  Token generator input (other parts): 1 input token + past KVCache
+  Token generator output (other parts): 1 output token + KVCache for next iteration
+  Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
+  Minimum QNN SDK version required: 2.27.7
+  Supported languages: English, Chinese, German, French, Spanish, Portuguese, Italian, Dutch, Russian, Czech, Polish, Arabic, Persian, Hebrew, Turkish, Japanese, Korean, Vietnamese, Thai, Indonesian, Malay, Lao, Burmese, Cebuano, Khmer, Tagalog, Hindi, Bengali, Urdu.
+  TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
+  Response Rate: Rate of response generation after the first response token.
+applicable_scenarios:
+  - Dialogue
+  - Content Generation
+  - Customer Support
+related_models: []
+form_factors:
+  - Phone
+  - Tablet
+has_static_banner: true
+has_animated_banner: true
+license_type: apache-2.0
+deploy_license_type: apache-2.0
+dataset: []
+model_type_llm: true
+llm_details:
+  call_to_action: 'download'
+  genie_compatible: true
+  Snapdragon 8 Elite QRD:
+    torchscript_onnx_qnn:
+      model_download_url: v2/snapdragon_8_elite/models.zip
diff --git a/qai_hub_models/models/qwen2_7b_instruct_quantized/perf.yaml b/qai_hub_models/models/qwen2_7b_instruct_quantized/perf.yaml
new file mode 100644
index 00000000..f3cfce2c
--- /dev/null
+++ b/qai_hub_models/models/qwen2_7b_instruct_quantized/perf.yaml
@@ -0,0 +1,24 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+models:
+  name: ''
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      llm_metrics:
+        time_to_first_token_range:
+          min: 170593
+          max: 5458976
+        tokens_per_second: 13.65
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T00:32:42.210701Z'
diff --git a/qai_hub_models/models/real_esrgan_general_x4v3/README.md b/qai_hub_models/models/real_esrgan_general_x4v3/README.md
index 87d2214e..34138a03 100644
--- a/qai_hub_models/models/real_esrgan_general_x4v3/README.md
+++ b/qai_hub_models/models/real_esrgan_general_x4v3/README.md
@@ -6,7 +6,7 @@
 Real-ESRGAN is a machine learning model that upscales an image with minimal loss in quality.
 
 This is based on the implementation of Real-ESRGAN-General-x4v3 found
-[here](https://github.com/xinntao/Real-ESRGAN/tree/master). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/real_esrgan_general_x4v3).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.real_esrgan_general_x4v3.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Real-ESRGAN-General-x4v3 can be found
+* The license for the original implementation of Real-ESRGAN-General-x4v3 can be found
   [here](https://github.com/xinntao/Real-ESRGAN/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data](https://arxiv.org/abs/2107.10833)
 * [Source Model Implementation](https://github.com/xinntao/Real-ESRGAN/tree/master)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/real_esrgan_general_x4v3/export.py b/qai_hub_models/models/real_esrgan_general_x4v3/export.py
index ea7905fe..e21e758b 100644
--- a/qai_hub_models/models/real_esrgan_general_x4v3/export.py
+++ b/qai_hub_models/models/real_esrgan_general_x4v3/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.real_esrgan_general_x4v3 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "real_esrgan_general_x4v3"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/real_esrgan_general_x4v3/perf.yaml b/qai_hub_models/models/real_esrgan_general_x4v3/perf.yaml
index 36f1b74d..7fbb3e8c 100644
--- a/qai_hub_models/models/real_esrgan_general_x4v3/perf.yaml
+++ b/qai_hub_models/models/real_esrgan_general_x4v3/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Real-ESRGAN-General-x4v3
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 7154.0
-      throughput: 139.78194017332962
+      inference_time: 7337.0
+      throughput: 136.2954886193267
       estimated_peak_memory_range:
-        min: 9478144
-        max: 112291720
+        min: 8445952
+        max: 9903736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: jwgoyv345
+      job_id: jpv6k9xr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6287.0
-      throughput: 159.0583744234134
+      inference_time: 6333.0
+      throughput: 157.9030475288173
       estimated_peak_memory_range:
-        min: 28672
-        max: 8153240
+        min: 36864
+        max: 12837416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jnp108ek5
+      job_id: jp4lr9015
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6955.0
-      throughput: 143.78145219266713
+      inference_time: 6784.0
+      throughput: 147.4056603773585
       estimated_peak_memory_range:
-        min: 12288
-        max: 21765248
+        min: 9166848
+        max: 10588528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: j0pxv119g
+      job_id: j5q6qmoop
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:59:46Z'
+    timestamp: '2024-10-14T23:53:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 5974.0
-      throughput: 167.39203213927016
+      inference_time: 5996.0
+      throughput: 166.77785190126752
       estimated_peak_memory_range:
-        min: 9457664
-        max: 71029904
+        min: 9461760
+        max: 73897776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: j1pv3wv75
+      job_id: jgjvnw4eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5196.0
-      throughput: 192.4557351809084
+      inference_time: 5195.0
+      throughput: 192.49278152069297
       estimated_peak_memory_range:
-        min: 212992
-        max: 17542816
+        min: 208896
+        max: 19567936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jvgdwvok5
+      job_id: jpxkod2l5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5828.0
-      throughput: 171.58544955387782
+      inference_time: 5664.0
+      throughput: 176.5536723163842
       estimated_peak_memory_range:
-        min: 4751360
-        max: 72541696
+        min: 4780032
+        max: 77508288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: jo5mrzzqg
+      job_id: jglvmlmm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:59:47Z'
+    timestamp: '2024-10-14T23:53:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 7253.0
-      throughput: 137.87398317937405
+      inference_time: 7333.0
+      throughput: 136.36983499249965
       estimated_peak_memory_range:
-        min: 9461760
-        max: 10875304
+        min: 9490432
+        max: 10871472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: j7gjxle7p
+      job_id: jpedml3v5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5754.0
-      throughput: 173.79214459506431
+      inference_time: 5743.0
+      throughput: 174.1250217656277
       estimated_peak_memory_range:
         min: 270336
-        max: 1588352
+        max: 5138304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jmg9v4wv5
+      job_id: jgn6v78q5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:59:41Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:53:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 12097.0
-      throughput: 82.66512358435976
+      inference_time: 7257.0
+      throughput: 137.79798814937303
       estimated_peak_memory_range:
-        min: 9474048
-        max: 75480208
+        min: 9478144
+        max: 113089928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: jlpe9vk7g
+      job_id: jp14zvx7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 9659.0
-      throughput: 103.53038616834041
+      inference_time: 5789.0
+      throughput: 172.74140611504578
       estimated_peak_memory_range:
-        min: 548864
-        max: 24289712
+        min: 266240
+        max: 1612080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jqp4qwwlg
+      job_id: jpy1370lp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:59:45Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:53:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 7232.0
-      throughput: 138.27433628318585
+      inference_time: 7390.0
+      throughput: 135.31799729364005
       estimated_peak_memory_range:
-        min: 9453568
-        max: 18787984
+        min: 9482240
+        max: 16018264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,52 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: jygze7rzg
+      job_id: jg9lnxe8g
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5761.0
+      throughput: 173.58097552508247
+      estimated_peak_memory_range:
+        min: 229376
+        max: 1589192
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 72
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 72
+      job_id: jp2kyvnqp
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:53:12Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7297.0
+      throughput: 137.04262025489928
+      estimated_peak_memory_range:
+        min: 9490432
+        max: 13783512
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 69
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 72
+      job_id: j5we61nm5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 5771.0
       throughput: 173.28019407381737
       estimated_peak_memory_range:
-        min: 233472
-        max: 2006128
+        min: 225280
+        max: 1545880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,7 +291,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jnp108el5
+      job_id: jprv3nj7g
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +299,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:59:42Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:53:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 7216.0
-      throughput: 138.5809312638581
+      inference_time: 11015.0
+      throughput: 90.78529278256923
       estimated_peak_memory_range:
-        min: 9502720
-        max: 11002432
+        min: 9469952
+        max: 80649056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: jz5wo9qzp
+      job_id: jgz3d4kx5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5792.0
-      throughput: 172.65193370165747
+      inference_time: 9657.0
+      throughput: 103.55182768975872
       estimated_peak_memory_range:
-        min: 270336
-        max: 1533720
+        min: 208896
+        max: 28375632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +329,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jvgdwvol5
+      job_id: jp8qy4vop
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:59:43Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:53:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 7145.0
-      throughput: 139.95801259622112
+      inference_time: 4187.0
+      throughput: 238.83448770002389
       estimated_peak_memory_range:
-        min: 9465856
-        max: 22162832
+        min: 12288
+        max: 30248944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +352,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 72
-      job_id: jmg9v4wq5
+      job_id: j57yr7395
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5766.0
-      throughput: 173.4304543877905
+      inference_time: 3583.0
+      throughput: 279.09572983533354
       estimated_peak_memory_range:
-        min: 286720
-        max: 1568128
+        min: 0
+        max: 19120256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +367,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jz57zddrp
+      job_id: jgkex9mng
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4635.0
+      throughput: 215.7497303128371
+      estimated_peak_memory_range:
+        min: 7495680
+        max: 36413632
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 74
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 74
+      job_id: jgo2686kp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:59:44Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:53:21Z'
   - torchscript_onnx_qnn:
-      inference_time: 6137.0
-      throughput: 162.94606485253382
+      inference_time: 6160.0
+      throughput: 162.33766233766235
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 204800
+        max: 204800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 72
-      job_id: jz5wo9qjp
+      job_id: j5mnxdy9p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7041.0
-      throughput: 142.02528049992898
+      inference_time: 7052.0
+      throughput: 141.80374361883153
       estimated_peak_memory_range:
-        min: 8908800
-        max: 8908800
+        min: 8912896
+        max: 8912896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: jegn2eemg
+      job_id: j56y4w4yp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:59:48Z'
+    timestamp: '2024-10-14T23:53:19Z'
diff --git a/qai_hub_models/models/real_esrgan_x4plus/README.md b/qai_hub_models/models/real_esrgan_x4plus/README.md
index 8dd6beec..f3544722 100644
--- a/qai_hub_models/models/real_esrgan_x4plus/README.md
+++ b/qai_hub_models/models/real_esrgan_x4plus/README.md
@@ -6,7 +6,7 @@
 Real-ESRGAN is a machine learning model that upscales an image with minimal loss in quality. The implementation is a derivative of the Real-ESRGAN-x4plus architecture, a larger and more powerful version compared to the Real-ESRGAN-general-x4v3 architecture.
 
 This is based on the implementation of Real-ESRGAN-x4plus found
-[here](https://github.com/xinntao/Real-ESRGAN). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/real_esrgan_x4plus).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.real_esrgan_x4plus.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Real-ESRGAN-x4plus can be found
+* The license for the original implementation of Real-ESRGAN-x4plus can be found
   [here](https://github.com/xinntao/Real-ESRGAN/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data](https://arxiv.org/abs/2107.10833)
 * [Source Model Implementation](https://github.com/xinntao/Real-ESRGAN)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/real_esrgan_x4plus/export.py b/qai_hub_models/models/real_esrgan_x4plus/export.py
index 97f68065..ae0c31c0 100644
--- a/qai_hub_models/models/real_esrgan_x4plus/export.py
+++ b/qai_hub_models/models/real_esrgan_x4plus/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.real_esrgan_x4plus import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "real_esrgan_x4plus"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/real_esrgan_x4plus/perf.yaml b/qai_hub_models/models/real_esrgan_x4plus/perf.yaml
index 3b6b172a..f15b2ca1 100644
--- a/qai_hub_models/models/real_esrgan_x4plus/perf.yaml
+++ b/qai_hub_models/models/real_esrgan_x4plus/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Real-ESRGAN-x4plus
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 67256.0
-      throughput: 14.868561912691804
+      inference_time: 68798.0
+      throughput: 14.535306258902875
       estimated_peak_memory_range:
-        min: 3162112
-        max: 11224464
+        min: 3244032
+        max: 6182328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: j7gjxl27p
+      job_id: jgkex9zng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 70685.0
-      throughput: 14.147273113107449
+      inference_time: 67503.0
+      throughput: 14.814156407863354
       estimated_peak_memory_range:
-        min: 40960
-        max: 36487680
+        min: 233472
+        max: 114789384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jz57zdlqp
+      job_id: jgz3d49x5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 68808.0
-      throughput: 14.533193814672712
+      inference_time: 70730.0
+      throughput: 14.138272303124559
       estimated_peak_memory_range:
         min: 118784
-        max: 44608840
+        max: 44736312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1030
-      job_id: j2p0yrl2g
+      job_id: jpy137wlp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:59:08Z'
+    timestamp: '2024-10-14T23:52:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 55698.0
-      throughput: 17.953966031096268
+      inference_time: 55834.0
+      throughput: 17.910233907654835
       estimated_peak_memory_range:
-        min: 3260416
-        max: 629103440
+        min: 3289088
+        max: 693883200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jlpe9vw7g
+      job_id: j5q6qm8op
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 56796.0
-      throughput: 17.606873723501653
+      inference_time: 55888.0
+      throughput: 17.892928714572
       estimated_peak_memory_range:
-        min: 86016
-        max: 98274560
+        min: 69632
+        max: 113449408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jqp4qwdqg
+      job_id: jg9lnx18g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 56604.0
-      throughput: 17.666596000282667
+      inference_time: 55527.0
+      throughput: 18.0092567579736
       estimated_peak_memory_range:
-        min: 6447104
-        max: 656213632
+        min: 8118272
+        max: 731345728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1030
-      job_id: j1p8o7zzg
+      job_id: jp0z0vqn5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:59:09Z'
+    timestamp: '2024-10-14T23:52:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 63513.0
-      throughput: 15.744808149512698
+      inference_time: 61376.0
+      throughput: 16.29301355578728
       estimated_peak_memory_range:
-        min: 3203072
-        max: 7122240
+        min: 1331200
+        max: 4638360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jygze7jzg
+      job_id: jglvm1zm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 63644.0
-      throughput: 15.712400226258563
+      inference_time: 62924.0
+      throughput: 15.892187400673828
       estimated_peak_memory_range:
-        min: 380928
-        max: 1577664
+        min: 397312
+        max: 1716056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jo5mrz6yg
+      job_id: jgdx1z9zp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:59:03Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:52:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 161796.0
-      throughput: 6.180622512299439
+      inference_time: 66879.0
+      throughput: 14.95237668027333
       estimated_peak_memory_range:
-        min: 3280896
-        max: 591519728
+        min: 3289088
+        max: 5565552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jz5wo93zp
+      job_id: jpv6k9or5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 126507.0
-      throughput: 7.904700925640478
+      inference_time: 63674.0
+      throughput: 15.704997330150453
       estimated_peak_memory_range:
-        min: 348160
-        max: 80814512
+        min: 372736
+        max: 2007560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jqpyed6rg
+      job_id: jpxkodjl5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:59:07Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:52:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 67146.0
-      throughput: 14.892919905876747
+      inference_time: 63276.0
+      throughput: 15.803780264239206
       estimated_peak_memory_range:
-        min: 3174400
-        max: 54688664
+        min: 3256320
+        max: 5876512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jmg9v4yq5
+      job_id: jgo2640kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 64154.0
-      throughput: 15.587492595941017
+      inference_time: 63755.0
+      throughput: 15.685044310250177
       estimated_peak_memory_range:
-        min: 397312
-        max: 1547784
+        min: 434176
+        max: 1675216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jegn2e3vg
+      job_id: jp4lr9o15
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:59:04Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:52:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 65483.0
-      throughput: 15.27113907426355
+      inference_time: 66934.0
+      throughput: 14.940090238145038
       estimated_peak_memory_range:
-        min: 3223552
-        max: 5782528
+        min: 3264512
+        max: 6511304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jnp108wk5
+      job_id: jp3j0w3ng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 63668.0
-      throughput: 15.70647735125966
+      inference_time: 63721.0
+      throughput: 15.693413474364808
       estimated_peak_memory_range:
-        min: 413696
-        max: 1686896
+        min: 409600
+        max: 1722368
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: joprkyev5
+      job_id: j57yr7w95
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:59:05Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:52:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 65448.0
-      throughput: 15.279305708348613
+      inference_time: 143595.0
+      throughput: 6.964030781016052
       estimated_peak_memory_range:
-        min: 3182592
-        max: 10550792
+        min: 0
+        max: 647411888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1028
-      job_id: jvgdwvqk5
+      job_id: j56y4djyp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 63341.0
-      throughput: 15.787562558216637
+      inference_time: 138583.0
+      throughput: 7.215892281160027
       estimated_peak_memory_range:
-        min: 434176
-        max: 1651824
+        min: 299008
+        max: 89498112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: jep28mlxp
+      job_id: jprv3nq7g
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:52:29Z'
+  - torchscript_onnx_tflite:
+      inference_time: 42951.0
+      throughput: 23.282344997788176
+      estimated_peak_memory_range:
+        min: 3158016
+        max: 192916848
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1028
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1028
+      job_id: jpedml1v5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 43454.0
+      throughput: 23.012841165370276
+      estimated_peak_memory_range:
+        min: 12288
+        max: 135567936
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1029
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1029
+      job_id: jp2kyv6qp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 43103.0
+      throughput: 23.20024128250934
+      estimated_peak_memory_range:
+        min: 0
+        max: 185570864
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1030
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1030
+      job_id: j5q6qmkop
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:59:06Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:52:35Z'
   - torchscript_onnx_qnn:
-      inference_time: 65087.0
-      throughput: 15.364051193018575
+      inference_time: 65203.0
+      throughput: 15.33671763569161
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 237568
+        max: 237568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1029
-      job_id: j0pxv16jg
+      job_id: jp14zvl7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 65518.0
-      throughput: 15.262981165481241
+      inference_time: 65666.0
+      throughput: 15.228581000822343
       estimated_peak_memory_range:
-        min: 39645184
-        max: 39645184
+        min: 40755200
+        max: 40755200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1030
-      job_id: jogkzy3yg
+      job_id: jp8qy49op
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:59:10Z'
+    timestamp: '2024-10-14T23:52:33Z'
diff --git a/qai_hub_models/models/regnet/README.md b/qai_hub_models/models/regnet/README.md
index 9186826f..e9368f1b 100644
--- a/qai_hub_models/models/regnet/README.md
+++ b/qai_hub_models/models/regnet/README.md
@@ -6,7 +6,7 @@
 RegNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of RegNet found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/regnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/regnet).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.regnet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of RegNet can be found
+* The license for the original implementation of RegNet can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/regnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/regnet/export.py b/qai_hub_models/models/regnet/export.py
index 21045695..fea11d5c 100644
--- a/qai_hub_models/models/regnet/export.py
+++ b/qai_hub_models/models/regnet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.regnet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "regnet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/regnet/perf.yaml b/qai_hub_models/models/regnet/perf.yaml
index dad1feeb..8140a459 100644
--- a/qai_hub_models/models/regnet/perf.yaml
+++ b/qai_hub_models/models/regnet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: RegNet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2067.0
-      throughput: 483.7929366231253
+      inference_time: 2075.0
+      throughput: 481.9277108433735
       estimated_peak_memory_range:
-        min: 24576
-        max: 2068992
+        min: 12288
+        max: 7107872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jygze7ozg
+      job_id: jp2kyv2qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2125.0
-      throughput: 470.5882352941176
+      inference_time: 2149.0
+      throughput: 465.33271288971616
       estimated_peak_memory_range:
-        min: 16384
-        max: 72977072
+        min: 618496
+        max: 62817144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: j0pxv1njg
+      job_id: jgo264ykp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2285.0
-      throughput: 437.636761487965
+      inference_time: 2197.0
+      throughput: 455.1661356395084
       estimated_peak_memory_range:
-        min: 28672
-        max: 2310240
+        min: 499712
+        max: 44367848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 190
-      job_id: jogkzyqyg
+      job_id: jp4lr9q15
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:58:14Z'
+    timestamp: '2024-10-14T23:51:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 1600.0
-      throughput: 625.0
+      inference_time: 1601.0
+      throughput: 624.6096189881324
       estimated_peak_memory_range:
         min: 16384
-        max: 145930384
+        max: 149556896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jz5wo92zp
+      job_id: jpy1379lp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1666.0
-      throughput: 600.2400960384153
+      inference_time: 1652.0
+      throughput: 605.3268765133172
       estimated_peak_memory_range:
-        min: 638976
-        max: 29361840
+        min: 618496
+        max: 28703200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: jo5mrzqyg
+      job_id: jpv6k93r5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1780.0
-      throughput: 561.7977528089888
+      inference_time: 1866.0
+      throughput: 535.9056806002144
       estimated_peak_memory_range:
-        min: 589824
-        max: 149330512
+        min: 368640
+        max: 153487456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 190
-      job_id: jn5q82r75
+      job_id: jpxkodvl5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:58:14Z'
+    timestamp: '2024-10-14T23:51:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 2035.0
-      throughput: 491.4004914004914
+      inference_time: 2041.0
+      throughput: 489.9559039686428
       estimated_peak_memory_range:
-        min: 192512
-        max: 1638264
+        min: 28672
+        max: 1897064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jmg9v4jq5
+      job_id: jp0z0vnn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2027.0
-      throughput: 493.33991119881597
+      inference_time: 2036.0
+      throughput: 491.1591355599214
       estimated_peak_memory_range:
-        min: 634880
-        max: 1938752
+        min: 663552
+        max: 2116376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: joprky2v5
+      job_id: jpedml9v5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:58:08Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:51:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 2794.0
-      throughput: 357.9098067287044
+      inference_time: 2039.0
+      throughput: 490.43648847474253
       estimated_peak_memory_range:
-        min: 12288
-        max: 128889488
+        min: 139264
+        max: 2210168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jnp108yk5
+      job_id: jglvm1nm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2946.0
-      throughput: 339.44331296673454
+      inference_time: 2039.0
+      throughput: 490.43648847474253
       estimated_peak_memory_range:
-        min: 618496
-        max: 25499136
+        min: 36864
+        max: 1334920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: j1p8o7mzg
+      job_id: jg9lnxv8g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:58:13Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:51:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 2068.0
-      throughput: 483.55899419729207
+      inference_time: 2037.0
+      throughput: 490.9180166912126
       estimated_peak_memory_range:
-        min: 28672
-        max: 1579576
+        min: 16384
+        max: 15438464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jvgdwvek5
+      job_id: j5q6qmjop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2047.0
-      throughput: 488.5197850512946
+      inference_time: 2028.0
+      throughput: 493.0966469428008
       estimated_peak_memory_range:
-        min: 638976
-        max: 1993208
+        min: 671744
+        max: 1904200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: jep28m9xp
+      job_id: j5we61om5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:58:09Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:51:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 2043.0
-      throughput: 489.47626040137055
+      inference_time: 2028.0
+      throughput: 493.0966469428008
       estimated_peak_memory_range:
-        min: 16384
-        max: 2353856
+        min: 28672
+        max: 1793128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jz57zd0qp
+      job_id: jgkex9jng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2034.0
-      throughput: 491.6420845624385
+      inference_time: 2042.0
+      throughput: 489.71596474045054
       estimated_peak_memory_range:
-        min: 630784
-        max: 2004232
+        min: 626688
+        max: 2222304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: jqpyedjrg
+      job_id: jgz3d4ex5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:58:11Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:51:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 2044.0
-      throughput: 489.23679060665364
+      inference_time: 2808.0
+      throughput: 356.1253561253561
       estimated_peak_memory_range:
         min: 16384
-        max: 1996704
+        max: 131330864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 114
-      job_id: jqp4qwkqg
+      job_id: jp8qy4lop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2027.0
-      throughput: 493.33991119881597
+      inference_time: 2905.0
+      throughput: 344.2340791738382
       estimated_peak_memory_range:
-        min: 643072
-        max: 1875728
+        min: 741376
+        max: 23528016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: j2p0yr22g
+      job_id: jgdx1zwzp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:51:28Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1363.0
+      throughput: 733.6757153338225
+      estimated_peak_memory_range:
+        min: 12288
+        max: 74559008
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 114
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 114
+      job_id: jp3j0wkng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1448.0
+      throughput: 690.6077348066299
+      estimated_peak_memory_range:
+        min: 0
+        max: 28855920
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 188
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 188
+      job_id: j57yr7z95
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1561.0
+      throughput: 640.6149903907751
+      estimated_peak_memory_range:
+        min: 0
+        max: 77292496
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 190
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 190
+      job_id: jprv3nk7g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:58:12Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:51:34Z'
   - torchscript_onnx_qnn:
-      inference_time: 2218.0
-      throughput: 450.8566275924256
+      inference_time: 2232.0
+      throughput: 448.02867383512546
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 188
-      job_id: jegn2emvg
+      job_id: jgjvnwxeg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2235.0
-      throughput: 447.42729306487695
+      inference_time: 2208.0
+      throughput: 452.8985507246377
       estimated_peak_memory_range:
-        min: 43081728
-        max: 43081728
+        min: 41877504
+        max: 41877504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 190
-      job_id: j1glnk2ep
+      job_id: j5wee4e45
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:58:15Z'
+    timestamp: '2024-10-16T09:33:52Z'
diff --git a/qai_hub_models/models/regnet_quantized/README.md b/qai_hub_models/models/regnet_quantized/README.md
index 85fb43e9..6eb10e69 100644
--- a/qai_hub_models/models/regnet_quantized/README.md
+++ b/qai_hub_models/models/regnet_quantized/README.md
@@ -6,7 +6,7 @@
 RegNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of RegNetQuantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/regnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/regnet_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[regnet_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.regnet_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of RegNetQuantized can be found
+* The license for the original implementation of RegNetQuantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/regnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/regnet_quantized/evaluate.py b/qai_hub_models/models/regnet_quantized/evaluate.py
index 4eb83eec..24b2be32 100644
--- a/qai_hub_models/models/regnet_quantized/evaluate.py
+++ b/qai_hub_models/models/regnet_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.regnet_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/regnet_quantized/export.py b/qai_hub_models/models/regnet_quantized/export.py
index bfd734a2..28212253 100644
--- a/qai_hub_models/models/regnet_quantized/export.py
+++ b/qai_hub_models/models/regnet_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.regnet_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "regnet_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/regnet_quantized/model.py b/qai_hub_models/models/regnet_quantized/model.py
index 47e79fed..c348ae26 100644
--- a/qai_hub_models/models/regnet_quantized/model.py
+++ b/qai_hub_models/models/regnet_quantized/model.py
@@ -4,83 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import (
-    equalize_bn_folded_model,
-    fold_all_batch_norms,
-)
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.regnet.model import RegNet
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 4
-DEFAULT_ENCODINGS = "regnet_quantized_encodings.json"
-
-
-class RegNetQuantizable(AIMETQuantizableMixin, RegNet):
-    """RegNet with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        RegNet.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "RegNetQuantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = RegNet.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-        model = prepare_model(model)
-        dummy_input = torch.rand(input_shape)
-
-        pairs = fold_all_batch_norms(model, input_shape, dummy_input)
-        equalize_bn_folded_model(model, input_shape, pairs, dummy_input)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=dummy_input,
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class RegNetQuantizable(HubQuantizableMixin, RegNet):
+    pass
diff --git a/qai_hub_models/models/regnet_quantized/perf.yaml b/qai_hub_models/models/regnet_quantized/perf.yaml
index 45b95dac..43cc6cdf 100644
--- a/qai_hub_models/models/regnet_quantized/perf.yaml
+++ b/qai_hub_models/models/regnet_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,36 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: RegNetQuantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 904.0
-      throughput: 1106.1946902654868
+      inference_time: 896.0
+      throughput: 1116.0714285714287
       estimated_peak_memory_range:
-        min: 20480
-        max: 1550536
+        min: 16384
+        max: 2233688
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,29 +57,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jlpe9vo7g
+      job_id: jg9l04vmg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1034.0
-      throughput: 967.1179883945841
+      inference_time: 1042.0
+      throughput: 959.6928982725528
       estimated_peak_memory_range:
-        min: 12288
-        max: 14327824
+        min: 0
+        max: 13613808
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jo5mrzeyg
+        total_layers: 189
+      job_id: jgn60eyv5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1547.0
-      throughput: 646.4124111182934
+      inference_time: 1524.0
+      throughput: 656.1679790026246
       estimated_peak_memory_range:
-        min: 94208
-        max: 2600640
+        min: 12288
+        max: 27258352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -91,7 +87,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 218
-      job_id: j1glnk6ep
+      job_id: jgo2zv04p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -100,13 +96,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:57:33Z'
+    timestamp: '2024-10-17T17:25:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 610.0
-      throughput: 1639.344262295082
+      inference_time: 750.0
+      throughput: 1333.3333333333333
       estimated_peak_memory_range:
         min: 12288
-        max: 137910400
+        max: 140573344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -114,29 +110,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jygze72zg
+      job_id: jp14280np
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 751.0
-      throughput: 1331.5579227696405
+      inference_time: 744.0
+      throughput: 1344.0860215053763
       estimated_peak_memory_range:
-        min: 167936
-        max: 31013392
+        min: 163840
+        max: 31485744
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jegn2elvg
+        total_layers: 189
+      job_id: jprv6yqvg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1316.0
-      throughput: 759.8784194528876
+      inference_time: 1107.0
+      throughput: 903.342366757001
       estimated_peak_memory_range:
-        min: 0
-        max: 171753424
+        min: 28672
+        max: 177195600
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -144,7 +140,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 218
-      job_id: jw5661ev5
+      job_id: jpv6qwo75
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -153,51 +149,74 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:57:34Z'
+    timestamp: '2024-10-17T17:25:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 885.0
-      throughput: 1129.9435028248588
+      inference_time: 30336.0
+      throughput: 32.96413502109704
       estimated_peak_memory_range:
-        min: 24576
-        max: 5661376
-      primary_compute_unit: NPU
+        min: 86016
+        max: 75736048
+      primary_compute_unit: GPU
       precision: int8
       layer_info:
-        layers_on_npu: 116
-        layers_on_gpu: 0
+        layers_on_npu: 0
+        layers_on_gpu: 116
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jz5wo9wzp
+      job_id: jgdxnvw6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 953.0
-      throughput: 1049.3179433368311
+      inference_time: 4094.0
+      throughput: 244.2598925256473
       estimated_peak_memory_range:
-        min: 188416
-        max: 1334288
+        min: 217088
+        max: 9048560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jep28m0xp
+        total_layers: 189
+      job_id: jp2kxm6xp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:57:28Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:25:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 1037.0
-      throughput: 964.3201542912246
+      inference_time: 39674.0
+      throughput: 25.205424207289408
       estimated_peak_memory_range:
-        min: 16384
-        max: 141484048
+        min: 2801664
+        max: 91186368
+      primary_compute_unit: GPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 12
+        layers_on_gpu: 91
+        layers_on_cpu: 13
+        total_layers: 116
+      job_id: j5wew9oz5
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:25:02Z'
+  - torchscript_onnx_tflite:
+      inference_time: 905.0
+      throughput: 1104.9723756906078
+      estimated_peak_memory_range:
+        min: 20480
+        max: 1443656
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -205,37 +224,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jmg9v40q5
+      job_id: jg9l04vqg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1174.0
-      throughput: 851.7887563884157
+      inference_time: 964.0
+      throughput: 1037.344398340249
       estimated_peak_memory_range:
-        min: 163840
-        max: 31496784
+        min: 176128
+        max: 1571688
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jogkzy7yg
+        total_layers: 189
+      job_id: jpy1zdwrp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:57:32Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:25:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 905.0
-      throughput: 1104.9723756906078
+      inference_time: 898.0
+      throughput: 1113.5857461024498
       estimated_peak_memory_range:
         min: 12288
-        max: 5407592
+        max: 9672640
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -243,37 +262,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jnp1082k5
+      job_id: jp14280kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 961.0
-      throughput: 1040.5827263267429
+      inference_time: 966.0
+      throughput: 1035.1966873706003
       estimated_peak_memory_range:
-        min: 184320
-        max: 1426312
+        min: 217088
+        max: 1482600
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jqpyedrrg
+        total_layers: 189
+      job_id: jp8q279zp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:57:28Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:25:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 902.0
-      throughput: 1108.6474501108648
+      inference_time: 900.0
+      throughput: 1111.111111111111
       estimated_peak_memory_range:
         min: 12288
-        max: 12826976
+        max: 8539544
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -281,22 +300,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jvgdwvnk5
+      job_id: jgdxnvwkp
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 962.0
       throughput: 1039.5010395010395
       estimated_peak_memory_range:
-        min: 188416
-        max: 1472072
+        min: 176128
+        max: 1467512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: j2p0yr32g
+        total_layers: 189
+      job_id: jgkevynyg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -304,14 +323,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:57:30Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:25:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 902.0
-      throughput: 1108.6474501108648
+      inference_time: 1023.0
+      throughput: 977.5171065493646
       estimated_peak_memory_range:
-        min: 36864
-        max: 1556416
+        min: 0
+        max: 143886928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -319,113 +338,105 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jz57zd2qp
+      job_id: j57y2dzq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 950.0
-      throughput: 1052.6315789473683
+      inference_time: 1190.0
+      throughput: 840.3361344537815
       estimated_peak_memory_range:
-        min: 217088
-        max: 1406984
+        min: 163840
+        max: 35308880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: j1p8o70zg
+        total_layers: 189
+      job_id: j5q602k7p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:57:31Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:25:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 30147.0
-      throughput: 33.17079643082231
+      inference_time: 626.0
+      throughput: 1597.444089456869
       estimated_peak_memory_range:
-        min: 102400
-        max: 75838848
-      primary_compute_unit: GPU
+        min: 8192
+        max: 69178288
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 0
-        layers_on_gpu: 116
+        layers_on_npu: 116
+        layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 116
-      job_id: jqp4qwnqg
+      job_id: jp4lnwqq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4055.0
-      throughput: 246.6091245376079
+      inference_time: 740.0
+      throughput: 1351.3513513513512
       estimated_peak_memory_range:
-        min: 167936
-        max: 7980064
+        min: 29278208
+        max: 59289440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: jn5q82e75
+        total_layers: 189
+      job_id: jglv4kze5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:57:32Z'
-  - torchscript_onnx_tflite:
-      inference_time: 41510.0
-      throughput: 24.09058058299205
+    torchscript_onnx:
+      inference_time: 1087.0
+      throughput: 919.9632014719411
       estimated_peak_memory_range:
-        min: 12288
-        max: 64667064
-      primary_compute_unit: GPU
+        min: 0
+        max: 89205056
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 12
-        layers_on_gpu: 91
-        layers_on_cpu: 13
-        total_layers: 116
-      job_id: j0pxv19jg
+        layers_on_npu: 218
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 218
+      job_id: jpedov175
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:57:23Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:25:36Z'
   - torchscript_onnx_qnn:
-      inference_time: 1116.0
-      throughput: 896.0573476702509
+      inference_time: 1129.0
+      throughput: 885.7395925597874
       estimated_peak_memory_range:
         min: 442368
         max: 442368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 113
+        layers_on_npu: 189
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 113
-      job_id: joprky8v5
+        total_layers: 189
+      job_id: jp0z4rq25
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1540.0
-      throughput: 649.3506493506494
+      inference_time: 1532.0
+      throughput: 652.7415143603133
       estimated_peak_memory_range:
-        min: 23322624
-        max: 23322624
+        min: 23302144
+        max: 23302144
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -433,7 +444,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 218
-      job_id: j1p3kmvx5
+      job_id: jgjvdlm7g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -442,4 +453,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:57:35Z'
+    timestamp: '2024-10-17T17:25:34Z'
diff --git a/qai_hub_models/models/regnet_quantized/requirements.txt b/qai_hub_models/models/regnet_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/regnet_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/regnet_quantized/test.py b/qai_hub_models/models/regnet_quantized/test.py
deleted file mode 100644
index 6018cb2a..00000000
--- a/qai_hub_models/models/regnet_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.regnet_quantized.demo import main as demo_main
-from qai_hub_models.models.regnet_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    RegNetQuantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        RegNetQuantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.45,
-        diff_tol=0.005,
-        atol=0.2,
-        rtol=0.02,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/resnet101/README.md b/qai_hub_models/models/resnet101/README.md
index 858e0ecd..c9f9189a 100644
--- a/qai_hub_models/models/resnet101/README.md
+++ b/qai_hub_models/models/resnet101/README.md
@@ -6,7 +6,7 @@
 ResNet101 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet101 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet101).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.resnet101.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet101 can be found
+* The license for the original implementation of ResNet101 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet101/export.py b/qai_hub_models/models/resnet101/export.py
index 08f9c3c7..6d9679af 100644
--- a/qai_hub_models/models/resnet101/export.py
+++ b/qai_hub_models/models/resnet101/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet101 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "resnet101"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/resnet101/perf.yaml b/qai_hub_models/models/resnet101/perf.yaml
index fc7067c1..e3d72737 100644
--- a/qai_hub_models/models/resnet101/perf.yaml
+++ b/qai_hub_models/models/resnet101/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet101
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 3458.0
-      throughput: 289.1844997108155
+      inference_time: 3460.0
+      throughput: 289.01734104046244
       estimated_peak_memory_range:
-        min: 24576
-        max: 2168072
+        min: 16384
+        max: 2472304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jz5wo9ezp
+      job_id: jgjvnx4xg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3505.0
-      throughput: 285.30670470756064
+      inference_time: 3504.0
+      throughput: 285.38812785388126
       estimated_peak_memory_range:
-        min: 622592
-        max: 160835080
+        min: 16384
+        max: 156687344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jo5mrznyg
+      job_id: jp2ky6y4p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3620.0
-      throughput: 276.24309392265195
+      inference_time: 3631.0
+      throughput: 275.40622418066647
       estimated_peak_memory_range:
         min: 618496
-        max: 3142688
+        max: 2503400
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: jn5q82075
+      job_id: jg9ln1llg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:56:26Z'
+    timestamp: '2024-10-15T17:22:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 3654.0
-      throughput: 273.6726874657909
+      inference_time: 2926.0
+      throughput: 341.7634996582365
       estimated_peak_memory_range:
         min: 16384
-        max: 115147440
+        max: 119118768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jmg9v4lq5
+      job_id: j5we6v665
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2860.0
-      throughput: 349.65034965034965
+      inference_time: 3734.0
+      throughput: 267.8093197643278
       estimated_peak_memory_range:
         min: 618496
-        max: 34785840
+        max: 35940912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jegn2e0vg
+      job_id: jpy13w37p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2991.0
-      throughput: 334.33634236041456
+      inference_time: 3791.0
+      throughput: 263.7826431020839
       estimated_peak_memory_range:
         min: 0
-        max: 122388784
+        max: 123516480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: j1glnk4ep
+      job_id: jgdx19xep
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:56:27Z'
+    timestamp: '2024-10-15T17:22:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 3394.0
-      throughput: 294.6375957572186
+      inference_time: 3371.0
+      throughput: 296.6478789676654
       estimated_peak_memory_range:
         min: 20480
-        max: 2054584
+        max: 2168592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jnp1084k5
+      job_id: jp14zlz2p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3301.0
-      throughput: 302.9385034837928
+      inference_time: 3275.0
+      throughput: 305.3435114503817
       estimated_peak_memory_range:
-        min: 630784
-        max: 2486328
+        min: 634880
+        max: 1855344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jep28mxxp
+      job_id: jp8qy9yxp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,52 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:56:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:22:34Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3407.0
+      throughput: 293.51335485764605
+      estimated_peak_memory_range:
+        min: 20480
+        max: 2514272
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 147
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 147
+      job_id: j5mnx2xwp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3283.0
+      throughput: 304.5994517209869
+      estimated_peak_memory_range:
+        min: 630784
+        max: 1840808
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 245
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 245
+      job_id: jglvmzm85
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:22:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 4764.0
-      throughput: 209.90764063811923
+      inference_time: 3391.0
+      throughput: 294.8982601002654
       estimated_peak_memory_range:
         min: 16384
-        max: 95821536
+        max: 6130056
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jvgdwvxk5
+      job_id: jpxkojo15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4806.0
-      throughput: 208.07324178110696
+      inference_time: 3321.0
+      throughput: 301.11412225233363
       estimated_peak_memory_range:
-        min: 618496
-        max: 22898176
+        min: 634880
+        max: 1922832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jogkzyvyg
+      job_id: j5q6qkq4p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:56:25Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:22:36Z'
   - torchscript_onnx_tflite:
       inference_time: 3422.0
       throughput: 292.22676797194623
       estimated_peak_memory_range:
-        min: 12288
-        max: 1968544
+        min: 16384
+        max: 5496088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jz57zdyqp
+      job_id: jp4lrorv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3299.0
-      throughput: 303.12215822976657
+      inference_time: 3286.0
+      throughput: 304.32136335970785
       estimated_peak_memory_range:
-        min: 663552
-        max: 1901344
+        min: 634880
+        max: 1919448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,7 +291,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jqpyedzrg
+      job_id: jgkexnx2g
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +299,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:56:22Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:22:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 3398.0
-      throughput: 294.2907592701589
+      inference_time: 4810.0
+      throughput: 207.9002079002079
       estimated_peak_memory_range:
-        min: 16384
-        max: 1817200
+        min: 20480
+        max: 96531168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jqp4qwlqg
+      job_id: j57yrwrl5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3307.0
-      throughput: 302.3888720895071
+      inference_time: 4825.0
+      throughput: 207.2538860103627
       estimated_peak_memory_range:
-        min: 647168
-        max: 1947368
+        min: 638976
+        max: 23148960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +329,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j2p0yr42g
+      job_id: jpv6kokj5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:56:23Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:22:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 3396.0
-      throughput: 294.4640753828033
+      inference_time: 2377.0
+      throughput: 420.69835927639883
       estimated_peak_memory_range:
-        min: 24576
-        max: 2347792
+        min: 12288
+        max: 44661680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +352,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j0pxv1kjg
+      job_id: jprv3q39g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3339.0
-      throughput: 299.4908655286014
+      inference_time: 2401.0
+      throughput: 416.49312786339027
       estimated_peak_memory_range:
-        min: 634880
-        max: 1866304
+        min: 614400
+        max: 34292064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +367,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1p8o72zg
+      job_id: jpedm1m15
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 2510.0
+      throughput: 398.40637450199205
+      estimated_peak_memory_range:
+        min: 0
+        max: 47890400
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 247
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 247
+      job_id: jp2ky6k4p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:56:24Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:22:44Z'
   - torchscript_onnx_qnn:
-      inference_time: 3475.0
-      throughput: 287.76978417266184
+      inference_time: 3453.0
+      throughput: 289.6032435563278
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: joprky6v5
+      job_id: jp0z0q065
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3549.0
-      throughput: 281.7695125387433
+      inference_time: 3580.0
+      throughput: 279.3296089385475
       estimated_peak_memory_range:
-        min: 90714112
-        max: 90714112
+        min: 90652672
+        max: 90652672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: jw56612v5
+      job_id: jgn667rm5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:56:28Z'
+    timestamp: '2024-10-16T08:31:22Z'
diff --git a/qai_hub_models/models/resnet101_quantized/README.md b/qai_hub_models/models/resnet101_quantized/README.md
index acc1f71d..8c3c49d1 100644
--- a/qai_hub_models/models/resnet101_quantized/README.md
+++ b/qai_hub_models/models/resnet101_quantized/README.md
@@ -6,7 +6,7 @@
 ResNet101 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet101Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet101_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[resnet101_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.resnet101_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet101Quantized can be found
+* The license for the original implementation of ResNet101Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet101_quantized/evaluate.py b/qai_hub_models/models/resnet101_quantized/evaluate.py
index fde921e3..9aba239a 100644
--- a/qai_hub_models/models/resnet101_quantized/evaluate.py
+++ b/qai_hub_models/models/resnet101_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.resnet101_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/resnet101_quantized/export.py b/qai_hub_models/models/resnet101_quantized/export.py
index bf8ced05..c406c3f4 100644
--- a/qai_hub_models/models/resnet101_quantized/export.py
+++ b/qai_hub_models/models/resnet101_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet101_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "resnet101_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/resnet101_quantized/model.py b/qai_hub_models/models/resnet101_quantized/model.py
index c4cfa229..12c3d4d6 100644
--- a/qai_hub_models/models/resnet101_quantized/model.py
+++ b/qai_hub_models/models/resnet101_quantized/model.py
@@ -4,86 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import (
-    equalize_bn_folded_model,
-    fold_all_batch_norms,
-)
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.resnet101.model import ResNet101
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 5
-DEFAULT_ENCODINGS = "resnet101_quantized_encodings.json"
-
-
-class ResNet101Quantizable(
-    AIMETQuantizableMixin,
-    ResNet101,
-):
-    """ResNet101 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ResNet101.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ResNet101Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ResNet101.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-        model = prepare_model(model)
-        dummy_input = torch.rand(input_shape)
-        pairs = fold_all_batch_norms(model, input_shape, dummy_input)
-        equalize_bn_folded_model(model, input_shape, pairs, dummy_input)
-
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ResNet101Quantizable(HubQuantizableMixin, ResNet101):
+    pass
diff --git a/qai_hub_models/models/resnet101_quantized/perf.yaml b/qai_hub_models/models/resnet101_quantized/perf.yaml
index f65c1de5..21242612 100644
--- a/qai_hub_models/models/resnet101_quantized/perf.yaml
+++ b/qai_hub_models/models/resnet101_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet101Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1153.0
-      throughput: 867.3026886383348
+      inference_time: 1159.0
+      throughput: 862.8127696289905
       estimated_peak_memory_range:
-        min: 32768
-        max: 2011888
+        min: 12288
+        max: 54352760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jvgdwv165
+      job_id: jp0z4rn05
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1373.0
-      throughput: 728.3321194464676
+      inference_time: 1382.0
+      throughput: 723.589001447178
       estimated_peak_memory_range:
-        min: 16384
-        max: 47362416
+        min: 32768
+        max: 10864768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j2p0yrz0g
+        total_layers: 246
+      job_id: jgz327145
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2397.0
-      throughput: 417.18815185648725
+      inference_time: 2239.0
+      throughput: 446.6279589102278
       estimated_peak_memory_range:
-        min: 217088
-        max: 52387320
+        min: 12288
+        max: 52546072
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: j7gjxlv1p
+      job_id: jp2kxm26p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:55:45Z'
+    timestamp: '2024-10-17T17:24:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 869.0
-      throughput: 1150.7479861910242
+      inference_time: 867.0
+      throughput: 1153.4025374855826
       estimated_peak_memory_range:
         min: 12288
-        max: 101112112
+        max: 101898880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jz57zdrnp
+      job_id: jp8q27lqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1210.0
-      throughput: 826.4462809917355
+      inference_time: 1043.0
+      throughput: 958.7727708533077
       estimated_peak_memory_range:
-        min: 167936
-        max: 22519552
+        min: 163840
+        max: 22249632
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1p8o7qqg
+        total_layers: 246
+      job_id: j5wew9j45
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2218.0
-      throughput: 450.8566275924256
+      inference_time: 1597.0
+      throughput: 626.1740763932373
       estimated_peak_memory_range:
         min: 12288
-        max: 149319232
+        max: 152934336
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: jlpe9vd8g
+      job_id: jpy1zd90p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:55:46Z'
+    timestamp: '2024-10-17T17:24:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 1144.0
-      throughput: 874.1258741258741
+      inference_time: 4486.0
+      throughput: 222.91573785109227
       estimated_peak_memory_range:
-        min: 12288
-        max: 54250312
+        min: 36864
+        max: 37005520
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jqp4qwr2g
+      job_id: jgkevyjvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1323.0
-      throughput: 755.8578987150415
+      inference_time: 6377.0
+      throughput: 156.81354869060686
       estimated_peak_memory_range:
-        min: 188416
-        max: 1486584
+        min: 208896
+        max: 8274624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jn5q826e5
+        total_layers: 246
+      job_id: jg9l046mg
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:55:39Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:24:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 1358.0
-      throughput: 736.3770250368188
+      inference_time: 17354.0
+      throughput: 57.62360262763628
       estimated_peak_memory_range:
-        min: 28672
-        max: 103102416
+        min: 208896
+        max: 2368376
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 150
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 150
+      job_id: j5q602jep
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:23:45Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1159.0
+      throughput: 862.8127696289905
+      estimated_peak_memory_range:
+        min: 20480
+        max: 1400088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: j0pxv1o8g
+      job_id: jglv4kj25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1591.0
-      throughput: 628.5355122564425
+      inference_time: 1324.0
+      throughput: 755.2870090634441
       estimated_peak_memory_range:
-        min: 507904
-        max: 24608272
+        min: 176128
+        max: 1511296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jwgoyv215
+        total_layers: 246
+      job_id: jp1428rnp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:55:43Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:24:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 1154.0
-      throughput: 866.5511265164645
+      inference_time: 1157.0
+      throughput: 864.304235090752
       estimated_peak_memory_range:
         min: 12288
-        max: 18030704
+        max: 28724152
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jo5mrzx7g
+      job_id: j56y21knp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1320.0
-      throughput: 757.5757575757576
+      inference_time: 1324.0
+      throughput: 755.2870090634441
       estimated_peak_memory_range:
         min: 184320
-        max: 1429024
+        max: 1543680
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1glnkv2p
+        total_layers: 246
+      job_id: j57y2dqn5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:55:40Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:24:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 1161.0
-      throughput: 861.3264427217915
+      inference_time: 1162.0
+      throughput: 860.5851979345955
       estimated_peak_memory_range:
         min: 28672
-        max: 17746288
+        max: 388559256
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jegn2evjg
+      job_id: jp3jnmymg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1330.0
-      throughput: 751.8796992481203
+      inference_time: 1325.0
+      throughput: 754.7169811320755
       estimated_peak_memory_range:
-        min: 176128
-        max: 1414360
+        min: 184320
+        max: 1467840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jw5661yn5
+        total_layers: 246
+      job_id: jp4lnwz25
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:55:41Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:24:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 1165.0
-      throughput: 858.3690987124463
+      inference_time: 1367.0
+      throughput: 731.528895391368
       estimated_peak_memory_range:
         min: 16384
-        max: 14728928
+        max: 104200096
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: joprky3k5
+      job_id: jgo2zvj1p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1333.0
-      throughput: 750.1875468867216
+      inference_time: 1592.0
+      throughput: 628.1407035175879
       estimated_peak_memory_range:
-        min: 184320
-        max: 1544360
+        min: 172032
+        max: 26655536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1p3kmjm5
+        total_layers: 246
+      job_id: jpxk91w85
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:55:42Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:24:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 4481.0
-      throughput: 223.1644722160232
+      inference_time: 832.0
+      throughput: 1201.923076923077
       estimated_peak_memory_range:
-        min: 12288
-        max: 36912016
+        min: 8192
+        max: 31175136
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +379,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jep28my6p
+      job_id: jpv6qwjz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6360.0
-      throughput: 157.23270440251574
+      inference_time: 1004.0
+      throughput: 996.01593625498
       estimated_peak_memory_range:
-        min: 192512
-        max: 8045248
+        min: 163840
+        max: 23243040
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1pv3w6z5
+        total_layers: 246
+      job_id: j5mnezj7p
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:55:44Z'
-  - torchscript_onnx_tflite:
-      inference_time: 17279.0
-      throughput: 57.87371954395509
+    torchscript_onnx:
+      inference_time: 1585.0
+      throughput: 630.9148264984227
       estimated_peak_memory_range:
-        min: 53248
-        max: 10200904
+        min: 0
+        max: 63100464
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 150
+        layers_on_npu: 283
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 150
-      job_id: jqpyed30g
+        total_layers: 283
+      job_id: jp8q27oqp
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:55:34Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:24:20Z'
   - torchscript_onnx_qnn:
-      inference_time: 1309.0
-      throughput: 763.9419404125287
+      inference_time: 1327.0
+      throughput: 753.5795026375282
       estimated_peak_memory_range:
-        min: 348160
-        max: 348160
+        min: 442368
+        max: 442368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jogkzyevg
+        total_layers: 246
+      job_id: jgdxnvj6p
       job_status: Passed
     torchscript_onnx:
       inference_time: 2350.0
       throughput: 425.531914893617
       estimated_peak_memory_range:
-        min: 48607232
-        max: 48607232
+        min: 48603136
+        max: 48603136
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +447,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: jygze734g
+      job_id: jp0z4ry05
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:55:47Z'
+    timestamp: '2024-10-17T17:24:18Z'
diff --git a/qai_hub_models/models/resnet101_quantized/requirements.txt b/qai_hub_models/models/resnet101_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/resnet101_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/resnet101_quantized/test.py b/qai_hub_models/models/resnet101_quantized/test.py
deleted file mode 100644
index 876ebffe..00000000
--- a/qai_hub_models/models/resnet101_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.resnet101_quantized.demo import main as demo_main
-from qai_hub_models.models.resnet101_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ResNet101Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ResNet101Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.45,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/resnet18/README.md b/qai_hub_models/models/resnet18/README.md
index c78fd6a9..299ae472 100644
--- a/qai_hub_models/models/resnet18/README.md
+++ b/qai_hub_models/models/resnet18/README.md
@@ -6,7 +6,7 @@
 ResNet18 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet18 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet18).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.resnet18.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet18 can be found
+* The license for the original implementation of ResNet18 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet18/export.py b/qai_hub_models/models/resnet18/export.py
index 1baff7ce..cca1addd 100644
--- a/qai_hub_models/models/resnet18/export.py
+++ b/qai_hub_models/models/resnet18/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet18 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "resnet18"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/resnet18/perf.yaml b/qai_hub_models/models/resnet18/perf.yaml
index 6404bb77..631f9ed3 100644
--- a/qai_hub_models/models/resnet18/perf.yaml
+++ b/qai_hub_models/models/resnet18/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet18
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1383.0
-      throughput: 723.0657989877079
+      inference_time: 1384.0
+      throughput: 722.543352601156
       estimated_peak_memory_range:
-        min: 16384
-        max: 2213008
+        min: 32768
+        max: 2412488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: jz57zx3np
+      job_id: jgo264dxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1460.0
-      throughput: 684.931506849315
+      inference_time: 1459.0
+      throughput: 685.4009595613434
       estimated_peak_memory_range:
-        min: 16384
-        max: 4586272
+        min: 167936
+        max: 82743112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jqpye600g
+      job_id: j5we61z35
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1354.0
-      throughput: 738.5524372230428
+      inference_time: 1337.0
+      throughput: 747.9431563201197
       estimated_peak_memory_range:
-        min: 36864
-        max: 25839080
+        min: 16384
+        max: 25939760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 55
-      job_id: jwgoyv615
+      job_id: jp2kyvdrp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:53:41Z'
+    timestamp: '2024-10-14T23:46:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1075.0
-      throughput: 930.2325581395348
+      inference_time: 1074.0
+      throughput: 931.0986964618249
       estimated_peak_memory_range:
         min: 16384
-        max: 28491264
+        max: 29274576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: jqp4qv02g
+      job_id: jpv6k92j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1423.0
-      throughput: 702.7406886858749
+      inference_time: 1114.0
+      throughput: 897.6660682226212
       estimated_peak_memory_range:
-        min: 618496
-        max: 15481040
+        min: 634880
+        max: 13621584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: j2p0yr00g
+      job_id: jg9lnx2wg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1384.0
-      throughput: 722.543352601156
+      inference_time: 1057.0
+      throughput: 946.073793755913
       estimated_peak_memory_range:
-        min: 638976
-        max: 30037392
+        min: 303104
+        max: 29953216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 55
-      job_id: j1pv3wkz5
+      job_id: jpy13728p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:53:42Z'
+    timestamp: '2024-10-14T23:46:28Z'
   - torchscript_onnx_tflite:
       inference_time: 1383.0
       throughput: 723.0657989877079
       estimated_peak_memory_range:
-        min: 24576
-        max: 314789328
+        min: 32768
+        max: 27591104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: j0pxvy28g
+      job_id: jgjvnw3xg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1318.0
-      throughput: 758.7253414264036
+      inference_time: 1322.0
+      throughput: 756.4296520423601
       estimated_peak_memory_range:
-        min: 643072
-        max: 2015960
+        min: 671744
+        max: 2247928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jogkzyxvg
+      job_id: jgdx1z4rp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:53:36Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:46:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 1942.0
-      throughput: 514.9330587023687
+      inference_time: 1387.0
+      throughput: 720.9805335255949
       estimated_peak_memory_range:
-        min: 32768
-        max: 26344192
+        min: 16384
+        max: 5378256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: jo5mr3y7g
+      job_id: jg9lnx2lg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2001.0
-      throughput: 499.7501249375312
+      inference_time: 1322.0
+      throughput: 756.4296520423601
       estimated_peak_memory_range:
-        min: 618496
-        max: 18347168
+        min: 634880
+        max: 2372912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: j1p3km0m5
+      job_id: jpxkodr35
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:53:40Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:46:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 1387.0
-      throughput: 720.9805335255949
+      inference_time: 1385.0
+      throughput: 722.0216606498195
       estimated_peak_memory_range:
         min: 16384
-        max: 2574440
+        max: 20826416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: jegn238jg
+      job_id: j5we61z65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1334.0
-      throughput: 749.6251874062968
+      inference_time: 1327.0
+      throughput: 753.5795026375282
       estimated_peak_memory_range:
-        min: 630784
-        max: 2122696
+        min: 634880
+        max: 2016360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jn5q82qe5
+      job_id: jp4lr9485
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:53:37Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:46:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 1384.0
-      throughput: 722.543352601156
+      inference_time: 1386.0
+      throughput: 721.5007215007215
       estimated_peak_memory_range:
         min: 28672
-        max: 1556416
+        max: 1431392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: joprkejk5
+      job_id: jgz3d4zk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1329.0
-      throughput: 752.4454477050414
+      inference_time: 1326.0
+      throughput: 754.1478129713424
       estimated_peak_memory_range:
-        min: 626688
-        max: 2136400
+        min: 634880
+        max: 1931784
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: j1glnkm2p
+      job_id: j57yr7nv5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:53:38Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:46:21Z'
   - torchscript_onnx_tflite:
-      inference_time: 1385.0
-      throughput: 722.0216606498195
+      inference_time: 1943.0
+      throughput: 514.668039114771
       estimated_peak_memory_range:
-        min: 40960
-        max: 1291888
+        min: 24576
+        max: 27766528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 38
-      job_id: jep28ln6p
+      job_id: jpedml615
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1353.0
-      throughput: 739.0983000739099
+      inference_time: 1994.0
+      throughput: 501.5045135406219
       estimated_peak_memory_range:
-        min: 634880
-        max: 1879608
+        min: 618496
+        max: 17848032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,57 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jw56614n5
+      job_id: jgn6v7qk5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:46:25Z'
+  - torchscript_onnx_tflite:
+      inference_time: 799.0
+      throughput: 1251.5644555694619
+      estimated_peak_memory_range:
+        min: 12288
+        max: 17085392
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 38
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 38
+      job_id: jgdx1z4ep
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 971.0
+      throughput: 1029.8661174047375
+      estimated_peak_memory_range:
+        min: 0
+        max: 16409728
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 55
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 55
+      job_id: jgkex90wg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:53:39Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:46:32Z'
   - torchscript_onnx_qnn:
-      inference_time: 1447.0
-      throughput: 691.0850034554251
+      inference_time: 1432.0
+      throughput: 698.3240223463687
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +390,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: j1p8o7yqg
+      job_id: jp14zv18p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1309.0
-      throughput: 763.9419404125287
+      inference_time: 1316.0
+      throughput: 759.8784194528876
       estimated_peak_memory_range:
-        min: 24379392
-        max: 24379392
+        min: 24326144
+        max: 24326144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +405,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 55
-      job_id: j7gjxln1p
+      job_id: jp0z0v995
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +414,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:53:43Z'
+    timestamp: '2024-10-14T23:46:29Z'
diff --git a/qai_hub_models/models/resnet18_quantized/README.md b/qai_hub_models/models/resnet18_quantized/README.md
index 705dd764..907fffb5 100644
--- a/qai_hub_models/models/resnet18_quantized/README.md
+++ b/qai_hub_models/models/resnet18_quantized/README.md
@@ -6,7 +6,7 @@
 ResNet18 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet18Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet18_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[resnet18_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.resnet18_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet18Quantized can be found
+* The license for the original implementation of ResNet18Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet18_quantized/evaluate.py b/qai_hub_models/models/resnet18_quantized/evaluate.py
index d98aec44..11452ab1 100644
--- a/qai_hub_models/models/resnet18_quantized/evaluate.py
+++ b/qai_hub_models/models/resnet18_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.resnet18_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/resnet18_quantized/export.py b/qai_hub_models/models/resnet18_quantized/export.py
index ebf77975..b803f909 100644
--- a/qai_hub_models/models/resnet18_quantized/export.py
+++ b/qai_hub_models/models/resnet18_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet18_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "resnet18_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/resnet18_quantized/model.py b/qai_hub_models/models/resnet18_quantized/model.py
index c0c56598..a9139836 100644
--- a/qai_hub_models/models/resnet18_quantized/model.py
+++ b/qai_hub_models/models/resnet18_quantized/model.py
@@ -4,78 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.resnet18.model import ResNet18
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 8
-DEFAULT_ENCODINGS = "resnet18_quantized_encodings.json"
-
-
-class ResNet18Quantizable(AIMETQuantizableMixin, ResNet18):
-    """ResNet with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        resnet18_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ResNet18.__init__(self, resnet18_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            resnet18_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ResNet18Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ResNet18.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ResNet18Quantizable(HubQuantizableMixin, ResNet18):
+    pass
diff --git a/qai_hub_models/models/resnet18_quantized/perf.yaml b/qai_hub_models/models/resnet18_quantized/perf.yaml
index cc3fa5fe..fd587c06 100644
--- a/qai_hub_models/models/resnet18_quantized/perf.yaml
+++ b/qai_hub_models/models/resnet18_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet18Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 409.0
-      throughput: 2444.987775061125
+      inference_time: 406.0
+      throughput: 2463.054187192118
       estimated_peak_memory_range:
         min: 12288
-        max: 14808408
+        max: 1536376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jnp10eln5
+      job_id: jgz327q45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 633.0
-      throughput: 1579.778830963665
+      inference_time: 624.0
+      throughput: 1602.5641025641025
       estimated_peak_memory_range:
         min: 16384
-        max: 8307960
+        max: 8236344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: jqpye6w0g
+        total_layers: 54
+      job_id: jp2kxmq6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 723.0
-      throughput: 1383.1258644536654
+      inference_time: 708.0
+      throughput: 1412.4293785310736
       estimated_peak_memory_range:
-        min: 77824
-        max: 1547736
+        min: 16384
+        max: 13985416
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: j7gjxe41p
+      job_id: jgjvdl91g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:53:02Z'
+    timestamp: '2024-10-17T17:22:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 354.0
-      throughput: 2824.858757062147
+      inference_time: 311.0
+      throughput: 3215.434083601286
       estimated_peak_memory_range:
-        min: 20480
-        max: 27231504
+        min: 12288
+        max: 28261504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jvgdwo965
+      job_id: j5wew9045
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 478.0
-      throughput: 2092.050209205021
+      inference_time: 477.0
+      throughput: 2096.4360587002097
       estimated_peak_memory_range:
-        min: 0
-        max: 13139152
+        min: 180224
+        max: 12752880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: j1p8ozvqg
+        total_layers: 54
+      job_id: jpy1zdk0p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 545.0
-      throughput: 1834.8623853211009
+      inference_time: 505.0
+      throughput: 1980.1980198019803
       estimated_peak_memory_range:
         min: 12288
-        max: 34316352
+        max: 34808496
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: jlpe9k38g
+      job_id: jpedovq85
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:53:03Z'
+    timestamp: '2024-10-17T17:22:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 406.0
-      throughput: 2463.054187192118
+      inference_time: 1418.0
+      throughput: 705.2186177715091
       estimated_peak_memory_range:
         min: 12288
-        max: 1377008
+        max: 18548880
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jz57zxwnp
+      job_id: jg9l047mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 599.0
-      throughput: 1669.449081803005
+      inference_time: 2033.0
+      throughput: 491.88391539596654
       estimated_peak_memory_range:
-        min: 184320
-        max: 1443088
+        min: 163840
+        max: 7996624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: jn5q83oe5
+        total_layers: 54
+      job_id: jp0z4rw05
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:52:56Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:22:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 470.0
-      throughput: 2127.659574468085
+      inference_time: 7144.0
+      throughput: 139.97760358342666
+      estimated_peak_memory_range:
+        min: 12288
+        max: 6250336
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 41
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 41
+      job_id: jp1428knp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:22:27Z'
+  - torchscript_onnx_tflite:
+      inference_time: 406.0
+      throughput: 2463.054187192118
       estimated_peak_memory_range:
-        min: 20480
-        max: 28156960
+        min: 12288
+        max: 1447784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jqp4qvo2g
+      job_id: jgdxnvy6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 702.0
-      throughput: 1424.5014245014245
+      inference_time: 600.0
+      throughput: 1666.6666666666667
       estimated_peak_memory_range:
-        min: 163840
-        max: 14723536
+        min: 176128
+        max: 1446944
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: jwgoy3q15
+        total_layers: 54
+      job_id: jp8q27nqp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:53:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:22:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 406.0
-      throughput: 2463.054187192118
+      inference_time: 401.0
+      throughput: 2493.7655860349128
       estimated_peak_memory_range:
         min: 16384
-        max: 8250584
+        max: 1446832
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: j0pxvyj8g
+      job_id: j57y2d1n5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 603.0
       throughput: 1658.374792703151
       estimated_peak_memory_range:
         min: 180224
-        max: 1981520
+        max: 1882592
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: j1gln3r2p
+        total_layers: 54
+      job_id: j5q602nep
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:52:57Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:22:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 412.0
-      throughput: 2427.1844660194174
+      inference_time: 405.0
+      throughput: 2469.135802469136
       estimated_peak_memory_range:
         min: 12288
-        max: 1487056
+        max: 1501904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jo5mr327g
+      job_id: jp4lnw625
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 601.0
-      throughput: 1663.8935108153078
+      inference_time: 600.0
+      throughput: 1666.6666666666667
       estimated_peak_memory_range:
-        min: 188416
-        max: 1522872
+        min: 184320
+        max: 1963040
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: jw566nln5
+        total_layers: 54
+      job_id: jglv4kd25
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:52:58Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:22:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 407.0
-      throughput: 2457.002457002457
+      inference_time: 470.0
+      throughput: 2127.659574468085
       estimated_peak_memory_range:
         min: 12288
-        max: 14964664
+        max: 29889024
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jegn23yjg
+      job_id: jpxk91885
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 602.0
-      throughput: 1661.1295681063123
+      inference_time: 703.0
+      throughput: 1422.475106685633
       estimated_peak_memory_range:
-        min: 180224
-        max: 1363016
+        min: 163840
+        max: 15623552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: j1p3ke2m5
+        total_layers: 54
+      job_id: j56y21xnp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:52:59Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:22:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 1347.0
-      throughput: 742.3904974016333
+      inference_time: 299.0
+      throughput: 3344.4816053511704
       estimated_peak_memory_range:
         min: 12288
-        max: 18336272
+        max: 16167008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +379,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: joprkeqk5
+      job_id: j5mnez17p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2244.0
-      throughput: 445.63279857397504
+      inference_time: 437.0
+      throughput: 2288.329519450801
       estimated_peak_memory_range:
-        min: 16384
-        max: 7546080
+        min: 0
+        max: 9301760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: j1pv3vxz5
+        total_layers: 54
+      job_id: jp3jnmdmg
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:53:01Z'
-  - torchscript_onnx_tflite:
-      inference_time: 7096.0
-      throughput: 140.92446448703495
+    torchscript_onnx:
+      inference_time: 525.0
+      throughput: 1904.7619047619048
       estimated_peak_memory_range:
-        min: 90112
-        max: 7545872
+        min: 0
+        max: 20260384
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 41
+        layers_on_npu: 74
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 41
-      job_id: jep28l66p
+        total_layers: 74
+      job_id: j5wew9k45
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:52:52Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:23:01Z'
   - torchscript_onnx_qnn:
-      inference_time: 687.0
-      throughput: 1455.604075691412
+      inference_time: 691.0
+      throughput: 1447.178002894356
       estimated_peak_memory_range:
-        min: 520192
-        max: 520192
+        min: 487424
+        max: 487424
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 37
+        layers_on_npu: 54
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 37
-      job_id: jogkz3mvg
+        total_layers: 54
+      job_id: jgkevy1vg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 712.0
-      throughput: 1404.4943820224719
+      inference_time: 717.0
+      throughput: 1394.700139470014
       estimated_peak_memory_range:
-        min: 13742080
-        max: 13742080
+        min: 14733312
+        max: 14733312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +447,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 74
-      job_id: jygzerk4g
+      job_id: jgz327645
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:53:04Z'
+    timestamp: '2024-10-17T17:23:00Z'
diff --git a/qai_hub_models/models/resnet18_quantized/requirements.txt b/qai_hub_models/models/resnet18_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/resnet18_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/resnet18_quantized/test.py b/qai_hub_models/models/resnet18_quantized/test.py
deleted file mode 100644
index 4405e8d2..00000000
--- a/qai_hub_models/models/resnet18_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.resnet18_quantized.demo import main as demo_main
-from qai_hub_models.models.resnet18_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ResNet18Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ResNet18Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.45,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/resnet50/README.md b/qai_hub_models/models/resnet50/README.md
index 96ba5cac..38950979 100644
--- a/qai_hub_models/models/resnet50/README.md
+++ b/qai_hub_models/models/resnet50/README.md
@@ -6,7 +6,7 @@
 ResNet50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet50 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet50).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.resnet50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet50 can be found
+* The license for the original implementation of ResNet50 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet50/export.py b/qai_hub_models/models/resnet50/export.py
index c61bb0a7..cf8e0502 100644
--- a/qai_hub_models/models/resnet50/export.py
+++ b/qai_hub_models/models/resnet50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "resnet50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/resnet50/perf.yaml b/qai_hub_models/models/resnet50/perf.yaml
index 5d94dd8c..71f36a73 100644
--- a/qai_hub_models/models/resnet50/perf.yaml
+++ b/qai_hub_models/models/resnet50/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2268.0
-      throughput: 440.9171075837742
+      inference_time: 2278.0
+      throughput: 438.98156277436345
       estimated_peak_memory_range:
-        min: 45056
-        max: 1952880
+        min: 36864
+        max: 1999784
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jmg9vwvm5
+      job_id: jgo26yxdp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2399.0
-      throughput: 416.84035014589415
+      inference_time: 2384.0
+      throughput: 419.46308724832215
       estimated_peak_memory_range:
-        min: 622592
-        max: 183955168
+        min: 618496
+        max: 182094504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jegn232jg
+      job_id: jprv3kzeg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2363.0
-      throughput: 423.1908590774439
+      inference_time: 2347.0
+      throughput: 426.075841499787
       estimated_peak_memory_range:
-        min: 16384
-        max: 754917776
+        min: 618496
+        max: 2422224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jw566njn5
+      job_id: jgo26yjdp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:52:05Z'
+    timestamp: '2024-10-15T17:20:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 1789.0
-      throughput: 558.9714924538848
+      inference_time: 1785.0
+      throughput: 560.2240896358544
       estimated_peak_memory_range:
-        min: 16384
-        max: 78037504
+        min: 12288
+        max: 79880464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jnp10e0n5
+      job_id: jpedm9q05
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1870.0
-      throughput: 534.75935828877
+      inference_time: 2356.0
+      throughput: 424.44821731748726
       estimated_peak_memory_range:
         min: 618496
-        max: 24347936
+        max: 27992256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: joprkekk5
+      job_id: jp2ky82mp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1975.0
-      throughput: 506.32911392405066
+      inference_time: 1889.0
+      throughput: 529.3806246691371
       estimated_peak_memory_range:
-        min: 0
-        max: 80285824
+        min: 618496
+        max: 82447184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1p3ke3m5
+      job_id: jgjvnxj8g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:52:06Z'
+    timestamp: '2024-10-15T17:20:51Z'
   - torchscript_onnx_tflite:
       inference_time: 2253.0
       throughput: 443.85264092321347
       estimated_peak_memory_range:
-        min: 0
-        max: 700208024
+        min: 28672
+        max: 725029512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jvgdwow65
+      job_id: jg9lnvrvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2177.0
-      throughput: 459.34772622875516
+      inference_time: 2157.0
+      throughput: 463.60686138154847
       estimated_peak_memory_range:
-        min: 634880
-        max: 1858536
+        min: 659456
+        max: 1866720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j2p0ylq0g
+      job_id: jp0z0yne5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:52:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:20:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 3080.0
-      throughput: 324.6753246753247
+      inference_time: 2273.0
+      throughput: 439.9472063352398
       estimated_peak_memory_range:
-        min: 16384
-        max: 67111280
+        min: 32768
+        max: 2046360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jz57zxznp
+      job_id: jpxkovw95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3222.0
-      throughput: 310.36623215394167
+      inference_time: 2185.0
+      throughput: 457.66590389016017
       estimated_peak_memory_range:
-        min: 618496
-        max: 18811248
+        min: 634880
+        max: 1925704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1gln3z2p
+      job_id: j5q6q8jmp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:52:05Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:20:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 2276.0
-      throughput: 439.3673110720562
+      inference_time: 2277.0
+      throughput: 439.17435221783046
       estimated_peak_memory_range:
-        min: 0
-        max: 2148096
+        min: 36864
+        max: 2635800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jqp4qvq2g
+      job_id: jp4lrqzl5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2192.0
-      throughput: 456.2043795620438
+      inference_time: 2185.0
+      throughput: 457.66590389016017
       estimated_peak_memory_range:
-        min: 638976
-        max: 2148800
+        min: 659456
+        max: 2019208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1p8oz9qg
+      job_id: jgkexzjog
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:52:01Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:20:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 2265.0
-      throughput: 441.5011037527594
+      inference_time: 2268.0
+      throughput: 440.9171075837742
       estimated_peak_memory_range:
-        min: 16384
-        max: 2511264
+        min: 20480
+        max: 2788896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j0pxvyv8g
+      job_id: j57yrzqr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2198.0
-      throughput: 454.9590536851683
+      inference_time: 2179.0
+      throughput: 458.9261128958238
       estimated_peak_memory_range:
-        min: 630784
-        max: 2162136
+        min: 688128
+        max: 1940992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jogkz3nvg
+      job_id: jp8qyol8p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:52:02Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:20:45Z'
   - torchscript_onnx_tflite:
-      inference_time: 2271.0
-      throughput: 440.33465433729634
+      inference_time: 3097.0
+      throughput: 322.8931223764934
       estimated_peak_memory_range:
-        min: 28672
-        max: 1915544
+        min: 16384
+        max: 67716480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jo5mr3r7g
+      job_id: jgdx1wklp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2224.0
-      throughput: 449.64028776978415
+      inference_time: 3127.0
+      throughput: 319.79533098816756
       estimated_peak_memory_range:
-        min: 626688
-        max: 2224152
+        min: 618496
+        max: 21181120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jn5q83ke5
+      job_id: j56y46k7p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:52:03Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:20:48Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1550.0
+      throughput: 645.1612903225806
+      estimated_peak_memory_range:
+        min: 12288
+        max: 31548912
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 79
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 79
+      job_id: jgn6v2jm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1669.0
+      throughput: 599.1611743559017
+      estimated_peak_memory_range:
+        min: 0
+        max: 23772240
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 126
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 126
+      job_id: jp3j0kyzg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1668.0
+      throughput: 599.5203836930456
+      estimated_peak_memory_range:
+        min: 0
+        max: 31393232
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: j57yrzzr5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:20:54Z'
   - torchscript_onnx_qnn:
-      inference_time: 2317.0
-      throughput: 431.59257660768236
+      inference_time: 2312.0
+      throughput: 432.52595155709344
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jqpye6e0g
+      job_id: jpy13e94p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2312.0
-      throughput: 432.52595155709344
+      inference_time: 2328.0
+      throughput: 429.553264604811
       estimated_peak_memory_range:
-        min: 52379648
-        max: 52379648
+        min: 52461568
+        max: 52461568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jwgoy3015
+      job_id: j5we6ojj5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:52:07Z'
+    timestamp: '2024-10-15T17:20:52Z'
diff --git a/qai_hub_models/models/resnet50_quantized/README.md b/qai_hub_models/models/resnet50_quantized/README.md
index e6e0a463..726fb4b4 100644
--- a/qai_hub_models/models/resnet50_quantized/README.md
+++ b/qai_hub_models/models/resnet50_quantized/README.md
@@ -6,7 +6,7 @@
 ResNet50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNet50Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnet50_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[resnet50_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.resnet50_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNet50Quantized can be found
+* The license for the original implementation of ResNet50Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnet50_quantized/evaluate.py b/qai_hub_models/models/resnet50_quantized/evaluate.py
index 8fdc840e..42a16a6e 100644
--- a/qai_hub_models/models/resnet50_quantized/evaluate.py
+++ b/qai_hub_models/models/resnet50_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.resnet50_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/resnet50_quantized/export.py b/qai_hub_models/models/resnet50_quantized/export.py
index bc8004f9..dcc61718 100644
--- a/qai_hub_models/models/resnet50_quantized/export.py
+++ b/qai_hub_models/models/resnet50_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnet50_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "resnet50_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/resnet50_quantized/model.py b/qai_hub_models/models/resnet50_quantized/model.py
index 54f44eb1..35c5399f 100644
--- a/qai_hub_models/models/resnet50_quantized/model.py
+++ b/qai_hub_models/models/resnet50_quantized/model.py
@@ -4,78 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.resnet50.model import ResNet50
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 6
-DEFAULT_ENCODINGS = "resnet50_quantized_encodings.json"
-
-
-class ResNet50Quantizable(AIMETQuantizableMixin, ResNet50):
-    """ResNet with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        resnet50_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ResNet50.__init__(self, resnet50_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            resnet50_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ResNet50Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ResNet50.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ResNet50Quantizable(HubQuantizableMixin, ResNet50):
+    pass
diff --git a/qai_hub_models/models/resnet50_quantized/perf.yaml b/qai_hub_models/models/resnet50_quantized/perf.yaml
index 89800b77..4ccd953f 100644
--- a/qai_hub_models/models/resnet50_quantized/perf.yaml
+++ b/qai_hub_models/models/resnet50_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,36 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNet50Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 783.0
-      throughput: 1277.139208173691
+      inference_time: 788.0
+      throughput: 1269.0355329949239
       estimated_peak_memory_range:
-        min: 24576
-        max: 2120120
+        min: 12288
+        max: 11412928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,29 +57,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jz57zxq9p
+      job_id: jgkevy0ng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1003.0
-      throughput: 997.0089730807578
+      inference_time: 1001.0
+      throughput: 999.000999000999
       estimated_peak_memory_range:
-        min: 12288
-        max: 255020432
+        min: 16384
+        max: 33294216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1p8ozlog
+        total_layers: 127
+      job_id: jg9l04q8g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1591.0
-      throughput: 628.5355122564425
+      inference_time: 1526.0
+      throughput: 655.307994757536
       estimated_peak_memory_range:
         min: 16384
-        max: 30887536
+        max: 31718720
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -91,7 +87,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jlpe9k9vg
+      job_id: jgn60ewj5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -100,13 +96,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:51:26Z'
+    timestamp: '2024-10-17T17:21:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 643.0
-      throughput: 1555.2099533437015
+      inference_time: 584.0
+      throughput: 1712.3287671232877
       estimated_peak_memory_range:
-        min: 16384
-        max: 64793440
+        min: 12288
+        max: 66134624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -114,29 +110,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jqp4qvz1g
+      job_id: j5q6021op
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 754.0
-      throughput: 1326.2599469496022
+      inference_time: 749.0
+      throughput: 1335.1134846461948
       estimated_peak_memory_range:
         min: 167936
-        max: 19516848
+        max: 15555376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jogkz3jng
+        total_layers: 127
+      job_id: jp1428m7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1179.0
-      throughput: 848.1764206955047
+      inference_time: 1128.0
+      throughput: 886.5248226950355
       estimated_peak_memory_range:
-        min: 0
-        max: 95508896
+        min: 151552
+        max: 97130560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -144,7 +140,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jygzerexg
+      job_id: jprv6y7kg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -153,13 +149,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:51:27Z'
+    timestamp: '2024-10-17T17:21:44Z'
   - torchscript_onnx_tflite:
-      inference_time: 781.0
-      throughput: 1280.4097311139565
+      inference_time: 2827.0
+      throughput: 353.73187124159887
       estimated_peak_memory_range:
-        min: 40960
-        max: 1465440
+        min: 0
+        max: 27793920
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -167,37 +163,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: j0pxvywlg
+      job_id: jglv4kqm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 943.0
-      throughput: 1060.4453870625662
+      inference_time: 4072.0
+      throughput: 245.5795677799607
       estimated_peak_memory_range:
-        min: 200704
-        max: 1743616
+        min: 208896
+        max: 8039184
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1gln3nmp
+        total_layers: 127
+      job_id: jgdxnvmzp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:51:20Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:21:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 912.0
-      throughput: 1096.4912280701753
+      inference_time: 11444.0
+      throughput: 87.38203425375742
       estimated_peak_memory_range:
-        min: 16384
-        max: 65629312
+        min: 32768
+        max: 7067008
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 82
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 82
+      job_id: j56y210yp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:21:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 785.0
+      throughput: 1273.8853503184714
+      estimated_peak_memory_range:
+        min: 12288
+        max: 3533448
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -205,37 +224,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jo5mr3j9g
+      job_id: jp3jnmrng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1148.0
-      throughput: 871.0801393728223
+      inference_time: 943.0
+      throughput: 1060.4453870625662
       estimated_peak_memory_range:
-        min: 167936
-        max: 19544448
+        min: 184320
+        max: 1386016
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1pv3v3r5
+        total_layers: 127
+      job_id: j5wew9r45
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:51:24Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:21:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 779.0
-      throughput: 1283.6970474967907
+      inference_time: 782.0
+      throughput: 1278.772378516624
       estimated_peak_memory_range:
-        min: 16384
-        max: 32231048
+        min: 12288
+        max: 15579336
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -243,37 +262,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jegn23jqg
+      job_id: jgo2zv9kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 948.0
-      throughput: 1054.8523206751054
+      inference_time: 947.0
+      throughput: 1055.9662090813094
       estimated_peak_memory_range:
-        min: 208896
-        max: 1931696
+        min: 172032
+        max: 1898736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jw566n6y5
+        total_layers: 127
+      job_id: jp1428mnp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:51:21Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:21:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 810.0
-      throughput: 1234.567901234568
+      inference_time: 787.0
+      throughput: 1270.6480304955528
       estimated_peak_memory_range:
-        min: 12288
-        max: 3808264
+        min: 16384
+        max: 16498400
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -281,22 +300,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: joprkez75
+      job_id: jpv6qwnr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 943.0
-      throughput: 1060.4453870625662
+      inference_time: 948.0
+      throughput: 1054.8523206751054
       estimated_peak_memory_range:
-        min: 16384
-        max: 1642128
+        min: 172032
+        max: 1935696
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1p3kekn5
+        total_layers: 127
+      job_id: jgdxnvm6p
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -304,14 +323,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:51:22Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:21:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 776.0
-      throughput: 1288.659793814433
+      inference_time: 909.0
+      throughput: 1100.1100110011
       estimated_peak_memory_range:
-        min: 12288
-        max: 13155864
+        min: 16384
+        max: 66889440
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -319,37 +338,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jep28l2qp
+      job_id: jgjvdl8eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 959.0
-      throughput: 1042.752867570386
+      inference_time: 1135.0
+      throughput: 881.0572687224669
       estimated_peak_memory_range:
-        min: 184320
-        max: 1657520
+        min: 167936
+        max: 18672800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jwgoy3yk5
+        total_layers: 127
+      job_id: j57y2d8n5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:51:23Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:21:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 2776.0
-      throughput: 360.2305475504323
+      inference_time: 518.0
+      throughput: 1930.5019305019305
       estimated_peak_memory_range:
-        min: 184320
-        max: 28048624
+        min: 12288
+        max: 24413712
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -357,75 +376,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jqpye69lg
+      job_id: jpedovnv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4015.0
-      throughput: 249.06600249066003
+      inference_time: 693.0
+      throughput: 1443.001443001443
       estimated_peak_memory_range:
-        min: 204800
-        max: 8527328
+        min: 0
+        max: 17452784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j7gjxexep
+        total_layers: 127
+      job_id: jp4lnw225
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:51:25Z'
-  - torchscript_onnx_tflite:
-      inference_time: 11484.0
-      throughput: 87.07767328456984
+    torchscript_onnx:
+      inference_time: 918.0
+      throughput: 1089.3246187363834
       estimated_peak_memory_range:
-        min: 53248
-        max: 6890120
+        min: 0
+        max: 40616048
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 82
+        layers_on_npu: 147
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 82
-      job_id: j2p0ylnng
+        total_layers: 147
+      job_id: jpy1zdy0p
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:51:16Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:21:46Z'
   - torchscript_onnx_qnn:
-      inference_time: 1008.0
-      throughput: 992.063492063492
+      inference_time: 1009.0
+      throughput: 991.0802775024777
       estimated_peak_memory_range:
-        min: 434176
-        max: 434176
+        min: 397312
+        max: 397312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jn5q83jo5
+        total_layers: 127
+      job_id: jg9l04qmg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1602.0
-      throughput: 624.2197253433209
+      inference_time: 1569.0
+      throughput: 637.3486297004462
       estimated_peak_memory_range:
-        min: 29212672
-        max: 29212672
+        min: 29220864
+        max: 29220864
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -433,7 +444,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jz5woqomp
+      job_id: jp2kxmz6p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -442,4 +453,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:51:28Z'
+    timestamp: '2024-10-17T17:21:45Z'
diff --git a/qai_hub_models/models/resnet50_quantized/requirements.txt b/qai_hub_models/models/resnet50_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/resnet50_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/resnet50_quantized/test.py b/qai_hub_models/models/resnet50_quantized/test.py
deleted file mode 100644
index 55efb858..00000000
--- a/qai_hub_models/models/resnet50_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.resnet50_quantized.demo import main as demo_main
-from qai_hub_models.models.resnet50_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ResNet50Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ResNet50Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.45,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/resnext101/README.md b/qai_hub_models/models/resnext101/README.md
index 1776cdf7..a23499fd 100644
--- a/qai_hub_models/models/resnext101/README.md
+++ b/qai_hub_models/models/resnext101/README.md
@@ -6,7 +6,7 @@
 ResNeXt101 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNeXt101 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnext101).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.resnext101.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNeXt101 can be found
+* The license for the original implementation of ResNeXt101 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnext101/export.py b/qai_hub_models/models/resnext101/export.py
index 4dfc4ad8..b9c1bdce 100644
--- a/qai_hub_models/models/resnext101/export.py
+++ b/qai_hub_models/models/resnext101/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnext101 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "resnext101"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/resnext101/perf.yaml b/qai_hub_models/models/resnext101/perf.yaml
index 24d2ec21..4e370e28 100644
--- a/qai_hub_models/models/resnext101/perf.yaml
+++ b/qai_hub_models/models/resnext101/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNeXt101
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 6525.0
-      throughput: 153.25670498084293
+      inference_time: 6555.0
+      throughput: 152.55530129672007
       estimated_peak_memory_range:
-        min: 53248
-        max: 2463624
+        min: 20480
+        max: 2412432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j0pxvyqlg
+      job_id: jgjvnxy7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6685.0
-      throughput: 149.58863126402395
+      inference_time: 6671.0
+      throughput: 149.902563333833
       estimated_peak_memory_range:
         min: 16384
-        max: 36743712
+        max: 34405528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1p8oznog
+      job_id: jpxkovrj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7135.0
-      throughput: 140.1541695865452
+      inference_time: 7073.0
+      throughput: 141.38272303124558
       estimated_peak_memory_range:
-        min: 630784
-        max: 2624480
+        min: 16384
+        max: 203636728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: j7gjxejep
+      job_id: jglvmnqe5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:50:02Z'
+    timestamp: '2024-10-15T17:18:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 5381.0
-      throughput: 185.8390633711206
+      inference_time: 5363.0
+      throughput: 186.46280067126608
       estimated_peak_memory_range:
-        min: 24576
-        max: 375771296
+        min: 20480
+        max: 386795184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jo5mr379g
+      job_id: jpedm9x75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5477.0
-      throughput: 182.58170531312763
+      inference_time: 5439.0
+      throughput: 183.85732671446956
       estimated_peak_memory_range:
         min: 618496
-        max: 80674720
+        max: 96030544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jogkz31ng
+      job_id: j5mnnd4qp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5913.0
-      throughput: 169.1188905800778
+      inference_time: 5887.0
+      throughput: 169.86580601324954
       estimated_peak_memory_range:
-        min: 675840
-        max: 379219632
+        min: 0
+        max: 393211568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: jlpe9kjvg
+      job_id: j56y460vp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:50:03Z'
+    timestamp: '2024-10-16T08:32:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 6501.0
-      throughput: 153.82248884786955
+      inference_time: 6521.0
+      throughput: 153.35071308081584
       estimated_peak_memory_range:
-        min: 28672
-        max: 2035600
+        min: 24576
+        max: 2086928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jegn234qg
+      job_id: jgz3deyz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6799.0
-      throughput: 147.08045300779526
+      inference_time: 6844.0
+      throughput: 146.11338398597312
       estimated_peak_memory_range:
-        min: 659456
-        max: 1862880
+        min: 647168
+        max: 1972464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1gln3jmp
+      job_id: jprv3k7vg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:49:56Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:18:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 9202.0
-      throughput: 108.67202782003912
+      inference_time: 6558.0
+      throughput: 152.48551387618176
       estimated_peak_memory_range:
-        min: 40960
-        max: 165382464
+        min: 81920
+        max: 2447424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: joprker75
+      job_id: jgdx1w4kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 9267.0
-      throughput: 107.90978741771879
+      inference_time: 6784.0
+      throughput: 147.4056603773585
       estimated_peak_memory_range:
-        min: 0
-        max: 49579808
+        min: 647168
+        max: 2373096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1pv3vjr5
+      job_id: jp0z0yx25
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:50:01Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:18:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 6527.0
-      throughput: 153.2097441397273
+      inference_time: 6484.0
+      throughput: 154.22578655151142
       estimated_peak_memory_range:
-        min: 57344
-        max: 1659520
+        min: 32768
+        max: 2401624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jep28l1qp
+      job_id: jp14z01kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6864.0
-      throughput: 145.6876456876457
+      inference_time: 6802.0
+      throughput: 147.0155836518671
       estimated_peak_memory_range:
-        min: 643072
-        max: 1899104
+        min: 675840
+        max: 1885160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jw566nky5
+      job_id: jpy13eyrp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:49:58Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:18:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 6538.0
-      throughput: 152.95197308045275
+      inference_time: 6518.0
+      throughput: 153.42129487572876
       estimated_peak_memory_range:
         min: 32768
-        max: 2195640
+        max: 2345440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jqpye6llg
+      job_id: jg9lnv2qg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6846.0
-      throughput: 146.0706982179375
+      inference_time: 6778.0
+      throughput: 147.5361463558572
       estimated_peak_memory_range:
-        min: 643072
-        max: 2281544
+        min: 634880
+        max: 2482112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: j1p3keyn5
+      job_id: jp2ky8zxp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:49:59Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:18:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 6485.0
-      throughput: 154.20200462606013
+      inference_time: 9211.0
+      throughput: 108.56584518510476
       estimated_peak_memory_range:
-        min: 32768
-        max: 2219208
+        min: 20480
+        max: 172903680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j2p0ylwng
+      job_id: j5we6ozz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6856.0
-      throughput: 145.85764294049008
+      inference_time: 9353.0
+      throughput: 106.91756655618518
       estimated_peak_memory_range:
-        min: 638976
-        max: 1877880
+        min: 0
+        max: 55099440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jwgoy3jk5
+      job_id: jgkexzkyg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:18:34Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4612.0
+      throughput: 216.8256721595837
+      estimated_peak_memory_range:
+        min: 12288
+        max: 165095984
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 147
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 147
+      job_id: jp4lrq4q5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 4754.0
+      throughput: 210.34917963819942
+      estimated_peak_memory_range:
+        min: 0
+        max: 100248816
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 245
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 245
+      job_id: j5q6q8d7p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5023.0
+      throughput: 199.0842126219391
+      estimated_peak_memory_range:
+        min: 626688
+        max: 169371552
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 247
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 247
+      job_id: jpv6k3n75
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:50:00Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:18:40Z'
   - torchscript_onnx_qnn:
-      inference_time: 6911.0
-      throughput: 144.6968600781363
+      inference_time: 6899.0
+      throughput: 144.94854326714017
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 245
-      job_id: jn5q83no5
+      job_id: jgn6v2wv5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6790.0
-      throughput: 147.27540500736376
+      inference_time: 6813.0
+      throughput: 146.7782181124321
       estimated_peak_memory_range:
-        min: 181276672
-        max: 181276672
+        min: 181223424
+        max: 181223424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 247
-      job_id: jygzer1xg
+      job_id: jp3j0krxg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:50:03Z'
+    timestamp: '2024-10-15T17:18:38Z'
diff --git a/qai_hub_models/models/resnext101_quantized/README.md b/qai_hub_models/models/resnext101_quantized/README.md
index 3ddf62df..6ed9ae1b 100644
--- a/qai_hub_models/models/resnext101_quantized/README.md
+++ b/qai_hub_models/models/resnext101_quantized/README.md
@@ -6,7 +6,7 @@
 ResNeXt101 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNeXt101Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnext101_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[resnext101_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.resnext101_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNeXt101Quantized can be found
+* The license for the original implementation of ResNeXt101Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnext101_quantized/evaluate.py b/qai_hub_models/models/resnext101_quantized/evaluate.py
index 9652d8f6..26fe838f 100644
--- a/qai_hub_models/models/resnext101_quantized/evaluate.py
+++ b/qai_hub_models/models/resnext101_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.resnext101_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/resnext101_quantized/export.py b/qai_hub_models/models/resnext101_quantized/export.py
index da30ea7b..2b35d4a0 100644
--- a/qai_hub_models/models/resnext101_quantized/export.py
+++ b/qai_hub_models/models/resnext101_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnext101_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "resnext101_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/resnext101_quantized/model.py b/qai_hub_models/models/resnext101_quantized/model.py
index 82c48e43..be95544c 100644
--- a/qai_hub_models/models/resnext101_quantized/model.py
+++ b/qai_hub_models/models/resnext101_quantized/model.py
@@ -4,78 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.resnext101.model import ResNeXt101
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 5
-DEFAULT_ENCODINGS = "resnext101_quantized_encodings.json"
-
-
-class ResNeXt101Quantizable(AIMETQuantizableMixin, ResNeXt101):
-    """ResNeXt101 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ResNeXt101.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ResNeXt101Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ResNeXt101.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ResNeXt101Quantizable(HubQuantizableMixin, ResNeXt101):
+    pass
diff --git a/qai_hub_models/models/resnext101_quantized/perf.yaml b/qai_hub_models/models/resnext101_quantized/perf.yaml
index 45176f27..77645754 100644
--- a/qai_hub_models/models/resnext101_quantized/perf.yaml
+++ b/qai_hub_models/models/resnext101_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNeXt101Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2842.0
-      throughput: 351.8648838845883
+      inference_time: 2869.0
+      throughput: 348.5535029627048
       estimated_peak_memory_range:
-        min: 28672
-        max: 2602440
+        min: 12288
+        max: 1741792
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jqp4qv61g
+      job_id: jg9l04o8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3096.0
-      throughput: 322.99741602067184
+      inference_time: 3112.0
+      throughput: 321.3367609254499
       estimated_peak_memory_range:
-        min: 16384
-        max: 35690376
+        min: 0
+        max: 31007424
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jogkz3wng
+        total_layers: 246
+      job_id: jp0z4ron5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4357.0
-      throughput: 229.51572182694514
+      inference_time: 4022.0
+      throughput: 248.6325211337643
       estimated_peak_memory_range:
         min: 12288
-        max: 102884408
+        max: 102958696
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: jygzer6xg
+      job_id: jgz327yx5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:49:15Z'
+    timestamp: '2024-10-17T17:20:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 2259.0
-      throughput: 442.67374944665784
+      inference_time: 2051.0
+      throughput: 487.56704046806436
       estimated_peak_memory_range:
-        min: 32768
-        max: 277751568
+        min: 12288
+        max: 287469904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: j0pxvy8lg
+      job_id: jp1428o7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2550.0
-      throughput: 392.15686274509807
+      inference_time: 2536.0
+      throughput: 394.3217665615142
       estimated_peak_memory_range:
         min: 12288
-        max: 87827760
+        max: 96556512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jn5q83xo5
+        total_layers: 246
+      job_id: jp8q27jop
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3094.0
-      throughput: 323.2062055591467
+      inference_time: 2786.0
+      throughput: 358.9375448671931
       estimated_peak_memory_range:
         min: 0
-        max: 336788912
+        max: 352709936
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: jz5woqkmp
+      job_id: j5wew9zm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:49:16Z'
+    timestamp: '2024-10-17T17:20:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 2790.0
-      throughput: 358.42293906810033
+      inference_time: 9831.0
+      throughput: 101.71905197843556
       estimated_peak_memory_range:
-        min: 24576
-        max: 1597984
+        min: 73728
+        max: 209272896
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jo5mr319g
+      job_id: jgdxnv6zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2964.0
-      throughput: 337.38191632928476
+      inference_time: 14602.0
+      throughput: 68.48376934666484
       estimated_peak_memory_range:
-        min: 180224
-        max: 1462256
+        min: 200704
+        max: 8236288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jw566nxy5
+        total_layers: 246
+      job_id: jgkevy6ng
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:49:09Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:20:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 3423.0
-      throughput: 292.141396435875
+      inference_time: 134358.0
+      throughput: 7.442802066121853
       estimated_peak_memory_range:
-        min: 16384
-        max: 282532096
+        min: 28672
+        max: 546404624
+      primary_compute_unit: GPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 14
+        layers_on_gpu: 125
+        layers_on_cpu: 11
+        total_layers: 150
+      job_id: j57y2do95
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:19:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2886.0
+      throughput: 346.5003465003465
+      estimated_peak_memory_range:
+        min: 24576
+        max: 2271104
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jegn23dqg
+      job_id: jp4lnwe15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3499.0
-      throughput: 285.7959416976279
+      inference_time: 2933.0
+      throughput: 340.94783498124787
       estimated_peak_memory_range:
-        min: 12288
-        max: 89765216
+        min: 176128
+        max: 1327792
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j7gjxe9ep
+        total_layers: 246
+      job_id: j5q6024op
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:49:14Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:20:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 2826.0
-      throughput: 353.8570417551309
+      inference_time: 2845.0
+      throughput: 351.493848857645
       estimated_peak_memory_range:
-        min: 16384
-        max: 2008088
+        min: 20480
+        max: 2130952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: joprkem75
+      job_id: jpxk910l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2937.0
-      throughput: 340.4834865509023
+      inference_time: 3010.0
+      throughput: 332.22591362126246
       estimated_peak_memory_range:
-        min: 176128
-        max: 1458216
+        min: 180224
+        max: 1426344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1p3kedn5
+        total_layers: 246
+      job_id: j56y21myp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:49:10Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:20:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 2844.0
-      throughput: 351.6174402250352
+      inference_time: 2788.0
+      throughput: 358.6800573888092
       estimated_peak_memory_range:
-        min: 20480
-        max: 2332144
+        min: 32768
+        max: 2689768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: jep28lqqp
+      job_id: j5mnez99p
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 2931.0
       throughput: 341.180484476288
       estimated_peak_memory_range:
-        min: 180224
-        max: 1498328
+        min: 184320
+        max: 1383432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jwgoy3xk5
+        total_layers: 246
+      job_id: jp3jnm7ng
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,52 +326,37 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:49:11Z'
-  - torchscript_onnx_tflite:
-      inference_time: 2791.0
-      throughput: 358.29451809387314
-      estimated_peak_memory_range:
-        min: 28672
-        max: 2554160
-      primary_compute_unit: NPU
-      precision: int8
-      layer_info:
-        layers_on_npu: 150
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 150
-      job_id: jqpye6klg
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 2938.0
-      throughput: 340.3675970047652
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:20:17Z'
+  - torchscript_onnx_qnn:
+      inference_time: 3456.0
+      throughput: 289.35185185185185
       estimated_peak_memory_range:
-        min: 200704
-        max: 1391336
+        min: 12288
+        max: 102493584
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1pv3v8r5
+        total_layers: 246
+      job_id: jgo2zvwkp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:49:12Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:20:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 9894.0
-      throughput: 101.07135637760258
+      inference_time: 2085.0
+      throughput: 479.6163069544364
       estimated_peak_memory_range:
-        min: 12288
-        max: 208908880
+        min: 8192
+        max: 199280432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +364,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 150
-      job_id: j2p0yl8ng
+      job_id: jprv6yx7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 14969.0
-      throughput: 66.80472977486806
+      inference_time: 2040.0
+      throughput: 490.19607843137254
       estimated_peak_memory_range:
-        min: 217088
-        max: 8505648
+        min: 0
+        max: 95817216
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: jlpe9kqvg
+        total_layers: 246
+      job_id: jpv6qwmr5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:49:15Z'
-  - torchscript_onnx_tflite:
-      inference_time: 131178.0
-      throughput: 7.623229504947476
+    torchscript_onnx:
+      inference_time: 2941.0
+      throughput: 340.02040122407345
       estimated_peak_memory_range:
-        min: 49152
-        max: 350266512
-      primary_compute_unit: GPU
+        min: 0
+        max: 237657248
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 14
-        layers_on_gpu: 125
-        layers_on_cpu: 11
-        total_layers: 150
-      job_id: j1p8ozdog
+        layers_on_npu: 283
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 283
+      job_id: jp142817p
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:49:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:20:30Z'
   - torchscript_onnx_qnn:
-      inference_time: 3077.0
-      throughput: 324.99187520311995
+      inference_time: 3076.0
+      throughput: 325.0975292587776
       estimated_peak_memory_range:
-        min: 204800
-        max: 204800
+        min: 262144
+        max: 262144
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 146
+        layers_on_npu: 246
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 146
-      job_id: j1gln3dmp
+        total_layers: 246
+      job_id: jglv4k8m5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4190.0
-      throughput: 238.6634844868735
+      inference_time: 4219.0
+      throughput: 237.02299123014933
       estimated_peak_memory_range:
-        min: 94441472
-        max: 94441472
+        min: 94502912
+        max: 94502912
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +432,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 283
-      job_id: jmg9vwr85
+      job_id: jg9l0428g
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +441,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:49:17Z'
+    timestamp: '2024-10-17T17:20:28Z'
diff --git a/qai_hub_models/models/resnext101_quantized/requirements.txt b/qai_hub_models/models/resnext101_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/resnext101_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/resnext101_quantized/test.py b/qai_hub_models/models/resnext101_quantized/test.py
deleted file mode 100644
index 1df1173a..00000000
--- a/qai_hub_models/models/resnext101_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.resnext101_quantized.demo import main as demo_main
-from qai_hub_models.models.resnext101_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ResNeXt101Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ResNeXt101Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.46,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/resnext50/README.md b/qai_hub_models/models/resnext50/README.md
index bb1c8865..2c206d02 100644
--- a/qai_hub_models/models/resnext50/README.md
+++ b/qai_hub_models/models/resnext50/README.md
@@ -6,7 +6,7 @@
 ResNeXt50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNeXt50 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnext50).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.resnext50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNeXt50 can be found
+* The license for the original implementation of ResNeXt50 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnext50/export.py b/qai_hub_models/models/resnext50/export.py
index a2abb9fe..81cd7927 100644
--- a/qai_hub_models/models/resnext50/export.py
+++ b/qai_hub_models/models/resnext50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnext50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "resnext50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/resnext50/perf.yaml b/qai_hub_models/models/resnext50/perf.yaml
index fb43c3f1..e08359c0 100644
--- a/qai_hub_models/models/resnext50/perf.yaml
+++ b/qai_hub_models/models/resnext50/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNeXt50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2547.0
-      throughput: 392.61876717707105
+      inference_time: 2525.0
+      throughput: 396.03960396039605
       estimated_peak_memory_range:
-        min: 12288
-        max: 2299032
+        min: 16384
+        max: 2647544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jnp10em75
+      job_id: jp3j0k9mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2603.0
-      throughput: 384.172109104879
+      inference_time: 2601.0
+      throughput: 384.46751249519417
       estimated_peak_memory_range:
         min: 618496
-        max: 63811024
+        max: 84120456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: joprke775
+      job_id: jgdx1w36p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2750.0
-      throughput: 363.6363636363636
+      inference_time: 2794.0
+      throughput: 357.9098067287044
       estimated_peak_memory_range:
         min: 12288
-        max: 60543136
+        max: 2177184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jw566n9y5
+      job_id: jp2ky8rxp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:47:03Z'
+    timestamp: '2024-10-15T17:15:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 1970.0
-      throughput: 507.61421319796955
+      inference_time: 1967.0
+      throughput: 508.38840874428064
       estimated_peak_memory_range:
         min: 12288
-        max: 179658848
+        max: 183090352
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jvgdwomz5
+      job_id: jglvv1ke5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2028.0
-      throughput: 493.0966469428008
+      inference_time: 2173.0
+      throughput: 460.1932811780948
       estimated_peak_memory_range:
         min: 618496
-        max: 35498192
+        max: 37904480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jep28lzqp
+      job_id: j5we6o7z5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2281.0
-      throughput: 438.4042086804033
+      inference_time: 2342.0
+      throughput: 426.9854824935952
       estimated_peak_memory_range:
-        min: 442368
-        max: 181333472
+        min: 0
+        max: 185964448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: j1p3keln5
+      job_id: jp0z0ym25
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:47:04Z'
+    timestamp: '2024-10-16T08:20:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 2501.0
-      throughput: 399.8400639744102
+      inference_time: 2503.0
+      throughput: 399.52057530962844
       estimated_peak_memory_range:
-        min: 12288
-        max: 1642008
+        min: 32768
+        max: 2131744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jz57zx89p
+      job_id: jpv6k3dz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2491.0
-      throughput: 401.4452027298274
+      inference_time: 2515.0
+      throughput: 397.61431411530816
       estimated_peak_memory_range:
-        min: 626688
-        max: 1918440
+        min: 634880
+        max: 1887112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j2p0ylxng
+      job_id: jp14z0jkp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:46:58Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:15:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 3274.0
-      throughput: 305.43677458766035
+      inference_time: 2480.0
+      throughput: 403.2258064516129
       estimated_peak_memory_range:
-        min: 16384
-        max: 116556784
+        min: 40960
+        max: 1990192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jqp4qv21g
+      job_id: j5we6o745
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3360.0
-      throughput: 297.6190476190476
+      inference_time: 2480.0
+      throughput: 403.2258064516129
       estimated_peak_memory_range:
-        min: 618496
-        max: 24560816
+        min: 622592
+        max: 2243496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1gln39mp
+      job_id: jp4lrq1q5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:47:02Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:15:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 2478.0
-      throughput: 403.5512510088781
+      inference_time: 2501.0
+      throughput: 399.8400639744102
       estimated_peak_memory_range:
-        min: 20480
-        max: 2038184
+        min: 28672
+        max: 2388824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j0pxvyzlg
+      job_id: jgz3dem45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2507.0
-      throughput: 398.8831272437176
+      inference_time: 2540.0
+      throughput: 393.7007874015748
       estimated_peak_memory_range:
-        min: 655360
-        max: 1980280
+        min: 643072
+        max: 1967920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1p8ozkog
+      job_id: j57yrz4q5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:46:59Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:15:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 2505.0
-      throughput: 399.2015968063872
+      inference_time: 2486.0
+      throughput: 402.2526146419952
       estimated_peak_memory_range:
         min: 28672
-        max: 1902384
+        max: 2146288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jo5mr3l9g
+      job_id: jpedm9z85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2493.0
-      throughput: 401.1231448054553
+      inference_time: 2488.0
+      throughput: 401.92926045016077
       estimated_peak_memory_range:
-        min: 630784
-        max: 1816408
+        min: 659456
+        max: 1823200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jogkz3kng
+      job_id: jgdx1w3kp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:47:00Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:15:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 2514.0
-      throughput: 397.77247414478916
+      inference_time: 3247.0
+      throughput: 307.9765937788728
       estimated_peak_memory_range:
-        min: 40960
-        max: 2616136
+        min: 0
+        max: 118191280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jegn23wqg
+      job_id: jgjvnx71g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2547.0
-      throughput: 392.61876717707105
+      inference_time: 3368.0
+      throughput: 296.91211401425176
       estimated_peak_memory_range:
-        min: 634880
-        max: 2307472
+        min: 618496
+        max: 28677200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jn5q83do5
+      job_id: j5mnxrmyp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:15:28Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1705.0
+      throughput: 586.5102639296188
+      estimated_peak_memory_range:
+        min: 12288
+        max: 63791216
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 79
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 79
+      job_id: jp14z0jnp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1799.0
+      throughput: 555.864369093941
+      estimated_peak_memory_range:
+        min: 614400
+        max: 40800384
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 126
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 126
+      job_id: jgn6v2zv5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1686.0
+      throughput: 593.1198102016607
+      estimated_peak_memory_range:
+        min: 0
+        max: 65267168
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jgo26yl4p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:47:01Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:15:33Z'
   - torchscript_onnx_qnn:
-      inference_time: 2663.0
-      throughput: 375.51633496057076
+      inference_time: 2669.0
+      throughput: 374.6721618583739
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jqpye6ylg
+      job_id: jg9lnvmqg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2634.0
-      throughput: 379.65072133637057
+      inference_time: 2676.0
+      throughput: 373.69207772795215
       estimated_peak_memory_range:
-        min: 53100544
-        max: 53100544
+        min: 53096448
+        max: 53096448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jwgoy37k5
+      job_id: jgkexz2yg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:47:05Z'
+    timestamp: '2024-10-15T17:15:31Z'
diff --git a/qai_hub_models/models/resnext50_quantized/README.md b/qai_hub_models/models/resnext50_quantized/README.md
index 6c69089a..d03bd9ef 100644
--- a/qai_hub_models/models/resnext50_quantized/README.md
+++ b/qai_hub_models/models/resnext50_quantized/README.md
@@ -6,7 +6,7 @@
 ResNeXt50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of ResNeXt50Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/resnext50_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/r
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[resnext50_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.resnext50_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of ResNeXt50Quantized can be found
+* The license for the original implementation of ResNeXt50Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/resnext50_quantized/evaluate.py b/qai_hub_models/models/resnext50_quantized/evaluate.py
index 1eb23114..87f75d68 100644
--- a/qai_hub_models/models/resnext50_quantized/evaluate.py
+++ b/qai_hub_models/models/resnext50_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.resnext50_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/resnext50_quantized/export.py b/qai_hub_models/models/resnext50_quantized/export.py
index 285dd9f4..83bca661 100644
--- a/qai_hub_models/models/resnext50_quantized/export.py
+++ b/qai_hub_models/models/resnext50_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.resnext50_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "resnext50_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/resnext50_quantized/model.py b/qai_hub_models/models/resnext50_quantized/model.py
index 101378c3..7708245a 100644
--- a/qai_hub_models/models/resnext50_quantized/model.py
+++ b/qai_hub_models/models/resnext50_quantized/model.py
@@ -4,78 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.resnext50.model import ResNeXt50
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 2
-DEFAULT_ENCODINGS = "resnext50_quantized_encodings.json"
-
-
-class ResNeXt50Quantizable(AIMETQuantizableMixin, ResNeXt50):
-    """ResNeXt50 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ResNeXt50.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ResNeXt50Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ResNeXt50.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ResNeXt50Quantizable(HubQuantizableMixin, ResNeXt50):
+    pass
diff --git a/qai_hub_models/models/resnext50_quantized/perf.yaml b/qai_hub_models/models/resnext50_quantized/perf.yaml
index d4ae1c1f..749bcdc2 100644
--- a/qai_hub_models/models/resnext50_quantized/perf.yaml
+++ b/qai_hub_models/models/resnext50_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: ResNeXt50Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 933.0
-      throughput: 1071.8113612004288
+      inference_time: 929.0
+      throughput: 1076.4262648008612
       estimated_peak_memory_range:
-        min: 12288
-        max: 6057144
+        min: 16384
+        max: 2198064
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jlpe9kxog
+      job_id: jpy1zdxlp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1195.0
-      throughput: 836.8200836820083
+      inference_time: 1180.0
+      throughput: 847.457627118644
       estimated_peak_memory_range:
-        min: 28672
-        max: 65421024
+        min: 12288
+        max: 67891904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jo5mr3kdg
+        total_layers: 127
+      job_id: jpedov7v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1977.0
-      throughput: 505.8168942842691
+      inference_time: 1870.0
+      throughput: 534.75935828877
       estimated_peak_memory_range:
-        min: 36864
-        max: 30920440
+        min: 12288
+        max: 31404304
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jwgoy3wq5
+      job_id: jp2kxmrqp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:46:23Z'
+    timestamp: '2024-10-17T17:19:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 679.0
-      throughput: 1472.7540500736377
+      inference_time: 687.0
+      throughput: 1455.604075691412
       estimated_peak_memory_range:
         min: 12288
-        max: 107105616
+        max: 111011792
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jygzeryog
+      job_id: jp0z4rjn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 879.0
-      throughput: 1137.6564277588168
+      inference_time: 882.0
+      throughput: 1133.7868480725624
       estimated_peak_memory_range:
         min: 167936
-        max: 32276064
+        max: 34667296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: joprked05
+        total_layers: 127
+      job_id: jgz327lx5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1431.0
-      throughput: 698.8120195667366
+      inference_time: 1279.0
+      throughput: 781.8608287724785
       estimated_peak_memory_range:
         min: 28672
-        max: 142368480
+        max: 146151648
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j1pv3vnk5
+      job_id: jpy1zdolp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:46:24Z'
+    timestamp: '2024-10-17T17:19:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 910.0
-      throughput: 1098.901098901099
+      inference_time: 3202.0
+      throughput: 312.3048094940662
       estimated_peak_memory_range:
         min: 12288
-        max: 1396096
+        max: 60662176
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jz5woqz3p
+      job_id: jp8q27xop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1126.0
-      throughput: 888.0994671403197
+      inference_time: 4530.0
+      throughput: 220.7505518763797
       estimated_peak_memory_range:
-        min: 184320
-        max: 1340512
+        min: 200704
+        max: 8100560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jqpye628g
+        total_layers: 127
+      job_id: j5wew9lm5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:46:17Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:18:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 1091.0
-      throughput: 916.5902841429881
+      inference_time: 64073.0
+      throughput: 15.607198039735927
       estimated_peak_memory_range:
-        min: 12288
-        max: 111592032
+        min: 24576
+        max: 89783464
+      primary_compute_unit: GPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 14
+        layers_on_gpu: 57
+        layers_on_cpu: 11
+        total_layers: 82
+      job_id: jgkevy4ng
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:18:33Z'
+  - torchscript_onnx_tflite:
+      inference_time: 920.0
+      throughput: 1086.9565217391305
+      estimated_peak_memory_range:
+        min: 32768
+        max: 6176376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jmg9vw2w5
+      job_id: j5q602yop
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1391.0
-      throughput: 718.9072609633357
+      inference_time: 1124.0
+      throughput: 889.6797153024911
       estimated_peak_memory_range:
-        min: 167936
-        max: 33372320
+        min: 172032
+        max: 1466640
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1gln38jp
+        total_layers: 127
+      job_id: jg9l04z8g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:46:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:18:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 935.0
-      throughput: 1069.51871657754
+      inference_time: 943.0
+      throughput: 1060.4453870625662
       estimated_peak_memory_range:
         min: 12288
-        max: 2570320
+        max: 1425152
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jnp10e185
+      job_id: jglv4kym5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1146.0
-      throughput: 872.6003490401396
+      inference_time: 1142.0
+      throughput: 875.6567425569177
       estimated_peak_memory_range:
-        min: 212992
-        max: 1395456
+        min: 192512
+        max: 1796784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j2p0yl99g
+        total_layers: 127
+      job_id: jgdxnvdzp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:46:18Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:18:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 928.0
-      throughput: 1077.5862068965516
+      inference_time: 946.0
+      throughput: 1057.0824524312895
       estimated_peak_memory_range:
-        min: 20480
-        max: 9269656
+        min: 12288
+        max: 2211160
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jvgdwo4r5
+      job_id: j56y218yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1138.0
-      throughput: 878.7346221441124
+      inference_time: 1140.0
+      throughput: 877.1929824561404
       estimated_peak_memory_range:
-        min: 184320
-        max: 1316032
+        min: 192512
+        max: 1365624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jogkz30wg
+        total_layers: 127
+      job_id: j57y2de95
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:46:19Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:18:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 921.0
-      throughput: 1085.7763300760043
+      inference_time: 1077.0
+      throughput: 928.5051067780872
       estimated_peak_memory_range:
-        min: 12288
-        max: 28288840
+        min: 4096
+        max: 110828336
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jz57zxnvp
+      job_id: jp3jnmzng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1144.0
-      throughput: 874.1258741258741
+      inference_time: 1432.0
+      throughput: 698.3240223463687
       estimated_peak_memory_range:
-        min: 221184
-        max: 1511136
+        min: 167936
+        max: 37381104
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jn5q831n5
+        total_layers: 127
+      job_id: jp4lnwy15
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:46:20Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:18:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 3068.0
-      throughput: 325.94524119947846
+      inference_time: 650.0
+      throughput: 1538.4615384615386
       estimated_peak_memory_range:
-        min: 12288
-        max: 60359568
+        min: 8192
+        max: 55905408
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +379,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jqp4qv48g
+      job_id: jgo2zvlkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4643.0
-      throughput: 215.37798836958862
+      inference_time: 737.0
+      throughput: 1356.85210312076
       estimated_peak_memory_range:
-        min: 225280
-        max: 8241344
+        min: 159744
+        max: 34219488
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1p3ke735
+        total_layers: 127
+      job_id: jpxk91ll5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:46:22Z'
-  - torchscript_onnx_tflite:
-      inference_time: 59523.0
-      throughput: 16.80022848310737
+    torchscript_onnx:
+      inference_time: 1252.0
+      throughput: 798.7220447284345
       estimated_peak_memory_range:
-        min: 24576
-        max: 135568256
-      primary_compute_unit: GPU
+        min: 0
+        max: 79889536
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 14
-        layers_on_gpu: 57
-        layers_on_cpu: 11
-        total_layers: 82
-      job_id: j0pxvyr3g
+        layers_on_npu: 147
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 147
+      job_id: jp8q27eop
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:46:13Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:19:07Z'
   - torchscript_onnx_qnn:
-      inference_time: 1240.0
-      throughput: 806.4516129032259
+      inference_time: 1324.0
+      throughput: 755.2870090634441
       estimated_peak_memory_range:
-        min: 442368
-        max: 442368
+        min: 462848
+        max: 462848
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jep28ldrp
+        total_layers: 127
+      job_id: jp1428n7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1934.0
-      throughput: 517.063081695967
+      inference_time: 1948.0
+      throughput: 513.347022587269
       estimated_peak_memory_range:
-        min: 29798400
-        max: 29798400
+        min: 30650368
+        max: 30650368
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +447,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jlpe9knog
+      job_id: jp0z4rmn5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:46:25Z'
+    timestamp: '2024-10-17T17:19:06Z'
diff --git a/qai_hub_models/models/resnext50_quantized/requirements.txt b/qai_hub_models/models/resnext50_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/resnext50_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/resnext50_quantized/test.py b/qai_hub_models/models/resnext50_quantized/test.py
deleted file mode 100644
index 4cd1dbbd..00000000
--- a/qai_hub_models/models/resnext50_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.resnext50_quantized.demo import main as demo_main
-from qai_hub_models.models.resnext50_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ResNeXt50Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ResNeXt50Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.46,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-        asset_version=MODEL_ASSET_VERSION,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/riffusion_quantized/README.md b/qai_hub_models/models/riffusion_quantized/README.md
index fedfc292..d889deb9 100644
--- a/qai_hub_models/models/riffusion_quantized/README.md
+++ b/qai_hub_models/models/riffusion_quantized/README.md
@@ -6,7 +6,7 @@
 Generates high resolution spectrograms images from text prompts using a latent diffusion model. This model uses CLIP ViT-L/14 as text encoder, U-Net based latent denoising, and VAE based decoder to generate the final image.
 
 This is based on the implementation of Riffusion found
-[here](https://github.com/CompVis/stable-diffusion/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/riffusion_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.riffusion_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Riffusion can be found
+* The license for the original implementation of Riffusion can be found
   [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+
 
 ## References
 * [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)
 * [Source Model Implementation](https://github.com/CompVis/stable-diffusion/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/riffusion_quantized/export.py b/qai_hub_models/models/riffusion_quantized/export.py
index 1cc4be82..496229aa 100644
--- a/qai_hub_models/models/riffusion_quantized/export.py
+++ b/qai_hub_models/models/riffusion_quantized/export.py
@@ -9,13 +9,14 @@
 
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.riffusion_quantized import Model
 from qai_hub_models.utils.args import export_parser
-from qai_hub_models.utils.base_model import BasePrecompiledModel, TargetRuntime
+from qai_hub_models.utils.base_model import BasePrecompiledModel
 from qai_hub_models.utils.printing import print_profile_metrics_from_job
 from qai_hub_models.utils.qai_hub_helpers import (
     can_access_qualcomm_ai_hub,
@@ -36,19 +37,16 @@ def export_model(
     output_dir: Optional[str] = None,
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[str, Tuple[Optional[hub.ProfileJob], Optional[hub.InferenceJob]]] | List[
-    str
-]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 5 main tasks:
+    This function executes the following recipe:
 
-        1. Initialize model.
-        2. Upload model assets to hub.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Summarizes the results from profiling.
+        1. Initialize model
+        2. Upload model assets to hub
+        3. Profiles the model performance on a real device
+        4. Summarizes the results from profiling
 
-    Each of the last three steps can be optionally skipped using the input options.
+    Each of the last 2 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -70,9 +68,8 @@ def export_model(
             `model_cls.from_precompiled`
 
     Returns:
-        A Mapping from component_name to a 2-tuple of:
+        A Mapping from component_name to a struct of:
             * A ProfileJob containing metadata about the profile job (None if profiling skipped).
-            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
     """
     model_name = "riffusion_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -101,9 +98,7 @@ def export_model(
             component_arg,
         )
 
-    target_runtime = TargetRuntime.TFLITE
-    # On-device perf improves with I/O in channel_last format except when using ONNX.
-    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+    target_runtime = TargetRuntime.QNN
 
     # 1. Initialize model
     print("Initializing model class")
@@ -123,8 +118,11 @@ def export_model(
         uploaded_models[component_name] = hub.upload_model(
             components_dict[component_name].get_target_model_path()
         )
+    print(
+        f"The {component_name} model is saved here: {components_dict[component_name].get_target_model_path()}"
+    )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -142,31 +140,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
-    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-    if not skip_inferencing:
-        for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            profile_options_all = components_dict[
-                component_name
-            ].get_hub_profile_options(target_runtime, profile_options)
-            sample_inputs = components_dict[component_name].sample_inputs(
-                use_channel_last_format=use_channel_last_format
-            )
-            submitted_inference_job = hub.submit_inference_job(
-                model=uploaded_models[component_name],
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Summarize the results from profiling
+    # 4. Summarizes the results from profiling
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -175,9 +149,8 @@ def export_model(
             print_profile_metrics_from_job(profile_job, profile_data)
 
     return {
-        component_name: (
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/riffusion_quantized/perf.yaml b/qai_hub_models/models/riffusion_quantized/perf.yaml
index 0706f066..9f7b5b11 100644
--- a/qai_hub_models/models/riffusion_quantized/perf.yaml
+++ b/qai_hub_models/models/riffusion_quantized/perf.yaml
@@ -26,7 +26,7 @@ aggregated:
   - Xiaomi 12
   - Xiaomi 12 Pro
   supported_chipsets:
-  - Qcs8550 Proxy
+  - QCS8550 Proxy
   - Snapdragon® 8 Gen 1
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 3
@@ -102,7 +102,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:20:19Z'
   - torchscript_onnx_qnn:
       inference_time: 7594.0
@@ -196,7 +196,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:20:19Z'
   - torchscript_onnx_qnn:
       inference_time: 227581.0
@@ -290,7 +290,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:20:19Z'
   - torchscript_onnx_qnn:
       inference_time: 129856.0
diff --git a/qai_hub_models/models/sam/README.md b/qai_hub_models/models/sam/README.md
index 5ad08396..ac8adc3b 100644
--- a/qai_hub_models/models/sam/README.md
+++ b/qai_hub_models/models/sam/README.md
@@ -6,7 +6,7 @@
 Transformer based encoder-decoder where prompts specify what to segment in an image thereby allowing segmentation without the need for additional training. The image encoder generates embeddings and the lightweight decoder operates on the embeddings for point and mask based image segmentation.
 
 This is based on the implementation of Segment-Anything-Model found
-[here](https://github.com/facebookresearch/segment-anything). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/sam).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.sam.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Segment-Anything-Model can be found
+* The license for the original implementation of Segment-Anything-Model can be found
   [here](https://github.com/facebookresearch/segment-anything/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Segment Anything](https://arxiv.org/abs/2304.02643)
 * [Source Model Implementation](https://github.com/facebookresearch/segment-anything)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/sam/export.py b/qai_hub_models/models/sam/export.py
index 0188a6e7..c527843c 100644
--- a/qai_hub_models/models/sam/export.py
+++ b/qai_hub_models/models/sam/export.py
@@ -10,15 +10,16 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 from torch.utils import mobile_optimizer
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.sam import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -46,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -84,10 +83,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "sam"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -119,7 +118,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "SAMDecoder" in components:
@@ -145,7 +144,7 @@ def export_model(
             },
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -161,7 +160,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -179,7 +178,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -203,14 +202,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -235,10 +234,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/sam/perf.yaml b/qai_hub_models/models/sam/perf.yaml
index a2f365f6..f3a86019 100644
--- a/qai_hub_models/models/sam/perf.yaml
+++ b/qai_hub_models/models/sam/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: SAMDecoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 29972.0
-      throughput: 33.364473508608036
+      inference_time: 29098.0
+      throughput: 34.366623135610695
       estimated_peak_memory_range:
-        min: 4259840
-        max: 12540360
+        min: 2162688
+        max: 21300704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,7 +56,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: j1gln3yjp
+      job_id: jgz3dewx5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -67,13 +65,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:44:43Z'
+    timestamp: '2024-10-15T17:13:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 20690.0
-      throughput: 48.33252779120348
+      inference_time: 20232.0
+      throughput: 49.426650850138394
       estimated_peak_memory_range:
-        min: 3805184
-        max: 227103408
+        min: 2363392
+        max: 238361616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -81,7 +79,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: j1p3kez35
+      job_id: jg9lnv88g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -90,13 +88,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:44:44Z'
+    timestamp: '2024-10-15T17:13:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 29720.0
-      throughput: 33.64737550471063
+      inference_time: 28959.0
+      throughput: 34.531579129113574
       estimated_peak_memory_range:
-        min: 3985408
-        max: 7065168
+        min: 3997696
+        max: 12470208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -104,7 +102,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: j1pv3v2k5
+      job_id: jgdx1w0zp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -112,14 +110,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:44:46Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:13:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 33136.0
-      throughput: 30.17865765330758
+      inference_time: 29061.0
+      throughput: 34.41037817005609
       estimated_peak_memory_range:
-        min: 4046848
-        max: 214762848
+        min: 4005888
+        max: 26106816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -127,22 +125,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: jlpe9k6og
+      job_id: jp2ky8j6p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:44:48Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:13:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 30005.0
-      throughput: 33.327778703549406
+      inference_time: 28990.0
+      throughput: 34.494653328734046
       estimated_peak_memory_range:
         min: 4030464
-        max: 17367192
+        max: 49143704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -150,22 +148,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: jz5woqy3p
+      job_id: jgn6v2xj5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:44:49Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:13:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 29956.0
-      throughput: 33.38229403124583
+      inference_time: 29004.0
+      throughput: 34.478003034064265
       estimated_peak_memory_range:
-        min: 4022272
-        max: 22289056
+        min: 4042752
+        max: 6870472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -173,22 +171,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: jnp10eo85
+      job_id: jpxkovm85
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:44:51Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:13:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 29912.0
-      throughput: 33.431398769724524
+      inference_time: 32396.0
+      throughput: 30.868008396098283
       estimated_peak_memory_range:
-        min: 4001792
-        max: 11746344
+        min: 4046848
+        max: 233043552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -196,24 +194,47 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 337
-      job_id: jz57zxovp
+      job_id: j57yrz6n5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:13:08Z'
+  - torchscript_onnx_tflite:
+      inference_time: 20466.0
+      throughput: 48.8615264340858
+      estimated_peak_memory_range:
+        min: 2555904
+        max: 164731968
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 337
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 337
+      job_id: jgkexzovg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:44:53Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:13:16Z'
 - name: SAMEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 11293293.0
-      throughput: 0.0885481320638719
+      inference_time: 11323510.0
+      throughput: 0.08831183970341351
       estimated_peak_memory_range:
-        min: 39505920
-        max: 225192072
+        min: 12288
+        max: 285674960
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -221,7 +242,7 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jw566n865
+      job_id: j5we6oxm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -230,13 +251,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:44:43Z'
+    timestamp: '2024-10-15T17:13:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 8339280.0
-      throughput: 0.11991442906342034
+      inference_time: 8300484.0
+      throughput: 0.12047490242737652
       estimated_peak_memory_range:
-        min: 43937792
-        max: 1631367904
+        min: 129224704
+        max: 1718444144
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -244,7 +265,7 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jwgoy3lq5
+      job_id: jp14z037p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -253,13 +274,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:44:45Z'
+    timestamp: '2024-10-15T17:13:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 10940893.0
-      throughput: 0.09140021751423764
+      inference_time: 10870158.0
+      throughput: 0.09199498296160921
       estimated_peak_memory_range:
-        min: 129261568
-        max: 132892712
+        min: 129540096
+        max: 300233944
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -267,7 +288,7 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: j7gjxe3vp
+      job_id: j5we6ox45
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -275,14 +296,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:44:46Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:13:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 17506449.0
-      throughput: 0.05712180694097358
+      inference_time: 10178345.0
+      throughput: 0.09824779961771782
       estimated_peak_memory_range:
-        min: 87793664
-        max: 1726140736
+        min: 126943232
+        max: 130050864
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -290,22 +311,22 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jygzerzog
+      job_id: jpy13en0p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:44:48Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:13:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 10431216.0
-      throughput: 0.09586610036643858
+      inference_time: 11283428.0
+      throughput: 0.08862554890233712
       estimated_peak_memory_range:
-        min: 129617920
-        max: 133152664
+        min: 126205952
+        max: 130851936
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -313,22 +334,22 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jmg9vwow5
+      job_id: jprv3k9kg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:44:50Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:13:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 10423682.0
-      throughput: 0.09593539020089062
+      inference_time: 10102843.0
+      throughput: 0.09898203901614624
       estimated_peak_memory_range:
-        min: 128888832
-        max: 132565768
+        min: 127098880
+        max: 131236328
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -336,22 +357,22 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jvgdwo6r5
+      job_id: j5mnxr47p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:44:52Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:13:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 11339804.0
-      throughput: 0.08818494570100154
+      inference_time: 13526091.0
+      throughput: 0.07393118972805965
       estimated_peak_memory_range:
-        min: 129880064
-        max: 133421840
+        min: 137879552
+        max: 1774634432
       primary_compute_unit: CPU
       precision: fp32
       layer_info:
@@ -359,13 +380,36 @@ models:
         layers_on_gpu: 36
         layers_on_cpu: 782
         total_layers: 818
-      job_id: jqp4qve8g
+      job_id: jp4lrq825
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:13:08Z'
+  - torchscript_onnx_tflite:
+      inference_time: 6334196.0
+      throughput: 0.15787323284596813
+      estimated_peak_memory_range:
+        min: 102768640
+        max: 1649431984
+      primary_compute_unit: CPU
+      precision: fp32
+      layer_info:
+        layers_on_npu: 0
+        layers_on_gpu: 36
+        layers_on_cpu: 782
+        total_layers: 818
+      job_id: jg9llx4qg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:44:53Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T08:25:26Z'
diff --git a/qai_hub_models/models/sesr_m5/README.md b/qai_hub_models/models/sesr_m5/README.md
index 35457e96..2c7bfd68 100644
--- a/qai_hub_models/models/sesr_m5/README.md
+++ b/qai_hub_models/models/sesr_m5/README.md
@@ -6,7 +6,7 @@
 SESR M5 performs efficient on-device upscaling of images.
 
 This is based on the implementation of SESR-M5 found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/sesr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/sesr_m5).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.sesr_m5.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of SESR-M5 can be found
+* The license for the original implementation of SESR-M5 can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Collapsible Linear Blocks for Super-Efficient Super Resolution](https://arxiv.org/abs/2103.09404)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/sesr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/sesr_m5/export.py b/qai_hub_models/models/sesr_m5/export.py
index 0ec23534..86f043b3 100644
--- a/qai_hub_models/models/sesr_m5/export.py
+++ b/qai_hub_models/models/sesr_m5/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.sesr_m5 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "sesr_m5"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/sesr_m5/perf.yaml b/qai_hub_models/models/sesr_m5/perf.yaml
index d74eee34..3f6dc320 100644
--- a/qai_hub_models/models/sesr_m5/perf.yaml
+++ b/qai_hub_models/models/sesr_m5/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: SESR-M5
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2184.0
-      throughput: 457.87545787545787
+      inference_time: 2175.0
+      throughput: 459.7701149425287
       estimated_peak_memory_range:
-        min: 20480
-        max: 11545544
+        min: 16384
+        max: 21924928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: j1p3ke935
+      job_id: jgdx1ze6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2154.0
-      throughput: 464.2525533890436
+      inference_time: 2129.0
+      throughput: 469.7040864255519
       estimated_peak_memory_range:
-        min: 20480
-        max: 3826424
+        min: 16384
+        max: 60533184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jmg9vwzw5
+      job_id: jprv3n2vg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2884.0
-      throughput: 346.74063800277395
+      inference_time: 2687.0
+      throughput: 372.1622627465575
       estimated_peak_memory_range:
-        min: 20480
-        max: 74815880
+        min: 212992
+        max: 1417096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: joprkel05
+      job_id: jgo264n4p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:43:40Z'
+    timestamp: '2024-10-14T23:35:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 1799.0
-      throughput: 555.864369093941
+      inference_time: 1778.0
+      throughput: 562.429696287964
       estimated_peak_memory_range:
         min: 16384
-        max: 28480528
+        max: 28718528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: jwgoy3rq5
+      job_id: jg9lnxjqg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1811.0
-      throughput: 552.1811154058531
+      inference_time: 1652.0
+      throughput: 605.3268765133172
       estimated_peak_memory_range:
-        min: 12288
-        max: 12853232
+        min: 208896
+        max: 13220800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jnp10en85
+      job_id: jp2kyv9xp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2388.0
-      throughput: 418.7604690117253
+      inference_time: 2121.0
+      throughput: 471.4757190004715
       estimated_peak_memory_range:
         min: 0
-        max: 30576416
+        max: 31376288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: jep28lrrp
+      job_id: jpv6k9r75
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:43:41Z'
+    timestamp: '2024-10-14T23:35:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 2187.0
-      throughput: 457.2473708276177
+      inference_time: 2241.0
+      throughput: 446.2293618920125
       estimated_peak_memory_range:
         min: 24576
-        max: 43874120
+        max: 1384976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: j1pv3vlk5
+      job_id: jp14zvykp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2163.0
-      throughput: 462.32085067036525
+      inference_time: 2144.0
+      throughput: 466.4179104477612
       estimated_peak_memory_range:
-        min: 24576
-        max: 4132456
+        min: 221184
+        max: 4746840
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jz57zxevp
+      job_id: jp0z0v225
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:43:35Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:34:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 3450.0
-      throughput: 289.8550724637681
+      inference_time: 2191.0
+      throughput: 456.41259698767686
       estimated_peak_memory_range:
-        min: 6336512
-        max: 35321392
+        min: 36864
+        max: 8154584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: j7gjxervp
+      job_id: jpxkodnj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3163.0
-      throughput: 316.1555485298767
+      inference_time: 2146.0
+      throughput: 465.98322460391427
       estimated_peak_memory_range:
-        min: 208896
-        max: 16094048
+        min: 225280
+        max: 1479912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jegn23zkg
+      job_id: j5q6qmr7p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:43:39Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:34:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 2202.0
-      throughput: 454.1326067211626
+      inference_time: 2311.0
+      throughput: 432.7131112072696
       estimated_peak_memory_range:
-        min: 28672
-        max: 8680096
+        min: 20480
+        max: 91047872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: jlpe9k7og
+      job_id: jp4lr9kq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2160.0
-      throughput: 462.962962962963
+      inference_time: 2169.0
+      throughput: 461.04195481788844
       estimated_peak_memory_range:
-        min: 225280
-        max: 1670544
+        min: 229376
+        max: 1645592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jqp4qvy8g
+      job_id: jgkex9qyg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:43:36Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:34:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 2185.0
-      throughput: 457.66590389016017
+      inference_time: 2198.0
+      throughput: 454.9590536851683
       estimated_peak_memory_range:
         min: 24576
-        max: 2066464
+        max: 7414256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: jygzerlog
+      job_id: j57yr70q5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2162.0
-      throughput: 462.53469010175763
+      inference_time: 2146.0
+      throughput: 465.98322460391427
       estimated_peak_memory_range:
-        min: 225280
-        max: 4812536
+        min: 221184
+        max: 1487632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: j0pxvyl3g
+      job_id: jp8qy4mzp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:43:37Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:34:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 2209.0
-      throughput: 452.6935264825713
+      inference_time: 3978.0
+      throughput: 251.38260432378078
       estimated_peak_memory_range:
-        min: 36864
-        max: 7783424
+        min: 16384
+        max: 27377792
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 25
-      job_id: jz5woql3p
+      job_id: jgdx1zekp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2522.0
-      throughput: 396.5107057890563
+      inference_time: 3202.0
+      throughput: 312.3048094940662
       estimated_peak_memory_range:
-        min: 221184
-        max: 1397968
+        min: 208896
+        max: 16850496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jo5mr30dg
+      job_id: j56y4dzvp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:43:38Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:35:02Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1694.0
+      throughput: 590.318772136954
+      estimated_peak_memory_range:
+        min: 12288
+        max: 17250624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 22
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 25
+      job_id: jgn6v7mv5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1520.0
+      throughput: 657.8947368421053
+      estimated_peak_memory_range:
+        min: 208896
+        max: 10952832
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 31
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 31
+      job_id: jp3j0w1xg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1994.0
+      throughput: 501.5045135406219
+      estimated_peak_memory_range:
+        min: 0
+        max: 16932288
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 33
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 33
+      job_id: j5we613z5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:35:08Z'
   - torchscript_onnx_qnn:
-      inference_time: 2340.0
-      throughput: 427.35042735042737
+      inference_time: 2358.0
+      throughput: 424.08821034775235
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 237568
+        max: 237568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 31
-      job_id: jvgdwodr5
+      job_id: jpy137jrp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2934.0
-      throughput: 340.83162917518746
+      inference_time: 2968.0
+      throughput: 336.92722371967653
       estimated_peak_memory_range:
-        min: 8941568
-        max: 8941568
+        min: 8953856
+        max: 8953856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 33
-      job_id: jqpye6o8g
+      job_id: jpedmlw75
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:43:42Z'
+    timestamp: '2024-10-14T23:35:06Z'
diff --git a/qai_hub_models/models/sesr_m5_quantized/README.md b/qai_hub_models/models/sesr_m5_quantized/README.md
index 18ee0ea3..4af66e25 100644
--- a/qai_hub_models/models/sesr_m5_quantized/README.md
+++ b/qai_hub_models/models/sesr_m5_quantized/README.md
@@ -6,7 +6,7 @@
 SESR M5 performs efficient on-device upscaling of images.
 
 This is based on the implementation of SESR-M5-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/sesr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/sesr_m5_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.sesr_m5_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of SESR-M5-Quantized can be found
+* The license for the original implementation of SESR-M5-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Collapsible Linear Blocks for Super-Efficient Super Resolution](https://arxiv.org/abs/2103.09404)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/sesr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/sesr_m5_quantized/export.py b/qai_hub_models/models/sesr_m5_quantized/export.py
index 2475648d..94bf1c30 100644
--- a/qai_hub_models/models/sesr_m5_quantized/export.py
+++ b/qai_hub_models/models/sesr_m5_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.sesr_m5_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "sesr_m5_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/sesr_m5_quantized/perf.yaml b/qai_hub_models/models/sesr_m5_quantized/perf.yaml
index 7281550b..e151066c 100644
--- a/qai_hub_models/models/sesr_m5_quantized/perf.yaml
+++ b/qai_hub_models/models/sesr_m5_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: SESR-M5-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1332.0
-      throughput: 750.7507507507507
+      inference_time: 1339.0
+      throughput: 746.8259895444362
       estimated_peak_memory_range:
-        min: 24576
-        max: 1307512
+        min: 28672
+        max: 1352888
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,29 +59,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: j7gjxeqxp
+      job_id: jpv6k9qz5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 973.0
       throughput: 1027.749229188078
       estimated_peak_memory_range:
-        min: 20480
-        max: 4007080
+        min: 16384
+        max: 77034472
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jnp10ej85
+        total_layers: 31
+      job_id: j5mnxde7p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1190.0
-      throughput: 840.3361344537815
+      inference_time: 1083.0
+      throughput: 923.3610341643582
       estimated_peak_memory_range:
-        min: 65536
-        max: 1516480
+        min: 12288
+        max: 1986272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -91,7 +89,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: jqpye6x8g
+      job_id: jp3j0wvmg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -100,13 +98,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:43:03Z'
+    timestamp: '2024-10-14T23:34:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 1112.0
-      throughput: 899.2805755395683
+      inference_time: 1109.0
+      throughput: 901.7132551848512
       estimated_peak_memory_range:
         min: 16384
-        max: 26595040
+        max: 27098080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -114,29 +112,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jlpe9ky1g
+      job_id: jgjvnwd1g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 856.0
-      throughput: 1168.2242990654206
+      inference_time: 714.0
+      throughput: 1400.5602240896358
       estimated_peak_memory_range:
-        min: 61440
-        max: 14073056
+        min: 77824
+        max: 14626672
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jvgdwo3r5
+        total_layers: 31
+      job_id: jgn6v70j5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 859.0
-      throughput: 1164.1443538998835
+      inference_time: 821.0
+      throughput: 1218.026796589525
       estimated_peak_memory_range:
         min: 0
-        max: 30469168
+        max: 30963296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -144,7 +142,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: j2p0ylj9g
+      job_id: jgo264k1p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -153,13 +151,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:43:04Z'
+    timestamp: '2024-10-14T23:34:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 1348.0
-      throughput: 741.839762611276
+      inference_time: 3597.0
+      throughput: 278.00945232137894
       estimated_peak_memory_range:
-        min: 815104
-        max: 3639248
+        min: 12288
+        max: 19733760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -167,37 +165,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jygzernkg
+      job_id: j57yr72n5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 689.0
-      throughput: 1451.3788098693758
+      inference_time: 2908.0
+      throughput: 343.878954607978
       estimated_peak_memory_range:
-        min: 73728
-        max: 1239280
+        min: 65536
+        max: 7152192
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jqp4qv18g
+        total_layers: 31
+      job_id: jglvm1625
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:42:57Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:34:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 1766.0
-      throughput: 566.2514156285391
+      inference_time: 19669.0
+      throughput: 50.841425593573646
       estimated_peak_memory_range:
-        min: 12288
-        max: 25904848
+        min: 1699840
+        max: 4295000
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -205,37 +203,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jz5woq76p
+      job_id: jp4lr9n25
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:34:05Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1338.0
+      throughput: 747.3841554559043
+      estimated_peak_memory_range:
+        min: 1597440
+        max: 50979000
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 24
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 27
+      job_id: jpedmlo85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1117.0
-      throughput: 895.2551477170994
+      inference_time: 684.0
+      throughput: 1461.9883040935672
       estimated_peak_memory_range:
-        min: 65536
-        max: 15966752
+        min: 77824
+        max: 1362768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: joprke005
+        total_layers: 31
+      job_id: jp2kyvx6p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:43:01Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:34:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 1365.0
-      throughput: 732.6007326007326
+      inference_time: 1344.0
+      throughput: 744.047619047619
       estimated_peak_memory_range:
-        min: 28672
-        max: 1409808
+        min: 20480
+        max: 8527456
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -243,37 +264,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jmg9vwml5
+      job_id: jp14zv2np
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 691.0
-      throughput: 1447.178002894356
+      inference_time: 684.0
+      throughput: 1461.9883040935672
       estimated_peak_memory_range:
-        min: 73728
-        max: 2423632
+        min: 81920
+        max: 1439032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: j0pxvy43g
+        total_layers: 31
+      job_id: jp8qy40qp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:42:58Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:34:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 1348.0
-      throughput: 741.839762611276
+      inference_time: 1342.0
+      throughput: 745.156482861401
       estimated_peak_memory_range:
         min: 16384
-        max: 1510760
+        max: 2323080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -281,22 +302,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jnp10ej25
+      job_id: jg9lnx0mg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 690.0
-      throughput: 1449.2753623188405
+      inference_time: 686.0
+      throughput: 1457.725947521866
       estimated_peak_memory_range:
-        min: 16384
-        max: 1747224
+        min: 81920
+        max: 1792632
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jo5mr3mdg
+        total_layers: 31
+      job_id: jp0z0v305
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -304,14 +325,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:42:59Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:34:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 1357.0
-      throughput: 736.9196757553427
+      inference_time: 1351.0
+      throughput: 740.1924500370096
       estimated_peak_memory_range:
-        min: 12288
-        max: 7607032
+        min: 20480
+        max: 1757768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -319,37 +340,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jvgdwo3e5
+      job_id: j5we61w45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 692.0
-      throughput: 1445.086705202312
+      inference_time: 687.0
+      throughput: 1455.604075691412
       estimated_peak_memory_range:
-        min: 73728
-        max: 2495152
+        min: 77824
+        max: 1297840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jegn23nkg
+        total_layers: 31
+      job_id: jpy137z0p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:43:00Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:34:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 3574.0
-      throughput: 279.79854504756577
+      inference_time: 1985.0
+      throughput: 503.77833753148616
       estimated_peak_memory_range:
         min: 1609728
-        max: 21305424
+        max: 29215200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -357,37 +378,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jz5woq73p
+      job_id: jgz3d4245
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3010.0
-      throughput: 332.22591362126246
+      inference_time: 1106.0
+      throughput: 904.1591320072333
       estimated_peak_memory_range:
-        min: 12288
-        max: 7239120
+        min: 61440
+        max: 17044800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jep28lwrp
+        total_layers: 31
+      job_id: j5q6qmeep
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:43:02Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:34:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 19818.0
-      throughput: 50.45917852457362
+      inference_time: 1351.0
+      throughput: 740.1924500370096
       estimated_peak_memory_range:
-        min: 1675264
-        max: 5049056
+        min: 12288
+        max: 18017408
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -395,37 +416,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 27
-      job_id: jmg9vwmw5
+      job_id: jpxkod985
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 599.0
+      throughput: 1669.449081803005
+      estimated_peak_memory_range:
+        min: 61440
+        max: 12144352
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 31
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 31
+      job_id: j56y4denp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 592.0
+      throughput: 1689.1891891891892
+      estimated_peak_memory_range:
+        min: 0
+        max: 21257104
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 48
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 48
+      job_id: jpedmle85
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:42:53Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:34:23Z'
   - torchscript_onnx_qnn:
-      inference_time: 821.0
-      throughput: 1218.026796589525
+      inference_time: 798.0
+      throughput: 1253.1328320802006
       estimated_peak_memory_range:
-        min: 61440
-        max: 61440
+        min: 139264
+        max: 139264
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 25
+        layers_on_npu: 31
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 25
-      job_id: jz57zx4vp
+        total_layers: 31
+      job_id: jprv3n6kg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1207.0
-      throughput: 828.5004142502071
+      inference_time: 1201.0
+      throughput: 832.6394671107411
       estimated_peak_memory_range:
-        min: 3301376
-        max: 3301376
+        min: 3321856
+        max: 3321856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -433,7 +484,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 48
-      job_id: j1p8ozxkg
+      job_id: jpv6k90z5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -442,4 +493,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:43:05Z'
+    timestamp: '2024-10-14T23:34:21Z'
diff --git a/qai_hub_models/models/shufflenet_v2/README.md b/qai_hub_models/models/shufflenet_v2/README.md
index 14b27b6d..87bbee99 100644
--- a/qai_hub_models/models/shufflenet_v2/README.md
+++ b/qai_hub_models/models/shufflenet_v2/README.md
@@ -6,7 +6,7 @@
 ShufflenetV2 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Shufflenet-v2 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/shufflenetv2.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/shufflenet_v2).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.shufflenet_v2.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Shufflenet-v2 can be found
+* The license for the original implementation of Shufflenet-v2 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design](https://arxiv.org/abs/1807.11164)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/shufflenetv2.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/shufflenet_v2/export.py b/qai_hub_models/models/shufflenet_v2/export.py
index 064a14ea..3d0f97d7 100644
--- a/qai_hub_models/models/shufflenet_v2/export.py
+++ b/qai_hub_models/models/shufflenet_v2/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.shufflenet_v2 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "shufflenet_v2"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/shufflenet_v2/perf.yaml b/qai_hub_models/models/shufflenet_v2/perf.yaml
index f58a25a3..040f539a 100644
--- a/qai_hub_models/models/shufflenet_v2/perf.yaml
+++ b/qai_hub_models/models/shufflenet_v2/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Shufflenet-v2
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1210.0
-      throughput: 826.4462809917355
+      inference_time: 1201.0
+      throughput: 832.6394671107411
       estimated_peak_memory_range:
-        min: 16384
-        max: 4278128
+        min: 12288
+        max: 1323640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jygzer4kg
+      job_id: j57yrzd95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 775.0
-      throughput: 1290.3225806451612
+      inference_time: 774.0
+      throughput: 1291.9896640826873
       estimated_peak_memory_range:
-        min: 618496
-        max: 6042968
+        min: 16384
+        max: 15972656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: j0pxvyx1g
+      job_id: j56y463yp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1088.0
-      throughput: 919.1176470588235
+      inference_time: 1128.0
+      throughput: 886.5248226950355
       estimated_peak_memory_range:
-        min: 651264
-        max: 2178824
+        min: 368640
+        max: 1853424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 223
-      job_id: jogkz382g
+      job_id: jglvmnem5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:42:23Z'
+    timestamp: '2024-10-15T17:11:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 971.0
-      throughput: 1029.8661174047375
+      inference_time: 975.0
+      throughput: 1025.6410256410256
       estimated_peak_memory_range:
         min: 12288
-        max: 39435904
+        max: 39986048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jz5woq46p
+      job_id: jp4lrqw15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 516.0
-      throughput: 1937.984496124031
+      inference_time: 518.0
+      throughput: 1930.5019305019305
       estimated_peak_memory_range:
-        min: 0
-        max: 11313536
+        min: 618496
+        max: 13868592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jo5mr38wg
+      job_id: jgo26y1kp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 737.0
-      throughput: 1356.85210312076
+      inference_time: 728.0
+      throughput: 1373.6263736263736
       estimated_peak_memory_range:
         min: 0
-        max: 41810496
+        max: 42560016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 223
-      job_id: jn5q83v45
+      job_id: jgo26yekp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:42:24Z'
+    timestamp: '2024-10-15T17:11:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 1193.0
-      throughput: 838.2229673093043
+      inference_time: 1197.0
+      throughput: 835.421888053467
       estimated_peak_memory_range:
-        min: 12288
-        max: 5973288
+        min: 24576
+        max: 1453192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jmg9vwdl5
+      job_id: jpxkov1l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 738.0
-      throughput: 1355.0135501355014
+      inference_time: 732.0
+      throughput: 1366.120218579235
       estimated_peak_memory_range:
-        min: 663552
-        max: 1916040
+        min: 634880
+        max: 1988536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: joprkew95
+      job_id: j5we6odm5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:42:18Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:11:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 1319.0
-      throughput: 758.1501137225171
+      inference_time: 1200.0
+      throughput: 833.3333333333334
       estimated_peak_memory_range:
-        min: 12288
-        max: 39963808
+        min: 16384
+        max: 1965656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jnp10e625
+      job_id: jp2ky8mqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 894.0
-      throughput: 1118.5682326621925
+      inference_time: 741.0
+      throughput: 1349.527665317139
       estimated_peak_memory_range:
-        min: 618496
-        max: 14610576
+        min: 630784
+        max: 1889664
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: j1p8oz1xg
+      job_id: j5mnxrw9p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:42:22Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:11:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 1208.0
-      throughput: 827.8145695364238
+      inference_time: 1196.0
+      throughput: 836.1204013377926
       estimated_peak_memory_range:
-        min: 20480
-        max: 111905904
+        min: 24576
+        max: 1539008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jvgdwo2e5
+      job_id: jprv3ky7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 752.0
-      throughput: 1329.787234042553
+      inference_time: 733.0
+      throughput: 1364.256480218281
       estimated_peak_memory_range:
-        min: 647168
-        max: 2427880
+        min: 634880
+        max: 1879296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jep28le4p
+      job_id: j57yrzj95
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:42:19Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:11:50Z'
   - torchscript_onnx_tflite:
-      inference_time: 1206.0
-      throughput: 829.1873963515754
+      inference_time: 1204.0
+      throughput: 830.5647840531561
       estimated_peak_memory_range:
-        min: 20480
-        max: 1571024
+        min: 28672
+        max: 1371800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jz57zx9lp
+      job_id: jgn6v2eq5
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 741.0
       throughput: 1349.527665317139
       estimated_peak_memory_range:
-        min: 647168
-        max: 2308712
+        min: 638976
+        max: 1991128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jqpye6m7g
+      job_id: jp14z0d7p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:42:20Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:11:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 1208.0
-      throughput: 827.8145695364238
+      inference_time: 1315.0
+      throughput: 760.4562737642585
       estimated_peak_memory_range:
-        min: 28672
-        max: 82256768
+        min: 12288
+        max: 40734320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jqp4qv3vg
+      job_id: j5mnxrz9p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 736.0
-      throughput: 1358.695652173913
+      inference_time: 885.0
+      throughput: 1129.9435028248588
       estimated_peak_memory_range:
-        min: 634880
-        max: 1859048
+        min: 618496
+        max: 14760976
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: j2p0yl66g
+      job_id: jpy13e4lp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:11:53Z'
+  - torchscript_onnx_tflite:
+      inference_time: 803.0
+      throughput: 1245.3300124533
+      estimated_peak_memory_range:
+        min: 12288
+        max: 22415936
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 204
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 204
+      job_id: j5q6q82op
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 407.0
+      throughput: 2457.002457002457
+      estimated_peak_memory_range:
+        min: 0
+        max: 10565600
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 158
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 158
+      job_id: jgkexzlng
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 786.0
+      throughput: 1272.264631043257
+      estimated_peak_memory_range:
+        min: 0
+        max: 23696464
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 223
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 223
+      job_id: jpedm94v5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:42:21Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:11:59Z'
   - torchscript_onnx_qnn:
-      inference_time: 881.0
-      throughput: 1135.0737797956867
+      inference_time: 894.0
+      throughput: 1118.5682326621925
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 158
-      job_id: jegn23krg
+      job_id: jpedm9rv5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1126.0
-      throughput: 888.0994671403197
+      inference_time: 1124.0
+      throughput: 889.6797153024911
       estimated_peak_memory_range:
-        min: 3928064
-        max: 3928064
+        min: 3284992
+        max: 3284992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 223
-      job_id: j1gln3l8p
+      job_id: jpv6k3zr5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:42:25Z'
+    timestamp: '2024-10-15T17:11:57Z'
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/README.md b/qai_hub_models/models/shufflenet_v2_quantized/README.md
index f97d918f..76982825 100644
--- a/qai_hub_models/models/shufflenet_v2_quantized/README.md
+++ b/qai_hub_models/models/shufflenet_v2_quantized/README.md
@@ -6,7 +6,7 @@
 ShufflenetV2 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Shufflenet-v2Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/shufflenetv2.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/shufflenet_v2_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/s
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[shufflenet_v2_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.shufflenet_v2_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Shufflenet-v2Quantized can be found
+* The license for the original implementation of Shufflenet-v2Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design](https://arxiv.org/abs/1807.11164)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/shufflenetv2.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/evaluate.py b/qai_hub_models/models/shufflenet_v2_quantized/evaluate.py
index 2fb7d9af..9989c217 100644
--- a/qai_hub_models/models/shufflenet_v2_quantized/evaluate.py
+++ b/qai_hub_models/models/shufflenet_v2_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.shufflenet_v2_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,7 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
-        supports_ort=False,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -39,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/export.py b/qai_hub_models/models/shufflenet_v2_quantized/export.py
index 1d90130f..7afa442b 100644
--- a/qai_hub_models/models/shufflenet_v2_quantized/export.py
+++ b/qai_hub_models/models/shufflenet_v2_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.shufflenet_v2_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "shufflenet_v2_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,12 +229,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model, supports_onnx=False)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/model.py b/qai_hub_models/models/shufflenet_v2_quantized/model.py
index e77c1af3..4ad13c24 100644
--- a/qai_hub_models/models/shufflenet_v2_quantized/model.py
+++ b/qai_hub_models/models/shufflenet_v2_quantized/model.py
@@ -4,92 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import (
-    equalize_bn_folded_model,
-    fold_all_batch_norms,
-)
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.shufflenet_v2.model import ShufflenetV2
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
-from qai_hub_models.utils.quantization_aimet import (
-    convert_all_depthwise_to_per_tensor,
-    tie_observers,
-)
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 3
-DEFAULT_ENCODINGS = "shufflenet_v2_quantized_encodings.json"
-
-
-class ShufflenetV2Quantizable(
-    AIMETQuantizableMixin,
-    ShufflenetV2,
-):
-    """ShufflenetV2 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        ShufflenetV2.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "ShufflenetV2Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = ShufflenetV2.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-        model = prepare_model(model)
-        dummy_input = torch.rand(input_shape)
-
-        pairs = fold_all_batch_norms(model, input_shape, dummy_input)
-        equalize_bn_folded_model(model, input_shape, pairs, dummy_input)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=dummy_input,
-        )
-        convert_all_depthwise_to_per_tensor(sim.model)
-        tie_observers(sim)
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class ShufflenetV2Quantizable(HubQuantizableMixin, ShufflenetV2):
+    pass
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/perf.yaml b/qai_hub_models/models/shufflenet_v2_quantized/perf.yaml
index c6edf3a5..321614e5 100644
--- a/qai_hub_models/models/shufflenet_v2_quantized/perf.yaml
+++ b/qai_hub_models/models/shufflenet_v2_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,67 +20,77 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: Shufflenet-v2Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 610.0
-      throughput: 1639.344262295082
+      inference_time: 615.0
+      throughput: 1626.0162601626016
       estimated_peak_memory_range:
-        min: 16384
-        max: 2376456
+        min: 12288
+        max: 1543976
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jygzer8kg
+        total_layers: 233
+      job_id: jp142868p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 589.0
-      throughput: 1697.792869269949
+      inference_time: 601.0
+      throughput: 1663.8935108153078
       estimated_peak_memory_range:
-        min: 53248
-        max: 75919760
+        min: 12288
+        max: 33104664
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: jegn237rg
+        total_layers: 160
+      job_id: jp8q271kp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 8856.0
+      throughput: 112.91779584462512
+      estimated_peak_memory_range:
+        min: 2326528
+        max: 6265288
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 210
+        layers_on_gpu: 0
+        layers_on_cpu: 5
+        total_layers: 215
+      job_id: j5wew9735
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -88,36 +99,51 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:41:38Z'
+    timestamp: '2024-10-17T17:17:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 431.0
-      throughput: 2320.185614849188
+      inference_time: 423.0
+      throughput: 2364.066193853428
       estimated_peak_memory_range:
         min: 12288
-        max: 28407328
+        max: 29672160
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jz5woq16p
+        total_layers: 233
+      job_id: jgdxnv2rp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 430.0
-      throughput: 2325.5813953488373
+      inference_time: 438.0
+      throughput: 2283.10502283105
       estimated_peak_memory_range:
         min: 159744
-        max: 16215744
+        max: 13332544
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: joprken95
+        total_layers: 160
+      job_id: jgkevy8wg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 7974.0
+      throughput: 125.4075746175069
+      estimated_peak_memory_range:
+        min: 921600
+        max: 356507072
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 210
+        layers_on_gpu: 0
+        layers_on_cpu: 5
+        total_layers: 215
+      job_id: jg9l04mwg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -126,150 +152,173 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:41:39Z'
+    timestamp: '2024-10-17T17:17:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 618.0
-      throughput: 1618.1229773462783
+      inference_time: 892.0
+      throughput: 1121.0762331838564
       estimated_peak_memory_range:
         min: 12288
-        max: 1440816
+        max: 22451392
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jmg9vwxl5
+        total_layers: 233
+      job_id: j57y2d9v5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 530.0
-      throughput: 1886.7924528301887
+      inference_time: 1210.0
+      throughput: 826.4462809917355
       estimated_peak_memory_range:
-        min: 212992
-        max: 1547008
+        min: 192512
+        max: 7952096
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: jqpye677g
+        total_layers: 160
+      job_id: j5q602vnp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:17:33Z'
+  - torchscript_onnx_tflite:
+      inference_time: 10608.0
+      throughput: 94.2684766214178
+      estimated_peak_memory_range:
+        min: 192512
+        max: 13361760
+      primary_compute_unit: CPU
+      precision: fp32
+      layer_info:
+        layers_on_npu: 44
+        layers_on_gpu: 11
+        layers_on_cpu: 178
+        total_layers: 233
+      job_id: jp4lnw385
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:41:41Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:17:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 647.0
-      throughput: 1545.595054095827
+      inference_time: 614.0
+      throughput: 1628.6644951140065
       estimated_peak_memory_range:
-        min: 16384
-        max: 28141808
+        min: 12288
+        max: 7154088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jnp10ev25
+        total_layers: 233
+      job_id: jpxk91x35
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 644.0
-      throughput: 1552.7950310559006
+      inference_time: 546.0
+      throughput: 1831.5018315018315
       estimated_peak_memory_range:
-        min: 159744
-        max: 16311600
+        min: 176128
+        max: 1520080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: jn5q83m45
+        total_layers: 160
+      job_id: jglv4klj5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:41:46Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:17:35Z'
   - torchscript_onnx_tflite:
       inference_time: 611.0
       throughput: 1636.6612111292961
       estimated_peak_memory_range:
         min: 12288
-        max: 33214624
+        max: 1590352
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jvgdwoze5
+        total_layers: 233
+      job_id: j5mnez8dp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 529.0
-      throughput: 1890.359168241966
+      inference_time: 544.0
+      throughput: 1838.235294117647
       estimated_peak_memory_range:
-        min: 180224
-        max: 1723400
+        min: 204800
+        max: 1535944
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: j2p0ylv6g
+        total_layers: 160
+      job_id: jp3jnm63g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:41:43Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:17:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 618.0
-      throughput: 1618.1229773462783
+      inference_time: 616.0
+      throughput: 1623.3766233766235
       estimated_peak_memory_range:
-        min: 16384
-        max: 15518960
+        min: 20480
+        max: 1659664
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jz57zx7lp
+        total_layers: 233
+      job_id: jgn60ekk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 530.0
-      throughput: 1886.7924528301887
+      inference_time: 566.0
+      throughput: 1766.7844522968198
       estimated_peak_memory_range:
         min: 184320
-        max: 1483136
+        max: 1791344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: j1p8oz4xg
+        total_layers: 160
+      job_id: jgo2zv8qp
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -277,121 +326,128 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:41:43Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:17:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 615.0
-      throughput: 1626.0162601626016
+      inference_time: 645.0
+      throughput: 1550.3875968992247
       estimated_peak_memory_range:
         min: 12288
-        max: 1521400
+        max: 30330480
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: jqp4qv9vg
+        total_layers: 233
+      job_id: jprv6yw0g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 526.0
-      throughput: 1901.1406844106464
+      inference_time: 649.0
+      throughput: 1540.8320493066255
       estimated_peak_memory_range:
-        min: 212992
-        max: 1577352
+        min: 159744
+        max: 14718512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: jogkz392g
+        total_layers: 160
+      job_id: jpv6qwdk5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:41:45Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:17:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 925.0
-      throughput: 1081.081081081081
+      inference_time: 466.0
+      throughput: 2145.922746781116
       estimated_peak_memory_range:
-        min: 12288
-        max: 21349328
+        min: 8192
+        max: 21476592
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 207
+        layers_on_npu: 233
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 207
-      job_id: j0pxvyd1g
+        total_layers: 233
+      job_id: jp2kxmerp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1119.0
-      throughput: 893.6550491510277
+      inference_time: 382.0
+      throughput: 2617.801047120419
       estimated_peak_memory_range:
-        min: 12288
-        max: 7590464
+        min: 159744
+        max: 10601808
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: j1gln318p
+        total_layers: 160
+      job_id: jgjvdl7vg
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:41:47Z'
-  - torchscript_onnx_tflite:
-      inference_time: 10938.0
-      throughput: 91.42439202779302
+    torchscript_onnx:
+      inference_time: 6292.0
+      throughput: 158.93197711379528
       estimated_peak_memory_range:
-        min: 12288
-        max: 9429816
-      primary_compute_unit: CPU
-      precision: fp32
+        min: 0
+        max: 287486368
+      primary_compute_unit: NPU
+      precision: int8
       layer_info:
-        layers_on_npu: 44
-        layers_on_gpu: 9
-        layers_on_cpu: 154
-        total_layers: 207
-      job_id: jo5mr3dwg
+        layers_on_npu: 210
+        layers_on_gpu: 0
+        layers_on_cpu: 5
+        total_layers: 215
+      job_id: jgdxnv3rp
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:41:37Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:17:52Z'
   - torchscript_onnx_qnn:
-      inference_time: 661.0
-      throughput: 1512.8593040847202
+      inference_time: 681.0
+      throughput: 1468.4287812041116
       estimated_peak_memory_range:
-        min: 532480
-        max: 532480
+        min: 630784
+        max: 630784
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 122
+        layers_on_npu: 160
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 122
-      job_id: jep28lv4p
+        total_layers: 160
+      job_id: j56y21w6p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 10337.0
+      throughput: 96.73986649898423
+      estimated_peak_memory_range:
+        min: 6778880
+        max: 6778880
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 210
+        layers_on_gpu: 0
+        layers_on_cpu: 5
+        total_layers: 215
+      job_id: jp1428j8p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -400,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:41:40Z'
+    timestamp: '2024-10-17T17:17:51Z'
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/requirements.txt b/qai_hub_models/models/shufflenet_v2_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/shufflenet_v2_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/shufflenet_v2_quantized/test.py b/qai_hub_models/models/shufflenet_v2_quantized/test.py
deleted file mode 100644
index 995731eb..00000000
--- a/qai_hub_models/models/shufflenet_v2_quantized/test.py
+++ /dev/null
@@ -1,29 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.shufflenet_v2_quantized.demo import main as demo_main
-from qai_hub_models.models.shufflenet_v2_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    ShufflenetV2Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        ShufflenetV2Quantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/sinet/README.md b/qai_hub_models/models/sinet/README.md
index 388eec6c..eb12f3f7 100644
--- a/qai_hub_models/models/sinet/README.md
+++ b/qai_hub_models/models/sinet/README.md
@@ -6,7 +6,7 @@
 SINet is a machine learning model that is designed to segment people from close-up portrait images in real time.
 
 This is based on the implementation of SINet found
-[here](https://github.com/clovaai/ext_portrait_segmentation). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/sinet).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.sinet.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of SINet can be found
+* The license for the original implementation of SINet can be found
   [here](https://github.com/clovaai/ext_portrait_segmentation/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder](https://arxiv.org/abs/1911.09099)
 * [Source Model Implementation](https://github.com/clovaai/ext_portrait_segmentation)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/sinet/export.py b/qai_hub_models/models/sinet/export.py
index 56e13737..8d0f710f 100644
--- a/qai_hub_models/models/sinet/export.py
+++ b/qai_hub_models/models/sinet/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.sinet import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "sinet"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/sinet/perf.yaml b/qai_hub_models/models/sinet/perf.yaml
index 1b53cb2a..e6956bfb 100644
--- a/qai_hub_models/models/sinet/perf.yaml
+++ b/qai_hub_models/models/sinet/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: SINet
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1743.0
-      throughput: 573.7234652897304
+      inference_time: 1753.0
+      throughput: 570.4506560182544
       estimated_peak_memory_range:
-        min: 12288
-        max: 4302648
+        min: 28672
+        max: 7061232
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jz5woq86p
+      job_id: jgn6voyq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1175.0
-      throughput: 851.063829787234
+      inference_time: 1189.0
+      throughput: 841.0428931875525
       estimated_peak_memory_range:
-        min: 626688
-        max: 5231632
+        min: 622592
+        max: 5973080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jo5mr3owg
+      job_id: j56y4rlyp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2305.0
-      throughput: 433.83947939262475
+      inference_time: 2281.0
+      throughput: 438.4042086804033
       estimated_peak_memory_range:
-        min: 290816
-        max: 2259456
+        min: 335872
+        max: 2129016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jn5q83z45
+      job_id: jgdx18lzp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:40:54Z'
+    timestamp: '2024-10-14T23:31:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 1142.0
-      throughput: 875.6567425569177
+      inference_time: 1164.0
+      throughput: 859.106529209622
       estimated_peak_memory_range:
         min: 12288
-        max: 31739376
+        max: 33021696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jmg9vwkl5
+      job_id: jprv3oq7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 808.0
-      throughput: 1237.6237623762377
+      inference_time: 809.0
+      throughput: 1236.0939431396787
       estimated_peak_memory_range:
         min: 618496
-        max: 14828816
+        max: 16535200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jegn23org
+      job_id: jp3j0x2ng
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1984.0
-      throughput: 504.03225806451616
+      inference_time: 1539.0
+      throughput: 649.772579597141
       estimated_peak_memory_range:
         min: 0
-        max: 35618128
+        max: 37479840
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: j1gln3o8p
+      job_id: j5we68n45
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:40:55Z'
+    timestamp: '2024-10-14T23:31:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 1721.0
-      throughput: 581.0575246949448
+      inference_time: 1732.0
+      throughput: 577.3672055427252
       estimated_peak_memory_range:
         min: 12288
-        max: 5637536
+        max: 2591184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jnp10e725
+      job_id: jp2ky46qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1160.0
-      throughput: 862.0689655172414
+      inference_time: 1157.0
+      throughput: 864.304235090752
       estimated_peak_memory_range:
-        min: 647168
-        max: 1964256
+        min: 634880
+        max: 2772960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jep28l44p
+      job_id: jpv6kexr5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:40:49Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:31:39Z'
   - torchscript_onnx_tflite:
-      inference_time: 1893.0
-      throughput: 528.2620179609086
+      inference_time: 1749.0
+      throughput: 571.7552887364208
       estimated_peak_memory_range:
-        min: 12288
-        max: 31468624
+        min: 32768
+        max: 1593152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jvgdwo8e5
+      job_id: jgkexonng
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1335.0
-      throughput: 749.0636704119851
+      inference_time: 1178.0
+      throughput: 848.8964346349745
       estimated_peak_memory_range:
-        min: 622592
-        max: 16541776
+        min: 630784
+        max: 1806512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jogkz3o2g
+      job_id: jgz3d8kx5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:40:53Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:31:42Z'
   - torchscript_onnx_tflite:
-      inference_time: 1768.0
-      throughput: 565.6108597285067
+      inference_time: 1754.0
+      throughput: 570.1254275940707
       estimated_peak_memory_range:
-        min: 40960
-        max: 23496256
+        min: 28672
+        max: 1526024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jz57zxklp
+      job_id: jp8qy69op
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1155.0
-      throughput: 865.8008658008658
+      inference_time: 1183.0
+      throughput: 845.30853761623
       estimated_peak_memory_range:
         min: 634880
-        max: 2041248
+        max: 2031744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: jqpye6q7g
+      job_id: jpedm83v5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:40:50Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:31:41Z'
   - torchscript_onnx_tflite:
-      inference_time: 1735.0
-      throughput: 576.3688760806916
+      inference_time: 1746.0
+      throughput: 572.737686139748
       estimated_peak_memory_range:
         min: 12288
-        max: 7947984
+        max: 2599136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: jqp4qvmvg
+      job_id: jp0z0dqn5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1162.0
-      throughput: 860.5851979345955
+      inference_time: 1180.0
+      throughput: 847.457627118644
       estimated_peak_memory_range:
-        min: 638976
-        max: 2347704
+        min: 634880
+        max: 1988320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: j2p0yld6g
+      job_id: jgjvno4eg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:40:51Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:31:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 1742.0
-      throughput: 574.052812858783
+      inference_time: 1884.0
+      throughput: 530.7855626326964
       estimated_peak_memory_range:
-        min: 28672
-        max: 8158144
+        min: 12288
+        max: 33257216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 240
-      job_id: j0pxvy31g
+      job_id: jpy13qwlp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1164.0
-      throughput: 859.106529209622
+      inference_time: 1316.0
+      throughput: 759.8784194528876
       estimated_peak_memory_range:
-        min: 638976
-        max: 2058072
+        min: 618496
+        max: 18202544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: j1p8oz6xg
+      job_id: jg9lnke8g
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:31:44Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1138.0
+      throughput: 878.7346221441124
+      estimated_peak_memory_range:
+        min: 12288
+        max: 23124416
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 240
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 240
+      job_id: jglvmorm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 770.0
+      throughput: 1298.7012987012988
+      estimated_peak_memory_range:
+        min: 0
+        max: 12719056
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 186
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 186
+      job_id: jp14z7x7p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1526.0
+      throughput: 655.307994757536
+      estimated_peak_memory_range:
+        min: 0
+        max: 25661488
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 229
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 229
+      job_id: jgdx18l6p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:40:52Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:31:50Z'
   - torchscript_onnx_qnn:
-      inference_time: 1344.0
-      throughput: 744.047619047619
+      inference_time: 1350.0
+      throughput: 740.7407407407408
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 186
-      job_id: joprkeo95
+      job_id: jgo26oqkp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2371.0
-      throughput: 421.76296921130324
+      inference_time: 2428.0
+      throughput: 411.8616144975288
       estimated_peak_memory_range:
-        min: 1720320
-        max: 1720320
+        min: 1765376
+        max: 1765376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jw566nr05
+      job_id: jg9lnkemg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:40:56Z'
+    timestamp: '2024-10-14T23:31:48Z'
diff --git a/qai_hub_models/models/squeezenet1_1/README.md b/qai_hub_models/models/squeezenet1_1/README.md
index 99b00954..71ce0fa4 100644
--- a/qai_hub_models/models/squeezenet1_1/README.md
+++ b/qai_hub_models/models/squeezenet1_1/README.md
@@ -6,7 +6,7 @@
 SqueezeNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of SqueezeNet-1_1 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/squeezenet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/squeezenet1_1).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.squeezenet1_1.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of SqueezeNet-1_1 can be found
+* The license for the original implementation of SqueezeNet-1_1 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size](https://arxiv.org/abs/1602.07360)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/squeezenet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/squeezenet1_1/export.py b/qai_hub_models/models/squeezenet1_1/export.py
index cd34e8b3..bd810a79 100644
--- a/qai_hub_models/models/squeezenet1_1/export.py
+++ b/qai_hub_models/models/squeezenet1_1/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.squeezenet1_1 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "squeezenet1_1"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/squeezenet1_1/perf.yaml b/qai_hub_models/models/squeezenet1_1/perf.yaml
index 6e1bb0c3..eafa7adc 100644
--- a/qai_hub_models/models/squeezenet1_1/perf.yaml
+++ b/qai_hub_models/models/squeezenet1_1/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: SqueezeNet-1_1
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 643.0
-      throughput: 1555.2099533437015
+      inference_time: 641.0
+      throughput: 1560.0624024960998
       estimated_peak_memory_range:
         min: 16384
-        max: 1992976
+        max: 2365544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jnp10e325
+      job_id: jgdx1worp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 713.0
-      throughput: 1402.5245441795232
+      inference_time: 710.0
+      throughput: 1408.4507042253522
       estimated_peak_memory_range:
-        min: 618496
-        max: 3591720
+        min: 634880
+        max: 6722776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: joprke995
+      job_id: jgn6v23q5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 658.0
-      throughput: 1519.756838905775
+      inference_time: 653.0
+      throughput: 1531.3935681470139
       estimated_peak_memory_range:
         min: 12288
-        max: 3840744
+        max: 41556344
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jw566nv05
+      job_id: jp3j0kmng
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:40:16Z'
+    timestamp: '2024-10-15T17:11:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 556.0
-      throughput: 1798.5611510791366
+      inference_time: 461.0
+      throughput: 2169.1973969631235
       estimated_peak_memory_range:
-        min: 12288
-        max: 25694912
+        min: 16384
+        max: 27001168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jvgdwo0e5
+      job_id: j5we6oqm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 510.0
-      throughput: 1960.7843137254902
+      inference_time: 512.0
+      throughput: 1953.125
       estimated_peak_memory_range:
         min: 618496
-        max: 14093824
+        max: 12746688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jep28lj4p
+      job_id: jprv3ke7g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 491.0
-      throughput: 2036.6598778004072
+      inference_time: 578.0
+      throughput: 1730.1038062283737
       estimated_peak_memory_range:
         min: 0
-        max: 28459728
+        max: 28721200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: j1p3ke8l5
+      job_id: jgo26yvkp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:40:17Z'
+    timestamp: '2024-10-15T17:11:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 642.0
-      throughput: 1557.632398753894
+      inference_time: 640.0
+      throughput: 1562.5
       estimated_peak_memory_range:
-        min: 20480
-        max: 1352552
+        min: 12288
+        max: 1821560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jz57zx6lp
+      job_id: jg9lnvw8g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 647.0
-      throughput: 1545.595054095827
+      inference_time: 645.0
+      throughput: 1550.3875968992247
       estimated_peak_memory_range:
-        min: 630784
-        max: 1875288
+        min: 667648
+        max: 1987184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: j2p0ylk6g
+      job_id: jpy13e6lp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:40:12Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-15T17:11:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 809.0
-      throughput: 1236.0939431396787
+      inference_time: 641.0
+      throughput: 1560.0624024960998
       estimated_peak_memory_range:
         min: 16384
-        max: 27268672
+        max: 6654104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jqp4qv8vg
+      job_id: jp4lrqv15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 893.0
-      throughput: 1119.8208286674133
+      inference_time: 654.0
+      throughput: 1529.051987767584
       estimated_peak_memory_range:
-        min: 618496
-        max: 15241280
+        min: 622592
+        max: 1958368
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: j1gln378p
+      job_id: jgkexz3ng
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:40:16Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-15T17:11:10Z'
   - torchscript_onnx_tflite:
       inference_time: 640.0
       throughput: 1562.5
       estimated_peak_memory_range:
-        min: 20480
-        max: 72121952
+        min: 16384
+        max: 2603568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,52 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: j0pxvym1g
+      job_id: j57yrzx95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 660.0
-      throughput: 1515.1515151515152
+      inference_time: 644.0
+      throughput: 1552.7950310559006
+      estimated_peak_memory_range:
+        min: 323584
+        max: 1672120
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 70
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 70
+      job_id: jp8qyozop
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-15T17:11:09Z'
+  - torchscript_onnx_tflite:
+      inference_time: 639.0
+      throughput: 1564.9452269170579
+      estimated_peak_memory_range:
+        min: 12288
+        max: 1913520
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 41
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 41
+      job_id: jgdx1wozp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 642.0
+      throughput: 1557.632398753894
       estimated_peak_memory_range:
         min: 634880
-        max: 1896816
+        max: 2400840
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,7 +291,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: j1p8oz8xg
+      job_id: jp0z0yln5
       job_status: Passed
     reference_device_info:
       name: SA8650 (Proxy)
@@ -263,14 +299,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:40:13Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-15T17:11:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 644.0
-      throughput: 1552.7950310559006
+      inference_time: 813.0
+      throughput: 1230.0123001230013
       estimated_peak_memory_range:
         min: 16384
-        max: 7319112
+        max: 28305552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jo5mr34wg
+      job_id: jp14z0e7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 654.0
-      throughput: 1529.051987767584
+      inference_time: 891.0
+      throughput: 1122.334455667789
       estimated_peak_memory_range:
-        min: 643072
-        max: 1873984
+        min: 618496
+        max: 15094096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +329,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jogkz3d2g
+      job_id: jglvmnkm5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:40:14Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-15T17:11:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 647.0
-      throughput: 1545.595054095827
+      inference_time: 431.0
+      throughput: 2320.185614849188
       estimated_peak_memory_range:
-        min: 12288
-        max: 1333008
+        min: 8192
+        max: 16302400
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +352,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 41
-      job_id: jegn23xrg
+      job_id: j5mnxr39p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 658.0
-      throughput: 1519.756838905775
+      inference_time: 391.0
+      throughput: 2557.544757033248
       estimated_peak_memory_range:
-        min: 626688
-        max: 2149952
+        min: 614400
+        max: 9250464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +367,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jn5q83w45
+      job_id: j56y461yp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 420.0
+      throughput: 2380.9523809523807
+      estimated_peak_memory_range:
+        min: 0
+        max: 17425504
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 71
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 71
+      job_id: jpedm9vv5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:40:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-15T17:11:16Z'
   - torchscript_onnx_qnn:
-      inference_time: 780.0
-      throughput: 1282.051282051282
+      inference_time: 784.0
+      throughput: 1275.5102040816328
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 70
-      job_id: jqpye6n7g
+      job_id: jp2ky8lqp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 695.0
-      throughput: 1438.8489208633093
+      inference_time: 697.0
+      throughput: 1434.7202295552368
       estimated_peak_memory_range:
-        min: 2756608
-        max: 2756608
+        min: 2879488
+        max: 2879488
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 71
-      job_id: jwgoy3mx5
+      job_id: jpv6k3wr5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:40:18Z'
+    timestamp: '2024-10-15T17:11:15Z'
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/README.md b/qai_hub_models/models/squeezenet1_1_quantized/README.md
index daed51c8..063e6f42 100644
--- a/qai_hub_models/models/squeezenet1_1_quantized/README.md
+++ b/qai_hub_models/models/squeezenet1_1_quantized/README.md
@@ -6,7 +6,7 @@
 SqueezeNet is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of SqueezeNet-1_1Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/squeezenet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/squeezenet1_1_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/s
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[squeezenet1_1_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.squeezenet1_1_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of SqueezeNet-1_1Quantized can be found
+* The license for the original implementation of SqueezeNet-1_1Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size](https://arxiv.org/abs/1602.07360)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/squeezenet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/evaluate.py b/qai_hub_models/models/squeezenet1_1_quantized/evaluate.py
index bdaf6536..6d914b65 100644
--- a/qai_hub_models/models/squeezenet1_1_quantized/evaluate.py
+++ b/qai_hub_models/models/squeezenet1_1_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.squeezenet1_1_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/export.py b/qai_hub_models/models/squeezenet1_1_quantized/export.py
index f17f90f8..8b5f0275 100644
--- a/qai_hub_models/models/squeezenet1_1_quantized/export.py
+++ b/qai_hub_models/models/squeezenet1_1_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.squeezenet1_1_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "squeezenet1_1_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,12 +225,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/model.py b/qai_hub_models/models/squeezenet1_1_quantized/model.py
index 197d101b..457ee0c5 100644
--- a/qai_hub_models/models/squeezenet1_1_quantized/model.py
+++ b/qai_hub_models/models/squeezenet1_1_quantized/model.py
@@ -4,100 +4,12 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-from typing import Optional
-
-import torch
-from aimet_torch.cross_layer_equalization import equalize_model
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-from qai_hub import Device
-
-from qai_hub_models.models.common import TargetRuntime
 from qai_hub_models.models.squeezenet1_1.model import SqueezeNet
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 3
-DEFAULT_ENCODINGS = "squeezenet1_1_quantized_encodings.json"
-
-
-class SqueezeNetQuantizable(AIMETQuantizableMixin, SqueezeNet):
-    """SqueezeNet with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        SqueezeNet.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-            needs_onnx_direct_aimet_export=True,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "SqueezeNetQuantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = SqueezeNet.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-
-        model = prepare_model(model)
-        equalize_model(model, input_shape)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=torch.rand(input_shape),
-        )
-        constrain_quantized_inputs_to_image_range(sim)
-
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
 
-    # TODO(12424) remove this once encodings export correctly
-    def get_hub_compile_options(
-        self,
-        target_runtime: TargetRuntime,
-        other_compile_options: str = "",
-        device: Optional[Device] = None,
-    ) -> str:
-        compile_options = super().get_hub_compile_options(
-            target_runtime, other_compile_options, device
-        )
-        if target_runtime not in [
-            TargetRuntime.ONNX,
-            TargetRuntime.PRECOMPILED_QNN_ONNX,
-        ]:
-            compile_options += " --quantize_full_type int8"
-        return compile_options
+class SqueezeNetQuantizable(HubQuantizableMixin, SqueezeNet):
+    def get_quantize_options(self) -> str:
+        return "--range_scheme min_max"
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/perf.yaml b/qai_hub_models/models/squeezenet1_1_quantized/perf.yaml
index 2240e932..30462525 100644
--- a/qai_hub_models/models/squeezenet1_1_quantized/perf.yaml
+++ b/qai_hub_models/models/squeezenet1_1_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: SqueezeNet-1_1Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 197.0
-      throughput: 5076.1421319796955
+      inference_time: 205.0
+      throughput: 4878.048780487805
       estimated_peak_memory_range:
-        min: 28672
-        max: 1522728
+        min: 20480
+        max: 71348624
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,37 +60,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jz57zxvrp
+      job_id: jp8q276kp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 462.0
-      throughput: 2164.5021645021643
+      inference_time: 466.0
+      throughput: 2145.922746781116
       estimated_peak_memory_range:
-        min: 12288
-        max: 2858688
+        min: 172032
+        max: 3320704
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: jogkz3rog
+        total_layers: 71
+      job_id: j5wew9135
       job_status: Passed
     torchscript_onnx:
-      inference_time: 620.0
-      throughput: 1612.9032258064517
+      inference_time: 467.0
+      throughput: 2141.3276231263385
       estimated_peak_memory_range:
-        min: 77824
-        max: 17339224
+        min: 167936
+        max: 1721136
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 73
+        layers_on_npu: 47
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 73
-      job_id: jygzerv6g
+        total_layers: 47
+      job_id: jpy1zd78p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:39:40Z'
+    timestamp: '2024-10-17T17:16:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 146.0
-      throughput: 6849.315068493151
+      inference_time: 200.0
+      throughput: 5000.0
       estimated_peak_memory_range:
-        min: 0
-        max: 26175056
+        min: 12288
+        max: 27490512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,37 +113,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jqp4qvjlg
+      job_id: jgkevyowg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 348.0
-      throughput: 2873.5632183908046
+      inference_time: 344.0
+      throughput: 2906.9767441860463
       estimated_peak_memory_range:
         min: 163840
-        max: 14540416
+        max: 12878832
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: jn5q839m5
+        total_layers: 71
+      job_id: jg9l04xwg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 568.0
-      throughput: 1760.5633802816901
+      inference_time: 437.0
+      throughput: 2288.329519450801
       estimated_peak_memory_range:
-        min: 12288
-        max: 27792848
+        min: 28672
+        max: 31185840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 73
+        layers_on_npu: 47
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 73
-      job_id: jz5woqmjp
+        total_layers: 47
+      job_id: jp0z4rv95
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:39:41Z'
+    timestamp: '2024-10-17T17:16:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 203.0
-      throughput: 4926.108374384236
+      inference_time: 493.0
+      throughput: 2028.3975659229209
       estimated_peak_memory_range:
         min: 12288
-        max: 71271896
+        max: 17850320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: j0pxvye9g
+      job_id: j5q602znp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 434.0
-      throughput: 2304.147465437788
+      inference_time: 997.0
+      throughput: 1003.0090270812437
       estimated_peak_memory_range:
-        min: 184320
-        max: 1459560
+        min: 12288
+        max: 8075200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: jw566nq75
+        total_layers: 71
+      job_id: jp1428v8p
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:39:34Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:16:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 238.0
-      throughput: 4201.680672268908
+      inference_time: 4154.0
+      throughput: 240.73182474723157
       estimated_peak_memory_range:
-        min: 16384
-        max: 26893664
+        min: 122880
+        max: 7024328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +204,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jo5mr3vqg
-      job_status: Passed
-    torchscript_onnx_qnn:
-      inference_time: 529.0
-      throughput: 1890.359168241966
-      estimated_peak_memory_range:
-        min: 163840
-        max: 14734320
-      primary_compute_unit: NPU
-      precision: int8
-      layer_info:
-        layers_on_npu: 45
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 45
-      job_id: j7gjxek8p
+      job_id: jglv4koj5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:39:38Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:16:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 204.0
-      throughput: 4901.9607843137255
+      inference_time: 205.0
+      throughput: 4878.048780487805
       estimated_peak_memory_range:
         min: 12288
-        max: 3714432
+        max: 3154344
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +227,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jegn23rmg
+      job_id: j56y21r6p
       job_status: Passed
     torchscript_onnx_qnn:
       inference_time: 430.0
       throughput: 2325.5813953488373
       estimated_peak_memory_range:
-        min: 188416
-        max: 1448928
+        min: 184320
+        max: 1474128
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: j1p3keqz5
+        total_layers: 71
+      job_id: jgdxnvzrp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:39:35Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:16:21Z'
   - torchscript_onnx_tflite:
       inference_time: 205.0
       throughput: 4878.048780487805
       estimated_peak_memory_range:
         min: 12288
-        max: 12581216
+        max: 72500552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: joprke1e5
+      job_id: jp3jnmx3g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 430.0
-      throughput: 2325.5813953488373
+      inference_time: 429.0
+      throughput: 2331.002331002331
       estimated_peak_memory_range:
-        min: 184320
-        max: 1606720
+        min: 196608
+        max: 1544376
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: jwgoy3ed5
+        total_layers: 71
+      job_id: jp4lnw985
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:39:36Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:16:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 206.0
-      throughput: 4854.368932038835
+      inference_time: 201.0
+      throughput: 4975.124378109453
       estimated_peak_memory_range:
-        min: 12288
-        max: 3364288
+        min: 20480
+        max: 24320576
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +303,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jep28l3mp
+      job_id: jgo2zvoqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 428.0
-      throughput: 2336.448598130841
+      inference_time: 429.0
+      throughput: 2331.002331002331
       estimated_peak_memory_range:
         min: 184320
-        max: 1461264
+        max: 1419928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: j1pv3vzm5
+        total_layers: 71
+      job_id: jpxk91d35
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:39:37Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:16:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 510.0
-      throughput: 1960.7843137254902
+      inference_time: 235.0
+      throughput: 4255.31914893617
       estimated_peak_memory_range:
-        min: 12288
-        max: 17397776
+        min: 20480
+        max: 27791328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: jqpye6v4g
+      job_id: jpv6qw9k5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 991.0
-      throughput: 1009.0817356205853
+      inference_time: 533.0
+      throughput: 1876.172607879925
       estimated_peak_memory_range:
-        min: 12288
-        max: 7853840
+        min: 163840
+        max: 13579040
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: jlpe9k40g
+        total_layers: 71
+      job_id: j5mnezddp
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:39:39Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:16:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 4128.0
-      throughput: 242.24806201550388
+      inference_time: 147.0
+      throughput: 6802.721088435374
       estimated_peak_memory_range:
-        min: 65536
-        max: 1918952
+        min: 8192
+        max: 16414800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -398,45 +379,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 43
-      job_id: j2p0yleeg
+      job_id: jgjvdlwvg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 345.0
+      throughput: 2898.550724637681
+      estimated_peak_memory_range:
+        min: 159744
+        max: 9814384
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 71
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 71
+      job_id: jgn60e7k5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 390.0
+      throughput: 2564.102564102564
+      estimated_peak_memory_range:
+        min: 0
+        max: 18980048
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 47
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 47
+      job_id: jgkevy9wg
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:39:30Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:16:37Z'
   - torchscript_onnx_qnn:
-      inference_time: 543.0
-      throughput: 1841.6206261510129
+      inference_time: 553.0
+      throughput: 1808.3182640144666
       estimated_peak_memory_range:
-        min: 552960
-        max: 552960
+        min: 692224
+        max: 692224
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 45
+        layers_on_npu: 71
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 45
-      job_id: j1gln3elp
+        total_layers: 71
+      job_id: j57y2d7v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 666.0
-      throughput: 1501.5015015015015
+      inference_time: 524.0
+      throughput: 1908.3969465648854
       estimated_peak_memory_range:
-        min: 3252224
-        max: 3252224
+        min: 1941504
+        max: 1941504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 73
+        layers_on_npu: 47
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 73
-      job_id: jmg9vw9v5
+        total_layers: 47
+      job_id: jp8q274kp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:39:42Z'
+    timestamp: '2024-10-17T17:16:36Z'
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/requirements.txt b/qai_hub_models/models/squeezenet1_1_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/squeezenet1_1_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/squeezenet1_1_quantized/test.py b/qai_hub_models/models/squeezenet1_1_quantized/test.py
deleted file mode 100644
index 9c927cf5..00000000
--- a/qai_hub_models/models/squeezenet1_1_quantized/test.py
+++ /dev/null
@@ -1,29 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.squeezenet1_1_quantized.demo import main as demo_main
-from qai_hub_models.models.squeezenet1_1_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    SqueezeNetQuantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        SqueezeNetQuantizable.from_pretrained(),
-        MODEL_ID,
-        asset_version=MODEL_ASSET_VERSION,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/stable_diffusion_v1_5_quantized/README.md b/qai_hub_models/models/stable_diffusion_v1_5_quantized/README.md
index c4690f12..64ea5be3 100644
--- a/qai_hub_models/models/stable_diffusion_v1_5_quantized/README.md
+++ b/qai_hub_models/models/stable_diffusion_v1_5_quantized/README.md
@@ -6,7 +6,7 @@
 Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT-L/14 as text encoder, U-Net based latent denoising, and VAE based decoder to generate the final image.
 
 This is based on the implementation of Stable-Diffusion-v1.5 found
-[here](https://github.com/CompVis/stable-diffusion/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.stable_diffusion_v1_5_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Stable-Diffusion-v1.5 can be found
+* The license for the original implementation of Stable-Diffusion-v1.5 can be found
   [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+
 
 ## References
 * [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)
 * [Source Model Implementation](https://github.com/CompVis/stable-diffusion/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/stable_diffusion_v1_5_quantized/export.py b/qai_hub_models/models/stable_diffusion_v1_5_quantized/export.py
index 33fe0503..e80883a3 100644
--- a/qai_hub_models/models/stable_diffusion_v1_5_quantized/export.py
+++ b/qai_hub_models/models/stable_diffusion_v1_5_quantized/export.py
@@ -9,13 +9,14 @@
 
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.stable_diffusion_v1_5_quantized import Model
 from qai_hub_models.utils.args import export_parser
-from qai_hub_models.utils.base_model import BasePrecompiledModel, TargetRuntime
+from qai_hub_models.utils.base_model import BasePrecompiledModel
 from qai_hub_models.utils.printing import print_profile_metrics_from_job
 from qai_hub_models.utils.qai_hub_helpers import (
     can_access_qualcomm_ai_hub,
@@ -36,19 +37,16 @@ def export_model(
     output_dir: Optional[str] = None,
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[str, Tuple[Optional[hub.ProfileJob], Optional[hub.InferenceJob]]] | List[
-    str
-]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 5 main tasks:
+    This function executes the following recipe:
 
-        1. Initialize model.
-        2. Upload model assets to hub.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Summarizes the results from profiling.
+        1. Initialize model
+        2. Upload model assets to hub
+        3. Profiles the model performance on a real device
+        4. Summarizes the results from profiling
 
-    Each of the last three steps can be optionally skipped using the input options.
+    Each of the last 2 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -70,9 +68,8 @@ def export_model(
             `model_cls.from_precompiled`
 
     Returns:
-        A Mapping from component_name to a 2-tuple of:
+        A Mapping from component_name to a struct of:
             * A ProfileJob containing metadata about the profile job (None if profiling skipped).
-            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
     """
     model_name = "stable_diffusion_v1_5_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -101,9 +98,7 @@ def export_model(
             component_arg,
         )
 
-    target_runtime = TargetRuntime.TFLITE
-    # On-device perf improves with I/O in channel_last format except when using ONNX.
-    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+    target_runtime = TargetRuntime.QNN
 
     # 1. Initialize model
     print("Initializing model class")
@@ -123,8 +118,11 @@ def export_model(
         uploaded_models[component_name] = hub.upload_model(
             components_dict[component_name].get_target_model_path()
         )
+    print(
+        f"The {component_name} model is saved here: {components_dict[component_name].get_target_model_path()}"
+    )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -142,31 +140,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
-    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-    if not skip_inferencing:
-        for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            profile_options_all = components_dict[
-                component_name
-            ].get_hub_profile_options(target_runtime, profile_options)
-            sample_inputs = components_dict[component_name].sample_inputs(
-                use_channel_last_format=use_channel_last_format
-            )
-            submitted_inference_job = hub.submit_inference_job(
-                model=uploaded_models[component_name],
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Summarize the results from profiling
+    # 4. Summarizes the results from profiling
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -175,9 +149,8 @@ def export_model(
             print_profile_metrics_from_job(profile_job, profile_data)
 
     return {
-        component_name: (
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/stable_diffusion_v1_5_quantized/perf.yaml b/qai_hub_models/models/stable_diffusion_v1_5_quantized/perf.yaml
index 1e2ae7b0..12218593 100644
--- a/qai_hub_models/models/stable_diffusion_v1_5_quantized/perf.yaml
+++ b/qai_hub_models/models/stable_diffusion_v1_5_quantized/perf.yaml
@@ -31,7 +31,7 @@ aggregated:
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
+  - QCS8550 Proxy
 models:
 - name: TextEncoder_Quantized
   performance_metrics:
@@ -125,7 +125,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:18:34Z'
 - name: VAEDecoder_Quantized
   performance_metrics:
@@ -219,7 +219,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:18:34Z'
 - name: UNet_Quantized
   performance_metrics:
@@ -313,5 +313,5 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:18:35Z'
diff --git a/qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md b/qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md
index ca77cd00..8bb23e47 100644
--- a/qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md
+++ b/qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md
@@ -6,7 +6,7 @@
 Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT-L/14 as text encoder, U-Net based latent denoising, and VAE based decoder to generate the final image.
 
 This is based on the implementation of Stable-Diffusion-v2.1 found
-[here](https://github.com/CompVis/stable-diffusion/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/stable_diffusion_v2_1_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.stable_diffusion_v2_1_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Stable-Diffusion-v2.1 can be found
+* The license for the original implementation of Stable-Diffusion-v2.1 can be found
   [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/CompVis/stable-diffusion/blob/main/LICENSE)
+
 
 ## References
 * [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)
 * [Source Model Implementation](https://github.com/CompVis/stable-diffusion/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/stable_diffusion_v2_1_quantized/export.py b/qai_hub_models/models/stable_diffusion_v2_1_quantized/export.py
index 6945840c..78e6b923 100644
--- a/qai_hub_models/models/stable_diffusion_v2_1_quantized/export.py
+++ b/qai_hub_models/models/stable_diffusion_v2_1_quantized/export.py
@@ -9,13 +9,14 @@
 
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.stable_diffusion_v2_1_quantized import Model
 from qai_hub_models.utils.args import export_parser
-from qai_hub_models.utils.base_model import BasePrecompiledModel, TargetRuntime
+from qai_hub_models.utils.base_model import BasePrecompiledModel
 from qai_hub_models.utils.printing import print_profile_metrics_from_job
 from qai_hub_models.utils.qai_hub_helpers import (
     can_access_qualcomm_ai_hub,
@@ -36,19 +37,16 @@ def export_model(
     output_dir: Optional[str] = None,
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[str, Tuple[Optional[hub.ProfileJob], Optional[hub.InferenceJob]]] | List[
-    str
-]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 5 main tasks:
+    This function executes the following recipe:
 
-        1. Initialize model.
-        2. Upload model assets to hub.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Summarizes the results from profiling.
+        1. Initialize model
+        2. Upload model assets to hub
+        3. Profiles the model performance on a real device
+        4. Summarizes the results from profiling
 
-    Each of the last three steps can be optionally skipped using the input options.
+    Each of the last 2 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -70,9 +68,8 @@ def export_model(
             `model_cls.from_precompiled`
 
     Returns:
-        A Mapping from component_name to a 2-tuple of:
+        A Mapping from component_name to a struct of:
             * A ProfileJob containing metadata about the profile job (None if profiling skipped).
-            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
     """
     model_name = "stable_diffusion_v2_1_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -101,9 +98,7 @@ def export_model(
             component_arg,
         )
 
-    target_runtime = TargetRuntime.TFLITE
-    # On-device perf improves with I/O in channel_last format except when using ONNX.
-    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+    target_runtime = TargetRuntime.QNN
 
     # 1. Initialize model
     print("Initializing model class")
@@ -123,8 +118,11 @@ def export_model(
         uploaded_models[component_name] = hub.upload_model(
             components_dict[component_name].get_target_model_path()
         )
+    print(
+        f"The {component_name} model is saved here: {components_dict[component_name].get_target_model_path()}"
+    )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -142,31 +140,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
-    inference_jobs: Dict[str, hub.client.InferenceJob] = {}
-    if not skip_inferencing:
-        for component_name in components:
-            print(
-                f"Running inference for {component_name} on a hosted device with example inputs."
-            )
-            profile_options_all = components_dict[
-                component_name
-            ].get_hub_profile_options(target_runtime, profile_options)
-            sample_inputs = components_dict[component_name].sample_inputs(
-                use_channel_last_format=use_channel_last_format
-            )
-            submitted_inference_job = hub.submit_inference_job(
-                model=uploaded_models[component_name],
-                inputs=sample_inputs,
-                device=hub_device,
-                name=f"{model_name}_{component_name}",
-                options=profile_options_all,
-            )
-            inference_jobs[component_name] = cast(
-                hub.client.InferenceJob, submitted_inference_job
-            )
-
-    # 5. Summarize the results from profiling
+    # 4. Summarizes the results from profiling
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -175,9 +149,8 @@ def export_model(
             print_profile_metrics_from_job(profile_job, profile_data)
 
     return {
-        component_name: (
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/stable_diffusion_v2_1_quantized/perf.yaml b/qai_hub_models/models/stable_diffusion_v2_1_quantized/perf.yaml
index c3f4cabf..d9c91a78 100644
--- a/qai_hub_models/models/stable_diffusion_v2_1_quantized/perf.yaml
+++ b/qai_hub_models/models/stable_diffusion_v2_1_quantized/perf.yaml
@@ -31,7 +31,7 @@ aggregated:
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
+  - QCS8550 Proxy
 models:
 - name: TextEncoder_Quantized
   performance_metrics:
@@ -125,7 +125,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:17:41Z'
 - name: VAEDecoder_Quantized
   performance_metrics:
@@ -219,7 +219,7 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-03T16:17:42Z'
 - name: UNet_Quantized
   performance_metrics:
@@ -313,5 +313,5 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
+      chipset: QCS8550 Proxy
     timestamp: '2024-10-04T14:15:28Z'
diff --git a/qai_hub_models/models/swin_base/README.md b/qai_hub_models/models/swin_base/README.md
index 6ffedb98..bd54515a 100644
--- a/qai_hub_models/models/swin_base/README.md
+++ b/qai_hub_models/models/swin_base/README.md
@@ -6,7 +6,7 @@
 SwinBase is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Swin-Base found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/swin_base).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.swin_base.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Swin-Base can be found
+* The license for the original implementation of Swin-Base can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/swin_base/export.py b/qai_hub_models/models/swin_base/export.py
index a42c833c..498bdf4f 100644
--- a/qai_hub_models/models/swin_base/export.py
+++ b/qai_hub_models/models/swin_base/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.swin_base import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "swin_base"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/swin_base/perf.yaml b/qai_hub_models/models/swin_base/perf.yaml
index b8e8a2ed..e97091c4 100644
--- a/qai_hub_models/models/swin_base/perf.yaml
+++ b/qai_hub_models/models/swin_base/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Swin-Base
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 28070.0
-      throughput: 35.62522265764161
+      inference_time: 25234.0
+      throughput: 39.629071887136405
       estimated_peak_memory_range:
-        min: 188416
-        max: 4159936
+        min: 0
+        max: 3319560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,37 +56,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: jqp4qvxlg
+      job_id: jgkexowwg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 31138.0
-      throughput: 32.115100520264626
+      inference_time: 28507.0
+      throughput: 35.07910337811766
       estimated_peak_memory_range:
-        min: 36864
-        max: 52033760
+        min: 57344
+        max: 51000304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: j2p0yl1eg
+        total_layers: 1264
+      job_id: jgz3d86o5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 63662.0
-      throughput: 15.707957651346172
+      inference_time: 46693.0
+      throughput: 21.416486411239372
       estimated_peak_memory_range:
-        min: 81920
-        max: 236996344
+        min: 98304
+        max: 237346048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1141
+        layers_on_npu: 1150
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1141
-      job_id: j1pv3v1m5
+        total_layers: 1150
+      job_id: jpxko3ql5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:35:33Z'
+    timestamp: '2024-10-14T23:26:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 19898.0
-      throughput: 50.256307166549405
+      inference_time: 18148.0
+      throughput: 55.10249063257659
       estimated_peak_memory_range:
         min: 49152
-        max: 559514784
+        max: 597627248
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,37 +109,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: j0pxvy79g
+      job_id: j5q6qzxnp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 22060.0
-      throughput: 45.33091568449683
+      inference_time: 23321.0
+      throughput: 42.87980789846061
       estimated_peak_memory_range:
-        min: 118784
-        max: 166813088
+        min: 0
+        max: 202385200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: j1p8oz38g
+        total_layers: 1264
+      job_id: j5we68k35
       job_status: Passed
     torchscript_onnx:
-      inference_time: 45400.0
-      throughput: 22.026431718061673
+      inference_time: 38588.0
+      throughput: 25.91479216336685
       estimated_peak_memory_range:
-        min: 720896
-        max: 809851216
+        min: 688128
+        max: 873983520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1141
+        layers_on_npu: 1150
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1141
-      job_id: j7gjxe08p
+        total_layers: 1150
+      job_id: j5mnxo79p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:35:34Z'
+    timestamp: '2024-10-14T23:26:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 28072.0
-      throughput: 35.62268452550584
+      inference_time: 25144.0
+      throughput: 39.770919503658924
       estimated_peak_memory_range:
-        min: 241664
-        max: 3370624
+        min: 86016
+        max: 2766208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +162,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: jo5mr3wqg
+      job_id: jglvmo9j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29176.0
-      throughput: 34.27474636687688
+      inference_time: 26838.0
+      throughput: 37.26060064088233
       estimated_peak_memory_range:
-        min: 708608
-        max: 2011296
+        min: 749568
+        max: 1914960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: jn5q837m5
+        total_layers: 1264
+      job_id: jp14z798p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:35:28Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:26:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 35085.0
-      throughput: 28.502208921191393
+      inference_time: 25529.0
+      throughput: 39.17113870500216
       estimated_peak_memory_range:
-        min: 262144
-        max: 529877072
+        min: 24576
+        max: 3714200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: jegn239mg
+      job_id: jpv6ke8k5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 38873.0
-      throughput: 25.72479613099066
+      inference_time: 27083.0
+      throughput: 36.9235313665399
       estimated_peak_memory_range:
-        min: 663552
-        max: 161477040
+        min: 684032
+        max: 2412728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: jwgoy31d5
+        total_layers: 1264
+      job_id: jg9lnkr8g
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:35:32Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:26:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 28202.0
-      throughput: 35.458478122119
+      inference_time: 25456.0
+      throughput: 39.28346951602766
       estimated_peak_memory_range:
-        min: 81920
-        max: 2815344
+        min: 61440
+        max: 2515168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,37 +238,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: joprke4e5
+      job_id: jgo26o7qp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29234.0
-      throughput: 34.206745570226445
+      inference_time: 27124.0
+      throughput: 36.86771862557145
       estimated_peak_memory_range:
-        min: 708608
-        max: 1891152
+        min: 684032
+        max: 2341568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: j1gln30lp
+        total_layers: 1264
+      job_id: j5we68km5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:35:29Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:26:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 28179.0
-      throughput: 35.48741970971291
+      inference_time: 25633.0
+      throughput: 39.01221082198728
       estimated_peak_memory_range:
-        min: 61440
-        max: 2713416
+        min: 73728
+        max: 2992616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +276,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: jep28l7mp
+      job_id: jp3j0xl3g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 29089.0
-      throughput: 34.37725600742549
+      inference_time: 27232.0
+      throughput: 36.72150411280846
       estimated_peak_memory_range:
-        min: 720896
-        max: 1994368
+        min: 704512
+        max: 2553296
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: jw566n375
+        total_layers: 1264
+      job_id: jgdx18krp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:35:30Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:26:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 28273.0
-      throughput: 35.369433735365895
+      inference_time: 32539.0
+      throughput: 30.732351946894497
       estimated_peak_memory_range:
-        min: 122880
-        max: 3287760
+        min: 110592
+        max: 565050224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,60 +314,113 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1568
-      job_id: jqpye644g
+      job_id: j56y4r96p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 28811.0
-      throughput: 34.70896532574364
+      inference_time: 35739.0
+      throughput: 27.980637398919946
       estimated_peak_memory_range:
-        min: 741376
-        max: 2410512
+        min: 655360
+        max: 199566896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: j1p3ke4z5
+        total_layers: 1264
+      job_id: j57yrkm95
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:26:28Z'
+  - torchscript_onnx_tflite:
+      inference_time: 16379.0
+      throughput: 61.05378838756945
+      estimated_peak_memory_range:
+        min: 24576
+        max: 282381088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1568
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1568
+      job_id: jpedm8qo5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 15042.0
+      throughput: 66.48052120728626
+      estimated_peak_memory_range:
+        min: 614400
+        max: 216465408
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1264
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1264
+      job_id: jp4lrm715
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 29175.0
+      throughput: 34.27592116538132
+      estimated_peak_memory_range:
+        min: 663552
+        max: 346042352
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1150
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1150
+      job_id: jp2ky41qp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:35:31Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:26:34Z'
   - torchscript_onnx_qnn:
-      inference_time: 29617.0
-      throughput: 33.76439207212074
+      inference_time: 27571.0
+      throughput: 36.26999383410105
       estimated_peak_memory_range:
         min: 602112
         max: 602112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1255
+        layers_on_npu: 1264
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1255
-      job_id: jogkz3log
+        total_layers: 1264
+      job_id: jg9lnkrwg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 65960.0
-      throughput: 15.160703456640388
+      inference_time: 51987.0
+      throughput: 19.235578125300556
       estimated_peak_memory_range:
-        min: 207126528
-        max: 207126528
+        min: 207286272
+        max: 207286272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1141
+        layers_on_npu: 1150
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1141
-      job_id: jlpe9kr0g
+        total_layers: 1150
+      job_id: jgn6vo4q5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:35:35Z'
+    timestamp: '2024-10-14T23:26:32Z'
diff --git a/qai_hub_models/models/swin_small/README.md b/qai_hub_models/models/swin_small/README.md
index 8594f83c..e5ba3f12 100644
--- a/qai_hub_models/models/swin_small/README.md
+++ b/qai_hub_models/models/swin_small/README.md
@@ -6,7 +6,7 @@
 SwinSmall is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Swin-Small found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/swin_small).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.swin_small.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Swin-Small can be found
+* The license for the original implementation of Swin-Small can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/swin_small/export.py b/qai_hub_models/models/swin_small/export.py
index d9bcea8f..3f0ff5bd 100644
--- a/qai_hub_models/models/swin_small/export.py
+++ b/qai_hub_models/models/swin_small/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.swin_small import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "swin_small"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/swin_small/perf.yaml b/qai_hub_models/models/swin_small/perf.yaml
index 55d355eb..c577f36b 100644
--- a/qai_hub_models/models/swin_small/perf.yaml
+++ b/qai_hub_models/models/swin_small/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Swin-Small
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 21002.0
-      throughput: 47.614512903533
+      inference_time: 18699.0
+      throughput: 53.47879565752179
       estimated_peak_memory_range:
-        min: 20480
-        max: 4495792
+        min: 106496
+        max: 4718936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,37 +56,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: jo5mr3zqg
+      job_id: jgn6vowk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 23699.0
-      throughput: 42.19587324359678
+      inference_time: 21583.0
+      throughput: 46.33276189593661
       estimated_peak_memory_range:
-        min: 36864
-        max: 38709952
+        min: 16384
+        max: 40130776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: jogkz3yog
+        total_layers: 1255
+      job_id: j56y4r06p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 54120.0
-      throughput: 18.477457501847745
+      inference_time: 34575.0
+      throughput: 28.922631959508315
       estimated_peak_memory_range:
-        min: 94208
-        max: 136459248
+        min: 69632
+        max: 136549584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1136
+        layers_on_npu: 1145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1136
-      job_id: jlpe9kv0g
+        total_layers: 1145
+      job_id: j57yrk1v5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:34:44Z'
+    timestamp: '2024-10-14T23:25:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 14445.0
-      throughput: 69.22810661128418
+      inference_time: 12959.0
+      throughput: 77.16644802839726
       estimated_peak_memory_range:
-        min: 24576
-        max: 524959744
+        min: 16384
+        max: 551676848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,37 +109,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: jegn23emg
+      job_id: jprv3o70g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 16066.0
-      throughput: 62.24324660774306
+      inference_time: 14588.0
+      throughput: 68.54949273375377
       estimated_peak_memory_range:
         min: 0
-        max: 138258560
+        max: 164090432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: jn5q832m5
+        total_layers: 1255
+      job_id: jgo26o9qp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 38307.0
-      throughput: 26.104889445793198
+      inference_time: 23854.0
+      throughput: 41.92169028255219
       estimated_peak_memory_range:
         min: 0
-        max: 761041680
+        max: 820872432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1136
+        layers_on_npu: 1145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1136
-      job_id: jygzer76g
+        total_layers: 1145
+      job_id: jp4lrm685
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:34:45Z'
+    timestamp: '2024-10-14T23:25:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 20989.0
-      throughput: 47.64400400209634
+      inference_time: 18642.0
+      throughput: 53.642313056538995
       estimated_peak_memory_range:
-        min: 65536
-        max: 2823888
+        min: 278528
+        max: 3571760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +162,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: joprkeye5
+      job_id: jp2ky4zrp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 21626.0
-      throughput: 46.24063627115509
+      inference_time: 20236.0
+      throughput: 49.4168808064835
       estimated_peak_memory_range:
-        min: 692224
-        max: 2500920
+        min: 679936
+        max: 1922600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: jw566n175
+        total_layers: 1255
+      job_id: jgjvno6vg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:34:39Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:25:21Z'
   - torchscript_onnx_tflite:
-      inference_time: 26590.0
-      throughput: 37.608123354644604
+      inference_time: 18785.0
+      throughput: 53.23396326856535
       estimated_peak_memory_range:
-        min: 69632
-        max: 509872816
+        min: 45056
+        max: 3508088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: jep28lmmp
+      job_id: jgkexokwg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 28655.0
-      throughput: 34.897923573547374
+      inference_time: 20621.0
+      throughput: 48.49425343096843
       estimated_peak_memory_range:
-        min: 0
-        max: 133381584
+        min: 671744
+        max: 2429504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: j7gjxel8p
+        total_layers: 1255
+      job_id: j5we68035
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:34:43Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:25:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 20977.0
-      throughput: 47.67125899795013
+      inference_time: 18670.0
+      throughput: 53.56186395286556
       estimated_peak_memory_range:
-        min: 20480
-        max: 3102680
+        min: 24576
+        max: 5234240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,37 +238,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: jqpye6d4g
+      job_id: jp8qy6kkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 22011.0
-      throughput: 45.43182953977556
+      inference_time: 20685.0
+      throughput: 48.344210780759006
       estimated_peak_memory_range:
-        min: 651264
-        max: 2418872
+        min: 675840
+        max: 2083696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: j1p3kemz5
+        total_layers: 1255
+      job_id: jgz3d8qo5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:34:40Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:25:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 21013.0
-      throughput: 47.58958739827726
+      inference_time: 18664.0
+      throughput: 53.57908272610373
       estimated_peak_memory_range:
-        min: 32768
-        max: 3470144
+        min: 61440
+        max: 3426808
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +276,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: j2p0ylreg
+      job_id: jp0z0dx95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 21986.0
-      throughput: 45.483489493313925
+      inference_time: 20596.0
+      throughput: 48.55311711011847
       estimated_peak_memory_range:
         min: 671744
-        max: 1938984
+        max: 1923944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: jwgoy3vd5
+        total_layers: 1255
+      job_id: jpedm80o5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:34:41Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:25:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 21092.0
-      throughput: 47.41134079271762
+      inference_time: 24220.0
+      throughput: 41.28819157720892
       estimated_peak_memory_range:
-        min: 57344
-        max: 2855080
+        min: 69632
+        max: 534668128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,60 +314,113 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1563
-      job_id: j1p8oz78g
+      job_id: jpy13qy8p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 21966.0
-      throughput: 45.524902121460435
+      inference_time: 26468.0
+      throughput: 37.7814719661478
       estimated_peak_memory_range:
-        min: 696320
-        max: 1905664
+        min: 643072
+        max: 162107152
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: j1pv3vwm5
+        total_layers: 1255
+      job_id: jp14z7k8p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:25:28Z'
+  - torchscript_onnx_tflite:
+      inference_time: 11752.0
+      throughput: 85.0918992511913
+      estimated_peak_memory_range:
+        min: 2113536
+        max: 243191392
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1563
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1563
+      job_id: jglvmoqj5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 12483.0
+      throughput: 80.10894816951054
+      estimated_peak_memory_range:
+        min: 618496
+        max: 170383712
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1255
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1255
+      job_id: jgdx18yrp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 20333.0
+      throughput: 49.181134116952734
+      estimated_peak_memory_range:
+        min: 0
+        max: 327920288
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1145
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1145
+      job_id: jgn6vodk5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:34:42Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:25:34Z'
   - torchscript_onnx_qnn:
-      inference_time: 22652.0
-      throughput: 44.14621225498852
+      inference_time: 21141.0
+      throughput: 47.30145215458115
       estimated_peak_memory_range:
         min: 602112
         max: 602112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1246
+        layers_on_npu: 1255
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1246
-      job_id: j1gln3klp
+        total_layers: 1255
+      job_id: jpv6keyk5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 57017.0
-      throughput: 17.53862882999807
+      inference_time: 37889.0
+      throughput: 26.392884478344637
       estimated_peak_memory_range:
-        min: 123432960
-        max: 123432960
+        min: 123691008
+        max: 123691008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 1136
+        layers_on_npu: 1145
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 1136
-      job_id: jz5woq9jp
+        total_layers: 1145
+      job_id: jpxko3835
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:34:46Z'
+    timestamp: '2024-10-14T23:25:32Z'
diff --git a/qai_hub_models/models/swin_tiny/README.md b/qai_hub_models/models/swin_tiny/README.md
index 08b4cf3a..33acf7ff 100644
--- a/qai_hub_models/models/swin_tiny/README.md
+++ b/qai_hub_models/models/swin_tiny/README.md
@@ -6,7 +6,7 @@
 SwinTiny is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of Swin-Tiny found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/swin_tiny).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.swin_tiny.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Swin-Tiny can be found
+* The license for the original implementation of Swin-Tiny can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/swin_tiny/export.py b/qai_hub_models/models/swin_tiny/export.py
index 89d35fad..623d8b83 100644
--- a/qai_hub_models/models/swin_tiny/export.py
+++ b/qai_hub_models/models/swin_tiny/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.swin_tiny import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "swin_tiny"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/swin_tiny/perf.yaml b/qai_hub_models/models/swin_tiny/perf.yaml
index 1fee5940..4e15f1cc 100644
--- a/qai_hub_models/models/swin_tiny/perf.yaml
+++ b/qai_hub_models/models/swin_tiny/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Swin-Tiny
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 13488.0
-      throughput: 74.13997627520759
+      inference_time: 11939.0
+      throughput: 83.75910880308234
       estimated_peak_memory_range:
-        min: 20480
-        max: 3269384
+        min: 40960
+        max: 2902584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,37 +56,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: joprkeee5
+      job_id: j57yrknv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 14968.0
-      throughput: 66.80919294494923
+      inference_time: 13291.0
+      throughput: 75.23888345496952
       estimated_peak_memory_range:
-        min: 40960
-        max: 24819936
+        min: 12288
+        max: 24774456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: j1gln33lp
+        total_layers: 709
+      job_id: jp8qy6rkp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 32582.0
-      throughput: 30.69179301454791
+      inference_time: 19804.0
+      throughput: 50.494849525348414
       estimated_peak_memory_range:
-        min: 36864
-        max: 69358280
+        min: 53248
+        max: 69055400
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 614
+        layers_on_npu: 623
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 614
-      job_id: jz5woqqjp
+        total_layers: 623
+      job_id: jgz3d80o5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:33:58Z'
+    timestamp: '2024-10-14T23:24:35Z'
   - torchscript_onnx_tflite:
-      inference_time: 11114.0
-      throughput: 89.97660608241857
+      inference_time: 8121.0
+      throughput: 123.13754463735994
       estimated_peak_memory_range:
-        min: 49152
-        max: 323779760
+        min: 20480
+        max: 342709040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,37 +109,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: jep28llmp
+      job_id: jp4lrm485
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10043.0
-      throughput: 99.57184108334162
+      inference_time: 10599.0
+      throughput: 94.34852344560808
       estimated_peak_memory_range:
-        min: 618496
-        max: 93102960
+        min: 638976
+        max: 107634384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: jw566nn75
+        total_layers: 709
+      job_id: jgkexo0wg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 22617.0
-      throughput: 44.21452889419463
+      inference_time: 16514.0
+      throughput: 60.55468087683178
       estimated_peak_memory_range:
-        min: 45056
-        max: 436578464
+        min: 0
+        max: 492780176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 614
+        layers_on_npu: 623
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 614
-      job_id: jmg9vwwv5
+        total_layers: 623
+      job_id: j5we68r35
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:33:59Z'
+    timestamp: '2024-10-14T23:24:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 13351.0
-      throughput: 74.90075649764063
+      inference_time: 11848.0
+      throughput: 84.40243079000675
       estimated_peak_memory_range:
-        min: 24576
-        max: 3192536
+        min: 20480
+        max: 3173096
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,22 +162,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: jqpye664g
+      job_id: jpxko3r35
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13321.0
-      throughput: 75.06943923128894
+      inference_time: 12224.0
+      throughput: 81.80628272251309
       estimated_peak_memory_range:
-        min: 634880
-        max: 1876344
+        min: 647168
+        max: 2486752
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: jwgoy33d5
+        total_layers: 709
+      job_id: jglvmo8j5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:33:53Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:24:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 16739.0
-      throughput: 59.740725252404566
+      inference_time: 11874.0
+      throughput: 84.21761832575375
       estimated_peak_memory_range:
-        min: 45056
-        max: 315094992
+        min: 36864
+        max: 2407856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,37 +200,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: j2p0ylleg
+      job_id: jp2ky4drp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 18022.0
-      throughput: 55.487737210076574
+      inference_time: 12370.0
+      throughput: 80.84074373484236
       estimated_peak_memory_range:
-        min: 626688
-        max: 92351424
+        min: 655360
+        max: 2007808
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: jygzerr6g
+        total_layers: 709
+      job_id: jgo26owqp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:33:57Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:24:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 13413.0
-      throughput: 74.55453664355475
+      inference_time: 11920.0
+      throughput: 83.89261744966443
       estimated_peak_memory_range:
-        min: 20480
-        max: 8742792
+        min: 45056
+        max: 3093240
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,37 +238,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: j1p8ozz8g
+      job_id: jprv3od0g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13566.0
-      throughput: 73.71369600471768
+      inference_time: 12533.0
+      throughput: 79.78935609989627
       estimated_peak_memory_range:
-        min: 626688
-        max: 1883504
+        min: 45056
+        max: 1710104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: j1pv3vvm5
+        total_layers: 709
+      job_id: jp3j0x73g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:33:54Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:24:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 13446.0
-      throughput: 74.37156031533542
+      inference_time: 11837.0
+      throughput: 84.48086508405846
       estimated_peak_memory_range:
-        min: 32768
-        max: 2457664
+        min: 36864
+        max: 3066768
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,37 +276,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: jogkz33og
+      job_id: jgn6voqk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13533.0
-      throughput: 73.89344565137073
+      inference_time: 12459.0
+      throughput: 80.26326350429409
       estimated_peak_memory_range:
-        min: 40960
-        max: 1667992
+        min: 684032
+        max: 1817448
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: j7gjxee8p
+        total_layers: 709
+      job_id: j56y4rm6p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:33:55Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:24:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 13404.0
-      throughput: 74.60459564309161
+      inference_time: 15225.0
+      throughput: 65.68144499178982
       estimated_peak_memory_range:
-        min: 40960
-        max: 3005544
+        min: 24576
+        max: 333978560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,60 +314,113 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 837
-      job_id: jn5q833m5
+      job_id: j5mnxokdp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13441.0
-      throughput: 74.39922624804701
+      inference_time: 16315.0
+      throughput: 61.29328838492185
       estimated_peak_memory_range:
-        min: 651264
-        max: 1889152
+        min: 0
+        max: 109370064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: jlpe9kk0g
+        total_layers: 709
+      job_id: jgjvno8vg
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:24:33Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7384.0
+      throughput: 135.42795232936078
+      estimated_peak_memory_range:
+        min: 16384
+        max: 163683760
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 837
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 837
+      job_id: jp0z0d995
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 7646.0
+      throughput: 130.78733978550875
+      estimated_peak_memory_range:
+        min: 614400
+        max: 110912592
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 709
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 709
+      job_id: jpedm8no5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 12008.0
+      throughput: 83.27781479013991
+      estimated_peak_memory_range:
+        min: 53248
+        max: 222424144
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 623
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 623
+      job_id: jgdx18mrp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:33:56Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:24:39Z'
   - torchscript_onnx_qnn:
-      inference_time: 13969.0
-      throughput: 71.58708568974157
+      inference_time: 12872.0
+      throughput: 77.68800497203232
       estimated_peak_memory_range:
         min: 602112
         max: 602112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 700
+        layers_on_npu: 709
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 700
-      job_id: j1p3keez5
+        total_layers: 709
+      job_id: j5q6qz1np
       job_status: Passed
     torchscript_onnx:
-      inference_time: 33942.0
-      throughput: 29.46202345177067
+      inference_time: 22104.0
+      throughput: 45.24068041983352
       estimated_peak_memory_range:
-        min: 67137536
-        max: 67137536
+        min: 67313664
+        max: 67313664
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 614
+        layers_on_npu: 623
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 614
-      job_id: jnp10eel5
+        total_layers: 623
+      job_id: jg9lnkqwg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:34:00Z'
+    timestamp: '2024-10-14T23:24:37Z'
diff --git a/qai_hub_models/models/trocr/README.md b/qai_hub_models/models/trocr/README.md
index 6051cbb9..49968d83 100644
--- a/qai_hub_models/models/trocr/README.md
+++ b/qai_hub_models/models/trocr/README.md
@@ -6,7 +6,7 @@
 End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.
 
 This is based on the implementation of TrOCR found
-[here](https://huggingface.co/microsoft/trocr-small-stage1). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/trocr).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.trocr.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of TrOCR can be found
+* The license for the original implementation of TrOCR can be found
   [here](https://github.com/microsoft/unilm/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282)
 * [Source Model Implementation](https://huggingface.co/microsoft/trocr-small-stage1)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/trocr/export.py b/qai_hub_models/models/trocr/export.py
index 8b99469e..897086eb 100644
--- a/qai_hub_models/models/trocr/export.py
+++ b/qai_hub_models/models/trocr/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.trocr import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "trocr"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "TrOCREncoder" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/trocr/perf.yaml b/qai_hub_models/models/trocr/perf.yaml
index f8af380e..31c92d4b 100644
--- a/qai_hub_models/models/trocr/perf.yaml
+++ b/qai_hub_models/models/trocr/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: TrOCREncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 66632.0
-      throughput: 15.007804058110217
+      inference_time: 50652.0
+      throughput: 19.74255705598989
       estimated_peak_memory_range:
-        min: 7770112
-        max: 9877904
+        min: 7196672
+        max: 9363616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: j1gln36ep
+      job_id: jgz3d8mk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 68321.0
-      throughput: 14.63678810321863
+      inference_time: 52890.0
+      throughput: 18.907165815844206
       estimated_peak_memory_range:
-        min: 98304
-        max: 22307152
+        min: 258048
+        max: 23201496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: j0pxvynjg
+      job_id: j56y4r80p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 56119.0
-      throughput: 17.819276893743652
+      inference_time: 39309.0
+      throughput: 25.43946678877611
       estimated_peak_memory_range:
-        min: 1912832
-        max: 120345624
+        min: 73728
+        max: 186476328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 396
-      job_id: jlpe9kw7g
+      job_id: jprv3ox0g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:33:09Z'
+    timestamp: '2024-10-14T23:23:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 51027.0
-      throughput: 19.59746800713348
+      inference_time: 40349.0
+      throughput: 24.78376167934769
       estimated_peak_memory_range:
-        min: 7266304
-        max: 307619920
+        min: 5361664
+        max: 321212896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: j1p3kevx5
+      job_id: jg9lnkzlg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 54054.0
-      throughput: 18.5000185000185
+      inference_time: 42073.0
+      throughput: 23.76821239274594
       estimated_peak_memory_range:
         min: 1802240
-        max: 58552016
+        max: 67186704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,7 +124,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: jegn23mvg
+      job_id: jgo26olxp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 31228.0
+      throughput: 32.0225438708851
+      estimated_peak_memory_range:
+        min: 0
+        max: 364484256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 396
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 396
+      job_id: jpy13q88p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -135,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:32:57Z'
+    timestamp: '2024-10-14T23:23:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 65665.0
-      throughput: 15.228812914033352
+      inference_time: 50061.0
+      throughput: 19.97562973172729
       estimated_peak_memory_range:
         min: 7188480
-        max: 9768312
+        max: 8843600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -149,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: j1pv3v075
+      job_id: jgdx18dep
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 48577.0
-      throughput: 20.585873973279536
+      inference_time: 36086.0
+      throughput: 27.71157789724547
       estimated_peak_memory_range:
-        min: 1933312
-        max: 3796736
+        min: 1843200
+        max: 3281984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: j2p0yl22g
+      job_id: jgz3d8lk5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -172,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:33:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:23:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 75863.0
-      throughput: 13.181656406944096
+      inference_time: 50179.0
+      throughput: 19.928655413619243
       estimated_peak_memory_range:
-        min: 7282688
-        max: 299682560
+        min: 7118848
+        max: 9955776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -187,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: jlpe9ke7g
+      job_id: jpy13qo7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 77027.0
-      throughput: 12.982460695600244
+      inference_time: 36899.0
+      throughput: 27.101005447302096
       estimated_peak_memory_range:
-        min: 57344
-        max: 53780128
+        min: 1884160
+        max: 3777416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: j1pv3vr75
+      job_id: jg9lnkowg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:33:07Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:23:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 66452.0
-      throughput: 15.048456028411485
+      inference_time: 51951.0
+      throughput: 19.24890762449231
       estimated_peak_memory_range:
-        min: 7294976
-        max: 735597352
+        min: 7122944
+        max: 9553192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -225,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: jz5woq2zp
+      job_id: jprv3ol9g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 48980.0
-      throughput: 20.41649652919559
+      inference_time: 37124.0
+      throughput: 26.936752505117983
       estimated_peak_memory_range:
-        min: 1810432
-        max: 4960864
+        min: 1916928
+        max: 3660680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: jogkz3qyg
+      job_id: jgdx186ep
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:33:02Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:23:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 65975.0
-      throughput: 15.157256536566882
+      inference_time: 51056.0
+      throughput: 19.586336571607646
       estimated_peak_memory_range:
-        min: 7229440
-        max: 9665952
+        min: 7188480
+        max: 9399120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -263,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: jnp10eyk5
+      job_id: j5mnxo0wp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 49173.0
-      throughput: 20.336363451487603
+      inference_time: 37072.0
+      throughput: 26.974536037980148
       estimated_peak_memory_range:
-        min: 1871872
-        max: 3306144
+        min: 1912832
+        max: 6938936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: j1gln32ep
+      job_id: jg9lnkolg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:33:04Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:23:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 65985.0
-      throughput: 15.154959460483443
+      inference_time: 60938.0
+      throughput: 16.410121763103483
       estimated_peak_memory_range:
-        min: 7180288
-        max: 9713720
+        min: 7118848
+        max: 310706544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -301,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 591
-      job_id: jz57zx0qp
+      job_id: jp4lrmyv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 49624.0
-      throughput: 20.15153957762373
+      inference_time: 60192.0
+      throughput: 16.61350345560872
       estimated_peak_memory_range:
-        min: 1941504
-        max: 3772984
+        min: 1785856
+        max: 66791168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: j1p3ke1x5
+      job_id: jp4lrme85
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:33:06Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:23:32Z'
+  - torchscript_onnx_tflite:
+      inference_time: 36174.0
+      throughput: 27.6441643169127
+      estimated_peak_memory_range:
+        min: 2641920
+        max: 125164800
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 591
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 591
+      job_id: j5q6qzl4p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 33016.0
+      throughput: 30.28834504482675
+      estimated_peak_memory_range:
+        min: 1810432
+        max: 69495072
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 443
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 443
+      job_id: j5mnxo9dp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 23693.0
+      throughput: 42.20655889925295
+      estimated_peak_memory_range:
+        min: 5943296
+        max: 217240080
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 396
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 396
+      job_id: j56y4ro6p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:23:43Z'
   - torchscript_onnx_qnn:
-      inference_time: 47461.0
-      throughput: 21.0699311013253
+      inference_time: 33885.0
+      throughput: 29.511583296443852
       estimated_peak_memory_range:
-        min: 1777664
-        max: 1777664
+        min: 1773568
+        max: 1773568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -339,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 443
-      job_id: jep28l9xp
+      job_id: jgjvnorxg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 55221.0
-      throughput: 18.109052715452453
+      inference_time: 35659.0
+      throughput: 28.043411200538433
       estimated_peak_memory_range:
-        min: 114429952
-        max: 114429952
+        min: 114479104
+        max: 114479104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 396
-      job_id: jnp10ewk5
+      job_id: jp8qy6jkp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -363,15 +429,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:33:12Z'
+    timestamp: '2024-10-14T23:23:39Z'
 - name: TrOCRDecoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2710.0
-      throughput: 369.00369003690037
+      inference_time: 2600.0
+      throughput: 384.61538461538464
       estimated_peak_memory_range:
         min: 12288
-        max: 2321944
+        max: 2023104
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -379,37 +445,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jw566nev5
+      job_id: j5we68l65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3068.0
-      throughput: 325.94524119947846
+      inference_time: 3012.0
+      throughput: 332.00531208499336
       estimated_peak_memory_range:
-        min: 24576
-        max: 283198712
+        min: 3383296
+        max: 275704168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: jo5mr3qyg
+        total_layers: 375
+      job_id: jp3j0xzlg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 3024.0
-      throughput: 330.6878306878307
+      inference_time: 2843.0
+      throughput: 351.74111853675697
       estimated_peak_memory_range:
-        min: 720896
-        max: 3353256
+        min: 704512
+        max: 3211728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 376
+        layers_on_npu: 395
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 376
-      job_id: jygzerjzg
+        total_layers: 395
+      job_id: jp2ky4orp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -418,13 +484,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:33:09Z'
+    timestamp: '2024-10-14T23:23:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 1917.0
-      throughput: 521.6484089723526
+      inference_time: 1851.0
+      throughput: 540.2485143165857
       estimated_peak_memory_range:
         min: 12288
-        max: 196663088
+        max: 198757456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -432,37 +498,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jwgoy3k45
+      job_id: jp14z7n2p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2190.0
-      throughput: 456.62100456621005
+      inference_time: 2471.0
+      throughput: 404.6944556859571
       estimated_peak_memory_range:
         min: 0
-        max: 51016704
+        max: 53666960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: joprke2v5
+        total_layers: 375
+      job_id: jpv6kelj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2259.0
-      throughput: 442.67374944665784
+      inference_time: 2148.0
+      throughput: 465.54934823091247
       estimated_peak_memory_range:
         min: 0
-        max: 151491264
+        max: 155889552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 376
+        layers_on_npu: 395
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 376
-      job_id: jmg9vwyq5
+        total_layers: 395
+      job_id: jp0z0do95
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -471,13 +537,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:33:11Z'
+    timestamp: '2024-10-14T23:23:38Z'
   - torchscript_onnx_tflite:
-      inference_time: 2676.0
-      throughput: 373.69207772795215
+      inference_time: 2562.0
+      throughput: 390.32006245121
       estimated_peak_memory_range:
         min: 12288
-        max: 1689544
+        max: 2224376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -485,22 +551,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: j7gjxez7p
+      job_id: j57yrkel5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2709.0
-      throughput: 369.139904023625
+      inference_time: 2631.0
+      throughput: 380.08361839604714
       estimated_peak_memory_range:
-        min: 1744896
-        max: 3108000
+        min: 188416
+        max: 1513696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: j1p8ozmzg
+        total_layers: 375
+      job_id: j5we68y65
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -508,14 +574,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:33:01Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:23:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 2819.0
-      throughput: 354.735721887194
+      inference_time: 2608.0
+      throughput: 383.4355828220859
       estimated_peak_memory_range:
-        min: 16384
-        max: 193892048
+        min: 12288
+        max: 2024384
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -523,37 +589,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jygzerozg
+      job_id: jp0z0dm65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3418.0
-      throughput: 292.5687536571094
+      inference_time: 2607.0
+      throughput: 383.5826620636747
       estimated_peak_memory_range:
-        min: 0
-        max: 44338144
+        min: 1265664
+        max: 3372392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: j7gjxe27p
+        total_layers: 375
+      job_id: jp14z7o8p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:33:08Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:23:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 2708.0
-      throughput: 369.2762186115214
+      inference_time: 2604.0
+      throughput: 384.0245775729647
       estimated_peak_memory_range:
         min: 12288
-        max: 1956704
+        max: 1654136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -561,37 +627,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jmg9vwjq5
+      job_id: jp2ky4r4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2768.0
-      throughput: 361.271676300578
+      inference_time: 2613.0
+      throughput: 382.70187523918867
       estimated_peak_memory_range:
-        min: 1347584
-        max: 3406960
+        min: 1286144
+        max: 3420960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: jn5q83r75
+        total_layers: 375
+      job_id: j5we68y35
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:33:03Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:23:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 2656.0
-      throughput: 376.50602409638554
+      inference_time: 2573.0
+      throughput: 388.65137971239795
       estimated_peak_memory_range:
-        min: 12288
-        max: 2222080
+        min: 16384
+        max: 2053744
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -599,37 +665,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jvgdwoek5
+      job_id: jgn6vozr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2832.0
-      throughput: 353.1073446327684
+      inference_time: 2658.0
+      throughput: 376.2227238525207
       estimated_peak_memory_range:
-        min: 1912832
-        max: 3317016
+        min: 1323008
+        max: 2779128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: jw566nzv5
+        total_layers: 375
+      job_id: jp14z7o2p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:33:04Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:23:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 2707.0
-      throughput: 369.4126339120798
+      inference_time: 2814.0
+      throughput: 355.36602700781805
       estimated_peak_memory_range:
-        min: 16384
-        max: 2148800
+        min: 12288
+        max: 198674720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -637,60 +703,113 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 399
-      job_id: jqp4qvkqg
+      job_id: jpxko3l15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2796.0
-      throughput: 357.653791130186
+      inference_time: 3375.0
+      throughput: 296.2962962962963
       estimated_peak_memory_range:
-        min: 1351680
-        max: 2756632
+        min: 4358144
+        max: 54921552
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: jwgoy3n45
+        total_layers: 375
+      job_id: jpxko3035
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:23:33Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2104.0
+      throughput: 475.2851711026616
+      estimated_peak_memory_range:
+        min: 12288
+        max: 28373760
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 399
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 399
+      job_id: jglvmoy85
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2016.0
+      throughput: 496.031746031746
+      estimated_peak_memory_range:
+        min: 0
+        max: 46974240
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 375
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 375
+      job_id: jgn6vo1k5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 2078.0
+      throughput: 481.23195380173246
+      estimated_peak_memory_range:
+        min: 0
+        max: 36669456
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 395
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 395
+      job_id: jp3j0xo3g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:33:06Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:23:44Z'
   - torchscript_onnx_qnn:
-      inference_time: 3022.0
-      throughput: 330.90668431502314
+      inference_time: 2793.0
+      throughput: 358.03795202291445
       estimated_peak_memory_range:
-        min: 7397376
-        max: 7397376
+        min: 7385088
+        max: 7385088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 356
+        layers_on_npu: 375
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 356
-      job_id: jqpye6jrg
+        total_layers: 375
+      job_id: jpedm8715
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2984.0
-      throughput: 335.1206434316354
+      inference_time: 2881.0
+      throughput: 347.1017007983339
       estimated_peak_memory_range:
-        min: 72138752
-        max: 72138752
+        min: 71094272
+        max: 71094272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 376
+        layers_on_npu: 395
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 376
-      job_id: jvgdwoqk5
+        total_layers: 395
+      job_id: jgkexo6wg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -699,4 +818,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:33:13Z'
+    timestamp: '2024-10-14T23:23:40Z'
diff --git a/qai_hub_models/models/unet_segmentation/README.md b/qai_hub_models/models/unet_segmentation/README.md
index f8474142..f29f695b 100644
--- a/qai_hub_models/models/unet_segmentation/README.md
+++ b/qai_hub_models/models/unet_segmentation/README.md
@@ -6,7 +6,7 @@
 UNet is a machine learning model that produces a segmentation mask for an image. The most basic use case will label each pixel in the image as being in the foreground or the background. More advanced usage will assign a class label to each pixel. This version of the model was trained on the data from Kaggle's Carvana Image Masking Challenge (see https://www.kaggle.com/c/carvana-image-masking-challenge) and is used for vehicle segmentation.
 
 This is based on the implementation of Unet-Segmentation found
-[here](https://github.com/milesial/Pytorch-UNet). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/unet_segmentation).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.unet_segmentation.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Unet-Segmentation can be found
+* The license for the original implementation of Unet-Segmentation can be found
   [here](https://github.com/milesial/Pytorch-UNet/blob/master/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/milesial/Pytorch-UNet/blob/master/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/milesial/Pytorch-UNet/blob/master/LICENSE)
+
 
 ## References
 * [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597)
 * [Source Model Implementation](https://github.com/milesial/Pytorch-UNet)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/unet_segmentation/export.py b/qai_hub_models/models/unet_segmentation/export.py
index 5eb8aa5d..c3722b9a 100644
--- a/qai_hub_models/models/unet_segmentation/export.py
+++ b/qai_hub_models/models/unet_segmentation/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.unet_segmentation import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "unet_segmentation"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/unet_segmentation/perf.yaml b/qai_hub_models/models/unet_segmentation/perf.yaml
index e4b18965..fc3c3c51 100644
--- a/qai_hub_models/models/unet_segmentation/perf.yaml
+++ b/qai_hub_models/models/unet_segmentation/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Unet-Segmentation
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 156677.0
-      throughput: 6.382557746191209
+      inference_time: 153929.0
+      throughput: 6.496501633870161
       estimated_peak_memory_range:
-        min: 6684672
-        max: 9129000
+        min: 6717440
+        max: 463282184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: j1p3kenx5
+      job_id: jgo26o8xp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 157042.0
-      throughput: 6.367723284216961
+      inference_time: 151064.0
+      throughput: 6.619710851030027
       estimated_peak_memory_range:
-        min: 9863168
-        max: 29829824
+        min: 9973760
+        max: 31622512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jmg9vw0q5
+      job_id: j57yrk4l5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 160699.0
-      throughput: 6.222814080983702
+      inference_time: 155224.0
+      throughput: 6.442302736690203
       estimated_peak_memory_range:
-        min: 36864
-        max: 58894448
+        min: 17252352
+        max: 18880624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: joprke8v5
+      job_id: jgkexo42g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:32:07Z'
+    timestamp: '2024-10-14T23:22:22Z'
   - torchscript_onnx_tflite:
-      inference_time: 133225.0
-      throughput: 7.5060987051979735
+      inference_time: 132249.0
+      throughput: 7.561493848724754
       estimated_peak_memory_range:
-        min: 6701056
-        max: 345142176
+        min: 6791168
+        max: 410391120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: jwgoy3z45
+      job_id: jpv6ke7j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 131838.0
-      throughput: 7.585066521033389
+      inference_time: 132978.0
+      throughput: 7.520040909022545
       estimated_peak_memory_range:
-        min: 9969664
-        max: 80217024
+        min: 9846784
+        max: 101051712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jnp10e2k5
+      job_id: jp4lrm1v5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 139885.0
-      throughput: 7.14872931336455
+      inference_time: 134367.0
+      throughput: 7.442303541792255
       estimated_peak_memory_range:
-        min: 892928
-        max: 350482928
+        min: 380928
+        max: 421141680
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jep28l0xp
+      job_id: j5q6qzy4p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:32:08Z'
+    timestamp: '2024-10-14T23:22:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 157105.0
-      throughput: 6.365169790904172
+      inference_time: 142642.0
+      throughput: 7.010557900197698
       estimated_peak_memory_range:
-        min: 24576
-        max: 472223368
+        min: 6688768
+        max: 463253000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: j1pv3vq75
+      job_id: jgjvnoqxg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 138754.0
-      throughput: 7.206999437854044
+      inference_time: 136843.0
+      throughput: 7.307644526939631
       estimated_peak_memory_range:
-        min: 10022912
-        max: 11385152
+        min: 10121216
+        max: 11336896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jz57zx2qp
+      job_id: j5mnxomwp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:32:02Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:22:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 310193.0
-      throughput: 3.2237993765172006
+      inference_time: 147599.0
+      throughput: 6.775113652531521
       estimated_peak_memory_range:
-        min: 7114752
-        max: 349415680
+        min: 6701056
+        max: 462994672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: j7gjxed7p
+      job_id: jg9lnkmlg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 279701.0
-      throughput: 3.5752464238597645
+      inference_time: 136006.0
+      throughput: 7.352616796317809
       estimated_peak_memory_range:
-        min: 7778304
-        max: 79320304
+        min: 10055680
+        max: 11787144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jegn23lvg
+      job_id: jp2ky4w4p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:32:06Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:22:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 151563.0
-      throughput: 6.597916378007825
+      inference_time: 145119.0
+      throughput: 6.890896436717453
       estimated_peak_memory_range:
-        min: 6709248
-        max: 463143104
+        min: 6684672
+        max: 463339832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: jlpe9ko7g
+      job_id: j5we68765
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 139417.0
-      throughput: 7.172726425041422
+      inference_time: 143044.0
+      throughput: 6.990855960403792
       estimated_peak_memory_range:
-        min: 10117120
-        max: 18549048
+        min: 10100736
+        max: 11361408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jqp4qvnqg
+      job_id: jprv3o09g
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:32:03Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:22:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 154551.0
-      throughput: 6.470356063694185
+      inference_time: 157280.0
+      throughput: 6.358087487283825
       estimated_peak_memory_range:
-        min: 6684672
-        max: 463394632
+        min: 6627328
+        max: 478999472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: jygzer2zg
+      job_id: jgz3d8nk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 140581.0
-      throughput: 7.11333679515724
+      inference_time: 139062.0
+      throughput: 7.191037091369317
       estimated_peak_memory_range:
-        min: 11464704
-        max: 12719960
+        min: 10096640
+        max: 11424328
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: j0pxvy9jg
+      job_id: jgn6vonr5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:32:04Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:22:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 157023.0
-      throughput: 6.368493787534311
+      inference_time: 380675.0
+      throughput: 2.6269127208248504
       estimated_peak_memory_range:
-        min: 6709248
-        max: 463057768
+        min: 167936
+        max: 406578064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 32
-      job_id: jz5woqwzp
+      job_id: jpedm8y15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 143814.0
-      throughput: 6.953425952967027
+      inference_time: 269680.0
+      throughput: 3.708098487095817
       estimated_peak_memory_range:
-        min: 10104832
-        max: 18668648
+        min: 4374528
+        max: 99862544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jo5mr3eyg
+      job_id: jp0z0dj65
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:22:20Z'
+  - torchscript_onnx_tflite:
+      inference_time: 102802.0
+      throughput: 9.727437209392813
+      estimated_peak_memory_range:
+        min: 5791744
+        max: 124288656
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 32
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 32
+      job_id: jgdx183ep
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 102598.0
+      throughput: 9.746778689643072
+      estimated_peak_memory_range:
+        min: 9932800
+        max: 115505984
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 52
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 52
+      job_id: jp8qy6xxp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 104486.0
+      throughput: 9.570660184139502
+      estimated_peak_memory_range:
+        min: 25743360
+        max: 149037744
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 53
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 53
+      job_id: jp3j0x9lg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:32:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:22:25Z'
   - torchscript_onnx_qnn:
-      inference_time: 135619.0
-      throughput: 7.373598094662253
+      inference_time: 135807.0
+      throughput: 7.36339069414684
       estimated_peak_memory_range:
         min: 9850880
         max: 9850880
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 52
-      job_id: jvgdwonk5
+      job_id: jpxko3415
       job_status: Passed
     torchscript_onnx:
-      inference_time: 147147.0
-      throughput: 6.795925163272102
+      inference_time: 147497.0
+      throughput: 6.779798911164295
       estimated_peak_memory_range:
-        min: 56770560
-        max: 56770560
+        min: 56721408
+        max: 56721408
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 53
-      job_id: jqpye6rrg
+      job_id: jglvmox85
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:32:09Z'
+    timestamp: '2024-10-14T23:22:23Z'
diff --git a/qai_hub_models/models/vit/README.md b/qai_hub_models/models/vit/README.md
index 06e0a6df..cb79f499 100644
--- a/qai_hub_models/models/vit/README.md
+++ b/qai_hub_models/models/vit/README.md
@@ -6,7 +6,7 @@
 VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of VIT found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/vit).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.vit.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of VIT can be found
+* The license for the original implementation of VIT can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/vit/export.py b/qai_hub_models/models/vit/export.py
index 60de5f21..a7ddef2e 100644
--- a/qai_hub_models/models/vit/export.py
+++ b/qai_hub_models/models/vit/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.vit import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "vit"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -199,7 +197,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/vit/perf.yaml b/qai_hub_models/models/vit/perf.yaml
index 9346d460..5e73a690 100644
--- a/qai_hub_models/models/vit/perf.yaml
+++ b/qai_hub_models/models/vit/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: VIT
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 19822.0
-      throughput: 50.44899606497831
+      inference_time: 19821.0
+      throughput: 50.45154129458655
       estimated_peak_memory_range:
-        min: 49152
-        max: 3157824
+        min: 90112
+        max: 3029272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: j1pv3v675
+      job_id: jglvmo185
       job_status: Passed
     torchscript_onnx:
-      inference_time: 20353.0
-      throughput: 49.13280597454921
+      inference_time: 15505.0
+      throughput: 64.49532408900355
       estimated_peak_memory_range:
-        min: 53248
-        max: 202758152
+        min: 61440
+        max: 202556824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 976
-      job_id: jqpye6zrg
+      job_id: jpy13qm7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:31:07Z'
+    timestamp: '2024-10-14T23:21:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 14467.0
-      throughput: 69.12283127116886
+      inference_time: 16903.0
+      throughput: 59.161095663491686
       estimated_peak_memory_range:
-        min: 36864
-        max: 385785792
+        min: 0
+        max: 400784160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: j7gjxev7p
+      job_id: j56y4rd0p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 14334.0
-      throughput: 69.76419701409237
+      inference_time: 11372.0
+      throughput: 87.93527963418924
       estimated_peak_memory_range:
-        min: 651264
-        max: 144984496
+        min: 0
+        max: 156306496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 976
-      job_id: j2p0yl42g
+      job_id: jp0z0d665
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:31:08Z'
+    timestamp: '2024-10-14T23:21:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 19190.0
-      throughput: 52.11047420531527
+      inference_time: 19788.0
+      throughput: 50.53567818880129
       estimated_peak_memory_range:
         min: 53248
-        max: 2811560
+        max: 2828712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -134,7 +132,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: jlpe9kd7g
+      job_id: jp3j0xwlg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -142,14 +140,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:30:53Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:20:56Z'
   - torchscript_onnx_tflite:
-      inference_time: 23008.0
-      throughput: 43.46314325452017
+      inference_time: 19830.0
+      throughput: 50.42864346949067
       estimated_peak_memory_range:
-        min: 98304
-        max: 370568720
+        min: 49152
+        max: 3033144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -157,22 +155,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: jygzer3zg
+      job_id: jpedm8l15
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:30:54Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:21:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 19280.0
-      throughput: 51.86721991701245
+      inference_time: 20031.0
+      throughput: 49.9226199390944
       estimated_peak_memory_range:
-        min: 32768
-        max: 1765157752
+        min: 65536
+        max: 2686480
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -180,22 +178,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: jz5woqezp
+      job_id: jgjvnowxg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:30:55Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:20:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 19169.0
-      throughput: 52.16756220981794
+      inference_time: 20358.0
+      throughput: 49.12073877591119
       estimated_peak_memory_range:
-        min: 53248
-        max: 2801568
+        min: 57344
+        max: 3355672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -203,22 +201,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: jmg9vwlq5
+      job_id: jpv6ke9j5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:30:56Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:20:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 19709.0
-      throughput: 50.73824141255264
+      inference_time: 24972.0
+      throughput: 40.04485023226013
       estimated_peak_memory_range:
-        min: 45056
-        max: 2867072
+        min: 53248
+        max: 385918016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -226,22 +224,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1579
-      job_id: jnp10e4k5
+      job_id: jgo26o4xp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:20:57Z'
+  - torchscript_onnx_tflite:
+      inference_time: 11489.0
+      throughput: 87.03977717817043
+      estimated_peak_memory_range:
+        min: 40960
+        max: 216772384
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 1579
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 1579
+      job_id: j5we68465
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 9010.0
+      throughput: 110.98779134295228
+      estimated_peak_memory_range:
+        min: 647168
+        max: 117624416
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 976
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 976
+      job_id: jgdxx2x6p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:30:57Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-16T09:35:49Z'
   - torchscript_onnx:
-      inference_time: 21231.0
-      throughput: 47.10093730865244
+      inference_time: 21624.0
+      throughput: 46.24491305956345
       estimated_peak_memory_range:
-        min: 179093504
-        max: 179093504
+        min: 179056640
+        max: 179056640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -249,7 +285,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 976
-      job_id: j1p8oz2zg
+      job_id: jp8qy61xp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -258,4 +294,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:31:09Z'
+    timestamp: '2024-10-14T23:21:16Z'
diff --git a/qai_hub_models/models/vit_quantized/README.md b/qai_hub_models/models/vit_quantized/README.md
new file mode 100644
index 00000000..a4560b1a
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/README.md
@@ -0,0 +1,59 @@
+[![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
+
+
+# [VITQuantized: Imagenet classifier and general purpose backbone](https://aihub.qualcomm.com/models/vit_quantized)
+
+VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
+
+This is based on the implementation of VITQuantized found
+[here]({source_repo}). This repository contains scripts for optimized on-device
+export suitable to run on Qualcomm® devices. More details on model performance
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/vit_quantized).
+
+[Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
+
+
+
+
+## Example & Usage
+
+
+Once installed, run the following simple CLI demo:
+
+```bash
+python -m qai_hub_models.models.vit_quantized.demo
+```
+More details on the CLI tool can be found with the `--help` option. See
+[demo.py](demo.py) for sample usage of the model including pre/post processing
+scripts. Please refer to our [general instructions on using
+models](../../../#getting-started) for more usage instructions.
+
+## Export for on-device deployment
+
+This repository contains export scripts that produce a model optimized for
+on-device deployment. This can be run as follows:
+
+```bash
+python -m qai_hub_models.models.vit_quantized.export
+```
+Additional options are documented with the `--help` option. Note that the above
+script requires access to Deployment instructions for Qualcomm® AI Hub.
+
+
+## License
+* The license for the original implementation of VITQuantized can be found
+  [here](https://github.com/pytorch/vision/blob/main/LICENSE).
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
+
+## References
+* [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)
+* [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py)
+
+
+
+## Community
+* Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
+* For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
+
+
diff --git a/qai_hub_models/models/vit_quantized/__init__.py b/qai_hub_models/models/vit_quantized/__init__.py
new file mode 100644
index 00000000..e86d7aee
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/__init__.py
@@ -0,0 +1,10 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.imagenet_classifier.app import (  # noqa: F401
+    ImagenetClassifierApp as App,
+)
+
+from .model import MODEL_ID  # noqa: F401
+from .model import VITQuantizable as Model  # noqa: F401
diff --git a/qai_hub_models/models/vit_quantized/conftest.py b/qai_hub_models/models/vit_quantized/conftest.py
new file mode 100644
index 00000000..28e56480
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/conftest.py
@@ -0,0 +1,37 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+import inspect
+
+import pytest
+
+from qai_hub_models.models.vit_quantized import Model
+
+
+# Instantiate the model only once for all tests.
+# Mock from_pretrained to always return the initialized model.
+# This speeds up tests and limits memory leaks.
+@pytest.fixture(scope="module", autouse=True)
+def cached_from_pretrained():
+    with pytest.MonkeyPatch.context() as mp:
+        pretrained_cache = {}
+        from_pretrained = Model.from_pretrained
+        sig = inspect.signature(from_pretrained)
+
+        def _cached_from_pretrained(*args, **kwargs):
+            cache_key = str(args) + str(kwargs)
+            model = pretrained_cache.get(cache_key, None)
+            if model:
+                return model
+            else:
+                model = from_pretrained(*args, **kwargs)
+                pretrained_cache[cache_key] = model
+                return model
+
+        _cached_from_pretrained.__signature__ = sig
+
+        mp.setattr(Model, "from_pretrained", _cached_from_pretrained)
+        yield mp
diff --git a/qai_hub_models/models/vit_quantized/demo.py b/qai_hub_models/models/vit_quantized/demo.py
new file mode 100644
index 00000000..71c37648
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/demo.py
@@ -0,0 +1,14 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from qai_hub_models.models._shared.imagenet_classifier.demo import imagenet_demo
+from qai_hub_models.models.vit_quantized.model import MODEL_ID, VITQuantizable
+
+
+def main(is_test: bool = False):
+    imagenet_demo(VITQuantizable, MODEL_ID, is_test)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/vit_quantized/evaluate.py b/qai_hub_models/models/vit_quantized/evaluate.py
new file mode 100644
index 00000000..e27bad7a
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/evaluate.py
@@ -0,0 +1,56 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import warnings
+
+import qai_hub as hub
+
+from qai_hub_models.models.vit_quantized import MODEL_ID, Model
+from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
+from qai_hub_models.utils.evaluate import evaluate_on_dataset
+from qai_hub_models.utils.inference import compile_model_from_args
+
+SUPPORTED_DATASETS = ["imagenette", "imagenet"]
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = evaluate_parser(
+        model_cls=Model,
+        default_split_size=2500,
+        supported_datasets=SUPPORTED_DATASETS,
+        supports_tflite=False,
+        is_hub_quantized=True,
+    )
+    args = parser.parse_args()
+    args.device = None
+
+    if args.hub_model_id is not None:
+        hub_model = hub.get_model(args.hub_model_id)
+    else:
+        hub_model = compile_model_from_args(
+            MODEL_ID, args, get_model_kwargs(Model, vars(args))
+        )
+    hub_device = get_hub_device(None, args.chipset)
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
+    evaluate_on_dataset(
+        hub_model,
+        torch_model,
+        hub_device,
+        args.dataset_name,
+        args.split_size,
+        args.num_samples,
+        args.seed,
+        args.profile_options,
+        args.use_cache,
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/vit_quantized/export.py b/qai_hub_models/models/vit_quantized/export.py
new file mode 100644
index 00000000..9d8eb31a
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/export.py
@@ -0,0 +1,250 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.
+
+
+from __future__ import annotations
+
+import os
+import warnings
+from pathlib import Path
+from typing import Any, Dict, List, Optional, cast
+
+import qai_hub as hub
+import torch
+
+from qai_hub_models.models.common import ExportResult, TargetRuntime
+from qai_hub_models.models.vit_quantized import Model
+from qai_hub_models.utils.args import (
+    export_parser,
+    get_input_spec_kwargs,
+    get_model_kwargs,
+)
+from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
+from qai_hub_models.utils.printing import (
+    print_inference_metrics,
+    print_on_target_demo_cmd,
+    print_profile_metrics_from_job,
+)
+from qai_hub_models.utils.qai_hub_helpers import (
+    can_access_qualcomm_ai_hub,
+    export_without_hub_access,
+)
+from qai_hub_models.utils.quantization import get_calibration_data
+
+
+def export_model(
+    device: str = "Samsung Galaxy S23 (Family)",
+    chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
+    skip_profiling: bool = False,
+    skip_inferencing: bool = False,
+    skip_downloading: bool = False,
+    skip_summary: bool = False,
+    output_dir: Optional[str] = None,
+    target_runtime: TargetRuntime = TargetRuntime.QNN,
+    compile_options: str = "",
+    profile_options: str = "",
+    **additional_model_kwargs,
+) -> ExportResult | List[str]:
+    """
+    This function executes the following recipe:
+
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
+
+    Each of the last 5 steps can be optionally skipped using the input options.
+
+    Parameters:
+        device: Device for which to export the model.
+            Full list of available devices can be found by running `hub.get_devices()`.
+            Defaults to DEFAULT_DEVICE if not specified.
+        chipset: If set, will choose a random device with this chipset.
+            Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
+        skip_profiling: If set, skips profiling of compiled model on real devices.
+        skip_inferencing: If set, skips computing on-device outputs from sample data.
+        skip_downloading: If set, skips downloading of compiled model.
+        skip_summary: If set, skips waiting for and summarizing results
+            from profiling and inference.
+        output_dir: Directory to store generated assets (e.g. compiled model).
+            Defaults to `<cwd>/build/<model_name>`.
+        target_runtime: Which on-device runtime to target. Default is TFLite.
+        compile_options: Additional options to pass when submitting the compile job.
+        profile_options: Additional options to pass when submitting the profile job.
+        **additional_model_kwargs: Additional optional kwargs used to customize
+            `model_cls.from_pretrained` and `model.get_input_spec`
+
+    Returns:
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
+            * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
+    """
+    model_name = "vit_quantized"
+    output_path = Path(output_dir or Path.cwd() / "build" / model_name)
+    if chipset:
+        hub_device = hub.Device(attributes=f"chipset:{chipset}")
+    else:
+        hub_device = hub.Device(name=device)
+    if not can_access_qualcomm_ai_hub():
+        return export_without_hub_access(
+            "vit_quantized",
+            "VITQuantized",
+            device,
+            skip_profiling,
+            skip_inferencing,
+            skip_downloading,
+            skip_summary,
+            output_path,
+            target_runtime,
+            compile_options,
+            profile_options,
+        )
+
+    # On-device perf improves with I/O in channel_last format except when using ONNX.
+    use_channel_last_format = target_runtime != TargetRuntime.ONNX
+
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+    model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
+    input_spec = model.get_input_spec(
+        **get_input_spec_kwargs(model, additional_model_kwargs)
+    )
+
+    # Trace the model
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
+    )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
+
+    # 3. Compiles the model to an asset that can be run on device
+    model_compile_options = model.get_hub_compile_options(
+        target_runtime, compile_options, hub_device
+    )
+    print(f"Optimizing model {model_name} to run on-device")
+    submitted_compile_job = hub.submit_compile_job(
+        model=quantize_job.get_target_model(),
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options=model_compile_options,
+    )
+    compile_job = cast(hub.client.CompileJob, submitted_compile_job)
+
+    # 4. Profiles the model performance on a real device
+    profile_job: Optional[hub.client.ProfileJob] = None
+    if not skip_profiling:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(f"Profiling model {model_name} on a hosted device.")
+        submitted_profile_job = hub.submit_profile_job(
+            model=compile_job.get_target_model(),
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
+
+    # 5. Inferences the model on sample inputs
+    inference_job: Optional[hub.client.InferenceJob] = None
+    if not skip_inferencing:
+        profile_options_all = model.get_hub_profile_options(
+            target_runtime, profile_options
+        )
+        print(
+            f"Running inference for {model_name} on a hosted device with example inputs."
+        )
+        sample_inputs = model.sample_inputs(
+            input_spec, use_channel_last_format=use_channel_last_format
+        )
+        submitted_inference_job = hub.submit_inference_job(
+            model=compile_job.get_target_model(),
+            inputs=sample_inputs,
+            device=hub_device,
+            name=model_name,
+            options=profile_options_all,
+        )
+        inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
+
+    # 6. Downloads the model asset to the local directory
+    if not skip_downloading:
+        os.makedirs(output_path, exist_ok=True)
+        target_model: hub.Model = compile_job.get_target_model()  # type: ignore
+        target_model.download(str(output_path / model_name))
+
+    # 7. Summarizes the results from profiling and inference
+    if not skip_summary and not skip_profiling:
+        assert profile_job is not None and profile_job.wait().success
+        profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
+        print_profile_metrics_from_job(profile_job, profile_data)
+
+    if not skip_summary and not skip_inferencing:
+        sample_inputs = model.sample_inputs(use_channel_last_format=False)
+        torch_out = torch_inference(
+            model, sample_inputs, return_channel_last_output=use_channel_last_format
+        )
+        assert inference_job is not None and inference_job.wait().success
+        inference_result: hub.client.DatasetEntries = inference_job.download_output_data()  # type: ignore
+
+        print_inference_metrics(
+            inference_job,
+            inference_result,
+            torch_out,
+            model.get_output_names(),
+            metrics="psnr,top1,top5",
+        )
+
+    if not skip_summary:
+        print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
+
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
+
+
+def main():
+    warnings.filterwarnings("ignore")
+    parser = export_parser(
+        model_cls=Model, supports_tflite=False, is_hub_quantized=True
+    )
+    args = parser.parse_args()
+    export_model(**vars(args))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/qai_hub_models/models/vit_quantized/info.yaml b/qai_hub_models/models/vit_quantized/info.yaml
new file mode 100644
index 00000000..48b07d44
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/info.yaml
@@ -0,0 +1,46 @@
+name: VITQuantized
+# id must match with the model dir name in qai_hub_models
+id: vit_quantized
+status: public
+headline: Imagenet classifier and general purpose backbone.
+domain: Computer Vision
+description: VIT is a machine learning model that can classify images from the Imagenet
+  dataset. It can also be used as a backbone in building more complex models for specific
+  use cases.
+use_case: Image Classification
+tags:
+  - backbone
+  - quantized
+research_paper: https://arxiv.org/abs/2010.11929
+research_paper_title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition
+  at Scale'
+license: https://github.com/pytorch/vision/blob/main/LICENSE
+deploy_license: https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf
+source_repo:
+  https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py
+technical_details:
+  Model checkpoint: Imagenet
+  Input resolution: 224x224
+  Number of parameters: 86.6M
+  Model size: 85.9 MB
+applicable_scenarios:
+  - Medical Imaging
+  - Anomaly Detection
+  - Inventory Management
+related_models:
+  - mobilenet_v2
+  - densenet121
+  - googlenet
+form_factors:
+  - Phone
+  - Tablet
+  - IoT
+  - XR
+has_static_banner: true
+has_animated_banner: true
+license_type: bsd-3-clause
+deploy_license_type: AI Model Hub License
+dataset:
+  - imagenet-1k
+  - imagenet-22k
+labels_file: imagenet_labels.txt
diff --git a/qai_hub_models/models/vit_quantized/model.py b/qai_hub_models/models/vit_quantized/model.py
new file mode 100644
index 00000000..0212fbb8
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/model.py
@@ -0,0 +1,14 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from __future__ import annotations
+
+from qai_hub_models.models.vit.model import VIT
+from qai_hub_models.utils.quantization import HubQuantizableMixin
+
+MODEL_ID = __name__.split(".")[-2]
+
+
+class VITQuantizable(HubQuantizableMixin, VIT):
+    pass
diff --git a/qai_hub_models/models/vit_quantized/perf.yaml b/qai_hub_models/models/vit_quantized/perf.yaml
new file mode 100644
index 00000000..ec5e76d7
--- /dev/null
+++ b/qai_hub_models/models/vit_quantized/perf.yaml
@@ -0,0 +1,313 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS6490 (Proxy)
+  - RB3 Gen 2 (Proxy)
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
+models:
+- name: VITQuantized
+  performance_metrics:
+  - torchscript_onnx_qnn:
+      inference_time: 5499.0
+      throughput: 181.8512456810329
+      estimated_peak_memory_range:
+        min: 12288
+        max: 31586800
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jp8q278kp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 43244.0
+      throughput: 23.124595319581907
+      estimated_peak_memory_range:
+        min: 360448
+        max: 5032392
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 1654
+        layers_on_gpu: 0
+        layers_on_cpu: 25
+        total_layers: 1679
+      job_id: j5wew9835
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-17T17:15:24Z'
+  - torchscript_onnx_qnn:
+      inference_time: 3591.0
+      throughput: 278.473962684489
+      estimated_peak_memory_range:
+        min: 163840
+        max: 59568672
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jgkevydwg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 32761.0
+      throughput: 30.52409877598364
+      estimated_peak_memory_range:
+        min: 221184
+        max: 799841776
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 1654
+        layers_on_gpu: 0
+        layers_on_cpu: 25
+        total_layers: 1679
+      job_id: jg9l04kwg
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-17T17:15:26Z'
+  - torchscript_onnx_qnn:
+      inference_time: 22428.0
+      throughput: 44.58712323880863
+      estimated_peak_memory_range:
+        min: 253952
+        max: 8408304
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 902
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 902
+      job_id: j5q602wnp
+      job_status: Passed
+    reference_device_info:
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:15:10Z'
+  - torchscript_onnx_qnn:
+      inference_time: 4939.0
+      throughput: 202.47013565499088
+      estimated_peak_memory_range:
+        min: 184320
+        max: 1447040
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jglv4k7j5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:15:12Z'
+  - torchscript_onnx_qnn:
+      inference_time: 4931.0
+      throughput: 202.7986209693774
+      estimated_peak_memory_range:
+        min: 229376
+        max: 1524040
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jp3jnm83g
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:15:15Z'
+  - torchscript_onnx_qnn:
+      inference_time: 4937.0
+      throughput: 202.55215718047398
+      estimated_peak_memory_range:
+        min: 217088
+        max: 1425256
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jgo2zvmqp
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:15:17Z'
+  - torchscript_onnx_qnn:
+      inference_time: 6251.0
+      throughput: 159.97440409534474
+      estimated_peak_memory_range:
+        min: 163840
+        max: 60155424
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jpv6qwek5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:15:18Z'
+  - torchscript_onnx_qnn:
+      inference_time: 3394.0
+      throughput: 294.6375957572186
+      estimated_peak_memory_range:
+        min: 159744
+        max: 75826496
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: jgjvdlovg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 27705.0
+      throughput: 36.094567767550984
+      estimated_peak_memory_range:
+        min: 0
+        max: 339511920
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 1654
+        layers_on_gpu: 0
+        layers_on_cpu: 25
+        total_layers: 1679
+      job_id: jgdxnv8rp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:15:29Z'
+  - torchscript_onnx_qnn:
+      inference_time: 5356.0
+      throughput: 186.70649738610905
+      estimated_peak_memory_range:
+        min: 180224
+        max: 180224
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 903
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 903
+      job_id: j56y21v6p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 58341.0
+      throughput: 17.140604377710357
+      estimated_peak_memory_range:
+        min: 239259648
+        max: 239259648
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 1654
+        layers_on_gpu: 0
+        layers_on_cpu: 25
+        total_layers: 1679
+      job_id: jp142878p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-17T17:15:28Z'
diff --git a/qai_hub_models/models/whisper_base_en/README.md b/qai_hub_models/models/whisper_base_en/README.md
index 9c784e92..e92925a2 100644
--- a/qai_hub_models/models/whisper_base_en/README.md
+++ b/qai_hub_models/models/whisper_base_en/README.md
@@ -6,7 +6,7 @@
 OpenAI’s Whisper ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audio clips up to 30 seconds long. Time to the first token is the encoder's latency, while time to each additional token is decoder's latency, where we assume a mean decoded length specified below.
 
 This is based on the implementation of Whisper-Base-En found
-[here](https://github.com/openai/whisper/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/whisper_base_en).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.whisper_base_en.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Whisper-Base-En can be found
+* The license for the original implementation of Whisper-Base-En can be found
   [here](https://github.com/openai/whisper/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Robust Speech Recognition via Large-Scale Weak Supervision](https://cdn.openai.com/papers/whisper.pdf)
 * [Source Model Implementation](https://github.com/openai/whisper/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/whisper_base_en/export.py b/qai_hub_models/models/whisper_base_en/export.py
index 978aef9f..7f5cb094 100644
--- a/qai_hub_models/models/whisper_base_en/export.py
+++ b/qai_hub_models/models/whisper_base_en/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.whisper_base_en import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "whisper_base_en"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "WhisperEncoder" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/whisper_base_en/perf.yaml b/qai_hub_models/models/whisper_base_en/perf.yaml
index ec42e6f5..b3ab9930 100644
--- a/qai_hub_models/models/whisper_base_en/perf.yaml
+++ b/qai_hub_models/models/whisper_base_en/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: WhisperEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 205173.0
-      throughput: 4.873935654301492
+      inference_time: 204008.0
+      throughput: 4.901768558095761
       estimated_peak_memory_range:
-        min: 36311040
-        max: 118811624
+        min: 22716416
+        max: 98255680
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: jvgdwql65
+      job_id: jp0z0dke5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 376271.0
-      throughput: 2.657658974515708
+      inference_time: 300732.0
+      throughput: 3.3252197970285837
       estimated_peak_memory_range:
-        min: 126976
-        max: 83512096
+        min: 57344
+        max: 88122224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jw566n4n5
+      job_id: jpxko3395
       job_status: Passed
     torchscript_onnx:
-      inference_time: 412243.0
-      throughput: 2.425753742331586
+      inference_time: 282733.0
+      throughput: 3.5369058440295262
       estimated_peak_memory_range:
-        min: 37031936
-        max: 172609304
+        min: 12832768
+        max: 149000496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 380
-      job_id: j0pxvyojg
+      job_id: jp14z77lp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:30:08Z'
+    timestamp: '2024-10-14T23:20:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 166323.0
-      throughput: 6.012397563776507
+      inference_time: 166502.0
+      throughput: 6.005933862656304
       estimated_peak_memory_range:
         min: 40566784
-        max: 78581776
+        max: 79279712
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: jqp4qd02g
+      job_id: jgkexodog
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 271385.0
-      throughput: 3.6848020340107226
+      inference_time: 222410.0
+      throughput: 4.496200710399712
       estimated_peak_memory_range:
-        min: 606208
-        max: 170092720
+        min: 0
+        max: 304439280
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jwgoy3615
+      job_id: jgn6voom5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 324335.0
-      throughput: 3.083231843618481
+      inference_time: 226140.0
+      throughput: 4.422039444591846
       estimated_peak_memory_range:
         min: 0
-        max: 831425248
+        max: 1028259808
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 380
-      job_id: jegn236vg
+      job_id: j5we68165
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:30:10Z'
+    timestamp: '2024-10-14T23:20:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 197960.0
-      throughput: 5.0515255607193374
+      inference_time: 198111.0
+      throughput: 5.047675293143743
       estimated_peak_memory_range:
-        min: 4059136
-        max: 68986424
+        min: 29540352
+        max: 106596056
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: jo5mr6y7g
+      job_id: jglvmo7l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 235151.0
-      throughput: 4.252586635821238
+      inference_time: 226448.0
+      throughput: 4.416024871052074
       estimated_peak_memory_range:
-        min: 299008
-        max: 11327888
+        min: 663552
+        max: 2063088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jygzerd4g
+      job_id: jp0z0dde5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:29:59Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:19:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 291293.0
-      throughput: 3.4329695529930344
+      inference_time: 195152.0
+      throughput: 5.1242108715257855
       estimated_peak_memory_range:
-        min: 40312832
-        max: 86790544
+        min: 35758080
+        max: 117749792
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: joprk2jk5
+      job_id: j5we68xj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 439323.0
-      throughput: 2.2762295623038176
+      inference_time: 238198.0
+      throughput: 4.198188062032427
       estimated_peak_memory_range:
-        min: 495616
-        max: 180370032
+        min: 163840
+        max: 10740728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jvgdwo1k5
+      job_id: jp3j0xxzg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:30:07Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:19:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 204428.0
-      throughput: 4.89169781047606
+      inference_time: 198204.0
+      throughput: 5.045306855562956
       estimated_peak_memory_range:
-        min: 16384
-        max: 82544840
+        min: 39964672
+        max: 109288784
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: jqpyej00g
+      job_id: jpedm8205
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 239045.0
-      throughput: 4.183312765378903
+      inference_time: 235704.0
+      throughput: 4.242609374469674
       estimated_peak_memory_range:
-        min: 745472
-        max: 1974112
+        min: 262144
+        max: 11261928
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jmg9vwnm5
+      job_id: jglvmool5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:30:01Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:19:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 201134.0
-      throughput: 4.9718098382173075
+      inference_time: 211852.0
+      throughput: 4.720276419387119
       estimated_peak_memory_range:
-        min: 15388672
-        max: 202327768
+        min: 16916480
+        max: 87093520
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: j1p8ozyqg
+      job_id: jpv6ke4m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 237663.0
-      throughput: 4.207638547018257
+      inference_time: 228823.0
+      throughput: 4.37019005956569
       estimated_peak_memory_range:
-        min: 659456
-        max: 32487016
+        min: 708608
+        max: 1994360
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jvgdwo165
+      job_id: jgkexooog
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:30:03Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:19:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 196903.0
-      throughput: 5.078642783502537
+      inference_time: 286641.0
+      throughput: 3.488684451980003
       estimated_peak_memory_range:
-        min: 25333760
-        max: 69670832
+        min: 41242624
+        max: 87531568
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 408
         layers_on_cpu: 11
         total_layers: 419
-      job_id: jn5q83qe5
+      job_id: jp3j0x8zg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 241292.0
-      throughput: 4.144356215705452
+      inference_time: 327803.0
+      throughput: 3.050612715563921
       estimated_peak_memory_range:
-        min: 303104
-        max: 11376744
+        min: 598016
+        max: 315411264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: jmg9vwnq5
+      job_id: jpedm8805
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:20:03Z'
+  - torchscript_onnx_tflite:
+      inference_time: 167912.0
+      throughput: 5.955500500262042
+      estimated_peak_memory_range:
+        min: 40484864
+        max: 61780944
+      primary_compute_unit: GPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 0
+        layers_on_gpu: 408
+        layers_on_cpu: 11
+        total_layers: 419
+      job_id: j57yrkkr5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 196114.0
+      throughput: 5.099075027789959
+      estimated_peak_memory_range:
+        min: 0
+        max: 321859152
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 531
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 531
+      job_id: j5we688j5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 197760.0
+      throughput: 5.05663430420712
+      estimated_peak_memory_range:
+        min: 82677760
+        max: 746041552
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 380
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 380
+      job_id: jpxko3d15
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:30:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:20:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 196372.0
-      throughput: 5.092375695109283
+      inference_time: 179530.0
+      throughput: 5.570099704784716
       estimated_peak_memory_range:
         min: 483328
         max: 483328
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 531
-      job_id: j7gjxen1p
+      job_id: jp2ky44mp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 413624.0
-      throughput: 2.417654681546525
+      inference_time: 308550.0
+      throughput: 3.2409658078107277
       estimated_peak_memory_range:
-        min: 139689984
-        max: 139689984
+        min: 139694080
+        max: 139694080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 380
-      job_id: jep28lkxp
+      job_id: jp14z7v2p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,15 +429,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:30:12Z'
+    timestamp: '2024-10-14T23:20:10Z'
 - name: WhisperDecoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 14243.0
-      throughput: 70.20992768377448
+      inference_time: 14594.0
+      throughput: 68.52131012744964
       estimated_peak_memory_range:
-        min: 5775360
-        max: 8650080
+        min: 3473408
+        max: 5768536
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -394,14 +445,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: jz57zl3np
+      job_id: jp8qy688p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4268.0
-      throughput: 234.30178069353326
+      inference_time: 4038.0
+      throughput: 247.64735017335315
       estimated_peak_memory_range:
-        min: 3080192
-        max: 148081824
+        min: 9445376
+        max: 218468648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -409,14 +460,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: j1p3ke0m5
+      job_id: j5mnxooqp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 17358.0
-      throughput: 57.61032377001959
+      inference_time: 32412.0
+      throughput: 30.852770578797976
       estimated_peak_memory_range:
-        min: 32768
-        max: 1813739296
+        min: 98304
+        max: 122110128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -424,7 +475,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 844
-      job_id: jo5mr3xyg
+      job_id: jgdx188lp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -433,13 +484,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:30:09Z'
+    timestamp: '2024-10-14T23:20:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 11488.0
-      throughput: 87.04735376044569
+      inference_time: 12726.0
+      throughput: 78.57928650007858
       estimated_peak_memory_range:
         min: 5758976
-        max: 98507808
+        max: 104316656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -447,14 +498,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: j0pxv628g
+      job_id: j5q6qzwmp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3153.0
-      throughput: 317.1582619727244
+      inference_time: 3126.0
+      throughput: 319.8976327575176
       estimated_peak_memory_range:
-        min: 21233664
-        max: 57602656
+        min: 21217280
+        max: 59772608
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -462,14 +513,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: j1pv3vkz5
+      job_id: jprv3ooeg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 16181.0
-      throughput: 61.80087757246153
+      inference_time: 14349.0
+      throughput: 69.69126768415917
       estimated_peak_memory_range:
-        min: 41398272
-        max: 439256992
+        min: 39747584
+        max: 460106720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -477,7 +528,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 844
-      job_id: joprkevv5
+      job_id: jg9lnkxlg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -486,13 +537,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:30:10Z'
+    timestamp: '2024-10-14T23:20:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 13974.0
-      throughput: 71.56147130385001
+      inference_time: 14054.0
+      throughput: 71.15411982353778
       estimated_peak_memory_range:
-        min: 5771264
-        max: 8346400
+        min: 5779456
+        max: 8021800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -500,14 +551,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: jegn2m8jg
+      job_id: j56y4rv7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4059.0
-      throughput: 246.3661000246366
+      inference_time: 4057.0
+      throughput: 246.48755237860487
       estimated_peak_memory_range:
-        min: 19922944
-        max: 21078744
+        min: 21327872
+        max: 22539000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -515,7 +566,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jz5woq64p
+      job_id: jp8qy668p
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -523,14 +574,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:30:00Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:19:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 16543.0
-      throughput: 60.44852807834129
+      inference_time: 14752.0
+      throughput: 67.78741865509761
       estimated_peak_memory_range:
-        min: 16384
-        max: 85309280
+        min: 5828608
+        max: 7813864
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -538,14 +589,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: jep289n6p
+      job_id: jg9lnk8vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4873.0
-      throughput: 205.21239482864766
+      inference_time: 4109.0
+      throughput: 243.3682161109759
       estimated_peak_memory_range:
-        min: 18452480
-        max: 55006240
+        min: 21270528
+        max: 22576728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -553,22 +604,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jz57zxrqp
+      job_id: jgo26oodp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:30:07Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:19:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 14162.0
-      throughput: 70.61149555147578
+      inference_time: 14219.0
+      throughput: 70.3284337857796
       estimated_peak_memory_range:
-        min: 5783552
-        max: 8083872
+        min: 5795840
+        max: 8087320
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -576,14 +627,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: j2p0yl00g
+      job_id: jgz3d8w65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4142.0
-      throughput: 241.42926122646065
+      inference_time: 4185.0
+      throughput: 238.94862604540023
       estimated_peak_memory_range:
-        min: 21258240
-        max: 24414552
+        min: 21303296
+        max: 22545288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -591,22 +642,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jnp10ezn5
+      job_id: j56y4rr7p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:30:02Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:19:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 14307.0
-      throughput: 69.89585517578807
+      inference_time: 14446.0
+      throughput: 69.22331441229406
       estimated_peak_memory_range:
-        min: 5758976
-        max: 7751376
+        min: 5754880
+        max: 8122424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -614,14 +665,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: jogkz3xvg
+      job_id: jgjvno18g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4065.0
-      throughput: 246.00246002460025
+      inference_time: 4201.0
+      throughput: 238.03856224708403
       estimated_peak_memory_range:
-        min: 21262336
-        max: 22551160
+        min: 21295104
+        max: 24611560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -629,22 +680,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jz5woq6zp
+      job_id: j5q6qzzmp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:30:04Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:19:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 14282.0
-      throughput: 70.01820473323065
+      inference_time: 16093.0
+      throughput: 62.138818119679364
       estimated_peak_memory_range:
-        min: 5767168
-        max: 8360160
+        min: 5758976
+        max: 96105728
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -652,14 +703,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 983
-      job_id: j1gln3m2p
+      job_id: jgo26omdp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4024.0
-      throughput: 248.5089463220676
+      inference_time: 4775.0
+      throughput: 209.4240837696335
       estimated_peak_memory_range:
-        min: 21315584
-        max: 22656496
+        min: 21213184
+        max: 61196176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -667,19 +718,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jnp10ezk5
+      job_id: jgz3d8865
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:20:03Z'
+  - torchscript_onnx_tflite:
+      inference_time: 9185.0
+      throughput: 108.87316276537834
+      estimated_peak_memory_range:
+        min: 4014080
+        max: 56241248
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 983
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 983
+      job_id: jp4lrmml5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 2586.0
+      throughput: 386.69760247486465
+      estimated_peak_memory_range:
+        min: 21209088
+        max: 57488704
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 821
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 821
+      job_id: jg9lnkkvg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 12009.0
+      throughput: 83.27088017320344
+      estimated_peak_memory_range:
+        min: 30158848
+        max: 290867904
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 844
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 844
+      job_id: j5mnxodwp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:30:05Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:20:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 3663.0
-      throughput: 273.000273000273
+      inference_time: 3678.0
+      throughput: 271.8868950516585
       estimated_peak_memory_range:
         min: 21229568
         max: 21229568
@@ -690,14 +794,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 821
-      job_id: jlpe9km8g
+      job_id: jpy13qq4p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 14264.0
-      throughput: 70.10656197420079
+      inference_time: 14577.0
+      throughput: 68.60122110173562
       estimated_peak_memory_range:
-        min: 112201728
-        max: 112201728
+        min: 112259072
+        max: 112259072
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -705,7 +809,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 844
-      job_id: jqpye61rg
+      job_id: jgdx18zep
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -714,4 +818,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:30:12Z'
+    timestamp: '2024-10-14T23:20:11Z'
diff --git a/qai_hub_models/models/whisper_small_en/README.md b/qai_hub_models/models/whisper_small_en/README.md
index a7227ed0..cb644d65 100644
--- a/qai_hub_models/models/whisper_small_en/README.md
+++ b/qai_hub_models/models/whisper_small_en/README.md
@@ -6,7 +6,7 @@
 OpenAI’s Whisper ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audio clips up to 30 seconds long. Time to the first token is the encoder's latency, while time to each additional token is decoder's latency, where we assume a mean decoded length specified below.
 
 This is based on the implementation of Whisper-Small-En found
-[here](https://github.com/openai/whisper/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/whisper_small_en).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.whisper_small_en.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Whisper-Small-En can be found
+* The license for the original implementation of Whisper-Small-En can be found
   [here](https://github.com/openai/whisper/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Robust Speech Recognition via Large-Scale Weak Supervision](https://cdn.openai.com/papers/whisper.pdf)
 * [Source Model Implementation](https://github.com/openai/whisper/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/whisper_small_en/export.py b/qai_hub_models/models/whisper_small_en/export.py
index b36a7bf0..bc0f7e93 100644
--- a/qai_hub_models/models/whisper_small_en/export.py
+++ b/qai_hub_models/models/whisper_small_en/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.whisper_small_en import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "whisper_small_en"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "WhisperEncoder" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
diff --git a/qai_hub_models/models/whisper_small_en/perf.yaml b/qai_hub_models/models/whisper_small_en/perf.yaml
index b7267838..fc159e91 100644
--- a/qai_hub_models/models/whisper_small_en/perf.yaml
+++ b/qai_hub_models/models/whisper_small_en/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: WhisperEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 707067.0
-      throughput: 1.414293129222549
+      inference_time: 696399.0
+      throughput: 1.4359584089006445
       estimated_peak_memory_range:
-        min: 18771968
-        max: 438129848
+        min: 84676608
+        max: 495897016
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: j0pxv6v8g
+      job_id: jg9lnk4vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1196020.0
-      throughput: 0.8361064196250899
+      inference_time: 854116.0
+      throughput: 1.170801155814901
       estimated_peak_memory_range:
-        min: 53248
-        max: 212393776
+        min: 49152
+        max: 216681440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: j1pv3roz5
+      job_id: jgo26o1dp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:28:19Z'
+    timestamp: '2024-10-14T23:18:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 550315.0
-      throughput: 1.817141091920082
+      inference_time: 549997.0
+      throughput: 1.818191735591285
       estimated_peak_memory_range:
-        min: 116359168
-        max: 206759088
+        min: 116482048
+        max: 206660112
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: jegn2m2jg
+      job_id: jgdx18vlp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 915241.0
-      throughput: 1.0926083949473417
+      inference_time: 693407.0
+      throughput: 1.4421544633959564
       estimated_peak_memory_range:
         min: 0
-        max: 467783184
+        max: 879347504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jlpe9w18g
+      job_id: jgjvno08g
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 861217.0
+      throughput: 1.1611475388897339
+      estimated_peak_memory_range:
+        min: 120385536
+        max: 4516540736
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 884
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 884
+      job_id: j56y4rq7p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +133,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:28:21Z'
+    timestamp: '2024-10-14T23:18:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 709967.0
-      throughput: 1.4085161704698952
+      inference_time: 689836.0
+      throughput: 1.4496199096596871
       estimated_peak_memory_range:
-        min: 31993856
-        max: 434173424
+        min: 41590784
+        max: 441984352
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -134,14 +147,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: jep28986p
+      job_id: jp4lrmxl5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 769338.0
-      throughput: 1.2998188052585469
+      inference_time: 694520.0
+      throughput: 1.4398433450440593
       estimated_peak_memory_range:
-        min: 884736
-        max: 2663424
+        min: 995328
+        max: 2386016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -149,7 +162,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jvgdwq965
+      job_id: jg9lnk3vg
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -157,14 +170,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:28:25Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:18:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 981977.0
-      throughput: 1.0183537903637254
+      inference_time: 683626.0
+      throughput: 1.462788132692437
       estimated_peak_memory_range:
-        min: 93597696
-        max: 197478064
+        min: 87117824
+        max: 489389752
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -172,14 +185,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: j2p0y2q0g
+      job_id: jp8qy638p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1274997.0
-      throughput: 0.7843155709385983
+      inference_time: 712687.0
+      throughput: 1.4031405090874396
       estimated_peak_memory_range:
-        min: 0
-        max: 580725648
+        min: 1200128
+        max: 32224872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -187,22 +200,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jqpyejw0g
+      job_id: j5mnxovqp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:28:33Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:18:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 696264.0
-      throughput: 1.4362368297082715
+      inference_time: 673361.0
+      throughput: 1.485087493929705
       estimated_peak_memory_range:
-        min: 93286400
-        max: 506950840
+        min: 9539584
+        max: 234029624
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -210,14 +223,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: jogkzqnvg
+      job_id: jpy13q44p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 783882.0
-      throughput: 1.2757022102816495
+      inference_time: 695416.0
+      throughput: 1.4379881969928792
       estimated_peak_memory_range:
-        min: 1531904
-        max: 2961680
+        min: 872448
+        max: 2680560
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -225,22 +238,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jqp4qdo2g
+      job_id: jp4lrmjl5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:28:27Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:18:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 689203.0
-      throughput: 1.4509513162304866
+      inference_time: 674483.0
+      throughput: 1.4826170563231393
       estimated_peak_memory_range:
-        min: 97169408
-        max: 507461688
+        min: 115093504
+        max: 498595456
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -248,14 +261,14 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: j1gln2z2p
+      job_id: jprv3o4eg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 788680.0
-      throughput: 1.2679413703910332
+      inference_time: 691155.0
+      throughput: 1.446853455447765
       estimated_peak_memory_range:
-        min: 1351680
-        max: 2577264
+        min: 6815744
+        max: 8259624
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -263,22 +276,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jo5mr627g
+      job_id: jgdx18rlp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:28:28Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:18:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 697737.0
-      throughput: 1.4332047748650278
+      inference_time: 986018.0
+      throughput: 1.014180268514368
       estimated_peak_memory_range:
-        min: 114737152
-        max: 499696096
+        min: 114905088
+        max: 216859760
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -286,14 +299,37 @@ models:
         layers_on_gpu: 900
         layers_on_cpu: 11
         total_layers: 911
-      job_id: j1p3k13m5
+      job_id: j5mnxowqp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:18:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 536818.0
+      throughput: 1.8628287427023684
+      estimated_peak_memory_range:
+        min: 114487296
+        max: 143590224
+      primary_compute_unit: GPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 0
+        layers_on_gpu: 900
+        layers_on_cpu: 11
+        total_layers: 911
+      job_id: j56y4r37p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 804200.0
-      throughput: 1.2434717731907485
+      inference_time: 551075.0
+      throughput: 1.8146350315292836
       estimated_peak_memory_range:
-        min: 589824
-        max: 1792216
+        min: 0
+        max: 953714048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -301,19 +337,34 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: joprk2qk5
+      job_id: jp8qy6w8p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 635179.0
+      throughput: 1.5743593538199467
+      estimated_peak_memory_range:
+        min: 123006976
+        max: 2907786160
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 884
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 884
+      job_id: j5we68mj5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
-      os: '13'
-      form_factor: Auto
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:28:31Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:18:24Z'
   - torchscript_onnx_qnn:
-      inference_time: 641243.0
-      throughput: 1.5594712145005871
+      inference_time: 526589.0
+      throughput: 1.8990142217175063
       estimated_peak_memory_range:
         min: 483328
         max: 483328
@@ -324,14 +375,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 1329
-      job_id: jmg9vy1m5
+      job_id: jgz3d8x65
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1647041.0
-      throughput: 0.6071494273670176
+      inference_time: 1357587.0
+      throughput: 0.7366010428797565
       estimated_peak_memory_range:
-        min: 470417408
-        max: 470417408
+        min: 470306816
+        max: 470306816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -339,7 +390,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 884
-      job_id: jw566zln5
+      job_id: jgo26oedp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -348,15 +399,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:28:39Z'
+    timestamp: '2024-10-14T23:18:21Z'
 - name: WhisperDecoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 25831.0
-      throughput: 38.7131740931439
+      inference_time: 26563.0
+      throughput: 37.64635018634943
       estimated_peak_memory_range:
-        min: 16760832
-        max: 20516296
+        min: 16764928
+        max: 19759256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -364,14 +415,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: jo5mr6r7g
+      job_id: jp14z78lp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12110.0
-      throughput: 82.57638315441784
+      inference_time: 11991.0
+      throughput: 83.39588024351598
       estimated_peak_memory_range:
-        min: 58085376
-        max: 126345112
+        min: 63504384
+        max: 132995256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -379,7 +430,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: j7gjx2m1p
+      job_id: jpv6ke1m5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 56190.0
+      throughput: 17.796760989499912
+      estimated_peak_memory_range:
+        min: 127217664
+        max: 129732616
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 2302
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 2302
+      job_id: jglvmoel5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -388,13 +454,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:28:19Z'
+    timestamp: '2024-10-14T23:18:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 19573.0
-      throughput: 51.090788330863944
+      inference_time: 21257.0
+      throughput: 47.043326904078654
       estimated_peak_memory_range:
-        min: 13574144
-        max: 1162869504
+        min: 16773120
+        max: 1182597392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -402,14 +468,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: joprk2kk5
+      job_id: j57yrkjr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 9345.0
-      throughput: 107.00909577314071
+      inference_time: 9770.0
+      throughput: 102.35414534288638
       estimated_peak_memory_range:
-        min: 57933824
-        max: 144391216
+        min: 41037824
+        max: 140272512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -417,14 +483,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: jygzej94g
+      job_id: jpedm8r05
       job_status: Passed
     torchscript_onnx:
-      inference_time: 49160.0
-      throughput: 20.34174125305126
+      inference_time: 47118.0
+      throughput: 21.223311685555416
       estimated_peak_memory_range:
-        min: 90808320
-        max: 1527605184
+        min: 120029184
+        max: 1668877984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -432,7 +498,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2302
-      job_id: j1gln2r2p
+      job_id: jp3j0xqzg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -441,13 +507,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:28:37Z'
+    timestamp: '2024-10-14T23:18:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 25314.0
-      throughput: 39.50383187169155
+      inference_time: 24890.0
+      throughput: 40.17677782241864
       estimated_peak_memory_range:
-        min: 16809984
-        max: 19070928
+        min: 16441344
+        max: 19909576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -455,14 +521,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: jqpyeje0g
+      job_id: jpxko3795
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 11967.0
-      throughput: 83.56313194618534
+      inference_time: 12378.0
+      throughput: 80.78849571820973
       estimated_peak_memory_range:
-        min: 65486848
-        max: 66707888
+        min: 63721472
+        max: 65005912
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -470,7 +536,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: jz57zlwnp
+      job_id: jp14z7dlp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -478,14 +544,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:28:25Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:18:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 28257.0
-      throughput: 35.38946101850869
+      inference_time: 25507.0
+      throughput: 39.20492413847179
       estimated_peak_memory_range:
-        min: 16764928
-        max: 1146454064
+        min: 14761984
+        max: 17803144
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -493,14 +559,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: j1p8om9qg
+      job_id: jgkexolog
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 15726.0
-      throughput: 63.588960956377974
+      inference_time: 12827.0
+      throughput: 77.96055196070787
       estimated_peak_memory_range:
-        min: 56090624
-        max: 153402384
+        min: 63680512
+        max: 69340872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -508,22 +574,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: j2p0y270g
+      job_id: jgn6vorm5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:28:33Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:18:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 25687.0
-      throughput: 38.930198154708606
+      inference_time: 24833.0
+      throughput: 40.26899689928724
       estimated_peak_memory_range:
-        min: 16797696
-        max: 19702104
+        min: 16093184
+        max: 19219080
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -531,14 +597,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: jn5q8rke5
+      job_id: jp0z0d1e5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12374.0
-      throughput: 80.81461128171973
+      inference_time: 12718.0
+      throughput: 78.62871520679352
       estimated_peak_memory_range:
-        min: 63692800
-        max: 64958336
+        min: 63713280
+        max: 69556520
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -546,22 +612,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: j0pxv6j8g
+      job_id: jpxko3e95
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:28:27Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:18:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 25128.0
-      throughput: 39.79624323463865
+      inference_time: 25027.0
+      throughput: 39.95684660566588
       estimated_peak_memory_range:
-        min: 16789504
-        max: 19879008
+        min: 16818176
+        max: 19555432
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -569,14 +635,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: jw566zjn5
+      job_id: jp2ky47mp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12549.0
-      throughput: 79.6876245119133
+      inference_time: 12546.0
+      throughput: 79.70667941973538
       estimated_peak_memory_range:
-        min: 63705088
-        max: 65038120
+        min: 63721472
+        max: 65071592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -584,22 +650,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: jegn2myjg
+      job_id: j57yrkvr5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:28:29Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:18:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 25007.0
-      throughput: 39.98880313512217
+      inference_time: 27823.0
+      throughput: 35.94148725874277
       estimated_peak_memory_range:
-        min: 16793600
-        max: 19730264
+        min: 16879616
+        max: 1157599856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -607,14 +673,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2573
-      job_id: jwgoyn015
+      job_id: jgn6vo9m5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12702.0
-      throughput: 78.72775940796726
+      inference_time: 14222.0
+      throughput: 70.3135986499789
       estimated_peak_memory_range:
-        min: 63696896
-        max: 65109016
+        min: 59482112
+        max: 167489648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -622,22 +688,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: jep28966p
+      job_id: jp0z0dee5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:28:31Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:18:13Z'
+  - torchscript_onnx_tflite:
+      inference_time: 15389.0
+      throughput: 64.98148027812074
+      estimated_peak_memory_range:
+        min: 15761408
+        max: 275112272
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 2573
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 2573
+      job_id: jp3j0x4zg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 7560.0
+      throughput: 132.27513227513228
+      estimated_peak_memory_range:
+        min: 63893504
+        max: 204320480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 2255
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 2255
+      job_id: jgkexorog
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:18:25Z'
   - torchscript_onnx_qnn:
-      inference_time: 10421.0
-      throughput: 95.96008060646771
+      inference_time: 10849.0
+      throughput: 92.17439395335975
       estimated_peak_memory_range:
-        min: 63696896
-        max: 63696896
+        min: 63692800
+        max: 63692800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -645,14 +749,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2255
-      job_id: jnp10wln5
+      job_id: j5we68dj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 52241.0
-      throughput: 19.142053176623726
+      inference_time: 49274.0
+      throughput: 20.29467873523562
       estimated_peak_memory_range:
-        min: 242700288
-        max: 242700288
+        min: 243027968
+        max: 243027968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -660,7 +764,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 2302
-      job_id: j1p3k12m5
+      job_id: jpv6kezm5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -669,4 +773,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:28:39Z'
+    timestamp: '2024-10-14T23:18:21Z'
diff --git a/qai_hub_models/models/whisper_tiny_en/README.md b/qai_hub_models/models/whisper_tiny_en/README.md
index e13d6b04..7de45b00 100644
--- a/qai_hub_models/models/whisper_tiny_en/README.md
+++ b/qai_hub_models/models/whisper_tiny_en/README.md
@@ -6,7 +6,7 @@
 OpenAI’s Whisper ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audio clips up to 30 seconds long. Time to the first token is the encoder's latency, while time to each additional token is decoder's latency, where we assume a mean decoded length specified below.
 
 This is based on the implementation of Whisper-Tiny-En found
-[here](https://github.com/openai/whisper/tree/main). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/whisper_tiny_en).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.whisper_tiny_en.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Whisper-Tiny-En can be found
+* The license for the original implementation of Whisper-Tiny-En can be found
   [here](https://github.com/openai/whisper/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Robust Speech Recognition via Large-Scale Weak Supervision](https://cdn.openai.com/papers/whisper.pdf)
 * [Source Model Implementation](https://github.com/openai/whisper/tree/main)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/whisper_tiny_en/export.py b/qai_hub_models/models/whisper_tiny_en/export.py
index fa76ff82..b6accf36 100644
--- a/qai_hub_models/models/whisper_tiny_en/export.py
+++ b/qai_hub_models/models/whisper_tiny_en/export.py
@@ -10,14 +10,15 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+from typing import Any, Dict, List, Mapping, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.whisper_tiny_en import Model
 from qai_hub_models.utils.args import export_parser, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel, TargetRuntime
+from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -45,20 +46,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Mapping[
-    str, Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]]
-] | List[str]:
+) -> Mapping[str, ExportResult] | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -83,10 +82,10 @@ def export_model(
             `model_cls.from_pretrained`
 
     Returns:
-        A Mapping from component_name to a 3-tuple of:
+        A Mapping from component_name to a struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "whisper_tiny_en"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -118,7 +117,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     components_dict: Dict[str, BaseModel] = {}
     if "WhisperEncoder" in components:
@@ -135,7 +134,7 @@ def export_model(
             component.to("cpu"), make_torch_inputs(input_spec)
         )
 
-        # 2. Compile the models to an on-device asset
+        # 2. Compiles the model to an asset that can be run on device
         model_compile_options = component.get_hub_compile_options(
             target_runtime, compile_options, hub_device
         )
@@ -151,7 +150,7 @@ def export_model(
             hub.client.CompileJob, submitted_compile_job
         )
 
-    # 3. Profile the model assets on real devices
+    # 3. Profiles the model performance on a real device
     profile_jobs: Dict[str, hub.client.ProfileJob] = {}
     if not skip_profiling:
         for component_name in components:
@@ -169,7 +168,7 @@ def export_model(
                 hub.client.ProfileJob, submitted_profile_job
             )
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_jobs: Dict[str, hub.client.InferenceJob] = {}
     if not skip_inferencing:
         for component_name in components:
@@ -193,14 +192,14 @@ def export_model(
                 hub.client.InferenceJob, submitted_inference_job
             )
 
-    # 5. Download the model assets to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         for component_name, compile_job in compile_jobs.items():
             target_model: hub.Model = compile_job.get_target_model()  # type: ignore
             target_model.download(str(output_path / component_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         for component_name in components:
             profile_job = profile_jobs[component_name]
@@ -225,10 +224,10 @@ def export_model(
             )
 
     return {
-        component_name: (
-            compile_jobs[component_name],
-            profile_jobs.get(component_name, None),
-            inference_jobs.get(component_name, None),
+        component_name: ExportResult(
+            compile_job=compile_jobs[component_name],
+            inference_job=inference_jobs.get(component_name, None),
+            profile_job=profile_jobs.get(component_name, None),
         )
         for component_name in components
     }
@@ -236,7 +235,9 @@ def export_model(
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model, components=ALL_COMPONENTS)
+    parser = export_parser(
+        model_cls=Model, components=ALL_COMPONENTS, supports_onnx=False
+    )
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/whisper_tiny_en/perf.yaml b/qai_hub_models/models/whisper_tiny_en/perf.yaml
index 43570e96..b3f4556c 100644
--- a/qai_hub_models/models/whisper_tiny_en/perf.yaml
+++ b/qai_hub_models/models/whisper_tiny_en/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: WhisperEncoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 98932.0
-      throughput: 10.107952937371124
+      inference_time: 103909.0
+      throughput: 9.62380544514912
       estimated_peak_memory_range:
-        min: 14483456
-        max: 56238536
+        min: 20807680
+        max: 91655600
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: jep2891qp
+      job_id: jgjvno27g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 185163.0
-      throughput: 5.400646997510302
+      inference_time: 135518.0
+      throughput: 7.3790935521480545
       estimated_peak_memory_range:
-        min: 20480
-        max: 52464640
+        min: 16384
+        max: 56853496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,7 +71,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jz5wo3jmp
+      job_id: j5q6qz37p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -82,13 +80,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:26:38Z'
+    timestamp: '2024-10-14T23:16:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 83641.0
-      throughput: 11.955858968687606
+      inference_time: 84725.0
+      throughput: 11.802891708468575
       estimated_peak_memory_range:
-        min: 22597632
-        max: 49880416
+        min: 22761472
+        max: 52150160
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -96,14 +94,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: j2p0y2wng
+      job_id: jgz3d8jz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 144419.0
-      throughput: 6.924296664566297
+      inference_time: 113054.0
+      throughput: 8.845330550002654
       estimated_peak_memory_range:
         min: 12288
-        max: 114966512
+        max: 195753504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,7 +109,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jnp10wr75
+      job_id: j56y4rnvp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -120,13 +118,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:26:40Z'
+    timestamp: '2024-10-14T23:16:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 99583.0
-      throughput: 10.04187461715353
+      inference_time: 98695.0
+      throughput: 10.132225543340594
       estimated_peak_memory_range:
-        min: 20795392
-        max: 53352520
+        min: 20750336
+        max: 104191960
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -134,14 +132,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: jogkzq1ng
+      job_id: jg9lnkyqg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 152977.0
-      throughput: 6.536930388228296
+      inference_time: 102692.0
+      throughput: 9.737856892455108
       estimated_peak_memory_range:
-        min: 737280
-        max: 1978448
+        min: 274432
+        max: 6031896
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -149,7 +147,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jnp10wrn5
+      job_id: jgjvnoe7g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -157,14 +155,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:26:43Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:16:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 137234.0
-      throughput: 7.286823964906656
+      inference_time: 100209.0
+      throughput: 9.979143589897115
       estimated_peak_memory_range:
-        min: 102400
-        max: 35425632
+        min: 18350080
+        max: 65172280
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -172,14 +170,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: j1gln2jmp
+      job_id: jprv3oevg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 208922.0
-      throughput: 4.7864753352925975
+      inference_time: 105009.0
+      throughput: 9.522993267243761
       estimated_peak_memory_range:
-        min: 114688
-        max: 124039776
+        min: 241664
+        max: 6122504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -187,22 +185,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jep28926p
+      job_id: jgdx18okp
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:26:51Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:16:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 97343.0
-      throughput: 10.272952343774078
+      inference_time: 105883.0
+      throughput: 9.44438672874777
       estimated_peak_memory_range:
-        min: 16031744
-        max: 164484336
+        min: 6832128
+        max: 53498920
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -210,14 +208,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: j1p3k1yn5
+      job_id: j5mnxo6yp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 155393.0
-      throughput: 6.435296313218742
+      inference_time: 103640.0
+      throughput: 9.648784253184099
       estimated_peak_memory_range:
-        min: 94208
-        max: 4624184
+        min: 704512
+        max: 2003848
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -225,22 +223,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jz57zlqnp
+      job_id: jg9lnkwqg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:26:45Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:16:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 96250.0
-      throughput: 10.38961038961039
+      inference_time: 101593.0
+      throughput: 9.843197858120146
       estimated_peak_memory_range:
-        min: 14434304
-        max: 60693616
+        min: 20385792
+        max: 119246136
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -248,14 +246,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: j1pv3rjr5
+      job_id: jp4lrmdq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 157869.0
-      throughput: 6.334365834964433
+      inference_time: 104738.0
+      throughput: 9.547633141744162
       estimated_peak_memory_range:
-        min: 163840
-        max: 5613952
+        min: 176128
+        max: 10930688
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -263,22 +261,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: j0pxv6w8g
+      job_id: jgz3d8rz5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:26:47Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:16:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 112458.0
-      throughput: 8.892208646783688
+      inference_time: 150469.0
+      throughput: 6.6458871927107905
       estimated_peak_memory_range:
-        min: 6311936
-        max: 59157856
+        min: 20463616
+        max: 55366064
       primary_compute_unit: GPU
       precision: fp16
       layer_info:
@@ -286,14 +284,14 @@ models:
         layers_on_gpu: 260
         layers_on_cpu: 11
         total_layers: 271
-      job_id: jlpe9wjvg
+      job_id: jgdx18qkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 154185.0
-      throughput: 6.48571521224503
+      inference_time: 180709.0
+      throughput: 5.533758694918349
       estimated_peak_memory_range:
-        min: 229376
-        max: 6149440
+        min: 106496
+        max: 204613008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -301,22 +299,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jegn2mjjg
+      job_id: j5mnxo3yp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:26:49Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:16:17Z'
+  - torchscript_onnx_tflite:
+      inference_time: 77729.0
+      throughput: 12.86521118244156
+      estimated_peak_memory_range:
+        min: 21049344
+        max: 41626208
+      primary_compute_unit: GPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 0
+        layers_on_gpu: 260
+        layers_on_cpu: 11
+        total_layers: 271
+      job_id: jp8qy6zzp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 100989.0
+      throughput: 9.902068542118448
+      estimated_peak_memory_range:
+        min: 0
+        max: 204384016
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 313
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 313
+      job_id: jprv3oyvg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:16:18Z'
   - torchscript_onnx_qnn:
-      inference_time: 148682.0
-      throughput: 6.725763710469324
+      inference_time: 95580.0
+      throughput: 10.462439840970914
       estimated_peak_memory_range:
-        min: 491520
-        max: 491520
+        min: 520192
+        max: 520192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -324,7 +360,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 313
-      job_id: jz5wo3j4p
+      job_id: jgo26o34p
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -333,15 +369,15 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:26:41Z'
+    timestamp: '2024-10-14T23:16:05Z'
 - name: WhisperDecoder
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 3793.0
-      throughput: 263.6435539151068
+      inference_time: 3760.0
+      throughput: 265.9574468085106
       estimated_peak_memory_range:
-        min: 7077888
-        max: 9559432
+        min: 2981888
+        max: 5642752
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -349,14 +385,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: jqpyejllg
+      job_id: jpedm8w75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2387.0
-      throughput: 418.93590280687056
+      inference_time: 2356.0
+      throughput: 424.44821731748726
       estimated_peak_memory_range:
-        min: 16384
-        max: 148387760
+        min: 2781184
+        max: 139470032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -364,22 +400,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jmg9vy685
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 5347.0
-      throughput: 187.02075930428276
-      estimated_peak_memory_range:
-        min: 36864
-        max: 79008392
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 462
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 462
-      job_id: j1p8omoqg
+      job_id: jglvmo3e5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -388,13 +409,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:26:53Z'
+    timestamp: '2024-10-14T23:16:01Z'
   - torchscript_onnx_tflite:
       inference_time: 2891.0
       throughput: 345.9010722933241
       estimated_peak_memory_range:
-        min: 184320
-        max: 227930880
+        min: 2994176
+        max: 231538704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -402,14 +423,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: j1p8omnog
+      job_id: j5we683z5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1717.0
-      throughput: 582.4111822947001
+      inference_time: 1618.0
+      throughput: 618.0469715698393
       estimated_peak_memory_range:
-        min: 4624384
-        max: 26259952
+        min: 4628480
+        max: 28286880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -417,22 +438,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jvgdwqjz5
-      job_status: Passed
-    torchscript_onnx:
-      inference_time: 4145.0
-      throughput: 241.25452352231605
-      estimated_peak_memory_range:
-        min: 995328
-        max: 401520432
-      primary_compute_unit: NPU
-      precision: fp16
-      layer_info:
-        layers_on_npu: 462
-        layers_on_gpu: 0
-        layers_on_cpu: 0
-        total_layers: 462
-      job_id: jn5q8r8e5
+      job_id: jp3j0xexg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -441,13 +447,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:26:54Z'
+    timestamp: '2024-10-14T23:16:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 4198.0
-      throughput: 238.20867079561697
+      inference_time: 3718.0
+      throughput: 268.9618074233459
       estimated_peak_memory_range:
         min: 2985984
-        max: 5534288
+        max: 5145512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -455,14 +461,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: jn5q8rno5
+      job_id: jp14z7wkp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2228.0
-      throughput: 448.8330341113106
+      inference_time: 2284.0
+      throughput: 437.82837127845886
       estimated_peak_memory_range:
-        min: 10661888
-        max: 12492272
+        min: 10674176
+        max: 11965008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -470,7 +476,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jvgdwqj65
+      job_id: jpedm8k75
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -478,14 +484,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:26:44Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:16:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 4213.0
-      throughput: 237.36055067647757
+      inference_time: 3651.0
+      throughput: 273.8975623116954
       estimated_peak_memory_range:
-        min: 2973696
-        max: 226398544
+        min: 2977792
+        max: 5085032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -493,14 +499,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: jw566zky5
+      job_id: jp2ky4lxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2639.0
-      throughput: 378.931413414172
+      inference_time: 2233.0
+      throughput: 447.82803403493057
       estimated_peak_memory_range:
-        min: 7266304
-        max: 29957504
+        min: 10657792
+        max: 12672312
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -508,22 +514,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jqpyej90g
+      job_id: j57yrkxq5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:26:51Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:16:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 3777.0
-      throughput: 264.76039184537996
+      inference_time: 3786.0
+      throughput: 264.1310089804543
       estimated_peak_memory_range:
-        min: 2977792
-        max: 4900008
+        min: 2981888
+        max: 4973504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -531,14 +537,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: jwgoynjk5
+      job_id: jgn6vo3v5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2162.0
-      throughput: 462.53469010175763
+      inference_time: 2226.0
+      throughput: 449.23629829290206
       estimated_peak_memory_range:
-        min: 2265088
-        max: 3560360
+        min: 10694656
+        max: 11899816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -546,22 +552,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jqp4qdz2g
+      job_id: jp14z7ekp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:26:45Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:16:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 3795.0
-      throughput: 263.5046113306983
+      inference_time: 3644.0
+      throughput: 274.423710208562
       estimated_peak_memory_range:
-        min: 2998272
-        max: 8673408
+        min: 2981888
+        max: 5035200
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -569,14 +575,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: j7gjx2jep
+      job_id: jpxko36j5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2170.0
-      throughput: 460.8294930875576
+      inference_time: 2297.0
+      throughput: 435.35045711798
       estimated_peak_memory_range:
-        min: 10661888
-        max: 12549080
+        min: 10674176
+        max: 12067440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -584,22 +590,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jo5mr6j7g
+      job_id: j5we68qz5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:26:47Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:16:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 3796.0
-      throughput: 263.43519494204423
+      inference_time: 4266.0
+      throughput: 234.4116268166901
       estimated_peak_memory_range:
-        min: 2994176
-        max: 5048120
+        min: 2973696
+        max: 228556224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -607,14 +613,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 557
-      job_id: jygzej1xg
+      job_id: j57yrklq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2193.0
-      throughput: 455.99635202918375
+      inference_time: 2741.0
+      throughput: 364.8303538854433
       estimated_peak_memory_range:
-        min: 4648960
-        max: 5961392
+        min: 10637312
+        max: 37286256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -622,22 +628,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: joprk2zk5
+      job_id: jgn6voev5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:26:49Z'
-  - torchscript_onnx_qnn:
-      inference_time: 2061.0
-      throughput: 485.201358563804
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:16:17Z'
+  - torchscript_onnx_tflite:
+      inference_time: 2429.0
+      throughput: 411.6920543433512
       estimated_peak_memory_range:
-        min: 10629120
-        max: 10629120
+        min: 1028096
+        max: 32313024
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 557
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 557
+      job_id: jgkexo3yg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1394.0
+      throughput: 717.3601147776184
+      estimated_peak_memory_range:
+        min: 10620928
+        max: 35570128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -645,22 +666,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 447
-      job_id: jmg9vy6m5
+      job_id: jp2ky4mxp
       job_status: Passed
-    torchscript_onnx:
-      inference_time: 4503.0
-      throughput: 222.0741727737064
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:16:19Z'
+  - torchscript_onnx_qnn:
+      inference_time: 2056.0
+      throughput: 486.38132295719845
       estimated_peak_memory_range:
-        min: 77918208
-        max: 77918208
+        min: 10629120
+        max: 10629120
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
-        layers_on_npu: 462
+        layers_on_npu: 447
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 462
-      job_id: jw566z6n5
+        total_layers: 447
+      job_id: jpv6kev75
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -669,4 +698,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:26:56Z'
+    timestamp: '2024-10-14T23:16:05Z'
diff --git a/qai_hub_models/models/wideresnet50/README.md b/qai_hub_models/models/wideresnet50/README.md
index 4e152e5b..30dcddb5 100644
--- a/qai_hub_models/models/wideresnet50/README.md
+++ b/qai_hub_models/models/wideresnet50/README.md
@@ -6,7 +6,7 @@
 WideResNet50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of WideResNet50 found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/wideresnet50).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.wideresnet50.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of WideResNet50 can be found
+* The license for the original implementation of WideResNet50 can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Wide Residual Networks](https://arxiv.org/abs/1605.07146)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/wideresnet50/export.py b/qai_hub_models/models/wideresnet50/export.py
index 16cb1e89..368a28e6 100644
--- a/qai_hub_models/models/wideresnet50/export.py
+++ b/qai_hub_models/models/wideresnet50/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.wideresnet50 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "wideresnet50"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/wideresnet50/perf.yaml b/qai_hub_models/models/wideresnet50/perf.yaml
index 706e9b87..46060da9 100644
--- a/qai_hub_models/models/wideresnet50/perf.yaml
+++ b/qai_hub_models/models/wideresnet50/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: WideResNet50
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 4868.0
-      throughput: 205.42317173377157
+      inference_time: 4887.0
+      throughput: 204.62451401677922
       estimated_peak_memory_range:
-        min: 28672
-        max: 2161232
+        min: 24576
+        max: 2241528
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j2p0y28ng
+      job_id: j56y4revp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5710.0
-      throughput: 175.13134851138355
+      inference_time: 5677.0
+      throughput: 176.14937466971992
       estimated_peak_memory_range:
-        min: 618496
-        max: 216999120
+        min: 622592
+        max: 362802736
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jwgoynxk5
+      job_id: jp14z7ykp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5515.0
-      throughput: 181.32366273798732
+      inference_time: 5217.0
+      throughput: 191.68104274487254
       estimated_peak_memory_range:
-        min: 16384
-        max: 169289768
+        min: 634880
+        max: 2575704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jvgdwqkz5
+      job_id: jp0z0d225
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:25:55Z'
+    timestamp: '2024-10-14T23:15:13Z'
   - torchscript_onnx_tflite:
-      inference_time: 4001.0
-      throughput: 249.93751562109472
+      inference_time: 3989.0
+      throughput: 250.68939583855604
       estimated_peak_memory_range:
-        min: 16384
-        max: 104623104
+        min: 12288
+        max: 105887760
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j1p8omdog
+      job_id: jp3j0xvxg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4546.0
-      throughput: 219.9736031676199
+      inference_time: 4603.0
+      throughput: 217.24961981316534
       estimated_peak_memory_range:
-        min: 647168
-        max: 27068912
+        min: 618496
+        max: 28515632
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j1pv3r8r5
+      job_id: jgdx18ekp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4538.0
-      throughput: 220.36139268400177
+      inference_time: 4293.0
+      throughput: 232.93733985557884
       estimated_peak_memory_range:
-        min: 638976
-        max: 106582320
+        min: 0
+        max: 111389648
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jz57zlm9p
+      job_id: jp8qy6mzp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:25:56Z'
+    timestamp: '2024-10-14T23:15:14Z'
   - torchscript_onnx_tflite:
-      inference_time: 4855.0
-      throughput: 205.97322348094747
+      inference_time: 4847.0
+      throughput: 206.31318341242005
       estimated_peak_memory_range:
-        min: 24576
-        max: 19454792
+        min: 16384
+        max: 2006712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jogkzqwng
+      job_id: jgo26ok4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4904.0
-      throughput: 203.9151712887439
+      inference_time: 5026.0
+      throughput: 198.96538002387584
       estimated_peak_memory_range:
         min: 634880
-        max: 2287808
+        max: 1895920
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jlpe9wqvg
+      job_id: jp4lrmkq5
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:25:50Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:15:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 7110.0
-      throughput: 140.64697609001408
+      inference_time: 4872.0
+      throughput: 205.2545155993432
       estimated_peak_memory_range:
-        min: 32768
-        max: 94466032
+        min: 28672
+        max: 2231992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jn5q8rxo5
+      job_id: jgz3d8oz5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7278.0
-      throughput: 137.40038472107722
+      inference_time: 5033.0
+      throughput: 198.68865487780647
       estimated_peak_memory_range:
-        min: 618496
-        max: 22598112
+        min: 659456
+        max: 1918008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jnp10w975
+      job_id: jgn6vomv5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:25:55Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:15:09Z'
   - torchscript_onnx_tflite:
-      inference_time: 4870.0
-      throughput: 205.3388090349076
+      inference_time: 4860.0
+      throughput: 205.76131687242798
       estimated_peak_memory_range:
-        min: 36864
-        max: 2265576
+        min: 16384
+        max: 2013600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j1gln2dmp
+      job_id: jpedm8e75
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4903.0
-      throughput: 203.95676116663267
+      inference_time: 5031.0
+      throughput: 198.76764062810574
       estimated_peak_memory_range:
-        min: 626688
-        max: 1802184
+        min: 655360
+        max: 2032392
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jygzej6xg
+      job_id: j5mnxoqyp
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:25:51Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:15:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 4870.0
-      throughput: 205.3388090349076
+      inference_time: 4878.0
+      throughput: 205.0020500205002
       estimated_peak_memory_range:
-        min: 16384
-        max: 24783952
+        min: 28672
+        max: 1515456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: jw566zxy5
+      job_id: jgjvnoz7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4915.0
-      throughput: 203.4587995930824
+      inference_time: 5029.0
+      throughput: 198.8466892026248
       estimated_peak_memory_range:
-        min: 327680
-        max: 1578048
+        min: 643072
+        max: 2388064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jz5wo3kmp
+      job_id: jpxko3nj5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:25:52Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:15:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 4855.0
-      throughput: 205.97322348094747
+      inference_time: 7138.0
+      throughput: 140.09526478005043
       estimated_peak_memory_range:
-        min: 16384
-        max: 2315176
+        min: 24576
+        max: 94402656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 79
-      job_id: j1p3k1dn5
+      job_id: jpv6ke075
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4847.0
-      throughput: 206.31318341242005
+      inference_time: 7222.0
+      throughput: 138.46579894765992
       estimated_peak_memory_range:
-        min: 634880
-        max: 1843304
+        min: 638976
+        max: 25920800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: jmg9vyr85
+      job_id: jp2ky49xp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:15:11Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3063.0
+      throughput: 326.47730982696703
+      estimated_peak_memory_range:
+        min: 12288
+        max: 33038080
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 79
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 79
+      job_id: jg9lnkjqg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 4069.0
+      throughput: 245.7606291472106
+      estimated_peak_memory_range:
+        min: 0
+        max: 26973264
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 126
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 126
+      job_id: jpy13qjrp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 3520.0
+      throughput: 284.09090909090907
+      estimated_peak_memory_range:
+        min: 0
+        max: 38708432
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 128
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 128
+      job_id: jglvmo2e5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:25:53Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:15:17Z'
   - torchscript_onnx_qnn:
-      inference_time: 4696.0
-      throughput: 212.94718909710392
+      inference_time: 4938.0
+      throughput: 202.5111381125962
       estimated_peak_memory_range:
         min: 602112
         max: 602112
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 126
-      job_id: j7gjx29ep
+      job_id: j57yrk0q5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5118.0
-      throughput: 195.38882375928097
+      inference_time: 4711.0
+      throughput: 212.26915729144557
       estimated_peak_memory_range:
-        min: 139382784
-        max: 139382784
+        min: 139501568
+        max: 139501568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 128
-      job_id: jqp4qd71g
+      job_id: jgkexoqyg
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:25:57Z'
+    timestamp: '2024-10-14T23:15:15Z'
diff --git a/qai_hub_models/models/wideresnet50_quantized/README.md b/qai_hub_models/models/wideresnet50_quantized/README.md
index 1f6f16ad..a64ee173 100644
--- a/qai_hub_models/models/wideresnet50_quantized/README.md
+++ b/qai_hub_models/models/wideresnet50_quantized/README.md
@@ -6,7 +6,7 @@
 WideResNet50 is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
 
 This is based on the implementation of WideResNet50-Quantized found
-[here](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/wideresnet50_quantized).
 
@@ -17,11 +17,6 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/w
 
 ## Example & Usage
 
-Install the package via pip:
-```bash
-pip install "qai_hub_models[wideresnet50_quantized]"
-```
-
 
 Once installed, run the following simple CLI demo:
 
@@ -44,15 +39,19 @@ python -m qai_hub_models.models.wideresnet50_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of WideResNet50-Quantized can be found
+* The license for the original implementation of WideResNet50-Quantized can be found
   [here](https://github.com/pytorch/vision/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Wide Residual Networks](https://arxiv.org/abs/1605.07146)
 * [Source Model Implementation](https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/wideresnet50_quantized/evaluate.py b/qai_hub_models/models/wideresnet50_quantized/evaluate.py
index 232037a3..865f4679 100644
--- a/qai_hub_models/models/wideresnet50_quantized/evaluate.py
+++ b/qai_hub_models/models/wideresnet50_quantized/evaluate.py
@@ -13,10 +13,8 @@
 
 from qai_hub_models.models.wideresnet50_quantized import MODEL_ID, Model
 from qai_hub_models.utils.args import evaluate_parser, get_hub_device, get_model_kwargs
-from qai_hub_models.utils.base_model import BaseModel
 from qai_hub_models.utils.evaluate import evaluate_on_dataset
 from qai_hub_models.utils.inference import compile_model_from_args
-from qai_hub_models.utils.quantization_aimet import AIMETQuantizableMixin
 
 SUPPORTED_DATASETS = ["imagenette", "imagenet"]
 
@@ -27,6 +25,7 @@ def main():
         model_cls=Model,
         default_split_size=2500,
         supported_datasets=SUPPORTED_DATASETS,
+        is_hub_quantized=True,
     )
     args = parser.parse_args()
     args.device = None
@@ -38,13 +37,7 @@ def main():
             MODEL_ID, args, get_model_kwargs(Model, vars(args))
         )
     hub_device = get_hub_device(None, args.chipset)
-
-    # Use Fp16 model for torch inference
-    for cls in Model.__mro__:
-        if issubclass(cls, BaseModel) and not issubclass(cls, AIMETQuantizableMixin):
-            torch_cls = cls
-            break
-    torch_model = torch_cls.from_pretrained(**get_model_kwargs(torch_cls, vars(args)))
+    torch_model = Model.from_pretrained(**get_model_kwargs(Model, vars(args)))
     evaluate_on_dataset(
         hub_model,
         torch_model,
diff --git a/qai_hub_models/models/wideresnet50_quantized/export.py b/qai_hub_models/models/wideresnet50_quantized/export.py
index 7588c26d..47865485 100644
--- a/qai_hub_models/models/wideresnet50_quantized/export.py
+++ b/qai_hub_models/models/wideresnet50_quantized/export.py
@@ -10,18 +10,20 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
+import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.wideresnet50_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
+from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
     print_on_target_demo_cmd,
@@ -31,11 +33,14 @@
     can_access_qualcomm_ai_hub,
     export_without_hub_access,
 )
+from qai_hub_models.utils.quantization import get_calibration_data
 
 
 def export_model(
     device: str = "Samsung Galaxy S23 (Family)",
     chipset: Optional[str] = None,
+    num_calibration_samples: int = 100,
+    skip_compiling: bool = False,
     skip_profiling: bool = False,
     skip_inferencing: bool = False,
     skip_downloading: bool = False,
@@ -45,20 +50,19 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+        3. Compiles the model to an asset that can be run on device
+        4. Profiles the model performance on a real device
+        5. Inferences the model on sample inputs
+        6. Downloads the model asset to the local directory
+        7. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 5 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -66,6 +70,9 @@ def export_model(
             Defaults to DEFAULT_DEVICE if not specified.
         chipset: If set, will choose a random device with this chipset.
             Overrides the `device` argument.
+        num_calibration_samples: The number of calibration data samples
+            to use for quantization.
+        skip_compiling: If set, skips compiling model to format that can run on device.
         skip_profiling: If set, skips profiling of compiled model on real devices.
         skip_inferencing: If set, skips computing on-device outputs from sample data.
         skip_downloading: If set, skips downloading of compiled model.
@@ -80,10 +87,11 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
-            * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+        A struct of:
+            * A CompileJob object containing metadata about the compile job submitted to hub (None if compiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
+            * A QuantizeJob object containing metadata about the quantize job submitted to hub
     """
     model_name = "wideresnet50_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,33 +117,52 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
     )
 
     # Trace the model
-    source_model = model.convert_to_hub_source_model(
-        target_runtime, output_path, input_spec
+    source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
+
+    print(f"Quantizing model {model_name} with {num_calibration_samples} samples.")
+    # 2. Converts the PyTorch model to ONNX and quantizes the ONNX model.
+    onnx_compile_job = hub.submit_compile_job(
+        model=source_model,
+        input_specs=input_spec,
+        device=hub_device,
+        name=model_name,
+        options="--target_runtime onnx",
+    )
+    quantize_job = hub.submit_quantize_job(
+        model=onnx_compile_job.get_target_model(),
+        calibration_data=get_calibration_data(
+            input_spec, "imagenette", num_calibration_samples
+        ),
+        weights_dtype=model.get_weights_dtype(),
+        activations_dtype=model.get_activations_dtype(),
+        name=model_name,
+        options=model.get_quantize_options(),
     )
+    if skip_compiling:
+        return ExportResult(quantize_job=quantize_job)
 
-    # 2. Compile the model to an on-device asset
+    # 3. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
     print(f"Optimizing model {model_name} to run on-device")
     submitted_compile_job = hub.submit_compile_job(
-        model=source_model,
+        model=quantize_job.get_target_model(),
         input_specs=input_spec,
         device=hub_device,
         name=model_name,
-        calibration_data=model.get_calibration_data(target_runtime),
         options=model_compile_options,
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 4. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +177,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 5. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +198,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 6. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 7. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,12 +225,17 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+        quantize_job=quantize_job,
+    )
 
 
 def main():
     warnings.filterwarnings("ignore")
-    parser = export_parser(model_cls=Model)
+    parser = export_parser(model_cls=Model, is_hub_quantized=True)
     args = parser.parse_args()
     export_model(**vars(args))
 
diff --git a/qai_hub_models/models/wideresnet50_quantized/model.py b/qai_hub_models/models/wideresnet50_quantized/model.py
index a0fa95da..9e5e0f2b 100644
--- a/qai_hub_models/models/wideresnet50_quantized/model.py
+++ b/qai_hub_models/models/wideresnet50_quantized/model.py
@@ -4,83 +4,11 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
-# isort: off
-# This verifies aimet is installed, and this must be included first.
-from qai_hub_models.utils.quantization_aimet import (
-    AIMETQuantizableMixin,
-    constrain_quantized_inputs_to_image_range,
-)
-
-# isort: on
-
-import torch
-from aimet_torch.cross_layer_equalization import (
-    equalize_bn_folded_model,
-    fold_all_batch_norms,
-)
-from aimet_torch.model_preparer import prepare_model
-from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
-
 from qai_hub_models.models.wideresnet50.model import WideResNet50
-from qai_hub_models.utils.aimet.config_loader import get_default_aimet_config
-from qai_hub_models.utils.asset_loaders import CachedWebModelAsset
+from qai_hub_models.utils.quantization import HubQuantizableMixin
 
 MODEL_ID = __name__.split(".")[-2]
-MODEL_ASSET_VERSION = 3
-DEFAULT_ENCODINGS = "wideresnet50_quantized_encodings.json"
-
-
-class WideResNet50Quantizable(AIMETQuantizableMixin, WideResNet50):
-    """WideResNet50 with post train quantization support.
-
-    Supports only 8 bit weights and activations, and only loads pre-quantized checkpoints.
-    Support for quantizing using your own weights & data will come at a later date."""
-
-    def __init__(
-        self,
-        sim_model: QuantizationSimModel,
-    ) -> None:
-        # Input is already normalized by sim_model. Disable it in the wrapper model.
-        WideResNet50.__init__(self, sim_model.model, normalize_input=False)
-        AIMETQuantizableMixin.__init__(
-            self,
-            sim_model,
-        )
-
-    @classmethod
-    def from_pretrained(
-        cls,
-        aimet_encodings: str | None = "DEFAULT",
-    ) -> "WideResNet50Quantizable":
-        """
-        Parameters:
-          aimet_encodings:
-            if "DEFAULT": Loads the model with aimet encodings calibrated on imagenette.
-            elif None: Doesn't load any encodings. Used when computing encodings.
-            else: Interprets as a filepath and loads the encodings stored there.
-        """
-        model = WideResNet50.from_pretrained()
-        input_shape = cls.get_input_spec()["image_tensor"][0]
-        model = prepare_model(model)
-        dummy_input = torch.rand(input_shape)
-
-        pairs = fold_all_batch_norms(model, input_shape, dummy_input)
-        equalize_bn_folded_model(model, input_shape, pairs, dummy_input)
-        sim = QuantizationSimModel(
-            model,
-            quant_scheme="tf_enhanced",
-            default_param_bw=8,
-            default_output_bw=8,
-            config_file=get_default_aimet_config(),
-            dummy_input=dummy_input,
-        )
-        constrain_quantized_inputs_to_image_range(sim)
 
-        if aimet_encodings:
-            if aimet_encodings == "DEFAULT":
-                aimet_encodings = CachedWebModelAsset.from_asset_store(
-                    MODEL_ID, MODEL_ASSET_VERSION, DEFAULT_ENCODINGS
-                ).fetch()
-            load_encodings_to_sim(sim, aimet_encodings)
 
-        return cls(sim)
+class WideResNet50Quantizable(HubQuantizableMixin, WideResNet50):
+    pass
diff --git a/qai_hub_models/models/wideresnet50_quantized/perf.yaml b/qai_hub_models/models/wideresnet50_quantized/perf.yaml
index b5dc4abc..93a078e8 100644
--- a/qai_hub_models/models/wideresnet50_quantized/perf.yaml
+++ b/qai_hub_models/models/wideresnet50_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,39 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8775P Proxy
 models:
 - name: WideResNet50-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1778.0
-      throughput: 562.429696287964
+      inference_time: 1779.0
+      throughput: 562.1135469364812
       estimated_peak_memory_range:
-        min: 12288
-        max: 2215384
+        min: 36864
+        max: 611655808
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +60,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jqpyejylg
+      job_id: j57y2dvl5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2038.0
-      throughput: 490.6771344455348
+      inference_time: 2025.0
+      throughput: 493.82716049382714
       estimated_peak_memory_range:
-        min: 16384
-        max: 671168640
+        min: 12288
+        max: 145082312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j1pv3ryr5
+        total_layers: 127
+      job_id: j5q60294p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2795.0
-      throughput: 357.78175313059035
+      inference_time: 2468.0
+      throughput: 405.1863857374392
       estimated_peak_memory_range:
         min: 12288
-        max: 85652800
+        max: 87343304
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +90,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jqp4qd61g
+      job_id: jp142832p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +99,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:25:12Z'
+    timestamp: '2024-10-17T17:14:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 1345.0
-      throughput: 743.4944237918215
+      inference_time: 1403.0
+      throughput: 712.7583749109052
       estimated_peak_memory_range:
-        min: 16384
-        max: 58881312
+        min: 12288
+        max: 61732864
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +113,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: j2p0y2xng
+      job_id: jp4lnwjv5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1674.0
-      throughput: 597.3715651135007
+      inference_time: 1682.0
+      throughput: 594.5303210463734
       estimated_peak_memory_range:
         min: 167936
-        max: 19094240
+        max: 21864528
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: j7gjx26ep
+        total_layers: 127
+      job_id: jglv4ke85
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2047.0
-      throughput: 488.5197850512946
+      inference_time: 1830.0
+      throughput: 546.448087431694
       estimated_peak_memory_range:
-        min: 24576
-        max: 89421392
+        min: 32768
+        max: 92240000
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +143,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: j0pxv68lg
+      job_id: jgdxnv0ep
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +152,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:25:13Z'
+    timestamp: '2024-10-17T17:14:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 1772.0
-      throughput: 564.3340857787811
+      inference_time: 7792.0
+      throughput: 128.33675564681724
       estimated_peak_memory_range:
-        min: 28672
-        max: 622465936
+        min: 12288
+        max: 30620560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +166,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: j1p8omkog
+      job_id: jpxk91e15
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1882.0
-      throughput: 531.3496280552604
+      inference_time: 9311.0
+      throughput: 107.3998496402105
       estimated_peak_memory_range:
-        min: 180224
-        max: 1470720
+        min: 200704
+        max: 8266832
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jygzejqxg
+        total_layers: 127
+      job_id: j56y21q0p
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:25:06Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-17T17:14:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 2175.0
-      throughput: 459.7701149425287
+      inference_time: 23600.0
+      throughput: 42.3728813559322
       estimated_peak_memory_range:
-        min: 20480
-        max: 61780080
+        min: 196608
+        max: 2337856
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +204,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jogkzqkng
+      job_id: j5mnezvwp
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-17T17:13:47Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1768.0
+      throughput: 565.6108597285067
+      estimated_peak_memory_range:
+        min: 12288
+        max: 2217072
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 82
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 82
+      job_id: jgn60err5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2463.0
-      throughput: 406.00893219650834
+      inference_time: 1920.0
+      throughput: 520.8333333333334
       estimated_peak_memory_range:
-        min: 167936
-        max: 22571312
+        min: 204800
+        max: 1626464
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jvgdwqyz5
+        total_layers: 127
+      job_id: jp3jnmqlg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:25:11Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-17T17:14:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 1772.0
-      throughput: 564.3340857787811
+      inference_time: 1779.0
+      throughput: 562.1135469364812
       estimated_peak_memory_range:
-        min: 24576
-        max: 1592920
+        min: 12288
+        max: 28815656
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +265,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jn5q8rdo5
+      job_id: jprv6y19g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1887.0
-      throughput: 529.9417064122946
+      inference_time: 1916.0
+      throughput: 521.9206680584551
       estimated_peak_memory_range:
-        min: 180224
-        max: 1435328
+        min: 176128
+        max: 1507952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jz5wo30mp
+        total_layers: 127
+      job_id: jpv6qwzj5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:25:07Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-17T17:14:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 1773.0
-      throughput: 564.0157924421884
+      inference_time: 1775.0
+      throughput: 563.3802816901408
       estimated_peak_memory_range:
-        min: 16384
-        max: 16234344
+        min: 32768
+        max: 1598432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +303,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: j1gln29mp
+      job_id: jp2kxm34p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1890.0
-      throughput: 529.1005291005291
+      inference_time: 1924.0
+      throughput: 519.7505197505197
       estimated_peak_memory_range:
-        min: 192512
-        max: 1639712
+        min: 184320
+        max: 1685552
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jmg9vy785
+        total_layers: 127
+      job_id: jgjvdlkxg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +326,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:25:08Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-17T17:14:08Z'
   - torchscript_onnx_tflite:
-      inference_time: 1774.0
-      throughput: 563.6978579481398
+      inference_time: 2167.0
+      throughput: 461.4674665436087
       estimated_peak_memory_range:
         min: 12288
-        max: 87019352
+        max: 64212992
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +341,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: jw566z9y5
+      job_id: jpy1zdv7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1928.0
-      throughput: 518.6721991701245
+      inference_time: 2463.0
+      throughput: 406.00893219650834
       estimated_peak_memory_range:
-        min: 16384
-        max: 1728576
+        min: 167936
+        max: 23012416
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jnp10wk75
+        total_layers: 127
+      job_id: jpedov415
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:25:10Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-17T17:14:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 7772.0
-      throughput: 128.66700977869274
+      inference_time: 1248.0
+      throughput: 801.2820512820513
       estimated_peak_memory_range:
-        min: 12288
-        max: 31462320
+        min: 8192
+        max: 24951424
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +379,67 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 82
-      job_id: j1p3k1ln5
+      job_id: jp0z4re65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10252.0
-      throughput: 97.54194303550527
+      inference_time: 1443.0
+      throughput: 693.000693000693
       estimated_peak_memory_range:
-        min: 163840
-        max: 7742096
+        min: 159744
+        max: 19173072
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jz57zl19p
+        total_layers: 127
+      job_id: jgz327vk5
       job_status: Passed
-    reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
-      os_name: Android
-      manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:25:12Z'
-  - torchscript_onnx_tflite:
-      inference_time: 24100.0
-      throughput: 41.49377593360996
+    torchscript_onnx:
+      inference_time: 1674.0
+      throughput: 597.3715651135007
       estimated_peak_memory_range:
-        min: 176128
-        max: 2286552
+        min: 0
+        max: 42505296
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 82
+        layers_on_npu: 147
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 82
-      job_id: jwgoyn7k5
+        total_layers: 147
+      job_id: jg9l048wg
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:25:02Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-17T17:14:20Z'
   - torchscript_onnx_qnn:
-      inference_time: 1822.0
-      throughput: 548.847420417124
+      inference_time: 1840.0
+      throughput: 543.4782608695652
       estimated_peak_memory_range:
-        min: 253952
-        max: 253952
+        min: 323584
+        max: 323584
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 78
+        layers_on_npu: 127
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 78
-      job_id: jlpe9w0vg
+        total_layers: 127
+      job_id: jgo2zvexp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 2640.0
-      throughput: 378.7878787878788
+      inference_time: 2614.0
+      throughput: 382.55547054322875
       estimated_peak_memory_range:
-        min: 72720384
-        max: 72720384
+        min: 73981952
+        max: 73981952
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +447,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 147
-      job_id: jo5mr619g
+      job_id: j5wew9x35
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +456,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:25:14Z'
+    timestamp: '2024-10-17T17:14:19Z'
diff --git a/qai_hub_models/models/wideresnet50_quantized/requirements.txt b/qai_hub_models/models/wideresnet50_quantized/requirements.txt
deleted file mode 100644
index de5b80e8..00000000
--- a/qai_hub_models/models/wideresnet50_quantized/requirements.txt
+++ /dev/null
@@ -1 +0,0 @@
-aimet-torch==1.32.1.post1; sys_platform == "linux"
diff --git a/qai_hub_models/models/wideresnet50_quantized/test.py b/qai_hub_models/models/wideresnet50_quantized/test.py
deleted file mode 100644
index fbe14f34..00000000
--- a/qai_hub_models/models/wideresnet50_quantized/test.py
+++ /dev/null
@@ -1,30 +0,0 @@
-# ---------------------------------------------------------------------
-# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
-# SPDX-License-Identifier: BSD-3-Clause
-# ---------------------------------------------------------------------
-from qai_hub_models.models._shared.imagenet_classifier.test_utils import (
-    run_imagenet_classifier_test,
-)
-from qai_hub_models.models.wideresnet50_quantized.demo import main as demo_main
-from qai_hub_models.models.wideresnet50_quantized.model import (
-    MODEL_ASSET_VERSION,
-    MODEL_ID,
-    WideResNet50Quantizable,
-)
-
-
-def test_task():
-    run_imagenet_classifier_test(
-        WideResNet50Quantizable.from_pretrained(),
-        MODEL_ID,
-        probability_threshold=0.4,
-        asset_version=MODEL_ASSET_VERSION,
-        diff_tol=0.005,
-        rtol=0.02,
-        atol=0.2,
-    )
-
-
-def test_demo():
-    # Verify demo does not crash
-    demo_main(is_test=True)
diff --git a/qai_hub_models/models/xlsr/README.md b/qai_hub_models/models/xlsr/README.md
index 1b42556e..53aa6944 100644
--- a/qai_hub_models/models/xlsr/README.md
+++ b/qai_hub_models/models/xlsr/README.md
@@ -6,7 +6,7 @@
 XLSR is designed for lightweight real-time upscaling of images.
 
 This is based on the implementation of XLSR found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/xlsr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/xlsr).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.xlsr.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of XLSR can be found
+* The license for the original implementation of XLSR can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Extremely Lightweight Quantization Robust Real-Time Single-Image Super Resolution for Mobile Devices](https://arxiv.org/abs/2105.10288)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/xlsr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/xlsr/export.py b/qai_hub_models/models/xlsr/export.py
index 63330580..5b37b6bc 100644
--- a/qai_hub_models/models/xlsr/export.py
+++ b/qai_hub_models/models/xlsr/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.xlsr import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "xlsr"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -197,7 +195,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/xlsr/perf.yaml b/qai_hub_models/models/xlsr/perf.yaml
index e58340ef..93e2f4cc 100644
--- a/qai_hub_models/models/xlsr/perf.yaml
+++ b/qai_hub_models/models/xlsr/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: XLSR
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 2483.0
-      throughput: 402.7386226339106
+      inference_time: 2580.0
+      throughput: 387.5968992248062
       estimated_peak_memory_range:
-        min: 24576
-        max: 72773184
+        min: 225280
+        max: 9489008
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: joprk2d75
+      job_id: jp4lrmr25
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1448.0
-      throughput: 690.6077348066299
+      inference_time: 1375.0
+      throughput: 727.2727272727273
       estimated_peak_memory_range:
-        min: 217088
-        max: 3042344
+        min: 28672
+        max: 3234336
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: j1gln2qmp
+      job_id: jgkexoevg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1547.0
-      throughput: 646.4124111182934
+      inference_time: 1509.0
+      throughput: 662.6905235255136
       estimated_peak_memory_range:
-        min: 12288
-        max: 30840088
+        min: 229376
+        max: 1899672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 23
-      job_id: jz5wo3rmp
+      job_id: j5we68e45
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:22:32Z'
+    timestamp: '2024-10-14T23:11:36Z'
   - torchscript_onnx_tflite:
-      inference_time: 1754.0
-      throughput: 570.1254275940707
+      inference_time: 1793.0
+      throughput: 557.7244841048522
       estimated_peak_memory_range:
-        min: 16384
-        max: 22748944
+        min: 20480
+        max: 25105472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: jep289dqp
+      job_id: jpxko3o85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1088.0
-      throughput: 919.1176470588235
+      inference_time: 1080.0
+      throughput: 925.925925925926
       estimated_peak_memory_range:
-        min: 212992
-        max: 11680624
+        min: 208896
+        max: 14304032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: jw566z0y5
+      job_id: j5q6qz6ep
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1048.0
-      throughput: 954.1984732824427
+      inference_time: 1084.0
+      throughput: 922.509225092251
       estimated_peak_memory_range:
         min: 0
-        max: 23772800
+        max: 24023888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 23
-      job_id: jmg9vyq85
+      job_id: jg9lnklmg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:22:33Z'
+    timestamp: '2024-10-14T23:11:37Z'
   - torchscript_onnx_tflite:
-      inference_time: 2474.0
-      throughput: 404.2037186742118
+      inference_time: 2467.0
+      throughput: 405.35062829347385
       estimated_peak_memory_range:
         min: 28672
-        max: 1382728
+        max: 1363664
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: jqpyej2lg
+      job_id: j5mnxox7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1330.0
-      throughput: 751.8796992481203
+      inference_time: 1359.0
+      throughput: 735.8351729212657
       estimated_peak_memory_range:
         min: 229376
-        max: 1476848
+        max: 1472576
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: jwgoyn9k5
+      job_id: j56y4rynp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:22:28Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:11:29Z'
   - torchscript_onnx_tflite:
-      inference_time: 4543.0
-      throughput: 220.1188641866608
+      inference_time: 2551.0
+      throughput: 392.0031360250882
       estimated_peak_memory_range:
-        min: 6336512
-        max: 30139312
+        min: 16384
+        max: 92452376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: j2p0y29ng
+      job_id: jpy13q30p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1564.0
-      throughput: 639.386189258312
+      inference_time: 1344.0
+      throughput: 744.047619047619
       estimated_peak_memory_range:
-        min: 208896
-        max: 15461264
+        min: 221184
+        max: 1455936
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: jygzej0xg
+      job_id: jpv6ke6z5
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:22:31Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:11:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 2578.0
-      throughput: 387.8975950349108
+      inference_time: 2600.0
+      throughput: 384.61538461538464
       estimated_peak_memory_range:
-        min: 28672
-        max: 1410784
+        min: 1933312
+        max: 3268800
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: j1p8omrog
+      job_id: jp2ky4y6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1349.0
-      throughput: 741.2898443291327
+      inference_time: 1364.0
+      throughput: 733.1378299120234
       estimated_peak_memory_range:
-        min: 225280
-        max: 4978872
+        min: 229376
+        max: 1519016
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: j1pv3rnr5
+      job_id: jgo26o21p
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:22:29Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:11:31Z'
   - torchscript_onnx_tflite:
-      inference_time: 2424.0
-      throughput: 412.54125412541254
+      inference_time: 2451.0
+      throughput: 407.9967360261118
       estimated_peak_memory_range:
-        min: 16384
-        max: 1506344
+        min: 24576
+        max: 32218040
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: jogkzq0ng
+      job_id: jprv3o3kg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1352.0
-      throughput: 739.6449704142012
+      inference_time: 1341.0
+      throughput: 745.7121551081283
       estimated_peak_memory_range:
-        min: 20480
-        max: 4007744
+        min: 233472
+        max: 1715304
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: j7gjx28ep
+      job_id: jp3j0xjmg
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:22:30Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:11:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 2574.0
-      throughput: 388.5003885003885
+      inference_time: 3255.0
+      throughput: 307.21966205837174
       estimated_peak_memory_range:
-        min: 6316032
-        max: 23092696
+        min: 6328320
+        max: 31083824
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 16
-      job_id: jn5q8r1o5
+      job_id: jgn6vovj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1462.0
-      throughput: 683.9945280437756
+      inference_time: 1541.0
+      throughput: 648.9292667099286
       estimated_peak_memory_range:
-        min: 221184
-        max: 1485168
+        min: 204800
+        max: 14941584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,22 +329,75 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: jlpe9wnvg
+      job_id: jpedm8d85
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:11:34Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1913.0
+      throughput: 522.7391531625718
+      estimated_peak_memory_range:
+        min: 20480
+        max: 17227600
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 13
+        layers_on_gpu: 0
+        layers_on_cpu: 3
+        total_layers: 16
+      job_id: jp8qy6qqp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 686.0
+      throughput: 1457.725947521866
+      estimated_peak_memory_range:
+        min: 0
+        max: 9899072
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 21
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 21
+      job_id: jgz3d8345
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 1057.0
+      throughput: 946.073793755913
+      estimated_peak_memory_range:
+        min: 0
+        max: 15333728
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 23
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 23
+      job_id: j57yrkyn5
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:22:30Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:11:40Z'
   - torchscript_onnx_qnn:
-      inference_time: 1459.0
-      throughput: 685.4009595613434
+      inference_time: 1500.0
+      throughput: 666.6666666666666
       estimated_peak_memory_range:
-        min: 212992
-        max: 212992
+        min: 237568
+        max: 237568
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 21
-      job_id: j1p3k1rn5
+      job_id: jglvmov25
       job_status: Passed
     torchscript_onnx:
-      inference_time: 1501.0
-      throughput: 666.2225183211193
+      inference_time: 1516.0
+      throughput: 659.6306068601583
       estimated_peak_memory_range:
-        min: 8970240
-        max: 8970240
+        min: 8962048
+        max: 8962048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 23
-      job_id: jnp10wm75
+      job_id: jp14z74np
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:22:34Z'
+    timestamp: '2024-10-14T23:11:38Z'
diff --git a/qai_hub_models/models/xlsr_quantized/README.md b/qai_hub_models/models/xlsr_quantized/README.md
index d1f27eab..dbc4c468 100644
--- a/qai_hub_models/models/xlsr_quantized/README.md
+++ b/qai_hub_models/models/xlsr_quantized/README.md
@@ -6,7 +6,7 @@
 XLSR is designed for lightweight real-time upscaling of images.
 
 This is based on the implementation of XLSR-Quantized found
-[here](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/xlsr). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/xlsr_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.xlsr_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of XLSR-Quantized can be found
+* The license for the original implementation of XLSR-Quantized can be found
   [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
-- The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+* The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
+
 
 ## References
 * [Extremely Lightweight Quantization Robust Real-Time Single-Image Super Resolution for Mobile Devices](https://arxiv.org/abs/2105.10288)
 * [Source Model Implementation](https://github.com/quic/aimet-model-zoo/tree/develop/aimet_zoo_torch/xlsr)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/xlsr_quantized/export.py b/qai_hub_models/models/xlsr_quantized/export.py
index 5411d38f..7c924ecd 100644
--- a/qai_hub_models/models/xlsr_quantized/export.py
+++ b/qai_hub_models/models/xlsr_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.xlsr_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "xlsr_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -198,7 +196,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/xlsr_quantized/perf.yaml b/qai_hub_models/models/xlsr_quantized/perf.yaml
index cc192a0d..7f345b62 100644
--- a/qai_hub_models/models/xlsr_quantized/perf.yaml
+++ b/qai_hub_models/models/xlsr_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: XLSR-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1060.0
-      throughput: 943.3962264150944
+      inference_time: 1076.0
+      throughput: 929.368029739777
       estimated_peak_memory_range:
         min: 12288
-        max: 3525304
+        max: 1327904
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,29 +62,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jqpyej88g
+      job_id: jpedm2385
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 654.0
-      throughput: 1529.051987767584
+      inference_time: 652.0
+      throughput: 1533.7423312883436
       estimated_peak_memory_range:
-        min: 20480
-        max: 3091504
+        min: 0
+        max: 3226576
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: j7gjx2yvp
+        total_layers: 21
+      job_id: jprv39jkg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 765.0
-      throughput: 1307.18954248366
+      inference_time: 678.0
+      throughput: 1474.9262536873157
       estimated_peak_memory_range:
-        min: 69632
-        max: 1491648
+        min: 65536
+        max: 1331032
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jnp10w175
+      job_id: jpv6kekz5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:21:57Z'
+    timestamp: '2024-10-14T23:10:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 916.0
-      throughput: 1091.703056768559
+      inference_time: 878.0
+      throughput: 1138.9521640091116
       estimated_peak_memory_range:
         min: 20480
-        max: 23306096
+        max: 23583056
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,29 +115,29 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j2p0y2o9g
+      job_id: jgz3dwk45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 447.0
-      throughput: 2237.136465324385
+      inference_time: 454.0
+      throughput: 2202.643171806167
       estimated_peak_memory_range:
         min: 12288
-        max: 14237008
+        max: 16088080
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jlpe9wxog
+        total_layers: 21
+      job_id: jp2kyjn6p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 721.0
-      throughput: 1386.9625520110958
+      inference_time: 499.0
+      throughput: 2004.0080160320642
       estimated_peak_memory_range:
         min: 0
-        max: 25230096
+        max: 25219312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jvgdwq4z5
+      job_id: jgjvnon1g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:21:58Z'
+    timestamp: '2024-10-14T23:10:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 1980.0
-      throughput: 505.050505050505
+      inference_time: 2437.0
+      throughput: 410.3405826836274
       estimated_peak_memory_range:
-        min: 28672
-        max: 1428296
+        min: 12288
+        max: 17008928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,37 +168,60 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1p8omjkg
+      job_id: jpxkom285
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 432.0
-      throughput: 2314.814814814815
+      inference_time: 1076.0
+      throughput: 929.368029739777
       estimated_peak_memory_range:
-        min: 77824
-        max: 1286280
+        min: 12288
+        max: 7595984
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jz5wo3z3p
+        total_layers: 21
+      job_id: jp3j0x0mg
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:10:52Z'
+  - torchscript_onnx_tflite:
+      inference_time: 16048.0
+      throughput: 62.31306081754736
+      estimated_peak_memory_range:
+        min: 4354048
+        max: 29172840
+      primary_compute_unit: GPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 5
+        layers_on_gpu: 9
+        layers_on_cpu: 5
+        total_layers: 19
+      job_id: j5mnx4y7p
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:21:52Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:10:40Z'
   - torchscript_onnx_tflite:
-      inference_time: 1492.0
-      throughput: 670.2412868632708
+      inference_time: 1060.0
+      throughput: 943.3962264150944
       estimated_peak_memory_range:
-        min: 806912
-        max: 24163360
+        min: 24576
+        max: 12948776
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,37 +229,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jogkzq6wg
+      job_id: j5we6xn45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 710.0
-      throughput: 1408.4507042253522
+      inference_time: 426.0
+      throughput: 2347.417840375587
       estimated_peak_memory_range:
-        min: 61440
-        max: 15521888
+        min: 81920
+        max: 1717840
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jz5wo3zmp
+        total_layers: 21
+      job_id: jp0z0d005
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:21:55Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:10:46Z'
   - torchscript_onnx_tflite:
-      inference_time: 1067.0
-      throughput: 937.207122774133
+      inference_time: 1054.0
+      throughput: 948.7666034155598
       estimated_peak_memory_range:
-        min: 28672
-        max: 3048136
+        min: 24576
+        max: 3075568
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,37 +267,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jn5q8r4n5
+      job_id: j57yr63n5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 432.0
-      throughput: 2314.814814814815
+      inference_time: 429.0
+      throughput: 2331.002331002331
       estimated_peak_memory_range:
-        min: 86016
-        max: 1312984
+        min: 32768
+        max: 1894288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jmg9vy2w5
+        total_layers: 21
+      job_id: j5q6qzqep
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:21:53Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:10:49Z'
   - torchscript_onnx_tflite:
-      inference_time: 1071.0
-      throughput: 933.7068160597572
+      inference_time: 1065.0
+      throughput: 938.9671361502348
       estimated_peak_memory_range:
-        min: 24576
-        max: 5816184
+        min: 49152
+        max: 1291680
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,22 +305,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1gln2wjp
+      job_id: jgdx10l6p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 432.0
-      throughput: 2314.814814814815
+      inference_time: 433.0
+      throughput: 2309.4688221709007
       estimated_peak_memory_range:
-        min: 73728
-        max: 1408024
+        min: 81920
+        max: 1307672
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jnp10w185
+        total_layers: 21
+      job_id: jgkexoxvg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:21:54Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:10:48Z'
   - torchscript_onnx_tflite:
-      inference_time: 1068.0
-      throughput: 936.3295880149813
+      inference_time: 1077.0
+      throughput: 928.5051067780872
       estimated_peak_memory_range:
-        min: 24576
-        max: 1419600
+        min: 1605632
+        max: 15182384
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,37 +343,37 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: jw566zo65
+      job_id: jp14z3xnp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 430.0
-      throughput: 2325.5813953488373
+      inference_time: 424.0
+      throughput: 2358.490566037736
       estimated_peak_memory_range:
         min: 69632
-        max: 1456560
+        max: 1396200
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jvgdwq4r5
+        total_layers: 21
+      job_id: jp8qy6yqp
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:21:54Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:10:47Z'
   - torchscript_onnx_tflite:
-      inference_time: 2344.0
-      throughput: 426.6211604095563
+      inference_time: 1399.0
+      throughput: 714.7962830593281
       estimated_peak_memory_range:
-        min: 1642496
-        max: 18693376
+        min: 16384
+        max: 24102912
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,75 +381,105 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 3
         total_layers: 19
-      job_id: j1p3k1o35
+      job_id: jg9ln8emg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1118.0
-      throughput: 894.4543828264758
+      inference_time: 716.0
+      throughput: 1396.6480446927374
       estimated_peak_memory_range:
-        min: 61440
-        max: 8045296
+        min: 12288
+        max: 13691328
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jmg9vy285
+        total_layers: 21
+      job_id: j56y4r4np
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:21:57Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:10:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 14013.0
-      throughput: 71.36230642974381
+      inference_time: 854.0
+      throughput: 1170.96018735363
       estimated_peak_memory_range:
-        min: 4333568
-        max: 10909736
-      primary_compute_unit: GPU
+        min: 12288
+        max: 16457856
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 5
-        layers_on_gpu: 9
-        layers_on_cpu: 5
+        layers_on_npu: 16
+        layers_on_gpu: 0
+        layers_on_cpu: 3
         total_layers: 19
-      job_id: jwgoyndq5
+      job_id: jgn6vx8j5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 404.0
+      throughput: 2475.2475247524753
+      estimated_peak_memory_range:
+        min: 57344
+        max: 10604272
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 21
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 21
+      job_id: jgo26o61p
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 381.0
+      throughput: 2624.6719160104985
+      estimated_peak_memory_range:
+        min: 20480
+        max: 16581872
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 19
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 19
+      job_id: j5we68645
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:21:47Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:10:58Z'
   - torchscript_onnx_qnn:
-      inference_time: 556.0
-      throughput: 1798.5611510791366
+      inference_time: 536.0
+      throughput: 1865.6716417910447
       estimated_peak_memory_range:
-        min: 49152
-        max: 49152
+        min: 131072
+        max: 131072
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 16
+        layers_on_npu: 21
         layers_on_gpu: 0
         layers_on_cpu: 0
-        total_layers: 16
-      job_id: jygzejyog
+        total_layers: 21
+      job_id: jpy13n00p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 782.0
-      throughput: 1278.772378516624
+      inference_time: 794.0
+      throughput: 1259.4458438287154
       estimated_peak_memory_range:
-        min: 3330048
-        max: 3330048
+        min: 3387392
+        max: 3387392
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 19
-      job_id: jz57zln9p
+      job_id: jpedm8m85
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:21:59Z'
+    timestamp: '2024-10-14T23:10:56Z'
diff --git a/qai_hub_models/models/yolonas/README.md b/qai_hub_models/models/yolonas/README.md
index dd81f9b2..f1467c01 100644
--- a/qai_hub_models/models/yolonas/README.md
+++ b/qai_hub_models/models/yolonas/README.md
@@ -6,7 +6,7 @@
 YoloNAS is a machine learning model that predicts bounding boxes and classes of objects in an image.
 
 This is based on the implementation of Yolo-NAS found
-[here](https://github.com/Deci-AI/super-gradients). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolonas).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolonas.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Yolo-NAS can be found
+* The license for the original implementation of Yolo-NAS can be found
   [here](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md#license).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md)
+
 
 ## References
 * [A Next-Generation, Object Detection Foundational Model generated by Deci’s Neural Architecture Search Technology](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md)
 * [Source Model Implementation](https://github.com/Deci-AI/super-gradients)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolonas/export.py b/qai_hub_models/models/yolonas/export.py
index 61906a98..eabeed46 100644
--- a/qai_hub_models/models/yolonas/export.py
+++ b/qai_hub_models/models/yolonas/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolonas import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolonas"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -201,7 +199,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolonas/perf.yaml b/qai_hub_models/models/yolonas/perf.yaml
index 7f549a6b..7f57acd5 100644
--- a/qai_hub_models/models/yolonas/perf.yaml
+++ b/qai_hub_models/models/yolonas/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Yolo-NAS
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 10909.0
-      throughput: 91.66743056192135
+      inference_time: 10860.0
+      throughput: 92.08103130755065
       estimated_peak_memory_range:
-        min: 217088
-        max: 4691760
+        min: 32768
+        max: 7163776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: j1p8omekg
+      job_id: jpedm21v5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 15011.0
-      throughput: 66.61781360335753
+      inference_time: 15235.0
+      throughput: 65.63833278634722
       estimated_peak_memory_range:
-        min: 6328320
-        max: 24035528
+        min: 4931584
+        max: 24696952
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: j1pv3r2k5
+      job_id: jgdx1096p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 9947.0
-      throughput: 100.53282396702524
+      inference_time: 7751.0
+      throughput: 129.01561088891756
       estimated_peak_memory_range:
-        min: 16384
-        max: 26342544
+        min: 28672
+        max: 26587000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jz57zlovp
+      job_id: jp8qy8vqp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:21:17Z'
+    timestamp: '2024-10-14T23:10:06Z'
   - torchscript_onnx_tflite:
-      inference_time: 9064.0
-      throughput: 110.32656663724624
+      inference_time: 8986.0
+      throughput: 111.28421989761851
       estimated_peak_memory_range:
-        min: 237568
-        max: 103485776
+        min: 163840
+        max: 112014672
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: jogkzq2wg
+      job_id: jgz3dw9x5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10848.0
-      throughput: 92.18289085545723
+      inference_time: 10889.0
+      throughput: 91.8357975939021
       estimated_peak_memory_range:
         min: 4952064
-        max: 33809264
+        max: 38833264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: j7gjx23vp
+      job_id: j57yr6wn5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7273.0
-      throughput: 137.49484394335212
+      inference_time: 6026.0
+      throughput: 165.94756057085962
       estimated_peak_memory_range:
-        min: 1060864
-        max: 106214624
+        min: 4116480
+        max: 119044512
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jqp4qde8g
+      job_id: jgkexdmvg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:21:18Z'
+    timestamp: '2024-10-14T23:10:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 10800.0
-      throughput: 92.5925925925926
+      inference_time: 10765.0
+      throughput: 92.89363678588016
       estimated_peak_memory_range:
-        min: 0
-        max: 25968968
+        min: 241664
+        max: 311233456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: jn5q8rln5
+      job_id: j5we6xvm5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10116.0
-      throughput: 98.85330170027679
+      inference_time: 9611.0
+      throughput: 104.04744563520966
       estimated_peak_memory_range:
-        min: 5001216
-        max: 6202472
+        min: 4988928
+        max: 6329160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jygzejzog
+      job_id: jpxkomj85
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:21:12Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:09:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 13791.0
-      throughput: 72.51105793633529
+      inference_time: 10662.0
+      throughput: 93.79103357719002
       estimated_peak_memory_range:
-        min: 217088
-        max: 100659936
+        min: 245760
+        max: 4395224
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: j1gln2yjp
+      job_id: j5we6xv45
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 18123.0
-      throughput: 55.17850245544336
+      inference_time: 9606.0
+      throughput: 104.10160316468874
       estimated_peak_memory_range:
-        min: 4952064
-        max: 32193984
+        min: 4976640
+        max: 6826496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jvgdwq6r5
+      job_id: jprv39qkg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:21:16Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:10:01Z'
   - torchscript_onnx_tflite:
-      inference_time: 10934.0
-      throughput: 91.45783793671117
+      inference_time: 10844.0
+      throughput: 92.21689413500553
       estimated_peak_memory_range:
-        min: 237568
-        max: 6140328
+        min: 12288
+        max: 80955216
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: jw566z865
+      job_id: jgdx109zp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10106.0
-      throughput: 98.95111814763507
+      inference_time: 9491.0
+      throughput: 105.36297545042672
       estimated_peak_memory_range:
-        min: 4960256
-        max: 6588608
+        min: 4993024
+        max: 6543456
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jz5wo3y3p
+      job_id: jgn6vxyj5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:21:13Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:10:00Z'
   - torchscript_onnx_tflite:
-      inference_time: 10964.0
-      throughput: 91.20758847136082
+      inference_time: 10664.0
+      throughput: 93.7734433608402
       estimated_peak_memory_range:
-        min: 40960
-        max: 334395264
+        min: 16384
+        max: 5471048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: j1p3k1z35
+      job_id: jp14z3l7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10226.0
-      throughput: 97.78994719342852
+      inference_time: 9508.0
+      throughput: 105.17458981909971
       estimated_peak_memory_range:
-        min: 4980736
-        max: 6388816
+        min: 4960256
+        max: 6244192
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jmg9vyow5
+      job_id: j5mnx427p
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:21:14Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:09:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 10838.0
-      throughput: 92.26794611551946
+      inference_time: 13889.0
+      throughput: 71.99942400460796
       estimated_peak_memory_range:
-        min: 253952
-        max: 7244464
+        min: 233472
+        max: 107888992
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 201
-      job_id: jwgoynlq5
+      job_id: jg9ln818g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10013.0
-      throughput: 99.87016878058525
+      inference_time: 18405.0
+      throughput: 54.333061668024996
       estimated_peak_memory_range:
-        min: 4993024
-        max: 6298952
+        min: 4956160
+        max: 39240272
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jnp10wo85
+      job_id: jpy13nw0p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:10:03Z'
+  - torchscript_onnx_tflite:
+      inference_time: 7633.0
+      throughput: 131.0100877767588
+      estimated_peak_memory_range:
+        min: 212992
+        max: 56645456
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 201
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 201
+      job_id: jp14z3lnp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 10025.0
+      throughput: 99.75062344139651
+      estimated_peak_memory_range:
+        min: 4931584
+        max: 33649952
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 289
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 289
+      job_id: jp0z0k705
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4362.0
+      throughput: 229.25263640531867
+      estimated_peak_memory_range:
+        min: 5369856
+        max: 61783152
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 290
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 290
+      job_id: j56y4vlnp
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:21:15Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:10:09Z'
   - torchscript_onnx_qnn:
-      inference_time: 10718.0
-      throughput: 93.3009889904833
+      inference_time: 10223.0
+      throughput: 97.81864423359092
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 289
-      job_id: jlpe9w6og
+      job_id: jp4lr8o25
       job_status: Passed
     torchscript_onnx:
-      inference_time: 10102.0
-      throughput: 98.99029895070284
+      inference_time: 8286.0
+      throughput: 120.68549360366885
       estimated_peak_memory_range:
-        min: 22188032
-        max: 22188032
+        min: 22249472
+        max: 22249472
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: j0pxv603g
+      job_id: j5q6qwoep
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:21:19Z'
+    timestamp: '2024-10-14T23:10:08Z'
diff --git a/qai_hub_models/models/yolonas_quantized/README.md b/qai_hub_models/models/yolonas_quantized/README.md
index 10c5859a..0894542c 100644
--- a/qai_hub_models/models/yolonas_quantized/README.md
+++ b/qai_hub_models/models/yolonas_quantized/README.md
@@ -6,7 +6,7 @@
 YoloNAS is a machine learning model that predicts bounding boxes and classes of objects in an image. This model is post-training quantized to int8 using samples from the COCO dataset.
 
 This is based on the implementation of Yolo-NAS-Quantized found
-[here](https://github.com/Deci-AI/super-gradients). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolonas_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolonas_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Yolo-NAS-Quantized can be found
+* The license for the original implementation of Yolo-NAS-Quantized can be found
   [here](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md#license).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md)
+
 
 ## References
 * [YOLO-NAS by Deci Achieves SOTA Performance on Object Detection Using Neural Architecture Search](https://deci.ai/blog/yolo-nas-object-detection-foundation-model/)
 * [Source Model Implementation](https://github.com/Deci-AI/super-gradients)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolonas_quantized/export.py b/qai_hub_models/models/yolonas_quantized/export.py
index fb81e239..ba9b2ab8 100644
--- a/qai_hub_models/models/yolonas_quantized/export.py
+++ b/qai_hub_models/models/yolonas_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolonas_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolonas_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,7 +200,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolonas_quantized/perf.yaml b/qai_hub_models/models/yolonas_quantized/perf.yaml
index 6c49be97..5ef63664 100644
--- a/qai_hub_models/models/yolonas_quantized/perf.yaml
+++ b/qai_hub_models/models/yolonas_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,41 +20,38 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Yolo-NAS-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 4789.0
-      throughput: 208.81186051367717
+      inference_time: 4715.0
+      throughput: 212.08907741251326
       estimated_peak_memory_range:
-        min: 81920
-        max: 14038736
+        min: 32768
+        max: 2619144
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -61,7 +59,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: j1p8omxkg
+      job_id: jglvm7nm5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -70,13 +68,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:20:18Z'
+    timestamp: '2024-10-14T23:08:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 3765.0
-      throughput: 265.6042496679947
+      inference_time: 3058.0
+      throughput: 327.01111837802483
       estimated_peak_memory_range:
-        min: 86016
-        max: 80907840
+        min: 12288
+        max: 83450752
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -84,7 +82,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jogkzq4wg
+      job_id: j56y4v6yp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -93,13 +91,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:20:19Z'
+    timestamp: '2024-10-14T23:08:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 4701.0
-      throughput: 212.72069772388852
+      inference_time: 13608.0
+      throughput: 73.4861845972957
       estimated_peak_memory_range:
-        min: 81920
-        max: 4319104
+        min: 69632
+        max: 69619264
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -107,22 +105,30 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jn5q8ryn5
+      job_id: j5we6xom5
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:20:20Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:09:01Z'
+  - reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:09:02Z'
   - torchscript_onnx_tflite:
-      inference_time: 5255.0
-      throughput: 190.29495718363464
+      inference_time: 4692.0
+      throughput: 213.12872975277068
       estimated_peak_memory_range:
-        min: 135168
-        max: 83142816
+        min: 86016
+        max: 1489008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -130,22 +136,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: j1gln2xjp
+      job_id: jp3j08kng
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:20:21Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:08:55Z'
   - torchscript_onnx_tflite:
-      inference_time: 4728.0
-      throughput: 211.50592216582064
+      inference_time: 4704.0
+      throughput: 212.58503401360545
       estimated_peak_memory_range:
-        min: 61440
-        max: 12883904
+        min: 98304
+        max: 4233816
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -153,22 +159,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jw566z765
+      job_id: jpedm29v5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:20:22Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:08:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 4728.0
-      throughput: 211.50592216582064
+      inference_time: 4696.0
+      throughput: 212.94718909710392
       estimated_peak_memory_range:
-        min: 110592
-        max: 6947216
+        min: 98304
+        max: 7726400
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -176,7 +182,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: j1p3k1935
+      job_id: jgjvn1xeg
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -184,14 +190,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:20:23Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:08:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 4772.0
-      throughput: 209.55574182732607
+      inference_time: 4690.0
+      throughput: 213.21961620469082
       estimated_peak_memory_range:
-        min: 32768
-        max: 191541224
+        min: 65536
+        max: 18267320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -199,22 +205,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: jwgoynrq5
+      job_id: jpv6k43r5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:20:24Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:08:57Z'
   - torchscript_onnx_tflite:
-      inference_time: 13953.0
-      throughput: 71.66917508779474
+      inference_time: 5202.0
+      throughput: 192.23375624759709
       estimated_peak_memory_range:
-        min: 69632
-        max: 69253424
+        min: 110592
+        max: 85722432
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -222,13 +228,36 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 204
-      job_id: j1pv3rlk5
+      job_id: jgo26mykp
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:08:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3157.0
+      throughput: 316.75641431738995
+      estimated_peak_memory_range:
+        min: 61440
+        max: 57573888
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 204
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 204
+      job_id: jp14z307p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:20:24Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:09:03Z'
diff --git a/qai_hub_models/models/yolov11_det/README.md b/qai_hub_models/models/yolov11_det/README.md
index e0bf22f0..99e7da65 100644
--- a/qai_hub_models/models/yolov11_det/README.md
+++ b/qai_hub_models/models/yolov11_det/README.md
@@ -1,14 +1,14 @@
 [![Qualcomm® AI Hub Models](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/quic-logo.jpg)](../../README.md)
 
 
-# [YOLOv11-Detection: Real-time object detection optimized for mobile and edge by Ultralytics](#)
+# [YOLOv11-Detection: Real-time object detection optimized for mobile and edge by Ultralytics](https://aihub.qualcomm.com/models/yolov11_det)
 
 Ultralytics YOLOv11 is a machine learning model that predicts bounding boxes and classes of objects in an image.
 
 This is based on the implementation of YOLOv11-Detection found
-[here](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
-accross various devices, can be found [here](#).
+accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov11_det).
 
 [Sign up](https://myaccount.qualcomm.com/signup) to start using Qualcomm AI Hub and run these models on a hosted Qualcomm® device.
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov11_det.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of YOLOv11-Detection can be found
+* The license for the original implementation of YOLOv11-Detection can be found
   [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+
 
 ## References
 * [Ultralytics YOLOv11 Docs: Object Detection](https://docs.ultralytics.com/tasks/detect/)
 * [Source Model Implementation](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov11_det/export.py b/qai_hub_models/models/yolov11_det/export.py
index c878152c..7a559ed8 100644
--- a/qai_hub_models/models/yolov11_det/export.py
+++ b/qai_hub_models/models/yolov11_det/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov11_det import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov11_det"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -203,7 +201,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov11_det/perf.yaml b/qai_hub_models/models/yolov11_det/perf.yaml
new file mode 100644
index 00000000..c37c9416
--- /dev/null
+++ b/qai_hub_models/models/yolov11_det/perf.yaml
@@ -0,0 +1,432 @@
+aggregated:
+  supported_oses:
+  - Android
+  supported_devices:
+  - Snapdragon 8 Elite QRD
+  - Samsung Galaxy S24
+  - Samsung Galaxy S24 Ultra
+  - Samsung Galaxy S24+
+  - Snapdragon 8 Gen 3 QRD
+  - Samsung Galaxy S23
+  - Samsung Galaxy S23 Ultra
+  - Samsung Galaxy S23+
+  - Samsung Galaxy S22 5G
+  - Samsung Galaxy S22 Ultra 5G
+  - Samsung Galaxy S22+ 5G
+  - Samsung Galaxy Tab S8
+  - Xiaomi 12
+  - Xiaomi 12 Pro
+  - Samsung Galaxy S21
+  - Samsung Galaxy S21 Ultra
+  - Samsung Galaxy S21+
+  - Snapdragon X Elite CRD
+  - Snapdragon X Plus 8-Core CRD
+  - QCS8450 (Proxy)
+  - XR2 Gen 2 (Proxy)
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
+  supported_chipsets:
+  - Snapdragon® 8 Elite
+  - Snapdragon® 8 Gen 3
+  - Snapdragon® 8 Gen 2
+  - Snapdragon® 8 Gen 1
+  - Snapdragon® 888
+  - Snapdragon® X Elite
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
+models:
+- name: YOLOv11-Detection
+  performance_metrics:
+  - torchscript_onnx_tflite:
+      inference_time: 5441.0
+      throughput: 183.7897445322551
+      estimated_peak_memory_range:
+        min: 32768
+        max: 95551352
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jp2kyj1qp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5576.0
+      throughput: 179.3400286944046
+      estimated_peak_memory_range:
+        min: 6307840
+        max: 17406200
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jgo26mjkp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 6016.0
+      throughput: 166.22340425531914
+      estimated_peak_memory_range:
+        min: 651264
+        max: 5849392
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 376
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 376
+      job_id: jp4lr8z15
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S23
+      os: '13'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 2
+    timestamp: '2024-10-14T23:07:45Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3961.0
+      throughput: 252.46149962130775
+      estimated_peak_memory_range:
+        min: 12288
+        max: 101388816
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jpy13nllp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3960.0
+      throughput: 252.5252525252525
+      estimated_peak_memory_range:
+        min: 4931584
+        max: 52566064
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jpv6k4jr5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4285.0
+      throughput: 233.37222870478413
+      estimated_peak_memory_range:
+        min: 5361664
+        max: 126557536
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 376
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 376
+      job_id: jpxkomwl5
+      job_status: Passed
+    reference_device_info:
+      name: Samsung Galaxy S24
+      os: '14'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Samsung
+      chipset: Snapdragon® 8 Gen 3
+    timestamp: '2024-10-14T23:07:46Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5440.0
+      throughput: 183.8235294117647
+      estimated_peak_memory_range:
+        min: 65536
+        max: 3419960
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jp0z0kwn5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5374.0
+      throughput: 186.08113137327874
+      estimated_peak_memory_range:
+        min: 4960256
+        max: 6210912
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jpedm2jv5
+      job_status: Passed
+    reference_device_info:
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:07:38Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5435.0
+      throughput: 183.99264029438822
+      estimated_peak_memory_range:
+        min: 253952
+        max: 10299528
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jglvm7jm5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5407.0
+      throughput: 184.945441094877
+      estimated_peak_memory_range:
+        min: 4964352
+        max: 6375432
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jg9ln868g
+      job_status: Passed
+    reference_device_info:
+      name: SA8255 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:07:41Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5518.0
+      throughput: 181.2250815512867
+      estimated_peak_memory_range:
+        min: 217088
+        max: 5084256
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: j5q6qwnop
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5402.0
+      throughput: 185.11662347278786
+      estimated_peak_memory_range:
+        min: 4956160
+        max: 6286088
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: j5we6xjm5
+      job_status: Passed
+    reference_device_info:
+      name: SA8775 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:07:40Z'
+  - torchscript_onnx_tflite:
+      inference_time: 5531.0
+      throughput: 180.7991321641656
+      estimated_peak_memory_range:
+        min: 229376
+        max: 2242640
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jgkexd1ng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5350.0
+      throughput: 186.9158878504673
+      estimated_peak_memory_range:
+        min: 4972544
+        max: 6175480
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jgz3dw1x5
+      job_status: Passed
+    reference_device_info:
+      name: SA8650 (Proxy)
+      os: '13'
+      form_factor: Auto
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:07:39Z'
+  - torchscript_onnx_tflite:
+      inference_time: 9143.0
+      throughput: 109.37329104232747
+      estimated_peak_memory_range:
+        min: 262144
+        max: 95935840
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jp8qy8nop
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 8619.0
+      throughput: 116.0227404571296
+      estimated_peak_memory_range:
+        min: 4931584
+        max: 39126160
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jgdx10jzp
+      job_status: Passed
+    reference_device_info:
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:07:43Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3848.0
+      throughput: 259.87525987525987
+      estimated_peak_memory_range:
+        min: 8192
+        max: 66587568
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 382
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 382
+      job_id: jp3j08yng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 4086.0
+      throughput: 244.73813020068528
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 51834944
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: j57yr6q95
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 3299.0
+      throughput: 303.12215822976657
+      estimated_peak_memory_range:
+        min: 0
+        max: 77494720
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 376
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 376
+      job_id: jprv39z7g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:07:49Z'
+  - torchscript_onnx_qnn:
+      inference_time: 5700.0
+      throughput: 175.43859649122808
+      estimated_peak_memory_range:
+        min: 4923392
+        max: 4923392
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 374
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 374
+      job_id: jgjvn1jeg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 6775.0
+      throughput: 147.60147601476015
+      estimated_peak_memory_range:
+        min: 4931584
+        max: 4931584
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 376
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 376
+      job_id: j5mnx4j9p
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon X Elite CRD
+      os: '11'
+      form_factor: Compute
+      os_name: Windows
+      manufacturer: Qualcomm
+      chipset: Snapdragon® X Elite
+    timestamp: '2024-10-14T23:07:47Z'
diff --git a/qai_hub_models/models/yolov6/README.md b/qai_hub_models/models/yolov6/README.md
index 82404fcf..a167d0bc 100644
--- a/qai_hub_models/models/yolov6/README.md
+++ b/qai_hub_models/models/yolov6/README.md
@@ -6,7 +6,7 @@
 YoloV6 is a machine learning model that predicts bounding boxes and classes of objects in an image.
 
 This is based on the implementation of Yolo-v6 found
-[here](https://github.com/meituan/YOLOv6/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov6).
 
@@ -39,15 +39,19 @@ python -m qai_hub_models.models.yolov6.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Yolo-v6 can be found
+* The license for the original implementation of Yolo-v6 can be found
   [here](https://github.com/meituan/YOLOv6/blob/47625514e7480706a46ff3c0cd0252907ac12f22/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/meituan/YOLOv6/blob/47625514e7480706a46ff3c0cd0252907ac12f22/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/meituan/YOLOv6/blob/47625514e7480706a46ff3c0cd0252907ac12f22/LICENSE)
+
 
 ## References
 * [YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications](https://arxiv.org/abs/2209.02976)
 * [Source Model Implementation](https://github.com/meituan/YOLOv6/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov6/export.py b/qai_hub_models/models/yolov6/export.py
index e7cd9712..ebb4d8a5 100644
--- a/qai_hub_models/models/yolov6/export.py
+++ b/qai_hub_models/models/yolov6/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov6 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov6"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -201,7 +199,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov6/perf.yaml b/qai_hub_models/models/yolov6/perf.yaml
index 8e835feb..5a00fece 100644
--- a/qai_hub_models/models/yolov6/perf.yaml
+++ b/qai_hub_models/models/yolov6/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Yolo-v6
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 6324.0
-      throughput: 158.12776723592663
+      inference_time: 6222.0
+      throughput: 160.72002571520412
       estimated_peak_memory_range:
-        min: 245760
-        max: 4293104
+        min: 32768
+        max: 2385000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: jn5q8rvn5
+      job_id: jpxkom8l5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5225.0
-      throughput: 191.38755980861245
+      inference_time: 5257.0
+      throughput: 190.22256039566292
       estimated_peak_memory_range:
-        min: 4239360
-        max: 14478112
+        min: 6316032
+        max: 19646608
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jlpe9wzog
+      job_id: j5q6qwxop
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6501.0
-      throughput: 153.82248884786955
+      inference_time: 6076.0
+      throughput: 164.58196181698486
       estimated_peak_memory_range:
-        min: 45056
-        max: 9677944
+        min: 12288
+        max: 10443168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: j0pxv643g
+      job_id: jg9ln8r8g
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:19:12Z'
+    timestamp: '2024-10-14T23:06:58Z'
   - torchscript_onnx_tflite:
-      inference_time: 4643.0
-      throughput: 215.37798836958862
+      inference_time: 5104.0
+      throughput: 195.92476489028212
       estimated_peak_memory_range:
-        min: 221184
-        max: 86093616
+        min: 12288
+        max: 95823856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: j1gln2ljp
+      job_id: j5mnx419p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4085.0
-      throughput: 244.79804161566707
+      inference_time: 4097.0
+      throughput: 244.081034903588
       estimated_peak_memory_range:
         min: 4931584
-        max: 48104320
+        max: 53770832
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jygzejmog
+      job_id: jglvm7dm5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4857.0
-      throughput: 205.88840848260244
+      inference_time: 4477.0
+      throughput: 223.36385972749608
       estimated_peak_memory_range:
         min: 0
-        max: 101204496
+        max: 110342112
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jo5mr6mdg
+      job_id: jp14z397p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:19:12Z'
+    timestamp: '2024-10-14T23:06:59Z'
   - torchscript_onnx_tflite:
-      inference_time: 6190.0
-      throughput: 161.55088852988692
+      inference_time: 6174.0
+      throughput: 161.96954972465176
       estimated_peak_memory_range:
-        min: 245760
-        max: 4091080
+        min: 217088
+        max: 3534712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: jw566zw65
+      job_id: jgn6vxdq5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4911.0
-      throughput: 203.62451639177357
+      inference_time: 5340.0
+      throughput: 187.26591760299627
       estimated_peak_memory_range:
         min: 5001216
-        max: 6447752
+        max: 6300376
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jmg9vymw5
+      job_id: jp3j08dng
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:19:06Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:06:51Z'
   - torchscript_onnx_tflite:
-      inference_time: 7039.0
-      throughput: 142.06563432305725
+      inference_time: 6358.0
+      throughput: 157.28216420257942
       estimated_peak_memory_range:
-        min: 12288
-        max: 71987488
+        min: 229376
+        max: 4300968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: j1p3k1635
+      job_id: jp0z0k8n5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6957.0
-      throughput: 143.74011786689664
+      inference_time: 5372.0
+      throughput: 186.15040953090096
       estimated_peak_memory_range:
-        min: 4931584
-        max: 42068976
+        min: 5066752
+        max: 6361888
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jqp4qd18g
+      job_id: jgjvn19eg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:19:10Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:06:54Z'
   - torchscript_onnx_tflite:
-      inference_time: 6362.0
-      throughput: 157.18327569946558
+      inference_time: 6324.0
+      throughput: 158.12776723592663
       estimated_peak_memory_range:
-        min: 274432
-        max: 4106232
+        min: 233472
+        max: 4026880
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: jwgoyn8q5
+      job_id: jpy13nklp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4900.0
-      throughput: 204.08163265306123
+      inference_time: 5321.0
+      throughput: 187.93459875963165
       estimated_peak_memory_range:
-        min: 4997120
-        max: 6228768
+        min: 5001216
+        max: 6267464
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jnp10wj85
+      job_id: jpv6k48r5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:19:07Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:06:53Z'
   - torchscript_onnx_tflite:
-      inference_time: 6281.0
-      throughput: 159.2103168285305
+      inference_time: 6286.0
+      throughput: 159.0836780146357
       estimated_peak_memory_range:
-        min: 253952
-        max: 4016480
+        min: 0
+        max: 2409872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: j1pv3rdk5
+      job_id: jp2kyjqqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4883.0
-      throughput: 204.7921359819783
+      inference_time: 5354.0
+      throughput: 186.77624206200971
       estimated_peak_memory_range:
-        min: 4964352
-        max: 6584096
+        min: 5009408
+        max: 6207208
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jvgdwq3r5
+      job_id: jgo26mxkp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:19:08Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:06:52Z'
   - torchscript_onnx_tflite:
-      inference_time: 6361.0
-      throughput: 157.2079861656972
+      inference_time: 7785.0
+      throughput: 128.45215157353886
       estimated_peak_memory_range:
-        min: 233472
-        max: 218946696
+        min: 217088
+        max: 78221968
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 182
-      job_id: j7gjx27vp
+      job_id: jprv39m7g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4924.0
-      throughput: 203.08692120227457
+      inference_time: 6945.0
+      throughput: 143.98848092152627
       estimated_peak_memory_range:
-        min: 5005312
-        max: 6275136
+        min: 4931584
+        max: 49288960
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jz57zl4vp
+      job_id: jgz3dw6x5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:06:56Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4370.0
+      throughput: 228.83295194508008
+      estimated_peak_memory_range:
+        min: 212992
+        max: 62186016
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 182
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 182
+      job_id: jgkexdwng
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3412.0
+      throughput: 293.08323563892145
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 50714384
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 228
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 228
+      job_id: j5we6xkm5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4075.0
+      throughput: 245.39877300613497
+      estimated_peak_memory_range:
+        min: 5337088
+        max: 74489872
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 228
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 228
+      job_id: jp4lr8715
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:19:09Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:07:02Z'
   - torchscript_onnx_qnn:
-      inference_time: 5218.0
-      throughput: 191.64430816404752
+      inference_time: 5728.0
+      throughput: 174.58100558659217
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jz5wo373p
+      job_id: j56y4vxyp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6544.0
-      throughput: 152.8117359413203
+      inference_time: 6407.0
+      throughput: 156.07928827844546
       estimated_peak_memory_range:
-        min: 6971392
-        max: 6971392
+        min: 8302592
+        max: 8302592
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 228
-      job_id: jegn2mnkg
+      job_id: jgdx10kzp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:19:13Z'
+    timestamp: '2024-10-14T23:07:00Z'
diff --git a/qai_hub_models/models/yolov7/README.md b/qai_hub_models/models/yolov7/README.md
index fe6030f0..861ce55b 100644
--- a/qai_hub_models/models/yolov7/README.md
+++ b/qai_hub_models/models/yolov7/README.md
@@ -6,7 +6,7 @@
 YoloV7 is a machine learning model that predicts bounding boxes and classes of objects in an image.
 
 This is based on the implementation of Yolo-v7 found
-[here](https://github.com/WongKinYiu/yolov7/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov7).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov7.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Yolo-v7 can be found
+* The license for the original implementation of Yolo-v7 can be found
   [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md)
+
 
 ## References
 * [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)
 * [Source Model Implementation](https://github.com/WongKinYiu/yolov7/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov7/export.py b/qai_hub_models/models/yolov7/export.py
index 4a10d175..5b4b87a8 100644
--- a/qai_hub_models/models/yolov7/export.py
+++ b/qai_hub_models/models/yolov7/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov7 import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov7"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
     # Trace the model
     source_model = torch.jit.trace(model.to("cpu"), make_torch_inputs(input_spec))
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -134,7 +132,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -149,7 +147,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -170,13 +168,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -201,7 +199,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov7/perf.yaml b/qai_hub_models/models/yolov7/perf.yaml
index e24a7b92..46404a95 100644
--- a/qai_hub_models/models/yolov7/perf.yaml
+++ b/qai_hub_models/models/yolov7/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Yolo-v7
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 17219.0
-      throughput: 58.07538184563563
+      inference_time: 17188.0
+      throughput: 58.180125669071444
       estimated_peak_memory_range:
-        min: 45056
-        max: 2664960
+        min: 663552
+        max: 3387440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jlpe9wl1g
+      job_id: jp4lr8285
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10503.0
-      throughput: 95.21089212605922
+      inference_time: 10527.0
+      throughput: 94.99382540134891
       estimated_peak_memory_range:
-        min: 5009408
-        max: 19070104
+        min: 4984832
+        max: 22130440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jmg9vydw5
+      job_id: j5q6qwdnp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 13692.0
-      throughput: 73.03534910896875
+      inference_time: 12235.0
+      throughput: 81.73273395995096
       estimated_peak_memory_range:
-        min: 57344
-        max: 11066416
+        min: 53248
+        max: 13030160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 222
-      job_id: joprk2w05
+      job_id: jg9ln87wg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:18:28Z'
+    timestamp: '2024-10-14T23:06:11Z'
   - torchscript_onnx_tflite:
-      inference_time: 11637.0
-      throughput: 85.93280054996993
+      inference_time: 11658.0
+      throughput: 85.7780065191285
       estimated_peak_memory_range:
-        min: 307200
-        max: 91891552
+        min: 638976
+        max: 105670416
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jygzej4kg
+      job_id: jpxkomz35
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 8517.0
-      throughput: 117.41223435481977
+      inference_time: 7221.0
+      throughput: 138.48497438027974
       estimated_peak_memory_range:
-        min: 4931584
-        max: 71172528
+        min: 4956160
+        max: 79005712
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jnp10w685
+      job_id: jglvm7qj5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 9230.0
-      throughput: 108.34236186348862
+      inference_time: 8179.0
+      throughput: 122.26433549333659
       estimated_peak_memory_range:
-        min: 5140480
-        max: 112479088
+        min: 479232
+        max: 124426000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 222
-      job_id: jep289erp
+      job_id: jp14z3k8p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:18:30Z'
+    timestamp: '2024-10-14T23:06:12Z'
   - torchscript_onnx_tflite:
-      inference_time: 17137.0
-      throughput: 58.35327070082278
+      inference_time: 17172.0
+      throughput: 58.23433496389471
       estimated_peak_memory_range:
-        min: 655360
-        max: 2642992
+        min: 618496
+        max: 9138064
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jz5wo346p
+      job_id: jgn6vxwk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10323.0
-      throughput: 96.8710646130001
+      inference_time: 10322.0
+      throughput: 96.8804495252858
       estimated_peak_memory_range:
-        min: 4964352
-        max: 6128368
+        min: 5005312
+        max: 6319184
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jz57zl9vp
+      job_id: jp3j08r3g
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:18:23Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:06:03Z'
   - torchscript_onnx_tflite:
-      inference_time: 19520.0
-      throughput: 51.22950819672131
+      inference_time: 17145.0
+      throughput: 58.326042578011084
       estimated_peak_memory_range:
-        min: 647168
-        max: 97617328
+        min: 77824
+        max: 2396616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jmg9vydl5
+      job_id: jp0z0kx95
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 12670.0
-      throughput: 78.92659826361484
+      inference_time: 10334.0
+      throughput: 96.76795045480937
       estimated_peak_memory_range:
-        min: 4952064
-        max: 56412288
+        min: 4993024
+        max: 6236288
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jegn2mkkg
+      job_id: jgjvn16vg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:18:27Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:06:07Z'
   - torchscript_onnx_tflite:
-      inference_time: 17221.0
-      throughput: 58.06863712908658
+      inference_time: 17142.0
+      throughput: 58.33625014584062
       estimated_peak_memory_range:
-        min: 626688
-        max: 3193392
+        min: 12288
+        max: 1619720
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jnp10w625
+      job_id: jpy13ny8p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10449.0
-      throughput: 95.70293808019906
+      inference_time: 10477.0
+      throughput: 95.44716999140975
       estimated_peak_memory_range:
-        min: 4993024
-        max: 6624200
+        min: 5009408
+        max: 6255424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jqp4qd38g
+      job_id: jpv6k4yk5
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:18:24Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:06:05Z'
   - torchscript_onnx_tflite:
-      inference_time: 17170.0
-      throughput: 58.241118229470004
+      inference_time: 17156.0
+      throughput: 58.28864537188156
       estimated_peak_memory_range:
-        min: 16384
-        max: 249429656
+        min: 49152
+        max: 211980424
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jvgdwq2e5
+      job_id: jp2kyjzrp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10449.0
-      throughput: 95.70293808019906
+      inference_time: 10469.0
+      throughput: 95.52010698251982
       estimated_peak_memory_range:
-        min: 4997120
-        max: 6179960
+        min: 4968448
+        max: 6197136
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: j0pxv6x3g
+      job_id: jgo26m9qp
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:18:25Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:06:04Z'
   - torchscript_onnx_tflite:
-      inference_time: 17204.0
-      throughput: 58.126017205301096
+      inference_time: 19533.0
+      throughput: 51.19541289100496
       estimated_peak_memory_range:
-        min: 827392
-        max: 2628400
+        min: 634880
+        max: 107445984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 215
-      job_id: jz5wo343p
+      job_id: jprv3970g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 10542.0
-      throughput: 94.8586605957124
+      inference_time: 12605.0
+      throughput: 79.33359777865927
       estimated_peak_memory_range:
-        min: 5001216
-        max: 6655832
+        min: 4931584
+        max: 63855088
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jo5mr68dg
+      job_id: jgz3dwqo5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:06:09Z'
+  - torchscript_onnx_tflite:
+      inference_time: 12241.0
+      throughput: 81.6926721673066
+      estimated_peak_memory_range:
+        min: 614400
+        max: 72754016
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 215
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 215
+      job_id: jgkexdkwg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 5825.0
+      throughput: 171.67381974248926
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 73396624
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 221
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 221
+      job_id: j5we6x035
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 8115.0
+      throughput: 123.22858903265558
+      estimated_peak_memory_range:
+        min: 6352896
+        max: 90064304
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 222
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 222
+      job_id: jg9ln878g
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:18:26Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:06:15Z'
   - torchscript_onnx_qnn:
-      inference_time: 10922.0
-      throughput: 91.55832265152902
+      inference_time: 10949.0
+      throughput: 91.33254178463787
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jvgdwq2r5
+      job_id: j56y4v06p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 14009.0
-      throughput: 71.38268256121064
+      inference_time: 14157.0
+      throughput: 70.63643427279791
       estimated_peak_memory_range:
-        min: 9863168
-        max: 9863168
+        min: 9900032
+        max: 9900032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 222
-      job_id: jqpyejm8g
+      job_id: jgdx10yrp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:18:31Z'
+    timestamp: '2024-10-14T23:06:13Z'
diff --git a/qai_hub_models/models/yolov7_quantized/README.md b/qai_hub_models/models/yolov7_quantized/README.md
index bb4f1089..a271f87c 100644
--- a/qai_hub_models/models/yolov7_quantized/README.md
+++ b/qai_hub_models/models/yolov7_quantized/README.md
@@ -6,7 +6,7 @@
 YoloV7 is a machine learning model that predicts bounding boxes and classes of objects in an image. This model is post-training quantized to int8 using samples from the COCO dataset.
 
 This is based on the implementation of Yolo-v7-Quantized found
-[here](https://github.com/WongKinYiu/yolov7/). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov7_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov7_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of Yolo-v7-Quantized can be found
+* The license for the original implementation of Yolo-v7-Quantized can be found
   [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/WongKinYiu/yolov7/blob/main/LICENSE.md)
+
 
 ## References
 * [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors](https://arxiv.org/abs/2207.02696)
 * [Source Model Implementation](https://github.com/WongKinYiu/yolov7/)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov7_quantized/export.py b/qai_hub_models/models/yolov7_quantized/export.py
index 9784d91a..25748e89 100644
--- a/qai_hub_models/models/yolov7_quantized/export.py
+++ b/qai_hub_models/models/yolov7_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov7_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov7_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,7 +200,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov7_quantized/perf.yaml b/qai_hub_models/models/yolov7_quantized/perf.yaml
index 73911737..eabd9a87 100644
--- a/qai_hub_models/models/yolov7_quantized/perf.yaml
+++ b/qai_hub_models/models/yolov7_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: Yolo-v7-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 4430.0
-      throughput: 225.73363431151242
+      inference_time: 4395.0
+      throughput: 227.53128555176337
       estimated_peak_memory_range:
-        min: 200704
-        max: 150294072
+        min: 368640
+        max: 2121736
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,14 +62,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jygzej8kg
+      job_id: jgo26mdqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4818.0
-      throughput: 207.55500207555002
+      inference_time: 4817.0
+      throughput: 207.59809009757112
       estimated_peak_memory_range:
-        min: 16384
-        max: 10251656
+        min: 20480
+        max: 10430648
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -79,14 +77,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jegn2m7rg
+      job_id: jgn6vxqk5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 8991.0
-      throughput: 111.22233344455567
+      inference_time: 7451.0
+      throughput: 134.21017313112333
       estimated_peak_memory_range:
-        min: 12288
-        max: 9416680
+        min: 49152
+        max: 12716312
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 253
-      job_id: jw566zd05
+      job_id: jpv6k4nk5
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:17:52Z'
+    timestamp: '2024-10-14T23:05:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 2831.0
-      throughput: 353.2320734722713
+      inference_time: 2825.0
+      throughput: 353.98230088495575
       estimated_peak_memory_range:
         min: 12288
-        max: 68718304
+        max: 76290288
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,14 +115,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jz5wo316p
+      job_id: jpv6k4mk5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3205.0
-      throughput: 312.01248049922
+      inference_time: 3164.0
+      throughput: 316.05562579013906
       estimated_peak_memory_range:
         min: 1245184
-        max: 49689920
+        max: 59792256
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -132,14 +130,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: joprk2n95
+      job_id: jprv39d0g
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6372.0
-      throughput: 156.9365976145637
+      inference_time: 5362.0
+      throughput: 186.4975755315181
       estimated_peak_memory_range:
-        min: 90112
-        max: 111436464
+        min: 311296
+        max: 128002480
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 253
-      job_id: j1p3k1wl5
+      job_id: jgjvn18vg
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:17:53Z'
+    timestamp: '2024-10-14T23:05:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 4381.0
-      throughput: 228.2583884957772
+      inference_time: 9943.0
+      throughput: 100.57326762546515
       estimated_peak_memory_range:
-        min: 180224
-        max: 150514552
+        min: 159744
+        max: 73956800
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,14 +168,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jmg9vyxl5
+      job_id: jp4lr8485
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3511.0
-      throughput: 284.8191398461977
+      inference_time: 13531.0
+      throughput: 73.90436774813392
       estimated_peak_memory_range:
-        min: 1269760
-        max: 2524184
+        min: 1732608
+        max: 9872768
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -185,22 +183,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jqpyej77g
+      job_id: jp3j0873g
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:05:21Z'
+  - torchscript_onnx_tflite:
+      inference_time: 96943.0
+      throughput: 10.315339942027789
+      estimated_peak_memory_range:
+        min: 3944448
+        max: 35907808
+      primary_compute_unit: GPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 33
+        layers_on_gpu: 127
+        layers_on_cpu: 69
+        total_layers: 229
+      job_id: jpxkomr35
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:17:46Z'
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:05:10Z'
   - torchscript_onnx_tflite:
-      inference_time: 5021.0
-      throughput: 199.16351324437363
+      inference_time: 4372.0
+      throughput: 228.72827081427263
       estimated_peak_memory_range:
-        min: 180224
-        max: 72680320
+        min: 176128
+        max: 4489448
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,14 +229,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jnp10wv25
+      job_id: jgjvn1yvg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4649.0
-      throughput: 215.10002151000216
+      inference_time: 3741.0
+      throughput: 267.30820636193533
       estimated_peak_memory_range:
-        min: 1273856
-        max: 52987264
+        min: 1265664
+        max: 3092536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -223,22 +244,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jn5q8rm45
+      job_id: jpy13n28p
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:17:50Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:05:15Z'
   - torchscript_onnx_tflite:
-      inference_time: 4392.0
-      throughput: 227.68670309653916
+      inference_time: 4375.0
+      throughput: 228.57142857142858
       estimated_peak_memory_range:
-        min: 180224
-        max: 2400480
+        min: 0
+        max: 1559680
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,14 +267,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jvgdwqze5
+      job_id: jp14z318p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3526.0
-      throughput: 283.60748723766307
+      inference_time: 3786.0
+      throughput: 264.1310089804543
       estimated_peak_memory_range:
-        min: 1294336
-        max: 2490600
+        min: 1282048
+        max: 2529512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -261,22 +282,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: j2p0y2v6g
+      job_id: j5q6qw1np
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:17:47Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:05:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 4372.0
-      throughput: 228.72827081427263
+      inference_time: 4360.0
+      throughput: 229.3577981651376
       estimated_peak_memory_range:
-        min: 192512
-        max: 1697224
+        min: 176128
+        max: 1892640
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,14 +305,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jz57zl7lp
+      job_id: jg9ln82wg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3550.0
-      throughput: 281.6901408450704
+      inference_time: 3741.0
+      throughput: 267.30820636193533
       estimated_peak_memory_range:
-        min: 1269760
-        max: 2601032
+        min: 1286144
+        max: 2711568
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -299,7 +320,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: j1p8om4xg
+      job_id: jp8qy8rkp
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:17:48Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:05:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 4416.0
-      throughput: 226.44927536231884
+      inference_time: 4402.0
+      throughput: 227.1694684234439
       estimated_peak_memory_range:
-        min: 167936
-        max: 11549496
+        min: 823296
+        max: 2827872
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,14 +343,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: jqp4qd9vg
+      job_id: j5we6xz35
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3529.0
-      throughput: 283.36639274582035
+      inference_time: 3749.0
+      throughput: 266.7377967457989
       estimated_peak_memory_range:
-        min: 1273856
-        max: 2607304
+        min: 1306624
+        max: 2952528
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -337,22 +358,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jogkzq92g
+      job_id: jp0z0k995
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:17:49Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:05:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 9988.0
-      throughput: 100.1201441730076
+      inference_time: 5008.0
+      throughput: 199.68051118210863
       estimated_peak_memory_range:
-        min: 159744
-        max: 74328688
+        min: 163840
+        max: 81691760
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,14 +381,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 229
-      job_id: j0pxv6d1g
+      job_id: jpedm2xo5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 13418.0
-      throughput: 74.52675510508273
+      inference_time: 4645.0
+      throughput: 215.28525296017222
       estimated_peak_memory_range:
-        min: 1282048
-        max: 9008416
+        min: 1245184
+        max: 62789088
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -375,42 +396,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: j1gln218p
+      job_id: j56y4vm6p
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:17:51Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:05:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 95801.0
-      throughput: 10.438304401832966
+      inference_time: 2893.0
+      throughput: 345.66194262011754
       estimated_peak_memory_range:
-        min: 3633152
-        max: 54027720
-      primary_compute_unit: GPU
+        min: 8192
+        max: 54127424
+      primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 33
-        layers_on_gpu: 127
-        layers_on_cpu: 69
+        layers_on_npu: 229
+        layers_on_gpu: 0
+        layers_on_cpu: 0
         total_layers: 229
-      job_id: jo5mr6dwg
+      job_id: j5mnx4kdp
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3348.0
+      throughput: 298.6857825567503
+      estimated_peak_memory_range:
+        min: 1241088
+        max: 51929040
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 221
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 221
+      job_id: jgo26mwqp
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 5191.0
+      throughput: 192.64110961279138
+      estimated_peak_memory_range:
+        min: 1585152
+        max: 94095936
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 253
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 253
+      job_id: j5we6xr35
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:17:42Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:05:27Z'
   - torchscript_onnx_qnn:
-      inference_time: 3883.0
-      throughput: 257.53283543651816
+      inference_time: 4189.0
+      throughput: 238.72045834328003
       estimated_peak_memory_range:
         min: 1232896
         max: 1232896
@@ -421,14 +472,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 221
-      job_id: jep289v4p
+      job_id: jp2kyjdrp
       job_status: Passed
     torchscript_onnx:
-      inference_time: 9351.0
-      throughput: 106.94043417816276
+      inference_time: 9178.0
+      throughput: 108.95619960775768
       estimated_peak_memory_range:
-        min: 6844416
-        max: 6844416
+        min: 8142848
+        max: 8142848
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 253
-      job_id: jwgoyn4x5
+      job_id: jpedm2no5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:17:54Z'
+    timestamp: '2024-10-14T23:05:25Z'
diff --git a/qai_hub_models/models/yolov8_det/README.md b/qai_hub_models/models/yolov8_det/README.md
index fc6ab59b..141d16c8 100644
--- a/qai_hub_models/models/yolov8_det/README.md
+++ b/qai_hub_models/models/yolov8_det/README.md
@@ -6,7 +6,7 @@
 Ultralytics YOLOv8 is a machine learning model that predicts bounding boxes and classes of objects in an image.
 
 This is based on the implementation of YOLOv8-Detection found
-[here](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov8_det).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov8_det.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of YOLOv8-Detection can be found
+* The license for the original implementation of YOLOv8-Detection can be found
   [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+
 
 ## References
 * [Ultralytics YOLOv8 Docs: Object Detection](https://docs.ultralytics.com/tasks/detect/)
 * [Source Model Implementation](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov8_det/export.py b/qai_hub_models/models/yolov8_det/export.py
index 985b6c19..9a73f192 100644
--- a/qai_hub_models/models/yolov8_det/export.py
+++ b/qai_hub_models/models/yolov8_det/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov8_det import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov8_det"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -203,7 +201,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov8_det/perf.yaml b/qai_hub_models/models/yolov8_det/perf.yaml
index 2261ac50..cb83042b 100644
--- a/qai_hub_models/models/yolov8_det/perf.yaml
+++ b/qai_hub_models/models/yolov8_det/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: YOLOv8-Detection
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 5248.0
-      throughput: 190.5487804878049
+      inference_time: 5198.0
+      throughput: 192.3816852635629
       estimated_peak_memory_range:
-        min: 40960
-        max: 5407336
+        min: 16384
+        max: 171349048
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jmg9vykl5
+      job_id: j5q6qwlnp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5281.0
-      throughput: 189.3580761219466
+      inference_time: 5304.0
+      throughput: 188.5369532428356
       estimated_peak_memory_range:
-        min: 4227072
-        max: 16782792
+        min: 4968448
+        max: 20730640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: jegn2morg
+      job_id: j5we6xy35
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6367.0
-      throughput: 157.0598397989634
+      inference_time: 6063.0
+      throughput: 164.93485073396008
       estimated_peak_memory_range:
-        min: 5300224
-        max: 10908312
+        min: 4960256
+        max: 10882952
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: j1gln2o8p
+      job_id: jp2kyjorp
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:17:00Z'
+    timestamp: '2024-10-14T23:04:23Z'
   - torchscript_onnx_tflite:
-      inference_time: 3835.0
-      throughput: 260.7561929595828
+      inference_time: 3840.0
+      throughput: 260.4166666666667
       estimated_peak_memory_range:
         min: 12288
-        max: 87381344
+        max: 96494496
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jnp10w725
+      job_id: jglvm7yj5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 3815.0
-      throughput: 262.12319790301444
+      inference_time: 3826.0
+      throughput: 261.3695765812859
       estimated_peak_memory_range:
-        min: 0
-        max: 43800208
+        min: 4931584
+        max: 55705856
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: joprk2o95
+      job_id: jg9ln8owg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 4524.0
-      throughput: 221.04332449160034
+      inference_time: 5039.0
+      throughput: 198.45207382417146
       estimated_peak_memory_range:
-        min: 4128768
-        max: 109267504
+        min: 5382144
+        max: 119552640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: jw566zr05
+      job_id: jpy13n88p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:17:01Z'
+    timestamp: '2024-10-14T23:04:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 5172.0
-      throughput: 193.34880123743233
+      inference_time: 5145.0
+      throughput: 194.3634596695821
       estimated_peak_memory_range:
-        min: 225280
-        max: 167035944
+        min: 229376
+        max: 2187816
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jvgdwq8e5
+      job_id: j56y4v86p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5039.0
-      throughput: 198.45207382417146
+      inference_time: 4996.0
+      throughput: 200.160128102482
       estimated_peak_memory_range:
         min: 4993024
-        max: 6751880
+        max: 6358000
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: jqpyejq7g
+      job_id: jgdx106rp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:16:55Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:04:16Z'
   - torchscript_onnx_tflite:
-      inference_time: 8713.0
-      throughput: 114.77103179157581
+      inference_time: 5198.0
+      throughput: 192.3816852635629
       estimated_peak_memory_range:
-        min: 217088
-        max: 81988800
+        min: 36864
+        max: 4522160
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jz57zlklp
+      job_id: jgjvn13vg
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 7883.0
-      throughput: 126.85525815045034
+      inference_time: 5109.0
+      throughput: 195.73302016050107
       estimated_peak_memory_range:
-        min: 4145152
-        max: 33283632
+        min: 4956160
+        max: 6524256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: jn5q8rz45
+      job_id: jpxkom035
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:16:59Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:04:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 5251.0
-      throughput: 190.43991620643686
+      inference_time: 5234.0
+      throughput: 191.05846388995033
       estimated_peak_memory_range:
-        min: 45056
-        max: 167626888
+        min: 229376
+        max: 1799032
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jqp4qdmvg
+      job_id: jpv6k42k5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5024.0
-      throughput: 199.04458598726114
+      inference_time: 5085.0
+      throughput: 196.65683382497542
       estimated_peak_memory_range:
-        min: 5013504
-        max: 6329568
+        min: 4988928
+        max: 6640168
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: j2p0y2d6g
+      job_id: jp4lr8e85
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:16:56Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:04:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 5191.0
-      throughput: 192.64110961279138
+      inference_time: 5156.0
+      throughput: 193.9487975174554
       estimated_peak_memory_range:
-        min: 225280
-        max: 2186664
+        min: 258048
+        max: 16794984
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: j0pxv631g
+      job_id: jgo26mlqp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5095.0
-      throughput: 196.27085377821393
+      inference_time: 5069.0
+      throughput: 197.27756954034325
       estimated_peak_memory_range:
-        min: 4964352
-        max: 6242344
+        min: 4993024
+        max: 6330696
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: j1p8om6xg
+      job_id: j57yr6ov5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:16:57Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:04:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 5180.0
-      throughput: 193.05019305019306
+      inference_time: 8670.0
+      throughput: 115.34025374855824
       estimated_peak_memory_range:
-        min: 225280
-        max: 16609440
+        min: 245760
+        max: 86445504
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 290
-      job_id: jo5mr6owg
+      job_id: jp3j08z3g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 5125.0
-      throughput: 195.1219512195122
+      inference_time: 7878.0
+      throughput: 126.93577050012694
       estimated_peak_memory_range:
-        min: 4972544
-        max: 6195568
+        min: 4931584
+        max: 41834944
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: jogkzqo2g
+      job_id: jgn6vx1k5
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:04:21Z'
+  - torchscript_onnx_tflite:
+      inference_time: 3025.0
+      throughput: 330.57851239669424
+      estimated_peak_memory_range:
+        min: 8192
+        max: 60656016
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 290
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 290
+      job_id: jgz3dwzo5
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3595.0
+      throughput: 278.1641168289291
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 51255872
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 285
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 285
+      job_id: jprv39x0g
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4011.0
+      throughput: 249.3143854400399
+      estimated_peak_memory_range:
+        min: 5365760
+        max: 76348800
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 286
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 286
+      job_id: jgkexd6wg
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:16:58Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:04:27Z'
   - torchscript_onnx_qnn:
-      inference_time: 5442.0
-      throughput: 183.75597206909225
+      inference_time: 5524.0
+      throughput: 181.02824040550325
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 285
-      job_id: jep28944p
+      job_id: jp14z3o8p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6486.0
-      throughput: 154.17823003391922
+      inference_time: 6702.0
+      throughput: 149.20919128618323
       estimated_peak_memory_range:
-        min: 4931584
-        max: 4931584
+        min: 5443584
+        max: 5443584
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 286
-      job_id: j1p3k1xl5
+      job_id: jp0z0ko95
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:17:02Z'
+    timestamp: '2024-10-14T23:04:25Z'
diff --git a/qai_hub_models/models/yolov8_det_quantized/README.md b/qai_hub_models/models/yolov8_det_quantized/README.md
index 025830a7..81e68ba0 100644
--- a/qai_hub_models/models/yolov8_det_quantized/README.md
+++ b/qai_hub_models/models/yolov8_det_quantized/README.md
@@ -6,7 +6,7 @@
 Ultralytics YOLOv8 is a machine learning model that predicts bounding boxes and classes of objects in an image. This model is post-training quantized to int8 using samples from the COCO dataset.
 
 This is based on the implementation of YOLOv8-Detection-Quantized found
-[here](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov8_det_quantized).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov8_det_quantized.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of YOLOv8-Detection-Quantized can be found
+* The license for the original implementation of YOLOv8-Detection-Quantized can be found
   [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+
 
 ## References
 * [Ultralytics YOLOv8 Docs: Object Detection](https://docs.ultralytics.com/tasks/detect/)
 * [Source Model Implementation](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/detect)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov8_det_quantized/export.py b/qai_hub_models/models/yolov8_det_quantized/export.py
index 69f20f6a..3682d38a 100644
--- a/qai_hub_models/models/yolov8_det_quantized/export.py
+++ b/qai_hub_models/models/yolov8_det_quantized/export.py
@@ -10,17 +10,17 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov8_det_quantized import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.printing import (
     print_inference_metrics,
@@ -45,20 +45,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -80,10 +78,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov8_det_quantized"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -109,7 +107,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -120,7 +118,7 @@ def export_model(
         target_runtime, output_path, input_spec
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -135,7 +133,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -150,7 +148,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -171,13 +169,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -202,7 +200,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov8_det_quantized/perf.yaml b/qai_hub_models/models/yolov8_det_quantized/perf.yaml
index 82f67d77..6a15ee80 100644
--- a/qai_hub_models/models/yolov8_det_quantized/perf.yaml
+++ b/qai_hub_models/models/yolov8_det_quantized/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,44 +20,41 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS6490 (Proxy)
   - RB3 Gen 2 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
   - QCS8250 (Proxy)
   - RB5 (Proxy)
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Sa8775p Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Qcs8250 Proxy
-  - Qcs6490 Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS6490 Proxy
+  - QCS8250 Proxy
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: YOLOv8-Detection-Quantized
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 1913.0
-      throughput: 522.7391531625718
+      inference_time: 1915.0
+      throughput: 522.1932114882507
       estimated_peak_memory_range:
-        min: 12288
-        max: 108295224
+        min: 16384
+        max: 1454752
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -64,14 +62,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jnp10w325
+      job_id: jpy13nx7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2242.0
-      throughput: 446.03033006244425
+      inference_time: 2265.0
+      throughput: 441.5011037527594
       estimated_peak_memory_range:
-        min: 2113536
-        max: 12544800
+        min: 12288
+        max: 9036560
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -79,14 +77,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: jqpyejn7g
+      job_id: jpedm2z15
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6310.0
-      throughput: 158.47860538827257
+      inference_time: 5638.0
+      throughput: 177.367860943597
       estimated_peak_memory_range:
-        min: 6213632
-        max: 12150408
+        min: 7311360
+        max: 11149928
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -94,7 +92,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 8
         total_layers: 331
-      job_id: j1pv3r4j5
+      job_id: jp4lr8y85
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -103,13 +101,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:16:21Z'
+    timestamp: '2024-10-14T23:03:33Z'
   - torchscript_onnx_tflite:
-      inference_time: 1646.0
-      throughput: 607.5334143377886
+      inference_time: 1273.0
+      throughput: 785.5459544383347
       estimated_peak_memory_range:
         min: 12288
-        max: 55371904
+        max: 59792512
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -117,14 +115,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jvgdwq0e5
+      job_id: jp0z0kj65
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1494.0
-      throughput: 669.3440428380187
+      inference_time: 1489.0
+      throughput: 671.591672263264
       estimated_peak_memory_range:
         min: 1245184
-        max: 28324336
+        max: 32420816
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -132,14 +130,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: j2p0y2k6g
+      job_id: jgz3dwmk5
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5579.0
-      throughput: 179.24359204158452
+      inference_time: 3976.0
+      throughput: 251.50905432595573
       estimated_peak_memory_range:
-        min: 0
-        max: 123878256
+        min: 3457024
+        max: 158278112
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -147,7 +145,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 8
         total_layers: 331
-      job_id: j7gjx21xp
+      job_id: jpxkoml35
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -156,13 +154,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:16:22Z'
+    timestamp: '2024-10-14T23:03:34Z'
   - torchscript_onnx_tflite:
-      inference_time: 1922.0
-      throughput: 520.2913631633714
+      inference_time: 4734.0
+      throughput: 211.23785382340515
       estimated_peak_memory_range:
-        min: 16384
-        max: 1718064
+        min: 12288
+        max: 44329568
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -170,14 +168,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jz57zl6lp
+      job_id: jgo26mrxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1950.0
-      throughput: 512.8205128205128
+      inference_time: 5781.0
+      throughput: 172.9804532087874
       estimated_peak_memory_range:
-        min: 1282048
-        max: 3053760
+        min: 1245184
+        max: 8681616
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -185,22 +183,45 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: jogkzqd2g
+      job_id: jgdx10drp
       job_status: Passed
     reference_device_info:
-      name: QCS8550 (Proxy)
+      name: RB3 Gen 2 (Proxy)
       os: '12'
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:16:15Z'
+      chipset: QCS6490 Proxy
+    timestamp: '2024-10-14T23:03:32Z'
   - torchscript_onnx_tflite:
-      inference_time: 2100.0
-      throughput: 476.1904761904762
+      inference_time: 46023.0
+      throughput: 21.72826630163179
+      estimated_peak_memory_range:
+        min: 2912256
+        max: 16478472
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 277
+        layers_on_gpu: 1
+        layers_on_cpu: 0
+        total_layers: 278
+      job_id: jpv6k4dj5
+      job_status: Passed
+    reference_device_info:
+      name: RB5 (Proxy)
+      os: '12'
+      form_factor: Iot
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8250 Proxy
+    timestamp: '2024-10-14T23:03:20Z'
+  - torchscript_onnx_tflite:
+      inference_time: 1891.0
+      throughput: 528.8207297726071
       estimated_peak_memory_range:
         min: 12288
-        max: 55966704
+        max: 6466008
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -208,14 +229,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jqp4qd8vg
+      job_id: jp8qy8xxp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 2486.0
-      throughput: 402.2526146419952
+      inference_time: 1941.0
+      throughput: 515.1983513652756
       estimated_peak_memory_range:
-        min: 1245184
-        max: 29299664
+        min: 1261568
+        max: 2423696
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -223,22 +244,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: j1p3k18l5
+      job_id: jg9ln8zlg
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
-      os: '13'
-      form_factor: Xr
+      name: QCS8550 (Proxy)
+      os: '12'
+      form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:16:19Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:03:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 1915.0
-      throughput: 522.1932114882507
+      inference_time: 1897.0
+      throughput: 527.1481286241434
       estimated_peak_memory_range:
-        min: 12288
-        max: 4475880
+        min: 16384
+        max: 9354536
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -246,14 +267,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: j0pxv6m1g
+      job_id: j56y4v70p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1956.0
-      throughput: 511.2474437627812
+      inference_time: 1970.0
+      throughput: 507.61421319796955
       estimated_peak_memory_range:
-        min: 1273856
-        max: 2472528
+        min: 1261568
+        max: 2413648
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -261,22 +282,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: jn5q8rw45
+      job_id: j5we6xl35
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:16:16Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:03:28Z'
   - torchscript_onnx_tflite:
-      inference_time: 1909.0
-      throughput: 523.8344683080147
+      inference_time: 1919.0
+      throughput: 521.1047420531527
       estimated_peak_memory_range:
-        min: 16384
-        max: 2940456
+        min: 12288
+        max: 2250360
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -284,14 +305,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jo5mr64wg
+      job_id: jglvm7x85
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1961.0
-      throughput: 509.94390617032127
+      inference_time: 1953.0
+      throughput: 512.0327700972862
       estimated_peak_memory_range:
-        min: 1273856
-        max: 2520880
+        min: 4788224
+        max: 5988272
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -299,7 +320,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: j1gln278p
+      job_id: jgdx10dep
       job_status: Passed
     reference_device_info:
       name: SA8775 (Proxy)
@@ -307,14 +328,14 @@ models:
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:16:17Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:03:27Z'
   - torchscript_onnx_tflite:
-      inference_time: 1915.0
-      throughput: 522.1932114882507
+      inference_time: 1919.0
+      throughput: 521.1047420531527
       estimated_peak_memory_range:
         min: 12288
-        max: 1786760
+        max: 3498696
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -322,14 +343,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jegn2mxrg
+      job_id: j5q6qwy4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 1951.0
-      throughput: 512.557662737058
+      inference_time: 1979.0
+      throughput: 505.3057099545225
       estimated_peak_memory_range:
-        min: 1245184
-        max: 3175872
+        min: 1265664
+        max: 2441912
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -337,22 +358,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: jw566zv05
+      job_id: jp14z3n2p
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:16:18Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:03:26Z'
   - torchscript_onnx_tflite:
-      inference_time: 4477.0
-      throughput: 223.36385972749608
+      inference_time: 2091.0
+      throughput: 478.24007651841225
       estimated_peak_memory_range:
-        min: 94208
-        max: 44556032
+        min: 12288
+        max: 60267504
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -360,14 +381,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: joprk2995
+      job_id: jgkexd42g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6094.0
-      throughput: 164.09583196586806
+      inference_time: 2512.0
+      throughput: 398.0891719745223
       estimated_peak_memory_range:
-        min: 1540096
-        max: 9478256
+        min: 1245184
+        max: 34234320
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -375,42 +396,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: jwgoynmx5
+      job_id: jp14z3n8p
       job_status: Passed
     reference_device_info:
-      name: RB3 Gen 2 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: QCS8450 (Proxy)
+      os: '13'
+      form_factor: Xr
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs6490 Proxy
-    timestamp: '2024-09-25T11:16:20Z'
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:03:30Z'
   - torchscript_onnx_tflite:
-      inference_time: 46612.0
-      throughput: 21.453702909122114
+      inference_time: 1206.0
+      throughput: 829.1873963515754
       estimated_peak_memory_range:
-        min: 2920448
-        max: 25415328
+        min: 8192
+        max: 39328128
       primary_compute_unit: NPU
       precision: int8
       layer_info:
-        layers_on_npu: 277
-        layers_on_gpu: 1
+        layers_on_npu: 278
+        layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 278
-      job_id: jep289j4p
+      job_id: jgjvn17xg
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 1487.0
+      throughput: 672.4949562878278
+      estimated_peak_memory_range:
+        min: 1241088
+        max: 28062400
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 273
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 273
+      job_id: j57yr6ev5
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 3993.0
+      throughput: 250.4382669671926
+      estimated_peak_memory_range:
+        min: 6139904
+        max: 127875488
+      primary_compute_unit: NPU
+      precision: int8
+      layer_info:
+        layers_on_npu: 323
+        layers_on_gpu: 0
+        layers_on_cpu: 8
+        total_layers: 331
+      job_id: jprv39l0g
       job_status: Passed
     reference_device_info:
-      name: RB5 (Proxy)
-      os: '12'
-      form_factor: Iot
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8250 Proxy
-    timestamp: '2024-09-25T11:16:11Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:03:37Z'
   - torchscript_onnx_qnn:
-      inference_time: 2223.0
-      throughput: 449.842555105713
+      inference_time: 2263.0
+      throughput: 441.8912947414936
       estimated_peak_memory_range:
         min: 1232896
         max: 1232896
@@ -421,14 +472,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 273
-      job_id: j1p8om8xg
+      job_id: j5we6xl65
       job_status: Passed
     torchscript_onnx:
-      inference_time: 6233.0
-      throughput: 160.43638697256537
+      inference_time: 6306.0
+      throughput: 158.5791309863622
       estimated_peak_memory_range:
-        min: 7827456
-        max: 7827456
+        min: 7704576
+        max: 7704576
       primary_compute_unit: NPU
       precision: int8
       layer_info:
@@ -436,7 +487,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 8
         total_layers: 331
-      job_id: jlpe9w21g
+      job_id: j5mnx40dp
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -445,4 +496,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:16:23Z'
+    timestamp: '2024-10-14T23:03:35Z'
diff --git a/qai_hub_models/models/yolov8_seg/README.md b/qai_hub_models/models/yolov8_seg/README.md
index 7f827999..678d1662 100644
--- a/qai_hub_models/models/yolov8_seg/README.md
+++ b/qai_hub_models/models/yolov8_seg/README.md
@@ -6,7 +6,7 @@
 Ultralytics YOLOv8 is a machine learning model that predicts bounding boxes, segmentation masks and classes of objects in an image.
 
 This is based on the implementation of YOLOv8-Segmentation found
-[here](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/segment). This repository contains scripts for optimized on-device
+[here]({source_repo}). This repository contains scripts for optimized on-device
 export suitable to run on Qualcomm® devices. More details on model performance
 accross various devices, can be found [here](https://aihub.qualcomm.com/models/yolov8_seg).
 
@@ -44,15 +44,19 @@ python -m qai_hub_models.models.yolov8_seg.export
 Additional options are documented with the `--help` option. Note that the above
 script requires access to Deployment instructions for Qualcomm® AI Hub.
 
+
 ## License
-- The license for the original implementation of YOLOv8-Segmentation can be found
+* The license for the original implementation of YOLOv8-Segmentation can be found
   [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
-- The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+* The license for the compiled assets for on-device deployment can be found [here](https://github.com/ultralytics/ultralytics/blob/main/LICENSE)
+
 
 ## References
 * [Ultralytics YOLOv8 Docs: Instance Segmentation](https://docs.ultralytics.com/tasks/segment/)
 * [Source Model Implementation](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models/yolo/segment)
 
+
+
 ## Community
 * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
 * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
diff --git a/qai_hub_models/models/yolov8_seg/export.py b/qai_hub_models/models/yolov8_seg/export.py
index dd920257..464045bf 100644
--- a/qai_hub_models/models/yolov8_seg/export.py
+++ b/qai_hub_models/models/yolov8_seg/export.py
@@ -10,18 +10,18 @@
 import os
 import warnings
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple, cast
+from typing import Any, Dict, List, Optional, cast
 
 import qai_hub as hub
 import torch
 
+from qai_hub_models.models.common import ExportResult, TargetRuntime
 from qai_hub_models.models.yolov8_seg import Model
 from qai_hub_models.utils.args import (
     export_parser,
     get_input_spec_kwargs,
     get_model_kwargs,
 )
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import torch_inference
 from qai_hub_models.utils.input_spec import make_torch_inputs
 from qai_hub_models.utils.printing import (
@@ -47,20 +47,18 @@ def export_model(
     compile_options: str = "",
     profile_options: str = "",
     **additional_model_kwargs,
-) -> Tuple[hub.CompileJob, Optional[hub.ProfileJob], Optional[hub.InferenceJob]] | List[
-    str
-]:
+) -> ExportResult | List[str]:
     """
-    This function accomplishes 6 main tasks:
+    This function executes the following recipe:
 
-        1. Instantiates a PyTorch model and converts it to a traced TorchScript format.
-        2. Compiles the model to an asset that can be run on device.
-        3. Profiles the model performance on real devices.
-        4. Inferences the model on sample inputs.
-        5. Downloads the model asset to the local directory.
-        6. Summarizes the results from profiling and inference.
+        1. Instantiates a PyTorch model and converts it to a traced TorchScript format
+        2. Compiles the model to an asset that can be run on device
+        3. Profiles the model performance on a real device
+        4. Inferences the model on sample inputs
+        5. Downloads the model asset to the local directory
+        6. Summarizes the results from profiling and inference
 
-    Each of the last four steps can be optionally skipped using the input options.
+    Each of the last 4 steps can be optionally skipped using the input options.
 
     Parameters:
         device: Device for which to export the model.
@@ -82,10 +80,10 @@ def export_model(
             `model_cls.from_pretrained` and `model.get_input_spec`
 
     Returns:
-        A 3-tuple of:
+        A struct of:
             * A CompileJob object containing metadata about the compile job submitted to hub.
-            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
             * An InferenceJob containing metadata about the inference job (None if inferencing skipped).
+            * A ProfileJob containing metadata about the profile job (None if profiling skipped).
     """
     model_name = "yolov8_seg"
     output_path = Path(output_dir or Path.cwd() / "build" / model_name)
@@ -111,7 +109,7 @@ def export_model(
     # On-device perf improves with I/O in channel_last format except when using ONNX.
     use_channel_last_format = target_runtime != TargetRuntime.ONNX
 
-    # 1. Initialize PyTorch model
+    # 1. Instantiates a PyTorch model and converts it to a traced TorchScript format
     model = Model.from_pretrained(**get_model_kwargs(Model, additional_model_kwargs))
     input_spec = model.get_input_spec(
         **get_input_spec_kwargs(model, additional_model_kwargs)
@@ -122,7 +120,7 @@ def export_model(
         model.to("cpu"), make_torch_inputs(input_spec), check_trace=False
     )
 
-    # 2. Compile the model to an on-device asset
+    # 2. Compiles the model to an asset that can be run on device
     model_compile_options = model.get_hub_compile_options(
         target_runtime, compile_options, hub_device
     )
@@ -136,7 +134,7 @@ def export_model(
     )
     compile_job = cast(hub.client.CompileJob, submitted_compile_job)
 
-    # 3. Profile the model asset on real devices
+    # 3. Profiles the model performance on a real device
     profile_job: Optional[hub.client.ProfileJob] = None
     if not skip_profiling:
         profile_options_all = model.get_hub_profile_options(
@@ -151,7 +149,7 @@ def export_model(
         )
         profile_job = cast(hub.client.ProfileJob, submitted_profile_job)
 
-    # 4. Run inference on-device with sample inputs
+    # 4. Inferences the model on sample inputs
     inference_job: Optional[hub.client.InferenceJob] = None
     if not skip_inferencing:
         profile_options_all = model.get_hub_profile_options(
@@ -172,13 +170,13 @@ def export_model(
         )
         inference_job = cast(hub.client.InferenceJob, submitted_inference_job)
 
-    # 5. Download the model asset to a local file
+    # 5. Downloads the model asset to the local directory
     if not skip_downloading:
         os.makedirs(output_path, exist_ok=True)
         target_model: hub.Model = compile_job.get_target_model()  # type: ignore
         target_model.download(str(output_path / model_name))
 
-    # 6. Summarize the results from profiling and inference
+    # 6. Summarizes the results from profiling and inference
     if not skip_summary and not skip_profiling:
         assert profile_job is not None and profile_job.wait().success
         profile_data: Dict[str, Any] = profile_job.download_profile()  # type: ignore
@@ -203,7 +201,11 @@ def export_model(
     if not skip_summary:
         print_on_target_demo_cmd(compile_job, Path(__file__).parent, hub_device)
 
-    return (compile_job, profile_job, inference_job)
+    return ExportResult(
+        compile_job=compile_job,
+        inference_job=inference_job,
+        profile_job=profile_job,
+    )
 
 
 def main():
diff --git a/qai_hub_models/models/yolov8_seg/perf.yaml b/qai_hub_models/models/yolov8_seg/perf.yaml
index 810ec6ac..aad3a905 100644
--- a/qai_hub_models/models/yolov8_seg/perf.yaml
+++ b/qai_hub_models/models/yolov8_seg/perf.yaml
@@ -2,6 +2,7 @@ aggregated:
   supported_oses:
   - Android
   supported_devices:
+  - Snapdragon 8 Elite QRD
   - Samsung Galaxy S24
   - Samsung Galaxy S24 Ultra
   - Samsung Galaxy S24+
@@ -19,38 +20,35 @@ aggregated:
   - Samsung Galaxy S21 Ultra
   - Samsung Galaxy S21+
   - Snapdragon X Elite CRD
-  - QCS8550 (Proxy)
-  - SA8775 (Proxy)
-  - SA8650 (Proxy)
-  - SA8255 (Proxy)
+  - Snapdragon X Plus 8-Core CRD
   - QCS8450 (Proxy)
   - XR2 Gen 2 (Proxy)
-  - Google Pixel 5a 5G
-  - Google Pixel 4
-  - Google Pixel 4a
-  - Google Pixel 3
-  - Google Pixel 3a
-  - Google Pixel 3a XL
+  - QCS8550 (Proxy)
+  - SA8255 (Proxy)
+  - SA8650 (Proxy)
+  - SA8775 (Proxy)
   supported_chipsets:
+  - Snapdragon® 8 Elite
   - Snapdragon® 8 Gen 3
   - Snapdragon® 8 Gen 2
   - Snapdragon® 8 Gen 1
   - Snapdragon® 888
   - Snapdragon® X Elite
-  - Qcs8550 Proxy
-  - Qcs8450 Proxy
-  - Sa8650p Proxy
-  - Sa8255p Proxy
-  - Sa8775p Proxy
+  - Snapdragon® X Plus 8-Core
+  - QCS8450 Proxy
+  - QCS8550 Proxy
+  - SA8255P Proxy
+  - SA8650P Proxy
+  - SA8775P Proxy
 models:
 - name: YOLOv8-Segmentation
   performance_metrics:
   - torchscript_onnx_tflite:
-      inference_time: 6418.0
-      throughput: 155.8117793705204
+      inference_time: 6541.0
+      throughput: 152.88182235132243
       estimated_peak_memory_range:
-        min: 4235264
-        max: 6944008
+        min: 4571136
+        max: 6416656
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -58,14 +56,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: jz57zlvlp
+      job_id: j5mnx48wp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6398.0
-      throughput: 156.29884338855894
+      inference_time: 6409.0
+      throughput: 156.03058199407084
       estimated_peak_memory_range:
-        min: 7303168
-        max: 17949456
+        min: 4210688
+        max: 15170608
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -73,14 +71,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: jqpyejv7g
+      job_id: jglvm7l85
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7635.0
-      throughput: 130.97576948264572
+      inference_time: 7616.0
+      throughput: 131.30252100840337
       estimated_peak_memory_range:
-        min: 14888960
-        max: 22845744
+        min: 13791232
+        max: 22564368
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -88,7 +86,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 336
-      job_id: jwgoynex5
+      job_id: jp14z3j2p
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S23
@@ -97,13 +95,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 2
-    timestamp: '2024-09-25T11:15:20Z'
+    timestamp: '2024-10-14T23:02:24Z'
   - torchscript_onnx_tflite:
-      inference_time: 4846.0
-      throughput: 206.35575732562938
+      inference_time: 4861.0
+      throughput: 205.71898786257972
       estimated_peak_memory_range:
-        min: 2994176
-        max: 107446800
+        min: 3215360
+        max: 117297872
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -111,14 +109,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: jqp4qdjvg
+      job_id: jgn6vxkr5
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 4782.0
-      throughput: 209.11752404851526
+      inference_time: 4775.0
+      throughput: 209.4240837696335
       estimated_peak_memory_range:
         min: 4931584
-        max: 55423776
+        max: 65463024
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -126,14 +124,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: j2p0y2e6g
+      job_id: j56y4vw0p
       job_status: Passed
     torchscript_onnx:
-      inference_time: 5593.0
-      throughput: 178.79492222420882
+      inference_time: 5228.0
+      throughput: 191.27773527161438
       estimated_peak_memory_range:
-        min: 434176
-        max: 113566400
+        min: 18432000
+        max: 139158352
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -141,7 +139,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 336
-      job_id: j1pv3rzj5
+      job_id: jgdx103ep
       job_status: Passed
     reference_device_info:
       name: Samsung Galaxy S24
@@ -150,13 +148,13 @@ models:
       os_name: Android
       manufacturer: Samsung
       chipset: Snapdragon® 8 Gen 3
-    timestamp: '2024-09-25T11:15:21Z'
+    timestamp: '2024-10-14T23:02:25Z'
   - torchscript_onnx_tflite:
-      inference_time: 6501.0
-      throughput: 153.82248884786955
+      inference_time: 6414.0
+      throughput: 155.90894917368257
       estimated_peak_memory_range:
-        min: 4575232
-        max: 6463320
+        min: 12288
+        max: 20095544
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -164,14 +162,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: j0pxv6e1g
+      job_id: jprv39w9g
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6096.0
-      throughput: 164.04199475065616
+      inference_time: 6283.0
+      throughput: 159.15963711602737
       estimated_peak_memory_range:
-        min: 4980736
-        max: 11781136
+        min: 4956160
+        max: 6289704
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -179,7 +177,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: jogkzqr2g
+      job_id: jgo26m8xp
       job_status: Passed
     reference_device_info:
       name: QCS8550 (Proxy)
@@ -187,14 +185,14 @@ models:
       form_factor: Iot
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8550 Proxy
-    timestamp: '2024-09-25T11:15:15Z'
+      chipset: QCS8550 Proxy
+    timestamp: '2024-10-14T23:02:17Z'
   - torchscript_onnx_tflite:
-      inference_time: 9680.0
-      throughput: 103.30578512396694
+      inference_time: 6446.0
+      throughput: 155.13496742165685
       estimated_peak_memory_range:
-        min: 4567040
-        max: 101403952
+        min: 0
+        max: 209652176
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -202,14 +200,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: jo5mr6vwg
+      job_id: jp8qy81xp
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 9137.0
-      throughput: 109.44511327569224
+      inference_time: 6278.0
+      throughput: 159.28639694170118
       estimated_peak_memory_range:
-        min: 4939776
-        max: 42898608
+        min: 4947968
+        max: 10993640
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -217,22 +215,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: j1p3k1ql5
+      job_id: jpedm2y15
       job_status: Passed
     reference_device_info:
-      name: QCS8450 (Proxy)
+      name: SA8255 (Proxy)
       os: '13'
-      form_factor: Xr
+      form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Qcs8450 Proxy
-    timestamp: '2024-09-25T11:15:19Z'
+      chipset: SA8255P Proxy
+    timestamp: '2024-10-14T23:02:20Z'
   - torchscript_onnx_tflite:
-      inference_time: 6566.0
-      throughput: 152.29972586049345
+      inference_time: 6490.0
+      throughput: 154.08320493066256
       estimated_peak_memory_range:
-        min: 4567040
-        max: 7074160
+        min: 4579328
+        max: 7531776
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -240,14 +238,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: jegn2mrrg
+      job_id: jp0z0k665
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6089.0
-      throughput: 164.23057973394646
+      inference_time: 6277.0
+      throughput: 159.31177314003506
       estimated_peak_memory_range:
-        min: 5005312
-        max: 6444544
+        min: 4943872
+        max: 12461256
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -255,22 +253,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: jn5q8r945
+      job_id: jgjvn1qxg
       job_status: Passed
     reference_device_info:
-      name: SA8650 (Proxy)
+      name: SA8775 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8650p Proxy
-    timestamp: '2024-09-25T11:15:16Z'
+      chipset: SA8775P Proxy
+    timestamp: '2024-10-14T23:02:19Z'
   - torchscript_onnx_tflite:
-      inference_time: 6616.0
-      throughput: 151.14873035066506
+      inference_time: 6533.0
+      throughput: 153.0690341343946
       estimated_peak_memory_range:
-        min: 4587520
-        max: 7057528
+        min: 4595712
+        max: 14287264
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -278,14 +276,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: joprk2195
+      job_id: jpy13nm7p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6022.0
-      throughput: 166.05778811026238
+      inference_time: 6380.0
+      throughput: 156.73981191222572
       estimated_peak_memory_range:
-        min: 5009408
-        max: 6336528
+        min: 4960256
+        max: 6184600
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -293,22 +291,22 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: j1gln2e8p
+      job_id: jpv6k47j5
       job_status: Passed
     reference_device_info:
-      name: SA8775 (Proxy)
+      name: SA8650 (Proxy)
       os: '13'
       form_factor: Auto
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8775p Proxy
-    timestamp: '2024-09-25T11:15:17Z'
+      chipset: SA8650P Proxy
+    timestamp: '2024-10-14T23:02:18Z'
   - torchscript_onnx_tflite:
-      inference_time: 6544.0
-      throughput: 152.8117359413203
+      inference_time: 9610.0
+      throughput: 104.0582726326743
       estimated_peak_memory_range:
-        min: 4227072
-        max: 13454240
+        min: 4763648
+        max: 107743616
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -316,14 +314,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 338
-      job_id: jep28934p
+      job_id: jp2kyje4p
       job_status: Passed
     torchscript_onnx_qnn:
-      inference_time: 6234.0
-      throughput: 160.41065126724413
+      inference_time: 9155.0
+      throughput: 109.22992900054615
       estimated_peak_memory_range:
-        min: 4993024
-        max: 6883504
+        min: 4931584
+        max: 46516128
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -331,19 +329,72 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: jw566zq05
+      job_id: j5we6x765
       job_status: Passed
     reference_device_info:
-      name: SA8255 (Proxy)
+      name: QCS8450 (Proxy)
       os: '13'
-      form_factor: Auto
+      form_factor: Xr
+      os_name: Android
+      manufacturer: Qualcomm
+      chipset: QCS8450 Proxy
+    timestamp: '2024-10-14T23:02:22Z'
+  - torchscript_onnx_tflite:
+      inference_time: 4508.0
+      throughput: 221.82786157941436
+      estimated_peak_memory_range:
+        min: 4075520
+        max: 78297040
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 338
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 338
+      job_id: j5q6qwv4p
+      job_status: Passed
+    torchscript_onnx_qnn:
+      inference_time: 3685.0
+      throughput: 271.37042062415196
+      estimated_peak_memory_range:
+        min: 4927488
+        max: 59329488
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 333
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 333
+      job_id: jg9ln8mlg
+      job_status: Passed
+    torchscript_onnx:
+      inference_time: 4821.0
+      throughput: 207.42584526031945
+      estimated_peak_memory_range:
+        min: 0
+        max: 74454688
+      primary_compute_unit: NPU
+      precision: fp16
+      layer_info:
+        layers_on_npu: 336
+        layers_on_gpu: 0
+        layers_on_cpu: 0
+        total_layers: 336
+      job_id: jpxkom415
+      job_status: Passed
+    reference_device_info:
+      name: Snapdragon 8 Elite QRD
+      os: '15'
+      form_factor: Phone
       os_name: Android
       manufacturer: Qualcomm
-      chipset: Sa8255p Proxy
-    timestamp: '2024-09-25T11:15:18Z'
+      chipset: Snapdragon® 8 Elite
+    timestamp: '2024-10-14T23:02:28Z'
   - torchscript_onnx_qnn:
-      inference_time: 6424.0
-      throughput: 155.6662515566625
+      inference_time: 7215.0
+      throughput: 138.6001386001386
       estimated_peak_memory_range:
         min: 4923392
         max: 4923392
@@ -354,14 +405,14 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 333
-      job_id: j1p8omwxg
+      job_id: jp3j086lg
       job_status: Passed
     torchscript_onnx:
-      inference_time: 7730.0
-      throughput: 129.36610608020698
+      inference_time: 7647.0
+      throughput: 130.7702366941284
       estimated_peak_memory_range:
-        min: 17395712
-        max: 17395712
+        min: 17469440
+        max: 17469440
       primary_compute_unit: NPU
       precision: fp16
       layer_info:
@@ -369,7 +420,7 @@ models:
         layers_on_gpu: 0
         layers_on_cpu: 0
         total_layers: 336
-      job_id: j7gjx2kxp
+      job_id: j57yr64l5
       job_status: Passed
     reference_device_info:
       name: Snapdragon X Elite CRD
@@ -378,4 +429,4 @@ models:
       os_name: Windows
       manufacturer: Qualcomm
       chipset: Snapdragon® X Elite
-    timestamp: '2024-09-25T11:15:22Z'
+    timestamp: '2024-10-14T23:02:26Z'
diff --git a/qai_hub_models/requirements.txt b/qai_hub_models/requirements.txt
index 2eee49ed..4da7dd45 100644
--- a/qai_hub_models/requirements.txt
+++ b/qai_hub_models/requirements.txt
@@ -3,11 +3,11 @@ deprecation==2.1.0
 fsspec==2023.6.0
 gdown==4.7.1
 gitpython==3.1.42
-huggingface_hub==0.23.1
+huggingface_hub>=0.23.1,<0.24
 ipython==8.12.3
 matplotlib==3.7.5
 numpy==1.23.1
-onnx==1.14.1
+onnx # We don't use a specific version of ONNX so we can defer to AIMET. AIMET-torch and AIMET-ONNX use different ONNX versions.
 opencv-python==4.8.1.78
 packaging==23.2
 pandas==1.5.3
@@ -24,4 +24,4 @@ torchvision==0.16.2
 typing-extensions>=4.12.2
 tqdm==4.66.2
 urllib3==1.26.18
-qai_hub>=0.15.0
+qai_hub>=0.18.1
diff --git a/qai_hub_models/test/test_utils/test_perf_summary.py b/qai_hub_models/test/test_utils/test_perf_summary.py
index 7d28c515..209db75f 100644
--- a/qai_hub_models/test/test_utils/test_perf_summary.py
+++ b/qai_hub_models/test/test_utils/test_perf_summary.py
@@ -99,7 +99,7 @@ def test_model_inference_run_toggle():
     perf_summary.update_summary(MODEL_ID, prev_perf_metrics, new_perf_metrics)
 
     assert perf_summary.progressions["inf"] == [
-        (MODEL_ID, "torchscript_onnx_tflite", "inf", 10.0, "null", CHIPSET, OS)
+        (MODEL_ID, "torchscript_onnx_tflite", "inf", 10.0, "null", "null", CHIPSET, OS)
     ]
 
 
@@ -118,7 +118,7 @@ def test_perf_progression_basic():
     perf_summary.update_summary(MODEL_ID, prev_perf_metrics, new_perf_metrics)
 
     expected_inf_bucket = [
-        (MODEL_ID, "torchscript_onnx_tflite", 20.0, 0.5, 10.0, CHIPSET, OS),
+        (MODEL_ID, "torchscript_onnx_tflite", 20.0, 0.5, 10.0, "null", CHIPSET, OS),
     ]
 
     assert perf_summary.progressions[10] == expected_inf_bucket
@@ -140,7 +140,7 @@ def test_perf_regression_basic():
     perf_summary.update_summary(MODEL_ID, prev_perf_metrics, new_perf_metrics)
 
     expected_inf_bucket = [
-        (MODEL_ID, "torchscript_onnx_tflite", 2, 20.0, 10.0, CHIPSET, OS),
+        (MODEL_ID, "torchscript_onnx_tflite", 2, 20.0, 10.0, "null", CHIPSET, OS),
     ]
 
     assert perf_summary.regressions[2] == expected_inf_bucket
diff --git a/qai_hub_models/utils/aimet/aimet_dummy_model.py b/qai_hub_models/utils/aimet/aimet_dummy_model.py
index 59d90738..3a530603 100644
--- a/qai_hub_models/utils/aimet/aimet_dummy_model.py
+++ b/qai_hub_models/utils/aimet/aimet_dummy_model.py
@@ -6,6 +6,7 @@
 
 import os
 import shutil
+from contextlib import ExitStack
 from pathlib import Path
 from typing import List, Optional
 from zipfile import ZIP_DEFLATED, ZipFile
@@ -13,6 +14,7 @@
 import torch
 from onnx import load_model as load_onnx_model
 from onnx import save_model as save_onnx_model
+from packaging.version import Version
 
 from qai_hub_models.evaluators.base_evaluators import _DataLoader
 from qai_hub_models.models.protocols import (
@@ -68,10 +70,10 @@ class AimetEncodingLoaderMixin(PretrainedHubModelProtocol, QuantizableModelProto
       - Export Torch model to ONNX and load pre-computed encodings
     """
 
-    def __init__(self, model, aimet_encoding_path: str):
+    def __init__(self, model, aimet_encodings: str):
         super().__init__()
         self.model = model
-        self.encodings_path = aimet_encoding_path
+        self.aimet_encodings = aimet_encodings
 
     def quantize(
         self,
@@ -91,6 +93,7 @@ def convert_to_onnx_and_aimet_encodings(
         input_spec: InputSpec | None = None,
         model_name: str | None = None,
         external_weights: bool = False,
+        bundle_external_weights: bool = False,
         output_names: Optional[List[str]] = None,
     ) -> str:
         """
@@ -103,29 +106,50 @@ def convert_to_onnx_and_aimet_encodings(
             input_spec = self.get_input_spec()
 
         os.makedirs(output_dir, exist_ok=True)
-        zip_path = os.path.join(output_dir, f"{model_name}.aimet.zip")
         zip_base_dir = Path(f"{model_name}.aimet")
+        zip = self._use_zip_file()
+
+        with ExitStack() as stack:
+            if zip:
+                # Use temporary directory for preparation
+                tmpdir = stack.enter_context(qaihm_temp_dir())
+            else:
+                tmpdir = output_dir
 
-        with qaihm_temp_dir() as tmpdir:
             base_path = Path(tmpdir) / zip_base_dir
-            if base_path.exists():
-                shutil.rmtree(base_path)
-            os.makedirs(base_path)
+            os.makedirs(base_path, exist_ok=True)
 
             onnx_file_path = str(base_path / f"{model_name}.onnx")
             encoding_file_path = str(base_path / f"{model_name}.encodings")
+            torch_inputs = tuple(make_torch_inputs(input_spec))
+
+            if Version(torch.__version__) < Version("2.4.0"):
+                print()
+                print(
+                    f"WARNING: You are using PyTorch {torch.__version__}, which pre-dates significant ONNX export optimizations"
+                )
+                print(
+                    "         introduced in 2.4.0. We recommend upgrading PyTorch version to speed up this step:"
+                )
+                print()
+                print("         pip install torch==2.4.0")
+                print()
+
             torch.onnx.export(
-                self.model,
-                tuple(make_torch_inputs(input_spec)),
+                self,
+                torch_inputs,
                 onnx_file_path,
                 input_names=[name for name in input_spec],
                 output_names=output_names,
+                opset_version=17,
             )
 
-            shutil.copyfile(self.encodings_path, encoding_file_path)
-            external_weights_file_path = ""
+            self._adapt_aimet_encodings(
+                self.aimet_encodings, encoding_file_path, onnx_file_path
+            )
 
-            if external_weights:
+            external_weights_file_path = ""
+            if external_weights and zip:
                 external_weights_file_name = f"{model_name}.data"
                 external_weights_file_path = str(base_path / external_weights_file_name)
                 # Torch exports to onnx with external weights scattered in a directory.
@@ -139,11 +163,30 @@ def convert_to_onnx_and_aimet_encodings(
                     location=external_weights_file_name,
                 )
 
-            zip_aimet_model(
-                zip_path,
-                zip_base_dir,
-                onnx_file_path,
-                encoding_file_path,
-                external_weights_file_path,
-            )
-        return zip_path
+            if zip:
+                zip_path = os.path.join(output_dir, f"{model_name}.aimet.zip")
+                zip_aimet_model(
+                    zip_path,
+                    zip_base_dir,
+                    onnx_file_path,
+                    encoding_file_path,
+                    external_weights_file_path,
+                )
+                return zip_path
+            else:
+                # This path is persistent
+                return base_path.as_posix()
+
+        return ""  # mypy requires this for some reason
+
+    def _use_zip_file(self) -> bool:
+        """
+        Should the return of convert_to_hub_source_model be zipped.
+        """
+        return True
+
+    def _adapt_aimet_encodings(self, src_encodings, dst_encodings, onnx_model_path):
+        """
+        Overridable file that adapts the AIMET encodings.
+        """
+        shutil.copyfile(src=src_encodings, dst=dst_encodings)
diff --git a/qai_hub_models/utils/aimet/encodings.py b/qai_hub_models/utils/aimet/encodings.py
new file mode 100644
index 00000000..a28708a9
--- /dev/null
+++ b/qai_hub_models/utils/aimet/encodings.py
@@ -0,0 +1,142 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import re
+from copy import deepcopy
+
+
+def find_name_mapping(pattern_pairs, src_names, dst_names, dst_input_names=None):
+    patterns = [re.compile(x) for x, y in pattern_pairs]
+    mapping = {}
+    rev_mapping = {}
+    known_unused = set()
+    for src_name in src_names:
+        for i in range(len(pattern_pairs)):
+            m = patterns[i].match(src_name)
+            if m:
+                dst_patterns = pattern_pairs[i][1]
+                if not isinstance(dst_patterns, list):
+                    dst_patterns = [dst_patterns]
+
+                used = False
+                for dst_pattern in dst_patterns:
+                    if isinstance(dst_pattern, tuple):
+                        assert dst_input_names is not None
+                        # This contains a (node, index) pair, where the index
+                        # refers to the input index of that node
+                        dst_pattern, index = dst_pattern
+                        dst_name = dst_pattern.format(*m.groups())
+                        if dst_name in dst_input_names:
+                            real_dst_name = dst_input_names[dst_name][index]
+                            mapping[src_name] = real_dst_name
+                            rev_mapping[real_dst_name] = src_name
+                            used = True
+
+                    elif not dst_pattern:
+                        known_unused.add(src_name)
+                        used = True
+                    else:
+                        # This dst_name refers to the edge name
+                        dst_name = dst_pattern.format(*m.groups())
+                        if dst_name in dst_names:
+                            mapping[src_name] = dst_name
+                            rev_mapping[dst_name] = src_name
+                            used = True
+                if used:
+                    break
+
+    return mapping, rev_mapping, known_unused
+
+
+def map_encodings(
+    pattern_pairs,
+    src_names,
+    dst_names,
+    dst_input_names=None,
+    src_encodings=[],
+    dst_encodings=[],
+):
+    patterns = [re.compile(x) for x, y in pattern_pairs]
+    mapping = {}
+    rev_mapping = {}
+    known_unused = set()
+
+    def default_callback(
+        src_encodings,
+        dst_encodings,
+        src_name,
+        dst_name,
+        pattern_index,
+        num_patterns,
+        groups,
+    ):
+        if src_name in src_encodings:
+            src_entry = src_encodings[src_name]
+            dst_entry = deepcopy(src_entry)
+            if isinstance(dst_entry, dict):
+                dst_entry["name"] = dst_name
+            dst_encodings[dst_name] = dst_entry
+
+    for src_name in src_names:
+        for i in range(len(pattern_pairs)):
+            m = patterns[i].match(src_name)
+            if m:
+                dst_patterns = pattern_pairs[i][1]
+                callback = default_callback
+
+                if isinstance(dst_patterns, tuple) and callable(dst_patterns[1]):
+                    dst_patterns, callback = dst_patterns
+
+                if not isinstance(dst_patterns, list):
+                    dst_patterns = [dst_patterns]
+
+                used = False
+
+                for dst_pattern_index, dst_pattern in enumerate(dst_patterns):
+                    if isinstance(dst_pattern, tuple):
+                        assert dst_input_names is not None
+                        # This contains a (node, index) pair, where the index
+                        # refers to the input index of that node
+                        dst_pattern, index = dst_pattern
+                        dst_name = dst_pattern.format(*m.groups())
+                        if dst_name in dst_input_names:
+                            real_dst_name = dst_input_names[dst_name][index]
+                            mapping[src_name] = real_dst_name
+                            rev_mapping[real_dst_name] = src_name
+                            used = True
+
+                            callback(
+                                src_encodings,
+                                dst_encodings,
+                                src_name,
+                                real_dst_name,
+                                dst_pattern_index,
+                                len(dst_patterns),
+                                m.groups(),
+                            )
+
+                    elif not dst_pattern:
+                        known_unused.add(src_name)
+                        used = True
+                    else:
+                        # This dst_name refers to the edge name
+                        dst_name = dst_pattern.format(*m.groups())
+                        if dst_name in dst_names:
+                            mapping[src_name] = dst_name
+                            rev_mapping[dst_name] = src_name
+                            used = True
+
+                            callback(
+                                src_encodings,
+                                dst_encodings,
+                                src_name,
+                                dst_name,
+                                dst_pattern_index,
+                                len(dst_patterns),
+                                m.groups(),
+                            )
+                if used:
+                    break
+
+    return mapping, rev_mapping, known_unused
diff --git a/qai_hub_models/utils/args.py b/qai_hub_models/utils/args.py
index c2259889..e82796c1 100644
--- a/qai_hub_models/utils/args.py
+++ b/qai_hub_models/utils/args.py
@@ -439,12 +439,13 @@ def get_qcom_chipsets() -> Set[str]:
 
 def _evaluate_export_common_parser(
     model_cls: Type[FromPretrainedTypeVar] | Type[FromPrecompiledTypeVar],
-    supports_tflite=True,
-    supports_qnn=True,
-    supports_onnx=True,
-    supports_precompiled_qnn_onnx=True,
-    default_runtime=TargetRuntime.TFLITE,
-    exporting_compiled_model=False,
+    supports_tflite: bool = True,
+    supports_qnn: bool = True,
+    supports_onnx: bool = True,
+    supports_precompiled_qnn_onnx: bool = True,
+    default_runtime: TargetRuntime = TargetRuntime.TFLITE,
+    exporting_compiled_model: bool = False,
+    is_hub_quantized: bool = False,
 ) -> argparse.ArgumentParser:
     """
     Common arguments between export and evaluate scripts.
@@ -452,7 +453,13 @@ def _evaluate_export_common_parser(
     # Set handler to resolve, to allow from_pretrained and get_input_spec
     # to have the same argument names.
     parser = get_parser(allow_dupe_args=True)
-
+    if is_hub_quantized:
+        parser.add_argument(
+            "--num-calibration-samples",
+            type=int,
+            default=100,
+            help="The number of calibration data samples to use for quantization.",
+        )
     if not exporting_compiled_model:
         # Default runtime for compiled model is fixed for given model
         available_runtimes = []
@@ -508,6 +515,7 @@ def export_parser(
     default_runtime: TargetRuntime = TargetRuntime.TFLITE,
     exporting_compiled_model: bool = False,
     default_export_device: str = DEFAULT_EXPORT_DEVICE,
+    is_hub_quantized: bool = False,
 ) -> argparse.ArgumentParser:
     """
     Arg parser to be used in export scripts.
@@ -532,11 +540,11 @@ def export_parser(
             True when exporting compiled model.
             If set, removing skip_profiling flag from export arguments.
             Default = False.
-        default_export_device:
-            Default device to set for export.
+        default_export_device: Default device to set for export.
+        is_hub_quantized: Whether the model is quantized via the hub quantize job.
 
     Returns:
-        Arg parser object.
+        argparse ArgumentParser object.
     """
     parser = _evaluate_export_common_parser(
         model_cls=model_cls,
@@ -546,6 +554,7 @@ def export_parser(
         supports_precompiled_qnn_onnx=supports_precompiled_qnn_onnx,
         default_runtime=default_runtime,
         exporting_compiled_model=exporting_compiled_model,
+        is_hub_quantized=is_hub_quantized,
     )
     parser.add_argument(
         "--device",
@@ -561,6 +570,12 @@ def export_parser(
         help="If set, will choose a random device with this chipset. "
         "Overrides whatever is set in --device.",
     )
+    if is_hub_quantized:
+        parser.add_argument(
+            "--skip-compiling",
+            action="store_true",
+            help="If set, skips compiling to asset that can run on device.",
+        )
     parser.add_argument(
         "--skip-profiling",
         action="store_true",
@@ -609,6 +624,7 @@ def evaluate_parser(
     supports_qnn=True,
     supports_onnx=True,
     default_runtime=TargetRuntime.TFLITE,
+    is_hub_quantized: bool = False,
 ) -> argparse.ArgumentParser:
     """
     Arg parser to be used in evaluate scripts.
@@ -630,6 +646,7 @@ def evaluate_parser(
             If set, removing skip_profiling flag from export arguments.
             Default = False.
         default_runtime: Which runtime to use as default if not specified in cli args.
+        is_hub_quantized: Whether the model is quantized via the hub quantize job.
 
     Returns:
         Arg parser object.
@@ -640,6 +657,7 @@ def evaluate_parser(
         supports_qnn=supports_qnn,
         supports_onnx=supports_onnx,
         default_runtime=default_runtime,
+        is_hub_quantized=is_hub_quantized,
     )
     parser.add_argument(
         "--chipset",
diff --git a/qai_hub_models/utils/asset_loaders.py b/qai_hub_models/utils/asset_loaders.py
index 9cf6e429..cb79f853 100644
--- a/qai_hub_models/utils/asset_loaders.py
+++ b/qai_hub_models/utils/asset_loaders.py
@@ -15,12 +15,13 @@
 import tempfile
 import threading
 import time
+import zipfile
 from contextlib import contextmanager
 from enum import Enum
 from functools import partial
 from pathlib import Path
 from types import ModuleType
-from typing import Any, Callable, Dict, List, Optional, Union
+from typing import Any, Callable, Dict, Iterable, List, Optional, Tuple, Union
 from zipfile import ZipFile
 
 import gdown
@@ -542,6 +543,7 @@ def from_cfg(
                 "models_website_url": str,
                 "models_website_relative_path": str,
                 "email_template": str,
+                "genie_url": str,
             }
         )
     )
@@ -743,7 +745,7 @@ def extract(self, force=True) -> Path:
 
         _, ext = os.path.splitext(self.local_cache_path)
         if ext == ".zip":
-            # Update local cache path to pont to the extracted zip folder.
+            # Update local cache path to point to the extracted zip folder.
             extract_zip_file(str(self.path()))
             os.remove(self.path())  # Deletes zip file
             self.is_extracted = True  # Updates path() to return extracted path
@@ -1045,6 +1047,37 @@ def extract_zip_file(filepath_str: str) -> Path:
     return out_path
 
 
+# TODO (#12708): Remove this and rely on client
+def zip_model(output_dir_path: str, model_path: str) -> str:
+    model_path = os.path.realpath(model_path)
+    package_name = os.path.basename(model_path)
+    compresslevel = 1
+
+    output_path = os.path.join(output_dir_path, package_name + ".zip")
+    os.makedirs(os.path.dirname(output_path), exist_ok=True)
+    with zipfile.ZipFile(
+        output_path, "w", compression=zipfile.ZIP_DEFLATED, compresslevel=compresslevel
+    ) as f:
+        walk: Iterable[Tuple[str, List[str], List[str]]]
+        if os.path.isfile(model_path):
+            root_path = os.path.dirname(model_path)
+            walk = [(root_path, [], [model_path])]
+        else:
+            root_path = os.path.join(model_path, "..")
+            walk = os.walk(model_path)
+        for root, _, files in walk:
+            # Create directory entry (can use f.mkdir from Python 3.11)
+            rel_root = os.path.relpath(root, root_path)
+            if rel_root != ".":
+                f.writestr(rel_root + "/", "")
+            for file in files:
+                f.write(
+                    os.path.join(root, file),
+                    os.path.relpath(os.path.join(root, file), root_path),
+                )
+    return output_path
+
+
 def callback_with_retry(
     num_retries: int,
     callback: Callable,
diff --git a/qai_hub_models/utils/config_loaders.py b/qai_hub_models/utils/config_loaders.py
index da646f6b..42f382b4 100644
--- a/qai_hub_models/utils/config_loaders.py
+++ b/qai_hub_models/utils/config_loaders.py
@@ -30,14 +30,16 @@
 from schema import Schema, SchemaError
 
 from qai_hub_models.utils.asset_loaders import ASSET_CONFIG, QAIHM_WEB_ASSET, load_yaml
-from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.path_helpers import (
     MODELS_PACKAGE_NAME,
     QAIHM_PACKAGE_NAME,
     get_qaihm_models_root,
     get_qaihm_package_root,
 )
-from qai_hub_models.utils.scorecard.common import get_supported_devices
+from qai_hub_models.utils.scorecard.common import (
+    ScorecardProfilePath,
+    get_supported_devices,
+)
 
 QAIHM_PACKAGE_ROOT = get_qaihm_package_root()
 QAIHM_MODELS_ROOT = get_qaihm_models_root()
@@ -122,10 +124,88 @@
 
 def get_all_supported_devices():
     return get_supported_devices(
-        ["qualcomm-snapdragon-x-elite", "qualcomm-snapdragon-8gen3"]
+        [
+            "qualcomm-snapdragon-8-elite",
+            "qualcomm-snapdragon-x-elite",
+            "qualcomm-snapdragon-8gen3",
+        ]
     )
 
 
+def _get_origin(input_type: Type) -> Type:
+    """
+    For nested types like List[str] or Union[str, int], this function will
+        return the "parent" type like List or Union.
+
+    If the input type is not a nested type, the function returns the input_type.
+    """
+    return getattr(input_type, "__origin__", input_type)
+
+
+def _extract_optional_type(input_type: Type) -> Type:
+    """
+    Given an optional type as input, returns the inner type that is wrapped.
+
+    For example, if input type is Optional[int], the function returns int.
+    """
+    assert (
+        _get_origin(input_type) == Union
+    ), "Input type must be an instance of `Optional`."
+    union_args = get_args(input_type)
+    assert len(union_args) == 2 and issubclass(
+        union_args[1], type(None)
+    ), "Input type must be an instance of `Optional`."
+    return union_args[0]
+
+
+def _constructor_from_type(input_type: Type) -> Union[Type, Callable]:
+    """
+    Given a type, return the appropriate constructor for that type.
+
+    For primitive types like str and int, the type and constructor are the same object.
+
+    For types like List, the constructor is list.
+    """
+    input_type = _get_origin(input_type)
+    if input_type == List:
+        return list
+    if input_type == Dict:
+        return dict
+    return input_type
+
+
+@dataclass
+class BaseDataClass:
+    @classmethod
+    def get_schema(cls) -> Schema:
+        """Derive the Schema from the fields set on the dataclass."""
+        schema_dict = {}
+        field_datatypes = get_type_hints(cls)
+        for field in fields(cls):
+            field_type = field_datatypes[field.name]
+            if _get_origin(field_type) == Union:
+                field_type = _extract_optional_type(field_type)
+                assert (
+                    field.default != dataclasses.MISSING
+                ), "Optional fields must have a default set."
+            if field.default != dataclasses.MISSING:
+                field_key = OptionalSchema(field.name, default=field.default)
+            else:
+                field_key = field.name
+            schema_dict[field_key] = _constructor_from_type(field_type)
+        return Schema(And(schema_dict))
+
+    @classmethod
+    def from_dict(
+        cls: Type[BaseDataClassTypeVar], val_dict: Dict[str, Any]
+    ) -> BaseDataClassTypeVar:
+        kwargs = {field.name: val_dict[field.name] for field in fields(cls)}
+        return cls(**kwargs)
+
+
+BaseDataClassTypeVar = TypeVar("BaseDataClassTypeVar", bound="BaseDataClass")
+
+
 @unique
 class FORM_FACTOR(Enum):
     PHONE = 0
@@ -257,323 +337,178 @@ def bytes_to_mb(num_bytes: int) -> int:
     return round(num_bytes / (1 << 20))
 
 
-@dataclass
-class ModelRuntimePerformanceDetails:
-    model_name: str
-    device_name: str
-    device_os: str
-    runtime: TargetRuntime
-    inference_time_ms: int
-    peak_memory_bytes: Tuple[int, int]  # min, max
-    compute_unit_counts: Dict[str, int]
-
-
 class QAIHMModelPerf:
     """Class to read the perf.yaml and parse it for displaying it on HuggingFace."""
 
-    def __init__(self, perf_yaml_path, model_name):
-        self.model_name = model_name
-        self.perf_yaml_path = perf_yaml_path
-        self.skip_overall = False
-        self.skip_tflite = False
-        self.skip_qnn = False
-        self.tflite_row = (
-            "| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 |"
-        )
-        self.qnn_row = "| Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 |"
-
-        if os.path.exists(self.perf_yaml_path):
-            self.perf_details = load_yaml(self.perf_yaml_path)
-            num_models = len(self.perf_details["models"])
-
-            # Get TFLite summary from perf.yaml
-            try:
-                self.tflite_summary = []
-                for model in self.perf_details["models"]:
-                    self.tflite_summary.append(
-                        model["performance_metrics"][0][TFLITE_PATH]
-                    )
-            except Exception:
-                self.skip_tflite = True
-
-            if not self.skip_overall and not self.skip_tflite:
-                for num in range(num_models):
-                    if isinstance(self.tflite_summary[num]["inference_time"], str):
-                        self.skip_tflite = True
-
-            # Get QNN summary from perf.yaml
-            try:
-                self.qnn_summary = []
-                for model in self.perf_details["models"]:
-                    self.qnn_summary.append(model["performance_metrics"][0][QNN_PATH])
-            except Exception:
-                self.skip_qnn = True
-            if not self.skip_overall and not self.skip_qnn:
-                for num in range(num_models):
-                    if isinstance(self.qnn_summary[num]["inference_time"], str):
-                        self.skip_qnn = True
-        else:
-            self.skip_overall = True
-
-    def _get_runtime_type(self, model_type):
-        if model_type == "tflite":
-            return "TFLite"
-        if model_type == "so":
-            return "QNN Model Library"
-        if model_type == "bin":
-            return "QNN Binary"
-        raise RuntimeError(f"Unsupported model_type specified {model_type}.")
-
-    def get_row(self, skip, summary_list, initial_row, model_type, has_assets=True):
-        # Creating a row for performance table.
-        row = ""
-        if not skip:
-            names = self.get_submodel_names()
-            for summary, name in zip(summary_list, names):
-                inf_time = summary["inference_time"]
-                inference_time = f"{inf_time / 1000} ms"
-                mem_min = bytes_to_mb(summary["estimated_peak_memory_range"]["min"])
-                mem_max = bytes_to_mb(summary["estimated_peak_memory_range"]["max"])
-                peak_memory_range = f"{mem_min} - {mem_max} MB"
-                if model_type == "tflite":
-                    self.tflite_inference_time = inference_time
-                    self.tflite_peak_memory_range = peak_memory_range
-                elif model_type == "so" or model_type == "bin":
-                    self.qnn_inference_time = inference_time
-                    self.qnn_peak_memory_range = peak_memory_range
-                primary_compute_unit = summary["primary_compute_unit"]
-                precision = summary["precision"].upper()
-                base_url = ASSET_CONFIG.get_hugging_face_url(self.model_name)
-                # For model cards with no assets, only show model name with no link;
-                # as there is not target model to download
-                if has_assets:
-                    target_model = f" [{name}.{model_type}]({base_url}/blob/main/{name}.{model_type})"
-                else:
-                    target_model = name
-
-                runtime_type = self._get_runtime_type(model_type)
-                row += (
-                    initial_row
-                    + f" {runtime_type} | {inference_time} | {peak_memory_range} | {precision} | {primary_compute_unit} | {target_model} \n"
-                )
-            return row
-        return ""
-
-    def get_tflite_row(self):
-        # Get TFLite row for a submodel on a device.
-        return self.get_row(
-            self.skip_tflite, self.tflite_summary, self.tflite_row, "tflite"
-        )
+    ###
+    # Helper Struct Classes
+    ###
+
+    @dataclass
+    class PerformanceDetails:
+        job_id: str
+        inference_time_microsecs: float
+        peak_memory_bytes: Tuple[int, int]  # min, max
+        compute_unit_counts: Dict[str, int]
+        primary_compute_unit: str
+        precision: str
+
+        @staticmethod
+        def from_dict(device_perf_details: Dict) -> QAIHMModelPerf.PerformanceDetails:
+            peak_memory = device_perf_details["estimated_peak_memory_range"]
+            layer_info = device_perf_details["layer_info"]
+            compute_unit_counts = {}
+            for layer_name, count in layer_info.items():
+                if "layers_on" in layer_name:
+                    if count > 0:
+                        compute_unit_counts[layer_name[-3:].upper()] = count
+
+            return QAIHMModelPerf.PerformanceDetails(
+                job_id=device_perf_details["job_id"],
+                inference_time_microsecs=float(device_perf_details["inference_time"]),
+                peak_memory_bytes=(peak_memory["min"], peak_memory["max"]),
+                compute_unit_counts=compute_unit_counts,
+                primary_compute_unit=device_perf_details["primary_compute_unit"],
+                precision=device_perf_details["precision"],
+            )
 
-    def get_qnn_row(self, is_precompiled: bool = False, has_assets=True):
-        # Get QNN row for a submodel on a device.
-        return self.get_row(
-            self.skip_qnn,
-            self.qnn_summary,
-            self.qnn_row,
-            "bin" if is_precompiled else "so",
-            has_assets,
-        )
+    @dataclass
+    class LLMPerformanceDetails:
+        time_to_first_token_range_secs: Tuple[str, str]  # min, max
+        tokens_per_second: float
+
+        @staticmethod
+        def from_dict(
+            device_perf_details: Dict,
+        ) -> QAIHMModelPerf.LLMPerformanceDetails:
+            ttftr = device_perf_details["time_to_first_token_range"]
+            return QAIHMModelPerf.LLMPerformanceDetails(
+                time_to_first_token_range_secs=(
+                    # Original data is in microseconds
+                    str(float(ttftr["min"]) * 1e-6),
+                    str(float(ttftr["max"]) * 1e-6),
+                ),
+                tokens_per_second=device_perf_details["tokens_per_second"],
+            )
 
-    def body_perf(self, is_precompiled: bool = False, has_assets: bool = True):
-        # Combine all the rows to make the body of performance table.
-        if self.skip_tflite:
-            return self.get_qnn_row(is_precompiled, has_assets)
-        elif self.skip_qnn:
-            return self.get_tflite_row()
-        else:
-            return self.get_tflite_row() + self.get_qnn_row(is_precompiled, has_assets)
-
-    def compute_unit_summary(self, runtime_path=TFLITE_PATH):
-        # Get compute unit summary for export script's output.
-        npu, gpu, cpu = 0, 0, 0
-        cu_summary = ""
-        for model in self.perf_details["models"]:
-            layer_info = model["performance_metrics"][0][runtime_path]["layer_info"]
-            npu += layer_info["layers_on_npu"]
-            gpu += layer_info["layers_on_gpu"]
-            cpu += layer_info["layers_on_cpu"]
-        if npu > 0:
-            cu_summary += f"NPU ({npu})"
-        if gpu > 0:
-            cu_summary += f"GPU ({gpu})"
-        if cpu > 0:
-            cu_summary += f"CPU ({cpu})"
-        return cu_summary
-
-    def get_submodel_names_and_ids(self):
-        # Get the names, TFLite job ids and QNN job ids.
-        names = self.get_submodel_names()
-        tflite_job_ids, qnn_job_ids = [], []
-        for model in self.perf_details["models"]:
-            if TFLITE_PATH in model["performance_metrics"][0]:
-                tflite_job_ids.append(
-                    model["performance_metrics"][0][TFLITE_PATH]["job_id"]
+    @dataclass
+    class EvaluationDetails(BaseDataClass):
+        name: str
+        value: float
+        unit: str
+
+    @dataclass
+    class DeviceDetails(BaseDataClass):
+        name: str
+        os: str
+        form_factor: str
+        os_name: str
+        manufacturer: str
+        chipset: str
+
+    @dataclass
+    class ProfilePerfDetails:
+        path: ScorecardProfilePath
+        perf_details: QAIHMModelPerf.PerformanceDetails | QAIHMModelPerf.LLMPerformanceDetails
+        eval_details: Optional[QAIHMModelPerf.EvaluationDetails] = None
+
+        @staticmethod
+        def from_dict(
+            path: ScorecardProfilePath, perf_details_dict: Dict
+        ) -> QAIHMModelPerf.ProfilePerfDetails:
+            perf_details: QAIHMModelPerf.LLMPerformanceDetails | QAIHMModelPerf.PerformanceDetails
+            if llm_metrics := perf_details_dict.get("llm_metrics", None):
+                perf_details = QAIHMModelPerf.LLMPerformanceDetails.from_dict(
+                    llm_metrics
                 )
-            if QNN_PATH in model["performance_metrics"][0]:
-                qnn_job_ids.append(model["performance_metrics"][0][QNN_PATH]["job_id"])
-        return names, tflite_job_ids, qnn_job_ids
-
-    def get_submodel_names(self):
-        # Get names of all the submodels.
-        names = []
-        for model in self.perf_details["models"]:
-            names.append(model["name"])
-        return names
-
-    def get_perf_details(
-        self,
-        runtime: TargetRuntime,
-        device: str | None = None,
-        device_os: str | None = None,
-    ) -> Dict[str, ModelRuntimePerformanceDetails | None]:
-        """
-        Get model performance details for the selected device and runtime.
-
-        If device is None, picks the first device specified in the perf results.
-
-        Returns a dictionary of
-            { model_component_name : performance details object }
-
-        If there is only one component, model_component_name == model_name.
-
-        The performance details object will be null if the requested
-        perf details do not exist, or if the perf job failed.
-        """
-        if runtime == TargetRuntime.TFLITE:
-            rt_name = "torchscript_onnx_tflite"
-        elif runtime == TargetRuntime.QNN:
-            rt_name = "torchscript_onnx_qnn"
-        else:
-            raise NotImplementedError()
-
-        # Model -> Performance Details
-        # None == Test did not run.
-        perf_details: Dict[str, ModelRuntimePerformanceDetails | None] = {}
-
-        for model in self.perf_details["models"]:
-            name = model["name"]
-            metrics = model["performance_metrics"]
-            for device_metrics in metrics:
-                device_name = device_metrics["reference_device_info"]["name"]
-                metric_device_os = device_metrics["reference_device_info"]["os"]
-
-                # Verify Device Matches Requested Device
-                if device and device_name != device:
-                    continue
-                if device_os and metric_device_os != device_os:
-                    continue
-
-                perf_rt = device_metrics.get(rt_name, None)
-
-                # Inference Time
-                inf_time = perf_rt["inference_time"] if perf_rt else "null"
-                if inf_time == "null":
-                    # Compilation or inference failed.
-                    perf_details[name] = None
-                    continue
-                inf_time /= 1000
-
-                # Memory
-                peak_mem = perf_rt["estimated_peak_memory_range"]
-                peak_mem_bytes: Tuple[int, int] = tuple([peak_mem["min"], peak_mem["max"]])  # type: ignore
-
-                # Layer Info
-                layer_info = perf_rt["layer_info"]
-                compute_unit_counts = {}
-                for layer_name, count in layer_info.items():
-                    if "layers_on" in layer_name:
-                        if count > 0:
-                            compute_unit_counts[layer_name[-3:].upper()] = count
-
-                perf_details[name] = ModelRuntimePerformanceDetails(
-                    model_name=model,
-                    device_name=device_name,
-                    device_os=metric_device_os,
-                    runtime=runtime,
-                    inference_time_ms=inf_time,
-                    peak_memory_bytes=peak_mem_bytes,
-                    compute_unit_counts=compute_unit_counts,
+            else:
+                perf_details = QAIHMModelPerf.PerformanceDetails.from_dict(
+                    perf_details_dict
                 )
 
-            if name not in perf_details.keys():
-                perf_details[name] = None
-
-        return perf_details
-
-
-def _get_origin(input_type: Type) -> Type:
-    """
-    For nested types like List[str] or Union[str, int], this function will
-        return the "parent" type like List or Union.
-
-    If the input type is not a nested type, the function returns the input_type.
-    """
-    return getattr(input_type, "__origin__", input_type)
-
-
-def _extract_optional_type(input_type: Type) -> Type:
-    """
-    Given an optional type as input, returns the inner type that is wrapped.
-
-    For example, if input type is Optional[int], the function returns int.
-    """
-    assert (
-        _get_origin(input_type) == Union
-    ), "Input type must be an instance of `Optional`."
-    union_args = get_args(input_type)
-    assert len(union_args) == 2 and issubclass(
-        union_args[1], type(None)
-    ), "Input type must be an instance of `Optional`."
-    return union_args[0]
-
-
-def _constructor_from_type(input_type: Type) -> Union[Type, Callable]:
-    """
-    Given a type, return the appropriate constructor for that type.
-
-    For primitive types like str and int, the type and constructor are the same object.
+            if eval_metrics := perf_details_dict.get("evaluation_metrics", None):
+                eval_details_data = (
+                    QAIHMModelPerf.EvaluationDetails.get_schema().validate(eval_metrics)
+                )
+                eval_details = QAIHMModelPerf.EvaluationDetails.from_dict(
+                    eval_details_data
+                )
+            else:
+                eval_details = None
 
-    For types like List, the constructor is list.
-    """
-    input_type = _get_origin(input_type)
-    if input_type == List:
-        return list
-    if input_type == Dict:
-        return dict
-    return input_type
+            return QAIHMModelPerf.ProfilePerfDetails(
+                path=path, perf_details=perf_details, eval_details=eval_details
+            )
 
+    @dataclass
+    class DevicePerfDetails:
+        device: QAIHMModelPerf.DeviceDetails
+        details_per_path: Dict[ScorecardProfilePath, QAIHMModelPerf.ProfilePerfDetails]
+
+        @staticmethod
+        def from_dict(
+            device: QAIHMModelPerf.DeviceDetails, device_runtime_details: Dict
+        ) -> QAIHMModelPerf.DevicePerfDetails:
+            details_per_path = {}
+            for profile_path in ScorecardProfilePath:
+                if profile_path.long_name in device_runtime_details:
+                    perf_details_dict = device_runtime_details[profile_path.long_name]
+                    details_per_path[
+                        profile_path
+                    ] = QAIHMModelPerf.ProfilePerfDetails.from_dict(
+                        profile_path, perf_details_dict
+                    )
+            return QAIHMModelPerf.DevicePerfDetails(
+                device=device, details_per_path=details_per_path
+            )
 
-@dataclass
-class BaseDataClass:
-    @classmethod
-    def get_schema(cls) -> Schema:
-        """Derive the Schema from the fields set on the dataclass."""
-        schema_dict = {}
-        field_datatypes = get_type_hints(cls)
-        for field in fields(cls):
-            field_type = field_datatypes[field.name]
-            if _get_origin(field_type) == Union:
-                field_type = _extract_optional_type(field_type)
-                assert (
-                    field.default != dataclasses.MISSING
-                ), "Optional fields must have a default set."
-            if field.default != dataclasses.MISSING:
-                field_key = OptionalSchema(field.name, default=field.default)
-            else:
-                field_key = field.name
-            schema_dict[field_key] = _constructor_from_type(field_type)
-        return Schema(And(schema_dict))
+    @dataclass
+    class ModelPerfDetails:
+        model: str
+        details_per_device: Dict[str, QAIHMModelPerf.DevicePerfDetails]
+
+        @staticmethod
+        def from_dict(
+            model: str, model_performance_metrics: List[Dict]
+        ) -> QAIHMModelPerf.ModelPerfDetails:
+            details_per_device = {}
+            for device_perf_details in model_performance_metrics:
+                device_details_data = (
+                    QAIHMModelPerf.DeviceDetails.get_schema().validate(
+                        device_perf_details["reference_device_info"]
+                    )
+                )
+                device_details = QAIHMModelPerf.DeviceDetails.from_dict(
+                    device_details_data
+                )
+                details_per_device[
+                    device_details.name
+                ] = QAIHMModelPerf.DevicePerfDetails.from_dict(
+                    device_details, device_perf_details
+                )
 
-    @classmethod
-    def from_dict(
-        cls: Type[BaseDataClassTypeVar], val_dict: Dict[str, Any]
-    ) -> BaseDataClassTypeVar:
-        kwargs = {field.name: val_dict[field.name] for field in fields(cls)}
-        return cls(**kwargs)
+            return QAIHMModelPerf.ModelPerfDetails(
+                model=model, details_per_device=details_per_device
+            )
 
+    def __init__(self, perf_yaml_path, model_name):
+        self.model_name = model_name
+        self.perf_yaml_path = perf_yaml_path
+        self.per_model_details: Dict[str, QAIHMModelPerf.ModelPerfDetails] = {}
 
-BaseDataClassTypeVar = TypeVar("BaseDataClassTypeVar", bound="BaseDataClass")
+        if os.path.exists(self.perf_yaml_path):
+            self.perf_details = load_yaml(self.perf_yaml_path)
+            all_models_and_perf = self.perf_details["models"]
+            if not isinstance(all_models_and_perf, list):
+                all_models_and_perf = [all_models_and_perf]
+
+            for model_perf in all_models_and_perf:
+                model_name = model_perf["name"]
+                self.per_model_details[
+                    model_name
+                ] = QAIHMModelPerf.ModelPerfDetails.from_dict(
+                    model_name, model_perf["performance_metrics"]
+                )
 
 
 @dataclass
@@ -659,6 +594,11 @@ class QAIHMModelCodeGen(BaseDataClass):
     # on a full dataset. Datasets specified here must be chosen from `qai_hub_models/datasets`.
     eval_datasets: Optional[List[str]] = None
 
+    # If set, quantizes the model using AI Hub quantize job. This also requires setting
+    # the `eval_datasets` field. Calibration data will be pulled from the first item
+    # in `eval_datasets`.
+    use_hub_quantization: bool = False
+
     # By default inference tests are done using 8gen1 chipset to avoid overloading
     # newer devices. Some models don't work on 8gen1, so use 8gen3 for those.
     inference_on_8gen3: bool = False
@@ -688,6 +628,12 @@ def load_code_gen_yaml(path: str | Path | None = None) -> Dict[str, Any]:
             data = QAIHMModelCodeGen.get_schema().validate(data)
         except SchemaError as e:
             assert 0, f"{e.code} in {path}"
+        if data["is_aimet"] and data["use_hub_quantization"]:
+            raise ValueError(
+                "Flags is_aimet and use_hub_quantization cannot both be set."
+            )
+        if data["use_hub_quantization"] and len(data["eval_datasets"]) == 0:
+            raise ValueError("Must set eval_datasets if use_hub_quantization is set.")
         return data
 
 
@@ -720,22 +666,6 @@ class QAIHMModelInfo(BaseDataClass):
     # A list of applicable tags to add to the model
     tags: List[MODEL_TAG]
 
-    # Link to the research paper where the model was first published. Usually an arxiv link.
-    research_paper: str
-
-    # The title of the research paper.
-    research_paper_title: str
-
-    # A link to the model's license. Most commonly found in the github repo it was cloned from.
-    license: str
-
-    # A link to the AIHub license, unless the license is more restrictive like GPL.
-    # In that case, this should point to the same as the model license.
-    deploy_license: str
-
-    # A link to the original github repo with the model's code.
-    source_repo: str
-
     # A list of real-world applicaitons for which this model could be used.
     # This is free-from and almost anything reasonable here is fine.
     applicable_scenarios: List[str]
@@ -758,13 +688,6 @@ class QAIHMModelInfo(BaseDataClass):
     # CodeGen options from code-gen.yaml in the model's folder.
     code_gen_config: QAIHMModelCodeGen
 
-    # The license type of the original model repo.
-    license_type: str
-
-    # Should be set to `AI Model Hub License`, unless the license is more restrictive like GPL.
-    # In that case, this should be the same as the model license.
-    deploy_license_type: str
-
     # A list of datasets for which the model has pre-trained checkpoints
     # available as options in `model.py`. Typically only has one entry.
     dataset: List[str]
@@ -778,6 +701,32 @@ class QAIHMModelInfo(BaseDataClass):
     #   Number of output classes: The number of classes the model can classify or annotate.
     technical_details: Dict[str, str]
 
+    # The license type of the original model repo.
+    license_type: str
+
+    # Some models are made by company
+    model_maker_id: Optional[str] = None
+
+    # Link to the research paper where the model was first published. Usually an arxiv link.
+    research_paper: Optional[str] = None
+
+    # The title of the research paper.
+    research_paper_title: Optional[str] = None
+
+    # A link to the original github repo with the model's code.
+    source_repo: Optional[str] = None
+
+    # A link to the model's license. Most commonly found in the github repo it was cloned from.
+    license: Optional[str] = None
+
+    # A link to the AIHub license, unless the license is more restrictive like GPL.
+    # In that case, this should point to the same as the model license.
+    deploy_license: Optional[str] = None
+
+    # Should be set to `AI Model Hub License`, unless the license is more restrictive like GPL.
+    # In that case, this should be the same as the model license.
+    deploy_license_type: Optional[str] = None
+
     # If set, model assets shouldn't distributed.
     restrict_model_sharing: bool = False
 
@@ -829,7 +778,9 @@ def validate(self) -> Tuple[bool, Optional[str]]:
                 return False, f"Model {r_model} cannot be related to itself."
 
         # If paper is arxiv, it should be an abs link
-        if self.research_paper.startswith("https://arxiv.org/"):
+        if self.research_paper is not None and self.research_paper.startswith(
+            "https://arxiv.org/"
+        ):
             if "/abs/" not in self.research_paper:
                 return (
                     False,
@@ -840,9 +791,14 @@ def validate(self) -> Tuple[bool, Optional[str]]:
         if self.license_type not in HF_AVAILABLE_LICENSES:
             return False, f"license can be one of these: {HF_AVAILABLE_LICENSES}"
 
-        if not self.deploy_license:
+        if self.model_type_llm and self.llm_details is not None:
+            purchase_required = (
+                self.llm_details.get("call_to_action", "") == "contact_for_purchase"
+            )
+
+        if not self.deploy_license and not purchase_required:
             return False, "deploy_license cannot be empty"
-        if not self.deploy_license_type:
+        if not self.deploy_license_type and not purchase_required:
             return False, "deploy_license_type cannot be empty"
 
         # Status Reason
@@ -896,31 +852,64 @@ def validate(self) -> Tuple[bool, Optional[str]]:
         if expected_example_use != ASSET_CONFIG.get_example_use(self.id):
             return False, "Example-usage field not pointing to expected relative path"
 
+        # Check that model_type_llm and llm_details fields
         if self.model_type_llm:
-            assert self.llm_details is not None
+            assert (
+                self.llm_details is not None
+            ), "All LLMs must have 'llm_details' section."
+            assert (
+                "call_to_action" in self.llm_details
+            ), "All LLMs must have 'call_to_action' in 'llm_details'."
+            assert self.llm_details["call_to_action"] in {
+                "contact_for_purchase",
+                "download",
+                "view_readme",
+                "contact_for_download",
+            }, "All LLMs 'call_to_action' field only allows these values: download, view_readme, contact_for_purchase or contact_for_download."
             for dev in self.llm_details:
-                # Check the device is one of the supported devices.
-                assert dev in get_all_supported_devices()
-
-                if "purchase_required" in self.llm_details[dev]:
-                    assert self.llm_details[dev]["purchase_required"]
-                if "model_download_url" in self.llm_details[dev]:
-                    assert self.llm_details[dev]["model_download_url"] is not None
-                    model_download_url = ASSET_CONFIG.get_web_asset_url(
-                        self.id, self.llm_details[dev]["model_download_url"]
+                if dev not in {"call_to_action", "genie_compatible"}:
+                    assert (
+                        list(self.llm_details[dev].keys())[0] == "torchscript_onnx_qnn"
                     )
-                    # Check if the url exists
+                    # Check the device is one of the supported devices.
+                    assert dev in get_all_supported_devices()
+
                     if (
-                        session.head(model_download_url).status_code
-                        != requests.codes.ok
+                        "model_download_url"
+                        in self.llm_details[dev]["torchscript_onnx_qnn"]
                     ):
-                        return False, f"Download URL is missing at {model_download_url}"
-                if "genie_url" in self.llm_details[dev]:
-                    assert self.llm_details[dev]["genie_url"] is not None
-                    genie_url = self.llm_details[dev]["genie_url"]
-                    # Check if the url exists
-                    if session.head(genie_url).status_code != requests.codes.ok:
-                        return False, f"Genie App URL is missing at {genie_url}"
+                        assert (
+                            self.llm_details[dev]["torchscript_onnx_qnn"][
+                                "model_download_url"
+                            ]
+                            is not None
+                        )
+                        version, relative_path = int(
+                            self.llm_details[dev]["torchscript_onnx_qnn"][
+                                "model_download_url"
+                            ].split("/")[0][1:]
+                        ), "/".join(
+                            self.llm_details[dev]["torchscript_onnx_qnn"][
+                                "model_download_url"
+                            ].split("/")[1:]
+                        )
+                        model_download_url = ASSET_CONFIG.get_model_asset_url(
+                            self.id, version, relative_path
+                        )
+                        # Check if the url exists
+                        if (
+                            session.head(model_download_url).status_code
+                            != requests.codes.ok
+                        ):
+                            return (
+                                False,
+                                f"Download URL is missing at {model_download_url}",
+                            )
+
+            if self.llm_details["call_to_action"] == "contact_for_purchase":
+                assert not self.llm_details.get("genie_compatible", False)
+        else:
+            assert self.llm_details is None
 
         return True, None
 
diff --git a/qai_hub_models/utils/evaluate.py b/qai_hub_models/utils/evaluate.py
index dfad39b1..42e9cac6 100644
--- a/qai_hub_models/utils/evaluate.py
+++ b/qai_hub_models/utils/evaluate.py
@@ -120,11 +120,13 @@ def _populate_data_cache_impl(
         else:
             output_names = ["output_0"]
         input_entries = make_hub_dataset_entries(
+            (model_inputs.split(1, dim=0),),
             input_names,
             channel_last_input,
-            model_inputs.split(1, dim=0),
         )
-        gt_entries = make_hub_dataset_entries(output_names, None, ground_truth_values)
+        gt_entries = make_hub_dataset_entries(
+            (ground_truth_values,), output_names, None
+        )
         # print(input_entries)
         input_dataset = hub.upload_dataset(input_entries)
         gt_dataset = hub.upload_dataset(gt_entries)
@@ -209,7 +211,7 @@ def _populate_data_cache(
         shutil.move(str(tmp_cache_path), str(cache_path))
 
 
-def sample_dataset(dataset: Dataset, num_samples: int, seed: int) -> Dataset:
+def sample_dataset(dataset: Dataset, num_samples: int, seed: int = 42) -> Dataset:
     """
     Create a dataset that is a subsample of `dataset` with `num_samples`.
 
diff --git a/qai_hub_models/utils/huggingface.py b/qai_hub_models/utils/huggingface.py
index 4ddd9bef..34361c7e 100644
--- a/qai_hub_models/utils/huggingface.py
+++ b/qai_hub_models/utils/huggingface.py
@@ -29,12 +29,14 @@ def fetch_huggingface_target_model(
         file_types = ["tflite"]
     elif runtime_path == TargetRuntime.QNN:
         file_types = ["so", "bin"]
+    elif runtime_path == TargetRuntime.ONNX:
+        file_types = ["onnx"]
     else:
         raise NotImplementedError()
 
     files = []
     for file_type in file_types:
-        files += fs.glob(os.path.join(hf_path, f"**/*.{file_type}"))
+        files += fs.glob(os.path.join(hf_path, f"*.{file_type}"))
     if not files:
         raise FileNotFoundError(
             f"No compiled assets are available on Huggingface for {model_name} with runtime {runtime_path.name}."
@@ -49,10 +51,13 @@ def fetch_huggingface_target_model(
     return paths
 
 
-def has_model_access(repo_name: str, repo_url: str):
+def has_model_access(repo_name: str, repo_url: str | None = None):
     # Huggingface returns GatedRepoError if model is not accessible to current User.
     # ref: https://github.com/huggingface/huggingface_hub/blob/5ff2d150d121d04799b78bc08f2343c21b8f07a9/src/huggingface_hub/utils/_errors.py#L135
 
+    if not repo_url:
+        repo_url = "https://huggingface.co/" + repo_name
+
     try:
         hf_api = HfApi()
         hf_api.model_info(repo_name)
diff --git a/qai_hub_models/utils/inference.py b/qai_hub_models/utils/inference.py
index 8a33a407..c2181f0c 100644
--- a/qai_hub_models/utils/inference.py
+++ b/qai_hub_models/utils/inference.py
@@ -253,7 +253,6 @@ def compile_model_from_args(
     """
     export_file = f"qai_hub_models.models.{model_id}.export"
     export_module = import_module(export_file)
-    compile_job: hub.CompileJob
     if cli_args.chipset:
         device_cli = f"--chipset {cli_args.chipset}"
     else:
@@ -279,10 +278,7 @@ def compile_model_from_args(
         **component_kwargs,
     )
 
-    if component is None:
-        no_hub_access = len(export_output) == 0 or isinstance(export_output[0], str)
-    else:
-        no_hub_access = export_output[component][0] is None
+    no_hub_access = isinstance(export_output, list)
 
     if no_hub_access:
         # The export returned local file paths, which mean Hub credentials were not found.
@@ -291,29 +287,36 @@ def compile_model_from_args(
         )
 
     export_output = export_output if component is None else export_output[component]
-    compile_job, _, _ = export_output
-    target_model = compile_job.get_target_model()
+    target_model = export_output.compile_job.get_target_model()
     assert target_model is not None
     return target_model
 
 
 def make_hub_dataset_entries(
+    tensors_tuple: Tuple[
+        torch.Tensor
+        | np.ndarray
+        | List[torch.Tensor | np.ndarray]
+        | Tuple[torch.Tensor | np.ndarray],
+        ...,
+    ],
     input_names: List[str],
-    channel_last_input: Optional[List[str]],
-    *args: torch.Tensor | np.ndarray | List[torch.Tensor | np.ndarray],
+    channel_last_input: Optional[List[str]] = None,
 ) -> DatasetEntries:
     """
     Given input tensor(s) in either numpy or torch format,
         convert to hub DatasetEntries format.
 
     Parameters:
+        tensors: Tensor data in numpy or torch.Tensor format.
         input_names: List of input names.
         channel_last_input: Comma-separated list of input names to transpose channel.
-        target_runtime: Runtime of model being used to inference this dataset.
-        args: Tensor data in numpy or torch.Tensor format.
     """
     dataset = {}
-    for name, inputs in zip(input_names, args):
+    assert len(tensors_tuple) == len(
+        input_names
+    ), "Number of elements in tensors_tuple must match number of inputs"
+    for name, inputs in zip(input_names, tensors_tuple):
         if not isinstance(inputs, (list, tuple)):
             inputs = [inputs]  # type: ignore
 
@@ -442,7 +445,8 @@ def __call__(
             assert len(args) == 1, "Only 1 dataset can be provided for inference."
             dataset = args[0]
         else:
-            dataset_entries = make_hub_dataset_entries(self.input_names, self.channel_last_input, *args)  # type: ignore
+            tensors = tuple(args)
+            dataset_entries = make_hub_dataset_entries(tensors, self.input_names, self.channel_last_input)  # type: ignore
             dataset = hub.upload_dataset(dataset_entries)
 
         inference_job = hub.submit_inference_job(
diff --git a/qai_hub_models/utils/path_helpers.py b/qai_hub_models/utils/path_helpers.py
index 2dc4a50f..dfb6a82d 100644
--- a/qai_hub_models/utils/path_helpers.py
+++ b/qai_hub_models/utils/path_helpers.py
@@ -18,7 +18,7 @@ def get_all_models(public_only: bool = False):
         if not subdir.is_dir():
             continue
         # Heuristic to see if this is a model we should generate export.py for.
-        if (subdir / "model.py").exists() and (subdir / "test.py").exists():
+        if (subdir / "model.py").exists():
             if public_only:
                 if not (subdir / "info.yaml").exists():
                     continue
diff --git a/qai_hub_models/utils/printing.py b/qai_hub_models/utils/printing.py
index 0884f1d0..5ea8ae7a 100644
--- a/qai_hub_models/utils/printing.py
+++ b/qai_hub_models/utils/printing.py
@@ -2,6 +2,8 @@
 # Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
+from __future__ import annotations
+
 from collections import Counter
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Union
@@ -14,10 +16,7 @@
 
 from qai_hub_models.utils.base_model import TargetRuntime
 from qai_hub_models.utils.compare import METRICS_FUNCTIONS, generate_comparison_metrics
-from qai_hub_models.utils.config_loaders import (
-    ModelRuntimePerformanceDetails,
-    bytes_to_mb,
-)
+from qai_hub_models.utils.config_loaders import QAIHMModelPerf, bytes_to_mb
 from qai_hub_models.utils.qnn_helpers import is_qnn_hub_model
 
 _INFO_DASH = "-" * 60
@@ -41,7 +40,7 @@ def print_with_box(data: List[str]) -> None:
 
 
 def print_inference_metrics(
-    inference_job: hub.InferenceJob,
+    inference_job: Optional[hub.InferenceJob],
     inference_result: DatasetEntries,
     torch_out: List[np.ndarray],
     output_names: Optional[List[str]] = None,
@@ -68,7 +67,8 @@ def custom_float_format(x):
     formatted_df = df_eval.applymap(custom_float_format)
 
     print(
-        f"\nComparing on-device vs. local-cpu inference for {inference_job.name.title()}."
+        "\nComparing on-device vs. local-cpu inference"
+        + (f" for {inference_job.name.title()}." if inference_job is not None else "")
     )
     print(tabulate(formatted_df, headers="keys", tablefmt="grid"))  # type: ignore
     print()
@@ -77,9 +77,10 @@ def custom_float_format(x):
     for m in df_eval.columns.drop("shape"):  # type: ignore
         print(f"- {m}:", METRICS_FUNCTIONS[m][1])
 
-    last_line = f"More details: {inference_job.url}"
-    print()
-    print(last_line)
+    if inference_job is not None:
+        last_line = f"More details: {inference_job.url}"
+        print()
+        print(last_line)
 
 
 def print_profile_metrics_from_job(
@@ -90,7 +91,6 @@ def print_profile_metrics_from_job(
         [op.get("compute_unit", "UNK") for op in profile_data["execution_detail"]]
     )
     execution_summary = profile_data["execution_summary"]
-    inference_time_ms = execution_summary["estimated_inference_time"] / 1000
     peak_memory_bytes = execution_summary["inference_memory_peak_range"]
     print(f"\n{_INFO_DASH}")
     print(f"Performance results on-device for {profile_job.name.title()}.")
@@ -105,48 +105,87 @@ def print_profile_metrics_from_job(
     else:
         raise NotImplementedError()
 
-    print_profile_metrics(
-        ModelRuntimePerformanceDetails(
-            profile_job.model.name,
-            profile_job.device.name,
-            profile_job.device.os,
-            runtime,
-            inference_time_ms,
-            peak_memory_bytes,
-            compute_unit_counts,
-        )
+    perf_details = QAIHMModelPerf.PerformanceDetails(
+        job_id=profile_job.job_id,
+        inference_time_microsecs=execution_summary["estimated_inference_time"],
+        peak_memory_bytes=peak_memory_bytes,
+        compute_unit_counts=compute_unit_counts,
+        # Unused
+        primary_compute_unit="",
+        precision="",
+    )
+
+    device_details = QAIHMModelPerf.DeviceDetails(
+        name=profile_job.device.name,
+        os=profile_job.device.os,
+        # unused
+        form_factor="",
+        os_name="",
+        manufacturer="",
+        chipset="",
     )
+
+    print_profile_metrics(device_details, runtime, perf_details)
     print(_INFO_DASH)
     last_line = f"More details: {profile_job.url}\n"
     print(last_line)
 
 
-def print_profile_metrics(
-    details: ModelRuntimePerformanceDetails,
-):
-    inf_time = details.inference_time_ms
-    peak_memory_mb = f"[{bytes_to_mb(details.peak_memory_bytes[0])}, {bytes_to_mb(details.peak_memory_bytes[1])}]"
-    num_ops = sum(details.compute_unit_counts.values())
-    compute_units = [
-        f"{unit} ({num_ops} ops)"
-        for unit, num_ops in details.compute_unit_counts.items()
-    ]
-
+def get_profile_metrics(
+    device: QAIHMModelPerf.DeviceDetails,
+    runtime: TargetRuntime,
+    perf_details: QAIHMModelPerf.PerformanceDetails
+    | QAIHMModelPerf.LLMPerformanceDetails,
+) -> str:
     rows = [
-        ["Device", f"{details.device_name} ({details.device_os})"],
-        ["Runtime", f"{details.runtime.name}"],
-        [
-            "Estimated inference time (ms)",
-            "<0.1" if inf_time < 0.1 else f"{inf_time:.1f}",
-        ],
-        ["Estimated peak memory usage (MB)", f"{peak_memory_mb}"],
-        ["Total # Ops", f"{num_ops}"],
-        ["Compute Unit(s)", " ".join(compute_units)],
+        ["Device", f"{device.name} ({device.os})"],
+        ["Runtime", runtime.name],
     ]
+
+    if isinstance(perf_details, QAIHMModelPerf.LLMPerformanceDetails):
+        rows.extend(
+            [
+                ["Response Rate (Tokens/Second)", str(perf_details.tokens_per_second)],
+                [
+                    "Time to First Token (Seconds)",
+                    str(perf_details.time_to_first_token_range_secs),
+                ],
+            ]
+        )
+    else:
+        inf_time_ms = perf_details.inference_time_microsecs / 1000
+        mem_min = bytes_to_mb(perf_details.peak_memory_bytes[0])
+        mem_max = bytes_to_mb(perf_details.peak_memory_bytes[1])
+        compute_units = [
+            f"{unit} ({num_ops} ops)"
+            for unit, num_ops in perf_details.compute_unit_counts.items()
+        ]
+
+        rows.extend(
+            [
+                [
+                    "Estimated inference time (ms)",
+                    "<0.1" if inf_time_ms < 0.1 else f"{inf_time_ms:.1f}",
+                ],
+                ["Estimated peak memory usage (MB)", f"[{mem_min}, {mem_max}]"],
+                ["Total # Ops", str(sum(perf_details.compute_unit_counts.values()))],
+                ["Compute Unit(s)", " ".join(compute_units)],
+            ]
+        )
+
     table = PrettyTable(align="l", header=False, border=False, padding_width=0)
     for row in rows:
         table.add_row([row[0], f": {row[1]}"])
-    print(table.get_string())
+    return table.get_string()
+
+
+def print_profile_metrics(
+    device: QAIHMModelPerf.DeviceDetails,
+    runtime: TargetRuntime,
+    perf_details: QAIHMModelPerf.PerformanceDetails
+    | QAIHMModelPerf.LLMPerformanceDetails,
+):
+    print(get_profile_metrics(device, runtime, perf_details))
 
 
 def print_on_target_demo_cmd(
diff --git a/qai_hub_models/utils/qai_hub_helpers.py b/qai_hub_models/utils/qai_hub_helpers.py
index 335e65d3..74dcbe4b 100644
--- a/qai_hub_models/utils/qai_hub_helpers.py
+++ b/qai_hub_models/utils/qai_hub_helpers.py
@@ -70,25 +70,44 @@ def export_without_hub_access(
         print("")
 
         missing_perf = True
-        # Components in perf.yaml don't yet have the same name as their code generated names.
-        if not components:
-            perf_yaml_path = os.path.join(
-                os.path.dirname(os.path.dirname(__file__)),
-                "models",
-                model_id,
-                "perf.yaml",
-            )
-            if os.path.exists(perf_yaml_path):
-                parsed_perf = QAIHMModelPerf(perf_yaml_path, model_id).get_perf_details(
-                    target_runtime, device_name
+        perf_yaml_path = os.path.join(
+            os.path.dirname(os.path.dirname(__file__)),
+            "models",
+            model_id,
+            "perf.yaml",
+        )
+        if os.path.exists(perf_yaml_path):
+            parsed_perf = QAIHMModelPerf(perf_yaml_path, model_id)
+
+            if not components:
+                components = [model_display_name]
+
+            print(f"Profiling Results\n{_INFO_DASH}")
+            for component in components:
+                print(f"{component}")
+                model_perf = parsed_perf.per_model_details[component]
+
+                # Device families aren't stored in perf yamls. Replace with the original device name.
+                device_search_name = device_name.replace(" (Family)", "")
+                device_perf = model_perf.details_per_device.get(
+                    device_search_name, None
                 )
-                missing_perf = None in parsed_perf.values()
+                if not device_perf:
+                    break
+
+                runtime_perf = None
+                for path, path_runtime_perf in device_perf.details_per_path.items():
+                    if path.get_runtime() == target_runtime:
+                        runtime_perf = path_runtime_perf
+                        break
 
-            if not missing_perf:
-                print(f"Profiling Results for {model_display_name}\n{_INFO_DASH}")
-                for model_name, perf in parsed_perf.items():
-                    assert perf is not None  # for mypy
-                    print_profile_metrics(perf)
+                if not runtime_perf:
+                    break
+
+                missing_perf = False
+                print_profile_metrics(
+                    device_perf.device, target_runtime, runtime_perf.perf_details
+                )
 
         if missing_perf:
             print(
diff --git a/qai_hub_models/utils/quantization.py b/qai_hub_models/utils/quantization.py
index 0220849d..ce339f9a 100644
--- a/qai_hub_models/utils/quantization.py
+++ b/qai_hub_models/utils/quantization.py
@@ -7,9 +7,16 @@
 from typing import Optional
 
 import torch
+from qai_hub.client import DatasetEntries, Device, QuantizeDtype
 from torch.utils.data import DataLoader
 
+from qai_hub_models.datasets import get_dataset_from_name
+from qai_hub_models.models.common import TargetRuntime
+from qai_hub_models.models.protocols import HubModelProtocol
 from qai_hub_models.utils.asset_loaders import CachedWebDatasetAsset, load_torch
+from qai_hub_models.utils.evaluate import sample_dataset
+from qai_hub_models.utils.inference import make_hub_dataset_entries
+from qai_hub_models.utils.input_spec import InputSpec
 
 DATA_ID = "image_quantziation_samples"
 DATA_VERSION = 1
@@ -66,3 +73,65 @@ def get_image_quantization_samples(
     """
     path = IMAGE_QUANTIZATION_SAMPLES.fetch(extract=False)
     return load_torch(quantization_samples_path or path)
+
+
+def get_calibration_data(
+    input_spec: InputSpec, dataset_name: str, num_samples: int
+) -> DatasetEntries:
+    """
+    Produces a numpy dataset to be used for calibration data of a quantize job.
+
+    Parameters:
+        input_spec: The input spec of the model. Used to ensure the returned dataset's names
+            match the input names of the model.
+        dataset_name: Name of the dataset to sample from.
+        num_samples: Number of data samples to use.
+
+    Returns:
+        Dataset compatible with the format expected by AI Hub.
+    """
+    torch_dataset = sample_dataset(get_dataset_from_name(dataset_name), num_samples)
+    torch_samples = tuple(
+        [torch_dataset[i][j].unsqueeze(0).numpy() for i in range(len(torch_dataset))]
+        for j in range(len(input_spec))
+    )
+    return make_hub_dataset_entries(torch_samples, list(input_spec.keys()))
+
+
+class HubQuantizableMixin(HubModelProtocol):
+    """
+    Mixin to attach to model classes that will be quantized using AI Hub quantize job.
+    """
+
+    def get_hub_compile_options(
+        self,
+        target_runtime: TargetRuntime,
+        other_compile_options: str = "",
+        device: Optional[Device] = None,
+    ) -> str:
+        quantization_flags = " --quantize_io"
+        if target_runtime == TargetRuntime.TFLITE:
+            # uint8 is the easiest I/O type for integration purposes,
+            # especially for image applications. Images are always
+            # uint8 RGB when coming from disk or a camera.
+            #
+            # Uint8 has not been thoroughly tested with other paths,
+            # so it is enabled only for TF Lite today.
+            quantization_flags += " --quantize_io_type uint8"
+        return (
+            super().get_hub_compile_options(  # type: ignore
+                target_runtime, other_compile_options, device
+            )
+            + quantization_flags
+        )
+
+    def get_quantize_options(self) -> str:
+        return ""
+
+    @staticmethod
+    def get_weights_dtype() -> QuantizeDtype:
+        return QuantizeDtype.INT8
+
+    @staticmethod
+    def get_activations_dtype() -> QuantizeDtype:
+        return QuantizeDtype.INT8
diff --git a/qai_hub_models/utils/scorecard/common.py b/qai_hub_models/utils/scorecard/common.py
index 13028351..06eb02de 100644
--- a/qai_hub_models/utils/scorecard/common.py
+++ b/qai_hub_models/utils/scorecard/common.py
@@ -3,6 +3,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
 import os
+import re
 from enum import Enum
 from functools import cached_property
 from typing import Dict, List, Optional, Tuple
@@ -25,9 +26,7 @@ def _get_cached_device(device_name: str) -> hub.Device:
 
 def scorecard_unit_test_idfn(val):
     """Name of unit test parameters used in tests created in test_generated.py"""
-    if val == ScorecardDevices.any:
-        return "device_agnostic"
-    elif isinstance(val, ScorecardDevice):
+    if isinstance(val, ScorecardDevice):
         return val.name
 
 
@@ -35,6 +34,12 @@ class ScorecardDevice:
     # -- DEVICE REGISTRY --
     _registry: Dict[str, "ScorecardDevice"] = {}
 
+    @classmethod
+    def all_devices(cls, only_enabled: bool = False) -> List["ScorecardDevice"]:
+        if only_enabled:
+            return cls.all_enabled()
+        return list(cls._registry.values())
+
     @classmethod
     def all_enabled(cls) -> List["ScorecardDevice"]:
         return [x for x in cls._registry.values() if x.enabled]
@@ -46,12 +51,21 @@ def register(
         reference_device_name: Optional[str],
         execution_device_name: Optional[str] = None,
         disabled_models: List[str] = [],
+        duplicate_of: Optional["ScorecardDevice"] = None,
+        compile_paths: Optional[List["ScorecardCompilePath"]] = None,
+        profile_paths: Optional[List["ScorecardProfilePath"]] = None,
     ) -> "ScorecardDevice":
         if name in cls._registry:
             raise ValueError("Device " + name + "already registered.")
 
         device = ScorecardDevice(
-            name, reference_device_name, execution_device_name, disabled_models
+            name,
+            reference_device_name,
+            execution_device_name,
+            disabled_models,
+            duplicate_of,
+            compile_paths,
+            profile_paths,
         )
         cls._registry[name] = device
         return device
@@ -67,6 +81,9 @@ def __init__(
         reference_device_name: Optional[str],
         execution_device_name: Optional[str] = None,
         disabled_models: List[str] = [],
+        duplicate_of: Optional["ScorecardDevice"] = None,
+        compile_paths: Optional[List["ScorecardCompilePath"]] = None,
+        profile_paths: Optional[List["ScorecardProfilePath"]] = None,
     ):
         """
         Parameters
@@ -80,11 +97,27 @@ def __init__(
 
             disabled_models: AI Hub Model IDs that are not supported by this device.
                             These models will be ignored by the scorecard in combination with this device.
+
+            duplicate_of: If set, this device will act as a duplicate of the given scorecard device. In effect this means:
+                          * Jobs will not be submitted targeting this chipset.
+                          * Jobs for the "given" scorecard device will be used to create performance metrics for this device.
+
+                          NOTE: Just because this chip is marked as having duplicate AI/ML performance compared to another chip,
+                                does not mean this chip is indistinguishable from that other chip. The chips will
+                                differ by other important features, but these are not relevant for this AI/ML scorecard.
+
+            compile_paths: The set of compile paths valid for this device. If unset, will use the default set of paths for this device type.
+
+            profile_paths: The set of profile paths valid for this device. If unset, will use the default set of paths for this device type.
+
         """
         self.name = name
         self.disabled_models = disabled_models
         self.reference_device_name = reference_device_name
         self.execution_device_name = execution_device_name
+        self.duplicate_of = duplicate_of
+        self._compile_paths = compile_paths
+        self._profile_paths = profile_paths
 
     def __str__(self):
         return self.name.lower()
@@ -117,7 +150,7 @@ def enabled(self) -> bool:
         valid_test_devices = os.environ.get("WHITELISTED_PROFILE_TEST_DEVICES", "ALL")
         return (
             valid_test_devices == "ALL"
-            or self.name == "all"
+            or self.name == "any"
             or self.name in valid_test_devices.split(",")
         )
 
@@ -168,48 +201,94 @@ def os(self) -> str:
         for attr in self.reference_device.attributes:
             if attr.startswith("os:"):
                 return attr[3:]
-        raise ValueError(f"OS Not found for device: {self.name}")
+        raise ValueError(f"OS not found for device: {self.name}")
 
+    @cached_property
+    def form_factor(self) -> str:
+        """
+        The chipset form_factor (eg. Auto, IoT, Mobile, ...)
+        """
+        for attr in self.reference_device.attributes:
+            if attr.startswith("format:"):
+                return attr[7:]
+        raise ValueError(f"Format not found for device: {self.name}")
 
-class ScorecardDevices:
-    any = ScorecardDevice.register(
-        "any", "Samsung Galaxy S23"
-    )  # no specific device (usable only during compilation)
-    cs_8_gen_2 = ScorecardDevice.register("cs_8_gen_2", "Samsung Galaxy S23")
-    cs_8_gen_3 = ScorecardDevice.register(
-        "cs_8_gen_3", "Samsung Galaxy S24", "Samsung Galaxy S24 (Family)"
-    )
-    cs_6490 = ScorecardDevice.register(
-        "cs_6490",
-        "RB3 Gen 2 (Proxy)",
-        None,
-        [
-            "ConvNext-Tiny-w8a8-Quantized",
-            "ConvNext-Tiny-w8a16-Quantized",
-            "ResNet50Quantized",
-            "RegNetQuantized",
-            "HRNetPoseQuantized",
-            "SESR-M5-Quantized",
-            "Midas-V2-Quantized",
-            "Posenet-Mobilenet-Quantized",
-        ],
-    )
-    cs_8250 = ScorecardDevice.register("cs_8250", "RB5 (Proxy)")
-    cs_8550 = ScorecardDevice.register("cs_8550", "QCS8550 (Proxy)")
-    cs_x_elite = ScorecardDevice.register("cs_x_elite", "Snapdragon X Elite CRD")
-    cs_auto_lemans_8255 = ScorecardDevice.register(
-        "cs_auto_lemans_8255", "SA8255 (Proxy)"
-    )
-    cs_auto_lemans_8775 = ScorecardDevice.register(
-        "cs_auto_lemans_8775", "SA8775 (Proxy)"
-    )
-    cs_auto_lemans_8650 = ScorecardDevice.register(
-        "cs_auto_lemans_8650", "SA8650 (Proxy)"
-    )
-    cs_xr_8450 = ScorecardDevice.register("cs_xr_8450", "QCS8450 (Proxy)")
-    cs_auto_makena_8295 = ScorecardDevice.register(
-        "cs_auto_makena_8295", "Snapdragon Cockpit Gen 4 QAM"
-    )
+    @cached_property
+    def hexagon_version(self) -> int:
+        """
+        The chipset hexagon version number
+        """
+        for attr in self.reference_device.attributes:
+            if attr.startswith("hexagon:v"):
+                return int(attr[9:])
+        raise ValueError(f"Hexagon version not found for device: {self.name}")
+
+    @property
+    def supports_fp16_inference(self) -> bool:
+        return self.hexagon_version >= 69
+
+    @cached_property
+    def supported_runtimes(self) -> List[TargetRuntime]:
+        runtimes = []
+        for attr in self.reference_device.attributes:
+            if attr.startswith("framework:"):
+                rt_name = attr[10:].upper()
+                try:
+                    runtimes.append(TargetRuntime[rt_name.upper()])
+                except KeyError:
+                    print(
+                        f"WARNING: Unable to determine supported runtime associated with framework {rt_name}"
+                    )
+        return runtimes
+
+    @cached_property
+    def profile_paths(self) -> List["ScorecardProfilePath"]:
+        if self._profile_paths:
+            return self._profile_paths
+        if self.duplicate_of:
+            return self.duplicate_of.profile_paths
+
+        if self.form_factor == "phone":
+            paths = [
+                ScorecardProfilePath.ONNX,
+                ScorecardProfilePath.QNN,
+                ScorecardProfilePath.TFLITE,
+            ]
+        elif self.form_factor == "auto":
+            paths = [
+                ScorecardProfilePath.QNN,
+                ScorecardProfilePath.TFLITE,
+            ]
+        elif self.form_factor == "xr":
+            paths = [ScorecardProfilePath.QNN, ScorecardProfilePath.TFLITE]
+        elif self.form_factor == "compute":
+            paths = [
+                ScorecardProfilePath.ONNX,
+                ScorecardProfilePath.ONNX_DML_GPU,
+                ScorecardProfilePath.QNN,
+            ]
+        elif self.form_factor == "iot":
+            paths = [ScorecardProfilePath.TFLITE, ScorecardProfilePath.QNN]
+        else:
+            raise NotImplementedError(
+                f"Unsupported device form_factor: {self.form_factor}"
+            )
+
+        return [path for path in paths if path.get_runtime() in self.supported_runtimes]
+
+    @cached_property
+    def compile_paths(self) -> List["ScorecardCompilePath"]:
+        if self._compile_paths:
+            return self._compile_paths
+        if self.duplicate_of:
+            return self.duplicate_of.compile_paths
+
+        if ScorecardProfilePath.QNN in self.profile_paths:
+            paths = [ScorecardCompilePath.QNN]
+        else:
+            paths = []
+
+        return [path for path in paths if path.get_runtime() in self.supported_runtimes]
 
 
 def get_job_cache_name(
@@ -253,13 +332,15 @@ def all_enabled() -> List["ScorecardCompilePath"]:
 
     @staticmethod
     def get_parameterized_test_config(
-        aimet_model=False,
+        is_quantized=False,
         only_enabled_paths=True,
         only_enabled_devices=True,
     ) -> List[Tuple["ScorecardCompilePath", ScorecardDevice]]:
         path_list: List[ScorecardCompilePath] = ScorecardCompilePath.all_enabled() if only_enabled_paths else ScorecardCompilePath  # type: ignore
         path_devices_dict = {
-            sc_path: sc_path.get_test_devices(aimet_model, only_enabled_devices)
+            sc_path: sc_path.get_test_devices(
+                is_quantized, only_enabled_devices, include_any=True
+            )
             for sc_path in path_list
         }
         return [
@@ -276,51 +357,40 @@ def get_runtime(self) -> TargetRuntime:
         raise NotImplementedError()
 
     def get_test_devices(
-        self, aimet_model: bool = False, only_enabled: bool = True
+        self,
+        is_quantized: bool = False,
+        only_enabled: bool = True,
+        include_duplicate_devices: bool = False,
+        include_any: bool = False,
     ) -> List[ScorecardDevice]:
-        if self == ScorecardCompilePath.QNN:
-            devices = [
-                ScorecardDevices.any,
-                ScorecardDevices.cs_x_elite,
-                ScorecardDevices.cs_8550,
-                ScorecardDevices.cs_auto_lemans_8255,
-                ScorecardDevices.cs_auto_lemans_8775,
-                ScorecardDevices.cs_auto_makena_8295,
-            ]
-            if aimet_model:
-                devices.append(ScorecardDevices.cs_6490)
-        else:
-            devices = [ScorecardDevices.any]
-
-        try:
-            from qai_hub_models.utils.scorecard._common_private import (
-                get_private_compile_path_test_devices,
+        return [
+            device
+            for device in ScorecardDevice.all_devices(only_enabled)
+            if (
+                (is_quantized or device.supports_fp16_inference)
+                and (include_duplicate_devices or not device.duplicate_of)
+                and (include_any or device != ScorecardDevices.any)
+                and self in device.compile_paths
             )
+        ]
 
-            devices.extend(get_private_compile_path_test_devices(self, aimet_model))  # type: ignore
-        except ImportError:
-            pass
-
-        return [x for x in devices if x.enabled] if only_enabled else devices
-
-    def get_compile_options(self, aimet_model=False) -> str:
-        if self == ScorecardCompilePath.ONNX_FP16 and not aimet_model:
+    def get_compile_options(self, is_quantized=False) -> str:
+        if self == ScorecardCompilePath.ONNX_FP16 and not is_quantized:
             return "--quantize_full_type float16 --quantize_io"
         return ""
 
     def get_job_cache_name(
         self,
         model: str,
-        device: ScorecardDevice = ScorecardDevices.any,
-        aimet_model: bool = False,
+        device: Optional[ScorecardDevice] = None,
+        is_quantized: bool = False,
         component: Optional[str] = None,
     ):
-        # These two auto chips are the same, re-use the same compiled asset.
-        if device == ScorecardDevices.cs_auto_lemans_8650:
-            device = ScorecardDevices.cs_auto_lemans_8775
-        if device not in self.get_test_devices(aimet_model=aimet_model):
+        if not device or self not in device.compile_paths:
             device = ScorecardDevices.any  # default to the "generic" compilation path
-        return get_job_cache_name(self.name, model, device, component)
+        return get_job_cache_name(
+            self.name, model, device.duplicate_of or device, component
+        )
 
 
 class ScorecardProfilePath(Enum):
@@ -358,13 +428,15 @@ def include_in_perf_yaml(self) -> bool:
 
     @staticmethod
     def get_parameterized_test_config(
-        aimet_model=False,
+        is_quantized=False,
         only_enabled_paths=True,
         only_enabled_devices=True,
     ) -> List[Tuple["ScorecardProfilePath", ScorecardDevice]]:
         path_list: List[ScorecardProfilePath] = ScorecardProfilePath.all_enabled() if only_enabled_paths else ScorecardProfilePath  # type: ignore
         path_devices_dict = {
-            sc_path: sc_path.get_test_devices(aimet_model, only_enabled_devices)
+            sc_path: sc_path.get_test_devices(
+                is_quantized, only_enabled_devices, include_any=True
+            )
             for sc_path in path_list
         }
         return [
@@ -397,56 +469,22 @@ def get_profile_options(self) -> str:
         return ""
 
     def get_test_devices(
-        self, aimet_model: bool = False, only_enabled: bool = True
+        self,
+        is_quantized: bool = False,
+        only_enabled: bool = True,
+        include_duplicate_devices: bool = False,
+        include_any: bool = False,
     ) -> List[ScorecardDevice]:
-        if self == ScorecardProfilePath.TFLITE:
-            devices = [
-                ScorecardDevices.cs_8_gen_2,
-                ScorecardDevices.cs_8_gen_3,
-                ScorecardDevices.cs_8550,
-                ScorecardDevices.cs_xr_8450,
-                ScorecardDevices.cs_auto_lemans_8650,
-                ScorecardDevices.cs_auto_lemans_8775,
-                ScorecardDevices.cs_auto_lemans_8255,
-                ScorecardDevices.cs_auto_makena_8295,
-            ] + (
-                [ScorecardDevices.cs_6490, ScorecardDevices.cs_8250]
-                if aimet_model
-                else []
-            )
-        elif self == ScorecardProfilePath.ONNX:
-            devices = [
-                ScorecardDevices.cs_8_gen_2,
-                ScorecardDevices.cs_8_gen_3,
-                ScorecardDevices.cs_x_elite,
-            ]
-        elif self == ScorecardProfilePath.QNN:
-            devices = [
-                ScorecardDevices.cs_8_gen_2,
-                ScorecardDevices.cs_8_gen_3,
-                ScorecardDevices.cs_x_elite,
-                ScorecardDevices.cs_8550,
-                ScorecardDevices.cs_auto_lemans_8650,
-                ScorecardDevices.cs_auto_lemans_8775,
-                ScorecardDevices.cs_auto_lemans_8255,
-                ScorecardDevices.cs_auto_makena_8295,
-                ScorecardDevices.cs_xr_8450,
-            ] + ([ScorecardDevices.cs_6490] if aimet_model else [])
-        elif self == ScorecardProfilePath.ONNX_DML_GPU:
-            devices = [ScorecardDevices.cs_x_elite]
-        else:
-            raise NotImplementedError()
-
-        try:
-            from qai_hub_models.utils.scorecard._common_private import (
-                get_private_profile_path_test_devices,
+        return [
+            device
+            for device in ScorecardDevice.all_devices(only_enabled)
+            if (
+                (is_quantized or device.supports_fp16_inference)
+                and (include_duplicate_devices or not device.duplicate_of)
+                and self in device.profile_paths
+                and (include_any or device != ScorecardDevices.any)
             )
-
-            devices.extend(get_private_profile_path_test_devices(self, aimet_model))  # type: ignore
-        except ImportError:
-            pass
-
-        return [x for x in devices if x.enabled] if only_enabled else devices
+        ]
 
     def get_job_cache_name(
         self,
@@ -454,7 +492,9 @@ def get_job_cache_name(
         device: ScorecardDevice,
         component: Optional[str] = None,
     ):
-        return get_job_cache_name(self.name, model, device, component)
+        return get_job_cache_name(
+            self.name, model, device.duplicate_of or device, component
+        )
 
 
 def supported_chipsets(chips: List[str]) -> List[str]:
@@ -466,7 +506,18 @@ def supported_chipsets(chips: List[str]) -> List[str]:
     """
     chipset_set = set(chips)
     chipset_list = []
-    if "qualcomm-snapdragon-8gen3" in chipset_set:
+
+    if "qualcomm-snapdragon-8-elite" in chipset_set:
+        chipset_list.extend(
+            [
+                "qualcomm-snapdragon-8-elite",
+                "qualcomm-snapdragon-8gen3",
+                "qualcomm-snapdragon-8gen2",
+                "qualcomm-snapdragon-8gen1",
+                "qualcomm-snapdragon-888",
+            ]
+        )
+    elif "qualcomm-snapdragon-8gen3" in chipset_set:
         chipset_list.extend(
             [
                 "qualcomm-snapdragon-8gen3",
@@ -485,10 +536,10 @@ def supported_chipsets(chips: List[str]) -> List[str]:
         )
 
     if "qualcomm-snapdragon-x-elite" in chipset_set:
+        chipset_list.extend(["qualcomm-snapdragon-x-elite"])
         chipset_list.extend(["qualcomm-snapdragon-x-plus-8-core"])
 
     chipset_order = [
-        "qualcomm-snapdragon-x-elite",
         "qualcomm-qcs6490",
         "qualcomm-qcs8250",
         "qualcomm-qcs8550",
@@ -502,32 +553,34 @@ def supported_chipsets(chips: List[str]) -> List[str]:
             chipset_list.append(chipset)
 
     # Add any remaining chipsets not covered
-    for chipset in chipset_set:
+    for chipset in sorted(chipset_set):
         if chipset not in chipset_list:
             chipset_list.append(chipset)
     return chipset_list
 
 
 def chipset_marketing_name(chipset) -> str:
-    """Sanitize chip name to match marketting."""
-    chip = [word.capitalize() for word in chipset.split("-")]
-    details_to_remove = []
-    for i in range(len(chip)):
-        if chip[i] == "8gen3":
-            chip[i] = "8 Gen 3"
-        if chip[i] == "8gen2":
-            chip[i] = "8 Gen 2"
-        elif chip[i] == "8gen1":
-            chip[i] = "8 Gen 1"
-        elif chip[i] == "Snapdragon":
-            # Marketing name for Qualcomm Snapdragon is Snapdragon®
-            chip[i] = "Snapdragon®"
-        elif chip[i] == "Qualcomm":
-            details_to_remove.append(chip[i])
-
-    for detail in details_to_remove:
-        chip.remove(detail)
-    return " ".join(chip)
+    """Sanitize chip name to match marketing."""
+    chip = " ".join([word.capitalize() for word in chipset.split("-")])
+    chip = chip.replace("Qualcomm ", "")
+    chip = chip.replace(
+        "Snapdragon", "Snapdragon®"
+    )  # Marketing name for Qualcomm Snapdragon is Snapdragon®
+
+    # 8cxgen2 -> 8cx Gen 2
+    # 8gen2 -> 8 Gen 2
+    chip = re.sub(r"(\w+)gen(\d+)", r"\g<1> Gen \g<2>", chip)
+
+    # 8 Core -> 8-Core
+    chip = re.sub(r"(\d+) Core", r"\g<1>-Core", chip)
+
+    # qcs6490 -> QCS6490
+    # sa8775p -> SA8775P
+    chip = re.sub(
+        r"(Qcs|Sa)(\w+)", lambda m: f"{m.group(1).upper()}{m.group(2).upper()}", chip
+    )
+
+    return chip
 
 
 def supported_chipsets_santized(chips) -> List[str]:
@@ -556,16 +609,6 @@ def get_supported_devices(chips) -> List[str]:
             supported_devices_for_chip = sorted(set(supported_devices_for_chip))
             __CHIP_SUPPORTED_DEVICES_CACHE[chip] = supported_devices_for_chip
         supported_devices.extend(supported_devices_for_chip)
-    supported_devices.extend(
-        [
-            "Google Pixel 5a 5G",
-            "Google Pixel 4",
-            "Google Pixel 4a",
-            "Google Pixel 3",
-            "Google Pixel 3a",
-            "Google Pixel 3a XL",
-        ]
-    )
     return supported_devices
 
 
@@ -574,6 +617,82 @@ def supported_oses() -> List[str]:
     return ["Android"]
 
 
+class ScorecardDevices:
+    any = ScorecardDevice.register(
+        name="any",
+        reference_device_name="Samsung Galaxy S23",
+        compile_paths=[path for path in ScorecardCompilePath],
+        profile_paths=[],
+    )  # no specific device (usable only during compilation)
+
+    ###
+    # cs == chipset
+    ###
+    cs_8_gen_2 = ScorecardDevice.register(
+        name="cs_8_gen_2",
+        reference_device_name="Samsung Galaxy S23",
+        compile_paths=[],  # Uses "any" in all cases
+    )
+
+    cs_8_gen_3 = ScorecardDevice.register(
+        name="cs_8_gen_3",
+        reference_device_name="Samsung Galaxy S24",
+        execution_device_name="Samsung Galaxy S24 (Family)",
+        compile_paths=[],  # Uses "any" in all cases
+    )
+
+    cs_6490 = ScorecardDevice.register(
+        name="cs_6490",
+        reference_device_name="RB3 Gen 2 (Proxy)",
+        disabled_models=[
+            "ConvNext-Tiny-w8a8-Quantized",
+            "ConvNext-Tiny-w8a16-Quantized",
+            "ResNet50Quantized",
+            "RegNetQuantized",
+            "HRNetPoseQuantized",
+            "SESR-M5-Quantized",
+            "Midas-V2-Quantized",
+            "Posenet-Mobilenet-Quantized",
+        ],
+    )
+
+    cs_8250 = ScorecardDevice.register(
+        name="cs_8250",
+        reference_device_name="RB5 (Proxy)",
+    )
+
+    cs_8550 = ScorecardDevice.register(
+        name="cs_8550", reference_device_name="QCS8550 (Proxy)"
+    )
+
+    cs_x_elite = ScorecardDevice.register(
+        name="cs_x_elite", reference_device_name="Snapdragon X Elite CRD"
+    )
+
+    cs_auto_lemans_8255 = ScorecardDevice.register(
+        name="cs_auto_lemans_8255",
+        reference_device_name="SA8255 (Proxy)",
+    )
+
+    cs_auto_lemans_8775 = ScorecardDevice.register(
+        name="cs_auto_lemans_8775",
+        reference_device_name="SA8775 (Proxy)",
+    )
+
+    cs_auto_lemans_8650 = ScorecardDevice.register(
+        name="cs_auto_lemans_8650",
+        reference_device_name="SA8650 (Proxy)",
+    )
+
+    cs_xr_8450 = ScorecardDevice.register(
+        name="cs_xr_8450", reference_device_name="QCS8450 (Proxy)"
+    )
+
+    cs_8_elite = ScorecardDevice.register(
+        name="cs_8_elite", reference_device_name="Snapdragon 8 Elite QRD"
+    )
+
+
 try:
     # Register private devices
     # This must live at the end of this file to avoid circular import problems.
diff --git a/qai_hub_models/utils/scorecard/job_summary.py b/qai_hub_models/utils/scorecard/job_summary.py
index b4a2f305..5e275c91 100644
--- a/qai_hub_models/utils/scorecard/job_summary.py
+++ b/qai_hub_models/utils/scorecard/job_summary.py
@@ -149,12 +149,12 @@ def from_model_id(
         path: ScorecardCompilePath
         for path in ScorecardCompilePath.all_enabled():
             for component in components:
-                path_devices_enabled = [
-                    x
-                    for x in path.get_test_devices(model_code_gen.is_aimet)
-                    if x.enabled
-                ]
-                for device in path_devices_enabled:
+                for device in path.get_test_devices(
+                    model_code_gen.is_aimet or model_code_gen.use_hub_quantization,
+                    only_enabled=True,
+                    include_duplicate_devices=True,
+                    include_any=True,
+                ):
                     model_runs.append(
                         cls(
                             model_id=component or model_info.name,
@@ -222,12 +222,11 @@ def from_model_id(
         path: ScorecardProfilePath
         for path in ScorecardProfilePath.all_enabled():
             for component in components:
-                path_devices_enabled = [
-                    x
-                    for x in path.get_test_devices(model_code_gen.is_aimet)
-                    if x.enabled
-                ]
-                for device in path_devices_enabled:
+                for device in path.get_test_devices(
+                    model_code_gen.is_aimet or model_code_gen.use_hub_quantization,
+                    only_enabled=True,
+                    include_duplicate_devices=True,
+                ):
                     model_runs.append(
                         cls(
                             model_id=component or model_info.name,
@@ -251,7 +250,7 @@ def __post_init__(self):
         super().__post_init__()
         if not self.skipped:
             assert isinstance(self.job, hub.ProfileJob)
-            if self._job_status.success:
+            if self._job_status.success:  # type: ignore
                 assert self.profile_results
 
     @cached_property
@@ -386,7 +385,7 @@ def evaluation_metrics(self) -> Union[Dict[str, Any], str]:
 
     @cached_property
     def performance_metrics(self) -> Dict[str, Any]:
-        return dict(
+        metrics = dict(
             inference_time=self.inference_time,
             throughput=self.throughput,
             estimated_peak_memory_range=self.peak_memory_range,
@@ -398,8 +397,11 @@ def performance_metrics(self) -> Dict[str, Any]:
                 layers_on_cpu=self.cpu,
                 total_layers=self.total,
             ),
-            llm_metrics=self.llm_metrics,
-            evaluation_metrics=self.evaluation_metrics,
             job_id=self.job_id,
             job_status=self.job_status,
         )
+        if self.llm_metrics != "null":
+            metrics["llm_metrics"] = self.llm_metrics
+        if self.evaluation_metrics != "null":
+            metrics["evaluation_metrics"] = self.evaluation_metrics
+        return metrics
diff --git a/qai_hub_models/utils/scorecard/model_card.py b/qai_hub_models/utils/scorecard/model_card.py
index b89b7cfc..00209a75 100644
--- a/qai_hub_models/utils/scorecard/model_card.py
+++ b/qai_hub_models/utils/scorecard/model_card.py
@@ -220,7 +220,7 @@ def from_runs(model_runs: List[ProfileJobSummary]):
             }
         )
 
-    def get_chipsets(self) -> Set[str]:
+    def get_chipsets(self, include_internal_devices: bool = False) -> Set[str]:
         chips: Set[str] = set()
         for model_id, model_summary in self.runs_per_model.items():
             for device, device_summary in model_summary.runs_per_device.items():
@@ -237,6 +237,10 @@ def get_chipsets(self) -> Set[str]:
                 if model_id in device.disabled_models:
                     continue
 
+                # Don't include private devices
+                if not include_internal_devices and not device.public:
+                    continue
+
                 chips.add(device.chipset)
         return chips
 
@@ -248,7 +252,7 @@ def get_perf_card(
     ) -> Dict[str, str | List[Any] | Dict[str, Any]]:
         perf_card: Dict[str, str | List[Any] | Dict[str, Any]] = {}
 
-        chips = self.get_chipsets()
+        chips = self.get_chipsets(include_internal_devices)
         perf_card["aggregated"] = dict(
             supported_oses=supported_oses(),
             supported_devices=get_supported_devices(chips),
diff --git a/qai_hub_models/utils/scorecard/perf_summary.py b/qai_hub_models/utils/scorecard/perf_summary.py
index abcf4348..975536f7 100644
--- a/qai_hub_models/utils/scorecard/perf_summary.py
+++ b/qai_hub_models/utils/scorecard/perf_summary.py
@@ -116,10 +116,9 @@ def update_summary(self, model_id: str, previous_report, new_report):
                     prev_inference_time = prev_inference_time.get(
                         "inference_time", "null"
                     )
-                    new_inference_time = new_perf_metrics[device].get(runtime_type, {})
-                    new_inference_time = new_inference_time.get(
-                        "inference_time", "null"
-                    )
+                    run_stats = new_perf_metrics[device].get(runtime_type, {})
+                    job_id = run_stats.get("job_id", "null")
+                    new_inference_time = run_stats.get("inference_time", "null")
                     if new_inference_time == prev_inference_time:
                         continue
 
@@ -131,6 +130,7 @@ def update_summary(self, model_id: str, previous_report, new_report):
                             "inf",
                             self._format_speedup(new_inference_time),
                             self._format_speedup(prev_inference_time),
+                            job_id,
                             device_info["chipset"],
                             device_info["os"],
                         )
@@ -161,6 +161,7 @@ def update_summary(self, model_id: str, previous_report, new_report):
                                 self._format_speedup(speedup),
                                 self._format_speedup(new_inference_time),
                                 self._format_speedup(prev_inference_time),
+                                job_id,
                                 device_info["chipset"],
                                 device_info["os"],
                             )
@@ -183,6 +184,7 @@ def _get_summary_table(self, bucket_id, get_progressions=True):
                 "Kx faster" if get_progressions else "Kx slower",
                 "New Inference time",
                 "Prev Inference time",
+                "Job ID",
                 "Chipset",
                 "OS",
             ]
diff --git a/qai_hub_models/utils/testing.py b/qai_hub_models/utils/testing.py
index 998d9594..cd72a77c 100644
--- a/qai_hub_models/utils/testing.py
+++ b/qai_hub_models/utils/testing.py
@@ -4,6 +4,9 @@
 # ---------------------------------------------------------------------
 from __future__ import annotations
 
+from typing import Callable
+from unittest import mock
+
 import numpy as np
 import pytest
 
@@ -95,3 +98,20 @@ def assert_most_close(
     assert (
         np.mean(not_close_values) <= diff_tol
     ), f"More than {diff_tol * 100}% of values were not close."
+
+
+def mock_first_n(fn: Callable, n: int):
+    """
+    Return a function that returns a Mock object for the first N calls
+    and calls the given fn on all subsequent calls.
+    """
+    call_count = 0
+
+    def mock_fn(*args, **kwargs):
+        nonlocal call_count
+        call_count += 1
+        if call_count <= n:
+            return mock.Mock()
+        return fn(*args, **kwargs)
+
+    return mock_fn
diff --git a/scripts/build_and_test.py b/scripts/build_and_test.py
index 9fc4beb5..2d8eaece 100755
--- a/scripts/build_and_test.py
+++ b/scripts/build_and_test.py
@@ -14,12 +14,7 @@
 from tasks.changes import (
     REPRESENTATIVE_EXPORT_MODELS,
     get_all_models,
-    get_changed_models,
-    get_code_gen_changed_models,
-    get_models_to_run_general_tests,
-    get_models_to_test_export,
-    get_models_with_changed_definitions,
-    get_models_with_export_file_changes,
+    get_models_to_test,
 )
 from tasks.constants import VENV_PATH
 from tasks.plan import (
@@ -258,37 +253,55 @@ def test_scripts(self, plan: Plan, step_id: str = "test_scripts") -> str:
             PyTestScriptsTask(self.venv_path),
         )
 
-    @public_task(
-        "Run most tests for only added/modified models in Model Zoo. Includes most tests, uses shared global cache, and uses the same environment for each model."
-    )
+    def _get_quantize_models_task(self, models) -> PyTestModelsTask:
+        return PyTestModelsTask(
+            self.python_executable,
+            models,
+            models,
+            self.venv_path,
+            venv_for_each_model=False,
+            use_shared_cache=True,
+            test_trace=False,
+            run_export_quantize=True,
+            run_export_compile=False,
+            skip_standard_unit_test=True,
+        )
+
+    @public_task("Quantize changed models in preparation for testing all of them.")
     @depends(["install_deps"])
-    def test_changed_models(
-        self, plan: Plan, step_id: str = "test_changed_models"
+    def quantize_changed_models(
+        self, plan: Plan, step_id: str = "quantize_changed_models"
     ) -> str:
-        # model.py changed
-        model_changed_models = get_models_with_changed_definitions()
-
-        # export.py or test_generated.py changed
-        export_changed_models = get_models_with_export_file_changes()
-
-        # code-gen.yaml changed
-        code_gen_changed_models = get_code_gen_changed_models()
-
-        # If model or code-gen changed, then test export.
-        models_to_test_export = model_changed_models | code_gen_changed_models
-
-        # For all other models where export.py or test_generated.py changed,
-        #   only test if they're part of REPRESENTATIVE_EXPORT_MODELS
-        models_to_test_export.update(
-            export_changed_models & set(REPRESENTATIVE_EXPORT_MODELS)
+        _, models_to_test_export = get_models_to_test()
+        return plan.add_step(
+            step_id, self._get_quantize_models_task(models_to_test_export)
         )
 
-        # Set of models where model.py, demo.py, or test.py changed.
-        models_to_run_tests = get_models_to_run_general_tests()
+    @public_task("Quantize changed models in preparation for testing all of them.")
+    @depends(["install_deps"])
+    def quantize_representative_models(
+        self, plan: Plan, step_id: str = "quantize_representative_models"
+    ) -> str:
+        return plan.add_step(
+            step_id, self._get_quantize_models_task(REPRESENTATIVE_EXPORT_MODELS)
+        )
 
-        # export tests can only run alongside general model tests
-        models_to_run_tests = models_to_run_tests | models_to_test_export
+    @public_task("Quantize changed models in preparation for testing all of them.")
+    @depends(["install_deps"])
+    def quantize_all_models(
+        self, plan: Plan, step_id: str = "quantize_all_models"
+    ) -> str:
+        all_models = get_all_models()
+        return plan.add_step(step_id, self._get_quantize_models_task(all_models))
 
+    @public_task(
+        "Run most tests for only added/modified models in Model Zoo. Includes most tests, uses shared global cache, and uses the same environment for each model."
+    )
+    @depends(["install_deps", "quantize_changed_models"])
+    def test_changed_models(
+        self, plan: Plan, step_id: str = "test_changed_models"
+    ) -> str:
+        models_to_run_tests, models_to_test_export = get_models_to_test()
         return plan.add_step(
             step_id,
             PyTestModelsTask(
@@ -305,17 +318,17 @@ def test_changed_models(
     @public_task(
         "Run all tests for only added/modified models in Model Zoo. Includes all tests, and creates a fresh environment for each model."
     )
-    @depends(["install_deps"])
+    @depends(["install_deps", "quantize_changed_models"])
     def test_changed_models_long(
         self, plan: Plan, step_id: str = "test_changed_models_long"
     ) -> str:
-        default_test_models = REPRESENTATIVE_EXPORT_MODELS
+        models_to_run_tests, models_to_test_export = get_models_to_test()
         return plan.add_step(
             step_id,
             PyTestModelsTask(
                 self.python_executable,
-                get_changed_models() or default_test_models,
-                get_models_to_test_export() or default_test_models,
+                models_to_run_tests,
+                models_to_test_export,
                 self.venv_path,
                 venv_for_each_model=True,
                 use_shared_cache=False,
@@ -323,7 +336,7 @@ def test_changed_models_long(
         )
 
     @public_task("Run tests for all models in Model Zoo.")
-    @depends(["install_deps"])
+    @depends(["install_deps", "quantize_representative_models"])
     def test_all_models(self, plan: Plan, step_id: str = "test_all_models") -> str:
         # Excludes export tests, and uses the same environment for each model.
         all_models = get_all_models()
@@ -353,53 +366,46 @@ def create_perfs(self, plan: Plan, step_id: str = "generate_perfs") -> str:
             ),
         )
 
+    def _make_hub_scorecard_task(
+        self, compile: bool = False, profile: bool = False, quantize: bool = False
+    ) -> PyTestModelsTask:
+        all_models = get_all_models()
+        return PyTestModelsTask(
+            self.python_executable,
+            all_models,
+            all_models,
+            self.venv_path,
+            venv_for_each_model=False,
+            use_shared_cache=True,
+            run_export_compile=compile,
+            run_export_profile=profile,
+            run_export_quantize=quantize,
+            # If one model fails, we should still try the others.
+            exit_after_single_model_failure=False,
+            skip_standard_unit_test=True,
+            test_trace=False,
+        )
+
     @public_task("Run Compile jobs for all models in Model Zoo.")
     @depends(["install_deps"])
     def test_compile_all_models(
         self, plan: Plan, step_id: str = "test_compile_all_models"
     ) -> str:
-        all_models = get_all_models()
-        return plan.add_step(
-            step_id,
-            PyTestModelsTask(
-                self.python_executable,
-                all_models,
-                all_models,
-                self.venv_path,
-                venv_for_each_model=False,
-                use_shared_cache=True,
-                run_export_compile=True,
-                run_export_profile=False,
-                # If one model fails to export, we should still try the others.
-                exit_after_single_model_failure=False,
-                skip_standard_unit_test=True,
-                test_trace=False,
-            ),
-        )
+        return plan.add_step(step_id, self._make_hub_scorecard_task(compile=True))
 
     @public_task("Run profile jobs for all models in Model Zoo.")
     @depends(["install_deps"])
     def test_profile_all_models(
         self, plan: Plan, step_id: str = "test_profile_all_models"
     ) -> str:
-        all_models = get_all_models()
-        return plan.add_step(
-            step_id,
-            PyTestModelsTask(
-                self.python_executable,
-                all_models,
-                all_models,
-                self.venv_path,
-                venv_for_each_model=False,
-                use_shared_cache=True,
-                run_export_compile=False,
-                run_export_profile=True,
-                skip_standard_unit_test=True,
-                # "Profile" tests fail only if there is something fundamentally wrong with the code, not if a single profile job fails.
-                exit_after_single_model_failure=False,
-                test_trace=False,
-            ),
-        )
+        return plan.add_step(step_id, self._make_hub_scorecard_task(profile=True))
+
+    @public_task("Run quantize jobs for all models in Model Zoo.")
+    @depends(["install_deps"])
+    def test_quantize_all_models(
+        self, plan: Plan, step_id: str = "test_quantize_all_models"
+    ) -> str:
+        return plan.add_step(step_id, self._make_hub_scorecard_task(quantize=True))
 
     @public_task("Verify all export scripts work e2e.")
     @depends(["install_deps"])
@@ -427,7 +433,7 @@ def test_all_export_scripts(
         )
 
     @public_task("Run tests for all models in Model Zoo.")
-    @depends(["install_deps"])
+    @depends(["install_deps", "quantize_all_models"])
     def test_all_models_long(
         self, plan: Plan, step_id: str = "test_all_models_long"
     ) -> str:
diff --git a/scripts/tasks/changes.py b/scripts/tasks/changes.py
index a27a856f..4b438cf7 100644
--- a/scripts/tasks/changes.py
+++ b/scripts/tasks/changes.py
@@ -2,9 +2,10 @@
 # Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
+import functools
 import os
 from pathlib import Path
-from typing import Iterable, Optional, Set
+from typing import Iterable, Optional, Set, Tuple
 
 from .constants import (
     PY_PACKAGE_MODELS_ROOT,
@@ -89,6 +90,32 @@ def _get_file_edges(filename) -> Set[str]:
     return dependent_files
 
 
+@functools.lru_cache(maxsize=1)
+def get_affected_files(changed_files: Iterable[str]) -> Set[str]:
+    """
+    Given a list of changed python files, performs a Depth-First Search (DFS)
+    over the qai_hub_models directory to figure out which files were affected.
+
+    Cached so that the graph traversal is done once, and `resolve_affected_models`
+    can be run with different args using the same base set of files.
+    """
+    changed_files = list(changed_files)
+    seen = set(changed_files)
+    while len(changed_files) > 0:
+        # Pop off stack
+        curr_file = changed_files.pop()
+        if curr_file in MANUAL_EDGES:
+            dependent_files = set(MANUAL_EDGES[curr_file])
+        else:
+            dependent_files = _get_file_edges(curr_file)
+        # Add new nodes to stack
+        for dependent_file in dependent_files:
+            if dependent_file not in seen:
+                seen.add(dependent_file)
+                changed_files.append(dependent_file)
+    return seen
+
+
 def resolve_affected_models(
     changed_files: Iterable[str],
     include_model: bool = True,
@@ -111,22 +138,10 @@ def resolve_affected_models(
     changed_files: List of filepaths to files that changed. Paths are
         relative to the root of this repository.
     """
-    changed_files = list(changed_files)
-    seen = set(changed_files)
-    while len(changed_files) > 0:
-        # Pop off stack
-        curr_file = changed_files.pop()
-        if curr_file in MANUAL_EDGES:
-            dependent_files = set(MANUAL_EDGES[curr_file])
-        else:
-            dependent_files = _get_file_edges(curr_file)
-        # Add new nodes to stack
-        for dependent_file in dependent_files:
-            if dependent_file not in seen:
-                seen.add(dependent_file)
-                changed_files.append(dependent_file)
+    # Convert to tuple so it can be used as a cache key
+    affected_files = get_affected_files(tuple(changed_files))
     changed_models = set()
-    for f in seen:
+    for f in affected_files:
         file_path = Path(f)
         # Only consider directories directly in the top-level `models/` folder
         # (i.e. ignore `models/_shared`, `models/_internal`)
@@ -167,6 +182,7 @@ def get_code_gen_changed_models() -> Set[str]:
     return set(changed_models)
 
 
+@functools.lru_cache(maxsize=2)  # Size 2 for `.py` and `code-gen.yaml`
 def get_changed_files_in_package(suffix: Optional[str] = None) -> Iterable[str]:
     """
     Returns the list of changed files in zoo based on git tracking.
@@ -277,7 +293,7 @@ def get_all_models() -> Set[str]:
     """
     Resolve model IDs (folder names) of all models in QAIHM.
     """
-    model_names = set()
+    model_names: Set[str] = set()
     for model_name in os.listdir(PY_PACKAGE_MODELS_ROOT):
         if os.path.exists(os.path.join(PY_PACKAGE_MODELS_ROOT, model_name, "model.py")):
             model_names.add(model_name)
@@ -289,6 +305,40 @@ def get_all_models() -> Set[str]:
         for model in allowed_models:
             if model not in model_names:
                 raise ValueError(f"Unknown model selected: {model}")
-        model_names = allowed_models
+        model_names = set(allowed_models)
 
     return model_names
+
+
+def get_models_to_test() -> Tuple[Set[str], Set[str]]:
+    """
+    This is the master function that is called directly in CI to determine
+    which models to test.
+
+    Returns:
+        Tuple[list of models to run unit tests, list of models to run compile tests]
+    """
+    # model.py changed
+    model_changed_models = get_models_with_changed_definitions()
+
+    # export.py or test_generated.py changed
+    export_changed_models = get_models_with_export_file_changes()
+
+    # code-gen.yaml changed
+    code_gen_changed_models = get_code_gen_changed_models()
+
+    # If model or code-gen changed, then test export.
+    models_to_test_export = model_changed_models | code_gen_changed_models
+
+    # For all other models where export.py or test_generated.py changed,
+    #   only test if they're part of REPRESENTATIVE_EXPORT_MODELS
+    models_to_test_export.update(
+        export_changed_models & set(REPRESENTATIVE_EXPORT_MODELS)
+    )
+
+    # Set of models where model.py, demo.py, or test.py changed.
+    models_to_run_tests = get_models_to_run_general_tests()
+
+    # export tests can only run alongside general model tests
+    models_to_run_tests = models_to_run_tests | models_to_test_export
+    return models_to_run_tests, models_to_test_export
diff --git a/scripts/tasks/constants.py b/scripts/tasks/constants.py
index 62ac36aa..5e91c725 100644
--- a/scripts/tasks/constants.py
+++ b/scripts/tasks/constants.py
@@ -3,8 +3,29 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
 import os
+import subprocess
+
+
+def process_output(command):
+    return command.stdout.decode("utf-8").strip()
+
+
+BASH_EXECUTABLE = process_output(
+    subprocess.run("which bash", stdout=subprocess.PIPE, shell=True, check=True)
+)
+
+
+def run_and_get_output(command, check=True):
+    return process_output(
+        subprocess.run(
+            command,
+            stdout=subprocess.PIPE,
+            shell=True,
+            check=check,
+            executable=BASH_EXECUTABLE,
+        )
+    )
 
-from .util import run_and_get_output
 
 # Env Variable
 STORE_ROOT_ENV_VAR = "QAIHM_STORE_ROOT"
diff --git a/scripts/tasks/test.py b/scripts/tasks/test.py
index 477fb97c..1b4d6287 100644
--- a/scripts/tasks/test.py
+++ b/scripts/tasks/test.py
@@ -16,7 +16,12 @@
     STORE_ROOT_ENV_VAR,
 )
 from .task import CompositeTask, PyTestTask, RunCommandsTask
-from .util import can_support_aimet, model_needs_aimet
+from .util import (
+    can_support_aimet,
+    check_code_gen_field,
+    get_is_hub_quantized,
+    model_needs_aimet,
+)
 from .venv import (
     CreateVenvTask,
     RunCommandsWithVenvTask,
@@ -69,6 +74,7 @@ def __init__(
         run_general: bool = True,
         run_compile: bool = True,
         run_profile: bool = False,
+        run_quantize: bool = False,
         run_export: bool = False,
         run_trace: bool = True,
         install_deps: bool = True,
@@ -114,12 +120,15 @@ def __init__(
             if run_profile:
                 test_flags.append("profile")
                 test_flags.append("inference")
+            if run_quantize:
+                test_flags.append("quantize")
             if run_trace:
                 test_flags.append("trace")
             if run_export:
                 test_flags.append("export")
-            if test_flags:
-                extras_args += ["-m", f'"{" or ".join(test_flags)}"']
+            if not test_flags:
+                raise ValueError("Must specify which types of tests to run")
+            extras_args += ["-m", f'"{" or ".join(test_flags)}"']
 
             # Create temporary directory for storing cloned & downloaded test artifacts.
             with TemporaryDirectory() as tmpdir:
@@ -179,6 +188,7 @@ def __init__(
         skip_standard_unit_test: bool = False,
         test_trace: bool = True,
         run_export_compile: bool = True,
+        run_export_quantize: bool = False,
         run_export_profile: bool = False,
         run_full_export: bool = False,
         exit_after_single_model_failure=False,
@@ -213,14 +223,13 @@ def __init__(
             )
 
         print(f"Tests to be run for models: {models_for_testing}")
-        global_models = []
+        global_models = set([])
         if not venv_for_each_model:
             for model_name in models_for_testing:
-                yaml_path = Path(PY_PACKAGE_MODELS_ROOT) / model_name / "code-gen.yaml"
-                if yaml_path.exists():
-                    with open(yaml_path, "r") as f:
-                        if "global_requirements_incompatible" not in f.read():
-                            global_models.append(model_name)
+                if not check_code_gen_field(
+                    model_name, "global_requirements_incompatible"
+                ):
+                    global_models.add(model_name)
 
             if len(global_models) > 0:
                 globals_path = Path(PY_PACKAGE_SRC_ROOT) / "global_requirements.txt"
@@ -238,7 +247,21 @@ def __init__(
 
         # Sort models for ease of tracking how far along the tests are.
         # Do reverse order because whisper is slow to compile, so trigger earlier.
-        for model_name in sorted(models_for_testing, reverse=True):
+        export_models = models_to_test_export
+        hub_quantized_models = []
+        nonhub_quantized_models = []
+        for model in sorted(models_for_testing, reverse=True):
+            if get_is_hub_quantized(model) and model in export_models:
+                hub_quantized_models.append(model)
+            else:
+                nonhub_quantized_models.append(model)
+
+        if run_export_quantize:
+            models_to_run = hub_quantized_models
+        else:
+            # Run hub quantized models last to give quantize job time to complete
+            models_to_run = nonhub_quantized_models + hub_quantized_models
+        for model_name in models_to_run:
             # Run standard test suite for this model.
             is_global_model = model_name in global_models
             tasks.append(
@@ -250,12 +273,12 @@ def __init__(
                     install_deps=not is_global_model,
                     run_trace=test_trace,
                     run_general=not skip_standard_unit_test,
-                    run_compile=run_export_compile
-                    and model_name in models_to_test_export,
-                    run_profile=run_export_profile
-                    and model_name in models_to_test_export,
-                    run_export=run_full_export and model_name in models_to_test_export,
-                    raise_on_failure=False,  # Do not raise on failure; let PyTestModelsTask::run_tasks handle this
+                    run_compile=run_export_compile and model_name in export_models,
+                    run_profile=run_export_profile and model_name in export_models,
+                    run_quantize=run_export_quantize and model_name in export_models,
+                    run_export=run_full_export and model_name in export_models,
+                    # Do not raise on failure; let PyTestModelsTask::run_tasks handle this
+                    raise_on_failure=False,
                 )
             )
 
diff --git a/scripts/tasks/util.py b/scripts/tasks/util.py
index 9625ff3b..1f5f1f77 100644
--- a/scripts/tasks/util.py
+++ b/scripts/tasks/util.py
@@ -3,10 +3,19 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
 import contextlib
+import functools
 import os
 import platform
 import subprocess
 import sys
+from pathlib import Path
+
+from .constants import (
+    BASH_EXECUTABLE,
+    PY_PACKAGE_MODELS_ROOT,
+    process_output,
+    run_and_get_output,
+)
 
 
 class Colors:
@@ -35,12 +44,30 @@ def new_cd(x):
         os.chdir(d)
 
 
+@functools.lru_cache(maxsize=None)
+def check_code_gen_field(model_name: str, field_name: str) -> bool:
+    """
+    This process does not have the yaml package, so use this primitive way
+    to check if a code gen field is true and apply branching logic within CI/scorecard.
+    """
+    yaml_path = Path(PY_PACKAGE_MODELS_ROOT) / model_name / "code-gen.yaml"
+    if yaml_path.exists():
+        with open(yaml_path, "r") as f:
+            if f"{field_name}: true" in f.read():
+                return True
+    return False
+
+
 def can_support_aimet(platform: str = sys.platform) -> bool:
     return platform == "linux" or platform == "linux2"
 
 
+def get_is_hub_quantized(model_name) -> bool:
+    return check_code_gen_field(model_name.lower(), "use_hub_quantization")
+
+
 def model_needs_aimet(model_name: str) -> bool:
-    return "quantized" in model_name.lower()
+    return "quantized" in model_name.lower() and not get_is_hub_quantized(model_name)
 
 
 def default_parallelism() -> int:
@@ -76,26 +103,10 @@ def on_mac():
     return platform.uname().system == "Darwin"
 
 
-def process_output(command):
-    return command.stdout.decode("utf-8").strip()
-
-
 def run(command):
     return subprocess.run(command, shell=True, check=True, executable=BASH_EXECUTABLE)
 
 
-def run_and_get_output(command, check=True):
-    return process_output(
-        subprocess.run(
-            command,
-            stdout=subprocess.PIPE,
-            shell=True,
-            check=check,
-            executable=BASH_EXECUTABLE,
-        )
-    )
-
-
 def run_with_venv(venv, command, env=None):
     if venv is not None:
         subprocess.run(
@@ -122,8 +133,3 @@ def run_with_venv_and_get_output(venv, command):
         )
     else:
         return run_and_get_output(command)
-
-
-BASH_EXECUTABLE = process_output(
-    subprocess.run("which bash", stdout=subprocess.PIPE, shell=True, check=True)
-)
diff --git a/scripts/util/extract_info_from_context_binary.py b/scripts/util/extract_info_from_context_binary.py
new file mode 100644
index 00000000..0bf0aba3
--- /dev/null
+++ b/scripts/util/extract_info_from_context_binary.py
@@ -0,0 +1,79 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+import argparse
+import json
+import os
+import subprocess
+
+QNN_TYPE_TO_STR = {
+    "QNN_DATATYPE_UFIXED_POINT_16": "uint16",
+    "QNN_DATATYPE_UFIXED_POINT_8": "uint8",
+    "QNN_DATATYPE_INT_32": "int32",
+}
+
+
+def run_utility(qnn_sdk, model_path):
+    json_path = f"{os.path.splitext(os.path.basename(model_path))[0]}.json"
+    subprocess.run(
+        [
+            f"{qnn_sdk}/qnn_sdk/default/bin/x86_64-linux-clang/qnn-context-binary-utility",
+            "--context_binary",
+            model_path,
+            "--json_file",
+            json_path,
+        ]
+    )
+    return json_path
+
+
+def print_details_from_json(json_path):
+    data = json.load(open(json_path, "r"))
+
+    for graph in data["info"]["graphs"]:
+        print(f"Graph Name: {graph['info']['graphName']}")
+        input_spec = dict()
+        for input in graph["info"]["graphInputs"]:
+            input_spec[input["info"]["name"]] = (
+                tuple(input["info"]["dimensions"]),
+                QNN_TYPE_TO_STR[input["info"]["dataType"]],
+            )
+        print(f"Graph Input: {input_spec}")
+        out = []
+        for output in graph["info"]["graphOutputs"]:
+            out.append(output["info"]["name"])
+        print(f"Graph Output Names: {out}")
+        print()
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--model",
+        "-m",
+        type=str,
+        default=None,
+        help="Folder of context binaries whose graph names and input/output details are needed to create model.py",
+    )
+    parser.add_argument(
+        "--qnn",
+        type=str,
+        default=None,
+        help="QNN SDK path",
+    )
+    args = parser.parse_args()
+    assert args.qnn and args.model, "Must specify --model and --qnn"
+
+    for model_path in os.listdir(args.model):
+        if os.path.splitext(model_path)[-1] == ".bin":
+            print(f"Model {model_path}")
+            print("===================")
+            json_path = run_utility(args.qnn, os.path.join(args.model, model_path))
+            print_details_from_json(json_path)
+            print()
+            print()
+
+
+if __name__ == "__main__":
+    main()