v0.16.2

See https://github.com/quic/ai-hub-models/releases/v0.16.2 for changelog. Signed-off-by: QAIHM Team <[email protected]>
quic · Oct 21, 2024 · 32cf044 · 32cf044
1 parent 1c5de3b
commit 32cf044
Show file tree

Hide file tree

Showing 524 changed files with 34,989 additions and 23,652 deletions.
diff --git a/README.md b/README.md
@@ -38,7 +38,8 @@ Supported precision
 
 Supported chipsets
 * [Snapdragon 845](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-845-mobile-platform), [Snapdragon 855/855+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-855-mobile-platform), [Snapdragon 865/865+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-865-plus-5g-mobile-platform), [Snapdragon 888/888+](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-888-5g-mobile-platform)
-* [Snapdragon 8 Gen 1](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-1-mobile-platform), [Snapdragon 8 Gen 2](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-2-mobile-platform), [Snapdragon 8 Gen 3](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-3-mobile-platform), [Snapdragon X Elite](https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-elite)
+* [Snapdragon 8 Elite](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-elite-mobile-platform), [Snapdragon 8 Gen 3](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-3-mobile-platform), [Snapdragon 8 Gen 2](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-2-mobile-platform), [Snapdragon 8 Gen 1](https://www.qualcomm.com/products/mobile/snapdragon/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-8-gen-1-mobile-platform)
+* [Snapdragon X Elite](https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-elite)
 
 Select supported devices
 * Samsung Galaxy S21 Series, Galaxy S22 Series, Galaxy S23 Series, Galaxy S24 Series
@@ -275,6 +276,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [ConvNext-Tiny-w8a16-Quantized](https://aihub.qualcomm.com/models/convnext_tiny_w8a16_quantized) | [qai_hub_models.models.convnext_tiny_w8a16_quantized](qai_hub_models/models/convnext_tiny_w8a16_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [ConvNext-Tiny-w8a8-Quantized](https://aihub.qualcomm.com/models/convnext_tiny_w8a8_quantized) | [qai_hub_models.models.convnext_tiny_w8a8_quantized](qai_hub_models/models/convnext_tiny_w8a8_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [DenseNet-121](https://aihub.qualcomm.com/models/densenet121) | [qai_hub_models.models.densenet121](qai_hub_models/models/densenet121/README.md) | ✔️ | ✔️ | ✔️
+| [DenseNet-121-Quantized](https://aihub.qualcomm.com/models/densenet121_quantized) | [qai_hub_models.models.densenet121_quantized](qai_hub_models/models/densenet121_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [EfficientNet-B0](https://aihub.qualcomm.com/models/efficientnet_b0) | [qai_hub_models.models.efficientnet_b0](qai_hub_models/models/efficientnet_b0/README.md) | ✔️ | ✔️ | ✔️
 | [GoogLeNet](https://aihub.qualcomm.com/models/googlenet) | [qai_hub_models.models.googlenet](qai_hub_models/models/googlenet/README.md) | ✔️ | ✔️ | ✔️
 | [GoogLeNetQuantized](https://aihub.qualcomm.com/models/googlenet_quantized) | [qai_hub_models.models.googlenet_quantized](qai_hub_models/models/googlenet_quantized/README.md) | ✔️ | ✔️ | ✔️
@@ -306,6 +308,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Swin-Small](https://aihub.qualcomm.com/models/swin_small) | [qai_hub_models.models.swin_small](qai_hub_models/models/swin_small/README.md) | ✔️ | ✔️ | ✔️
 | [Swin-Tiny](https://aihub.qualcomm.com/models/swin_tiny) | [qai_hub_models.models.swin_tiny](qai_hub_models/models/swin_tiny/README.md) | ✔️ | ✔️ | ✔️
 | [VIT](https://aihub.qualcomm.com/models/vit) | [qai_hub_models.models.vit](qai_hub_models/models/vit/README.md) | ✔️ | ✔️ | ✔️
+| [VITQuantized](https://aihub.qualcomm.com/models/vit_quantized) | [qai_hub_models.models.vit_quantized](qai_hub_models/models/vit_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [WideResNet50](https://aihub.qualcomm.com/models/wideresnet50) | [qai_hub_models.models.wideresnet50](qai_hub_models/models/wideresnet50/README.md) | ✔️ | ✔️ | ✔️
 | [WideResNet50-Quantized](https://aihub.qualcomm.com/models/wideresnet50_quantized) | [qai_hub_models.models.wideresnet50_quantized](qai_hub_models/models/wideresnet50_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
@@ -359,7 +362,9 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [MediaPipe-Face-Detection](https://aihub.qualcomm.com/models/mediapipe_face) | [qai_hub_models.models.mediapipe_face](qai_hub_models/models/mediapipe_face/README.md) | ✔️ | ✔️ | ✔️
 | [MediaPipe-Face-Detection-Quantized](https://aihub.qualcomm.com/models/mediapipe_face_quantized) | [qai_hub_models.models.mediapipe_face_quantized](qai_hub_models/models/mediapipe_face_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [MediaPipe-Hand-Detection](https://aihub.qualcomm.com/models/mediapipe_hand) | [qai_hub_models.models.mediapipe_hand](qai_hub_models/models/mediapipe_hand/README.md) | ✔️ | ✔️ | ✔️
-| [YOLOv11-Detection](qai_hub_models/models/yolov11_det/README.md) | [qai_hub_models.models.yolov11_det](qai_hub_models/models/yolov11_det/README.md) | ✔️ | ✔️ | ✔️
+| [PPE-Detection](https://aihub.qualcomm.com/models/gear_guard_net) | [qai_hub_models.models.gear_guard_net](qai_hub_models/models/gear_guard_net/README.md) | ✔️ | ✔️ | ✔️
+| [Person-Foot-Detection](https://aihub.qualcomm.com/models/foot_track_net) | [qai_hub_models.models.foot_track_net](qai_hub_models/models/foot_track_net/README.md) | ✔️ | ✔️ | ✔️
+| [YOLOv11-Detection](https://aihub.qualcomm.com/models/yolov11_det) | [qai_hub_models.models.yolov11_det](qai_hub_models/models/yolov11_det/README.md) | ✔️ | ✔️ | ✔️
 | [YOLOv8-Detection](https://aihub.qualcomm.com/models/yolov8_det) | [qai_hub_models.models.yolov8_det](qai_hub_models/models/yolov8_det/README.md) | ✔️ | ✔️ | ✔️
 | [YOLOv8-Detection-Quantized](https://aihub.qualcomm.com/models/yolov8_det_quantized) | [qai_hub_models.models.yolov8_det_quantized](qai_hub_models/models/yolov8_det_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Yolo-NAS](https://aihub.qualcomm.com/models/yolonas) | [qai_hub_models.models.yolonas](qai_hub_models/models/yolonas/README.md) | ✔️ | ✔️ | ✔️
@@ -369,7 +374,7 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Yolo-v7-Quantized](https://aihub.qualcomm.com/models/yolov7_quantized) | [qai_hub_models.models.yolov7_quantized](qai_hub_models/models/yolov7_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
 | **Pose Estimation**
-| [FaceMap_3DMM](qai_hub_models/models/facemap_3dmm/README.md) | [qai_hub_models.models.facemap_3dmm](qai_hub_models/models/facemap_3dmm/README.md) | ✔️ | ✔️ | ✔️
+| [Facial-Landmark-Detection](https://aihub.qualcomm.com/models/facemap_3dmm) | [qai_hub_models.models.facemap_3dmm](qai_hub_models/models/facemap_3dmm/README.md) | ✔️ | ✔️ | ✔️
 | [HRNetPose](https://aihub.qualcomm.com/models/hrnet_pose) | [qai_hub_models.models.hrnet_pose](qai_hub_models/models/hrnet_pose/README.md) | ✔️ | ✔️ | ✔️
 | [HRNetPoseQuantized](https://aihub.qualcomm.com/models/hrnet_pose_quantized) | [qai_hub_models.models.hrnet_pose_quantized](qai_hub_models/models/hrnet_pose_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [LiteHRNet](https://aihub.qualcomm.com/models/litehrnet) | [qai_hub_models.models.litehrnet](qai_hub_models/models/litehrnet/README.md) | ✔️ | ✔️ | ✔️
@@ -413,6 +418,15 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
 | [Stable-Diffusion-v2.1](https://aihub.qualcomm.com/models/stable_diffusion_v2_1_quantized) | [qai_hub_models.models.stable_diffusion_v2_1_quantized](qai_hub_models/models/stable_diffusion_v2_1_quantized/README.md) | ✔️ | ✔️ | ✔️
 | | | | |
 | **Text Generation**
-| [Baichuan-7B](https://aihub.qualcomm.com/models/baichuan_7b_quantized) | [qai_hub_models.models.baichuan_7b_quantized](qai_hub_models/models/baichuan_7b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Baichuan2-7B](https://aihub.qualcomm.com/models/baichuan2_7b_quantized) | [qai_hub_models.models.baichuan2_7b_quantized](qai_hub_models/models/baichuan2_7b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [IBM-Granite-3B-Code-Instruct](https://aihub.qualcomm.com/models/ibm_granite_3b_code_instruct) | [qai_hub_models.models.ibm_granite_3b_code_instruct](qai_hub_models/models/ibm_granite_3b_code_instruct/README.md) | ✔️ | ✔️ | ✔️
+| [IndusQ-1.1B](https://aihub.qualcomm.com/models/indus_1b_quantized) | [qai_hub_models.models.indus_1b_quantized](qai_hub_models/models/indus_1b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [JAIS-6p7b-Chat](https://aihub.qualcomm.com/models/jais_6p7b_chat_quantized) | [qai_hub_models.models.jais_6p7b_chat_quantized](qai_hub_models/models/jais_6p7b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Llama-v2-7B-Chat](https://aihub.qualcomm.com/models/llama_v2_7b_chat_quantized) | [qai_hub_models.models.llama_v2_7b_chat_quantized](qai_hub_models/models/llama_v2_7b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
 | [Llama-v3-8B-Chat](https://aihub.qualcomm.com/models/llama_v3_8b_chat_quantized) | [qai_hub_models.models.llama_v3_8b_chat_quantized](qai_hub_models/models/llama_v3_8b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Llama-v3.1-8B-Chat](https://aihub.qualcomm.com/models/llama_v3_1_8b_chat_quantized) | [qai_hub_models.models.llama_v3_1_8b_chat_quantized](qai_hub_models/models/llama_v3_1_8b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Llama-v3.2-3B-Chat](https://aihub.qualcomm.com/models/llama_v3_2_3b_chat_quantized) | [qai_hub_models.models.llama_v3_2_3b_chat_quantized](qai_hub_models/models/llama_v3_2_3b_chat_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Mistral-3B](https://aihub.qualcomm.com/models/mistral_3b_quantized) | [qai_hub_models.models.mistral_3b_quantized](qai_hub_models/models/mistral_3b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Mistral-7B-Instruct-v0.3](https://aihub.qualcomm.com/models/mistral_7b_instruct_v0_3_quantized) | [qai_hub_models.models.mistral_7b_instruct_v0_3_quantized](qai_hub_models/models/mistral_7b_instruct_v0_3_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [PLaMo-1B](https://aihub.qualcomm.com/models/plamo_1b_quantized) | [qai_hub_models.models.plamo_1b_quantized](qai_hub_models/models/plamo_1b_quantized/README.md) | ✔️ | ✔️ | ✔️
+| [Qwen2-7B-Instruct](https://aihub.qualcomm.com/models/qwen2_7b_instruct_quantized) | [qai_hub_models.models.qwen2_7b_instruct_quantized](qai_hub_models/models/qwen2_7b_instruct_quantized/README.md) | ✔️ | ✔️ | ✔️
diff --git a/qai_hub_models/_version.py b/qai_hub_models/_version.py
@@ -2,4 +2,4 @@
 # Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
 # SPDX-License-Identifier: BSD-3-Clause
 # ---------------------------------------------------------------------
-__version__ = "0.15.0"
+__version__ = "0.16.2"
diff --git a/qai_hub_models/asset_bases.yaml b/qai_hub_models/asset_bases.yaml
@@ -12,3 +12,4 @@ huggingface_path: qualcomm/{model_name}
 models_website_url: https://aihub.qualcomm.com
 models_website_relative_path: models/{model_id}
 email_template: qai_hub_models/scripts/templates/email_template.txt
+genie_url: https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie
diff --git a/qai_hub_models/conftest.py b/qai_hub_models/conftest.py
@@ -4,6 +4,7 @@
 # ---------------------------------------------------------------------
 def pytest_configure(config):
     config.addinivalue_line("markers", "compile: Run compile tests.")
+    config.addinivalue_line("markers", "quantize: Run quantize tests.")
     config.addinivalue_line("markers", "profile: Run profile tests.")
     config.addinivalue_line("markers", "inference: Run inference tests.")
     config.addinivalue_line("markers", "trace: Run trace accuracy tests.")

diff --git a/qai_hub_models/models/_shared/body_detection/app.py b/qai_hub_models/models/_shared/body_detection/app.py
@@ -0,0 +1,171 @@
+# ---------------------------------------------------------------------
+# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+# SPDX-License-Identifier: BSD-3-Clause
+# ---------------------------------------------------------------------
+from typing import Callable, List
+
+import numpy as np
+import torch
+
+from qai_hub_models.utils.asset_loaders import load_image
+from qai_hub_models.utils.bounding_box_processing import batched_nms, box_xywh_to_xyxy
+from qai_hub_models.utils.image_processing import resize_pad
+
+
+def preprocess(img: np.ndarray, height: int, width: int):
+    """
+    Preprocess model input.
+
+    Inputs:
+        img: np.ndarray
+            Input image of shape [H, W, C]
+        height: int
+            Model input height.
+        width: int
+            Model input width
+    Outputs:
+        input: torch.Tensor
+            Preprocessed model input. Shape is (1, C, H, W)
+        scale: float
+            Scaling factor of input image and network input image.
+        pad: List[float]
+            Top and left padding size.
+    """
+    img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze_(0) / 255.0
+    input, scale, pad = resize_pad(img, (height, width))
+    return input, scale, pad
+
+
+def decode(output: List[torch.Tensor], thr: float) -> np.ndarray:
+    """
+    Decode model output to bounding boxes, class indices and scores.
+
+    Inputs:
+        output: List[torch.Tensor]
+            Model output.
+        thr: float
+            Detection threshold. Predictions lower than the thresholds will be discarded.
+    Outputs: np.ndarray
+        Detection results. Shape is (N, 6). N is the number of detected objects. Each object is
+        represented by (class, x1, y1, x2, y2, score)
+    """
+    anchors = [
+        [[10, 13], [16, 30], [33, 23]],
+        [[30, 61], [62, 45], [59, 119]],
+        [[116, 90], [156, 198], [373, 326]],
+    ]
+    strides = (8, 16, 32)
+    result = []
+    for s, out in enumerate(output):
+        b, h, w, c = out.shape
+        out = out.reshape(b, h, w, 3, -1)
+        _, ny, nx, na = out.shape[:-1]
+        for y in np.arange(ny):
+            for x in np.arange(nx):
+                for a in np.arange(na):
+                    pred = out[0, y, x, a]
+                    obj_score = pred[4].sigmoid()
+                    cls_score = pred[5:].max().sigmoid()
+                    score = obj_score * cls_score
+                    if score < thr:
+                        continue
+                    c = np.argmax(pred[5:])
+                    bx = (pred[0].sigmoid() * 2 - 0.5 + x) * strides[s]
+                    by = (pred[1].sigmoid() * 2 - 0.5 + y) * strides[s]
+                    bw = 4 * pred[2].sigmoid() ** 2 * anchors[s][a][0]
+                    bh = 4 * pred[3].sigmoid() ** 2 * anchors[s][a][1]
+
+                    boxes = box_xywh_to_xyxy(
+                        torch.from_numpy(np.array([[[bx, by], [bw, bh]]]))
+                    )
+                    x1 = boxes[0][0][0].round()
+                    y1 = boxes[0][0][1].round()
+                    x2 = boxes[0][1][0].round()
+                    y2 = boxes[0][1][1].round()
+                    result.append([c, x1, y1, x2, y2, score])
+    return np.array(result, dtype=np.float32)
+
+
+def postprocess(
+    output: List[torch.Tensor],
+    scale: float,
+    pad: List[int],
+    conf_thr: float,
+    iou_thr: float,
+) -> np.ndarray:
+    """
+    Post process model output.
+    Inputs:
+        output: List[torch.Tensor]
+            Multi-scale model output.
+        scale: float
+            Scaling factor from input image and model input.
+        pad: List[int]
+            Padding sizes from input image and model input.
+        conf_thr: float
+            Confidence threshold of detections.
+        iou_thr: float
+            IoU threshold for non maximum suppression.
+    Outputs: np.ndarray
+        Detected object. Shape is (N, 6). N is the number of detected objects. Each object is
+        represented by (class, x1, y1, x2, y2, score)
+    """
+    result = decode(output, conf_thr)
+
+    result_final = []
+    for c in [0, 1]:
+        idx = result[:, 0] == c
+        boxes, scores = batched_nms(
+            iou_thr,
+            0,
+            torch.from_numpy(result[idx, 1:5]).unsqueeze_(0),
+            torch.from_numpy(result[idx, -1]).unsqueeze_(0),
+        )
+        scores[0].unsqueeze_(-1)
+        result_final.append(
+            torch.concat([torch.zeros_like(scores[0]) + c, boxes[0], scores[0]], 1)
+        )
+    result_final = torch.concat(result_final).numpy()
+    result_final[:, 1:5] = (
+        (result_final[:, 1:5] - np.array([pad[0], pad[1], pad[0], pad[1]])) / scale
+    ).round()
+    return result_final
+
+
+class BodyDetectionApp:
+    """Body detection application"""
+
+    def __init__(self, model: Callable[[torch.Tensor], torch.Tensor]) -> None:
+        """
+        Initialize BodyDetectionApp.
+
+        Inputs:
+            model: Callable[[torch.Tensor], torch.Tensor]
+                Detection model.
+        """
+        self.model = model
+
+    def detect(self, imgfile: str, height: int, width: int, conf: float) -> np.ndarray:
+        """
+        Detect objects from input images.
+
+        Inputs:
+            imgfile: str
+                Input image file
+            height: int
+                Model input height.
+            width: int
+                Model input width.
+            conf: float
+                Detection threshold.
+        Outputs: np.ndarray
+            Detection result. Shape is (N, 6). N is the number of detected objects. Each object is represented by
+            (cls_id, x1, y1, x2, y2, score)
+        """
+        img = np.array(load_image(imgfile))
+        input, scale, pad = preprocess(img, height, width)
+        output = self.model(input)
+        for t, o in enumerate(output):
+            output[t] = o.permute(0, 2, 3, 1).detach()
+        result = postprocess(output, scale, pad, conf, 0.5)
+        return result