Skip to content

Commit

Permalink
v0.17.0
Browse files Browse the repository at this point in the history
See https://github.com/quic/ai-hub-models/releases/v0.17.0 for changelog.

Signed-off-by: QAIHM Team <[email protected]>
  • Loading branch information
qaihm-bot committed Oct 31, 2024
1 parent 32cf044 commit 4b1d508
Show file tree
Hide file tree
Showing 519 changed files with 28,039 additions and 19,321 deletions.
5 changes: 5 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ repos:
hooks:
- id: shellcheck
exclude: '\.yml$'
- repo: https://github.com/asottile/pyupgrade
rev: v3.16.0
hooks:
- id: pyupgrade
args: ['--py39-plus', '--keep-runtime-typing']
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
Expand Down
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,13 @@ and many more.

## Installation

We currently support **Python >=3.8 and <= 3.10.** We recommend using a Python
We currently support **Python 3.9, 3.10 (recommended), 3.11, and 3.12.** We recommend using a Python
virtual environment
([miniconda](https://docs.anaconda.com/free/miniconda/miniconda-install/) or
[virtualenv](https://virtualenv.pypa.io/en/latest/)).

*NOTE: Many quantized models are supported only with python 3.10*.

You can setup a virtualenv using:
```
python -m venv qai_hub_models_env && source qai_hub_models_env/bin/activate
Expand Down Expand Up @@ -181,7 +183,6 @@ Here is a simplified example of code that can be used to run the entire model
on a cloud hosted device:

```python
from typing import Tuple
import torch
import qai_hub as hub
from qai_hub_models.models.yolov7 import Model as YOLOv7Model
Expand Down Expand Up @@ -359,10 +360,14 @@ Qualcomm® AI Hub Models is licensed under BSD-3. See the [LICENSE file](../LICE
| [DETR-ResNet101-DC5](https://aihub.qualcomm.com/models/detr_resnet101_dc5) | [qai_hub_models.models.detr_resnet101_dc5](qai_hub_models/models/detr_resnet101_dc5/README.md) | ✔️ | ✔️ | ✔️
| [DETR-ResNet50](https://aihub.qualcomm.com/models/detr_resnet50) | [qai_hub_models.models.detr_resnet50](qai_hub_models/models/detr_resnet50/README.md) | ✔️ | ✔️ | ✔️
| [DETR-ResNet50-DC5](https://aihub.qualcomm.com/models/detr_resnet50_dc5) | [qai_hub_models.models.detr_resnet50_dc5](qai_hub_models/models/detr_resnet50_dc5/README.md) | ✔️ | ✔️ | ✔️
| [FaceAttribNet](https://aihub.qualcomm.com/models/face_attrib_net) | [qai_hub_models.models.face_attrib_net](qai_hub_models/models/face_attrib_net/README.md) | ✔️ | ✔️ | ✔️
| [FootTrackNet_Quantized](https://aihub.qualcomm.com/models/foot_track_net_quantized) | [qai_hub_models.models.foot_track_net_quantized](qai_hub_models/models/foot_track_net_quantized/README.md) | ✔️ | ✔️ | ✔️
| [Lightweight-Face-Detection](https://aihub.qualcomm.com/models/face_det_lite) | [qai_hub_models.models.face_det_lite](qai_hub_models/models/face_det_lite/README.md) | ✔️ | ✔️ | ✔️
| [MediaPipe-Face-Detection](https://aihub.qualcomm.com/models/mediapipe_face) | [qai_hub_models.models.mediapipe_face](qai_hub_models/models/mediapipe_face/README.md) | ✔️ | ✔️ | ✔️
| [MediaPipe-Face-Detection-Quantized](https://aihub.qualcomm.com/models/mediapipe_face_quantized) | [qai_hub_models.models.mediapipe_face_quantized](qai_hub_models/models/mediapipe_face_quantized/README.md) | ✔️ | ✔️ | ✔️
| [MediaPipe-Hand-Detection](https://aihub.qualcomm.com/models/mediapipe_hand) | [qai_hub_models.models.mediapipe_hand](qai_hub_models/models/mediapipe_hand/README.md) | ✔️ | ✔️ | ✔️
| [PPE-Detection](https://aihub.qualcomm.com/models/gear_guard_net) | [qai_hub_models.models.gear_guard_net](qai_hub_models/models/gear_guard_net/README.md) | ✔️ | ✔️ | ✔️
| [PPE-Detection-Quantized](https://aihub.qualcomm.com/models/gear_guard_net_quantized) | [qai_hub_models.models.gear_guard_net_quantized](qai_hub_models/models/gear_guard_net_quantized/README.md) | ✔️ | ✔️ | ✔️
| [Person-Foot-Detection](https://aihub.qualcomm.com/models/foot_track_net) | [qai_hub_models.models.foot_track_net](qai_hub_models/models/foot_track_net/README.md) | ✔️ | ✔️ | ✔️
| [YOLOv11-Detection](https://aihub.qualcomm.com/models/yolov11_det) | [qai_hub_models.models.yolov11_det](qai_hub_models/models/yolov11_det/README.md) | ✔️ | ✔️ | ✔️
| [YOLOv8-Detection](https://aihub.qualcomm.com/models/yolov8_det) | [qai_hub_models.models.yolov8_det](qai_hub_models/models/yolov8_det/README.md) | ✔️ | ✔️ | ✔️
Expand Down
2 changes: 1 addition & 1 deletion qai_hub_models/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------
__version__ = "0.16.2"
__version__ = "0.17.0"
5 changes: 2 additions & 3 deletions qai_hub_models/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------

from typing import Dict, List, Type

from .bsd300 import BSD300Dataset
from .coco import CocoDataset
Expand All @@ -12,15 +11,15 @@
from .imagenette import ImagenetteDataset
from .pascal_voc import VOCSegmentationDataset

ALL_DATASETS: List[Type[BaseDataset]] = [
ALL_DATASETS: list[type[BaseDataset]] = [
CocoDataset,
VOCSegmentationDataset,
BSD300Dataset,
ImagenetDataset,
ImagenetteDataset,
]

DATASET_NAME_MAP: Dict[str, Type[BaseDataset]] = {
DATASET_NAME_MAP: dict[str, type[BaseDataset]] = {
dataset_cls.dataset_name(): dataset_cls for dataset_cls in ALL_DATASETS
}

Expand Down
3 changes: 1 addition & 2 deletions qai_hub_models/datasets/bsd300.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
from __future__ import annotations

import os
from typing import Tuple

import numpy as np
import torch
Expand Down Expand Up @@ -66,7 +65,7 @@ def _prepare_data(self):
def __len__(self):
return DATASET_LENGTH

def __getitem__(self, item) -> Tuple[torch.Tensor, torch.Tensor]:
def __getitem__(self, item) -> tuple[torch.Tensor, torch.Tensor]:
# We use the super resolution GT-and-test image preparation from AIMET zoo:
# https://github.com/quic/aimet-model-zoo/blob/d09d2b0404d10f71a7640a87e9d5e5257b028802/aimet_zoo_torch/quicksrnet/dataloader/utils.py#L51

Expand Down
6 changes: 3 additions & 3 deletions qai_hub_models/datasets/coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------
import os
from typing import Tuple, Union
from typing import Union

import torch
from torch.utils.data.dataloader import default_collate
Expand Down Expand Up @@ -56,7 +56,7 @@ class CocoDataset(BaseDataset, CocoDetection):
Contains ~5k images spanning 80 classes.
"""

def __init__(self, target_image_size: Union[int, Tuple[int, int]] = 640):
def __init__(self, target_image_size: Union[int, tuple[int, int]] = 640):
BaseDataset.__init__(self, str(COCO_DATASET.path(extracted=True)))
CocoDetection.__init__(
self,
Expand All @@ -78,7 +78,7 @@ def __init__(self, target_image_size: Union[int, Tuple[int, int]] = 640):
)

def __getitem__(self, item):
image, target = super(CocoDataset, self).__getitem__(item)
image, target = super().__getitem__(item)
width, height = image.size
boxes = []
labels = []
Expand Down
5 changes: 2 additions & 3 deletions qai_hub_models/datasets/pascal_voc.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------

from typing import Tuple

import numpy as np
import torch
Expand All @@ -30,7 +29,7 @@ class VOCSegmentationDataset(BaseDataset):
https://host.robots.ox.ac.uk/pascal/VOC/voc2012/
"""

def __init__(self, split: str = "train", image_size: Tuple[int, int] = (224, 224)):
def __init__(self, split: str = "train", image_size: tuple[int, int] = (224, 224)):
BaseDataset.__init__(self, str(VOC_ASSET.path().parent / DEVKIT_FOLDER_NAME))
assert split in ["train", "val", "trainval"]
self.split = split
Expand All @@ -44,7 +43,7 @@ def __init__(self, split: str = "train", image_size: Tuple[int, int] = (224, 224
self.images = []
self.categories = []

with open(splits_dir / (split + ".txt"), "r") as f:
with open(splits_dir / (split + ".txt")) as f:
lines = f.read().splitlines()

for line in lines:
Expand Down
5 changes: 3 additions & 2 deletions qai_hub_models/evaluators/base_evaluators.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
from __future__ import annotations

from abc import ABC, abstractmethod
from typing import Callable, Collection, Tuple, Union
from collections.abc import Callable, Collection
from typing import Union

import torch
from torch.utils.data.dataloader import DataLoader
Expand All @@ -15,7 +16,7 @@
_ModelIO: TypeAlias = Union[Collection[torch.Tensor], torch.Tensor]
# Typically is a torch DataLoader, but anything with the collection signature is acceptable.
_DataLoader: TypeAlias = Union[
DataLoader, Collection[Union[_ModelIO, Tuple[_ModelIO, _ModelIO]]]
DataLoader, Collection[Union[_ModelIO, tuple[_ModelIO, _ModelIO]]]
]


Expand Down
8 changes: 6 additions & 2 deletions qai_hub_models/evaluators/detection_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# ---------------------------------------------------------------------
from __future__ import annotations

from typing import Collection
from collections.abc import Collection

import torch
from podm.metrics import ( # type: ignore
Expand Down Expand Up @@ -33,7 +33,11 @@ def __init__(
self.scale_x = 1 / image_height
self.scale_y = 1 / image_width

def add_batch(self, output: Collection[torch.Tensor], gt: Collection[torch.Tensor]):
def add_batch(
self,
output: Collection[torch.Tensor],
gt: Collection[torch.Tensor],
):
# This evaluator supports 1 output tensor at a time.
image_id, _, _, bboxes, classes = gt
pred_boxes, pred_scores, pred_class_idx = output
Expand Down
55 changes: 40 additions & 15 deletions qai_hub_models/global_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,61 +6,86 @@
# THIS FILE WAS AUTO-GENERATED. DO NOT EDIT MANUALLY.

Deprecated==1.2.11
aimet-torch==1.32.1.post1; sys_platform == "linux"
Pillow>10,<12
aimet-torch==1.32.1.post1; sys_platform == "linux" and python_version == "3.10"
albumentations==0.5.2
audio2numpy==0.1.2
basicsr==1.4.2
boto3==1.34.119
botocore==1.34.119
coverage==5.3.1
boto3>=1.34,<1.36
botocore>=1.34,<1.36
data-gradients==0.3.1
datasets==2.14.5
diffusers[torch]==0.21.4
easydict==1.10
easydict==1.13
einops==0.3.2
ftfy==6.1.1
gdown==4.7.1
gitpython==3.1.42
huggingface-hub>=0.23.1,<0.24
huggingface_hub>=0.23.1,<0.24
hydra-core==1.3.0
imageio[ffmpeg]==2.31.5
imagesize==1.4.1
jinja2==3.0.3
keyrings.envvars==1.1.0; python_version >= '3.9' # used only by CI
ipython==8.12.3
jinja2<3.2
jsonschema>4,<5
keyrings.envvars==1.1.0
kornia==0.5.0
librosa==0.10.1
matplotlib==3.7.5
mmcv==2.1.0
mmdet==3.2.0
mmpose==1.2.0
mypy==0.991
mypy==1.13.0
numpy>=1.23.5,< 2 # 1.23.5 required by AIMET
object-detection-metrics==0.4.post1
onnx>=1.14.1,<1.17 # ONNX must be at least 1.14.1. AIMET-torch and AIMET-ONNX use different ONNX versions.
onnxsim<=0.4.36
openai-whisper==20231117
pre-commit==3.5.0
opencv-python>4,<5
packaging>23,<24
pandas>=1.4.3,<2.3 # 1.4 required by AIMET
pre-commit==4.0.1
prettytable==3.11.0
psutil>6,<7
pycocotools==2.0.7
pytest-cov==4.1.0
pytest-xdist==3.3.1
pytorch-lightning==1.6.0
pytest>7,<9
pytest-cov>=5,<5.2
pytest-xdist>3,<4
pytorch-lightning>2,<3
pyyaml==6.0.2
qai_hub>=0.18.1
rapidfuzz==3.8.1
regex==2023.10.3
requests_toolbelt==1.0.0
ruamel-yaml==0.18.6
samplerate==0.2.1
schema==0.7.5
scikit-image==0.21.0
scikit-learn==1.1.3
scipy==1.8.1
scikit-image>0.21.0,<0.25
scikit-learn>1.1,<1.6
scipy>=1.8.1,<2 # 1.8.1 is for AIMET
seaborn==0.11.0
sentencepiece==0.2.0
shapely==2.0.3
soundfile==0.12.1
stringcase==1.2.0
tabulate==0.9.0
tensorboard==2.13.0
termcolor<=2.5.0
tflite==2.10.0
thop==0.1.1.post2209072238
timm==1.0.3
torch>=2.1.2,<2.5.0 # 2.1.2 is for AIMET. 2.5 won't work with torchvision yet.
torchmetrics==1.4.0.post0
torchvision>=0.16.2,<0.21
tqdm>=4.66
transformers==4.41.1
treelib==1.6.1
types-PyYAML==6.0.12.12
types-pillow==10.2.0.20240213
types-requests==2.31.0.6
types-tabulate==0.9.0.20240106
typing-extensions>=4.12.2
ultralytics==8.0.193
webdataset==0.2.86
wheel==0.44.0
Expand Down
42 changes: 10 additions & 32 deletions qai_hub_models/models/_shared/body_detection/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------
from typing import Callable, List

from collections.abc import Callable

import numpy as np
import torch
Expand All @@ -12,36 +13,12 @@
from qai_hub_models.utils.image_processing import resize_pad


def preprocess(img: np.ndarray, height: int, width: int):
"""
Preprocess model input.
Inputs:
img: np.ndarray
Input image of shape [H, W, C]
height: int
Model input height.
width: int
Model input width
Outputs:
input: torch.Tensor
Preprocessed model input. Shape is (1, C, H, W)
scale: float
Scaling factor of input image and network input image.
pad: List[float]
Top and left padding size.
"""
img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze_(0) / 255.0
input, scale, pad = resize_pad(img, (height, width))
return input, scale, pad


def decode(output: List[torch.Tensor], thr: float) -> np.ndarray:
def decode(output: list[torch.Tensor], thr: float) -> np.ndarray:
"""
Decode model output to bounding boxes, class indices and scores.
Inputs:
output: List[torch.Tensor]
output: list[torch.Tensor]
Model output.
thr: float
Detection threshold. Predictions lower than the thresholds will be discarded.
Expand Down Expand Up @@ -87,20 +64,20 @@ def decode(output: List[torch.Tensor], thr: float) -> np.ndarray:


def postprocess(
output: List[torch.Tensor],
output: list[torch.Tensor],
scale: float,
pad: List[int],
pad: list[int],
conf_thr: float,
iou_thr: float,
) -> np.ndarray:
"""
Post process model output.
Inputs:
output: List[torch.Tensor]
output: list[torch.Tensor]
Multi-scale model output.
scale: float
Scaling factor from input image and model input.
pad: List[int]
pad: list[int]
Padding sizes from input image and model input.
conf_thr: float
Confidence threshold of detections.
Expand Down Expand Up @@ -163,7 +140,8 @@ def detect(self, imgfile: str, height: int, width: int, conf: float) -> np.ndarr
(cls_id, x1, y1, x2, y2, score)
"""
img = np.array(load_image(imgfile))
input, scale, pad = preprocess(img, height, width)
img = torch.from_numpy(img).permute(2, 0, 1).unsqueeze_(0)
input, scale, pad = resize_pad(img, (height, width))
output = self.model(input)
for t, o in enumerate(output):
output[t] = o.permute(0, 2, 3, 1).detach()
Expand Down
Loading

0 comments on commit 4b1d508

Please sign in to comment.