Skip to content

Commit

Permalink
v0.20.0
Browse files Browse the repository at this point in the history
See https://github.com/quic/ai-hub-models/releases/v0.20.0 for changelog.

Signed-off-by: QAIHM Team <[email protected]>
  • Loading branch information
qaihm-bot committed Dec 12, 2024
1 parent 826c3ea commit 9857d3f
Show file tree
Hide file tree
Showing 328 changed files with 30,803 additions and 19,280 deletions.
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ and many more.
| [SINet](https://aihub.qualcomm.com/models/sinet) | [qai_hub_models.models.sinet](qai_hub_models/models/sinet/README.md) |
| [Segment-Anything-Model](https://aihub.qualcomm.com/models/sam) | [qai_hub_models.models.sam](qai_hub_models/models/sam/README.md) |
| [Unet-Segmentation](https://aihub.qualcomm.com/models/unet_segmentation) | [qai_hub_models.models.unet_segmentation](qai_hub_models/models/unet_segmentation/README.md) |
| [YOLOv11-Segmentation](https://aihub.qualcomm.com/models/yolov11_seg) | [qai_hub_models.models.yolov11_seg](qai_hub_models/models/yolov11_seg/README.md) |
| [YOLOv8-Segmentation](https://aihub.qualcomm.com/models/yolov8_seg) | [qai_hub_models.models.yolov8_seg](qai_hub_models/models/yolov8_seg/README.md) |
| | |
| **Object Detection**
Expand All @@ -243,8 +244,10 @@ and many more.
| [DETR-ResNet101-DC5](https://aihub.qualcomm.com/models/detr_resnet101_dc5) | [qai_hub_models.models.detr_resnet101_dc5](qai_hub_models/models/detr_resnet101_dc5/README.md) |
| [DETR-ResNet50](https://aihub.qualcomm.com/models/detr_resnet50) | [qai_hub_models.models.detr_resnet50](qai_hub_models/models/detr_resnet50/README.md) |
| [DETR-ResNet50-DC5](https://aihub.qualcomm.com/models/detr_resnet50_dc5) | [qai_hub_models.models.detr_resnet50_dc5](qai_hub_models/models/detr_resnet50_dc5/README.md) |
| [FaceAttribNet](https://aihub.qualcomm.com/models/face_attrib_net) | [qai_hub_models.models.face_attrib_net](qai_hub_models/models/face_attrib_net/README.md) |
| [Facial-Attribute-Detection](https://aihub.qualcomm.com/models/face_attrib_net) | [qai_hub_models.models.face_attrib_net](qai_hub_models/models/face_attrib_net/README.md) |
| [Facial-Attribute-Detection-Quantized](https://aihub.qualcomm.com/models/face_attrib_net_quantized) | [qai_hub_models.models.face_attrib_net_quantized](qai_hub_models/models/face_attrib_net_quantized/README.md) |
| [Lightweight-Face-Detection](https://aihub.qualcomm.com/models/face_det_lite) | [qai_hub_models.models.face_det_lite](qai_hub_models/models/face_det_lite/README.md) |
| [Lightweight-Face-Detection-Quantized](https://aihub.qualcomm.com/models/face_det_lite_quantized) | [qai_hub_models.models.face_det_lite_quantized](qai_hub_models/models/face_det_lite_quantized/README.md) |
| [MediaPipe-Face-Detection](https://aihub.qualcomm.com/models/mediapipe_face) | [qai_hub_models.models.mediapipe_face](qai_hub_models/models/mediapipe_face/README.md) |
| [MediaPipe-Face-Detection-Quantized](https://aihub.qualcomm.com/models/mediapipe_face_quantized) | [qai_hub_models.models.mediapipe_face_quantized](qai_hub_models/models/mediapipe_face_quantized/README.md) |
| [MediaPipe-Hand-Detection](https://aihub.qualcomm.com/models/mediapipe_hand) | [qai_hub_models.models.mediapipe_hand](qai_hub_models/models/mediapipe_hand/README.md) |
Expand All @@ -257,6 +260,7 @@ and many more.
| [YOLOv8-Detection-Quantized](https://aihub.qualcomm.com/models/yolov8_det_quantized) | [qai_hub_models.models.yolov8_det_quantized](qai_hub_models/models/yolov8_det_quantized/README.md) |
| [Yolo-NAS](https://aihub.qualcomm.com/models/yolonas) | [qai_hub_models.models.yolonas](qai_hub_models/models/yolonas/README.md) |
| [Yolo-NAS-Quantized](https://aihub.qualcomm.com/models/yolonas_quantized) | [qai_hub_models.models.yolonas_quantized](qai_hub_models/models/yolonas_quantized/README.md) |
| [Yolo-v3](https://aihub.qualcomm.com/models/yolov3) | [qai_hub_models.models.yolov3](qai_hub_models/models/yolov3/README.md) |
| [Yolo-v6](https://aihub.qualcomm.com/models/yolov6) | [qai_hub_models.models.yolov6](qai_hub_models/models/yolov6/README.md) |
| [Yolo-v7](https://aihub.qualcomm.com/models/yolov7) | [qai_hub_models.models.yolov7](qai_hub_models/models/yolov7/README.md) |
| [Yolo-v7-Quantized](https://aihub.qualcomm.com/models/yolov7_quantized) | [qai_hub_models.models.yolov7_quantized](qai_hub_models/models/yolov7_quantized/README.md) |
Expand All @@ -273,6 +277,8 @@ and many more.
| [Posenet-Mobilenet-Quantized](https://aihub.qualcomm.com/models/posenet_mobilenet_quantized) | [qai_hub_models.models.posenet_mobilenet_quantized](qai_hub_models/models/posenet_mobilenet_quantized/README.md) |
| | |
| **Depth Estimation**
| [Depth-Anything](https://aihub.qualcomm.com/models/depth_anything) | [qai_hub_models.models.depth_anything](qai_hub_models/models/depth_anything/README.md) |
| [Depth-Anything-V2](https://aihub.qualcomm.com/models/depth_anything_v2) | [qai_hub_models.models.depth_anything_v2](qai_hub_models/models/depth_anything_v2/README.md) |
| [Midas-V2](https://aihub.qualcomm.com/models/midas) | [qai_hub_models.models.midas](qai_hub_models/models/midas/README.md) |
| [Midas-V2-Quantized](https://aihub.qualcomm.com/models/midas_quantized) | [qai_hub_models.models.midas_quantized](qai_hub_models/models/midas_quantized/README.md) |

Expand All @@ -284,16 +290,15 @@ and many more.
| **Speech Recognition**
| [HuggingFace-WavLM-Base-Plus](https://aihub.qualcomm.com/models/huggingface_wavlm_base_plus) | [qai_hub_models.models.huggingface_wavlm_base_plus](qai_hub_models/models/huggingface_wavlm_base_plus/README.md) |
| [Whisper-Base-En](https://aihub.qualcomm.com/models/whisper_base_en) | [qai_hub_models.models.whisper_base_en](qai_hub_models/models/whisper_base_en/README.md) |
| [Whisper-Small-En](https://aihub.qualcomm.com/models/whisper_small_en) | [qai_hub_models.models.whisper_small_en](qai_hub_models/models/whisper_small_en/README.md) |
| [Whisper-Tiny-En](https://aihub.qualcomm.com/models/whisper_tiny_en) | [qai_hub_models.models.whisper_tiny_en](qai_hub_models/models/whisper_tiny_en/README.md) |

### Multimodal

| Model | README |
| -- | -- |
| | |
| [TrOCR](https://aihub.qualcomm.com/models/trocr) | [qai_hub_models.models.trocr](qai_hub_models/models/trocr/README.md) |
| [OpenAI-Clip](https://aihub.qualcomm.com/models/openai_clip) | [qai_hub_models.models.openai_clip](qai_hub_models/models/openai_clip/README.md) |
| [TrOCR](https://aihub.qualcomm.com/models/trocr) | [qai_hub_models.models.trocr](qai_hub_models/models/trocr/README.md) |

### Generative Ai

Expand Down
2 changes: 1 addition & 1 deletion qai_hub_models/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------
__version__ = "0.19.1"
__version__ = "0.20.0"
2 changes: 2 additions & 0 deletions qai_hub_models/labels/ppe_labels.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
helmet
vest
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,20 @@
from qai_hub_models.utils.image_processing import pil_resize_pad, undo_resize_pad


class MidasApp:
class DepthEstimationApp:
"""
This class is required to perform end to end inference for Depth Estimation
The app uses 2 models:
* Midas
* DepthAnything
For a given image input, the app will:
* pre-process the image (convert to range[0, 1])
* Run DepthAnything inference
* Convert the depth into visual representation(heatmap) and return as image
"""

def __init__(
self,
model: Callable[[torch.Tensor], torch.Tensor],
Expand Down
49 changes: 49 additions & 0 deletions qai_hub_models/models/_shared/depth_estimation/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# ---------------------------------------------------------------------
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------

from qai_hub_models.models._shared.depth_estimation.app import DepthEstimationApp
from qai_hub_models.utils.args import (
demo_model_from_cli_args,
get_model_cli_parser,
get_on_device_demo_parser,
validate_on_device_demo_args,
)
from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_image
from qai_hub_models.utils.base_model import BaseModel
from qai_hub_models.utils.display import display_or_save_image


# The demo will display a heatmap of the estimated depth at each point in the image.
def depth_estimation_demo(
model_cls: type[BaseModel],
model_id,
default_image: CachedWebModelAsset,
is_test: bool = False,
):
parser = get_model_cli_parser(model_cls)
parser = get_on_device_demo_parser(parser, add_output_dir=True)
parser.add_argument(
"--image",
type=str,
default=default_image,
help="image file path or URL",
)
args = parser.parse_args([] if is_test else None)
model = demo_model_from_cli_args(model_cls, model_id, args)
validate_on_device_demo_args(args, model_id)

# Load image
(_, _, height, width) = model_cls.get_input_spec()["image"][0]
image = load_image(args.image)
print("Model Loaded")

app = DepthEstimationApp(model, height, width)
heatmap_image = app.estimate_depth(image)

if not is_test:
# Resize / unpad annotated image
display_or_save_image(
heatmap_image, args.output_dir, "out_heatmap.png", "heatmap"
)
57 changes: 57 additions & 0 deletions qai_hub_models/models/_shared/face_attrib_net/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# ---------------------------------------------------------------------
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
# SPDX-License-Identifier: BSD-3-Clause
# ---------------------------------------------------------------------
import json
from pathlib import Path

from qai_hub_models.models._shared.face_attrib_net.app import FaceAttribNetApp
from qai_hub_models.models.face_attrib_net.model import (
MODEL_ASSET_VERSION,
MODEL_ID,
OUT_NAMES,
FaceAttribNet,
)
from qai_hub_models.utils.args import (
demo_model_from_cli_args,
get_model_cli_parser,
get_on_device_demo_parser,
validate_on_device_demo_args,
)
from qai_hub_models.utils.asset_loaders import CachedWebModelAsset, load_image

INPUT_IMAGE_ADDRESS = CachedWebModelAsset.from_asset_store(
MODEL_ID, MODEL_ASSET_VERSION, "img_sample.bmp"
)


# Run FaceAttribNet end-to-end on a sample image.
def face_attrib_net_demo(model_cls: type[FaceAttribNet], is_test: bool = False):
# Demo parameters
parser = get_model_cli_parser(model_cls)
parser = get_on_device_demo_parser(parser, add_output_dir=True)
parser.add_argument(
"--image",
type=str,
default=INPUT_IMAGE_ADDRESS,
help="image file path or URL",
)
args = parser.parse_args([])
model = demo_model_from_cli_args(model_cls, MODEL_ID, args)
validate_on_device_demo_args(args, MODEL_ID)

# Load image
_, _, height, width = model_cls.get_input_spec()["image"][0]
orig_image = load_image(args.image)
print("Model loaded")

app = FaceAttribNetApp(model)
output = app.run_inference_on_image(orig_image)
out_dict = {}
for i in range(len(output)):
out_dict[OUT_NAMES[i]] = list(output[i].astype(float))

output_path = (args.output_dir or str(Path() / "build")) + "/output.json"
with open(output_path, "w", encoding="utf-8") as wf:
json.dump(out_dict, wf, ensure_ascii=False, indent=4)
print(f"Model outputs are saved at: {output_path}")
Loading

0 comments on commit 9857d3f

Please sign in to comment.