Deep Aggregation Vision Transformer for Infant Pose Estimation

Introduction

Movement and pose assessment of newborns lets trained pediatricians predict neurodevelopmental disorders, allowing early intervention for related diseases. However, most of newest approaches for human pose estimation method focus on adults, lacking publicly large-scale dataset and powerful deep learning framework for infant pose estimation. In this paper, we fill this gap by proposing Deep Aggregation Vision Transformer for human (infant) posture estimation (AggPose), which introduces a high-resolution transformer framework without using convolution operations to extract features in the early stages. It generalizes Transformer + MLP to multi-scale deep layer aggregation within feature maps, thus enabling information fusion between different levels of vision tokens. We pre-train AggPose on COCO pose estimation and apply it on our newly released large-scale infant pose estimation dataset. The results show that AggPose could effectively learn the multi-scale features among different resolutions and significantly improve the performance.

This work was accepted by IJCAI-ECAI 2022 AI for Good Track

Model:
Google Drive

Modified code will release soon.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
experiments		experiments
lib		lib
tools		tools
visualization		visualization
CHINESE_README.md		CHINESE_README.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Aggregation Vision Transformer for Infant Pose Estimation

Introduction

About

Releases

Packages

Languages

License

IrohXu/AggPose

Folders and files

Latest commit

History

Repository files navigation

Deep Aggregation Vision Transformer for Infant Pose Estimation

Introduction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages