Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset #9
harshraj22
started this conversation in
Paper Review
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Inflated 3D ConvNet 📓
CVPR 2017
The authors, try to tackle the problem of video action classification by reusing the state of the art architectures for image classification. They use two streams (image and optical), pass each of them through the I3D module, and then classify their concatenated output using fully connected network.
The architecture of I3D model is same as the inception V1 (googlenet), with just extra dimension added to inflate the 2D model into 3D. They copy the weights of 2D architecure along the time dimention.
Beta Was this translation helpful? Give feedback.
All reactions