-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is TCN suitable for spatio-temporal data? #73
Comments
I have a similar question. I want to use a TCN for video data. Anyone have any ideas? |
I have found a solution using an encoder. |
I was wondering if you could share the solution? Thanks, |
Did you find a solution? |
Just use any encoder and set channels to the output dim for one time step of the encoder. For example if you have some CNN model that inputs image (n_imgs,112,112) and outputs (n_imgs, channels), you simply input that into a CNN making sure that n_channels = channels and n_imgs is the length not the channels (possibly requiring reshaping). Lmk if that makes sense. |
You are correct in saying that we can use any CNN backbone initially to transform the input images (n_imgs, W, H, C) into (n_imgs, W', H', C'), where W', H', and C' are derived from the last feature map. To reduce the dimensions of W and H, we can employ either flattening or global average pooling (which is recommended) so that the dimension becomes (n_imgs, C'). Afterward, we can feed the transformed data into TCN. Please let me know if you need any further clarification. |
How does it perform?
How does it perform? |
I have dimensional spatio-temporal data which the spatial part is represented by 2D matrices(like an RGB image). How can I feed the data to the TCN?
The text was updated successfully, but these errors were encountered: