You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I’ve been following your work, and I find it particularly interesting and helpful. May I ask two questions?
I'd like to learn more about how your work compares to the approach used in WF-VAE (https://github.com/PKU-YuanGroup/WF-VAE). Would you be willing to share some insights on this?
How about the inference speed?
The text was updated successfully, but these errors were encountered:
We have tested the open-source model of WF-VAE, and the comparison results under the same setting (causal model, video compression ratio: 4×8×8, input shape: 17×256×256, testing data: MCL-JCV, 30 FPS) is as follows:
Method
Param.
PSNR
SSIM
LPIPS
FVD
WF-VAE-L-16chn
317M
33.76
0.928
0.091
90.8
VidTok-16chn
157M
35.04
0.942
0.047
78.9
As for the inference time, we will provide updates on it later.
Hello, I’ve been following your work, and I find it particularly interesting and helpful. May I ask two questions?
I'd like to learn more about how your work compares to the approach used in WF-VAE (https://github.com/PKU-YuanGroup/WF-VAE). Would you be willing to share some insights on this?
How about the inference speed?
The text was updated successfully, but these errors were encountered: