Some Common Data Processing Code Used in Multimodal Scene Understanding | 多模态视频理解中常用的一些数据处理代码 目录 YouTube video download | YouTube视频下载 [VideoDownload] Extract video frames | 提取视频帧 [Video_Frames_Extraction] Feature Extraction | 特征提取 [FeatureExtraction] ImageBind