-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrations into popular evaluation frameworks like lmms_eval or vlmevalkit #2
Comments
Thank you for the suggestion! Yes, we will integrate our benchmark into lmms_eval and VLMEvalKit. We will work on this soon after adding the results of some recent VLMs. |
Look forward to that! |
@Violettttee Thanks for your questions.
|
@woodfrog Hi~ text = "This is the first example.<image>This is the second example <image>" and for video tasks, the examples seem to be: text = "This is the first example.<image><image><image>.......<image>This is the second example<image>...... so, under this circumstance and for video tasks, it means we must load the video, and get all the frames to add token in text.It seems to take a lot of memory and times. |
@Violettttee Thanks for the clarification. So the question is specifically for video tasks, am I right? In our evaluation pipeline, we indeed read the video and do frame sub-sampling based on per-model hyper-parameters (models with a larger context window size will have a larger sampling rate). The We didn't pre-convert a video into a list of images mainly for two reasons: 1) If the pre-defined sampling rate is too large, the size of the image sequence will be super large; 2) If the pre-defined sampling rate is too small, we might lose critical temporal information, which is also unfair for those models with long context window. Maybe you can follow our sub-sampling pipeline to prepare the data? Feel free to ask if you have any further questions. |
Thank you for your great work!
I wonder if it can be integrated into popular evaluation frameworks like lmms_eval or vlmevalkit for easier use by everyone?
The text was updated successfully, but these errors were encountered: