VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

The advancement of Multimodal Large Language Models (MLLMs) has enabled significant progress in multimodal understanding, expanding their capacity to analyze video content. However, existing evaluation benchmarks for MLLMs primarily focus on abstract video comprehension, lacking a detailed assessmen...

Full description

Saved in:
Bibliographic Details
Main Authors: Tang, Yunlong, Guo, Junjia, Hua, Hang, Liang, Susan, Feng, Mingqian, Li, Xinyang, Mao, Rui, Huang, Chao, Bi, Jing, Zhang, Zeliang, Fazli, Pooyan, Xu, Chenliang
Format: Journal Article
Language:English
Published: 17-11-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first