ChartX ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
CoRR(2024)
摘要
Recently, many versatile Multi-modal Large Language Models (MLLMs) have
emerged continuously. However, their capacity to query information depicted in
visual charts and engage in reasoning based on the queried contents remains
under-explored. In this paper, to comprehensively and rigorously benchmark the
ability of the off-the-shelf MLLMs in the chart domain, we construct ChartX, a
multi-modal evaluation set covering 18 chart types, 7 chart tasks, 22
disciplinary topics, and high-quality chart data. Besides, we develop ChartVLM
to offer a new perspective on handling multi-modal tasks that strongly depend
on interpretable patterns, such as reasoning tasks in the field of charts or
geometric images. We evaluate the chart-related ability of mainstream MLLMs and
our ChartVLM on the proposed ChartX evaluation set. Extensive experiments
demonstrate that ChartVLM surpasses both versatile and chart-related large
models, achieving results comparable to GPT-4V. We believe that our study can
pave the way for further exploration in creating a more comprehensive chart
evaluation set and developing more interpretable multi-modal models. Both
ChartX and ChartVLM are available at:
https://github.com/UniModal4Reasoning/ChartVLM
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要