Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models
CoRR(2024)
摘要
This paper studies the relationship between the surface form of a
mathematical problem and its solvability by large language models. We find that
subtle alterations in the surface form can significantly impact the answer
distribution and the solve rate, exposing the language model's lack of
robustness and sensitivity to the surface form in reasoning through complex
problems. To improve mathematical reasoning performance, we propose
Self-Consistency-over-Paraphrases (SCoP), which diversifies reasoning paths
from specific surface forms of the problem. We evaluate our approach on four
mathematics reasoning benchmarks over three large language models and show that
SCoP improves mathematical reasoning performance over vanilla self-consistency,
particularly for problems initially deemed unsolvable. Finally, we provide
additional experiments and discussion regarding problem difficulty and surface
forms, including cross-model difficulty agreement and paraphrasing
transferability, and Variance of Variations (VOV) for language model
evaluation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要