Why Can Large Language Models Generate Correct Chain-of-Thoughts?

Rasul Tutunov,Antoine Grosnit,Juliusz Ziomek,Jun Wang,Haitham Bou-Ammar

CoRR（2023）

引用 0|浏览20

暂无评分

摘要

This paper delves into the capabilities of large language models (LLMs), specifically focusing on advancing the theoretical comprehension of chain-of-thought prompting. We investigate how LLMs can be effectively induced to generate a coherent chain of thoughts. To achieve this, we introduce a two-level hierarchical graphical model tailored for natural language generation. Within this framework, we establish a compelling geometrical convergence rate that gauges the likelihood of an LLM-generated chain of thoughts compared to those originating from the true language. Our findings provide a theoretical justification for the ability of LLMs to produce the correct sequence of thoughts (potentially) explaining performance gains in tasks demanding reasoning skills.

查看译文

关键词

large language models,language models,chain-of-thoughts

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要