Unsupervised grammar inference using the minimum description length principle

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition(2012)

引用 1|浏览0
暂无评分
摘要
Context Free Grammars (CFGs) are widely used in programming language descriptions, natural language processing, compilers, and other areas of software engineering where there is a need for describing the syntactic structures of programs. Grammar inference (GI) is the induction of CFGs from sample programs and is a challenging problem. We describe an unsupervised GI approach which uses simplicity as the criterion for directing the inference process and beam search for moving from a complex to a simpler grammar. We use several operators to modify a grammar and use the Minimum Description Length (MDL) Principle to favor simple and compact grammars. The effectiveness of this approach is shown by a case study of a domain specific language. The experimental results show that an accurate grammar can be inferred in a reasonable amount of time.
更多
查看译文
关键词
context free grammars,unsupervised gi approach,inference process,programming language description,natural language processing,accurate grammar,unsupervised grammar inference,compact grammar,simpler grammar,minimum description length principle,domain specific language,grammar inference,context free grammar,minimum description length,unsupervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要