Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity.

HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2(2011)

引用 18|浏览37
暂无评分
摘要
Contrary to popular belief, we show that the optimal parameters for IBM Model 1 are not unique. We demonstrate that, for a large class of words, IBM Model 1 is indifferent among a continuum of ways to allocate probability mass to their translations. We study the magnitude of the variance in optimal model parameters using a linear programming approach as well as multiple random trials, and demonstrate that it results in variance in test set log-likelihood and alignment error rate.
更多
查看译文
关键词
IBM Model,optimal model parameter,optimal parameter,alignment error rate,large class,linear programming approach,multiple random trial,popular belief,probability mass,test set log-likelihood,IBM model,initialization matter,multiple optimum,non-strict convexity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要