Automated Cognate Detection as a Supervised Link Prediction Task with Cognate Transformer
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
Identification of cognates across related languages is one of the primary
problems in historical linguistics. Automated cognate identification is helpful
for several downstream tasks including identifying sound correspondences,
proto-language reconstruction, phylogenetic classification, etc. Previous
state-of-the-art methods for cognate identification are mostly based on
distributions of phonemes computed across multilingual wordlists and make
little use of the cognacy labels that define links among cognate clusters. In
this paper, we present a transformer-based architecture inspired by
computational biology for the task of automated cognate detection. Beyond a
certain amount of supervision, this method performs better than the existing
methods, and shows steady improvement with further increase in supervision,
thereby proving the efficacy of utilizing the labeled information. We also
demonstrate that accepting multiple sequence alignments as input and having an
end-to-end architecture with link prediction head saves much computation time
while simultaneously yielding superior performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要