Chrome Extension
WeChat Mini Program
Use on ChatGLM

Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features

IEEE Transactions on Knowledge and Data Engineering(2013)

Cited 11|Views0
No score
Abstract
This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Motivated by this observation, we propose a new holistic approach by 1) combining all similarity types used and 2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. We also discover "latent" features lost in the graph extraction process and integrate this into our framework. According to our experimental results, our holistic graph-based approach and its enhancement using corpus latent features are highly effective and our framework significantly outperforms previous approaches.
More
Translated text
Key words
entity translation,entity translation problem,relationship similarity,entity context,comparable corpus,entity similarity,entity context similarity,graph mapping,holistic graph-based approach,entity graph,named entity translation mining,named entity graphs,chinese named entity translation,data mining,entity translation mining,bilingual named entities,combining graph mapping,graph extraction process,graph theory,natural language processing,reinforcing method,english named entity translation,entity similarity metrics,comparable corpora,text mining,relationship context similarity,corpus latent features,dictionaries,measurement,vectors,feature extraction
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined