Scalable Document Image Information Extraction With Application To Domain-Specific Analysis

2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2019)

引用 3|浏览96
暂无评分
摘要
Document images are ubiquitous, but existing methods mainly focus on the text reading but not information understanding. In this paper, we propose a novel document image information extraction framework with application to domain-specific analysis. Key gains of our system result from the modularized implementation of the document analysis modules needed for different document analysis problems. Further, we provide an efficient text recognition approach that makes a trade-off between performance and running speed for document images and a novel information extraction method with both visual and semantic information. Our framework is scalable and customizable, and only a few annotations of the keyword-content mapping is needed towards domain-specific document analysis.
更多
查看译文
关键词
keyword-content mapping annotations,text recognition approach,scalable document image information extraction method,text reading,domain-specific document analysis,semantic information,visual information,document analysis problems,document analysis modules,novel document image information extraction framework,information understanding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要