谷歌浏览器插件
订阅小程序
在清言上使用

Combining Knowledge about Text Types and Document Structures for Enhanced Content Curation

Qurator(2021)

引用 0|浏览12
暂无评分
摘要
We present the conceptual design of a language technology (LT) system that enables enhanced document curation and processing of different documents types by providing customized NLP workflows that respond and adapt to the extracted characteristics of the input documents. To optimize document and text understanding, the processing steps will not only incorporate textual features but also layout and document type related features like document structure, and the communicative function of specific parts or constituents of a document (e. g., header, subtitle, paragraph, footer). We tackle the lack of standardized representation formats for many of these document features by presenting the first draft of an ontology (QOntology) we plan to incorporate into the overall workflow manager. Since the work is still in progress, we present the theoretical background and conceptual design decisions of the approach which will be the basis of experiments in future work.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要