Generating Semantics for the Life Sciences via Text Analytics

Semantic Computing(2011)

引用 3|浏览0
暂无评分
摘要
The life sciences have a strong need for carefully curated, semantically rich fact repositories. Knowledge harvesting from unstructured textual sources is currently performed by highly skilled curators who manually feed semantics into such databases as a result of deep understanding of the documents chosen to populate such repositories. As this is a slow and costly process, we here advocate an automatic approach to the generation of database contents which is based on JREX, a high performance relation extraction system. As a real-life example, we target REGULONDB, the world's largest manually curated reference database for the transcriptional regulation network of E. coli. We investigate in our study the performance of automatic knowledge capture from various literature sources, such as PUBMED abstracts and associated full text articles. Our results show that we can, indeed, automatically re-create a considerable portion of the REGULONDB database by processing the relevant literature sources. Hence, this approach might help curators widen the knowledge acquisition bottleneck in this field.
更多
查看译文
关键词
text analytics,knowledge acquisition bottleneck,e. coli,relevant literature source,regulondb database,life sciences,high performance relation extraction,various literature source,generating semantics,automatic approach,database content,automatic knowledge capture,curated reference database,database management systems,semantics,information extraction,databases,gene expression,text analysis,biomedical text mining,radio frequency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要