Semantic Architecture for the Extraction, Storage, Processing and Visualization of Internet Sources Through the Use of Scrapy and Crawler Techniques

INFORMATION AND COMMUNICATION TECHNOLOGIES OF ECUADOR (TIC.EC)(2019)

引用 0|浏览16
暂无评分
摘要
The collection of structured data on the web involves a significant problem at the time of its abstraction in HTML pages, subsequently the processing of information for the reuse of any user and finally send it to a semantic process involves a difficult task to find an architecture that fulfill all these objectives. The present researching work has two main objectives that give solution to two of the major problems of the web of today. (a) Information overloaded: To provide a solution to the data collection hosted on the WEB in HTML format by merging data collection tools (Scrapy, Selenium) involving the user to perform a monitoring of the data to be collected. In addition, the existing limitations within tools that provide a similar service are taken into consideration. (b) Conceptualization of the data: To afford the user with a work space where the transformation of structured data to semantic data is allowed, taking into account the principles of Linked Data, moreover, the process of giving semantics to the data where aspects are taken into consideration Important such as: reuse of vocabularies, for covering this aspect it is made use of online catalogs that help to search existing vocabularies.
更多
查看译文
关键词
Web data,Semantic web,Ontologies,RDF,Crawler,Scrapy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要