Domain-Specific Web Site Identification: The CROSSMARC Focused Web Crawler

msra(2003)

引用 49|浏览23
暂无评分
摘要
This paper presents techniques for identifying domain specific Web sites that have been implemented as part of the EC-funded R&D project, CROSSMARC. The project aims to develop technology for extracting interesting information from domain-specific Web pages. It is therefore important for CROSSMARC to identify Web sites in which interesting domain specific pages reside (focused Web crawling). This is the role of the CROSSMARC Web crawler.
更多
查看译文
关键词
web crawler,web crawling,web pages
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要