When the Web is your Data Lake: Creating a Search Engine for Datasets on the Web

SIGMOD/PODS '20: International Conference on Management of Data Portland OR USA June, 2020(2020)

引用 5|浏览34
暂无评分
摘要
There are thousands of data repositories on the Web, providing access to millions of datasets. National and regional governments, scientific publishers and consortia, commercial data providers, and others publish data for fields ranging from social science to life science to high-energy physics to climate science and more. Access to this data is critical to facilitating reproducibility of research results, enabling scientists to build on others' work, and providing data journalists easier access to information and its provenance. In this talk, I will discuss our work on Dataset Search, which provides search capabilities over potentially all dataset repositories on the Web. I will talk about the open ecosystem for describing and citing datasets that we hope to encourage and the technical details on how we went about building Dataset Search. Finally, I will highlight research challenges in building a vibrant, heterogeneous, and open ecosystem where data becomes a first-class citizen.
更多
查看译文
关键词
datasets, neural networks, gaze detection, text tagging
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要