Reliability-Aware and Graph-Based Approach for Rank Aggregation of Biological Data

2019 15th International Conference on eScience (eScience)(2019)

引用 1|浏览5
暂无评分
摘要
Massive biological datasets are available in public databases and can be queried using portals with keyword queries. Ranked lists of answers are obtained by users. However, properly querying such portals remains difficult since various formulations of the same query can be considered (e.g., using synonyms). Consequently, users have to manually combine several lists of hundreds of answers into one list. Rank aggregation techniques are particularly well-fitted to this context as they take in a set of ranked elements (rankings) and provide a consensus, that is, a single ranking which is the "closest" to the input rankings. However, the problem of rank aggregation is NP-hard in most cases. Using an exact algorithm is currently not possible for more than a few dozens of elements. A plethora of heuristics have thus been proposed which behaviour are, by essence, difficult to anticipate: given a set of input rankings, one cannot guarantee how far from an exact solution the consensus ranking provided by an heuristic will be. The two challenges we want to tackle in this paper are the following: (i) providing an approach based on a pre-process to decompose large data sets into smaller ones where high-quality algorithms can be run and (ii) providing information to users on the robustness of the positions of elements in the consensus ranking produced. Our approach not only lies in mathematical bases, offering guarantees on the result computed but it has also been implemented in a real system available to life science community and tested on various real use cases.
更多
查看译文
关键词
rank aggregation, graph approach, consensus ranking, biological datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要