Generalization within in silico screening

arxiv(2023)

引用 0|浏览14
暂无评分
摘要
In silico screening uses predictive models to select a batch of compounds with favorable properties from a library for experimental validation. Unlike conventional learning paradigms, success in this context is measured by the performance of the predictive model on the selected subset of compounds rather than the entire set of predictions. By extending learning theory, we show that the selectivity of the selection policy can significantly impact generalization, with a higher risk of errors occurring when exclusively selecting predicted positives and when targeting rare properties. Our analysis suggests a way to mitigate these challenges. We show that generalization can be markedly enhanced when considering a model's ability to predict the fraction of desired outcomes in a batch. This is promising, as the primary aim of screening is not necessarily to pinpoint the label of each compound individually, but rather to assemble a batch enriched for desirable compounds. Our theoretical insights are empirically validated across diverse tasks, architectures, and screening scenarios, underscoring their applicability.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要