Chrome Extension
WeChat Mini Program
Use on ChatGLM

Navigating the pitfalls of applying machine learning in genomics

NATURE REVIEWS GENETICS(2021)

Cited 75|Views8
No score
Abstract
The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.
More
Translated text
Key words
Machine learning,Statistical methods,Biomedicine,general,Human Genetics,Cancer Research,Agriculture,Gene Function,Animal Genetics and Genomics
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined