A comparative study of the predictive performance of different descriptor calculation tools: Molecular-based elution order modeling and interpretation of retention mechanism for isomeric compounds from METLIN database

JOURNAL OF CHROMATOGRAPHY A(2024)

引用 0|浏览2
暂无评分
摘要
In the pharmaceutical industry, the need for analytical standards is a bottleneck for comprehensive evaluation and quality control of intermediate and end products. These are complex mixtures containing structurally related molecules. In this regard, chromatographic peak annotation, especially for critical pairs of isomers and closest structural analogs, can be supported by using a Quantitative Structure Retention Relationship (QSRR) approach. In our study, we investigated the fundamental basis of the reversed-phase (RP) retention mechanism for 1141 isomeric compounds from the METLIN SMRT dataset. Nine different descriptor calculation tools combined with different feature selection methods (genetic algorithm (GA), stepwise, Boruta) and machine learning (ML) approaches (support vector machine (SVM), multiple linear regression (MLR), random forest (RF), XGBoost) were applied to provide a reliable molecular structure-based interpretation of RP retention behaviour of the isomeric compounds. Strict internal and external validation metrics were used to select models with the best predictive capabilities (rtest > 0.73, order of elution > 60 %). For the developed models, mean absolute errors were in the range of 60 to 110 s. Stepwise and GA showed the most suitable performance as descriptor selection methods, while SVM and XGBoost modeling gave satisfactory predictive characteristics in most cases. Validation performed on the published experimental data for structurally related pharmaceutical compounds confirmed the best accuracy of MLR modeling in combination with GA feature selection of general physico-chemical properties. The resulting models will be useful for the prediction of separation and identification of structurally related compounds in pharmaceutical analysis, providing a simultaneous understanding of the interaction mechanisms leading to their retention under RP conditions.
更多
查看译文
关键词
RP-HPLC,Rp retention mechanism,Molecular -based modeling,Elution order prediction,descriptor selection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要