A Machine Learning Language to Build a QSAR Model of Pyrazoline Derivative Inhibitors Targeting Mycobacterium tuberculosis Strain H37Rv

Letters in Drug Design & Discovery(2023)

Cited 0|Views4
No score
Abstract
Background: Machine learning has become an essential tool for drug research to generate pertinent structural information to design drugs with higher biological activities. Quantitative structure-activity relationship (QSAR) is considered one technique. QSAR study involves two main steps: first is the generation of descriptors, and the second is building and validating the models. Aim: By using a Python program language for building the QSAR model of pyrazoline derivatives, the data were collected from diverse literature for the inhibition of Mycobacterium tuberculosis. Pyrazoline, a small molecule scaffold, could block the biosynthesis of mycolic acids, resulting in mycobacteria death and leading to anti-tubercular drug discovery. Methods: We have developed a new Python script that effectively uses CDK descriptor as the independent variable and anti-tubercular bioactivity as the dependent variable in building and validating the best QSAR model. The built QSAR model was further cross-validated by using the external test set compounds. Then, the three algorithms, viz. multiple linear regression, support vector machine, and partial least square classifiers, were used to differentiate and compare their r(2) values. Results: Our generated QSAR model via an open-source python program predicted well with external test set compounds. The generated statistical model afforded the ordinary least squares (OLS) regression as R-2 value of 0.514, F value of 5.083, the adjusted R-2 value of 0.413, and std. error of 0.092. Moreover, the multiple linear regression showed the R-2 value of 0.5143, reg.coef_ of, -0.07795 (PC1), 0.01619 (PC2), 0.03763 (PC3), 0.07849 (PC4), -0.09726 (PC5), and reg.intercept_ of 4.8324. The performance of the model was determined by the support vector machine classifier of sklearn, module and it provided a model score of 0.5901. Further, the model performance was supported by a partial least square regression, and it showed the R-2 value of 0.5901. The model performance was validated, and the model predicted similar values when compared to that of the train set, and the plotted linear curve between the predicted and actual pMIC(50) value showed all data to fall over the middle linear line. Conclusion: We have found that the model score obtained using this script via three diverse algorithms correlated well, and there was not much difference between them; the model may be useful in the design of a similar group of pyrazoline analogs as anti-tubercular agents.
More
Translated text
Key words
Machine learning,QSAR,Python,H37Rv strain,Mycobacterium tuberculosis,Pyrazoline derivatives
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined