alexa Quantitative structure-retention relationship for the Kovats retention indices of a large set of terpenes: a combined data splitting-feature selection strategy.
General Science

General Science

Forest Research: Open Access

Author(s): Hemmateenejad B, Javadnia K, Elyasi M

Abstract Share this page

Abstract A data set consisting of a large number of terpenoids, the widely distributed compounds in nature that are found in abundance in higher plants, have been used to develop a quantitative structure property relationship (QSPR) for their Kovats retention index. QSPR models are usually obtained by splitting the data into two sets including calibration (or training) and prediction (or validation). All model building steps, especially feature selection procedure, are performed using this initial splitting, and therefore the performances of the resulted models are highly dependent on the initial data splitting. To investigate the effects of data splitting on the feature selection in the current article we proposed a combined data splitting-feature selection (CDFS) methodology for QSPR model development by producing several different training/validation/test sets, and repeating all of the model building studies. In this method, data splitting is achieved many times and in each case feature selection is performed. The resulted models are compared for similarity and dissimilarity between the selected descriptors. The final model is one whose descriptors are the common variables between all of resulted models. The method was applied to QSPR study of a large data set containing the Kovats retention indices of 573 terpenoids. A final 8-parametric multilinear model with constitutional and topological indices was obtained. Cross-validation indicated that the model could reproduce more than 90\% of variances in the Kovats retention data. The relative error of prediction for an external test set of 50 compounds was 3.2\%. Finally, to improve the results, structure-retention relationships were followed by nonlinear approach using artificial neural networks and consequently better results were obtained. This article was published in Anal Chim Acta and referenced in Forest Research: Open Access

Relevant Expert PPTs

Relevant Speaker PPTs

Recommended Conferences

Relevant Topics

Peer Reviewed Journals
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

© 2008-2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version