GET THE APP

Journal of Proteomics & Bioinformatics

Journal of Proteomics & Bioinformatics
Open Access

ISSN: 0974-276X

+44 1223 790975

Abstract

NRPred-FS: A Feature Selection based Two-level Predictor for Nuclear Receptors

Pu Wang and Xuan Xiao

Motivation: Nuclear receptors (NRs) play a role in all developmental and physiological processes and are important drug targets in a wide variety of disease and healthy states. In the past years, to identify NRs and their subfamilies with high throughput and low-cost, many machine learning methods have been introduced. However, these predictors are all developed based on old dataset in the NucleaRDB, what’s more, no feature selection technique is employed, so that the performances are very limited.

Result: In this study, a feature selection based two-level predictor, called NRPred-FS, is developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone, if it is, the prediction will be automatically continued to further identify it among the following eight subfamilies: (1) Thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) Estrogen like, (4) Nerve growth factor IB-like (NR4), (5) Fushi tarazu-F1 like (NR5), (6) Germ cell nuclear factor like (NR6), (7) knirps like (NR0A), and (8) DAX like (NR0B). The nuclear receptor sequences are encoded as sequence-derived feature vectors formed by incorporating various physicochemical and statistical features. Furthermore, the features set are optimized by forward feature selection algorithm for reducing the feature dimensions and for getting higher classifying accuracy. As a demonstration, this method gone through rigorous testing on a benchmark datasets derived from the latest version of NucleaRDB and UniProt. The overall prediction accuracies of leave-one-out cross-validation were about 97% and 93% in the first and second level respectively. As a convenience to the users, the powerful predictor, NRPred-FS, is freely accessible at http://www.jci-bioinfo.cn/NRPred-FS. Hopefully it will be a useful vehicle for identifying NRs and their subfamilies.

Top