Received date: March 20, 2013; Accepted date: April 24, 2013; Published date: April 26, 2013
Citation: Sharma BK, Singh P (2013) Chemometric Descriptor Based QSAR Rationales for the MMP-13 Inhibition Activity of Non-Zinc-Chelating Compounds. Med chem 3:168-178. doi:10.4172/2161-0444.1000134
Copyright: © 2013 Sharma BK, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Medicinal Chemistry
The MMP-13 inhibition activity of non-zinc-chelating compounds has been quantitatively analyzed in terms of chemometric descriptors. The statistically validated quantitative structure-activity relationship (QSAR) models provided rationales to explain the inhibition activity of these compounds. The descriptors, identified through combinatorial protocol in multiple linear regression (CP-MLR) analysis, have highlighted the role of 3-path Kier alpha-modified shape index (S3K), complementary information content index of 1-order neighborhood symmetry (CIC1), eigenvalue sum from mass weighted distance matrix (SEigm), lowest eigenvalue n. 6 of Burden matrix / weighted by atomic van der Waals volumes (BELv6) and by atomic polarizabilities (BELp6), 3-order topological charge index (GGI3) and the functionality, R--CR--R (C-025). From statistically validated models, it appeared that the descriptors S3K, BELv6, BELp6 and SEigm make positive contribution to activity and their higher values are conducive in improving the MMP- 13 inhibition activity of a compound. On the other hand, the descriptors CIC1, GGI3 and C-025 render detrimental effects to activity. Therefore, the absence of functionality, R--CR--R and lower values of descriptors CIC1 and GGI3 would be advantageous. PLS analysis has further corroborated the dominance of the CP-MLR identified descriptors. Applicability domain analysis revealed that the suggested models have acceptable predictability. All the compounds are within the applicability domain of the proposed models and were evaluated correctly.
QSAR; MMP-13 inhibitors; Combinatorial protocol in multiple linear regression (CP-MLR) analysis; Chemometric descriptors; Non-Zn-chelating compounds
The matrix metalloproteinases (MMPs), a family of more than 27 zinc- and calcium-containing enzymes, are involved in the degradation of extracellular matrix and tissue remodeling [1-3]. Of the collagenase family (MMP-1, MMP-8 and MMP-13) MMP-13, the most efficient type II collagen-degrading MMP [4,5], has now become an attractive therapeutic target because its inhibition reduces cartilage degradation associated with the progression of rheumatoid arthritis and osteoarthritis in animal models [6,7]. However, broad-spectrum MMP inhibitors exhibit a dose-limiting toxicity leading to side effects such as a painful joint stiffening (musculoskeletal syndrome, MSS) and inflammation [8-15]. It was suggested that MSS is caused by the inhibition of normal extracellular matrix turnover due to inhibition of other MMPs rather than MMP-13 [16-21]. At present, it is unclear which MMP isoforms may be involved  and to what extent they contribute to MSS. Thus, selective inhibition of MMP-13, devoid of MSS, may prove to be better therapeutic research area.
MMPs having a tris (histidine)-bound zinc(II), acts as the catalytic site for the hydrolysis of substrate. Most MMP inhibitors achieve affinity through interaction with the catalytic zinc via a chelating moiety such as hydroxamic acid and by locating hydrophobic functionality in the S1′ pocket . The S1′ pocket varies in length and amino acid sequence for different MMP isoforms. Such variations between MMP family members were, therefore, used to design MMP inhibitors with different selectivity profiles . MMP-13 has additional region, S1′*, for inhibitor binding that has not been identified in other MMP isoforms. Most potent and selective MMP-13 inhibitors occupy both the S1′ and S1′* pockets only [24-27] and reduces the need to have a Zn-binding functionality.
In view of this, two new classes of potent and selective MMP-13 inhibitors involving unique binding mode at the active site and not interacting with the catalytic zinc, have recently been reported [28,29]. The general structure of these classes is shown in figure 1. In the first series, the structural variations appeared at position R1 and in incision X while in the second series, positions R2 and R3 have been varied. These functional variations are given in Chart 1.
The first series of compounds (1-23) were obtained through optimization with the aid of co-crystal structural information . For this, the hit compound (1) was extended out from the active site into the S1′ pocket by adding an aryl group through two different linking functionalities. The aryl ring occupies the entrance to the S1′ pocket thus providing the opportunity to grow into the S1′ pocket to improve the potency against MMP-13 and the selectivity profile against other MMP isoforms. Depending on the linkage different trends for potency and selectivity for the respective aryl groups were observed.
To further improve potency against MMP-13, the second series (compounds 23-55) was explored to investigate alternative ways of interacting with the S1′ pocket . The starting point of this optimization was the result of a hybridization of hit structure (compound 1) with another series of MMP-13 inhibitors based on an overlay of their crystal structures. In these analogues, the aryl groups was appended at the C-3 position of phenyl ring through a methyl amide linkage occupying the MMP S1′ pockets somewhat differently than the analogous functionality in first class of compounds. Additionally, the cyclohexyl group, able to bind in the S2′ pocket, was also replaced by other smaller substituents to modify the lipophilicity of some of the congeners.
In both reported studies, the structure-activity relationships (SARs) were, however, targeted at the alteration of substituents at different positions and provided no rationale to reduce the trial-and-error factors. Hence, in the present communication a 2D-quantitative SAR (2D-QSAR) has been conducted to provide the rationale for drug-design and to explore the possible mechanism of the action. In the congeneric series, where a relative study is being carried out, the 2D-descriptors may play important role in deriving the significant correlations with biological activities of the compounds. The novelty and importance of a 2D-QSAR study is due to its simplicity for the calculations of different descriptors and their interpretation (in physical sense) to explain the inhibition actions of compounds at molecular level.
For present work the non-Zn-chelating compounds (Chart 1), along with their in vitro inhibition activity of MMP-13, have been taken from the literature [28,29]. The inhibition activity, IC50, represents the concentration of a compound to achieve 50% inhibition of MMP-13 against type II collagen. The same is expressed as pIC50 on a molar basis and stand as the dependent descriptor for present quantitative analysis. For modeling purpose, the complete data-set was divided into trainingand test-sets. The training-set was used to derive statistical significant models while the test-set, consisting nearly 25% of total compounds, was employed to validate such models. The selection of test-set compounds was made through SYSTAT  using the single linkage hierarchical cluster procedure involving the Euclidean distances of the inhibition activity, pIC50 values. The test-set compounds were selected from the generated cluster tree in such a way to keep them at a maximum possible distance from each other. In SYSTAT, by default, the normalized Euclidean distances are computed to join the objects of cluster. The normalized distances are root mean-squared distances. The single linkage uses distance between two closest members in clustering. It generates long clusters and provides scope to choose objects at intervals. Due to this reason, a single linkage clustering procedure was applied.
The structures of the compounds (Chart 1), under study, have been drawn in 2D ChemDraw  using the standard procedure. These structures were converted into 3D objects using the default conversion procedure implemented in the CS Chem3D Ultra. The generated 3D-structures of the compounds were subjected to energy minimization in the MOPAC module, using the AM1 procedure for closed shell systems, implemented in the CS Chem3D Ultra. This will ensure a well defined conformer relationship across the compounds of the study. All these energy minimized structures of respective compounds have been ported to DRAGON software  for computing the descriptors corresponding to 0D-, 1D-, and 2D-classes. The combinatorial protocol in multiple linear regression (CP-MLR)  analysis and partial leastsquares (PLS) [34-36] procedures have been used in the present work for developing QSAR models. A brief description of the computational procedure is given below.
The CP-MLR is a ‘filter’-based variable selection procedure for model development in QSAR studies . Its procedural aspects and implementation are discussed in some of our recent publications [37-42]. The thrust of this procedure is in its embedded ‘filters’. They are briefly as follows: filter-1 seeds the variables by way of limiting interparameter correlations to predefined level (upper limit ≤ 0.79); filter-2 controls the variables entry to a regression equation through t-values of coefficients (threshold value ≥ 2.0); filter-3 provides comparability of equations with different number of variables in terms of square root of adjusted multiple correlation coefficient of regression equation, r-bar; filter-4 estimates the consistency of the equation in terms of cross-validated r2 or q2 with leave-one-out (LOO) cross-validation as default option (threshold value 0.3 ≤ q2 ≤ 1.0). All these filters make the variable selection process efficient and lead to a unique solution. In order to collect the descriptors with higher information content and explanatory power, the threshold of filter-3 was successively incremented with increasing number of descriptors (per equation) by considering the r-bar value of the preceding optimum model as the new threshold for next generation. Furthermore, in order to discover any chance correlations associated with the models recognized in CPMLR, each cross-validated model has been put to a randomization test [43,44] by repeated randomization of the activity to ascertain the chance correlations, if any, associated with them. For this, every model has been subjected to 100 simulation runs with scrambled activity. The scrambled activity models with regression statistics better than or equal to that of the original activity model have been counted, to express the percent chance correlation of the model under scrutiny.
To support the findings, a partial least squares (PLS) analysis has been carried out on descriptors identified through CP-MLR. The study facilitates the development of a ‘single window’ structureactivity model and help to categorize the potentiality of identified descriptors in explaining the MMP-13 inhibition activity profiles of the compounds. It also gives an opportunity to make a comparison of the relative significance among the descriptors. The fraction contributions obtainable from the normalized regression coefficients of the descriptors allow this comparison within the modeled activity.
The utility of a QSAR model is based on its accurate prediction ability for new compounds. A model is valid only within its training domain and new compounds must be assessed as belonging to the domain before the model is applied. The applicability domain is assessed by the leverage values for each compound [45,46]. The Williams plot (the plot of standardized residuals versus leverage values, h) can then be used for an immediate and simple graphical detection of both the response outliers (Y outliers) and structurally influential chemicals (X outliers) in the model. In this plot, the applicability domain is established inside a squared area within ± x (s.d.) and a leverage threshold h*. The threshold h* is generally fixed at 3(k + 1)/n (n is the number of training-set compounds and k is the number of model parameters) whereas x=2 or 3. Prediction must be considered unreliable for compounds with a high leverage value (h > h*). On the other hand, when the leverage value of a compound is lower than the threshold value, the probability of accordance between predicted and observed values is as high as that for the training-set compounds.
For the compounds in chart 1, a total number of 495 descriptors belonging to 0D- to 2D- classes of DRAGON have been computed and were subjected to CP-MLR analysis. The preliminary assessment of complete data-set suggested that the lone compound 25, having a methyl group at R2 remained as an ‘outlier’. Similarly compound 15, due to its uncertain activity value, could not fit into the trend of remaining compounds of the series. Both these compounds were, therefore, ignored in the subsequent analyses. The remaining 53 compounds were further divided into training-set and test-set. Thirteen compounds (nearly 25% of total population) have been selected for test-set through SYSTAT. The identified test-set was then used for external validation of models derived from remaining forty compounds in the training-set. The squared correlation coefficient between the observed and predicted values of compounds from test-set, r2Test, was calculated to explain the fraction of explained variance in the test-set which is not part of regression/model derivation. It is a measure of goodness of the derived model equation. A high r2Test value is always good. But considering the stringency of test-set procedures, often r2Test values in the range of 0.500–0.600 are regarded as indicative predictive models. Following the strategy to explore only predictive models, CP-MLR resulted into 70 models in two descriptors, 99 models in three descriptors, 8 models in four descriptors and 13 models in five descriptors. However, the highest significant of them, in statistical sense, are given through Equations (1-10).
pIC50 = 5.983 + 2.242(0.324)S3K + 0.874(0.228)nRORPh n = 40, r = 0.764, s = 0.525, F = 25.915, q2LOO = 0.505, q2L5O = 0.511, r2Test = 0.634 (1)
pIC50 = 6.230 + 1.434(0.417)VAR + 1.141(0.241)N-075 n = 40, r = 0.736, s = 0.551, F = 21.870, q2LOO = 0.470, q2L5O = 0.474, r2Test = 0.670 (2)
pIC50 = 6.598 + 3.470(0.415)S3K – 1.183(0.432)GGI3 – 1.541(0.323)C-025 n = 40, r = 0.846, s = 0.440, F = 30.112, q2LOO = 0.629, q2L5O = 0.616, r2Test = 0.504 (3)
pIC50 = 6.549 + 2.513(0.317)S3K – 1.745(0.327)C-025 + 0.556(0.260)C-027 n = 40, r = 0.833, s = 0.456, F = 27.299, q2LOO = 0.613, q2L5O = 0.615, r2Test = 0.553 (4)
pIC50 = 5.617 + 2.491(0.533)SEigm + 2.703(0.433)BELv6 – 3.152(0.648)GGI3 + 0.779(0.232)N-075
n = 40, r = 0.856, s = 0.432, F = 23.993, q2LOO = 0.646, q2L5O = 0.652, r2Test = 0.751 (5)
pIC50 = 7.052 + 1.354(0.518)S3K – 0.515(0.224)PJI2 + 1.521(0.510)C-006 – 1.316(0.343)C-025
n = 40, r = 0.854, s = 0.435, F = 23.665, q2LOO = 0.644, q2L5O = 0.634, r2Test= 0.658 (6)
pIC50 = 7.363 + 2.626(0.491)S3K – 1.629(0.553)CIC1 + 1.677(0.632)BELv6 – 2.601(0.627)GGI3 – 0.967(0.356)C-025
n = 40, r = 0.880, s = 0.404, F = 23.260, q2LOO = 0.661, q2L5O = 0.648, r2Test= 0.652 (7)
pIC50 = 7.358 + 2.591(0.504)S3K – 1.570(0.541)CIC1 + 1.602(0.614)BELp6 – 2.528(0.612)GGI3 – 0.957(0.359)C-025
n = 40, r = 0.879, s = 0.405, F =23.101, q2LOO = 0.661, q2L5O = 0.675, r2Test = 0.675 (8)
pIC50 = 6.140 + 2.230(0.654)S3K + 1.551(0.650)SEigm + 1.304(0.607)BELp6 – 2.483(0.679)GGI3 – 1.102(0.359)C-025
n = 40, r = 0.870, s = 0.418, F = 21.176, q2LOO = 0.639, q2L5O = 0.654, r2Test = 0.620 (9)
pIC50 = 6.115 + 2.267(0.642)S3K + 1.576(0.660)SEigm + 1.332(0.622)BELv6 – 2.520(0.693)GGI3 – 1.120(0.356)C-025
n = 40, r = 0.870, s = 0.419, F = 21.161, q2LOO = 0.637, q2L5O = 0.618, r2Test = 0.619 (10)
Where n and F represent respectively the number of data points and the F-ratio between the variances of calculated and observed activities. The data within the parentheses are the standard errors associated with regression coefficients. In all above equations, the F- values remained significant at 99% level. The indices q2LOO and q2L5O (> 0.5), except baseline Equation (2), have accounted for their internal robustness. For all above models the r2Test values, obtained greater than 0.5, specified that the selected test-set is fully accountable for their external validation. The descriptors, in all above models, have been scaled between the intervals 0 to 1  to ensure that a descriptor will not dominate simply because it has larger or smaller pre-scaled value compared to the other descriptors. In this way, the scaled descriptors would have equal potential to influence the QSAR models.
The signs of the regression coefficients have indicated the direction of influence of explanatory variables in above models. The positive regression coefficient associated to a descriptor will augment the activity profile of a compound while the negative coefficient will cause detrimental effect to it.
Though Equations (1-10) emerged as significant predictive models but Equations (7-10) remained statistically more efficient. The later four models, involving five descriptors in each, could estimate up to 77.44 percent of variance in observed activity of the compounds. In fact, a total number of 13 such models, sharing 15 descriptors among them, have been obtained through CP-MLR and only four of them, being most significant have been documented through Equation (7-10). The shared 15 descriptors along with their brief description, average regression coefficients and total incidences are given in table 1. Besides listed descriptors in table 1, the other identified descriptors PJI2 and VAR are from topological class, nRORPh is from functional class and C-027 is from atom centred fragment class. The PJI2 represents the 2D Petitjean shape index (Equation 6), VAR explains the variation in a molecular structure (Equation 2), nRORPh accounts for the number of ethers (aromatic) (Equation 1) and C-027 encodes the functionality R-- CH--X (Equation 4). The further discussion is, however, based on the highest significant Equations (7-10). The derived statistical parameters of these four models have indicated that their level of significance is almost the same. These models were, therefore, used to calculate the activity profiles of all the compounds and are included in table 2 for the sake of comparison with observed ones. A close agreement between them has been observed. Additionally, the graphical display, showing the variation of observed versus calculated activities is given in figure 2 to insure the goodness of fit for each of these four models.
|Cpd.||pIC50 (M)||Cpd.||pIC50 (M)|
|Obsd.||Calculated Eq.||Obsd.||Calculated Eq.|
aSee footnote under Chart 1, bcompounds in test set, ccompound with uncertain activity and not included in the study, doutlier compound in present study
Table 1: Observed and modeled MMP-13 inhibition activity of non-zinc-chelating compounds.
|S. No.||Descriptor||Descriptor class||Physical meaning||Average regression coefficient (incidence)|
|1||S3K||Topological||3-Path Kier alpha-modified shape index||2.135 (10)|
|2||CIC1||Topological||Complementary information content index of 1-order neighborhood symmetry||-1.949 (4)|
|3||SEigZ||Topological||Eigenvalue sum from Z weighted distance matrix (Barysz matrix)||1.733 (4)|
|4||SEigm||Topological||Eigenvalue sum from mass weighted distance matrix||1.811 (4)|
|5||BELv6||BCUT||Lowest eigenvalue n. 6 of Burden matrix / weighted by atomic van der Waals volumes||1.861 (7)|
|6||BELe1||BCUT||Lowest eigenvalue n. 1 of Burden matrix / weighted by atomic Sanderson electronegativities||-0.854 (4)|
|7||BELp6||BCUT||Lowest eigenvalue n. 6 of Burden matrix / weighted by atomic polarizabilities||1.382 (5)|
|8||GGI3||Galvez topological charge indices||3-Order topological charge index||-2.524 (12)|
|9||ATS5m||2D-autocorrelation||Broto-Moreau autocorrelation of a topological structure - lag 5 / weighted by atomic masses||1.765 (1)|
|10||MATS4m||2D-autocorrelation||Moran autocorrelation - lag 4 / weighted by atomic masses||0.747 (1)|
|11||nCs||Functional||number of total secondary C(sp3)||1.379 (1)|
|12||nNR2||Functional||number of tertiary amines (aliphatic)||-0.741 (2)|
|13||C-006||Atom centered fragment||CH2RX||1.198 (2)|
|14||C-025||Atom centered fragment||R--CR--R||-1.161 (7)|
|15||N-075||Atom centered fragment||R--N--R / R--N--X||0.668 (1)|
aThe descriptors are identified from the five parameter models, emerged from CP-MLR protocol with filter-1 as 0.79, filter-2 as 2.0, filter-3 as 0.837, and filter-4 as 0.3 ≤ q2≤1.0 with a training set of 40 compounds. bThe average regression coefficient of the descriptor corresponding to all models and the total number of its incidence. The arithmetic sign of the coefficient represents the actual sign of the regression coefficient in the models
Table 2: Identified descriptorsa along with their physical meaning, average regression coefficient and incidenceb, in modeling the MMP-13 inhibition activity.
The participated descriptors in these models are S3K, CIC1, SEigm, BELv6, BELp6, GGI3 and C-025. These descriptors represent, respectively, 3-path Kier alpha-modified shape index, complementary information content index of 1-order neighbourhood symmetry, eigenvalue sum from mass weighted distance matrix, lowest eigenvalue n. 6 of Burden matrix / weighted by atomic van der Waals volumes (v) and by atomic polarizabilities (p), 3-order topological charge index and the functionality, R--CR--R.
The S3K encodes information about the centrality of branching in the H-depleted molecular graph. The CIC1 measures the deviation of the information content pertaining to neighbourhood symmetry of 1-order (IC1) from its maximum value. The descriptor SEigm determines the sum of all the eigenvalues of atomic mass weighted distance matrix of the H-depleted molecular graph.
From Equations (7-10), it appeared that the descriptors S3K, BELv6, BELp6 and SEigm make positive contribution to activity while the descriptors CIC1, GGI3 and C-025 render the negative role to it. Thus to explore more potential analogues of the series, the values of prevalent descriptors of a given model may be decided according to delineated strategy. For example, Equation (7) has revealed that the higher values of descriptors S3K and BELv6, the lower (or more negative) value of descriptor CIC1 and absence of functionality R--CR- -R, are all conducive in improving the MMP-13 inhibition activity of a compound.
To corroborate the study further, a PLS analysis has also been carried out on 15 descriptors identified through CP-MLR and results are given in table 3. For this purpose, the descriptors have been autoscaled (zero mean and unit s.d.) to give each one of them equal weight in the analysis. In the PLS cross-validation, four components have been found to be the optimum for these 15 descriptors and they explained 78% variance in the activity (r2 = 0.780). The MLR-like PLS coefficients of these 15 descriptors are given in table 3. The calculated activity values of training- and test-set compounds are in close agreement to that of the observed ones and are listed in table 1. For the sake of comparison, the plot between observed and calculated activities (through PLS analysis) for the training- and test-set compounds is given in figure 2. Figure 3 shows a plot of the fraction contribution of normalized regression coefficients of these descriptors to the activity (Table 3). Actually, the 15 identified descriptors have shared 55 PLS models and the analysis could reveal four components (Table 3) as optimum to explain the MMP-13 inhibition activity.
|A: PLS equation|
|PLS components||PLS coefficient (s.e.)a|
|B: MLR-like PLS equation|
|S. No.||Descriptor||MLR-like coefficient (f.c.)b||Order|
|C: PLS regression statistics||Values|
aRegression coefficient of PLS factor and its standard error. bCoefficients of MLRlike PLS equation in terms of descriptors for their original values; f.c. is fraction contribution of regression coefficient, computed from the normalized regression coefficients obtained from the autoscaled (zero mean and unit s.d.) data
Table 3: PLS and MLR-like PLS models from the descriptors of five parameter CPMLR models for MMP-13 inhibition activity.
The top ten descriptors in decreasing order of significance are C-025, nNR2, MATS4m, GGI3, C-006, BELv6, BELp6, S3K, N-075, ATS5m (Table 3, figure 3). Among these descriptors, C-025, GGI3, C-006, BELv6, BELp6, S3K and N-075 are part of Equations discussed above and convey same inferences in PLS analysis. The negative contribution of functional group count descriptor nNR2 (number of tertiary aliphatic amine functionality in a molecule) advocates that a higher number of such functional groups are detrimental to activity. The positive contribution of atomic mass weighted 2D-autocorrelation descriptors (Moran autocorrelation, MATS4m and Broto-Moreau autocorrelation, ATS5m) suggest that higher value of these are helpful in improving the activity profile. It is also observed that PLS model from the dataset devoid of 15 descriptors (Table 3) remained inferior in explaining the activity of the analogues.
On analyzing the applicability domain (AD) in the Williams plot (Figure 3) of the model based on the whole data set (Table 4), one compound (25; Chart 1) has been identified as an obvious ‘outlier’ for the MMP-13 inhibitory activity if the limit of normal values for the Y outliers (response outliers) was set as 3×(standard deviation) units. None of the compounds was found to have leverage (h) values greater than the threshold leverage (h*). For both the training-set and test-set, the suggested model matches the high quality parameters with good fitting power and the capability of assessing external data. Furthermore, all of the compounds were within the applicability domain of the proposed model and were evaluated correctly (Figure 4).
|pIC50 = 7.264 + 2.303(0.489)S3K
– 1.822(0.590)CIC1 + 2.187(0.601)BELv6
– 2.456(0.662)GGI3 – 1.069(0.387)C-025
|pIC50 = 7.280 + 2.207(0.502)S3K
– 1.777(0.578)CIC1 + 2.140(0.585)BELp6
– 2.384(0.645)GGI3 – 1.056(0.388)C-025
|pIC50 = 5.838 + 1.606(0.627)S3K
+ 1.937(0.642)SEigm + 1.990(0.553)BELp6
– 2.596(0.704)GGI3 – 1.063(0.390)C-025
|pIC50 = 5.794 + 1.692(0.614)S3K
+ 1.969(0.653)SEigm + 2.018(0.567)BELv6
– 2.655(0.720)GGI3 – 1.080(0.389)C-025
Table 4: Models derived for the whole data set (n=54) for the MMP-13 inhibition activity in descriptors identified through CP-MLR.
The MMP-13 inhibition activity of non-zinc-chelating compounds has been quantitatively analyzed in terms of chemometric descriptors. The statistically validated quantitative structure-activity relationship (QSAR) models provided rationales to explain the inhibition activity of these congeners. The descriptors identified through combinatorial protocol in multiple linear regression (CP-MLR) analysis have highlighted the role of 3-path Kier alpha-modified shape index (S3K), complementary information content index of 1-order neighbourhood symmetry (CIC1), eigenvalue sum from mass weighted distance matrix (SEigm), lowest eigenvalue n. 6 of Burden matrix / weighted by atomic van der Waals volumes (BELv6) and by atomic polarizabilities (BELp6), 3-order topological charge index (GGI3 and the functionality, R--CR- -R (C-025). From statistically validated models, it appeared that the descriptors S3K, BELv6, BELp6 and SEigm make positive contribution to activity and their higher values are conducive in improving the MMP-13 inhibition activity of a compound. On the other hand, the descriptors CIC1, GGI3 and C-025 render detrimental effect to activity. Therefore, the absence of functionality, R--CR--R and lower values of descriptors CIC1 and GGI3 would be advantageous. Such guidelines may be helpful in exploring more potential analogues of the series. The statistics emerged from the test sets have validated the identified significant models. PLS analysis has further confirmed the dominance of the CP-MLR identified descriptors. Applicability domain analysis revealed that the suggested models have acceptable predictability. All the compounds are within the applicability domain of the proposed models and were evaluated correctly.
The financial support provided by the University Grants Commission, New Delhi to one of the author, PS, under the scheme of Emeritus Fellowship is thankfully acknowledged. Authors are also thankful to their Institutions for providing necessary facilities to complete this work.