QSAR Model for Androgen Receptor Antagonism - Data from CHO Cell Reporter Gene Assays

For the development of QSAR models for Androgen Receptor (AR) antagonism, a training set based on reporter gene data from Chinese hamster ovary (CHO) cells was constructed. The training set is composed of data from the literature as well as new data for 51 cardiovascular drugs screened for AR antagonism in our laboratory. The data set represents a wide range of chemical structures and various functions. Twelve percent of the screened drugs were AR antagonisms; three out of six statins showed AR antagonism, two showed cytotoxicity and one was negative. The newly identified AR antagonisms are: Lovastatin, Simvastatin, Mevastatin, Amiodaron, Docosahexaenoic acid and Dilazep. A total of 874 (231 positive, 643 negative) chemicals constitute the training set for the model. The Case Ultra expert system was used to construct the QSAR model. The model was cross-validated (leave-groups-out) with a concordance of 78.4%, a specificity of 86.1% and a sensitivity of 57.9%. The model was run on a set of 51,240 EINECS chemicals, and 74% were within the domain of the model. Approximately 9.2% of the chemicals in domain of the model were predicted active for AR antagonism. Case Ultra identified common alerts among different chemicals. By comparing biophores (alerts in positive chemicals) and biophobes (alerts in negative chemicals), it appears that chlorine (Cl) and bromine (Br) enhance AR antagonistic effect whereas nitrogen (N) seems to decrease the effect. A specific study of benzophenones and benzophenone derivatives indicate that a radical with a “high” number of atoms in 4-position and/or other positions generally decrease the anti-androgenic effect. lipid-regulating fibrates (5), lipid-regulating nicotinates (1), lipid-regulating bile acid-binding resins (1), triglyceride-reducing polyunsaturated fatty acids (2), direct-acting vasodilators, (4), vasodilators for ischaemic heart disease (1), and Vasodilators for cerebral and peripheral vascular disorders (7).


Introduction
Inhibition of the Androgen Receptor (AR) dependent reporter gene transcription provides an important piece of information that flags potential endocrine-disrupting effect of a wide range of chemicals, e.g. pesticides, industrial chemicals and drugs [1][2][3][4]. In vitro data for AR antagonism may be used for priority setting for further studies, e.g. in vivo experiments that are more costly and time-consuming. By use of QSAR models for AR antagonism the priority capacity is enhanced considerably and is further enhanced when the QSAR models and associated training sets are improved. Pesticides and industrial chemicals dominate in our existing QSAR model for AR antagonism [5]. It is well known that some drugs have AR antagonistic effect either as a primary mechanism of action for efficacy or as a secondary mechanism not directly involved in the pharmacological action of the drug [6][7][8].
Spironolactone, quinidine, procainamide, disopyramide, sotalol, amiodarone, ibutilide, dofitilide and statins are well known Cardiovascular Disease (CVD) drugs [8,24]. An antiandrogenic effect of CVD drugs may thus be a possibility. Until now spironolactone is the only CVD drug present in the training set of our QSAR models for AR antagonism [4,5]. In this study our published QSAR model [5] was applied for analyzing the potential occurrence of AR antagonism among CVD drugs. 343 CVD drugs were screened. The QSAR model revealed biophores (chemical structures characteristic for active AR antagonism chemicals) in about 40% of the drugs. Chemical structures unknown to the QSAR AR antagonism model were also identified in 40% of the CVD drugs (data not published). Therefore it was decided to analyze some of the CVD drugs for AR antagonism in the AR reporter gene assay described previously [4]. The purpose of this study was to identify new AR antagonisms among the CVD drugs and to extend the domain of the future AR antagonism QSAR models. The selection of CVD drugs for AR reporter gene assay is described in material and methods.
Different QSAR models for AR antagonism have been published. Our QSAR model from 2008 was constructed by use of the software MultiCASE and 528 chemicals assayed by use of different cellular reporter gene assays [4]. Later, three modeling systems (MultiCASE, Leadscope and MDL QSAR) were used for the construction of AR antagonism models. There were 923-942 chemicals in the training sets also assayed by use of different cellular reporter gene assays [5].
In addition, AR antagonism QSAR models based on only one single cell type and on a specific functional group, the brominated flame retardants (BFRs), have been published [25,26]. Recently, Kovarich et al. [27] used a training set consisting of AR antagonism data from 24 BFRs in a QSAR model developed especially for prediction of BFRs. Osteosarcoma cells from human (U2 OS) were used in the AR antagonism assay.
AR reporter gene assays are based on different cell types, e.g. Chinese hamster ovary (CHO-K1) cells [4], human mammary carcinoma cells (MDA-kb2) [28], U2 OS cells [29], African monkey kidney cells (CV-1) [30], human prostate adenocarcinoma PC-3 derived cell line (PALM) [31], human hepatoma liver cells (HepG2) [32] and yeast [33]. However, discrepancies between data from the different in vitro cell assays have been reported [34,35]. Previously it has been described that effectors which interact with the ligandreceptor complex, like response elements, corepressor or coactivator proteins, or other transcription factors, are cell-type dependent [36]. In addition, Kojima et al. [37] reported low sensitivity of reporter gene assays based on yeast cells, HepG2 cells or Hela cells (cervical cancer cells) as compared to reporter gene assays based on CHO cells. Thus, it may be an advantage to use data from only one single cell type for the development of AR antagonism QSAR models. Our database for AR antagonism contains data from 1140 chemicals, data from around 900 of these chemicals are based on CHO cells. As our AR antagonism data are collected continuously for all cell types, CHO cells are probably the most often used cell type for AR antagonism assays. Thus the aim of this study was to use AR antagonism data based exclusively on CHO cells to form a training set for a new "CHO" AR antagonism QSAR model. CHO data from an existing training set [5], new CHO data from the literature and new experimental data were used.
New software for QSAR modeling is developed continuously. In this study a newly developed program, Case Ultra, from MultiCASE Inc. was used. The program is especially suitable for the unbalanced training set [38] that was used in this study.

In vitro AR Assay
The AR antagonism assay was performed as previously described [4]. Shortly: Chinese hamster ovary cells (CHO-K1) were for each chemical run in two parallel lines; one transfected with the plasmids pSVAR0 and MMTV-LUC for antagonism, and another transfected with the plasmids pSVAR13 and MMTV-LUC for the cytotox evaluation. The MMTV-LUC plasmid contains gene coding for the reporter enzyme Luciferase. The plasmid pSVAR0 contains gene coding for the human androgen receptor (AR). The plasmid pSVAR13 contains gene coding for AR without a ligand-binding domain (LBD). CHO cells transfected with pSVAR0/MMTV-LUC need R1881 for AR activation. CHO cells transfected with pSVAR13/MMTV-LUC are constitutively AR activated. The chemicals were tested at concentrations of 1, 3, 10, and 30 μM, and within each assay all data were related to the response of 0.1 nM R1881 (methyltrienolone), which was set to 100%. IC 25 , that is the concentration of test compound showing a 25% inhibition of the activity induced by 0.1 nM R1881, was calculated for each compound. The criteria for determining "a positive" was that a 25% inhibition of the 0.1 nM R1881-induced response should be reached at a non-cytotoxic concentration ≤10 μM. For QSAR modeling purpose, chemicals showing 25% inhibition at higher concentration than 10 μM belong to the group of "weak AR antagonisms/not AR antagonisms", also referred to as negatives.
All AR antagonism data were separated into two groups: a positive group (chemicals with IC 25 ≤10 μM) and a negative group (chemicals with IC 25 >10 μM or no activity).
Comparison of common data from the different laboratories showed a 83% (29/35)-91% (40/44) agreement. The compounds "2,4,5-trichlorophenoxyacetic acid" and "di-n-butyl phthalate" were indicated as weak AR antagonisms by Araki et al. [46] and this was in agreement with the finding in Vinggaard et al. [4], classifying the compound as negative. The remaining data was further evaluated. Some chemicals were excluded due to significant discrepancies between data without other supporting data (fenchlorphos, butyl benzyl phthalate, and the steroids estrone and corticosterone). Other data was excluded due to contradictory IC 25 /IC 20 values close to 10 μM (fenvalerate, ethoxyquin). Laboratory 3 found IC 50 values of 26.9 and 35.9 μM for 4-tert-octylphenol and p-n-nonylphenol, respectively; according to laboratory 1, by using IC 25 as the cut-off, these two chemicals were classified "AR antagonism, high" and "negative", respectively. In the QSAR training set, 4-tert-octylphenol was set to positive and p-nnonylphenol to negative, also indicating that negative means either negative or weak positive. Takeuchi et al. [40] found the phthalate to be positive with an IC 20 value of 4.8 μM. Due to more reports showing no AR antagonism, the di-n-butyl phthalate was included in the training set as a negative. In laboratory 3, dexamethasone was shown to have an IC 50 value of 44.5 μM, but was estimated by laboratory 1 to have an IC 25 in the range of 1-3 μM (AR antagonism, moderate); the decision was taken to include dexamethasone in the training set as a positive.
Benzophenones: Benzophenones represent a group of chemicals with functions as drugs and UV stabilizers in sunscreens, cosmetics and plastics [44]. Hydroxyl groups in benzophenones increase the antiandrogenic activity. The IC 50 value for benzophenone was 77 μM and the calculated IC 25 was 29 μM according to the reference [44]. Thus for the QSAR training set, benzophenone was classified as negative, while several of the hydroxylated benzophenone compounds were classified as positives [4,44]. six BFRs analyzed by NFI were included in the training set [4,5]. New data from Kojima et al. [43] add further 15 BRFs to the training set. Analyzed by both laboratories (laboratory 1 and laboratory 2), BDE-100 was found to be positive [4,43].
Cardiovascular drugs: 343 cardiovascular drugs were identified in Drug References [8,24], and predictions were made for these compounds by our AR antagonism QSAR model [5]. The predicted activity (pos, neg, out of domain) as well as the presence of biophores, deactivating fragments and unknown chemical fragments was noted down. For the selection of CVD drugs for experimental testing in in vitro assay and inclusion of data in the training set, the following criteria for the drugs were used: • Possible AR antagonism according to the literature [11][12][13][14][15][16][17][18][19][20][21] • Part of a drug group • The presence of a biophore • The presence of an unknown fragment • Overall a distribution between positive and negative corresponding to about 25% and 75%, respectively; two times the presence of positives previously found for the EINECS chemicals [4,5].
*also belong to the direct-acting vasodilators.
The whole process for the 51 selected CVD drugs, from QSAR prediction to in vitro laboratory experiments to the prediction by our new QSAR model, is described in Supplement 1. Supplement 1 is available at http://qsar.food.dtu.dk/AntiAndrogensup1.zip Data preparation for QSAR modeling: For the QSAR modeling the chemical structures were described using SMILES (simplified molecular input entry system) and imported into OASIS DataBase Manager (DBM) [50]. In DBM, a hydrolysis simulation was performed and examination for chemicals without at least two carbons (including inorganics) and chemicals containing heavy atoms. Salts (e.g. sodium/ potassium-salts and hydrochlorides) were analyzed and the ones not containing toxic ions were processed by removing the ion part(s) from the structure. Duplicate or conflicting occurrences were removed from the structure set. Thereafter some SMILES codes were removed due to the MultiCase procedure for checking of SMILES and Data Kurator, the Case Ultra procedure for additional checking of SMILES for correctness.
During creation of the model, α hexachlorocyclohexane and β hexachlorocyclohexane were identified to have the same 2-D-structure (identical SMILES) and also having the same activity; only one was included in the training set. Dieldrin and Endrin were identified to having the same structure but different activities; none of them were included in the training set.
The training set is available as supplement 2 at http://qsar.food. dtu.dk/AntiAndrogensup 2.zip, and contains information on CAS numbers, machine-readable structure notations and activities.

Modeling methodology
Algorithm: The Case Ultra 64-bit 1.4.0.0 modeling system from Multicase Inc. was used [38]. It is a further development of MC4PC, MultiCASE, previously used [4,5]. Case Ultra uses SMILES codes to enter chemicals. The program is a fragment (alert)-based statistical model system that aims to discover fragment combinations, which are relevant for the observed effect. Biophores are structural alerts that appear mostly in active molecules and therefore may be responsible for the observed activity. Case Ultra also looks at inactivities in the training set to identify deactivating fragments, deemed biophobes. Case Ultra has new functionalities and features and a new algorithm to discover structural alerts. New descriptors are added, e.g. estate values, surface and volume descriptors, Gasteiger atom based charges, vapor pressure, pKa and hydrogen bond donor/acceptors. Alerts are no longer only linear or with only one branch. They are now more general substructures. More model validation options exist, e.g. leave N% out N times, also for unbalanced training sets [38]. Case Ultra also uses physicochemical data (e.g. log (octanol/water) partition coefficient) as well as pharmacokinetic data (e.g. Lipinski rule of five and human intestinal absorption) and fragments as modulators (increasing or decreasing the activity prediction).
Applicability domain: While making a prediction, Case Ultra may report that the prediction is out of domain; this may be due to the presence of fragments not occurring in the training set. In the Case Ultra domain definition, up to one unknown fragment is accepted.
Predictions may also be inconclusive, e.g. when a chemical contains biophores as well as biophobes.

Statistical analysis:
In Case Ultra a specific program for unbalanced training sets is available and used in this study. Leavegroups-out cross-validation was used. The Case Ultra model was validated five times two-fold 50% cross-validation [38,51]. The crossvalidation result was evaluated by use of Cooper statistics [52]. Cooper statistics use sensitivity (ability to predict actives), specificity (ability to predict inactives), and concordance (overall accuracy) to describe the predictivity of a model.

Cardiovascular drugs in the training set
Data of the newly assayed cardiovascular drugs are shown in table 1. AR antagonism for drugs in concentration ≤ 10 μM was found in six drugs out of 51. This corresponds to AR antagonism in 12% of the investigated cardiovascular drugs.
The initial QSAR prediction with our published MultiCase model [5] of 17 antiarrhytmics showed defitilide and ibutilide to be positive due to the presence of biophores. However, the molecules also contain deactivating fragments (biophobes). Only amiodarone among the antiarrhytmics contains solely a biophore. The in vitro assay showed that only amiodarone among the antiarrhytmics was positive. Among the lipid-regulating CVD drugs, two statins (lovastatin and simvastatin) out of six were predicted to be positive due to biophores only. These two statins were also found to be positive in the laboratory test. In addition, mevastatin was experimentally found to be positive. The QSAR prediction of atorvastatin, fluvastatin and pravastatin showed all three to contain biophores as well as unknown fragments. showing four out of the five drugs to be negative. Fenofibrate was predicted positive. QSAR prediction of two other lipid-regulating CVD drugs, acipimox and colestipol, showed acipimox to contain unknown fragments and colestipol to be negative. The in vitro assay showed both to be negative.
Among the polyunsaturated fatty acids, the in vitro assay showed docosahexaenoic acid to be positive.
For two of the CVD drug groups, the "Angiotensin II receptor antagonisms" and the "Vasodilators, direct acting", the QSAR prediction showed all (five and five, respectively) to be out of domain due to the presence of unknown chemical fragments. Four of the "Angiotensin II receptor antagonisms" also contained biophores. The in vitro assay showed all to be negative. Among the other vasodilators, the vasodilator used for ischaemic heart disease (dilazep) was QSAR predicted to contain solely biophores and was found to be positive in the in vitro assay.
Lovastatin, simvastatin, mevastatin, amiodaron, docosahexaenoic acid and dilazep were included in the QSAR training set as positives. The other drugs were included as negatives, except for the chemicals producing cytotoxicity (two statins, atorvastatin and fluvastatin, and one antiarrhythmic, verapamil), which were excluded from the training set.
Data for 890 chemicals from the existing training set, new CHO data from the literature and the new experimental data for the CVD drugs made up for the new training set. These chemicals were reduced to 874 chemicals. Among the newly analyzed CVD drugs, acipimox and bretylium were not accepted during the technical adaptation procedure.

Validation, applicability domain and chemicals with antiandrogenic effect
The five times two-fold 50% cross-validation of the Case Ultra QSAR model (Table 2) showed a sensitivity of 57.9%, a specificity of 86.1% and a concordance of 78.4%.
In total, 51,240 discrete organic EINECS chemicals (European INventory of Existing Commercial chemical Substances) were predicted using the modeling system. Table 2 shows the domain of the Case Ultra model to be 74% of the EINECS chemicals. The percentage of the screened EINECS chemicals that were predicted positive for AR antagonism was 9.2%.

Biophores
From the 874 chemicals in the training set for the Case Ultra model, 79 alerts were identified, 45 as biophores and 34 as biophobes. 35 of the biophores were present in three or more active molecules in the training set. Table 3 shows the most significant alerts in the training set of the Case Ultra model; in addition, other alerts were shown. The most significant alerts (Alert no. 6 (cc(ccc)c ) and alert no. 1 ((Cl)ccccc)) were primarily found in PCBs. Other alerts, e.g. no. 4 (c1ccccc1), were found in PCBs and chemicals from other groups (e.g. brominated diphenyl ethers) as well.  benzophenone derivatives containing alert no. 5 were characterized by having a radical with a "high" number of atoms in 4-position and/ or other positions, e.g. octyloxy, sulfonic acid, dibutylamino and 2-methylpropanoic acid 1-methylester. The CVD drug fenofibrate belongs to this group of negative benzophenone derivatives and also contains a deactivating fragment (alert no. 64 (Cc1ccccc1O) 0 out of 7 molecules containing this alert are active) (Figure 2).
Alert no. 12 (ccc(cc)c) was mostly found in PAHs with a distribution between positive and negative of 75%.
Alert no. 13 (cccc(O)c ) was present in two of the new identified AR antagonisms with cardiovascular drug effect (amiodaronean antiarrythmics, and dilazep -a vasodilator used for treatment of ischaemic heart disease). Other chemicals with this alert were herbicides, fungicides and BRFs. Figure 3 shows alert no. 13 in prochloraz (a fungicide), BDE-100 (a BFR) and amiodarone (a CVD drug). Distribution between positive and negative chemicals in this alert group was 75%.
The three active statins contained a common alert (no. 27), an alert not found in other types of chemicals; the alert contained a high number of carbons (16) and oxygen (5). However, as in the initiating MultiCASE QSAR prediction, lovastatin and simvastatin differ from mevastatin; in addition, lovastatin and simvastatin contained biophore no. 33 also found in 3-methylpent-1-ene and 1-hexene,3 methyl. Mevastatin also contained an additional alert (no. 26), an alert which was common with spironolactone, betulin, corticosterone acetate and mifepristone, all steroid-like drugs. For the last new drug with AR antagonistic effect "Docosahexaenoic acid", the alert was the whole molecule. Table 4 shows the alerts in the inactive chemicals (the biophobes). Nitrogen is present in four out of ten most significant biophobes, two in ring structures. Sulfur and phosphorus were present in one alert each; otherwise carbon atoms and oxygen make up the alerts. The most significant biophobe was alert no. 46 consisting of a nitrogencarbon structure ([n]c) placed in ring structures, 57 out of 58 molecules containing [n]c are inactive. Among the CVD drugs, all five Angiotensin II receptor antagonisms contain alert no. 46. In addition, three other of the CVD drugs also contained alert no. 46. Alert no. 47 consists of the hydroxylated part (CO) of organic acids, 51 out of 54 molecules containing (CO) are inactive. Many of the inactive cardiovascular drugs contribute to the training set with molecules containing alert no. 47. However, the two mentioned polyunsaturated fatty acids experimentally found to be AR antagonists (docosahexaenoic acid and γ-linolenic acid) also contain alert no. 47. *chemicals predicted marginal or inconclusive are not included ªstructural alerts are described using SMILES notation for aliphatic and aromatic compounds according to the Daylight Theory Manual [59]. Notation: Lower case atoms are aromatic. ( ): branch point. When the alert covers part of an aromatic structure, the attached part of the alert (aliphatic or aromatic) is enclosed in parenthesis. b two occurrences of the same biophore in the mentioned molecule. c (1-p-value) x 100 Table 3: The most significant alerts (biophores) and the alerts in the active CVD drugs in the Case Ultra AR antagonism model.

Discussion
The CVD drugs and the AR antagonism The in vitro assay showed an occurrence on 12% of AR antagonisms among cardiovascular drugs. This is at the level previously found among 49,292 EINECS chemicals [5]. However, as for the EINECS chemicals, some drug groups contain AR antagonisms more frequently than others. Among the antiarrhytmics only amiodarone was positive. The presence of antiandrogenic effect among statins was confirmed. An intuitive evaluation of figure 1 indicates that the AR antagonism activity of statins may be related to the pyran part and the cytotoxicity to the azole, benzene fluoride part. The presence of antiandrogenic effects of statins and not of fibrates and acipimox and colestipol are supported by the findings in a Finnish epidemiological study. This study showed statins to have a beneficial effect on prostate cancer prevention; this effect was not found for fibrates, acipimox and colestipol [19]. The finding of docosahexaenoic acid as positive was unexpected, but another polyunsaturated fatty acid "γ-linolenic acid" was also found to be positive in a previous study [4]. The vasodilator used for ischaemic heart disease (dilazep) was also found to be positive. This in vitro finding was expected only due to the prediction by our published QSAR model [5]. This study shows that CVD drugs that were predicted to be positive by our previous published QSAR model were generally confirmed to be positive in the in vitro assay.

The new QSAR model
The concordance of 78.4% was slightly higher than the concordances of our previous QSAR models for AR antagonism [4,5]. Modeling the newly developed training set by use of the MultiCASE software, which we have used for previous QSAR modeling [5], a similar concordance on 78.7% was found. The domain of the present Multicase Ultra model was improved from about 60% to 74% as compared to our previous AR antagonist QSAR models [4,5].

Alerts (biophores/biophobes) and the CVD drugs
In some cases, an individual alert can make up the main part of a chemical group, e.g. PCBs or benzophenones, but can also be common for chemicals belonging to different chemical groups. PCBs which in particular are AR antagonists (74% of 39 PCBs was experimentally shown to be AR antagonists [4]) does not have common biophores with the CVD drugs in this study.
Evaluating positive and negative chemicals in the alert groups it was found that a radical with a "high" number of atoms in 4-position or/and other positions may decrease the AR antagonism of benzophenones. The CVD drug fenofibrate contains the actual biophore but also a radical with a "high" number of atoms in 4-position and a significant biophobe and perform no AR antagonism.This finding adds new knowledge to the identification of chemical structures of importance for the AR antagonism of Benzophenones. Kowamura et al. [44], has previously shown that a hydroxylated group at the 2-position generally enhances the antiandrogenic activity of benzophenones ( Figure 2). Thus evaluating positive and negative chemicals in the alert groups of QSAR models may be a useable supplementary tool for potency evaluation.
Alert group no. 13 (cccc(O)c ) was present in chemical groups with different functions, e.g. prochloraz (a fungicide), BDE-100 (a BFR) and amiodarone/dilazep (CVD drugs). This illustrates that it is valuable to include a wide range of chemical structures in the training set ( Figure  3).
The three statins performing AR antagonism were present in one alert group (no. 27). This biophore contained the pyran part ( Figure  1). Lovastatin/simvastatin and mevastatin were also included in an additional biophore group, respectively. Mevastatin was in the same biophore group as steroid-like drugs, e.g. spironolactone. The statin with no AR antagonism "Pravastatin" was in the large alert group ªstructural alerts are described using SMILES, see footnote to Table 3. b (1-p-value) x 100 Table 4: The most significant alerts (biophobes) in the inactive molecules and the alerts in the inactive CVD drugs and benzophenone in the Case Ultra AR antagonism model.

Biophobes
Example of a molecule containing the alert and main chemical structures Case Ultra makes use of the ratio between active/inactive when predictions are made and chemicals like fatty acids with unexpectedly AR antagonistic activity would probably be predicted as inactive AR antagonisms, not least due to the biophobe no. 47. Only one biophore will exist for such chemicals and cover the whole molecule. Thus, in spite of having QSAR models for important priority setting, experimental measurement should be performed when possible. The AR antagonism of docosahexaenoic acid and γ-linolenic acid [4] is a good example of this. This property of these unsaturated fatty acids deserves further attention.
By comparing biophores and biophobes it appears that chlorine (Cl) and bromine (Br) enhance the AR antagonistic effect while nitrogen seems to decrease the effect.

Comparison of AR antagonism data between assays based on different cell types
In general, a good agreement exists between qualitative data from various AR reporter gene assays, and QSAR models have been developed by use of data from different assays [4,5]. Minor disagreements between data for AR antagonism in CHO cells and U2-OS cells are found [4,29,43]. A comparison of AR antagonism data for brominated flame retardants experimentally performed on U2-OS cells as well as on CHO cells shows a possible disagreement on 27% (3/11), i.e. differences were found for BDE-153, BDE-190 and TBBPA.
Disagreement between AR antagonism in CHO cells and MDA (human mammary carcinoma) cells was also found; i.e. chemicals with AR antagonism in CHO cells were inactive in MDA cells with steroids dominating, making up three (progesterone, cyproterone acetate and 17β-estradiol) out of four chemicals [4,28,53]. The fourth chemical was chlordane [2,34,54]. This disagreement for chlordane was also reported previously by Aït-Aïssa et al. [34]. For chemicals showing AR antagonism in MDA cells but inactive in CHO cells, all were non-steroids (phenanthrene, pirimiphosmethyl, chlorpropham, metolachlor, pretilachlor) [2,4,28,34,55]. Chemicals like diethyl phthalate, aldrin and γ-lindane having IC 50 values at concentrations of 60 -80 μM in MDA cells were negative in CHO cells [4,28,40] where the cut-off for positive results was set at 10 μM. However, due to the relatively high concentration needed for AR antagonism in MDA cells, these chemicals were considered negative in the two cell assays.
Thus, discrepancies in AR antagonism between cellular assays should without a doubt be taken into account when QSAR models are to be developed. Data from some cell types seem more appropriate to include in the same training set than others, without exclusion of too many contradicting results. Individual cellular QSAR models as well as multicellular QSAR models for AR antagonism could be developed with advantage.
Discrepancies in AR antagonism between cells may be due to difference in mechanism as well as the used cell type. Differences may be: initiating ligands (R1881, DHT) for the AR, metabolism, reporter plasmid, cellular presence of endogeneous receptors (AR, glucocorticoid receptor (GR)), cytotoxicity (transcription complex level (used in CHO assays), use of phase-contrast microscopy for cellular vacuolization (in MDA assays), etc.) [2,4,28,55].
Deviations between data from different CHO assays may also occur. This may be related to minor differences between protocols. Thus some laboratories use R1881 to initiate the AR mediated transcription, other laboratories use DHT. Some laboratories report IC 25 and measure inhibition up to 10 μM; other laboratories report IC 50 and measure inhibition up to 100 μM or higher. Compared to other cellular AR antagonism protocols, the applied assays in the CHO assays for cytotoxicity seem relatively similar, i.e. the cytotoxicity of chemicals is evaluated by transcriptional activity [2,4], which is the ideal way to investigate cytotoxicity in this context.

Conclusion
In the development of QSAR models for Androgen Receptor (AR) antagonism, a training set based on Chinese hamster ovary (CHO) cells was constructed. Data from the literature and data on 51 cardiovascular drugs recently screened for AR antagonism at our laboratory at the National Food Institute (NFI) make up the training set. All data together represent a wide range of chemical structures and various functions. Twelve percent of the NFI screened drugs are AR antagonisms; 3 out of 6 statins showed AR antagonism, two showed cytotoxicity and one was negative. Newly identified AR antagonisms are: Lovastatin, Simvastatin, Mevastatin, Amiodarone, Docosahexaenoic acid and Dilazep.
A total of 874 (231 positive, 643 negative) chemicals constitute the training set for the model. The Case Ultra expert system was used to construct the QSAR model. Case Ultra showed a concordance of 78.4%, a sensitivity of 57.9% and a specificity of 86.1%. The model was run on a set of 51,240 EINECS chemicals, and 74% were within the domain of the model. Approximately 9.2% of the chemicals in the model domain were predicted active for AR antagonism.
Case Ultra identifies common alerts among chemical groups with equal as well different functions. By comparing biophores and biophobes, it appears that Cl and Br may enhance AR antagonistic effect while nitrogen seems to decrease the effect. A specific study of benzophenones and benzophenone derivatives indicate that a "high" number of atoms in 4-position and/or other positions generally decrease the antiandrogenic effect.