HIV-PDI: A Protein Drug Interaction Resource for Structural Analyses of HIV Drug Resistance: 2. Examples of Use and Proof-of-Concept

The HIV-PDI resource was designed and implemented to address the problems of drug resistance with a central focus on the 3D structure of the target-drug interaction. Clinical and biological data, structural and physico-chemical information and 3D interaction data concerning the targets (HIV protease) and the drugs (ARVs) were meticulously integrated and combined with tools dedicated to study HIV mutations and their consequences on the efficacy of drugs. Here, the capabilities of the HIV-PDI resource are demonstrated for several different scenarios ranging from retrieving information associated with patients to analyzing structural data relating cognate proteins and ligands. HIV-PDI allows such diverse data to be correlated, especially data linking antiretroviral drug (ARV) resistance to a given treatment with changes in three-dimensional interactions between a drug molecule and the mutated protease. Our work is based on the assumption that ARV resistance results from a loss of affinity between the mutated HIV protease and a drug molecule due to subtle changes in the nature of the protein-ligand interaction. Therefore, a set of patients whose resistance to first line treatment was corrected by a second line treatment was selected from the HIV-PDI database for detailed study, and several queries regarding these patients are processed via its graphical user interface. Considering the protease mutations found in the selected set of patients, our retrospective analysis was able to establish in most cases that the first line treatment was not suitable, and it predicted a second line treatment which agreed perfectly with the clincian’s prescription. The present study demonstrates the capabilities of HIV-PDI. We anticipate that this decision support tool will help clinicians and researchers find suitable HIV treatments for individual patients. The HIV-PDI database is thereby useful as a system of data collection allowing interpretation on the basis of all available information, thus helping in possible decision-makings.


Introduction
The fast growth of HIV resistance to antiretroviral drugs (ARVs) is one of the main limitations in treating the disease [1,2]. Overcoming this resistance in HIV-infected patients is a major public health challenge in AIDS research today [3,4]. The ability of HIV to mutate and produce genetic variations has allowed HIV to develop resistance to many currently available ARVs [5,6]. Crystallographic studies have clearly shown that mutations in the HIV genome can induce structural modifications at the protein active sites targeted by ARVs, which therefore reduce ARV potencies [7][8][9]. Understanding the mechanisms by which these structural variations emerge and evolve at the molecular level may enable individual patient-specific drug treatments to be formulated. In this context, we developed the HIV-PDI (HIV protein drug interaction) resource, which was introduced in the accompanying article (paper 1). The main aim of this resource is to aggregate three-dimensional (3D) structure data relating to HIV-ARV interactions together with more classical biological and clinical data on HIV-infected patients.
While the ultimate goal of the HIV-PDI resource is to help clinical decision-making regarding HIV patients with ARV resistance, it will also facilitate the analysis of HIV resistance for basic research purposes. For example, it allows the progress of patients treated with a first line treatment to be tracked and correlated with structural modifications induced by mutations in the viral proteins. In the first part of this paper, we demonstrate several such analyses which can be performed using HIV-PDI. In the second part of this paper, we validate our basic hypotheses that resistance to treatment stems from a loss of affinity of the delivered ARVs due to mutation-induced structural modifications at the drug binding site, and that similar compensating interaction may be created using other ARVs.

HIV-PDI example of use
HIV-PDI was used as the source of all the data considered in this study to demonstrate the system utilities. HIV-PDI is coupled with tools for visualizing and analyzing 3D Protein-Drug interactions, and with data mining programs. Each query can involve multiple database fields, including the target name, drug name or function, and therapeutic drug classification. This is illustrated here by several requests performed from specific graphic user interfaces (GUIs) in order to extract information and to eventually correlate various data stored in the database. From these requests, information or knowledge was extracted by exporting the data into tables and this was a prerequisite of further analysis.

HIV-PDI proof of principle
From the HIV-PDI database, we selected a subset of patients who presented resistance to a first line ARV treatment, which was overcome by a second line treatment. Given that the HIV-PDI database currently focuses on the well documented HIV protease and related drugs, the only change expected between the two lines of treatment had to be the protease inhibitor. From the biological data recorded in the HIV-PDI database, we retrieved all patients meeting the inclusion criteria by developing a SQL script based queries (PL/pgSQL language). Figure  1 summarizes the three main steps of the selection process. Briefly, for each of the 2029 patients recorded in the HIV-PDI, a chronological list of dates/time was generated, where clinical and/or mutation data were available. Patients who met the following criteria were selected: i) the treatment taken before the first HIV protease sequencing (T = x 1 ) is known, ii) data on viral load, HIV protease sequence and treatment are available at the date of the genotyping test (T = 0), and iii) viral load data are available at the date following the first sequencing (T = y 1 ).
The 3D structures of drug-resistant protease mutants and wild type (WT) protease were selected in an apo form or in complex with protease inhibitors. The process of building the 3D protease mutants by homology modeling is described in the accompanying article. In order to predict the best molecules that overcome the resistance observed in the selected patients, a docking study was carried out with a collection of 40 NIAID-FDA compounds. The NIAID-FDA collection consists of ARVs from the FDA and NIAID datasets that target the HIV protease. Full details of the docking procedure are described in the accompanying article.

Cheminformatics analyses
Searching the HIV-PDI associated database with cheminformatic queries, such as retrieving ligands with a given substructure, searching for or comparing molecules by their 3D shapes, chemical groups, or functions, etc., can help understand the role of chemical moieties with regard to drug resistance. Hence, several cheminformatics routines have been made available in the GUI, e.g. substructure searching and Tanimoto similarity calculations. For this purpose, the main GUI page (see paper 1 for its description), contains search possibilities on ligand categories in order to retrieve the file containing the required properties to be used for further analyses in the database. For example it is possible to specify the 2D or 3D chemical structure of any ligand as well as its InChIKey [10], or Chemaxon fingerprints [11]. This allows all similar compounds in the database to retrieved, or all compounds that were selected by chemical groups or molecular structure to be identified (see Table 1, which shows the tanimoto similarity scores between usual ARVs extracted from HIV-PDI). More advanced analyses can also be carried out, such as retrieving ligands with a similar shape, chemical substituents or pharmacophores to those of a given query ligand.

General structural analysis
Structural data relating to mutants and WT Protein Data Bank (PDB) structures of the HIV-1 protease are stored in the HIV-PDI database. This information can also be retrieved through the GUI. It is therefore possible to investigate the influence of given mutations on the 3D structures of the target proteins, to compare their 3D structures and to highlight their differences. Importantly, modifications of ARVprotease interaction patterns in mutant proteases can be investigated at the 3D structure level using the GUI in combination with the Jmol, an open-source Java viewer for three-dimensional chemical structures [12]. For example, the main structural characteristics of the protease homo-dimer can easily be visualized for each structure in the database. Figure 2a shows the Asp25 position of the critical Asp25-Thr26-Gly27 triad of the active site [13,14], in the vicinity of which subsequent studies could be focused on. Other structural elements such as the anti-parallel β-sheets linking the two subunits and the so-called "flap structure" [15] in each subunit can also be inspected manually.
The structural differences that exist between the bound and free states of the proteins can also be visualized and compared ( Figure  2b) [13,14]. Such analyses reveal that, in all of the holo structures, the flaps are pulled inward toward the bottom of the active site (the "closed" form), whereas a "semi open" conformation is observed in the structures of the apo protease, with the flaps shifted away from the dual catalytic aspartate (Asp25) but still forming a lid over the active site and still maintaining contacts with each other. Such an analysis can be extended, for example, to compare the 3D interaction patterns of the holo structures containing a given class of ligand, or to compare all WT protein structures complexed with different ligands.
Many other types of queries are supported by the HIV-PDI, such as getting a list of all existing WT structures (see Table 2, which shows the currently available WT structures of HIV-1 protease in the PDB), of requesting all existing mutated structures which are linked to resistance of a particular ARV (see Table 3, which shows the HIV-1 mutations that confer resistance to the main protease inhibitors). All such lists can be analyzed and the associated 3D structures compared and dissected at both the protease and the ligands levels.
The HIV-PDI system can also be used to list the different types of mutations found in the protease (Table 3), and to consider their influence according to their 3D location within the protein structure [16][17][18][19][20][21]. For example, all of the available protease structures to date (see Table 4, which shows the protease structures complexed with amprenavir) in complex with the drug amprenavir (APV) were extracted from the HIV-PDI database and compared. Calculating the root mean squared deviation (RMSD) deviations between the Cα backbone atoms in these complexes shows that the overall protease conformation is maintained with an RMSD of 0.7Å, regardless of the locations of the mutations (Figure 2c). Nonetheless, some small local flap modifications between mutations are observed, and these could be associated with changes in ARV binding affinities in the active site, thus explaining reduced activity of the protease inhibitors in those mutants [14,22,23].

Patient-specific HIV protease mutation analyses
HIV-PDI can also be used to extract patient data and to compare the associated structural information to those found in the PDB. This is demonstrated here using patient 39546, who was selected at random selected from the HIV-PDI database, and whose HIV protease was found to have the following mutations: L15V, E35D, R41K, I50L, and  (b): Superpositions of the wild-type apo protease (PDB code 3PHV; for clarity, only one monomer unit is displayed as a grey ribbon on the right hand side), and the homodimer of the wild-type holo protease complex with two A00 molecules (PDB code 2AMP; ribbon colored according to secondary structures).
The critical Asp25 of the apo form displayed in white sticks moved downward with regards to the Asp25 of the complex in orange sticks. The flap of the apo form moved upward compare to the flap of complex protease.
(c): Cα superposition of all 12 PDB structures complexed with amprenavir (see Table, Supplemental Digital Content 4, which shows the protease structures complexed with amprenavir). The mutation locations are shown as blue spheres. V82L. Extracting the clinical data for this patient shows that ARV resistance in this patient was revealed by virological failure (increased viremia) after first line treatment with the amprenavir (APV) antiprotease. By comparing the protein-ligand complex of this patient's protease with a drug resistant HIV-1 mutant from the PDB both with the crystal structure of WT HIV-1 protease, the presence of several structural differences between the two structures can be observed (Figure 3a). The flaps of the PDB structure and the patient mutant are more open than the flap of the WT structure. Furthermore, the ear, whiskers, nose, eye and the two active sites are also slightly modified. Even though the protease in patient 39546 has more mutations ( Figure  3a, panel A) than the EM3 drug resistant mutated protease from the PDB (Figure 3a, panel B), the majority of this patient's mutations are far from the protease active site, except for I50L which is situated on the flap very close to the active site. This suggests that the main mutation that influences ARV the activity is the same (i.e. I50L) in these two proteases.
Another type of evidence that may be obtained from HIV-PDI concerning this patient is the shape of the binding site cavity in the mutated protease. The effect of mutations on the structural features of the binding cavity may be analysed in order to explain variations in potency observed for a given drug and a mutated HIV-1 protease. For example, the WT HIV-1 protease structure (PDB code 3EKV), the structure of the I50L/A71V protease mutant (PDB code 3EM3), and a homology-modeleted structure of the protease structure of patient 39546 may be compared by superposing their respective structures. Figure 3b shows the shapes of the three cavities calculated using our spherical harmonic comparison technique developed previously by us. This Figure shows that although the three cavities have broadly similar shapes, the cavity of the patient is considerably smaller than those of the other two structures. This difference could explain why the two mutated HIV-1 proteases do not have the same affinity for APV comparised to the WT HIV-1. This will be confirmed by an analysis of the pattern of intermolecular interactions between the WT and mutated HIV-1 protease complexed with APV.  The main protease residues (Asp25, Gly27, Asp29, Asp30, Gly48, Ile50) [24,25]     Intermolecular interactions concerning one or several proteinligand complexes can also be obtained from the HIV-PDI database, and these are helpful to explain drug affinity changes due to mutations in the protease. For example, as shown in Figure 3c, several interactions have been lost in the mutated protease complex of patient 39546.
These interactions concern residues of both the flap and the active site regions, which are necessary for inhibition of the protease [24,25].

Structural differences between patients' mutants and WT apo protease
According to our working hypothesis regarding the link between drug resistance and modifications of protein-drug interaction patterns, we selected a subset of patients whose resistance to first line treatment was overcome by a second therapy. A set of 362 patients matched the first criteria of a multi step selection process, which finally gave a group of 15 patients (Figure 1). While this number is very small, it nonetheless provides a genuine test of the validity of our hypothesis. Table 5 lists detailed data related to the selected set of patients. The mutations present on the patients' protease concern several locations where important residues for drug binding have been identified (see Table 6, which shows the residue mutations observed in the 15 patients) [16][17][18][19][20][21].
The overall model structures of the HIV-1 apo protease mutants isolated from the 15 patients are similar to the WT protease structure with a RMSD of between 0.9-1.5 Å for all main chain Cα atoms. The mutations present in each patient's protease caused weak modifications of the global geometry of the protein (see Figure, Supplementary information 1, which shows the Superposition of all 15 patients mutated apo proteases and the 2QMP WT structure). In fact, when looking in more detail, the largest differences between the WT and the patient's mutant structures are seen in the most flexible protein regions, namely the flaps, ear, and cheek, while the catalytic triads of residues 25-27 and 25′-27′ show very low main chain atom deviations. These changes are similar to the structural changes observed experimentally in the whole set of PDB mutants, as may be confirmed by checking all the PDB files stored in the HIV-PDI resource.

Analysis of the binding mode of the antiprotease drugs in the patient mutated proteases
The 3D protein-ligand structures for the 15 patients' proteases and the 3 ARVs involved in these patients' first and second line treatments (namely indinavir, saquinavir, nelfinavir) were extracted from the HIV-PDI database and compared. These holo structures were obtained by selecting the best poses of the ligands deriving from molecular docking. For all the 15 selected patients, indinavir had a better docking score than the other tested molecules and especially saquinavir (see Table  7, which shows the patient-related data and best compound docking solutions). Of course such a result is only a preliminary indication since a deeper analysis of the protein-ligand interactions is necessary to rank the docked compounds.
At this stage, it is important to compare the detailed molecular interactions found in the selected 15 patients' protease mutants (see Table 8, which shows the critical residues involved in ARV interactions and characterized in the crystal HIV-1 proteases from HIV-PDI database) with the 3 ARV complexes considered above to those present in PDB complexes (see Table 9, which shows the list of PDB entry code related to HIV-1 protease) with the same ligands. In order to illustrate use of these data, the 3D structure of the drug-resistant protease mutant of one selected patient (patient 6670) and the WT structure protease (PDB code 2QMP) were compared in complex with both saquinavir and indinavir, respectively. This example is typical of all 15 patients except the 19255 one. The comparison of the lists of residues involved in key interactions with the ligands shows that

Patient ID Mutations Treatment1
Treatment2 I93L  SQV  IDV  6693  T12S-I13V-K14R-G16E-E35D-N37A-L63Q-I64V  SQV  IDV  6696  I15V-P39T-A71T-L89P-L90M  SQV  IDV  6706  N37S-L63R-K70V-V77I-I93V  SQV  IDV  19255 M36I-L63P IDV NFV *Saquinavir (SQV) critical substitutions: red color SQV additional substitutions: green color **Indinavir (IDV) critical substitutions: red color IDV additional substitutions: yellow color _ Important residues for drug binding and stability: Asp25, Gly 27, Asp29, Asp30, Gly48, Ile50 Table 6: Residue mutations observed in the 15 patients colored according to their known effect on Saquinavir* and Indinavir** resistance. The analysis is focused on the catalytic residues important residues for drug binding and stability (Asp25, Gly 27, Asp29, Asp30, Gly48, Ile50), the residues of the active site (Asp25-Thr26-Gly27), and the residues provoking specific substitutions (see Table, Supplemental Digital Content 3, which shows the HIV-1 mutations that confer resistance to the main protease inhibitors).    several of them no longer form the canonical interactions (see Figure), Supplementary information 2, which shows the counts of critical protease-ARV interactions for saquinavir and indinavir in the WT and patient 6670 proteases). Indeed, the data (Supplementary information 2) show that on balance, the interactions between saquinavir in the wild type (the expected interactions) and in the mutant protease of patient 6670 are less favorable compared to those of indinavir, and that the total numbers of interactions corresponds exactly the docking ranking. In the case of patient 19255, if we consider only the docking rank of nelfinavir which was used as the second line treatment, this compound seems less favorable compared to indinavir which was used as the first line treatment (rank 8 versus first rank, respectively). If we now consider the interaction network based re-scoring, the differences in the ARV-protease interactions obtained for both indinavir-19255 protease and nelvinavir-19255 protease, especially focused on the socalled critical residues, it appears that the nelfinavir complex presents 36% more critical interactions compared to the indinavir complex (see Table 10). It would therefore be possible to propose nelfinavir as the best second line compound when compared to indinavir. This proposal corresponds to the treatment effectively given to the patient 19255.

Discussion
This article has highlighted some of the novel uses of the HIV-PDI resource. Compared to other databases focusing on HIV resistance, the innovative possibilities offered in HIV-PDI are in good accordance with recent trends in dealing with HIV drug resistance. For example, using chemoinformatics tools can contribute towards making better decisions in HIV drug discovery processes [43][44][45][46][47]. The chemoinformatics capabilities available in HIV-PDI are demonstrated. As presented above, when faced with resistance to a given compound such as APV, the Tanimoto score may be used to suggest possible rescue treatments. For example, according to their Tanimoto scores, darunavir, fosamprenavir, and amprenavir" should be replaced by "Darunavir, Fosamprenavir and Amprenavir and consequently should be eliminated because they might have the same unfavourable behaviour towards their mutated protein target. On the other hand, if for some reason a particular anti-protease drug is not available, one can choose the next closest one according to its Tanimoto score.     48B GLY 48B  ILE 50A ILE 50A  ILE 50A ILE 50A ILE 50A  ILE 50A ILE 50A ILE 50A ILE 50A ILE 50A ILE 50A ILE 50A   ILE 50B ILE 50B  ILE 50B ILE 50B ILE 50B  ILE 50B ILE 50B ILE 50B ILE 50B ILE 50B ILE 50B ILE 50B   Table 10: Residues in interaction with the drug of treatment 2 (second treatment to overcome drug resistance) are listed.

Interactions present in all 14 Saquinavir PDB complex structures presently available in HIV-PDI
Those highlighted in yellow are part of the critical set (see Table, Supplemental Digital Content 8, which shows the critical residues involved in ARV interactions and characterized in the crystal HIV-1 proteases from HIV-PDI database). These residues are no longer involved in the interaction with the drug of treatment 1 (Treatment which produce drug resistance).
Similarly, the interest for performing binding site cavity analyses like those presented here is also very topical as an important current approach [48,49]. From such analyses, as demonstrated here for patient 39546, the binding capabilities of APV can be predicted to be modified his mutated protease due to the reduced size of the binding pocket, in agreement with the preceding analyses on 3D pattern interactions. Therefore, future work will include clustering of binding cavity or ligand using the shape descriptors together with target information and activity classes from HIV-PDI database to predict putative targets for new small molecule structures [50].
One novel capability of HIV-PDI is to provide 3D structures for all the protease mutants. While most of these structures are built by homology, in the case of the HIV protease it has been reported that such models are reliable enough to be used or molecular docking studies [51]. The capability of HIV-PDI for refining these homology models through MD simulations in explicit solvent would provide users with good quality 3D mutated protease structures. MD has proved to be a suitable modeling technique for HIV proteases [52].
Numerous docking studies on HIV resistance have been reported [32]. In this context, several publications have highlighted some of the limitations on the ability of available docking programs to predict correctly ligand binding affinities [53]. This is why the HIV-PDI resource uses additional criteria such as ligand comparisons ( Table 1) and counting of critical residues (see Table 8 and Figure, Supplementary information 2) to perform a re-scoring step using an approach based on protein-ligand interaction networks, similar to the one recently proposed [54]. Thus, the specificity of patient 19255 used in our preliminary proof-of-concept demonstrates the capacity of the HIV-PDI resource to provide a deeper analysis with the interaction network based re-scoring when necessary.
Interestingly, if this procedure is applied to the previous 14 patients for which indinavir was ranked top compared to saquinavir (ranked between 2 and 10 according to the patient; data not shown), it appears that for all these patients, indinavir presents 40-50% more interactions than saquinavir and consequently would be the best choice for second line treatment. In other words, for these 14 patients, a treatment incorporating indinavir instead of saquinavir would have been proposed to reduce the resistance in these patients. This predicted treatment is in perfect agreement with the actual treatment given to most of the selected patients, which gave rise to an improvement of the anti-protease activity. Consequently, it appears from the present study that differences in ARV-protease interaction patterns represents a key feature to be considered when explaining loss of drug potency in patients with HIV-1 protease mutants and for predicting a suitable treatment to reduce the drug resistance observed in these patients.

Conclusion
The results of the present retrospective analysis support the robustness of the molecular modeling process carried out for building protease mutants, docking ARV, and characterizing protein-drug interaction patterns. The capacity of the HIV-PDI resource to aggregate relevant information such as critical interaction residues of diverse ARVs makes possible a more robust analysis and a satisfactory proposal for a second line of treatment. This proof-of-concept will be generalized in the future by integrating other HIV targets, especially the reverse transcriptase, and be being applied in the field for patients for whom we have full treatment histories.