Characterization of the Leucine-Responsive Transcription Factor from Pathogenic Vibrio cholerae Using Molecular Modelling and Molecular Dynamics Simulations

Vibrio cholarae is the causative agent of cholera. It is responsible behind a variety of diseases specifically the multidrug-resistant nosocomial infections and chronic lung infections in cystic fibrosis patients. One of the vital genes of the organism responsible for its multidrug resistant behaviour is the gene PA3523 which codes for the multidrug efflux transporter. The expression of PA3523 is regulated by the dimeric transcription factor Lrp having Helix-Turn-Helix DNA binding motif. So far there have been no previous reports that depict the characterization of Lrp protein from Vibrio cholerae from a structural point of view. In the present work, an attempt has been made to characterize Lrp protein by structural bioinformatics approach. The dimeric structure of Lrp was built by comparative modelling technique. The dimeric model of Lrp was then docked on to the corresponding promoter region of the PA3523 gene encoding the multidrug efflux transporter. The docked complex of promoter DNA with Lrp protein was subjected to molecular dynamics simulations to identify the mode of DNA protein interactions. So far, this is the first report that depicts the mechanistic details of gene regulation by Lrp protein. This work may therefore be useful to illuminate the still obscure molecular mechanism behind disease propagation by Vibrio cholerae.


Introduction
Vibrio cholerae is a gram negative bacterium found mostly in soil, water and inside a number of host organisms. This organism is an opportunistic pathogen particularly known as the causative agent of multidrug-resistant nosocomial infections. It is also the main cause behind chronic lung infections in patients diagnosed with cystic fibrosis. The effect of Vibrio cholerae in cystic fibrosis patients is so severe that its establishment in the patients' lungs is likely to be a permanent fixture of the patient's life [1][2][3][4][5][6][7][8][9][10]. Recently, a Vibrio cholerae strain with increased capability of antibiotic resistance was identified [7]. The most important characteristic of Vibrio cholerae is the resistance against multiple drugs. One of the vital genes of the organism responsible for the multidrug resistant behavior is the gene PA3523 which codes for the multidrug efflux transporter [11]. The efflux transporters are proteins bound to the bacterial cell membranes. These proteins function as pumps to efflux out the toxic substances from bacterial cells and thereby confer single-and multiple-drug resistances. There are basically three different transport systems; the active transport, involving ATP hydrolysis for the entry or efflux of substances from cells; then there is phosphoenolpyruvate-dependent phosphotransferase system (PTS) in which a solute is phosphorylated to be transported across the membrane; the third type utilizes ion gradients as the energy-mode for transportation process. An interesting property of these efflux transporters is that they intrinsically harbour multiple substrates thereby providing the advantage of resistance to multiple structurally-different drugs. These efflux transporters reduce the accumulation of antibiotics inside bacterial cells. However, the process of the antibiotic efflux is very slow. Thus it provides sufficient time for the bacterium to adapt to the antibiotics and become resistant through mutations or alteration of antibiotic targets. Such multi-drug efflux transporters in bacterial pathogens are considered to be good targets for inhibitors, as reducing the efflux of multiple drug molecules from the bacterial cells may restore the clinical efficacy of older drugs and prevent emergence of drug-resistant variants. [12][13][14]. The expression of PA3523 is controlled by the transcription factor Lrp. Lrp is a transcriptional regulator having Helix-Turn-Helix DNA binding motif and it acts as a dimer in presence of metal ions like Cu [15]. However, till date no reports regarding the molecular mechanism of regulation of PA3523 gene by Lrp protein is available. In the present scenario, an attempt has been made to explore the molecular mechanism of the involvements of Lrp in the regulation of PA3523 gene expression. The computational model of Lrp protein has been built as no structures of Lrp protein from Vibrio cholerae was available. The Dimeric model of the Lrp protein has been built and the then docked onto the known promoter DNA region of the PA3523 gene. The interactions between the DNA and the Lrp protein were analyzed after molecular dynamics simulation of the DNA-protein complex. This computational analysis of the interactions between the promoter DNA and Lrp protein is the first report that analyzes the involvements of the amino acid residues of Lrp protein in the regulation of bacterial gene expression. This report would therefore pave the pathway for future genetic studies to identify the roles of the individual amino acids of Lrp protein in bacterial gene regulation. The interaction analysis would also be useful to develop new drugs aiming to attack the important amino acid residues of Lrp protein involved in binding to promoter DNA regions during the PA3223 gene expression.

Sequence analysis and homology modeling of Lrp
The amino acid sequence of Lrp protein from Vibrio cholerae was obtained from GenBank (Refseq id: YP_001217437.1) and was used to search the Brookhaven Protein Data Bank (PDB) [15] using the software BLAST [16] [18] with the A chain of 2GQQ as template. 5 models were built with high optimization levels. The loops generated in the models were refined by loop refinement technique as employed in the MODELLER software tool. This was done in order to avoid generation of erroneous loop structures [18]. The modeled structure of the Lrp protein was then superimposed on to the crystal template without altering the coordinate system of atomic positions in the template. The root mean squared deviation (RMSD) for the superimposition was 0.5 Å. The model of the Lrp protein was then energy minimized in two steps. In the first step the modelled structure was minimized without restraints. In the second step, the energy minimization was done by fixing the backbone of the modeled protein to ensure proper interactions. All energy minimizations were done using conjugate gradient (CG) with CHARMM force fields [19]

Validation of the model of Lrp protein
Regarding the main chain properties of the modeled protein, no considerable bad contacts nor Cα tetrahedron distortions nor hydrogen bond (H-bond) energy problems were found. There were no side chain distortions as observed by measuring the side chain torsion angles. The Z-scores calculated using PROSA [https:// prosa.services.came.sbg.ac.at/prosa.php] web server showed that the predicted homology model was well inside the range of typical native structures. The residue profiles of the three dimensional model were further checked by VERIFY3D [20]. PROCHECK [21] analyses were performed in order to assess the stereo-chemical qualities of the models and Ramachandran plots [22] were drawn. No residues were found to be present in the disallowed regions of the Ramachandran plots.
Building the model of the promoter DNA: In order to find the interactions between promoter DNA and the Lrp protein, the nucleotide sequence of the promoter DNA region for the gene VC0395_A2223 from Vibrio cholerae was extracted. For this purpose the database CollecTF [15] was used. The database CollecTF showed the following nucleotide sequence to be the promoter DNA sequence: The nucleotide sequence was used to build the model of the corresponding DNA region using the CHARMM software tool and then subjected to energy minimizations. The resulting energy minimized structure was used for docking studies.

Molecular docking simulation
It is known that Lrp binds to DNA as a homo-dimer [15]. Thus a homo-dimeric model of Lrp was built by superimposing the models of Lrp on to the two chains its crystal template 2GQQ. The homodimeric model of Lrp so obtained was subjected to energy minimization as per the protocol previously mentioned in the section Sequence analysis and homology modeling of Lrp. In order to elucidate the mode of binding between the promoter DNA and Lrp, the models of the protein and the DNA were docked using the software patchdock [23]. Patchdock uses surface information about proteins and docks the proteins on the basis of their surface complementarity. The final docked complexes are ranked on the basis of their geometric shape complementarity scores [23]. The highest scoring docked structure of the DNA-protein complex that yielded the best score was selected and analyzed visually using DS modeling software suite. The docked complex was then energy minimized as per the protocol previously mentioned in the section Sequence analysis and homology modeling of Lrp.

Molecular dynamics (MD) simulation
The MD simulation of the DNA-protein complex was performed using the CHARMM module of DS modeling software suite. The initial coordinates were extracted from the energy minimized structure of the promoter DNA-Lrp docked complex. The complex was placed in an orthorhombic box having dimensions preventing selfinteractions and periodic boundary conditions were employed. The system was solvated with adequate water molecules at the typical density of water at 298K and 1.0 atm pressure utilizing single point charge (SPC) model. The whole system was energy minimized keeping the temperature constant to the body temperature of 310K using NPT dynamics protocol. The LINCS [24] method was applied to constrain the covalent bond lengths to keep the integration steps to 2fs. The particle mesh Ewald [25] was applied to calculate the electrostatic interactions. A 100 ns dynamics run was then performed for the DNAprotein complex. The modes of binding interactions between Lrp and the corresponding promoter DNA was then analyzed using DS modeling software suite.

Structure of Lrp from Vibrio cholerae
The structure of Lrp was built using 2GQQ, Chain A as the template. The amino sequences of Lrp and 2GQQ, Chain A showed 89% sequence identity. The modelled structure of Lrp monomer was presented in figure 1. The protein has two domains; the N-terminal DNA binding domain and the C-terminal domain consisting of antiparallel beta sheets flanked by alpha helices. The two domains are connected via a linker loop region. The DNA binding region of the protein was present at the N-terminus and comprised of long helix-

Interaction with promoter DNA
It is known that the transcription factor Lrp interacts with DNA as a homo-dimer. In order to analyze the interactions between Lrp and its promoter DNA region, a homo-dimeric model of Lrp was generated. The resulting model was docked onto the model of the promoter DNA region to find the DNA-protein interactions. The docked complex was subjected to MD simulation after energy minimization. The progress and completion of the MD simulation process were monitored by plotting a graph of RMSD of the backbone atoms of the docked complex vs time periods of the MD simulation runs (Figure 4). Interestingly, the protein conformation does not undergo an appreciable change upon DNA binding. The resulting complex was then analyzed to find probable modes of bindings between the DNA and the protein. Mainly the amino acid residues from the second helix-turn-helix DNA binding motifs of Lrp were found to be involved in the interactions. Most of the interactions between the DNA and the protein were found to be ionic and polar in nature involving the polar side chains of the amino acids of the Lrp protein. The electrostatic energy profile and the Hydrogen bonding pattern are plotted in figure 5 and figure 6 respectively. The interaction patterns between DNA and protein are found to follow nearly uniform distributions. Notable among them were the amino acid residues Arg50, Arg51, Gln55 from the helix-turn-helix DNA binding motif of the Lrp protein; they were found to interact with their positively charged side chains with the negative DNA backbone. However, Asp16, Asp26, Lys36, Arg37 from the same helix-turn-helix DNA binding motif of Lrp were also found to be interacting with the DNA phosphate backbone via their main chain atoms (Figure 7).

Conclusion
In this work an attempt has been made to analyze the molecular mechanism of the regulation of the gene PA3523 coding for the multidrug efflux transporter from Vibrio cholerae. Vibrio cholerae is an opportunistic pathogen responsible for multidrug-resistant nosocomial infections and chronic lung infections in cystic fibrosis patients. The expression of the gene PA3523 is controlled by the protein Lrp. In order to analyze the molecular mechanism of the gene regulation process, a homology model of the Lrp protein from Vibrio     The metal ion binding region of the Lrp protein acted as helper to the helix-turn-helix motif. The MD simulations also identified the amino acid residues from the helix-turn-helix motif of Lrp protein, which are responsible for DNA binding and interactions. So far this is the first report that elucidates the mode of DNA binding and regulation by Lrp protein from Vibrio cholerae. This study would therefore pave the pathway for future genetic and mutational studies to identify the roles of the amino acids of the Lrp protein in the regulation of the gene PA3523. Future drug development endeavors may target these amino acid residues of the Lrp protein to abolish its capability to bind to the promoter DNA region to initiate the gene expression process in highly pathogenic Vibrio cholerae.