Genetic Syndromes & Gene Therapy

Proteins are essential biomolecules for use in research, diagnostics, and therapy; however, toxicity and side effects, lack of activity and specificity of the manufactured proteins arise due to altered properties and factors derived from the complexity of their structures, cellular specificity of action, dynamic folding, and reactivity. In vivo , proteins perform diverse functions and may play multiple roles, such as enzymatic activity coupled with genetic regulation, membranes traffic and rearrangements, molecular transport or transmission, among many others. Reactive amino acid residues on the protein surfaces participate in signal transduction pathways via cascades of reversible interactions and redox reactions. In isolated form, proteins lose the support of the native environment, resulting in loss of functional states. The replacement of the cellular compounds by appropriate organic and inorganic molecules is critical during protein production to keep them in a homogeneous state that is as close as possible to that of the native system. In addition, the formation of homo-oligomers may inactivate proteins, triggering immune responses in the targeted cells. It is challenging to identify the multiple factors that can affect protein properties during the entire process of production from cloning to manufacturing. The fast growth of bioinformatic tools and databases allow statistical evaluation of a large variety of those critical factors and how they contribute to the manufacturing process. By applying innovative multidisciplinary and design of experiments approaches, a rapid reduction of the variables is achievable and thus can bring about results that are more consistent to improve the quality of the manufactured proteins and the success in clinical trials. The review article provides examples of enzyme and antibody production for use in research, diagnostics, and drug discovery.


Introduction
Over the past ten years, drug discovery efforts have been progressively focused on protein research with an increasing perception for more comprehensive studies of their complex cellular functions [1][2][3]. The guidelines [4][5][6] reflect advances in fundamental knowledge, new technologies, bioinformatic tools, and the integration of data for modeling and phenotype/physiology predictions with emphasis on case-by-case scientific approaches for improvement of protein characterization. This includes the use of appropriate alternative methods for safety and quality evaluation that can replace current standard methods, if accepted by regulatory authorities. It also includes comparisons of target sequence homology between species, specifically, in vitro qualitative and quantitative cross-species comparison of relative activities, affinities, specificity and kinetics. Although minimum assays and amount of information are acceptable from the regulatory agencies at the beginning of the research and development (R&D) process, the increasing requirements for information at the later steps compel the development of additional assays that often provide controversial results, repetition and delay of the whole process.
Purified proteins have altered environment, properties and structure-activity relationships (SARs), which commonly lead to low affinity and specificity and cause immunogenicity, toxicity and side effects when used as pharmaceutical drugs. A number of factors should be taken into account when purifying proteins during early stages of the R&D process in order to evaluate diverse cellular environments when transferring from expression and purification to in vitro, cell-and animal-based assays. Whether the purified proteins are used for drugs, diagnostics, or for research, particular attention is required during the entire production process to ensure their proper folding and activity. Thus, more predictable and consistent results could be achieved after the manufacturing process and during protein storage.
This review covers some basic and practical aspects of protein purification for application in research, diagnostics, and drug discovery with a focus on interactions between critical factors and how the new data from different Omics fields provide valuable information to develop more successful statistical-based design of experiments (DoE) for quality improvement, while reducing costs through the application of appropriate established assays in parallel during the whole process. Examples with two different protein classes and family members, analytical tools and general concepts are provided, with the aim to support early discoveries translate into future market-valued products.
Protein identity and SARs-from homology to diversity information, derived from sequencing and structural genomic projects, is a great source for comparison of primary-to-tertiary and quaternary structures of proteins within a particular family or class of proteins. Continuous improvement of existing interrelated bio-resources (ExPASy, KEGG, NCBI, RCSB, DrugBank) along with newly developed resources, such as the Guide to Pharmacology [7] and the Human Protein Atlas [8], among many others, allows for the identification of common patterns and differentiation of unique properties of any single target in various cellular environments. As an example, sequence and structure comparisons of the Methionine amino peptidase enzyme family are used to provide evidence for the significance of comparative data in planning of R&D, from cell culture and upstream processes to manufacturing, formulation, storage and delivery.
Methionine aminopeptidase (MetAP, EC 3.4.11.18) is among the widely-studied enzymes for treatment of cancers, bacterial and parasitic infections [9][10][11][12]. It is an ubiquitous enzyme, with key functions in protein synthesis, that co-translationally removes the initiator methionine (Met) from the nascent polypeptide chain [13][14][15][16][17][18]. MetAP has become an attractive target for research and drug discovery, since it was first recognized as a cancer marker and the 3D structures for human and E. coli MetAPs, complexed with the natural anticancer inhibitors fumagillin and ovalicin, solved in 1998 [19,20]. Since then, large-scale libraries of synthetic inhibitors have been generated and tested against MetAPs from different organisms, as part of drug discovery efforts [21][22][23][24][25][26][27][28][29][30][31][32][33]. MetAP is encoded by a single gene in a majority of prokaryotes; at least two iso-forms exist in eukaryotes. Three genes encoding human MetAPs have been identified, although one of them, the mitochondrial iso-form of the enzyme, is still poorly studied [34]. They are classified into prokaryotic (MetAPIa-c) and eukaryotic (MetAPII) MetAPs, based on N-terminal extensions ( Figure 1A and Figure 2) and an extra domain in the eukaryotic MetAPs, Figures 1A, C and D (gray colored domain in hMetAPII) and Figure 2. With a high degree of sequence and structural conservation in the substrate binding domain, all of the family members expressed activity towards polypeptides with small amino acid residues in the second and the third positions. The diversity among MetAPI members was mainly related to heterogeneity in the N-termini, sites of interaction with ribosome and for sub-cellular translocation. The eukaryotic MetAPs were proposed to have more complex properties, including regulation of protein synthesis [35]. Under debate are the physiological relevance and the number of the metal ions that  Figure 1A. Domain organization of MetAP from E. coli, M. tuberculosis, human MetAPI, human mitochondrial MetAP, rat MetAPII and human MetAPII. K1, K2 and D boxes at the N-termini of MetAPII signify basic and acidic motifs, respectively; the metal binding residues are shown in pink, the highly conserved residues (cyan), Cys (blue), based on the sequence alignment presented in Figure 2. Figure 1B. Divalent metal co-ordination in human MetAPI and MetAPII. Figure 1C. Ribbon structure of human MetAPI, the divalent metals (pink spheres) binding site and the NXV motif (boxed), PDB 2B3K. Figure 1D. Ribbon structure of human MetAPII, the divalent metals (pink spheres) binding site and the disulfide bond (boxed), PDB 1BN5. Figure E. The NXV motif in human MetAPI. Figure F. The disulfide bond in human MetAPII. The colors of the residues in the structures are the same as in Figure 1A and Figure 2: His and Met are colored in firebrick and forest, respectively. PyMOL was used for the ribbon drawings. activate the enzyme. Originally, five conserved residues (1His, 2Asp and 2Glu, Figure 1B) binding two Co 2+ ions were identified in the substrate binding domain (pink colored residues in Figures 1A, C and D, and Figure 2); however, later studies revealed that one, two or three ions of Co 2+ , Mn 2+ , Fe 2+ , Zn 2+ or Ni 2+ may activate the enzyme [25,27,[36][37][38][39]. Many questions related to the cellular regulatory mechanisms of MetAPs still remain unanswered. One of the main questions is: how to distinguish the properties of each family member in order to develop drug discovery concepts for particular disease treatment, hence build-up strategies for purification, production and in vitro studies. With the increasing amount of structural and other data coming from different Omics fields, it is currently possible to accumulate information that allows more comprehensive analysis: comparison of primary sequences and structures; identification of common patterns and dissimilarity; comparison of the available information and approaches from similar classes of proteins.
Until now, 113 structures of MetAPs from different organisms have been submitted to the Protein Data Bank (PDB). Table 1 summarizes the data for metal ions bound to the enzyme and the number of Met, Cys and His residues in the corresponding sequences. We narrowed down our analysis to a few amino acids and metal ions for illustration purpose mainly, but it could be extended to more amino acid residues for further analysis: Met and Cys easily oxidize, whereas His has a strong affinity to transition metals, which activate the enzyme in this particular example. Based on sequence alignment ( Figure 2) and 3D structures, we mapped conserved residues and domains that allow further distinguishing the MetAPs. As shown in Figures 1C and E, hMetAPI forms a metal complex with Asn198 and Val200; both amino acids are conserved in the majority of the prokaryotic MetAPs and form this metal complex (Table 1 and Figure  2). As seen from the sequence alignment, Val200 is replaced by Cys228 in hMetAPII and forms a disulfide bond with Cys448. The NXV motif exists in E. coli, M. tuberculosis, P. aeruginosa, R. prowazekii, T. brucei and hMetAPI (data not shown) and almost all of the solved structures from these organisms have bound metal ions, mainly Na + and K + ( Table 1). The existence of Mn 2+ in the P. Aeruginosa NHV motif is most probably due to overloading of the purified enzyme with this metal for the activity assays, as a result one Mn 2+ in each of the structures (PDBs: 4FO7, 4FO8, 4JUQ) binds His residues located on the surface of the enzyme and forms homodimer. Recently, Hogg PJ and collaborators [40] identified the existence of disulfide bond in the eukaryotic hMetAP and studied its role as a redox regulator of enzyme activity. Similarly to another investigation of the same research group with serine protease βII-tryptase [41], they found that the oxidized and the reduced hMetAPII had different specificities and efficiencies for hydrolysis of native peptide substrates. Furthermore, the authors determined changes in the redox states of the enzyme in human glioblastoma cells in two different stresses associated with tumor growth, low nutrient supply, in the form of glucose, and hypoxia. They suggested that hMetAPII activity is controlled by reduction of an allosteric disulfide bond (present in 14 out of the 18 available 3D structures) rather than by an inhibitor or by compartmentalization, as had been previously assumed. The existence of a second conserved domain in MetAPs (NXV metal complex in MetAPI and disulfide bond in MetAPII) is approximately 16-18 Å from the active site, and its contribution to enzyme specificity and mechanisms of action will require further detailed investigation.
As is seen from Table 1, the content of amino acid residues in MetAPs is variable: 0-15 for Cys, 5-14 for Met and 6-19 for His. Their conservation and distribution in the protein structure provide useful information to develop selective approaches for protein purification and analysis for any single member of the family (highlighted in Figures 1C-F and Figure 2). Cys are the most reactive residues in protein chains and the major cause of oxidations and aggregations during purification procedures observed in other protein types (such as antibodies, discussed in the next section). The role of Met in protein properties has not been extensively studied. However, Met is recognized as one of the most frequently oxidized amino acids during protein purification and storage, particularly when it is located on the protein surface [42][43][44][45][46]. The fact that His has a strong binding affinity to transition metals requires special considerations, in addition to other reactive residues on the protein surface that may interact with excess of metal when it is loaded to the protein for subsequent activity assays. For example, attempts to activate the enzyme with increasing amounts of transition metals may lead to non-specific binding to reactive residues on the protein surface, as in E. coli (PDBs: 1YVM Glu122, 2MAT His54, 4PNC His54) and P. aeruginosa (PDBs: 4F07 His144, 4F08 His144, 4JUQ His144) MetAPs or to residues distant from the metal binding site, hMetAPI (PDBs: 2GZ5 Glu119, Tyr186, His301; 2B3H His203). Metal oxidation or metal-induced oxidation of amino acid residues could explain precipitation of MetAPs when loaded with Fe 2+ , as Fe 3+ is the only observed ion in 3D structures, e.g., P. falciparum (PDB 3S6B), E. cuniculi (PDBs: 3FM3, 3FMQ and 3FMR). Estimation of bound small molecules to proteins and their coordination with particular residues should be considered prior to the activity assays in order to ensure more accurate experiments and better reproducibility of results. Evaluation of metal content in cells for expression, media and during the purification process is a prerequisite for distinguishing physiologically relevant factors from artificial impurities derived from experimental errors. Cys oxidation status during the activity assays has still not been evaluated; more attention to Cys residues and their location in the polypeptide chain can provide important information, discussed in the next sections.

Heterogeneity of purified proteins
In cells, proteins undergo different types of conformational transition states, translational and post-translational modifications, and activity in a strictly controlled temporal and spatial environment. During expression and purification, proteins exist in heterogeneous populations, where partial modifications may be derived from cellular type, strains and stages of growth, media and buffer composition, primary and quaternary structure, temperature, etc. To date, monoclonal antibodies (mAbs) provide the best example of protein heterogeneity with a large variety of process and product related modifications.
Due to the natural diversity of antibodies, they are highly heterogeneous in isolated form with different biochemical and physicochemical properties [47][48][49] and high potential for immunotoxicity [50][51][52]. Different types of modifications have been classified in variety of ways in order to better describe them for purification and analytical purposes. The majority of chemical modifications of mAbs observed involve: N-terminal pyroglutamate formation, C-terminal lysine processing, deamidation, aspartate isomerisation, Met, Trp and Cys oxidation (disulfide bonds), glycosilation, glycation, peptide bond cleavage, non-reducible crosslinking, mutations and insertions [53][54][55][56][57][58][59][60][61][62][63][64][65][66]. Modifications that generate acidic and basic species in Abs could be considered for the development of appropriate chromatography strategies [66,67], and further characterization of their quality or process attributes. While optimizations of antibody purification and analytical tools for characterization of the most abundant modifications have been extensively developed, less attention has been paid, until recently, to variables related to Cys redox properties and their functional significance [70][71][72][73].
In the last few years, disulfide structural variations and scrambling in mAbs have become an area of extensive research due to the fact that a high content of Cys residues in their structures is closely related to instability, misfolding, oxidation, aggregation and fragmentation [68,69,73]. Previously, it was assumed that Cys in Abs are paired and have a structural role. The classical view of disulfide organization in Abs is under current revision, as new data support the notion that Cys residues, such as those located in the hinge regions may play a functional role (Figure 3) [70,71]. Still, little is known about scrambling of Cys in Abs. However increasing evidence shows that Cys residues in the highly dynamic hinge regions may undergo rearrangements, as they are in close proximity to each other and are exposed to the surrounding environment with variations of ionic strength. Most structural variations in IgG2 and IgG4 subclasses have been related to reduced Cys, with subsequent formation of non-native disulfide bonds, during protein purification. The existence of unpaired Cys was well-documented by several research groups [73][74][75][76] and the instability of the Abs due to Cys redox conversions, particularly in the hinge regions, becomes more obvious. In addition, the hinge regions are more susceptible to non-enzymatic fragmentation where Cys, Asp, Gly, Ser, Thr, and Asn may facilitate peptide bond cleavage via specific mechanisms [77][78][79].
The close correlation between all of the factors mentioned above need careful consideration during mAbs manufacturing, as the reversible modifications (such as disulfide scrambling and oligomers formation), at a particular stage of the mAbs production, may trigger irreversible processes, such as degradation and aggregation [69,73,80]. Whether alterations in mAbs structures and functions could be process-or product-related, they need to be analyzed on a case-by-case basis. With an early application of appropriate analytical tools, a reduction in subsequent unwanted processes could be achieved. For example, during the purification process, even if the protein seems to be purified (base on gel electrophoresis), it may not contain a homogeneous population of protein. It may have fractions with different types of covalent modifications, Cys pairing or bound small molecules including metals, which is the case with MetAPs mentioned above. Each of those fractions will behave differently in bioactivity analysis and upon storage. The existence of impurities, such as transition metals that may trigger oxidation in a longer period of time, need special consideration and that is discussed in the next section. A combination of analytical tests on each fraction during chromatographic separation is highly recommended in order to collect homogeneous protein at the end of the production cycle. Normally, shoulders of isolated peaks  are relatively impure and should not be collected, as they contribute little to total yield, but will likely necessitate additional purification. Thus, in-process tests of intermediate fractions will help not only to improve mAbs production (yield and quality), but also in their further validation for research and diagnostics applications. Moreover, this step will provide better knowledge about causes of immunotoxicity when mAbs are used as drugs.

Redox-and metallo-proteome research in protein production
The oxidation of purified proteins is rarely studied; however, it is a matter of great concern. Protein oxidative damage is attributed to the action of generated reactive oxygen species (ROS) [81][82][83][84] in response to cellular stresses. Among the amino acid residues, the aromatic and sulfur-containing amino acids are most sensitive to oxidation. In recent years, more attention has been paid towards reversible and irreversible redox modifications of Cys at molecular level and their significance in protein folding and maturation, catalytic activity, signaling, and interactions with redox-active chemicals. The field of redox and Cys proteomes is expanding quickly, focusing on the central role that Cys plays in signal transduction via redox switches and thiol-disulfide exchange reactions with sulfhydryl-containing substrates, such as glutathione (GSH) and thioredoxin (Trx) [85][86][87][88][89]. Cys may exist in over ten oxidative forms, most of them reversible, and its abundance in proteins makes the process of protein purification and in vitro studies very challenging. Cys distribution in proteins is evolutionally marked, its content increases in correlation to the complexity of the organisms, ranging between 0.5% in prokaryotes and 2.26% in mammals among the analysed species [90]. The central thiol/disulfide redox couples (GSH/GSSG, Trx) are maintained at distinct, non-equilibrium potentials in mitochondria, nuclei, secretory pathways and extracellular space in eukaryotic cells [88,91,92]. Disulfide bonds in celllular proteins and in model peptides are the most studied Cys-oxidative forms, with increasing evidence for their functional redox properties [93][94][95][96]. Met is the other most versatile sulfur-containing amino acid, albeit with fewer studies of redox properties. Met oxidation is well-documented in purified proteins and is associated with changes in structure, half-life, biological activity, and immunogenicity [42,43,97,98]. Based on detailed studies of Met distribution and its role in model peptides, Levine and coworkers proposed that Met residues constitute an important defence mechanism [45,[99][100][101].   (4). Highly reactive oxygen species most likely oxidize side chains of amino acid residues and increase susceptibility of protein chains to degradation. Iron, copper, and zinc are the most abundant transition metals in cells and their deficiency is linked to a large variety of diseases [104][105][106][107][108][109][110][111]. All three metals coordinate sulfur, nitrogen, and oxygen of amino acid residues, differently than calcium, which coordinates only oxygen. Unlike iron and copper, zinc is redox inert and functions as an antioxidant, either protecting sulfhydryl groups from oxidation directly bound to Cys or bound to amino acid residues resulting in conformational changes and competition with iron and copper [112][113][114]. Zinc is known to play several fundamental roles in protein properties: co-factor of enzymes, forms finger motifs and participates in redox switches via thiol-disulfide exchange. Thus, it is involved in protein structure, function and protein-protein interactions of almost every single protein, either instantly or throughout transduction pathways. Studies of metal coordination and how they contribute to proper folding of proteins and activity is a relatively new area of research, known as bio-inorganic chemistry and very recently, as Metallo-proteomics [115][116][117][118][119], which amalgamates proteomic and metallomic approaches to study proteins in cells and isolated form. This area of research will greatly improve current understanding of oxidative damage in purified proteins and transition metal contribution to these processes, as it has been observed previously, that oxidation is possible only in the presence of particular transition metals.
All of the above, when considered together, suggest that transition metal content (specifically that of iron, copper, and zinc) in cells, media, and during the protein purification process has a critical consequence for protein instability and biological activity. Therefore, preliminary information on protein amino acid composition, distribution of certain reactive residues and how they would be exposed to metal-induced damage in cells for expression (where the environment is different from the native cells), with different buffer compositions during and after purification, is an important consideration for experimental design. Strict control of oxygen supply during aerobic fermentation, metal composition, contamination, chelators and reducing reagents during the entire production cycle from cellular growth to protein storage should be included. As mentioned above, the nature of the metal may have a negative impact, or may protect protein from undesired oxidation. In either case, knowledge of these types of interactions is extremely helpful for better management of the protein production process.

Integration of classic with advanced analytical techniques
Dynamic protein structure is subject of a wide variety of investigations for better describing the correlation between structural variations and cellular functions. X-ray crystallography and smallangle X-ray scattering, nuclear magnetic resonance (NMR) spectroscopy and cryo-electron microscopy (cryo-EM), provide atomic resolution of proteins and structural ensemble directly; however, those methods still have limitations for characterizing timescales of inter-conversions. The classical biophysical techniques of fluorescence, circular dichroism (CD), UV/Vis absorption, infrared, Raman spectroscopy, and electron paramagnetic resonance (EPR) provide kinetic information, the energy landscape, and protein dynamics in a timescale. Both types of techniques complement each other nicely for a better description of protein dynamics in isolated form, and provide an opportunity to learn more on how proteins carry out a variety of functions in cells. The description of protein properties (SARs) is challenging; the relative probabilities of defining their energetic states inspires scientists to search for new directions and methodologies, in order to better understand and predict multiple transitions and properties of proteins in different types of cells.
Cellular oxidative damage of proteins is primarily linked to the disruption of redox and metal ion balance, with increased formation of ROS [81,83,[102][103][104]. The redox state of cells is predominantly dependent on iron-and copper-redox couples, followed by nickel, chromium and cadmium. The common mechanisms of ROS formation and metal induced oxidative damage include: Currently, proteomics research is growing exponentially with new high-throughput (HTP) technologies, where MS-based methods continually expand the analytical spectrum, from typical peptide identification, post-translational modification and metabolite quantification -to estimation of co-factors binding affinities and specificity. The preferred method, a combination of strong ion exchange and reverse-phase liquid chromatography with tandem mass spectrometry (MS/MS), allows large-scale proteomics and analysis of various endogenous and artefact modifications to characterize unexplained spectra, such as obtained with potassium cations that mimic the mass increment of a phosphorylation group [120], for example. The correlation between protein oxidation and the nature of metals that affect protein stability or may generate artefacts during data processing, is particularly challenging to study by LC-MS. Therefore, improvement of methodologies for providing reliable information from large datasets is of great importance. ICP-MS (Inductively coupled plasma mass spectrometry) [121][122][123] becomes a more recognized method that will adequately integrate with other HTP proteomics approaches to provide precise information about the nature and stoichiometry of bound metals. Yet, the classical UV/Vis spectral assays that have been forgotten for some period of time will assertively integrate into advanced technologies. The need for a more detailed description of multiple protein forms and conversions during the entire process from isolation and purification to analytical studies, requires alternative approaches, invoking lessons from classical examples of cross-related fields. Typically, during protein production, absorption at 280 nm is measured to estimate concentration, based on the extinction coefficients, considering λ max of the three chromophores in proteins -Phe, Trp and Tyr or by experimental estimations. However, the entire UV/Vis profile of proteins (190-850 nm) is very informative for assessing improper folding, conformational transitions, protein-protein interactions, oligomer formation and aggregation, bound compounds, and impurities ( Figure 4). Wavelength and intensity of absorption in the range of 190-300 nm is strongly influenced by surrounding microenvironment of protein chromophores, including pH, rearrangements due to local interactions with small molecules and disulfide reshuffle or proteinprotein interactions that may shift the maximum and intensity of absorption by 1-20 nm [124][125][126][127]. The co-ordination of transition metals or chromophores to proteins may raise shoulders in the range of 300-400 nm or peaks in the range of 400-800 nm in absorption spectra [128][129][130][131][132]. If UV/Vis assays are combined with other costeffective colorimetric assays, such as PAR (4-(2-Pyridylazo)resorcinol) for transition metals analysis [133][134][135] and DTNB (5,5'-dithiobis-(2-nitrobenzoic acid), or its derivatives, for Cys oxidation [136][137][138], this could be a powerful strategy to collect extremely important information for protein status during the entire protein production process. The combination of low-cost classical and appropriate HTP assays with a focus on particular factors and their interactions, allows for a reduction in experimental variables and more consistent in vitro results.

Statistical design of experiments
Statistical design of experiments (DoE) is a highly efficient foundation for describing multifactorial systems, such as a biological one, and is applicable to the all stages in the bioprocess [139][140][141]. Beyond the different existing definitions of DoE, the main concept is to study more than two factors in parallel, searching for correlations between them; it differs from the traditional model where experiments are designed sequentially in order to eliminate single-tested factors. The power of DoE to assess multiple factors in parallel with crossinteractions is in attaining more precise information from fewer experiments. Statistical DoE is now possible to put into practice and to consider the enormous amount of data coming from diverse areas of proteomics research ( Figure 5). Detailed examples of statistical  DoE could be extended to other interacting factors, such as protein oxidation or other modifications in presence of different ions during DoE application for bio-process optimization [142][143][144] and protein purification [144] provide evidences on how multidimensional combination and interaction of process parameters improve productivity. In bio-processing optimization, the common methodology is screening of variables, such as pH, T 0 , media and mixture composition with a strict control of oxygen supply. Improvement in protein yield and activity for different types of expression systems has been achieved by monitoring the interactions between critical factors and assuring consistency of the fermentation process. The application of DoE for purification of variety of mAbs with initial screening of 3-5 types of chromatography media, followed by variation of pH, conductivity and additives have permitted fast optimization of the process with improvement of the recovery with less purification steps. the purification process. It is important to distinguish reversible from irreversible modifications: how the reversible processes might be better controlled, and the impact of the irreversible modifications on protein properties. Information obtained from early stages of a process will allow for faster selection of only those factors that are critical for protein proper folding and activity, with an increased probability of reproducing procedures and obtaining more consistent results. A matter of choice is how to select the initial screening of factors and how many interactions should be monitored within a given timeframe.
The wealth of information from various sources needs to be carefully analysed in the context of the unique structural and dynamic properties of the protein of interest. Of major importance is the initial collection of data available for a particular target and family members that would allow applying similar approaches to other targets. The constant use of simple, cost-effective techniques in parallel and monitoring of protein complex behaviour will enhance knowledge of protein in vitro properties, and requirements for prevention of undesired processes and products. Most importantly, the collected information will be of fundamental significance for further decisions regarding appropriate bio-analytical assays with more comprehensive, robust and reliable product characterization. The main concepts of DoE and the essential analytical tools might have broad application to the entire production process, yet careful assessment of the appropriate target-specific approaches is required.
The provided MetAPs and mAbs examples in this review illustrate the need for further systematic approaches in order to better characterize proteins for application in research, diagnostics and drug discovery. As demonstrated with MetAPs research, more detailed sequence/structure comparison and analysis allowed for identification of conserved domains and residues that hadn't been characterized before. Our new understanding of the mechanisms of action and substrate specificity of MetAP enzymes is essential and based on the recently identified allosteric disulfide bond with redox properties in the eukaryotic hMetAP. With the recent development of Omics fields, such as Redoxand Metallo-proteomics, the protein signaling network requires different perception, one with more attention to the existence and contribution of Cys residues in protein structures. The great abundance of Cys in Abs enforces particular attention, due to the fact that Cys scrambling is a prerequisite for oxidation, aggregation, loss of activity and specificity. By applying DoE during R&D and manufacturing processes with constant monitoring of critical factors, many of those factors could be analysed and monitored through fewer experiments. As a result, it is possible to obtain consistency of product and process performance while also applying cost effective approaches and improving quality.
The basic concepts outlined in this review may be applicable to the other types of proteins, such as cytokines, receptors, hormones, growth factors and others. The multidimensional nature of proteins requires complex case-by-case approaches when a research team endeavours to study a particular drug target. With the rapid development of modern biotechnologies, their proper application requires a thorough understanding of protein nature and data interpretation. Biological processes could be described on atomic and molecular levels with the collective efforts of scientists working in diverse closely-related fields. Bridging physics and chemistry with biology is a challenging task; however, crossing borders in life sciences disciplines is a vital process that opens great opportunities for bioanalytical application.
In cells, signal transduction, enzyme catalysis, protein-ligand and protein-protein interactions are biological processes on a µs to ms timescale coincident with local motion and bond vibration of polypeptides in fs to ns timescale. In isolated form, proteins exist in a large variety of dynamic forms, thus selection of appropriate experimental approaches will enhance description of protein SARs and dynamics, both in vitro and in vivo. Novel data-mining algorithms coupled with innovative, cost-effective, multidisciplinary approaches and HTP technologies from academia and industry will endorse the drug discovery efforts and protein manufacturing.