History of Progress and Challenges in Structural Biology

In 1953, James Watson and Francis Crick published their work on the molecular structure of DNA double helix after observing the X-ray diffraction results done by Rosalind Franklin and Maurice Wilkins [1,2]. The simple base pairings, A-T and C-G, revolutionized virtually all biological science fields. Understanding the genome of organisms, especially humans, was believed to be the Holy Grail in understanding organism development and disease. This led to the race to publish the whole human genome. The human genome project was completed in 2003 [3,4]. It took about half a century from the discovery of DNA structure to the completion of the human genome. While we understand more about the DNA structure and the genome of several organisms, such as humans, fruit flies, laboratory mice, rice, etc, the understanding of protein structure is still limited.


Introduction
In 1953, James Watson and Francis Crick published their work on the molecular structure of DNA double helix after observing the X-ray diffraction results done by Rosalind Franklin and Maurice Wilkins [1,2]. The simple base pairings, A-T and C-G, revolutionized virtually all biological science fields. Understanding the genome of organisms, especially humans, was believed to be the Holy Grail in understanding organism development and disease. This led to the race to publish the whole human genome. The human genome project was completed in 2003 [3,4]. It took about half a century from the discovery of DNA structure to the completion of the human genome. While we understand more about the DNA structure and the genome of several organisms, such as humans, fruit flies, laboratory mice, rice, etc, the understanding of protein structure is still limited.
In concurrence with the discovery of DNA's structure and the completion of the Human Genome Project, the first three-dimensional structure of protein was determined by X-ray crystallography. The limitations of protein structural determination include 1) the complexity of amino acids, 2) the difficulty of protein expression in large quantity for experimental structural studies, 3) the difficulty in phase determination especially for proteins with an unknown structural fold, and 4) difficulty in predicting crystallization feasibility. The complexity of amino acids, 20 amino acids compared to 4 bases in DNA, and the variations of protein structures complicate the generalization of protein structures. John Kendrew and Max Perutz published the very first crystal structures of myoglobin and then hemoglobin [5,6]. Both of these proteins are oxygen transport proteins that can be readily collected in large quantities from nature. Very few if any abundant proteins are left without their structures determined. We are now faced with the challenge of finding a new way to improve expression and purification of unstable proteins or membrane-bound proteins. Several protein expression systems ranging from E. coli, yeast, insect, to mammalian systems are currently used to overcome this important problem [7]. Multiple ways of phasing in protein crystallography have been introduced. The more structures we have the easier it becomes. Most structural determination was done using simple molecular replacement [8]. Numerous ways have been proposed to help in generating appropriate construct for protein crystallization. However, crystallizing protein is still one of the most daunting and unpredictable tasks and is the rate-limiting step of structural determination [9].
There have been attempts, historically, to compile most of protein structure in order to predict the folding of proteins with no known structure. This attempt led to structural genomics and structural proteomics [10,11]. The main goal of structural genomics/proteomics projects is to use the high-through-put screening in combination with traditional protein structural biology, X-ray crystallography and NMR, and structural computational modeling to describe the 3-dimensional structure of every protein encoded by a given genome [10,11]. While this is an enormous task that requires significant bioinformatics, it allows a fast access to novel protein structures. Unfortunately, most of these proteins have no known function and therefore immediate use and publication of the work can be limited [10,11]. It is important to note here that traditional structural biology elucidating the structure of important proteins using X-ray crystallography and NMR along with functional studies is still the standard of publishing the structural work in a high-impact journal.
While learning novel protein folds is essential in moving structural biology forward, it is also important to collect as many experimental structures as possible especially in promiscuous proteins and enzymes that can interact with variety of ligands, substrates or inhibitors [12]. Computational modeling of protein structure can be useful in determining the general fold of the protein [13] and predicting the surface interactive site when combining it with functional studies [14]. However, the tertiary or quaternary structure of multi-meric or multidomain proteins can be difficult to predict without experimental data. In these cases, structural experiments can be essential in determining the role of embedded structural interactive surfaces that may be exposed or hindered in certain ligand or protein binding partners [15].
In this special issue, we invited the authors to publish their structural studies and review of the protein structures in relation to human diseases. The challenge of protein structural biology is that there is a need for experimental data of novel structural fold and structural data. Coupled with this, there is a need for complementary bioinformatics and computational modeling that will allow further hypothesis generating and understanding of relatively solid crystal/ NMR structures. Combination of these technologies and collaboration will, in the near future, lead us to the complete understanding of protein folding and structural prediction that will undoubtedly benefit science and medicine.