Received date: December 02, 2016; Accepted date: December 19, 2016; Published date: January 05, 2017
Citation: Kakubayashi N, Fujita E, Morikawa M, Ohashi S, Matsuo Y (2017) Concerted Evolution of the Replication-Dependent Histone Gene Family in Drosophila immigrans. J Data Mining Genomics Proteomics 8:210. doi: 10.4172/2153-0602.1000210
Copyright: © 2017 Kakubayashi N, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Data Mining in Genomics & Proteomics
The replication-dependent histone genes in Drosophila immigrans were analyzed for elucidating the evolutionary mechanism of the histone multigene family. A region of approximately 3.9 kb containing H2A-H2B-H1 genes was cloned. Six independent clones were sequenced and analyzed for nucleotide variability. The average nucleotide sequence identity in the region among repetitive copies was more than 99%, indicating that the histone multigene family in D. immigrans has evolved in a concerted fashion and with a similar level as in D. melanogaster. Amino acid variants were found at a low frequency. Analysis of the GC content at the 3rd codon position of histone genes revealed that a change in GC content, i.e., a decrease, observed in D. hydei and D. americana has occurred after the divergence of an ancestor of these two species from D. immigrans.
Concerted evolution; Histone multigene family; GC content; Drosophila
Evolutionary factors, including selection, genetic drift, migration and mutation, interact and cause genetic changes in organisms [1-2]. Evolutionary mechanisms can be studied by investigating measurable values, such as genetic variability [1,3-6], molecular evolutionary rate [7-10], concerted evolution of a multigene family [11-17] and changes in GC content [14-15,18-20], and by speculating on the factors that affect them. For studying the evolution of a multigene family, additional information, such as copy number and variability among copies, is necessary [13,16-17]. Gene copies of replication-dependent histones in Drosophila melanogaster were reported to be 0.40% different within a chromosome . In addition, a 0.6-0.9% difference was reported for the histone repeating unit in other Drosophila species [17,21-22]. Investigations of many species are required to determine the kinds of evolutionary changes that have occurred and the level of nucleotide variability in a multigene family. The factors contributing to the evolutionary mechanism of a multigene family can be understood more clearly if the nucleotide variability among copies in a multigene family is studied in many species.
There are two types of histones: Replication-dependent and replication-independent . In Drosophila , histone genes for the replication-dependent type are tandemly clustered with approximately 110 copies ; in contrast, the histone genes for the replicationindependent type comprise only one or a few copies [25-28]. There is also a large difference in the gene structures of the two types of histones [25-28]. Two types of histone genes have been evolved independently [29-31].
In this paper, for studying the evolutionary mechanism of the histone multigene family, the nucleotide variability among repetitive genes was investigated in a region of approximately 3.9 kb containing the H2A-H2B-H1 genes of the histone repeating unit in D. immigrans. By studying the GC content in D. immigrans , information on GC content evolution can be obtained for the lineage leading to species with a low GC content, i.e., D. americana and D. hydei.
Drosophila strain and DNA extraction
An isofemale strain of D. immigrans was donated by Kyushu University, Japan. Genomic DNA from D. immigrans was extracted from larvae with a DNA extraction kit (Sepa Gene Kit, Sanko Junyaku, Co., Ltd., Tokyo, Japan) and studied for variability in the gene family within the population.
A 3.9 kb region containing the H2A-H2B-H1 genes of the histone repeating unit was amplified by PCR from genomic DNA (Figure 1). PCR reactions were conducted with Takara EX Taq (Takara Bio, Kyoto, Japan)  under the following conditions: 40 cycles of denaturation at 94°C for 1 min, annealing at 55°C for 2 min and polymerization at 70°C for 2 min, followed by extension for 5 s. The nucleotide sequences of the primers used for cloning the 3.9 kb region are shown in Table 1. The PCR products were then cloned into the plasmid vector PCR 2.1 (Invitrogen, Carlsbad, CA, USA). Six independent clones were analyzed for genetic variation of the multigene family in the 3.9 kb region (Figure 1).
|Primers||Sequence (5’-3’)||Primers||Sequence (5’-3’)|
Table 1: Sequences of the primers used for cloning and sequencing.
The sequencing strategy for the 3.9 kb region of the histone gene repeating unit is shown in Figure 1. The nucleotide sequences of the primers used for sequencing are indicated in Table 1. The PCR products were sequenced with a BigDye Terminator sequencing kit (Applied Biosystems, Foster City, CA, USA) using an ABI310 sequencer . DNA sequences for the 3.9 kb region of histone gene repeating unit in D. immigrans are deposited in the DNA Data Bank of Japan (DDBJ). The accession numbers for clones, imm 1, imm 5, imm 6, imm 8, imm 10 and imm 11 are LC194855, LC194856, LC194857, LC194858, LC194859, and LC194860, respectively.
The nucleotide sequence of the region was compared between six independent clones. The different sites of the nucleotide sequences of the clones are shown in Figure 2. Seven of the 12 different sites in the coding region showed a change in amino acid. InD. immigrans, a variation in amino acid was found at two sites (R-C, L-F) in H2A, one site (L-P) in H2B and 4 sites (A-P, K-R, T-A, K-E) in H1. Each variant type was found only once among the 6 samples (Figures 2 and 3). Nucleotide differences were observed over the whole region; however, indels were found only in the 3’ spacer of the H1 gene. The average nucleotide variability in the region was 0.28%, i.e., the average identity was 99.72%. The average variability among copies in D. immigrans was considerably small when compared to the corresponding interspecies differences: 32% for D. immigrans and D. americana and 33% for D. immigrans and D. hydei. These results indicated that a strong concerted evolution has occurred for the multigene family of replication-dependent histone genes in D. immigrans. The GC contents at the 3rdcodon position of the histone genes for H1, H2A, H2B and H3 are shown in Figure 3. For comparison, the GC contents at the 3rd codon position of the histone genes from D. melanogaster, D. americana and D. hydei are also shown. D. americana and D. hydei are Drosophila species that have a low GC content in the genes [19,21]. Although the GC content of D. melanogaster is lower than those of D. lutescens and D. takahashii , it is higher than those of D. americana and D. hydei . If the level of GC content at the 3rdcodon position in D. immigrans is similar to that in D. melanogaster rather than those in D. hydei and D. americana, the GC content musthave been changed after the divergence of D. immigrans and an ancestor of D. hydei and D. americana. Alternatively if it is similar to those of D. hydei and D. americana, the GC content must have been changed after the divergence of D. melanogaster and an ancestor of D. immigrans, D. hydei and D. americana. Figure 4 showed that the GC content of each histone gene, H3, H2A, H2B or H1, of D. immigrans was comparable to that of D. melanogaster. These results suggested that the low GC content observed for D. americana and D. hydei was caused by a GC content decrease after the divergence of an ancestor of these species from D. immigrans.
Figure 4: Evolution of GC content at the 3rd codon positions of the H3 (yellow), H1 (grey), H2A (blue) and H2B (red) genes in D. melanogaster , D. americana , D. hydei and D. immigrans . The GC content data for the H3 gene were obtained from [15-16] for comparison. Evolutionary relationship for these Drosophila species has been already known as reported for the H3 genes .
The differences in nucleotide sequence observed in the histone gene repeating sequences of D. immigrans were caused by either nucleotide substitutions or small indels. Although nucleotide substitutions were observed in most of the regions, indels were observed only in the spacer region of H1-H4. This is understandable because indels, which change the length of the sequence, would be more deleterious when compared to nucleotide substitutions. Alternatively, mutations by indels would occur more frequent in the spacer region than in the other regions. Some nucleotide substitutions that occurred in the coding region may cause amino acid changes in the histone proteins. Amino acid variants of replication-dependent histones with low frequency have been reported in the gene family . Variants of the amino acids for histones should be called ‘histone variants’ even if the histone is a replication-dependent type of histone. Histone proteins are highly conserved at the amino acid level, especially in H3 and H4, i.e., strong purifying selection is present at the amino acid level. Different from the ‘variants’, the replication-independent type of histones, the variants of the replication-dependent type of histones must be, in this case, deleterious rather than conferring distinctive or new functions. Multiple copy numbers in a gene family must be permissible for the existence of excessive variants.
The average nucleotide variability for the histone gene repeating unit was 0.28% in D. immigrans . Corresponding data for the histone repeating unit from other Drosophila species are indicated in Table 2. Variability in D. immigrans was relatively low and was comparable to that in D. melanogaster , indicating a strong concerted evolution. These results suggested that the interaction of evolutionary factors, such as selection, gene conversion, unequal crossing over and mutation, might be very similar for these two species.
|D. melanogaster||0.40*||10||Matsuo and Yamazaki |
|D. sechellia||0.61||3||Kakita et al. |
|D. yakuba||0.65||2||Kakita et al. |
|D. hydei||0.89||2||Fitch and Strausbauch ,
Kremer and Hennig 
|D. immigrans||0.28||6||This study|
Table 2: Nucleotide variation in the histone multigene family in Drosophila . *Data on variation within a chromosome were used.
The low GC content observed in D. americana and D. hydei can be explained by a GC content decrease after the divergence of an ancestor of these species from D. immigrans. Our hypothesis for GC content evolution proposed in previous works [14-16,20] can explain the change as follows. The evolution of the GC content at the 3rd codon position was caused by a change in the efficiency of negative selection for AT content or codon bias [14-16,20]. The efficiency of selection depends on the population size: When the size of the population becomes larger, the efficiency for selectional so becomes larger . Therefore, when the population size of an ancestor of D. americana and D. hydei became smaller after the ancestor diverged from D. immigrans , the GC content must have been decreased by relaxing negative selection against A/T. The population size effect is also expected to be seen for other genes in the same genome.
The 3.9 kb region containing H2A-H2B-H1 genes from D. immigrans was analyzed for nucleotide variability among repeating units of histone genes of replication-dependent type. It was found that a strong concerted evolution has occurred in the histone multigene family in D. immigrans . Investigations of nucleotide variability from many Drosophila species will provide valuable data for understanding the evolutionary mechanisms of the multigene family.
This research was supported by a Grant-in-Aid for Scientific Research to Y. M. from the Ministry of Education, Culture, Sports, Science and Technology of Japan.