Sequence Diversity at Cyt-b and Co-1 mtDNA Genes in Animal Taxa Proved Neo-Darwinism

Nucleotide sequences of mitochondrial DNA (mtDNA) depict the process of molecular evolution and speciation in animals. A dataset of about twenty three thousand sequences of 2 genes, Cyt-b and Co-1, among different species was analyzed at 5 taxa ranks across the Animal Kingdom. The results support the prevalence of a geographic or allopatric speciation and suggest that Darwin’s gradual evolution in animals also prevails at the molecular level. The approach suggested allows recognize the geographic and other speciation modes, using the set theory equations and genetic terms as their components. The suggested approach may solve a key problem of the Biological Species Concept, i.e. the inability of evolutionary studies to monitor the reproductive isolation among species in nature, by defining a species rank with measurable estimates of genetic parameters. *Corresponding author: Yuri Ph Kartavtsev, AV Zhirmunsky Institute of Marine Biology of the Far Eastern Branch of the Russian Academy of Sciences, Vladivostok 690041, Russia, Tel: +7-4232-311173; Fax: +7-4232-310900; E-mail: yuri.kartavtsev48@hotmail.com Received July 06, 2013; Accepted October 07, 2013; Published October 14, 2013 Citation: Kartavtsev YP (2013) Sequence Diversity at Cyt-b and Co-1 mtDNA Genes in Animal Taxa Proved Neo-Darwinism. J Phylogen Evolution Biol 1: 120. doi:10.4172/2329-9002.1000120 Copyright: © 2013 Kartavtsev YP, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
Nucleotide diversity (p-distance and relevant measurements) among individuals, which was reviewed recently in different animal taxa [1][2][3], as well as in plants, fungi, and sponges, provide theoretical and empirical background for further development in molecular phylogenetics and DNA barcoding. The data show sharp empirical signal on substantial increase of p-distance with higher rank of taxonomic hierarchy. This hierarchy is diversity gradations of animals in nature such as populations of single species (1) and taxa of different rank, e.g. subspecies (plus semispecies and sibling species) (2), morphologically distinct species of single genus (3), genera of the same Family (4), and families of the same order (5). When analysed on large scale, these associations of genetic distances and population-and-taxa gradations scientifically substantiate the global-wide initiatives in molecular taxonomy, viz. CBOL (Consortium for Barcoding of Life; http:// www.barcoding.si.edu/), iBOL (International Barcode of Life; http:// www.DNAbarcoding.org) and the Tree of Life Project (http://tolweb. org/tree/). This does not mean that the molecular genetic approaches for the description of natural diversity are free from complications. There is an active discussion on the theme [1][2][3][4][5] and it is obvious that species identification is not a trivial task because the species notion is complicated and has not been quantitatively defined in modern biology and evolutionary genetics [2,4]. Nowadays, the vivid contradiction in evolutionary biology is between the widely accepted Biological Species Concept (BSC) and the Phylogenetic Species Concept (PSC). While the degree of contradiction is more apparent than actual [6,7], researchers are still far from understanding the basic reasons of this contradiction. However, misled by the vast opportunities of phylogenetic reconstructions inferred from the DNA sequences, some authors even reject the analysis of the current generation dynamics and divergence, opposing the PSC to the BSC [8,9]. Fortunately, sharing understanding of the common nature of many intraspecies and interspecies divergence mechanisms, many geneticists are far from such extreme views [6,7]. Simply speaking, the key issues for in space and in time differentiation are gene flow (m) and effective size (Ne). These two parameters under certain, stable combination, lead to the accumulation of mutations/ substitutions thus forming genetic diversity background in a simplest case, when no natural selection is considered. While in space genetic diversity of single biological species could be reversible (e.g., two or more local populations united into one), in time divergence when speciation completed is not reversible and have to cause relationship among genetic distance and taxa ranks, because it is believed that the higher taxa rank the higher is gene flow restriction (lesser m). Such in time genetic diversity accumulation is well substantiated for BSC and for geographic or allopatric speciation mode [1][2][3]. Whether the BSC paradigm fit modern molecular data and if geographic speciation mode along with others could be tested in a framework of genetic terms are the important questions rising in the evolutionary biology and in the genetics of speciation. These issues are considered in broader sense in the paper to summarize the author's view on divergence and speciation genetics [1][2][3].
Main objective of this paper is to annotate the levels of nucleotide diversity in animal populations and higher taxa (jointly considered as 5 comparison groups) using the published data. Author also explains how the molecular genetic variability and divergence are related with species identification (DNA barcoding), molecular phylogenetics, and speciation genetics.

Materials and Methods
A representative database of Cyt-b and Co-1 genes comprised of about twenty three thousand sequences of different species across the Animal Kingdom was designed. Details of the database that represent relevant genetic distance scores in the current paper are given elsewhere [1][2][3] and in the paper a brief summary of analysis of these sequence data is presented. The primary nucleotide sequences of genes (sequences for shortage) and their degree of resemblance (known as p-distance) among the individuals at various taxa is the focus of the paper. In the database the information on the p-distances, their derivates and details of statistical analysis and comparison of these parameters have been presented in parts in earlier studies [1][2][3]. A considerable portion of the genetic diversity parameters in original sources were either calculated from generated Cyt-b and Co-1 gene sequences or taken from author's estimates for these genes elsewhere. Most of the sequences used in this study were retrieved from GenBank NCBI (http://www.ncbi.nlm.nih. gov), BOLD (http://www.boldsystems.org/) or obtained experimentally by the authors of original papers from which they were retrieved.

Sequence divergence within species and at different levels of the taxonomical hierarch
The conclusions in this paper, as noted above, are based on p-distances among sequences of mitochondrial DNA (mtDNA) of Cyt-b as well as Co-1 genes, and presented in 3 sources [1][2][3]. The BSC implies that species is an isolated reproductive unit and the taxa above species are also assumed as isolated entities in general. Sequences of individual genes demonstrate that the diversity of DNA markers increases with the rise of taxon rank having the lowest diversity within species [10][11][12][13]. A revision of vast new sequence data [2,3] focused on assignment of an individual specimen to a certain taxon as well as applicability of two genes to DNA barcoding, detection of a speciation mode by which a species originated, and to speciation genetics in general. Nucleotide diversity signal can also be translated into a gene tree by a certain set of tools [13,14], but this aspect is not dealt in detail in this study.
Taking into account variation in a sample size (n) for each i-th distance measure in comparison groups, we performed a two-way MANOVA test on p-distances weighted by n (factor 1, comparison groups: 1 to 5 above; factor 2, genes: Cyt-b and Co-1; a model with random effect of factors was applied) (Figure 1). In this MANOVA, the effect of factor 1 (i.e., group of comparison) was significant F=4964.01, d.f.=4, 22227; P<0.000001. The effect of factor 2 (mean p-distance differences for two genes) proved to be non-significant: F=1. 15, d.f.=1, 22227; P=0.2842. The interaction between factors 1 and 2 was significant too: F=101.05, d.f.=4, 22227; P<0.000001. The graph in Figure 1 clearly shows the meaning of the factor interaction: the p-distance values and its derivatives for two genes differ among some of five groups of comparison; i.e., the substitution rates are different for Cyt-b and Co-1 at least in some of the groups of animal taxa compared. Number of compared sequences or specimens in a file in above calculations is 22,232 (see indication from d.f.=22227 value above). Heterogeneity of gene evolution rate is widely known in the literature (e.g. [5,16]). The data presented in Figure 1 demonstrate that both genes show a trend of increasing mean p-distances with the rising rank of the groups compared, from populations to orders. Because of the importance of this conclusion the data presented in Figure 1 were additionally tested using nonparametric Kruskall-Wallis ANOVA; the same conclusion was made [2,3]. Thus, saturation and homoplasy effects on presented data for Cyt-b and Co-1 are small or negligible [2,3]. In other words, data on sampled mtDNA genes diversity are not expected to be critically affected by these factors and applicable in taxonomy and molecular phylogenetics up to the order level in various animal taxa. Complementary to above information, vast DNA barcoding surveys obtained that majority of genera and even families are monophyletic [11,12,17,18]; thus, molecular phylogenetic signal also show basic hierarchical ordering of genetic diversity (divergence) in nature taxa. Jointly, all these data suppose that such events as horizontal gene transfer and introgression of genes throughout taxa borders are negligible too and unable to change basic process of vertical gender transmission.

Applicability of sequence data to speciation mode detection, barcoding, and speciation genetics
As shown above, genetic differences are acquired gradually in time in isolated populations or their groups. The process of divergence proceeds further to diversify subspecies into semi-species, sibling species, morphologically distinct species, genera, and so on. This idea was conceptualized and tested experimentally in many sources [7,[19][20][21]. The presented data on nucleotide sequences conclusively demonstrate that this process is implemented up to the order level ( Figure 1). The distance estimates show good correspondence with the analyses of other markers diversity [5,7,22]. This testifies to the applicability of p-distance for most intraspecies and interspecies comparisons of genetic divergence up to the order level in animals for the two genes compared. In other words, comprehensive analysis of vast data provide theoretical and empirical background for further development in molecular phylogenetics, DNA barcoding applicability to the biodiversity re-description through a specimen identification and attribution to a certain taxon with quantification and database supplying (see iBOL/BOLD websites for details), and at last the speciation mode definition (see below). Distance signal at different comparison groups have concordantly transmitted into gene trees topology, using specified tools [14], as noted above. Certainly, many limitations should be kept in mind for precise molecular phylogenetic reconstructions, including saturation of nucleotide substitutions for certain taxa, homoplasy, topology constrains, information capacity of marker genes (their number and complementarity), and biodiversity coverage. However, at lowest taxa gradations up to the genus level these complications are usually minimal. Sometimes, e.g. for recently originated taxa or taxa that deviate substantially from BSC, many complications may be retained and identification by the genetic markers could fail even at this low level. Nevertheless, currently reviewed genetic diversity data scientifically substantiate the global-wide initiatives, like CBOL, iBOL, and Tree of Life.
Nowadays, evolutionary genetics lacks a speciation theory in a strict scientific sense, implying a formal, analytic model and prediction of future events on this basis. In a particular case, such a model must predict the formation of species or at least distinguish different speciation modes based on quantitatively estimated parameters and/ or their empirical estimates. The attempts taken in this direction [23][24][25][26][27] do not meet the above requirements. An alternative attempt in this direction is a scheme-and-algorithmic approach that have been developed to distinguish speciation modes (models) on the basis of key population genetic parameters and their estimates [1][2][3][28][29][30]. This approach, which I call the operation-and-genetic approach for delimiting speciation mode, may lay the foundation for a future theory, a genetic theory of speciation. As a basis for the evolutionary genetic concept of speciation, some verbal descriptions of properties of speciation modes [24] are used. The development of the approach lead to a classification scheme for seven known modes of speciation [1][2][3]. This scheme has been presented in the profit journals [2,3]. Note, that this approach leads to a relatively simple experimental scheme, which allows a user 1) to organize an investigation of speciation in various groups of organisms, based on a focused approach with genetic terms, and 2) to obtain analytic expressions (equations) for each of 7 specified speciation modes.
Descriptors in the equations are as follows. D, genetic distance at structural genes (any measure is relevant, e.g. Nei's D n [31]): D T , at suggested parent taxa, D S , among conspecific demes, D D , among subspecies or sibling species; p T /p S are p-distances or a fraction of different nucleotides in a pair of randomly sampled sequences (see references and explanations in [1][2][3]13]), correspondingly for parent taxa and conspecific demes; H D /π D , mean heterozygosity/ nucleotide diversity in a suggested daughter population; H p /π P , mean heterozygosity/nucleotide diversity in suggested parent population; E P , divergence in regulatory genes among suggested parent taxa; E D , divergence in regulatory genes among suggested daughter taxa; TM + , test for modification (positive); TM -, test for modification (negative). TM + vs. TM -, an experimental test to detect scores for modifications or the signatures of reproductive isolation barriers (RIBs) that responsible for the creation of differences at quantitative traits. This test could allow distinguish between an epigenetic variation and taxonomically significant difference.
Using the proposed scheme [2,3], one can determine the conditions required for speciation (necessity conditions) as well as conditions sufficient for the formation of a species (sufficiency conditions). As shown in the former paragraph, experimentally measured descriptors are introduced (their number including mtDNA and nDNA molecular markers can be increased, if necessary) to clarify how and in which form, these conditions are manifested in a particular case of speciation or in a potential model. For instance, the divergent type of speciation D1 explains the classic geographic or allopatric speciation. Equations given above help one to visualize and quantify the difference between speciation modes as functions, Φ 1 (S) to Φ 7 (S). According to the BSC, the D1 model implies that large populations are isolated (disruption of gene flow) and then evolve separately accumulating mutations, nucleotide substitutions and other changes during hundreds of generations. The RIBs in parent and daughter populations (taxa) are caused by pleiotropic effects caused by these changed genes. The longer the time elapsed from the isolation event, the greater are the distances between the corresponding taxa. Accordingly, a descriptor is introduced in my notation: 1. D T >D S and 4. p T >p S (where subscripts T and S indicate genetic distances in the putative parental vs. daughter taxa or in estimated species vs. conspecific populations (or at higher and lower levels of taxonomic hierarchy, correspondingly). Likewise, since upon implementation of the D1 mode, no significant genetic diversity differences appear at either structural gene or the regulatory part of the genome (because the initial and derived taxa have large effective size, Ne and thus a small rate of the diversity lost in time). Consequently, we can introduce parameters: 2. H D =H P and 3. E D =E P , and 5. π D =π P (heterozygosity/nucleotide diversity, H and π, and gene expression, E, between the daughter and the parental taxa; as defined, differences in the case are absent). Finally, upon some types of speciation, not only variability and genetic distances, but also some quantitative loci (polygenes) are of major importance, which could not be distinguished at the molecular level, but lead to the RIB formation. Hence, new descriptor is introduced: 6. TM (TM + vs. TM -; an experimental test for modification or RIB-important difference at some quantitative traits in nature taxa), which also allows one to distinguish between an epigenetic variation and a taxonomically significant difference. The test for modification is an experimental investigation that could be designed to distinguish life forms (ecological modifications via epigenesis) from changes made through genetically caused irreversible changes in a derived daughter population of a new species. The single gene changes may be small at structural or regulatory parts of genomes in some speciation modes (e.g. D3 [1][2][3]), and so molecular markers may not detect them but TM could. Anyway, the possibility of speciation mode identification described above is an objective tool for speciation genetics and may be compared with some other quantitative approaches [4,32,33], but above seems more advanced [1][2][3].
Many critics of the BSC [34] argue that in most cases it is not possible to measure the critical notion for the BSC, the reproductive isolation in nature, or how to measure an efficient RIB that have been established for a new species. Under the approach proposed above, this inherent and principal weakness of BSC is eliminated; i.e. the experimental/field estimates of the descriptors will give us unambiguous evidence whether Sufficiency Conditions for speciation are met. All data presented above imply that speciation always corresponds to the D1 type? Apparently, this is not true. At least 7 speciation modes noted above are available, and their actual number in nature is unknown but may be high. So, further development of the scheme is supposed along with a specific software creation. However, judging on p-distance increase with taxa rank (Figure 1), the most common is the geographic speciation mode. Also, one can conclude on these data that phyletic evolution prevails in animal world.
Let us consider some complications and recent developments in evolutionary genetics that might weaken suggested concept and BSC in general. Well known that mtDNA could spread through species barriers and may exist many generations within gene pools of species, which biological integrity documented by other molecular markers and by phenotypic traits. Despite many contradictions, these facts are well documented by data from mice [35,36], frogs, fish, mussels, and other organisms [37]. Investigation of mtDNA genotypes, in combination with nuclear DNA markers or isozyme loci, have sometimes demonstrated the ability of mtDNA to introgress from one species to the other species, if the hybrids between these species and their progeny are fertile and in this case make an impact on the nuclear vs. cytoplasmic background. This introgressive hybridization requires successful backcrosses of the ancestral hybrid female with males of the parental species or other taxa. Such introgression is independent of recombination and segregation events, occurring in the nuclear genome, if natural selection, maintaining nuclear-cytoplasmic compatibility, is absent [38,31]. However, evidence of this kind appears increasingly often, indicating operation of subtle selective mechanisms that maintain the interaction of nuclear and, for example, mitochondrial genes [39]. However, a number of cases of mtDNA introgression (see references below) show that possible selection, if it exists at all, is not sufficiently strong to prevent hybridization and introgression. Thus, the instances of possession of foreign mtDNA among species' hybrids in nature, identified by other methods, may be a proof of hybridization of closely related species (taxa), but may be not very influential on the evolutionary fate of species itself. The interspecies mtDNA transfer has been found in species of invertebrates (Drosophila) and vertebrates (Mus and Rana) [35][36][37]31,[39][40][41][42][43]. Literature on the topic has already considered with the aim of comparative analysis [6,36,42,44]. Based on an analytical approach for the analysis of nuclear-cytoplasmic equilibrium [45,46], an original method has been developed for testing the direction of hybridization and the intensity of introgression [37]. In this assignment, I able only briefly touch upon the issue of mtDNA introgression, to elucidate its relationship with the species status in BSC.
Asymmetry of the introgression as demonstrated for two frog species of Hyla genus is especially obvious on the nuclear-cytoplasmic interaction data and it is frequent in nature [37]. For many cases that were analyzed, the spread itself through the interspecies borders by mtDNA genes, as well as probably by some mobile elements from the "foreign" genomes, is not necessarily cause species' disintegration. Moreover, in some cases, as it is supposed by BSC, new mtDNA genes harbored via introgression may take their part in the formation of RIBs. Action of RIBs depends upon development of the nuclear-cytoplasmic relations and on other biotic and abiotic events. An example of current contradictive development of RIBs under hybridization, urbanization and climate change represent different pairs of taxa is Mytilus ex. group edulis complex [47][48][49][50][51][52]. Nowadays the monitoring of hybridization and introgression is, possibly, one of most actual goals of evolutionary and general genetics. This is especially actual taking into account available data on claimed numerous examples of reticulate evolution and gene introgression among many marine organisms [53,54]. Analysis for part of such evidences provided for animals [53] shows that in many cases examples do not represent really the interspecies introgression but rather intraspecies introgression, introgression between taxa of an uncertain rank or between such taxa as subspecies/semispecies, which may be not contradictive to BSC. In the review [53] there are evidences of the introgression that indicate the presence of hybrids but not introgression itself. Well known that presence of F 1 hybrids even genetically proved do not necessarily means that gene introgression occurred [44]. For some groups of fish hybridization is common [44] but only sporadic introgression occurred [55]. Also, quite well known that these events are caused by drastic climate changes in past, as evidenced for instance for char [56,57]. There are in the review some evidence based on morphology traits [53] but this may be only rough information and hardly acceptable as signs of gene introgression [28,44]. Very thorough fish species investigation based on mtDNA and nuclear allozyme markers showed that in most cases hybrids are only F 1 , so they suggest no real gene introgression may occurred [57]. Cases of introgression that were presented for sea weeds [54] are currently rare and generally correspond with facts that known for terrestrial plants; i.e., they support weakness of reproductive barriers in many plant "species". In this connection, it seems that claims on a crush of modern BSC paradigm [53,58] are too premature. Contrary to that, evidence that summarized in the current paper and in reviews [1][2][3] show that molecular genetic data are concordant with BSC and Neo-Darwinism in general.