Ashok Gupta*, Anuradha Bhardwaj, Supriya, Parvati Sharma, Yash Pal, Mamta and Sanjay Kumar
National Research Centre on Equines, Sirsa Road, Hisar (Haryana), India
Received date: September 17, 2015 Accepted date: October 01, 2015 Published date: October 07, 2015
Citation: Gupta A, Bhardwaj A, Supriya, Sharma P, Pal Y, et al. (2015) Mitochondrial DNA- a Tool for Phylogenetic and Biodiversity Search in Equines. J Biodivers Endanger Species S1:006. doi:10.4172/2332-2543.S1-006
Copyright: © 2015 Gupta A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Biodiversity & Endangered Species
It is imperative to assess the maternal lineage in order to achieve a broad picture of evolution, phylogenetic and genetic biodiversity within and among different breeds of livestock. In recent past, there has been a considerable advancement in sequencing of complete mammalian mtDNA molecules and their analysis. Most of the studies have focused on the mitochondrial D-loop region, the most variable part of mtDNA due to increased substitution rate than in the rest of the mtDNA genome which serves as a better genetic marker to assess the diversity. Mitochondrial DNA (mtDNA) possesses several favorable characteristics, including large quantity in the cell, small genome size, haploid, maternal inheritance with extremely low probability of paternal leakage, higher mutation rate than nuclear DNA, and amenable to change mainly through mutation rather than recombination. All these features make mtDNA a useful and one of the most frequently used markers in molecular systematic and has been widely employed to address questions of genetic diversity, population structure and population evolution of animals including equines. Many native breeds of horses as well as ponies were assessed for their genetic diversity and ancestry on the basis of studies on mitochondrial DNA to address the questions of evolution along with breed development and conservation.
Biodiversity; Equines; Inheritance; Displacement loop; Replication
Management of genetic diversity among population is a prime factor in any breed conservation programme for protecting the animal genetic resources in any country [1,2]. The baseline molecular analysis provides a dependable tool which can be used together with the quantitative approach and traditional breeding strategies for an efficient design of preservation strategy . Further genetic distances can also be used to determine the population structure and genetic distinctiveness of a population or breed . Mitochondrial DNA (mtDNA) is a pivotal tool in evolutionary and population genetics including molecular ecology. The control region of the mitochondrial DNA (mtDNA) due to its elevated mutation rate, lack of recombination and maternal inheritance serve as a biomarker in phylogenetic studies. A functional marker system for population and evolutionary biology and recently, the mitochondrial DNA (mtDNA) sequence analysis has been widely used as it provides rich sources of data to analyze genetic diversity and phylogeny . In recent times, much insight has been gained by the increasing use of D-Loop as a molecular marker for investigating the inter-specific and intra-specific genetic differentiation of different animals including equines.
Many studies have focused on the mitochondrial D-loop region, the most variable part of mtDNA  due to increased substitution rate than in the rest of the mtDNA genome . It has become a useful tool in forensic science also due to its high copy number, maternal inheritance and high levels of sequence polymorphisms. Nonetheless, there is still a paucity of information regarding the structure and characteristics of the mitochondrial genomes in many species which can support the efforts to further advance epidemiological and phylogenetic studies and to address taxonomic questions [8-10]. So, there is a need to focus on mitochondrial sequence analysis studies and its significance in exploring the genetic diversity and generation of evolutionary tree for different animal breeds. It is also important to review the knowledge concerning how mt-DNA proved to be an important genetic tool, understanding of mt-DNA D-loop in evolutionary and population analyses, its role in disease diagnosis and prognosis, application in forensic sciences and what the future likely holds.
Every eukaryotic cell contains at least one copy of the entire nuclear genome housed in its nucleus. In contrast, every cell contains as many as several thousand mitochondria. Each cell contains varying numbers of mitochondria depending on energetic requirements. Mitochondria harbor a small but essential component of an eukaryote's genetic material. It has been known for many years that mitochondria are semi-autonomous, possessing their own genome and the machinery for replication, transcription, and protein synthesis . This organelle has been found to play a central role in numerous cellular functions such as metabolism (oxidative phosphorylation), apoptosis, thermogenesis, and aging . Moreover, alterations in mitochondrial functions contribute to several inherited and acquired human diseases and the aging process. Mitochondrial DNA (mtDNA) possesses several favorable characteristics, including large quantity in the cell (its presence in large quantity as they are present in large numbers in each cell), small genome size, haploid, maternal inheritance and extremely low probability of paternal leakage , higher mutation rate than nuclear DNA, and change mainly through mutation rather than recombination . All these features make mt-DNA a useful and one of the most frequently used markers in molecular systematics  and has been widely employed to address questions of genetic diversity, population structure, phylogeography and population evolution of animals [16-18]. Owing to a particular susceptibility to oxidative damage due to high levels of Reactive Oxygen Species (ROS) generation in mitochondria, inefficient DNA repair system and a lack of protective histones in this organelle, mutation rate has been reported to be 10–17-fold higher in the mt-DNA than in the nDNA . Different regions of the mitochondrial genome evolve at different rates  and this allows suitable regions to be chosen for question under study.
Mitochondrial DNA (mt-DNA) has strictly maternal inheritance  which means mt-DNA haplotypes should be shared by all individuals within a maternal family line. The mt-DNA is particularly useful in inferring phylogenetic relationship between closely related species within the same family or even within the same species . While the exact origin of mitochondria is still uncertain, it is widely believed that they arose from an endo-symbiotic relationship between a glycolytic proto-eukaryotic cell and an oxidative bacterium [23-25]. As such, they are able to maintain genomic independence from the nucleus. However, as a consequence of proto-mitochondrial genes integrating into the nuclear genome throughout evolution, most mitochondrial proteins are encoded by Nuclear DNA (nDNA) and imported into mitochondria. Although the replication of mitochondrial DNA (mt-DNA) is not coordinated with nDNA replication, the overall number of mitochondria per cell remains fairly constant for specific cell types during proliferation, suggesting that the creation of mitochondria is largely influenced by extra-mitochondrial signal transduction events. This conclusion is further supported by the scrutiny that mitochondrial biosynthesis continues even when mt- DNA is deleted . Thus, the replication of mitochondria does not require the presence of mt-DNA. The mitochondrial genome is a pretty remarkable, albeit tiny, piece of DNA which has relevance for medical and veterinary genetics, evolutionary genetics, and population genetics of all species, especially humans.
Mitochondrial DNA is a 16,569 bp double-stranded, circular DNA encoding 37 genes; 13 of them encode polypeptides of the Oxidative Phosphorylation System (OXPHOS), 22 tRNAs and two rRNAs, required for translation by mitoribosomes within the matrix [26,27]. The mtDNA has 4 main regions: D-Loop, rRNA, tRNA and genes that code for protein. The regions that code for rRNA, tRNA and protein are called the “coding region”. The Displacement-Loop (D-Loop), 1.2 kb non-coding region, is often referred to as the "hypervariable region" which contains essential transcription and replication elements . It is present in high copy numbers (103–104 copies per cell) in virtually all cells and the vast majority of copies are identical at birth . Animal mtDNAs are extremely compact, being less than 20 kilobases in length, and encode fewer than 40 genes . In vertebrates, transcription is initiated bi-directionally at two promoters, PH and PL for heavy (H) and light strands (L), respectively, within the D-loop regulatory region [31,32]. In the “strand asymmetric model” of mt- DNA replication, the RNA transcript initiated at PL is cleaved in the vicinity of three evolutionarily conserved sequence blocks (CSB I, II, and III), and H-strand replication is initiated at the sites of these cleavages . Thus, transcription is coupled to DNA replication.
Sequence changes in animal mitochondrial DNA (mt-DNA) are of four principal types: sequence rearrangements, additions, deletions, and nucleotide substitutions . In the derivation of phylogenetic relationships, most emphasis has been placed on nucleotide substitutions . In a number of species, overall substitution rates in mt-DNA have been estimated to be about 5-10 times greater than in single-copy nuclear DNA , although rates vary between different parts of the mitochondrial genome. The two complementary strands of mt-DNA, based on their guanine (G) content, are named as heavy and light strands (H-strand and L-strand, respectively). Guanine-rich Hstrand of mt-DNA encodes 28 of the 37 genes while L-strand encodes the remaining genes . A non-coding control region extending from 16,024 to 576 nucleotide positions contains three conserved sequence blocks and a Displacement loop (D-loop). Moreover, promoters and enhancers for mitochondrial transcription, as well as the origin of replication for H-strand, reside in this region. The intra-specific and intra-individual length polymorphisms are most often observed in tandem repeated structures in the control region [37,38]. The tandemly repeated arrays coincide with the 5’ end of the D-loop DNA where replication is initiated, and the 3’ end where the D-loop DNA is terminated . Moreover, the repeated sequences are usually associated with secondary structures [39-41]. The precise mechanisms causing mt-DNA length variation heteroplasmy are not known; however, several models have been suggested by various authors [42,43]. Because of the properties of its structure as well as its mechanism of DNA replication and injury repair, the mutation frequency of mitochondria is about hundred fold higher than that of nuclear DNA.
The first mitochondrial genome subjected to be sequenced, was the human mitochondrial genome . The smallest mitochondrial genome sequenced is the 5967 bp mtDNA of the parasite Plasmodium falciparum  and the largest mitochondrial genome sequenced is the massive 366,924 bp mtDNA of the model plant Arabidopsis thaliana . In all, GenBank archives more than 500 mt-DNA sequences with numerous additions each year. Mitochondrial mRNAs lack untranslated leader and trailing sequences and more than half do not even have a stop codon. Stops are added upon polyadenylation when a terminal U or UA is converted to a UAA. The two ribosomal RNAs are the smallest known at 1,559 and 954 bases, there is no 5S RNA, and the 22 tRNAs are used to read all codons. The mitochondrial genetic code is different from the eukaryotic code; UGA is read as tryptophan rather than as STOP; AGA and AGG, normally read as arginine, are read as STOPs; AUA is methionine and not isoleucine; and the ubiquitous AUG start codon is sometimes replaced by AUA or AUU in mitochondrial genes. Subsequent studies of other mt-DNAs have shown that the mitochondrial genetic code is not even universal among mitochondria. Yeast mitochondrial genomes, for example, are much larger and have not reassigned the AUA, AGA, and AGG codons. Yeast has reassigned CTN as leucine rather than threonine. While the vast majority of the mitochondrial genome is under the analysis of selection because mutations in these areas are usually deleterious, while D-loop there is a region in which there are no coding sequences and mutations are free to accumulate at will along with time .
One section of mt-DNA is known to often carry an additional strand, creating a displacement loop, or D-loop, a non-coding control region which is the most rapidly evolving part of the mitochondrial genome [45-47]. D-loop is the longest non-coding region in mt-DNA of 1,124 bp (spanning nucleotide positions 16,024 to 516), which acts as a promoter for both the heavy and light strands of mt-DNA, and contains essential transcription and replication elements. The mt-DNA replication begins in the D-loop resulting in the formation of a displacement loop with a newly synthesized heavy, or H, strand of about 700nt known as 7S DNA . Both strands of the mt-DNA are completely transcribed from the promoters in the D-loop. In addition to the promoter sequences, there are two small regions known as the hyper variable regions I and II (HV1 at positions 16024–16324 and HV2 at positions 63–322) . Thus, D-loop is a DNA structure where the two strands of a double-stranded DNA molecule are separated for a stretch and held apart by a third strand of DNA. The third strand has a base sequence which is complementary to one of the key strands and pairs with it, thus displacing the other main strand in the region. Within that region the configuration is thus a form of triple stranded DNA. The size of this region varies among animal species, from -200 to 4,100 base pairs .
The substitution rate in the human D-loop has been estimated to be from 2.8  to 5  times the rate found in rest of the mitochondrial genome, although there are a number of conserved blocks near the promoter sequences, one of which has been associated with a function . D-loops occur in numeral of scrupulous conditions, including in telomeres, in DNA repair and arrangements, and as a semi-stable structure in mitochondrial circular DNA molecules. Researchers at Caltech discovered in 1971 that the circular mitochondrial DNA from growing cells included a short segment of three strands which they called a displacement loop . They found the third strand was a replicated segment of the heavy strand (or H-strand) of the molecule, which it displaced, and was linked with hydrogen bond to the light strand (or L-strand). Since then, it has been shown that the third strand is the initial segment generated by a replication of the heavy strand that has been arrested shortly after initiation of replication and is time and again maintained for some period in that state . The Dloop occurs in the core non-coding area of the mitochondrial DNA molecule, a segment called the control region or D-loop region. The DLoop region is the major control site for mt-DNA expression since it contains the leading-strand for origin of replication and major promoters for transcription . The D-loop sequences in particular have been used to establish intra-specific and inter-specific relationships, determine maternal contributions, and trace the origin of modern and ancient animals.The substitution rate in the human D-loop has been estimated to be from 2.8  to 5  times the rate found in rest of the mitochondrial genome, although there are a number of conserved blocks near the promoter sequences, one of which has been associated with a function . D-loops occur in numeral of scrupulous conditions, including in telomeres, in DNA repair and arrangements, and as a semi-stable structure in mitochondrial circular DNA molecules. Researchers at Caltech discovered in 1971 that the circular mitochondrial DNA from growing cells included a short segment of three strands which they called a displacement loop . They found the third strand was a replicated segment of the heavy strand (or H-strand) of the molecule, which it displaced, and was linked with hydrogen bond to the light strand (or L-strand). Since then, it has been shown that the third strand is the initial segment generated by a replication of the heavy strand that has been arrested shortly after initiation of replication and is time and again maintained for some period in that state . The Dloop occurs in the core non-coding area of the mitochondrial DNA molecule, a segment called the control region or D-loop region. The DLoop region is the major control site for mt-DNA expression since it contains the leading-strand for origin of replication and major promoters for transcription . The D-loop sequences in particular have been used to establish intra-specific and inter-specific relationships, determine maternal contributions, and trace the origin of modern and ancient animals.
The D-loop or control region, although non-coding, contains binding sites for two transcription factors; three Conserved Sequence Blocks (CSBs) associated with initiation of replication and the loop strand termination associated sequences, all of which play an important role in the replication of the mitochondrial genome. Replication of the mitochondrial DNA can occur in two different ways, both starting in the D-loop region . One way continues replication of the heavy strand through a substantial part (e.g. two-thirds) of the circular molecule, and then replication of the light strand begins. The more recently reported mode starts at a different origin within the Dloop region and uses coupled-strand replication with simultaneous synthesis of both strands [52,53]. Certain bases within the D-loop region are conserved, but huge parts are exceedingly variable and the region has proven to be useful for the study of the evolutionary history of vertebrates . The region contains promoters for the transcription of RNA from the two strands of mitochondrial DNA immediately adjacent to the D-loop structure that is coupled with initiation of DNA replication . The conventional wisdom supposed that D-loops were non-functional leftovers of incomplete replication. The function of the D-loop is not yet understandable, but recent research suggests that it participates in the organization of the mitochondrial nucleoid [55,56]. The partial mt-DNA D-loop region of all the registered horse and pony breeds of India, has been sequenced and submitted in GenBank with following accession numbers: Manipuri ponies (Genbank accession no.s HE565867 to HE565889); Spiti (HE 572592 to HE 572594 and HE565598 to HE565610); Zanskari (HE565647 to HE565670); Bhutia (HE565672 to HE 565695); Marwari (HE572595 to HE572619); Kathiawari (HE580440 to HE 580462; along with d-loop sequence of Indian Thoroughbred horses (HE 575410 to HE575432). Analysis revealed 70 Haplotypes in Indian horse and ponies.
Non-coding region of the mitochondrial DNA (mt-DNA), the Displacement loop (D-loop) has emerged as a mutational hotspot as it has the ability to accumulate mutations at a high, neutral rate as already stated earlier. Mutation rates in HVI and HVII are particularly high on an average and there is evidence that the rates differ within the regions as well . As a result of the high average mutation rates and the deficiency of coding or regulatory sequences in the hyper variable regions, they have turn out to be an immensely valuable source for investigating intra-specific genetic variation and differentiation. The sequence analysis of these two regions is used not only in forensic analyses, but also in medical diagnosis . Fliss et al. (2000), suggests that the constitutive hyper variable areas such as the mt-DNA D-Loop are hot spots for somatic mutations in malignant tumors.
The D-loop contains essential transcription and replication elements and mutations in this region may serve as a prospective sensor for cellular DNA damage and a marker for cancer development. It plays important role in telomeres also as the T-loop which is completed by the D-loop splice, protects the end of the chromosome from damage . Unlike all of the other regions of the mt-DNA, the D-Loop does not contain any functional genes. Most of the ancestral markers are found in the D-Loop. Whenever a mutation occurs in this region, the individual does not die and survives to pass the mutation along to future generations. However, the coding region of the mt- DNA is considered essential for the survival of the individual, so usually, whenever a mutation occurs in this region, it is often lethal and the organism dies. Thus, mutations which arise in the coding region are usually not passed down to future generations. For this reason, over a period of thousands of years, many mutations accumulate in the D-Loop, but very little are found in the coding region. Mutations are found at a much lower frequency in the coding region because only the mutations which do not end up being lethal are passed down to future generations.
With the goal of tracing ancestry, scientists usually begin by testing the D-Loop because of its abundance of mutations or “ancestral markers”. The displacement loop, which makes up 5.5% of the mitochondrial genome, contained numerous polymorphisms, including insertion/deletions (indels). This region contained the highest rate of polymorphisms per kilo base, which was not unexpected since the D-loop is known to be the most highly mutable region of the mitochondrial genome. The D-loop has been classified into three different highly conserved regions among 26 species: the Extended Termination-Associated Sequence (ETAS), which is divided into ETAS1 and ETAS2; the central region; and the CSB domains: CSB1, CSB2, and CSB3. The ETAS has been implicated in the termination of heavy (H) strand synthesis, which is important in the termination of replication. The CSB domain, defined as CSB1, CSB2, and CSB3 contains elements important in the replication and transcription of mt-DNA and substitutions has been detected in ETAS and CSB regions.
Role of D-loop in genetic diversity, maternal lineage and phylogeny studies
Understanding the evolution and genetic diversity of various species and classifying their populations by their evolutionary significance is essential for an appropriate conservation plan to be conceived and carried out for both wild and captive populations . Mitochondrial (mt) sequences provide rich sources of data for research in evolutionary biology, population genetics and phylogenetics and has been used in studies of the various species like pigs , Echinococcus , wild and domestic equids [60,62-66], chicken , fish , rhinoceros , goat . The complete mitochondrial DNA (mtDNA) sequences have been determined from more than 100 Chordata species since the first complete mt-DNA sequences of human determined in 1981  which covered all the classes of Chordata [71-75]. Eight complete mitochondrial DNA (mt-DNA) sequences of reptile species have been determined, including Alligator mississippiensis , Caiman crocodilus , Iguana iguana , Eumes egregius , Dinodon micarinatus , Chelonia mydas , Chrysemys picta and Pelomedusa subrufa . The 16746-neucleotide (nt) sequence of mitochondrial DNA (mt-DNA) of Chinese alligator, Alligator sinensis , was determined using the long-PCR and primer walking methods  and demonstrated that Chinese alligator is most closely related to American alligator among three crocodilian species. Molecular studies, using mainly mt-DNA sequences, have identified 9 distinct genotypes within E. granulosus [77-82]. The mt-DNA sequences can be used to identify the putative wild progenitors, the number of maternal lineages and their geographic origins. To some extent it may provide important information on the geographic distribution of diversity within livestock species although the usefulness of mt-DNA sequences data will vary between species, depending on the demographic history of the migration from the center(s) of domestication. More particularly, mtDNA information supports the conclusion that there were at least five major centers of livestock domestication: the northern Andean chain (New World camelids), the northeast African region (donkey and likely taurine cattle), the Near East (taurine cattle, sheep, goat, pigs), south Asia (Indus Valley, indicine cattle and chicken) and East Asia (pigs, chicken, horse, buffalo) to which should be added the Hindu-Kush Himalayan region (yak) and North and Central Asia (horse).
The mitochondrial DNA studies in horses have proved to be useful to characterize intra and inter-breed relationships [6,63,64,66,83-90]. Mitochondrial DNA sequence polymorphism has been used to examine genetic relationship within breeds [85,91], among breeds [64,86], between domestic and wild horse populations  and also to address questions of horse domestication [66,92]. Mitochondrial DNA analysis has been widely used to study wild and domestic equids, mainly due to the evolutionary information that can be drawn from sequence data [60,62-66]. Mt-DNA sequences will be the markers of choice for domestication studies as the segregation of a mitochondrial DNA lineage within a livestock population, will only have occurred through the domestication of a wild female or through the incorporation of a female into the domestic stock. Royo et al. (2005) analyzed a 296 bp mt-DNA fragment from the HVI region of 171 horses representing 11 native Iberian, Barb, and Exmoor breeds to assess the maternal phylogeography of Iberian horses and their finding supports the close genetic relationship between the ancestral mare populations of the Iberian Peninsula and Northern Africa. Phenotypic differences among the Northern and Southern Iberian groups of breeds are not explained by population sub-division based on maternal lineages. Northern Iberian ponies which were phenotypically close to British ponies, especially Exmoor, are the result of an introgression rather than population replacement. The horse mt-DNA sequence was determined by Xu and Árnason  which is 16,660 bp. They further demonstrated that the length of the D-loop varied due to the presence of variable numbers of repeats of eight Base Pairs (bp) in the large conserved central sequence block of the control region. The number of repeats differed between 2 and 29 copies, although the majority was in the range of 22 to 27.
D-loop regions in equine mitochondrial DNA were cloned from three Thoroughbred horses by Polymerase Chain Reaction (PCR). The total number of bases in the D-loop region was 1114 bp, 1115 bp and 1146 bp. The equine D-loop region is A/T rich like many other mammalian D-loops . The large central conserved sequence block and small conserved sequence blocks 1, 2 and 3 that are common to other mammals were observed. However, the researchers found that, between conserved sequence blocks 1 and 2 there were tandem repeats of an 8 bp equine-specific sequence TGTGCACC, and the number of tandem repeats differed among individual horses. The base composition in the unit of these repeats is G/C rich as are the short repeats in the D-loops of rabbits and pigs. Comparing DNA sequences between horse and other mammals, the difference in the D-loop region length is mostly due to the difference in the number of DNA sequences at both extremities. The similarities of the DNA sequences are in the middle part of the D-loop . In comparison of the sequences among three Thoroughbred horses, it was determined that the region between tRNA (Pro) and the large central conserved sequence block was the richest in variation. PCR primers in the D-loop region were designed and the expected maternal inheritance was confirmed by PCR-RFLP (restriction fragment length polymorphism .
Ishida et al. (1995) were the first to use direct sequencing of the most variable region of the D-loop to estimate phylogenetic relationships within the genus Equus. They estimated the evolutionary rate of the studied region to be between 2 and 4x10-8 per site per year and provided new information particularly concerning the evolution of domestic and Przewalski’s Horses. They concluded that the lineage of the Przewalski’s wild horse is not located at the deepest branching among the E. caballus sequences in the neighbor-joining trees they constructed. The observed topology of the trees is clearly inconsistent with an origin of the domestic horse from Przewalski’s Horse, although the Przewalski’s Horse was shown to be within the genetic variation of the domestic horses, suggesting that the chromosome number change occurred rather recently. Oakenfull and Ryder (1998) investigated the variation in the mitochondrial control region and 12S rRNA in all four extant mitochondrial lineages of the Przewalski’s Horse, none of which is descended from domestic/Przewalski’s hybrids or domestic horse founders. Only two different sequences were found, one of which corresponds to that of Ishida et al. (1995). Thus, variation was found to be very little, regardless of individuals who apparently originated from three distinct geographical regions. The other sequence differed from the first, but both were certainly more similar to the published sequences of three thorough bred and of a Mongolian Horse than to other equids.
Kim et al. (1999) sequenced the D-loop region to test the hypothesis that horses, inhabiting the island of Cheju in Korea, are descendants of Mongolian horses. Since the sequences varied considerably within the horse breeds, and since Cheju horses clustered with Mongolian horses as well as with horses from other distant breeds, the authors proposed that the horses on Cheju island were of mixed origin in their maternal lineage, and that they may have been present on the island and were the entity of trade before the Mongolian introduction. In a follow-up study Yang et al.  found 17 distinct haplotypes in almost all Cheju Horses currently inhabiting the island. Another phylogenetic analysis was performed by Mirol et al. (2002), who investigated the relationship between Argentinean Creole and Spanish horses by direct sequencing and SSCP analysis.
Mitochondrial D-loop sequence variation among maternal lineages of the Lipizzan, Arabian, and Thoroughbred horses was determined by Kavar et al. (1999, 2002), Bowling et al. (2000) and Hill et al. (2002), respectively. Sixteen maternal lines of the Lipizzan horse were grouped into 13 distinct mitochondrial haplotypes with stable inheritance, and no sequence variation was observed that was potentially attributable to mutation within maternal lines. Historical data about the multiple origin of the Lipizzan breed was supported by the phylogenetic analysis, which produced a dendrogram with three separate branches. Sequencing of 212 Lipizzans revealed 37 haplotypes . A comparison of these sequences to 136 sequences of domestic and wild horses from GenBank showed a clustering of Lipizzan haplotypes in the majority of haplotype subgroups present in other domestic horses. These findings correspond to past data, according to which numerous Lipizzan maternal lines originating from founder mares of different breeds were established during the breed’s history. The authors proposed that domestic horses could therefore have arisen either from a single large population or from several populations assuming that strong migrations occurred during the early phase of domestication. Advantages of mt-DNA over microsatellites are that the mitochondrial genome is exclusively maternally inherited, haploid, and does not undergo recombination and the methods for assessing genetic diversity are similar to those for microsatellites. Thus, the individuals from one matriline (dam line) are supposed to share single mt-DNA haplotype. Moreover, the control region of mtDNA (D-loop) provides a highly informative tool for matrilineal relationship studies within breeds to detect their differentiation and to refer to different founder mares. The drawbacks of mtDNA analyses are that they cannot detect gene flows from males and the overall genomic diversity because mtDNA behaves like a single haplotype of extra-nuclear DNA.
The Arabian Horses in the USA were traced in the maternal line to 34 mares showing 27 haplotypes . They observed single base differences within two lines which were interpreted as representing alternative fixations of past heteroplasmy, calling into question the traditional assumption that Arabian Horses of the same strain necessarily share a common maternal ancestry. Seventeen haplotypes were found in 19 of the most common matrilineal female families of the Thoroughbred horse . Using both SSCP analysis and direct sequence analysis, Hill et al. (2002) compared Thoroughbred families to 13 other diverse horse populations, revealing no significant differences in variation and suggesting a non-random partitioning of diversity among geographically diverse horse populations. Another analysis of maternal line variation was performed by Luis et al. (2002) in the Sorraia Horse breed. Tracing back the maternal lineages revealed that only two different lines have survived and therefore only two haplotypes, only one of which is present in the German population. The reduced number of surviving maternal lineages emphasizes the importance of establishing a conservation plan for this endangered breed .
Mitochondrial D-loop for 318 horses from 25 oriental and European breeds, including American mustangs was sequenced by Jansen et al. (2002). A phylogenetic network was constructed that showed that most of the 93 different mt-DNA types grouped into 17 distinct phylogenetic clusters. A number of of the clusters correspond to breeds and/or geographic areas, notably cluster A2, which is specific to Przewalski’s horses, cluster C1, which is distinctive for northern European ponies, and cluster D1, which is well represented in Iberian and northwest African breeds . Royo et al. (2005) analyzed a 296 bp mt-DNA fragment from the HVI region of 171 horses representing 11 native Iberian, Barb, and Exmoor breeds to assess the maternal phylogeography of Iberian horses and their finding supports the close genetic relationship between the ancestral mare populations of the Iberian Peninsula and Northern Africa. Phenotypic differences among the Northern and Southern Iberian groups of breeds are not explained by population subdivision based on maternal lineages. Northern Iberian ponies which are phenotypically close to British ponies, especially Exmoor, are the result of an introgression rather than population replacement. Cothran et al.  investigated genetic variation in Zemaitukai horses using mitochondrial DNA (mt-DNA) sequencing. The study was performed on 421 bp of the mitochondrial DNA control region and five distinct haplotypes were obtained for the five Zemaitukai maternal families supporting the pedigree data. Lopes et al. (2005) examined the Lusitano horse maternal lineage based on mitochondrial D-loop sequence variation. Ivankovic et al.  studied mitochondrial D-loop sequence variation among autochthonous horse breeds in Croatia. Genetic variation in three Croatian cold-blood horse populations was analysed using a sequence analysis of the proximal part (nt 15,498–15,821) of the D-loop region of mt-DNA which revealed 26 polymorphic sites representing thirty haplotypes which were clustered into eight haplogroups indicating the presence of many ancient maternal lineages with high diversity in mt-DNA. Glazewska et al.  analyzed a 458 bp mt-DNA D-loop fragment from representatives of 15 Polish Arabian dam lines for phylogenetic analysis and reported 14 distinct haplotypes. Lopes et al. (2005) examined the Lusitano horse maternal lineage based on mitochondrial D-loop sequence variation. The genetic information based on mt-DNA typing has a great importance for the future breed conservation strategy, especially for the critically endangered breed such as Murinsulaner horse.
The nucleotide sequence of the complete mitochondrial genome of the donkey, Equus asinus , was also determined. The length of the molecule is 16,670 bp. The mt-DNA differences between the donkey and the horse suggest that the evolutionary separation of the two species occurred approx. 9 million years ago. Lei et al.  analyzed the mitochondrial DNA (mt-DNA) D-loop sequences with 399 bp in 26 individuals from 5 donkey breeds in China and showed 23 polymorphic nucleotide sites which demonstrated that there is abundant mitochondrial genetic diversity in Chinese donkeys. They further constructed the molecular phylogenetic tree of mt-DNA Dloop sequences in 5 Chinese donkey breeds, 6 sequences of Asian wild ass (Equus asinus kiang , Equus asinus kulan , Equus asinus hemionus ) and 4 sequences of European domestic donkeys from GenBank by Neighbor-Joining method. It was the first report for study at molecular level that the origin of Chinese donkey breeds was from African wild ass (Equus africanus africanus and Equus africanus somaliensis ), not from Asian wild ass. Similarly, Lu et al.  performed analysis of the 367 mt-DNA D-loop sequences (of which 241 sequences were collected from literature) of 399 bp in 13 Chinese domestic donkey breeds and revealed 96 different haplotypes with 57 polymorphic sites and they also suggested that the maternal ancestor of Chinese domestic donkeys is exceedingly likely to be Somali and Nubian of African wild ass instead of Asian wild ass.
The mt-DNA D-loop is a widely employed tool, though there are some problems associated with studying D-loop which need to be taken into consideration as these studies have focused mainly on the polymorphisms in a small section of the mitochondrial genome called the D-loop, which comprises around 7% of the mitochondrial genome. The rationale for this study’s attractiveness lies in its predominantly high mutation rate, meaning that the researchers can analyse this relatively short sequence and still resolve differences between closely related sequences leading to the animal’s relatedness as whole. Regrettably, it is now becoming increasingly clear that this very high mutation rate is actually obscuring the informative information. Three foremost nuisance with data from the D-loop section have been acknowledged which are: back mutation - sites that have already undergone substitution are returned to their original state, parallel substitution - mutations occur at the same site in independent lineages and rate of heterogeneity - there is a large difference in the rate at which some sites undergo mutation when compared to other sites in the same region; data shows evidence of ‘hot spots’ for mutation. But with the availability of complete genome sequences we can get a clear picture of the mt-genome and D-loop region. Although the mitochondrial genome is one of the first genomes to be sequenced in its entirety, it was in recent past that the progression of technology allowed sequences of that length to be obtained with relative ease and a study of any appreciable size using whole genomes was undertaken. This study became a significant milestone in the field of population genetics and perhaps will be a precedent for a new-fangled field, already coined “population genomics and phenomics.”
Equine owners have a special attachment to their animals along with keen interest in their characteristics, pedigree, maternal lineage and evolution studies. They head towards maintaining the best of horses breeds and look towards the science to answer their questions. The mitochondrial DNA (mt-DNA) exists in a nested hierarchy of populations of various animals. There are numerous mt-DNAs within each mitochondrion, multiple mitochondria in each cell, a number of oocytes within each reproductive male or female animal, manifold females in each population, and so on. The attractive mutagenic properties of mitochondria make it extremely useful of genetic diversity study and serves as a pivotal tool for geneticists. For the last many years, molecular biologists have been comparing the mt-DNA D-loop of animals of diverse origins to build evolutionary trees and to address many of the fundamental questions of origin of animal populations such as “how a breed originated and developed with time?” or “from where do we come from?”. The mutations occur in mt- DNA at a regular velocity and will time and again be passed along to our next generations. It is these differences (polymorphisms) that, on a genotypic level, make a particular animal altogether different and unique. The analysis of these differences will show how closely livestock’s are related based on individual assessment. However, in the past decade, mt-DNA analysis has become established as a powerful tool for evolutionary studies of animals. These studies have used mt- DNA analysis to provide insights into population structures and gene flow, hybridization, phylogeography and phylogenetic relationships. A powerful synergism exists between the two fields of research: Evolution studies provide comparative data on mt-DNA organization and function and molecular investigations can and should improve the level of sophistication of evolutionary studies that use mt DNA. In the present review, we tried to focus on importance of animals, especially equines mt-DNA that are especially relevant to its use in evolutionary and biodiversity studies.