RK Garg*, Nidhi Dubey, N Batav, Pooja Pandey and RK Singh
Centre of Excellence in Biotechnology, Council of Science and Technology, Madhya Pradesh, India
Received Date: July 27, 2017; Accepted Date: September 09, 2017; Published Date: September 14, 2017
Citation: Dubey N, Batav N, Pandey P, Garg RK (2017) Mitochondrial COI Gene Sequence Analyses of Puntius ticto Compared with Seven Species of Genus Puntius of Family Cyprinidae: A Finding for Phylogenetic Positioning and DNA Barcoding as Model Study for Cryptic Species Identification. J Proteomics Bioinform 10:214-221. doi: 10.4172/jpb.1000445
Copyright: © 2017 Dubey N, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Proteomics & Bioinformatics
Over the last three decades, mitochondrial DNA (mtDNA) has declared as the most popular marker of molecular diversity, for a combination of technical ease-of-use considerations, and supposed biological and evolutionary properties of a species. The present study examined partial mitochondrial cytochrome c oxidase subunit I gene sequence of mitochondrial DNA for phylogenetic positioning of Puntius ticto among eight species of genus Puntius and its suitability to determine the genetic differentiation in among genus Puntius. The 05 samples of P. ticto were collected from Halali reservoir were analyzed mtcox1 gene of mitochondrial partial regions were sequenced and compared with the online database available on NCBI (National Centre for Biotechnology Information, USA) for rest of seven species (P. sophore, P. sarana, P. amphibious, P. chola, P. conchonius, P. dorsalis and P. gelius). Sequencing of 668 bp of mtcox1 gene revealed 18 haplotypes (h) with haplotype (gene) diversity (Hd) 0.981 ± 0.023 and nucleotide diversity (Pi) 0.658. 504 variable sites and parsimony sites have been recorded and conserved positions were observed at equal frequency in the nucleotide sequence and the variable regions were mostly visualized between nucleotide 27 to 76. The results concluded that the partial mtcox1 is polymorphic and can be a potential marker to determining phylogenetic positioning of Puntius ticto with rest of seven species. Present investigation may be treated model study for wildlife scientists for banding smuggling protected species under false pretenses and the importance of DNA barcoding in stopping such illegal trade.
Puntius ticto; Cytochrome C Oxidase subunit I gene; Gene diversity (Hd); Haplotypes (H); Nucleotide diversity (Pi); Conserved and variable gene fractions; DNA barcoding
Phylogeny and taxonomy play a crucial role in the measurement of biodiversity for conservation and environmental management. Complementing taxonomic descriptions with the knowledge from molecular tools such as mitochondrial DNA sequences [1-3] is now considered relevant to understand the phylogenetic relationships and the precise systematic of taxons and description of biodiversity [4-6]. Assessing biodiversity depends not only on the organization of organisms into taxonomic units, but also on understanding the phylogenetic relatedness of these taxonomic units . Mitochondrial DNA markers analysis is being increasingly used in recent years in population and phylogenetic surveys of organisms.
Studies of vertebrate species generally have shown that, sequences divergence accumulates more rapidly in mitochondrial than in nuclear DNA . This has been attributed to a faster mutation rate in mtDNA that may result from a lack of repair mechanism during replication  and smaller effective population size due to the strict maternal inheritance of the haploid mitochondrial genome . Due to its rapid rate of evolution, mtDNA analysis has proven useful in clarifying relationship among closely related species. Due to non-mendelian mode of inheritance, the mtDNA molecule is considered as a single locus . In addition, because mtDNA is maternally inherited, the phylogenies and population structures derived from mtDNA data may not reflect complete picture of the nuclear genome, if gender-biased migration or selection  or introgression exists . The analysis of animal mtDNA polymorphisms represents the most commonly used means for revealing phylogenetic relationships among closely related species and among populations of the same species [12-16].
In the present investigation, aims to address this important fish Puntius ticto with the question of phylogenetic positioning and existence of any divergent species of genus Puntius has carried out by molecular characterization through mtcox1 gene as primary studies. However, these molecular primary studies were compared with the online database available of NCBI, USA for phylogenetic positioning. On the basis of DNA sequences analyses. Subsequently, P. ticto has been barcoded with uses of Barcode of Life Data Systems, USA, so, that a solid taxonomic identification carried out in a scientific manner with respect to biodiversity conservation priorities. Present investigation may also be a model study for wildlife scientists for banding smuggling protected species under false pretenses and the importance of DNA barcoding in stopping such illegal trade.
Puntius ticto were caught from Halali reservoir. Bhopal Incisions (not more than 5-6 mm deep) were made and skin flap was removed (Figure 1). Small white muscle pieces were cut using surgical blade or small fine scissors. The muscle samples were kept on the aluminum foil labelled with fish number held over ice and the aluminum foils was folded and were kept on ice temporarily and finally stored at –80ºC in the laboratory till further use for analysis. In the present investigation, molecular research methodologies were adopted to delineated the gene flow and hereditary traits among the P. ticto populations.
Extraction of genomic DNA from tissues samples
Total genomic DNA was extracted by phenol: chloroform: isoamylalcohol (25:24:1) method using some modifications [17-19]. One hundred mg tissue sample was taken in pre-chilled eppendorf tube (1.5 ml capacity) and grinded tissue with the help of micro pestle within the tube. During grinding, added 0.5 ml of digestion buffer (100 mM Tris-HCl with pH 8.0, 10 mM EDTA with 8.0, 1.4 M NaCl, 1% SDS and 0.2% β-Mercaptoethanol) in tubes and added remaining 0.5 ml after grinding. Incubated samples at 50ºC for 30-60 min on dry bath with occasional shaking and then centrifuged at 5,000 rpm for 10 min at room temperature. Collected supernatant in a fresh eppendorf tube and added equal volume of phenol: chloroform: isoamyl-alcohol (25:24:1) to the samples. Centrifuged again at 10,000 rpm for 10 min at 4ºC and transferred top aqueous layer to new tube. Added half volume of 7.5 M ammonium acetate and 2 volumes of 100% chilled ethanol. Tues kept in deep freezer for 1-2 h and centrifuged at 10,000 rpm for 10 min at 4ºC. Added 1 ml of 70% ethanol in the tubes for washing and centrifuged for 10 min at 10,000 rpm at 4°C. Discarded upper aqueous layer and dried the pellet for 1-2 h at room temperature. Added 50 μl Tris-EDTA buffer (10 mM Tris HCl, 1 mM EDTA, pH 7.6) and stayed for 2 h to dissolved the pellets.
Quantification of extracted genomic DNA and integrity checking
The yield of extracted DNA from fish tissues in ng/μl was measured using a UV Spectrophotometer (ND-1000) at 260nm and 280nm wavelength. The purity of DNA was determined by calculating the ratio of absorbance at 260 nm to 280 nm. The ratio of absorption at 260nm v/s 280nm should 1.8 is commonly used to assess the purity of DNA with respect to protein contamination, since protein (in particular, the aromatic amino acid) tends to absorb at 280nm. The DNA sample is considered as pure when the 260 to 280 ratio comes near 1.8. But the DNA sample having ratio 1.5 to 2.0 can be easily used for PCR. After checking quality and quantity of DNA, the dilutions were made as desired for PCR amplification as 50 ng/μl or samples were treated with proteinse-K or RANase to get the pure DNA in the samples 
PCR programming for mtDNA mtcox1 gene amplification
Total genomic DNA was extracted from fish tissues (muscles) using the phenol-chloroform method modified by . Amplification was carried out in 25 μl reaction mixture which comprising 9 μl distilled water, 15 μl 2 x PCR master mixes, 3 μl templates DNA, 3 μl forward primers and 3 μl reverse primer. The Amplification was done as program revealed by  which is consisted of 30 cycles with an initial denaturation at 94°C for 5 minutes, denaturation at 94°C for 30 seconds, annealing at 55°C for 60 seconds s and extension at 72ºC for 90 seconds per cycle and final extension at 72ºC for 10 minutes. Mitochondrial DNA exhibits several properties that make it a useful tool in the study of phylogenetics, molecular evolution and even conservation genetics, due to its relatively simple genetic structure, maternal mode of inheritance (in most situations), and high rate of evolution/polymorphisms. MTCOX1 gene will be amplified with universal primers i.e., FISH F1 and FISH R1  as shown in Table 1.
Table 1: Universal primer and their conditions used for present investigation.
DNA Sequencing and their analyses
The DNA fragment was excised from gel with sharp scalpel and cleaned. Weight the gel slice (desired fragment) and transferred to a 1.5 ml microfuge tube. 400 μl of binding buffer was added in 100 mg of gel slice in a tube (@ 0.40 μl/1mg gel slice) and incubated at 50-60ºC for 10 minutes with occasionally shaking until agarose is completely dissolved. However, more concentration of the binding buffer (@ 0.70 μl/1mg gel slice) may be use for high concentration gel (1.5-2.0%). Loaded above mixture in column (MX-10) and left stand for 2 minutes. Then, centrifuged at 10,000 rpm for 2 minutes and discarded the flow through in the tube. Added 500 μl of wash solution, and centrifuged at 10,000 rpm for 1 minute and discarded flow through from the tube and repeated same procedure again to remove any residual wash buffer. Placed column in a cleaned microfuge tube with 1.5 ml capacity and added 30-50 μl of elution buffer at the centre part of the column then incubated at room temperature for 2 minutes. Samples centrifuged at 10,000 rpm for 2 minutes for elution of DNA.
Sequences were subjected to BLAST at the National Centre for Biotechnology (NCBI), website (www.ncbi.nlm.nih.gov/blast). All sequences of the mt-DNA (mtcox1 gene) of P. ticto is planning to submit in Genebank. Amplified mtcox1 gene was sequenced in both the directions to check the validity of the sequences data. All DNA sequences were aligned using CLUSTAL-W  and sequence composition was estimated using MEGA 5.0 ver software . However, further molecular parameters of genetic diversity such as genetic differentiation values, nucleotide diversity, haplotype diversity, etc. were calculated by MEGA ver 5.0, DNASP ver 5.0 software .
Modern population genetics that incorporate genotypic analysis (those utilizing combined information from genotypes across multiple genetic loci) are revolutionizing the understanding of population structure and history . The development of highly variable cox1 gene mitochondrial marker and statistical methodologies for interpreting genetic data had provided the opportunity to gain an intricate understanding of population characteristics such as dispersive and genetic structures, which an important for the successful management of threat and species both in the world and in captivity [27-29].
In the present investigation, we performed the molecular studies to find out the “genetic variability of P. ticto obtained from Halali reservoir with special emphasis on gene flow, mitochondrial DNA cytochrome oxidase subunit I (mtDNAcox1) sequences variations” to assed the threats that effected on the gene and their diversity and conservation aspects. In this investigation, we performed mtcox1 gene sequences variations and tried to analyze species phylogenetic positioning with reference to genetic variability and identification of conserved regions for DNA barcoding. We had mtcox1 primary data, then compared and found out the systematic position of the Puntius ticto among Puntius species.
(a) DNA yields samples and qualities for PCR amplification
The purity of extracted genomic DNA samples was calculated from O.D. 260/O.D. 280 ratios as represented by Sambrook and Russell (2001). When DNA shows ratio of absorbance values at wavelengths 260/280 as 1.8 it indicates the purity of DNA, When DNA shows ratio of absorbance values at wavelengths 260/280 as more than 1.8 it indicates the contamination of RNA in the DNA, when DNA shows ratio of absorbance values at wavelengths 260/280 less than 1.8 it indicates the contamination of protein in the DNA. During polymerase chain reactions (PCR) amplification, it is needed to have the DNA concentration close to the 50 ng/μl, therefore, all yielded DNA were diluted and adjusted the concentration 40 to 50 ng/μl using sterile Milli-Q double distilled water for further molecular experimental work. The dilutions are shown in the Table 2. In the present investigation, the quantity of extracted DNA was obtained from 70.5 ng/ul to 286.2 ng/ul with the purity of 1.78 to 1.99% of 260 by 280 ratios. According to Sambrook and Russell (2001), the concentration of extracted DNA should be near about 50 ng/ul, all ratios.
|S NO||Quantification of DNA extracted||260/280||DNA template||Dilution Of OD Water||Final Quantification Of [email protected]/µl|
Table 2: Quantification of extracted genomic DNA from n=5 of P. ticto of Halali reservoir, Bhopal.
The general goals of population genetic studies are to characterize the extent of genetic variation within species and account for this variation . The amount of genetic variation within and between populations can be determined by the frequency of genes and the forces that affect their frequencies, such as migration, mutation, selection and genetic drift . During the last two decades, a large amount of genotype and allele frequency data have been obtained from a large number of species, including many fish species, primarily through the means of protein and DNA base molecular genetic techniques. These studies have shown that most species are subdivided into more or less distinct units that differ genetically from each other . At this point intraspecific groups of fish have to be described to prevent confusion by terms such as race, tribe, population, subpopulation, stock and subspecies and are intended to reflect the magnitude of differences among such subdivisions .
Therefore, looking above facts, a co-dominant coding i.e., COX1 (cytochrome oxidase subunit I) of mitochondrial DNA. COX1 gene is very much prominent for estimation of gene flows and species identification of fishes. Hence, it may also be useful to other species to delineate the forensics for conservation point of view.
The present investigation on population genetics of Puntius ticto includes (1) extraction of genomic DNA, (2) quantification of genomic DNA (3) PCR amplification using synthesized universal primers FishF1: TCAACCAACCACAAAGACATTGGCAC and, FishR1: TAGACTTCTGGGTGGCCAAAGAATCA), (4) gel elution, (5) DNA sequencing and biostatistical analysis. Uses of DnaSP software (version 5) for analysis of number of polymorphic sites, for analysis of DNA Polymorphism for analysis of conserved DNA Regions, for analysis of Haplotype/ DNA Sequences Data File, and MEGA software version 5 for analysis of phylogeny, Analysis of Disparity Index Test among 21 genotypes of Puntius ticto, Nucleotide Composition, Maximum Likelihood fits of 21 different nucleotide substitution models and domain data is calculated among 05 genotypes of Puntius ticto, collected from Halali reservoir. Detailed description on results is obtained are as (1) obtained good genomic DNA was extracted and obtained good yield of the extracted DNA as required approximately 50-60 μl for amplification of DNA.
Visualized bands (Figure 2) were sliced with the help of sharp scissor. The gel piece with DNA fragment was purified with the help of gel purification kit (Genei, Bangalore) and finally all DNA’s were sequenced using DNA Sequencer (ABI Model 3500 USA), All the sequences were aligned using BLAST and Clustal-W software on NCBI population’s resources data available (aligned report attached). The aligned sequences of the mtCox1 gene region have neither stop codons nor introns. Additionally, the 3rd position of the codon has a very low frequency of G’s as reported for mitochondrial DNA genes. Investigations of 05 DNA nucleotide sequences were compared with the rest of seven species (online retrieved data) of genus Puntius. Puntius ticto mitochondrion complete genome sequences (sequence ID gb|KF429932.1|) shows that the score was 1011 bits (547). Then these sequences were believed to repress true mitochondrial Cox1 gene sequences rather than numts [34,35].
(b) Gene Profiling Estimation through mtcox1 gene
The alignment of the sequences was performed using Mega ver. 5.0 software using NCBI, USA database revealed eighteen different haplotype defined by 21 genotypes. No size or heteroplasmy polymorphism was observed within or among individuals from five sampled Puntius ticto populations of M. P. Statistical models for the evolution of molecular sequences play an important role in the study of evolutionary processes. For the evolutionary analysis of protein-coding (3) sequences, 3 types of evolutionary models are available: (1) nucleotide, (2) amino acid, and codon substitution models.
Selecting appropriate models can greatly improve the estimation of phylogenies and the detection of positive selection. By analyzing nucleotide comparisons, nucleotide pair frequencies, conserved and variable inter-specific fish sequences and intra-specific fish population data, we showed the superiority of the codon substitution models and discuss the advantages and disadvantages of the models. Nucleotides parsimony info regions in was also obtained using mtDNA gene in Punctius ticto. One nucleotides parsimony site was obtained in all genotypes which was in between GCC nucleotides. The Singleton index (SI index) in Punctius ticto genotypes obtained from Cox1 gene sequences as depicted that 65 nucleotide sites were obtained as SI index.
P. ticto mtDNA was sequenced and it was found out that total 683 sites after the making multiple alignment using Mega ver. 5.0 software which presented no conserved sites was recorded, however, 504 variable sites which has been depicted in Table 3. Similar work has been done by Martins et al.  in Leporinus elongatus from Parana River basin of South America and he obtained 153 variable sites and 97 conserved sites were found. If our study compared to each other sequences through Mega ver. 5.0 software, it concluded that P. ticto fish population have no conserved were found as stated above and variable sites has been recorded which shown in the Table 3 representing not a good gene flow. The conserved positions were observed at equal frequency in the nucleotide sequence and the variable regions were mostly visualized between nucleotide 27 to 76.
|1.||Number of sequences||21|
|3.||Number of sites||683|
|4.||Total number of sites (excluding sites with gaps / missing data)||504|
|5.||Number of polymorphic (segregating) sites||504|
|6.||Total number of mutations, Eta||1451|
|7.||Number of Haplotypes, h||18|
|8.||Haplotype (gene) diversity, Hd||0.981|
|9.||Variance of Haplotype diversity||0.00051|
|10.||Standard Deviation of Haplotype diversity||0.023|
|11.||Nucleotide diversity, Pi: 0.72569||0.65875|
|12.||Sampling variance of Pi||0.0166705|
|13.||Standard deviation of Pi:||0.12911|
|14.||Nucleotide diversity (Jukes and Cantor), Pi(JC)||1.90849|
|15.||Theta (per site) from Eta||0.80022|
|16.||Theta (per site) from S, Theta-W||0.27795|
|17.||Variance of theta (no recombination)||0.0086176|
|18.||Standard deviation of theta (no recombination)||0.09283|
|19.||Variance of theta (free recombination)||0.0001533|
|20.||Standard deviation of theta (free recombination)||0.01238|
Table 3: Haplotype Diversity and Gene Diversity including Nucleotide Diversity.
The ClustalW represents the nucleotide sequence alignment of the observed variable sites of the 395 analyzed P. ticto. Although a few base substitutions were identified in the Cox1 sequences of the species, short insertions/deletions were more frequent. Similarly, the occurrence of short base insertions and deletions has been observed in the mtDNA sequences of other fish, such as salmonid species .
(c) Polymorphic sites among 21 accessions of Puntius ticto
Total number of DNA sequences were 21 representing total eight species of genus Puntius of which one is studied. In the present investigation, as an overall following data were obtained Number of sites: 683, Total number of sites (excluding sites with gaps/missing data): 504, Number of polymorphic (segregating) sites: 504, Number of Haplotypes (h): 18, Haplotype (gene) diversity (Hd): 0.981, Variance of Haplotype diversity: 0.00051, Standard Deviation of Haplotype diversity: 0.023, Nucleotide diversity (Pi): 0.65875. However, the nucleotide differences were as average number of nucleotide differences (k): 332.010, Vst (k): 19747.870, Vs (k): 2115.053, V (k): 21862.923, Vst (k): 110.670, V (k): 121.737, Theta-W 140.088 were obtained with respect to all eight species of genus Puntius.
(d) COX 1 gene based phylogenetic positions among 08 species of Puntius ticto
Five DNA sequences of P. ticto were analyzed in the laboratory. Rest of Puntius species was downloaded from NCBI, USA. All sequence were aligned and compared with CLUSTAL W and MEGA ver 5 software. Puntius ticto studied in the present work were grouped in a single node (Figure 3), other species grouped in another second group biggest group, showing all accession in single node. Current studies shows that, Punitus ticto is highly similarity with Puntius sarana however, rest of all species showing secondary sub branches (Figure 3). Same data was analyzed through neighbor joining linkage and minimum evolutionary and phylogenetic method who showed same result as revealed by maximum likelihood.
Present investigation showing that the database available on NCBI reflects that, Puntius ticto species are closely showed resemblance with database as sequences amplified through COX1, therefore, it can be said that this marker is good for species identification, gene flow estimation and genetic polymorphism at DNA sequences level as studied. The amplified data were also compared with online database and generated online phylogeny which clearly indicates that all accessions having under the database presented online and Puntius is showing high resemblance with Puntius sarana. It may also useful for:
• The database available on NCBI reflects that, the all 07 species are closely related,
• Good genetic variation in all 08 population were found showing that all species are vulnerable.
• Data of polymorphic sites in which it shows greater variable (polymorphic) sites in natural water bodies i.e, Halali Reservoir,
• Haplotype (gene) diversity and nucleotide diversity is also greater in natural water bodies.
(e) Development of DNA barcodes for taxonomic identification
All three mitochondrial sequence markers were useful for the identification of the 50 target species. Sequences of 08 Puntius ticto species were obtained to compare the applicability of the COI genes as markers for DNA barcoding. A data set of sequences of eight fish species from Halali reservoir was obtained and these sequences are available uploaded on Barcode of Life Data Systems, USA sequence data base (Table 4 and Figures 4 and 5). No stop codons, insertions, and deletions were observed in the cyt b and COI sequences, indicating that they represent. Fragments of functional mitochondrial genes and not nuclear mitochondrial pseudogenes (Numts). In our data, the extent of overlap between genetic variations observed at within- and between species levels was different among markers. The lack of a ‘‘barcoding gap’’ in COI was also observed in a comprehensive study on publicly available sequences of freshwater fishes available from the Barcoding of Life Database (BOLD) (Figure 4). The similarity scores as top hit matches as 99.00% showed by P. ticto with respects to the data available on Barcoding of Life Database (Figure 6 and Table 4). This clearly indicated that, the present investigation shows that, DNA sequences of the P. ticto successfully barcoded and prominently able to identify the species through the development dataset.
Figure 5: Phylogenetic analysis (COI). Neighbour Joining tree for partial sequences of the mitochondrial cytochrome oxidase subunit I gene of fishes from Halali reservoir and compared with database of available on www.boldsystems.org.
Figure 6: Similarity scores of the top 99 matches with database available on www.boldsystems.org.in.
Table 4: Similarity matches and scores with >99.00 top hits with available database on www.boldsystems.org.in.
Our study confirms that employing COI barcoding can help in the identification of the majority of fish species including other aquatic fauna. This use of molecular data should be complementary to morphological analysis in such taxonomical identification of a particular species. Once this type of DNA barcode for any species is developed the problem of smuggling protected species illegal trade/ poaching can be controlled.
The authors wish to thank to Director General, MPCST & Scientific Advisor to Government of Madhya Pradesh for their help and encouragement during the work. The authors also wish to thank to laboratory mates for his help during this work and Department of Fisheries, Government of M.P. for providing fish specimens for research study.