Population Genetics for Autosomal STR Loci in Sikh Population of Central India

This study is an attempt to generate genetic database for three endogamous populations of Sikh population (Arora, Jat and Ramgariha) of Central India. The analysis of eight autosomal STR loci (D16S539, D7S820, D13S317, FGA, CSF1PO, D21S11, D18S51, and D2S1338) was done in 140 unrelated Sikh individuals. In all the three studied populations, all loci were in Hardy -Weinberg equilibrium except at locus FGA in Ramgariha Sikh and locus D16S539 in Arora Sikh. An analysis of molecular variance (AMOVA) showed 1% variation among the three studied populations. The close genetic relationship between Jat and Ramgariha Sikh population were confirmed in the MDS Plot generated from the pairwise genetic distances.


Introduction
Microsatellite markers are most suited for the genetic structural assessment of a population due to ease of use, co-dominant inheritance, high polymorphism and mutation rate [1,2]. Use of multiplex polymerase chain reaction based technology has made the task of identification easier and this has emerged as the dominant conclusive identification method in forensic investigation and for anthropological studies. India is a country rich in ethnic, cultural and linguistic variant groups. Although a considerable amount of information on polymorphism at microsatellite loci in humans is now available, but such studies are confined to limited groups [3][4][5][6][7][8][9][10][11][12][13]. Like most other Indians, Sikh is endogamous by caste and exogamous by sub caste [14]. Sikhism is India's fourth-largest religion and has existed for over 500 years, beginning with the birth of its founder Guru Nanak dev in the late 15th century C.E. in the Punjab region of what is today in India and Pakistan. The Sikhs community has a stronghold in the state of Punjab; roughly 60% of the population belongs to the Sikh faith. The state of Madhya Pradesh (MP) comprises about 1.9% of the Sikh population [15]. There is necessary to fill a big lacuna with information about the genetic diversity of Sikh Population. A very few number of genetic studies on Sikh population have been carried out around the world [16][17][18][19][20][21][22][23]. However, no STR marker based study on Sikh populations of central India has been reported in the literature till date. Therefore the present data would be used in the forensics and individual identification for these selected population groups and these genetic data would enrich the genetic informational resource. In the present study 140 unrelated individuals of three studied Sikh population Arora (n=40), Ramgariha (n=50) and Jat (n=50) were taken for analysis from MP, India on the nine microsatellite loci which are D13S317 (13q22-31), D7S820 (7q11. [21][22], Amelogenin (X:p22.1-22.3;Y:p11.2), D2S1338 (2q35-37.1), D21S11 (21q11.2-q21), D16S539 (16q24-qter), D18S51 (18q21.3), CSF1PO (5q33.  and FGA (4q28). These loci were chosen for two reasons. Firstly, they consist of repetitions of tetranucleotide repeat units and are therefore less prone to slippage of polymerase during enzymatic amplification [24]. Secondly they are located on different chromosomes so there is no possibility of mitotic recombination as they are present far apart from each other. All studied loci are substantially unlinked, which make them ideal tools to study genomic variation.

Sample collection
Venous blood from a total of 140 unrelated healthy individuals from three endogamous groups (50 Ramgariha, 50 Jat sikh and 40 Arora sikh) of Sikh population from Bhopal and Raisen district of Madhya Pradesh, India were taken on FTA card.

DNA extraction
A 1.2 mm punch from a dried sample spot on FTA paper was taken in a PCR tube. FTA purification reagent (200 μl) was added to PCR tube, incubated for 5 minutes at room temperature and then continuously agitated by using a pipette. This process was repeated thrice with FTA purification reagent and twice with 100 μl TE-buffer. Finally the entire unspent TE buffer was removed and discarded by pipetting and the disc was allowed to dry at room temp for overnight and was directly used for PCR amplification.

PCR amplification
Multiplexed PCR amplifications of the 9 STR loci: D16S539, D13S317, D7S820, CSF1PO, FGA, D21S11, D2S1338, D18S51 and amelogenin was performed using AmpFlSTR® MiniFilerTM PCR amplification kit (Applied Biosystem, Foster city, CA, USA). The PCR reagents have been standardized in the laboratory for consistency of results. PCR was performed by taking the ½ reaction volume of the manufacturer's recommended protocol [25] by using 9700 thermal cycler (Applied Biosystems, USA). For one 1.2 mm washed punch of FTA paper the PCR mix was comprised of Reaction Buffer -5.0 µL, Primers -2.5 µL, MQ water -5.0 µL to make final volume 12.5 µL.

Genotyping of amplified fragments
The PCR products were genotyped using multicapillary electrophoresis with POP-4 polymer in ABI Prism Avant 3100 Genetic Analyzer (Applied Biosystem, Foster city, CA, USA) according to the manufacturer's protocol provided with the kit and the data was analyzed using Gene Mapper Software v3.5 (Applied Biosystem, Foster city, CA, USA) to designate alleles by comparison with the allelic ladder supplied with the kit. Peak detection threshold was set to 50 RFUs for allele designation. All steps were according to the laboratory internal standards and respective kit controls.

Analysis of the data
Allele frequencies of the 8 STR loci were calculated by GenAlEx 6.5 software [26]. Several Statistical parameters of forensic importance like the power of discrimination (PD), polymorphism information content (PIC), matching probability (PM) and power of exclusion (PE) were calculated using the Excel PowerStats spreadsheet program [27].
PD is the probability that two randomly chosen persons would not have matching DNA profiles [28] and CPD is used to prove that selected loci can be safely used to establish DNA based database for the studied population. CPM describes the possibility of finding two individuals with the same genotype in the population is almost null. Heterozygosity is a measure of genetic variation within a population. High heterozygosity values for a breed may be due to long term natural selection for adaptation, to the mixed nature of the breeds or to historic mixing of strains of different populations. A low level of heterozygosity may be due to isolation with the subsequent loss of unexploited genetic potential. Observed heterozygosity is defined as the % of loci heterozygous per individual. Low observed heterozygosity values indicate inbreeding and may be a departure from HWE too. Observed and expected heterozygosities and Hardy-Weinberg equilibrium (HWE) using exact test were calculated using Arlequin v3.5 [29]. The same package of software was used for calculating AMOVA among the three endogamous studied populations. The genetic affinities among the studied three populations were observed by plotting dendogram using POPTREE software [30]. Nei's genetic distances for the studied populations with other published Indian population data [31][32][33] was calculated by using POPTREE software [30] which were graphically summarized using Principal Component Analysis (PCA) plot generated using Past v3.02a software [34] to visualize population affinities.

Results and Discussion
The genetic variation in allele frequency distribution at 8 STR loci and statistical analysis of forensic parameters for Arora, Jat and Ramgariha Sikh population are shown in Tables 1, 2 and 3 respectively. The common pattern of allele distribution was observed at all the studied loci which may be due to their practice of endogamy as a social rule. The distribution of the most common allele (MCA) and least common allele (LCA) in the three endogamous caste is presented in Table 4, when the significance level was corrected by the Bonferroni method [35] and P values <0.05/8=0.00625 were considered statistically significant only two deviation persisted one at locus FGA for Ramgariha sikh and other for locus D16S539 in Arora Sikh. Molecular variance analysis (AMOVA) was conducted to understand the intra and interpopulation variations in the three Sikh endogamous populations (Figure 1). The genetic relation among them is shown in (Figure 2). Jat and Ramgariha Sikh showed close affinity than the Arora Sikh. PCA plot ( Figure 3) shows comparison of the three studied populations to other published population of India. The study shows that all the three populations showed significant differences from the tribal population. This finding is similar to the early reports [33][34] on caste and tribal population, which indicate that social stratification has played a major role in shaping the genetic diversity of India.