The Major Y-Chromosome Haplotype XI – Haplogroup R1a in Eurasia

More than 6600 unrelated males from Eurasia were analysed by molecular hybridization experiments for the p49a, f Taq I polymorphisms. A total of 846 subjects (12.7%), belonging to haplotype XI/R1a haplogroup, were identified and further analysed for the two SNPs Z280 and Z93; these two SNPs define the European and Indian sub-haplogroups, respectively. Among Europe, approximate dating based on the study of a set of 12 STRs and subsequent TMRCA calculations are given for the Eastern, Central, Northern, Western and South-Eastern regions. *Corresponding author: Gérard Lucotte, Institute of Molecular Anthropology, Paris, France, Tel: 0698829261; E-mail: lucotte@hotmail.com Received April 20, 2015; Accepted May 22, 2015; Published May 25, 2015 Citation: Lucotte G (2015) The Major Y-Chromosome Haplotype XI – Haplogroup R1a in Eurasia. Hereditary Genet 4: 150. doi:10.4172/2161-1041.1000150 Copyright: © 2015 Lucotte G. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
Analysis of Y-chromosome DNA sequence variation has provided major insights to the analysis of human evolution and dispersals. The first published tools [1] permitting such a study were the informative p49a, f Y-chromosome specific DNA probes, mapped the nonrecombinant (NRY) Yq11.2 region [2]. Because the non-recombinant region of the Y chromosome is uniparentally transmitted and escapes recombination, its variation arises only by the sequential accumulation of the rare events of new mutations along radiating paternal lineages.
The p49a, f Y-chromosome-specific probes (the DYS1 locus) were first used in Southern Blot (SB) experiments on genomic DNA samples of unrelated males living in the cosmopolitan town of Paris [3]: TaqI male-specific fragments A, C, D, F are polymorphic between individuals, and the sixteen first described haplotypes (numbered I -XVI) were identified in this population.
Haplotype XV (A3, C1, D2, F1, I1) was the most widespread haplotype found in this initial study. Elevated frequencies were later found in French Basques [4], and percentages of the haplotype XV geographic distribution in Europe reveal a gradient of decreasing frequencies from this basque focus toward eastern peripheral countries [5,6]. Haplotype XV is equivalent to haplogroup R1b-M343, since the beginning of the YCC nomenclature [7].
Haplotype XI (A3, C0, D0, F1, I1), equivalent to haplogroup R1a-M420, was the most frequent haplotype found after the XV. Previously to our own work it was shown, using the p49a, f Taq I polymorphism, that haplotype XI is at a relatively elevated percentage in Hungary [8].
In our first study on the subject [9] we reported haplotype XI frequencies in more than 600 males originating from 13 different geographic locations in Eastern Europe, where haplotype XI represents the major haplotype. The highest frequencies were obtained from Ukraine (44%), Russia (43.9%) and Hungary (40.7%); percentages of haplotype XI geographic distribution show a gradient of decreasing frequencies from these areas of higher percentages toward southeastern and more western countries in Europe. In our second study [10] we reported haplotype XI frequencies in more than 3,500 males originating from European populations (including two from Turkey and one from Cyprus). Haplotype frequencies in the different geographic areas in which the samples were pooled confirm the high prevalence of haplotype XI in Eastern Europe (38.6%, versus 2.3% in West Europe and 3.1% in South-Western Europe).
At the end of these studies, it was interpreted that haplotype XI have expanded in the territory of present day Ukraine [11] and to have been spread by the Kurgan culture, which migrated into both Europe and the East, resulting in the expansion of Indo-European languages [12].
In the present study we extend the approach to other populations located in the North of Europe, in North-Africa, the Middle-East, in Iran and in Iraq, in Afghanistan, and in Pakistan and in India. The goal of this study is to construct a complete genetic map of the haplotype XI-R1a-M420 haplogroup.

Subjects and Methods Used
The population sample (79 populations) consisted of 6643 adult males originating from 52 countries ( Table 1). All samples of blood were collected from volunteer donors, with informed consent; their classification was based on their grandfathers' birthplaces. The geographic location of the populations analysed is shown in Figure 1.
Genomic DNA was extracted from whole blood by a classical method [13], using proteinase K and several successive phenol/ chloroform extractions.
At least 5µg of genomic DNA were restricted with TaqI enzyme and separated by electrophoresis on a 1.5% agarose gel. The restricted DNAs were then transferred to Hybond N + membranes by the SB method, and hybridized with two probes (the 2.8kb p49f -EcoR I insert first, and the 0.9 kb p49a Xba I and BamH I second), according to Lucotte et al. [14]. The TaqI fragments (named alphabetically A-Q according to decreasing sizes) are revealed among genomic DNA, most of which being male specific; among them the A, C, D, F and I bands can be either present or absent in individuals (variants or zero).
The haplotype XI map was realized with the Spatial Analyst program (Arcview software) using the Kringing procedure [5]. We used the inverse distance weighting (IDW), which performs well with scarce data. The IDW method was computed for the five nearest neighbors (the grid has 250 rows and 355 columns) and we used a power of 2 (so that the influence is greater at large distance than with a high power).
The two Single Nucleotide Polymorphisms (SNPs) tested on haplotype XI subjects were Z280 and Z93 ( Table 2). Amplifications of 3-5 mµ genomic DNA were performed in an ABI 7500 and in GeneAmp 9700 thermal cycles (Applied Biosystems, Foster City, CA).     Table 1. The various nuances of purple correspond to artificial discontinuities, with density percentages as indicated (arrow points indicate the geographical limit between haplotypes XVI-XI).
To examine the Short Tandem Repeat (STR) variation within the two sub-haplogroups, DNA was amplified using a PowerPlex Y (Promega, Madison, WI) amplification kit including 12 Y-STR loci (Table 3) according to the manufacturer's instructions. Fragment sizes and allele designations were determined using a Genetic Analyzer ABI 3130 (Applied Biosystems, Foster City, CA) using Gene Mapper ID-X v.1.2 software.
The rho statistic was used to estimate the time of the most recent common ancestor (TMRCA) of haplotypes within the sub-haplogroups R1a1-Z280 and R1a1-Z93. Evolutionary time estimates were calculated according to Zhivotovsky et al. [15] and STR mutation rate was assumed to be 6.9 × 10 -4 per 25 years. On the map of the Figure 1, each point represents the approximate geographic locations of the populations studied. Maximal haplotype XI values reported in Table 3 correspond to peaks in the landscape of haplotype XI frequencies; four of such peaks are visible on the map: three in Europe (Kew, 44%; Moscow, 43.9%; Hungary, 40.7%) and one-the highest-in Punjab (61.3%). Around these peak 'areas, there are apparent clines of decreasing haplotype XI frequencies. Between the two blocks (Punjab, and the other three peaks in Europe) of higher frequencies, the intermediate geographic region (Caucase, East of Turkey, some part of the Balkans, Iraq, Iran, Afghanistan, Syria and the Near-East, Alexandria, and until Libya) shows relatively low haplotype XI frequency values. It is in the occidental part of Europe and in the rest of North-Africa that the haplotype XI frequencies are the lowest.

The haplotype XI Map
This map is grossly similar to that of the Indo-European languages [16].
Some researchers have suggested that the R1a1-M458 haplogroup Of the 394 European haplotype XI subjects of our samples, more than 92.1% were assigned to Z280, whereas 90.9% of the 231 Pakistano-Indian subjects belonged to Z93, according to the previously proposed trend showed in [21]; both of these SNP markers were found among our Near East/Middle-East and Caucasian populations comprising 140 of our samples bearing haplotype XI. These results are greatly similar to those recently published by Underhill et al. [22]. We have not tested for the paragroup R1a-Z93* and their subdivisions described by Underhill et al. [22], that is most common in the South Siberian Altai region of Russia. Table 5 summarizes TMRCA estimations in populations belonging to the two Z93 and Z280 haplotypes. For Z93, the 192 Pakistano-Indians chosen for the estimation give an approximate TMRCA=15,5 Kyears, that is substantially older than that (10,272 ± 2,187 years) estimated by Pamjav et al. [21]; this difference between the two can be partially explained by the fact that these authors included in her sample the Roma population group and the Hungarian Z93 chromosomes that result of Roma admixture.

TMRCA in Pakistan-India and in the five European regions
For Z280 in the five European regions (based each on a mean of 50 subjects), TMRCA varies between 6,9 Kyears for the population of Northern Europe (that of the lower diversity) to 12,5 Kyears for the populations of Eastern Europe. These estimations are in accordance to our previous one [10] about 12 Kyears-concerning the maximum coalescent time in Europe (Northern Europe excepted). In their recent study [22] proposed about 11,7 K years for Europe (of the Z282 haplotype), and about 12,5 Kyears for Siberia (of the Z93 haplotype).
In their two articles [23,24], based on either 67 or 111 STR markers, proposed the following dates : The haplogroup R1a arose in Central Asia (apparently in South Siberia and/or neighboring regions) around 20 Kyears; not later than 12 Kyears bearers of R1a1 already was in the Hindustan, then went across Anatolia and the rest of Asia Minor apparently between 10 and 9 Kyears, and around 9-8 Kyears they arrived to the Balkans and spread over Eastern Europe to the British Isles.

Discussion
In the present study we have extended the field of detection of haplotype XI/haplogroup R1a subject to other countries previously uncovered in our preceding articles [9,10]: these countries are mainly Northern Europe, Georgia and Armenia, Near/Middle East, North-Africa, Iran and Afghanistan, Pakistan and India. We found high haplotype XI frequencies values in Afghanistan (18.4%), in Iran (26.5%), in Pakistan (28% and 30.4%) and in India; in this last subcontinent, the maximal value of 61.3% was found in Punjab.
We have refound in our samples the clear distinction initially established by Pamjav et al. [21] between Indian Z93 populations and European Z280 populations: all our South Asian populations are Z93, while almost all our European populations are Z280. Datations show that the Z93 Pakistano-Indian group is the most ancient (about 15,5 K years); in Europe, the Eastern populations are the most ancient (about 12,5 K years) and the Northern ones the most recent (about 6,9 Kyears).
We have already quantified the pattern of the East-to-West decreasing cline of haplotype XI frequencies in Europe [9]: there is a highly significant (p<0.001) correlation between haplotype XI percentages with northern latitude. Another fashion to establish the may have originated in India (Sharma et al., 2009), in Central Asia [17], or "somewhere between South Asia and Eastern Europe" [18].
Concerning Europe, Table 4 shows the mean frequencies of haplotype XI in five major European regions : these frequencies are low in Western (4.4%) and above 10% in South-Eastern Europe (7.5%), attain 15.3% in Northern Europe, climb to 29.6% in Central Europe, and culminate (34.7%) in Eastern Europe (represented by Russia, Bielorussia and Ukraine).

A clear genetic distinction between European and Indian bearing haplotype XI
The origin and spread of haplotype XI-R1a in Eurasia has longtime remained unknown, due to the lack of downstream SNPs within the R1a1 haplogroup [18]. The new SNP marker M458, which was reported in this study, was the first step to describe a new R1a1-M198 sub-haplogroup which had specific geographic distribution in Eastern Europe.
Recently, because of the "1000 Genome Project" that have provided many complete Y-DNA genomes in her database [19], new Y-SNPs were available to the specific community [20]. In their seminal article on the Hungarian population compared to Malaysia Indians, [21]     augmentation of haplotype XI values in Europe from West-to-East is to considerate these frequencies along the major River basins [10]: starting from relatively low values in the Seine and the Rhine basins (3.6% and 3.4%, respectively), the haplotype XI mean frequencies climb to 39.3% and 38.9% in the Oder and the Vistule basins, and culminate (44%, the Ukrainian value) in the Dnieper basin. Results reported here concerning frequencies and dates raise the possibility of a wide and relatively rapid spread of R1a-Z280-related lineages associated, with prevalent Copper and Early Bronze Age societies that ranged from the Rhine River in the West to the Volga River in the East [25], including the Bronze Age Proto-Slavic culture that arose in Central Europe near the Vistule River [26]. However, our current data does not able us to directly ascribe the patterns of haplotype XI/R1a haplogroup to specific cultures in Europe.
Contrary to the major haplotype XV-R1b haplogroup, which is well geographically sub-structured in Western Europe [6], the other major European haplotype XI-R1a haplogroup shows few differentiation when studied with most of the current SNP markers; two exceptions [22] for R1a-Z284, which is confined to Northwest Europe and peaks at ≈20% in Norway, and for R1a-M458 which is centred on Poland.
There is an abrupt geographic limit, located in West-Europe along the 15° meridian of longitude approximately [9], between the haplotypes XV and XI distributions (Figure 1). According to recent estimations based on whole Y-chromosome sequences and using a rate of one SNP per 122 years [22], it was estimated that the bifurcation of R1 into R1b and R1a had occurred ≈ 25,100 years ago. The global impression we have, based on dates we obtained, is that the waves of expansion during history of lineages bearing the haplotype XI toward West superpose areas of Western Europe previously occupied by populations bearing haplotype XV.