Genetic Diversity Studies on Selected Rice (Oryza sativa L) Populations Based on Aroma and Cooked Kernel Elongation

Rice (Oryza sativa L.) is the main staple food for more than half of the world’s population. Improving cooking and eating quality of rice is one of the important objectives of many plant breeding programs. Aroma and cooked kernel elongation are two critical parameters that determine the market value, cooking and eating qualities of rice. The objective of this study was to evaluate the genetic diversity of thirteen (13) Oryza sativa L. populations from Kenyan and Tanzanian. Genetic diversity was determined using 8 simple sequence repeats (SSR) markers. Diversity data was analyzed using POWERMARKER version 3.25 and GENALEX v 6.5 software packages. The number of alleles per locus ranged from 2 to 4 alleles with an average of 3.12 across 8 loci. The polymorphic information content (pic) ranged from 0.2920 (RM 282) to 0.6409 (RM 339) in all loci with an average of 0.4821. Pair-wise genetic dissimilarity coefficients ranged from 0.1125 to 0.9003 with an average of 0.5312. The average gene diversity over all SSR loci for the 13 rice varieties was 0.6036, ranging from 0.3550 to 0.6391. Maximum genetic similarity was observed between Kilombero and Supa, BS 370 and BS 217. Minimum genetic similarity was observed between Kahogo and BS 217. Cluster analysis was used to group varieties by constructing dendrograms based on SSR data and morphological characterization of grains. The dendrogram based on SSR data formed two distinct clusters of the 13 rice varieties. RM 339 and RM 241 were the most informative markers and could be used for differentiating rice varieties from diverse geographical origins. Results obtained from this study demonstrated that use of trait specific SSR markers can be relied upon in diversity studies among diverse and closely related genotypes. RM 339 and RM 241 markers are recommended for use in diversity studies and in quality assurance for grading of rice varieties. Further analysis should be carried out using a larger number of samples and markers to come up a more conclusive report on the discriminating power of microsatellite markers based on rice grain quality traits. *Corresponding author: Wambua F Kioko, Department of Biochemistry and Biotechnology, School of Pure and Applied Sciences, Kenyatta University, P.O. Box 43844-00100, Nairobi, Kenya, Tel: +254718507667; E-mail: festuswambua101@gmail.com Received September 23, 2015; Accepted October 15, 2015; Published October 21, 2015 Citation: Kioko WF, Musyoki MA, Piero NM, Muriira KG, Wavinya ND, et al. (2015) Genetic Diversity Studies on Selected Rice (Oryza sativa L) Populations Based on Aroma and Cooked Kernel Elongation. J Phylogen Evolution Biol 3: 158. doi:10.4172/2329-9002.1000158 Copyright: © 2015 Kioko WF, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
Rice (Oryza sativa L.) is regarded as one of the major cereal crops with high agronomic and nutritional importance. It is a major source of human food for more than half of the world's population [1]. Rice is one of the food crops for which complete genome sequence is available. Therefore, it is an ideal model plant for study of grass genetics due to its relatively small genome size of 430 Mb compared to other plants [2]. In Kenya, rice is the third most important staple food after maize and wheat. The local production is estimated at between 45,000 to 80,000 tones whereas its consumption is about 300,000 tones. This huge production -consumption gap is met through imports.
There is a wide genetic diversity available in rice among and between landraces, leaving a wide scope for future crop improvement. Landraces are the local varieties of a domesticated plant species which have developed over time through adaptation to their natural environment. The demand for productive and homogeneous crops has led to development of a small number of standard, high yielding varieties. This has consequently resulted to tremendous loss of heterogeneous traditional cultivars through genetic erosion. Landraces preserve much of this lost diversity and are known to harbor great genetic potential for breeding new crop varieties that can cope with environmental and demographic changes [3]. Proliferation of rice varieties has narrowed down the number of combinations of morphological descriptors available to describe the uniqueness of a variety. Therefore, characterization and varietal identification of available landraces and improved varieties have become important in modern day crop improvement [4]. There are more than 100,000 rice varieties worldwide but the major categories include; indica, japonica, basmati and glutinous [5]. This study was carried out using 13 randomly selected rice landraces and improved varieties largely grown in Kenya and Tanzania. Profiles of the rice varieties used are detailed in Table 1. Kenya is home to many varieties of rice and land races. These varieties were developed through selection based on agronomic traits. This resulted in a wide spectrum of varieties that are highly valued both in domestic and foreign markets. In Kenya, rice consumers prefer the aromatic rice, which is high in quality, and hence price. Unscrupulous traders often blend this fragrant rice which has good cooking quality traits with low quality non-fragrant rice to make more profit from their trade. Accurate evaluation of these traits is difficult and has constrained the development of better varieties. Various conventional methods routinely used to evaluate and grade rice varieties include sensory and chemical methods. These methods are inconsistent and have failed to address these concerns due to low sensitivity, time consumption and large sample volume requirement. Aroma and cooked kernel elongation are crucial determinants of cooked rice grain quality. Aroma is caused by accumulation of 2-acetyl-1-pyroline (2-AP). This compound is encoded by betaine aldehyde dehydrogenase 2 (BAD2) genes which is also called fragrance (fgr) gene located on chromosome 8. Accumulation of 2-AP is caused by mutation in BAD2 gene with 8 bp deletion [6]. Cooked kernel elongation trait is influenced by several physicochemical and genetic factors, including genotypes, aging temperature, aging time, water uptake, amylose content and gelatinization temperature. Cooked kernel elongation is influenced by kne gene and the major QTL has been mapped on chromosome 8. Previous studies on genetic analysis have shown that genes and/or QTLs of cooked kernel elongation and aroma are linked [7].
Molecular characterization using PCR-based SSR markers provides a suitable method, which can be used for varietal identification in rice supplies and to differentiate between the various grades of fragrant rice. This is because they are highly reproducible, co-dominant, interspersed throughout the genome and require only small amount of tissue hence they are cost effective to use. In addition, grain quality evaluation is a key step in development of better rice varieties through markerassisted selection. This project was therefore aimed at validating SSR markers for diversity as a tool for grading of Kenyan rice. 8 SSR markers tightly linked to the QTLs and or genes for aroma and cooked kernel elongation were used in this study.

Plant material
A total of 500 g rice seeds of thirteen different rice varieties were collected from Mwea Irrigation Agricultural Development (MIAD) and Kilimanjaro Agricultural Training Center (KATC). The names and attributes of the rice varieties and the names of the corresponding sources are detailed in Table 1. The rice seeds were stored in Molecular Biology laboratory at Kenya Bureau of Standards, Nairobi, Kenya.

Genomic DNA extraction and simple sequence repeat (SSR) analysis
DNA was extracted from rice seeds of each sample by cetyl trimethyl ammonium bromide (ctab) [8]. The quality of DNA extracted was determined by running an aliquot of 5 µl of each extracted DNA sample in a 1% agarose gel electrophoresis pre-stained with ethidium bromide. Further, the concentration and purity of DNA for each sample solution was determined using a nanodrop spectrophotometer (JENWAY GENOVA) at wavelengths 230, 260 and 280 nm. Genetic diversity among the rice varieties was assessed using 8 SSR markers of the RM series selected from the Gramene database (http://www. gramene.org/). Details of the markers used in this study are described in Table 2. These markers were selected on the basis of tight linkage to the QTLs and or genes for aroma and cookes kernel elongation. An optimization survey was first conducted empirically for each of the primers to determine the optimal annealing temperatures and primer concentrations so as to achieve a robust asssay. The quantified DNA samples were amplified in 25 μl reaction volumes containing of 5.0 μl template DNA (5 ng), 5.4 μl ddH 2 O, 6 μl PCR buffer (10X), 3.0 μl MgCl 2 (50 mM), 3.6 μl dNTPs (2 mM), 0.6 μl of each primer (60 ng) and 0.8 μl of Taq DNA Polymerase (5 U/μl). This was carried out in a thermal cycler with a cycle profile: Initial denaturation at 94°C for 4 min, 40 cycles of 1 min denaturation at 94°C, 30 sec annealing at 55°C or 62°C (depending on the marker used) and 1 min extension at 72°C, and then 4 min at 72°C for the final extension. The resultant PCR products were analysed by electrophoresis on 2% agarose gels .To achieve good separation of the PCR products, agarose gel electrophoresis was performed at 100 V for 1 hour. The gel was visualized using a high performance ultraviolet Trans-illuminator, photographed using gel documentation instrument and saved in a computer. The size of the amplified DNA was determined with reference to the 100 bp DNA ladder included in the gel as a size marker.

Data analysis
Genetic data was analysed using POWERMARKER version 3.25 [9] and GenAlex version 6.5 [10] statistical software packages. Clearly resolved bands of the genotypes were manually scored using the binary coding system, '1' for presence of band and '0' for absence of band. The resultant binary matrix was subjected to POWERMARKER software to analyse the genetic diversity of each variety on the basis of five parameters: major allele frequencies, allele number, and polymorphism information content (pic) and gene diversity [11]. A dendrogram of cluster analysis was constructed using the un-weighted pair group method with arithmetic average (upgma) as implemented on PowerMarker software and was viewed using TreeView. Analysis of molecular variance (Amova) was used to reveal the partitioning of variation within and among the populations. Principal coordinate analysis (Pcoa) was carried out based on SSR data to generate a 2dimensional representation of genetic relationship across the 13 rice varieties with the help of GENALEX version 6.501 software.

Results
A total of eight simple sequence repeat (SSR) markers covering chromosomes 3, 4, 8 and 9 were utilized to characterize and assess the genetic diversity among thirteen rice varieties from Kenya and Tanzania. The ability of each of the eight microsatellite markers to determine genetic diversity among the varieties varied. A total of 25 alleles were detected from the 13 varieties using the eight SSR markers as shown in Table 3. The allelic richness per locus generated by each marker varied from 2 for RM 282 to 4 for RM 241 and RM 339 with an average of 3.125 alleles per locus. Maximum number of alleles per loci was obtained with markers RM 241 and RM 339. The minimum number of polymorphic alleles was observed with marker RM 282. As shown in Table 3, there was no association between the number of alleles detected and the number of SSR repeat motifs. RM 339 and RM 241 rice microsatellite markers demonstrated distinct bands in most of improved aromatic rice varieties compared to all other varieties. Similar observations were made using basmati, local varieties, Japonica and Indica rice varieties from India [12]. Therefore, these markers could be used in combination for differentiating improved aromatic rice varieties from other rice varieties.
Alleles observed in less than 5% of all the rice varieties (commonly termed as rare) were investigated and identified at three loci RM 277, RM 241 and RM 339. A total of 5 rare alleles (20%) were detected with maximum number being observed at RM 241 followed by RM 339. Five of the rice varieties (38%) showed rare alleles. ITA 310, Wahiwahi and Supa had one rare allele each while IR 2793 had two rare alleles. It was found that markers RM 241 and RM 339 which detected a higher number of alleles (4) also detected more rare alleles.  The level of polymorphism among the 13 rice varieties was evaluated by calculating polymorphic information content (pic) values for each SSR loci. The pic values varied from 0.292 on RM 282 to 0.641 on RM 339 with an average of 0.5019 per locus as shown in Table 4. The varying pic values generated by the markers served as an indicator of the discriminating power of a particular marker by taking into account the number of alleles at each locus and their relative frequencies among the tested varieties. Six out of the eight markers (RM 277, RM 252, RM 241, RM 339, RM 215 and RM 225) had pic values of above 0.5. On this basis, RM 339 was considered the best marker for the 13 test genotypes. The results were summarized in Table 3.
Heterozygosity was analysed at each microsatellite loci across all the varieties using 8 SSR markers. No heterozygosity was observed (Ho=0) across the varieties whereas expected heterozygosity (He) which is reflected by the gene diversity at each locus ranged from 0.355 to 0.698 with an average value of 0.604. Heterozygosity deficiency concurred with high inbreeding coefficients (F) of 1.0 across all the varieties.
A dissimilarity matrix based on CS cord 1967 statistical tool as implemented in power marker was used to determine the genetic relatedness among the rice varieties as shown in Table 4. Pair-wise genetic similarity estimates ranged from 0.113 to 0.90. It was found that Kilombero and Supa were the closest genotypes with the lowest genetic dissimilarity value of 0.113. This was closely followed by BS 217 and BS 370 varieties with a dissimilarity value of 0.225. On the other hand, the highest level of dissimilarity was observed between Kahogo and BS 217 rice varieties with a dissimilarity index of 0.900. Higher similarity coefficients were evident among improved rice varieties as compared to landraces.
A dendrogram based on upgma grouped the 13 rice varieties into two major clusters, I and II as shown in Cluster I contained 6 rice varieties and was further sub divided into two sub clusters i and ii. Sub cluster i consisted of three aromatic varieties from Tanzania; two improved varieties Saro 5 and Supa and one landrace, Kilombero. Sub cluster ii consisted of two improved non-aromatic varieties IR 64 and ITA 310 and one semi aromatic landrace, Kahogo, from both source countries. Cluster II was more diverse and consisted of 7 rice varieties which were further subdivided into 2 sub-clusters, iii and iv, each having two other small clusters. Sub-cluster iii solely consisted of 3 improved varieties from Kenya where two aromatic Basmati genotypes, BS 217, BS 370 clustered close together.
Sub-cluster iv consisted of 4 non aromatic varieties from both countries and was further subdivided in to two small clusters. One consisted of two landraces, Red Afaa and Wahiwahi from Tanzania whereas the second consisted of two improved varieties; IR 2793 and IR 54, one from Kenya and the other from Tanzania respectively. The two landraces and improved varieties formed sub clusters of their own and this could be due to presence of some unique characteristics in them that are absent in the other varieties. The dendrogram placed the  observation was made by [15] using a different set of rice varieties from Pakistan.
The microsatellite assays produced some variety specific alleles in some of the varieties assessed. It was found that markers RM 241 and RM 339 detected a higher number of polymorphic alleles and more rare alleles. A similar output was also made using Indian aromatic and quality rice accessions [16]. Rare alleles are highly informative in fingerprinting of rice varieties and this indicates the enormous value of RM 241 and RM 339 markers in creation of DNA fingerprints. These can be very useful in quality assurance for varietal identification of those varieties as well as for determination of cultivar purity. This phenomenon could be in support of the fact that some markers are reportedly more specific to subspecies genomes than others, and this aspect makes them very useful for discrimination of closely related genotypes [17].
The numbers of alleles detected in this study are comparable to those observed by [18] using Basmati and non-Basmati rice varieties from Pakistan which had allelic richness of 2-4 alleles with an average of 2.75 alleles per SSR locus. In contrast, the average number of alleles detected in this study was comparatively higher than the values obtained using Indian aromatic rice varieties which had an average of 2.08 and 2.5 alleles per locus [19] and [5]. The average number of alleles per locus detected in this study was lower than the values obtained using a different set of rice germplasm from India, Portugal and Venezuela which had 4.5, 7.7 and 13.0 alleles per locus for various classes of SSR markers [20][21][22]. The contradiction in those reports might be due to use of diverse germplasm and higher number of rice accessions used by these researchers.
The level of polymorphism as assessed by the polymorphism information content (pic) was considerably high and ranged from 0.29 to 0.64 with an average of 0.53. The highest pic values were observed at SSR RM 339, an indication that this marker was the most polymorphic and informative. Similar pic values were made using rice varieties from Pakistan [16]. In contrast, it was higher than the average pic value of 0.43 obtained using Taiwan modern elite varieties, domestic and imported germplasm [23]. The average pic value obtained in this study was notably lower than that previously reported by [24] using Brazilian landraces and improved lines which had an average pic value of 0.61. This could indicate that the genotypes used in this study were more diverse due to differences in origin and ecotype. Generally, microsatellite markers exhibit high pic values due to their co-dominant nature and multi-allelism [25].
The study demonstrated heterozygosity deficiency across all the study varieties, an indication that the study varieties were all pure breeds. This could be associated to forces such as inbreeding as reflected by high levels of inbreeding coefficients (F) of 1.0 across all the varieties. In addition, it could also be as a result of hybrid incompatibility that exists within the rice species [26]. This is supported by the fact that rice selfpollinated and effects of cross pollination are very minimal. However, the level of polymorphism as indicated by the number of alleles and the pic values concurred with the levels of expected heterozygosity at each locus. This reflected the high genetic variability contained across the rice varieties. Heterozygosity deficiency and high level of inbreeding Basmati varieties in one sub group and this reveals the high level of genetic relatedness among them as shown in Figure 1. Clustering of these varieties close together could perhaps be due to sharing the same a close ancestral parentage.

Analysis of molecular variance (Amova)
The analysis of molecular variance (amova) results showed statistically significant differentiation (P<0.001; Table 4). 86% of the total variation was contained within populations whereas a small but significant variation of 14% was contained among the two populations (P<0.001; Table 4). These results indicated that a major genetic difference existed between individual rice varieties whereas a small proportion of diversity was evident when Kenyan populations were compared with Tanzanian populations.

Principal coordinates analysis (Pcoa)
The genetic relationship between the rice varieties was also assessed using principal coordinate analysis (Pcoa) based on genetic distance matrix. The first and second component axes of Pcoa showed 27.96% and 24.23% totaling to 52.19% of the variance respectively. Principal coordinate analysis revealed that huge genetic diversity existed in the test rice varieties and formed two clusters, A and B as shown in Figure 2. In cluster A, three improved and high quality rice varieties from Kenya; BS 217, BS 370 and BW 196 were grouped close together compared to Wahiwahi and Red Afaa, landraces from Tanzania which were grouped far apart in the same cluster. On the other hand, two improved aromatic varieties, Supa and Saro 5 from Tanzania were grouped close together with one aromatic landrace, Kilombero in cluster B. These two clusters, A and B corresponded well with the two major clusters I and II of upgma dendrogram.

Discussion
The assessment of genetic diversity is crucial in germplasm characterization, conservation and breeding. The advent of DNA marker technology has greatly facilitated studies of genetic variation through development of genetic markers to follow inheritance of agronomically important traits. The results obtained from assessment of genetic diversity at the DNA level could be used in development of better breeding strategies. Simple Sequence Repeat (SSR) markers were chosen for the analysis of genetic diversity among Kenyan and Tanzanian rice varieties because previous studies have shown that they are a reliable tool for differentiation of even closely related lines. In addition, these markers have numerous advantages over other PCRbase DNA markers which include co-dominance, high abundance in the genome, allowance of high throughput screening, reproducibility and can be easily automated [13].
Although diversity analysis in rice has been previously reported [14], very little is known on the relationship of Kenyan and Tanzanian rice landraces and improved varieties on the basis of molecular analysis. The results obtained in this study indicated a significant level of genetic variation among the rice varieties used. The number of alleles produced by microsatellite assays was found to be shared among improved and landrace varieties but comparatively a lower number of alleles were common to aromatic and non-aromatic rice varieties. A similar Degrees of freedom (DF), sum of squares (SS), mean of square (MS), estimated variation, % variation and P-values are shown. Table 4: Analysis of molecular variance (Amova) based on 8 SSR loci share common ancestors. These varieties share BW 196 as one of their parents in the pedigree. Similarly, other aromatic varieties from Tanzania; Saro 5, Supa, and Kilombero were clustered closely in the same subgroup on the upper part of the dendrogram. This is consistent with the use of quality trait, aroma as a parameter of discrimination among the varieties.
Two improved non-aromatic varieties IR 2793 and IR54 from both source countries were grouped close together in to one subgroup. These two varieties were introduced into east Africa from international rice research institute, IRRI in Philippine and possibly had similar ancestors. Wahiwahi and Red Afaa, non-aromatic landraces from Tanzania also clustered close together on the lower part of the dendrogram. In other similar studies carried out using microsatellite markers, long slender grained Basmati varieties were placed in the same group whereas other short grain non aromatic varieties clustered into different but close groups [30,31].
Analysis of molecular variance (Amova) revealed that the main contribution to the genetic variation was due to variation within populations. Indeed, 97% of the genetic variation was found within populations while differences among populations had only 3% contribution to the total genetic variation. This small genetic difference between the two populations could be perhaps due to exchange of germplasm between the two countries. These results are comparable to what was obtained using Indian rice varieties [32].
In the pcoa scatter plot, the distances among the varieties reflected the genetic distances among them, hence varieties that were clustered close together were interpreted to be closely related and sharing similar quality traits whereas those clustered far apart were distantly related Figure 2. Two major groups were identified corresponding to improved varieties with high grain quality traits and landraces by cluster analyses, upgma and pcoa based on genetic distance. Most of improved rice varieties showed high genetic similarity which was supported by both upgma and pcoa. Since most rice breeding programs are geared towards improvement of grain quality, varieties with good cooking and eating qualities were grouped together in clusters IA and IIA. Clustering of the rice varieties by both methods revealed that there was no association in the observed pattern of variations with their geographical origin. Similar observations were made using Taiwan landraces and improved rice varieties [23]. Such non-congruence between the clustering pattern and geographical origin could be due to exchange of germplasm between the two origin countries.

Conclusions
Aroma and cooked kernel elongation are crucial quality traits that determine the market value, cooking and eating qualities of rice. Molecular analysis based on these two traits revealed that improved rice varieties from both source countries used in this study had a low genetic diversity compared to landraces. This indicated a high genetic similarity among these varieties and could be perhaps due to high selection pressure for good quality traits and sharing of a common ancestry. RM 339 and RM 241 were found to be the most reproducible diverse markers suitable for differentiating most of the rice varieties. These trait specific markers demonstrated a good sensitivity and the extend in which they can be relied upon for use in quality assurance, for characterization of other rice varieties as well as in breeding. This would be of benefit to both consumers and farmers. Rice varieties from both source countries were found to share some common alleles with some being specific to particular rice varieties. The variety specific alleles can be employed in variety identification and DNA fingerprints to differentiate rice varieties in the market from different countries.  has also been previously reported by [27,28] using different sets of rice varieties from china. However, these findings contradict an earlier report by who found some low level of heterozygosity among hybrid rice varieties [26].
The genetic dissimilarity coefficients among all varieties were consisted with an earlier report by [21] among eight rice varieties from Pakistan which had dissimilarity coefficients ranging from 0.24 to 0.92 observed. It was lower than the average genetic similarity of 0.79 obtained among 40 cultivated rice varieties and 5 wild relatives of rice [29]. A high degree of genetic similarity ranging from 0.67 to 0.91 was also reported by 18 among Basmati and non-Basmati long grain indica varieties using SSR markers. This discrepancy in the level of genetic similarity could perhaps be due to intra-specific variation in the germplasm used.
Cluster analysis based on the similarity coefficients conspicuously placed the 13 rice accessions into two major groups. Most of aromatic and non-aromatic varieties clustered into close sub groups. Cluster analysis grouped 2 improved aromatic rice varieties BS 217 and BS 370 from Kenya in a distinct sub group from the rest of the varieties studied. This indicates that these varieties are genetically similar and