Development of Species-Specific Microsatellite Markers for Broomcorn
Millet (Panicum miliaceum L.) via High-Throughput Sequencing

Min-Xuan Liu; Yue Xu; Tian-Yu Yang; Zhi-Jun Qiao; Rui-Yun Wang; Yin-Yue Wang; Ping Lu

doi:10.4172/2329-8863.1000297

ISSN: 2329-8863

Advances in Crop Science and Technology

Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.

Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on Medical, Pharma, Engineering, Science, Technology and Business

Development of Species-Specific Microsatellite Markers for Broomcorn Millet (Panicum miliaceum L.) via High-Throughput Sequencing

Min-Xuan Liu¹, Yue Xu², Tian-Yu Yang³, Zhi-Jun Qiao⁴, Rui-Yun Wang⁵, Yin-Yue Wang⁶and Ping Lu^1*

¹The National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Science, Chinese Academy of Agricultural Science, Beijing 100081, China

²School of Life Science, Jilin University, Changchun 130012, China

³Institute of Crop, Gansu Academy of Agricultural Sciences, Lanzhou 030000, Gansu, China

⁴Institute of Crop Genetic Resources, Shanxi Academy of Agricultural Sciences, Taiyuan 030031, Shanxi, China

⁵Shanxi Agricultural University, Taiyuan 030031, Shanxi, China

⁶Faculty of Life Science, Jilin Agricultural University, Changchun 130118, Jilin, China

*Corresponding Author:: Ping Lu
The National Key Facility for Crop Gene Resources
and Genetic Improvement/Institute of Crop Science
Chinese Academy of Agricultural Science, Beijing 100081, China
Tel: +86-010-6215-9962
E-mail: lupingcaas@163.com

Received date: July 18, 2017; Accepted date: July 24, 2017; Published date: July 27, 2017

Citation: Liu MX, Xu Y, Yang TY, Qiao ZJ, Wang RY, et al. (2017) Development of Species-Specific Microsatellite Markers for Broomcorn Millet (Panicum miliaceum L.) via High-Throughput Sequencing. Adv Crop Sci Tech 5: 297. doi:10.4172/2329-8863.1000297

Copyright: © 2017 Liu MX, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Advances in Crop Science and Technology

View PDF Download PDF

Abstract

Objectives: To discover and develop large-scale SSR markers of the P. miliaceum genome, which can be used in future genetic studies effectively.

Result: 223,894 putative SSR sequences were identified by next-generation sequencing. A total of 56,694 primer pairs were successfully designed and 240 primer pairs were randomly selected for effectiveness validation. The expected heterozygosity and observed heterozygosity varied from 0.0447 to 0.7713 and from 0 to 0.9545, respectively and the mean of Shannon information index (I) was 0.7254. A UPGMA dendrogram indicated the high quality and effectiveness of these novel genomic SSR markers developed via next-generation sequencing technology.

Conclusion: A large repertoire of SSR markers were successfully developed by next-generation sequencing of the P. miliaceum genome which will be useful for the construction of genetic linkage maps, the identification of QTLs, and marker-assisted selection breeding.

Keywords

454 FLX titanium pyrosequencing; Marker development; Microsatellite; Panicum miliaceum L

Introduction

Broomcorn millet (Panicum miliaceum L, 2n=4x=36), an important member of the genus Panicum [1] was domesticated in China more than 10,000 year [2,3] and it is an outstanding crop for high adaptability to climate change, especially abiotic stresses, such as drought, salinity and infertility [4-6]. The growth duration of broomcorn millet is 60-90 days [7], which is the shortest in crops and it is usually grown for exploration of new wastelands and deserts or as a remediation crop in the event of natural disasters [8,9]. Although broomcorn millet grains are used primarily as bird and livestock feed in the United States and Europe the grains remain a staple for human consumption in China [10].

Over 8,600 accessions (varieties and landraces) of Panicum miliaceum have been deposited in the National Gene Bank located at the Chinese Academy of Agricultural Sciences (Beijing, China). Although abundant morphological variation exist among broomcorn millet landraces, the characterization and identification of this variation at molecular level is limited. This limitation is primarily due to the tetraploid genome (2n=4x=36) of P. miliaceum and the paucity of sequencing data, which has limited molecular marker development [11]. To date, no more than seventy characterized SSR loci are available for broomcorn millet [12,13], and these loci have been validated in relatively few genetic backgrounds. Initially, Hu et al. used 46 simple sequence repeat (SSR) markers from rice, wheat, barley and oat to study the genetic diversity of 118 broomcorn millet landraces collected from various ecological areas in China [12]. Through the construction of a SSR-enriched library from broomcorn millet genomic DNA, Cho et al. developed and identified 25 polymorphic microsatellite markers to analyze the genetic diversity of 50 P. miliaceum accessions from Mongolia, India, the Republic of Korea, Russia, Italy, and Uzbekistan [13].

Microsatellite is a type of DNA marker that is frequently used in many areas of research [14]. However, for the species which no genomic resources are available, the effective utilization and de novo isolation of SSR markers are limited [15]. The application of nextgeneration sequencing (NGS) technology has brought about a revolution in biological and agricultural applications because they can sequence DNA at unprecedented speed [16,17]. In conjunction with selective hybridization, NGS technologies can be used in highthroughput applications to develop and identify sequences that flank simple sequence repeat (SSR) regions. Species-specific SSR markers in mung beans [18], endangered dwarf bulrushes [19], fava beans [20], and grass peas [21] have been identified using this method and can be used as locus-specific markers to promote the study of downstream genotyping.

In this study, we used next-generation sequencing technology to inexpensively and efficiently obtain genomic SSR loci of broomcorn millet. Furthermore, 240 primer pairs were selected and amplified in 40 broomcorn millet genotypes aim to identify novel millet-specific SSR markers for future study.

Materials and Methods

Plant material and DNA isolation

Twenty-four broomcorn millet accessions with different seed color (white, grey, yellow, red, brown, and compound) were selected for 454 sequencing.

A set of 40 broomcorn millet accessions comprising 16 landraces and 24 cultivars from various geographic origins in China were used for SSR marker validation and genetic diversity analysis. Of these accessions, 5 were from Heilongjiang, 8 were from Shanxi, 12 were from Inner Mongolia, 4 were from Shaanxi, 5 were from Ningxia, 5 were from Gansu, and 1 was from Jilin.

Seeds were provided by the National Gene Bank of China located in Beijing, China. Detailed information of plant materials was listed in Table 1 (Additional file 1).

For SSR development, approximately equivalent weights of 7-dayold leaves (15-20) of each genotype were collected and pooled. Total genomic DNA was extracted using a cetyltrimethylammonium bromide (CTAB) method as modified by Edward et al. [22]. The concentration and purity of isolated DNA were determined using a NanoDrop ND-1000 (NanoDrop Technologies Inc., Wilmington, DE, USA).

Library preparation and 454 sequencing

Selective hybridization with streptavidin-coated beads with eight probes which including pAC, pGA, pAAG, pAAT, pAAC, pGATA, pATGT and pAAAT [23,24] was used to construct SSR-enriched libraries. Library quality was controlled by sequencing 192 clones which selected randomly from library. pEASY-T1 was used as the cloning vector, and insert fragments were validated by sequencing with an ABI3730XL DNA Analyzer. Libraries were considered high quality if sequence lengths were between 300 and 1,000 bp with a mean of 500-700 bp.

The eight DNA libraries enriched SSR probes were pooled in equal amounts, and then subjected to Roche 454 sequencing with GS-FLX Titanium system (Beijing Autolab Biotechnology Co., Ltd, China). The sequencing data was processed to generate a standard flow gram file (broomcorn millet.sff). This broomcorn millet.sff file was submitted to the SRA of National Center for Biotechnology Information (NCBI) with an accession number of SRX1223614.

Read characterization, SSR loci search and primer pair design

The sequencing data were pretreated with normalization, correction and quality-filtering algorithms, then processed to screen and filter out weak signals and low-quality reads. The read ends were trimmed using the EMBOSS software package for 454 adaptor sequences [25]. Furthermore, the length distribution of the reads and the nucleotide number in all reads was analyzed.

Before SSR loci search, the clean reads were filtered redundant by the CD-HIT program. A large-scale SSR search was done using the MISA tool. The minimum SSR motif length is 10 bp and monomer, dimer, trimer, tetramer, pentamer, and hexamer repeat lengths is 10, 6, 5, 5, 5, and 5, respectively. In a compound sequence, the max interruption between two SSR was set to 100 bp.

A Primer 3.0 interface module [26] was used for primer pairs design corresponding to the criteria proposed by Faircloth.

SSR marker validation and genetic diversity analysis

The characterizations including the total number of identified SSRs, the number of sequences containing more than 1 SSR, the number of SSRs present in compound formation etc. were analyzed using the MISA files [27] and plotted using R [28] and Open Office Calc.

Two hundred forty SSR primer pairs were selected and synthesized for polymorphic identification. Each PCR reaction (10 μL) containing 1.6 μL of 10 × PCR buffer (add Mg²⁺), 0.2 μL dNTP, 0. 5 U Taq DNA polymerase, 0.5 μL of a 5 μmol/L solution of each primer, 50 ng of gDNA templet, and 6.1 μL ddH₂O. Primer pairs were amplified on a PTC-100 Thermo-Cycler (MJ Research, USA) with the following program: 5 min pre denaturation at 94°C; 39 cycles of 94°C for 45s, 50s annealing at 55°C, 1 min extension at 72°C; and a final extension at 72°C for 10 min. The PCR products were resolved using 8% polyacrylamide gel electrophoresis (PAGE). DNA bands were visualized by silver nitrate staining. Allele sizes were determined using a 50 bp DNA ladder (Dingguo, Beijing, China).

Through PCR amplification and PAGE testing, a total of 162 SSRs produced clear and reproducible polymorphic fragments and can be used in further study to assess the genetic diversity of the 40 P. miliaceum accessions from different geographical locations in China.

POPGENE1.31 [29] was used to assess genetic variability, including the observed total number of alleles (Na), effective allele number (Ne), Nei’1973 gene diversity (He) [30], and Shannon–Weaver index (I).

Genetic relationships between accessions were performed on the similarity matrix obtained from the proportion of shared fragments [31] using the NTSYS2.1 program and calculated using the unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis.

Results

Quality evaluation of the SSR-enriched DNA library and read characterization of 454 sequencing data

The quality of the SSR enriched broomcorn millet library was tested by sequencing 192 randomly selected clones. The result showed that the recombination rate within the constructed P. miliaceum library was 86.5%, and 30.7% of the cloned sequence contained SSR motifs with an insert that varied between 200 and 1000 bp in size.

A total of 1,087,428 reads were generated using the Roche 454 GS FLX Titanium platform, and 904,311 reads were selected for next study after removal of adaptor. The most abundant nucleotide in the reads was adenine, accounting for 32.98% of the sequences, followed by cytosine (24.95%), guanine (23.71%), and thymine (18.34%). The average GC content was 48.66%. The most of read lengths were between 350 and 500 bp with a mean length of 370.4 bp and a maximum length of 565 bp (Figure 1).

advances-crop-science-clean-read-sequences

Figure 1: Frequencies length distribution of 454 clean read sequences.

Identification of SSR loci in the broomcorn millet genome

The microsatellite identification tool (abbreviated to MISA, and can be downloaded from (http://pgrc.ipk-gatersleben.de/misa/) was used for SSR loci mining. A total of 223,894 reads contained one SSR loci, and 289,155 SSRs were distinguished. Furthermore, there are altogether 45,604 sequences containing more than one SSR loci, and 61,908 containing compound SSR loci (Table 1).

Category	Numbers
Total number of sequences examined	904311
Total size of examined sequences (bp)	334957348
Total number of identified SSRs	289155
Number of SSR containing sequences	223894
Number of sequences containing more than 1 SSR	45604
Number of SSRs present in compound formation	61908

Table 1: MISA results from the genome survey.

We analyzed the distribution of SSR loci start positions and found that a total of SSR motif reads length was 11,299,460 bp with an average valueof 199 bp. In the SSR motifs, most (78.6%) were situated within 320 bp of the 5’-terminus and middle regions of the cloned sequences. Few SSRs were located near the 3'-terminus (Figure 2). For later study of locus amplification, 56,694 SSR primers were successfully designed by the Primer 3.0 public shareware to meet the criteria including size range of amplification products, optimal melting temperature, GC content, etc (Additional file 2: Table 2).

advances-crop-science-terminus-cloned-insert

Figure 2: Number of the SSR motif start position from the 5'terminus of the cloned insert within the enriched libraries.

Primer pair ID	Repeat	F (5'-3')	R (5'-3')	Size (bp)	Ta(°C)
ICSBM2	(GA)₁₃	GGCTTTGCTAGGGTTTCTCC	GGTGTGAAGTTGCCCAGATT	226	60
ICSBM3	(GA)₁₂	GTGTCTCTTTCGTCTTGCCC	GGGACACTTCCACCATCATC	204	60
ICSBM5	(GT)₁₃	TGTCTAGACCATCGCCATCA	CACTCACACACACATTTTCTTGG	218	60
ICSBM8	(AC)₁₄	GTGGTACAGCTGCTCGTTCA	AGGAGGAACCAGGAAGCAAT	254	60
ICSBM10	(AC)₁₅	GTGGTACAGCTGCTCGTTCA	GTGGTACAGCTGCTCGTTCA	268	60
ICSBM13	(AC)₁₆	CGTTTTCTCGCTACACACGA	TGGACAACGGAAAACGTACA	194	60
ICSBM14	(CA)₁₁	CTGCTGCATGCCTTTACCTT	CGCTGCAGTTTTGGTCAGTA	252	60
ICSBM15	(CA)₁₂	ATGAATCACCCGATCCACAT	ACGCCAACATCAGCATATCA	209	60
ICSBM19	(CA)₁₃	ATGAATCACCCGATCCACAT	ACGCCAACATCAGCATATCA	211	60
ICSBM21	(CA)₁₄	GCTGTCGGTCAGTCCTGTTT	ACGCCAACATCAGCATATCA	161	60
ICSBM22	(CT)₁₂	ACTCATGGTTACGGCAACTG	GCGCGAGAGAGAGAGAGAGA	287	60
ICSBM24	(AC)₂₅	ATCGACGACTAGGCCCTGTA	GGCCGTCACTATATCTGTCACC	153	60
ICSBM27	(CA)₂₂	CGATGAACGAAAATTCACCC	GTTCATTCGTCCAAATGCCT	258	60
ICSBM29	(TG)₁₆	GAGATGGTGCGGATTCTGAG	TCATTTCCACTGTCACTGCC	146	60
ICSBM30	(AC)₁₈	CAGAGCAGTGCGGTATTGTG	TCGTTTGTTGTTCGGTTGTC	232	60
ICSBM31	(AC)₂₁	TCTGGACATGCTTTCACCAG	CCTACCTCGTAACACTGCGG	267	60
ICSBM33	(GA)₁₃	AATATCCCTTTTGTCGCACG	ATGCATTGATGGGCTTGATT	181	60
ICSBM35	(GA)₁₀	AGCAACGGAGGTGAGAGAGA	TCGACACACACGACACACAC	128	60
ICSBM39	(CT)₉	TTTCAGGGACTGGACTGGAC	GTAGGGGGTAGCTGAGAGCC	105	60
ICSBM40	(CT)₈	GCCTCCTGTCTTGTAGCGTC	AGGGTAGGCTGAGAGCCTGT	121	60
ICSBM43	(CA)₁₄	GCACACGCATCATCACAAGT	GCTCATTCAACGACAGATGC	280	60
ICSBM46	(GA)₁₃	CGTCCACCTTGGTGCTTATT	GCTGATTTTCTAACGGCTGC	236	60
ICSBM49	(GA)₁₉	CTGCATTCTCTGTTCACCCA	ATCCTTTCACTCGAGGGGTT	250	60
ICSBM51	(GT)₁₇	GCGCAGTAATATATTTCAGTAATTCA	GCATCATCGTCAAGACCTCA	225	58
ICSBM54	(GT)₁₈	GCGCAGTAATATATTTCAGTAATTCA	GCATCATCGTCAAGACCTCA	226	58
ICSBM59	(GT)₁₉	TCTTTTATGCGCGTAAGGCT	CACGAACACAAGAGAAGTAGCTCTCA	262	59
ICSBM60	(AC)₂₂	ATCGACGACTAGGCCCTGTA	TGCGGAGTGTCTTTGTTCTG	199	60
ICSBM67	(AC)₂₃	ATCGACGACTAGGCCCTGTA	TGTATGGAAAGCTCTGGCCT	159	59
ICSBM68	(GT)₁₀	ATTTGACCTGTGACCTCGCT	AGGGCTCTCGAGGAGTGTTT	195	60
ICSBM71	(GT)₁₇	GACCCAGCGATCAGTCTCTC	CTCTTGTCGTCTTGGTCCGT	205	60
ICSBM78	(GT)₂₁	ACCCAACCGTATATCCAACG	TGTCACAGTTGTCCTGGCAT	274	60
ICSBM80	(GT)₉	ATTTGACCTGTGACCTCGCT	CCTTTCTGTTTCTGCAAGCC	215	59
ICSBM81	(TC)₈	CAACAAGGTTGGTTGGCTTT	ATGCTGCTGCAGATGTTTTG	166	60
ICSBM85	(TG)₁₅	TGTGGGAGAGAAGTGGGC	CAAGGAAGGAATAAACCGCA	187	59
ICSBM86	(AC)₂₁	AGTTAACCCTTGTGATGCCC	CGTTGTTGGTCCTTCTGGTT	253	58
ICSBM90	(AC)₈	GCAGTGGGTCAGCTTATGGT	TCTCTCTGTGTGTGTGCGTG	210	60
ICSBM96	(CA)₁₆	TGAGATTGGCATCAAGCAAG	TTTCTGGTCAGTTCGGTCAG	287	58
ICSBM99	(CA)₁₉	GCCACACTAAATAAGCTTTGTGTC	TGGTCGTCACTGATTACGGA	236	60
ICSBM100	(GA)₁₅	GAGTTAGAGGACAGCGTGGC	TGCAGCAGAGAATGTGCTACT	210	58
ICSBM107	(GA)₁₇	GTCCTCACCTCCTTTTGGGT	CCTTCGTTTCTCTCTCGTCG	250	60
ICSBM109	(GT)₁₄	TTCTCCGTCAGCTCACATTG	TCCATTGTTCATTTAGTAGAAACCT	251	57
ICSBM113	(GT)₁₅	GACCCAGCGATCAGTCTCTC	CTCTTGTCGTCTTGGTCCGT	201	60
ICSBM119	(GT)₁₆	GACCCAGCGATCAGTCTCTC	GACCTCACCTCTTCGTCGTC	215	59
ICSBM120	(GT)₂₂	CGCACTAGCCCTTGTCTTTC	CGCCCTACGAACAAATCACT	225	60
ICSBM123	(TAG)₁₄	CGAGTCGGTGAAGAGAGACC	TTTGCAATGTTCACCCAACT	290	59
ICSBM126	(TC)₈	CAACAAGGTTGGTTGGCTTT	ATGCTGCTGCAGATGTTTTG	165	60
ICSBM127	(AC)₁₆	TATTCGAGCCCCATTTCTTG	GCGTTATCCGGATGATGAAG	184	60
ICSBM130	(AC)₁₇	CTGATCAAATCAATGCAGCAA	GTTTTTAGGTCCGTGGCGTAAAG	132	60
ICSBM132	(CA)₁₄	CACACAGATATTTGGCACCG	TGAGGATCCGAAAAGATTGG	216	60
ICSBM135	(CA)₇	GCCGGAGTATAGATCCGACA	GTCAGGCCGTGAACGTTATT	175	60
ICSBM139	(CA)₁₀	ATGCACGCACGAACACATA	TCTTGATCATCACCAGCACC	280	59
ICSBM144	(CA)₈	CACCATGTGTATGCGTGTGA	GGAGAGGAGCTTTCAGAACCA	234	60
ICSBM151	(CA)₁₀	CCTCTCCTTACACGGGGATT	TTGATTATGCTTTGGAGGGG	242	60
ICSBM159	(AC)₁₃	ATCGTAGAAACCATTGGCCC	TGACCCATGGACACTTTTCA	279	60
ICSBM165	(AC)₁₄	AATGTCACAGGTTTCCCTCG	GCGAGAAAGAGGAGAGGGTT	225	60
ICSBM172	(AC)₆	TATAGCCTCACCGCTCGTCT	GGCCTGAAAACTCAAATGGA	206	60
ICSBM180	(CA)₁₂	TGGGACAATATGGCAAGGTT	ACAAATGCCTGATGGTAGGC	237	60
ICSBM197	(CA)₁₃	CACACAGATATTTGGCACCG	TGAGGATCCGAAAAGATTGG	214	60
ICSBM205	(CA)₁₉	ATTTTCTGGGCAATTCAACG	GTCCTCATCCCTTCCCTCTC	191	60
ICSBM206	(CT)₆	GTGAAGAACTCTCGATCGGC	ACTGGGTAGTACACGGCGAG	305	60
ICSBM212	(CT)₆	AGCACTGAGGCACAATTCCT	GTGCTGGGGTTTGTGACTTT	231	60
ICSBM216	(GA)₁₇	CTACCGCTTCAAAACGAAGG	TGTCCCACTCTCCTACCTACTACC	178	59
ICSBM219	(GA)₂₀	ATGGGATGCACAGGTACACA	TCCTTAGGTCATCGTCCTATTTG	260	59
ICSBM227	(GT)₁₁	TTGATTATGCTTTGGAGGGG	CCTCTCCTTACACGGGGATT	248	60
ICSBM234	(GT)₁₅	GACCCAGCGATCAGTCTCTC	TTGTCGTCTTGTCCGTCG	198	59
ICSBM235	(TG)₈	GACCAGAGACTTGGGCTTTG	TCACTCACTCACTCATCCGC	243	59
ICSBM239	(GA)₆	CCTGGACACACACACACACA	TCTTGTCACTGTCGGCGTAG	233	60

Table 2: Characteristics of 66 polymorphic SSR markers developed in Panicum miliaceum L.

Abundance and length frequencies of SSR repeat motifs from broomcorn millet

The total number of identified SSRs was 289,155, which included 7,476 mononucleotide repeat motifs (2.59%), 256,429 dinucleotide repeat motifs (88.68%), 18,363 trinucleotide repeat motifs (6.35%), 6,059 tetranucleotide repeat motifs (2.10%), 384 pentanucleotide repeat motifs (0.13%), and 444 (0.15%) hexanucleotide repeat motifs (Figure 3). For mononucleotide SSRs, the (A/T) n repeat was more prevalent and was about five times of the (C/G) n repeat, especial at the 0-10 bp length. The most abundant repeat in the dinucleotide SSRs was the (AG/CT) n repeat followed by the (AC/GT) n repeat. Both of these repeats accounted for 98.62% of the dinucleotides characterized (Additional file 3). For trinucleotide SSRs, the (AAC/GTT) n, (AAG/ CTT) n, and (CCG/CGG) n repeats were predominant, representing 27.24%, 19.28%, and 11.54%, respectively, of the trinucleotides identified (Additional file 4). Thirty-one tetranucleotide repeat motifs were recognized, and the most prevalent repeats were ACAT/ATGT (40.37%), ACGC/CGTG (15.23%), and ACTC/AGTG (12.02%) (Additional file 5). Together, pentanucleotide and hexanucleotide repeat motifs comprisedonly 0.29% of the total SSRs detected. The dominant pentanucleotide and hexanucleotide motifs were AGCTC/ AGCTG (10.42% of all of the pentanucleotide motifs) and ACACCC/ GGGTGT (43.47% of all of the hexanucleotide motifs), respectively (Additional files 6, 7, and 8).

advances-crop-science-broomcorn-millet-sequence

Figure 3: Pie-chart of six SSR motifs identified in the broomcorn millet sequence.

Compound SSR analysis

In this study, 43,100 compound SSRs were identified, and these compound SSRs accounted for only 18.97% of all SSR sequences. Two types of compound SSRs were identified: one kind was without an interruption between the two motifs (C type; i.e., (AC)11taacactactcacacaaacacacacactctctcag (AC)10tcacact(CA)6) and another was with an interruption between the two motifs (C*type; i.e., (CT)16(TCT)5). In total, 40,464 C type (93.88%) and 2,636 C*type (6.12%) compound SSRs were detected. These results indicated the complexity of the broomcorn millet genome.

Validation identification of SSR markers and genetic diversity study of broomcorn millet in China

To identify the feasibility of using SSR sequences to study genetic diversity, 240 SSR primers were randomly choosed to synthesized and propagated through polymerase chain reaction to determine band polymorphisms in 40 broomcorn millet genotypes from diverse geographical locations. The result of PCR amplication showed that 103 SSR primer pairs generated a reproducible and distinct amplicon product on the specific size. Of these primer pairs, 66 (27.5%) were confirmed to amplify polymorphic bands from the 40 genotypes assessed (Table 2). Polymorphic variations in the SSRs were evaluated and listed in Table 3. The Na value (number of alleles) per locus changed from 2 to 5 with a mean of 2.67. The Ng value (number of amplified genotypes) ranged from 3 to 15 with an average of 5.21. The He (expected heterozygosity) and Ho (observed heterozygosity) varied from 0.05 to 0.77 (mean=0.45) and from 0 to 0.95 (mean=0.23), respectively. The mean I (Shannon Information index) value was 0.73 which varied from 0.11 to 1.52 per locus. These results indicated that SSR identification via next-generation sequencing is feasible and efficient. These SSRs can be used for future studies of broomcorn millet genetics.

Primer pair ID	Ng	Na	I	Ho	He
ICSBM2	10	4	1.225	0.830	0.693
ICSBM3	6	3	1.098	0.435	0.670
ICSBM5	3	2	0.689	0.136	0.499
ICSBM8	10	4	1.289	0.288	0.706
ICSBM10	3	2	0.693	0.026	0.503
ICSBM13	6	3	0.585	0.034	0.307
ICSBM14	3	2	0.556	0.102	0.371
ICSBM15	3	2	0.522	0.091	0.341
ICSBM19	3	2	0.529	0.080	0.347
ICSBM21	6	3	0.952	0.552	0.565
ICSBM22	3	2	0.692	0.091	0.502
ICSBM24	6	3	0.682	0.023	0.372
ICSBM27	10	4	1.005	1.000	0.589
ICSBM29	3	2	0.692	0.057	0.501
ICSBM30	3	2	0.109	0.000	0.045
ICSBM31	10	4	1.180	0.322	0.632
ICSBM33	10	4	1.340	0.955	0.731
ICSBM35	6	3	0.830	0.205	0.500
ICSBM39	10	4	1.346	0.886	0.735
ICSBM40	15	5	1.519	1.000	0.771
ICSBM43	6	3	0.574	0.109	0.310
ICSBM46	6	3	0.594	0.114	0.315
ICSBM49	3	2	0.646	0.125	0.458
ICSBM51	10	4	0.627	0.094	0.295
ICSBM54	6	3	0.692	0.046	0.389
ICSBM59	6	3	0.539	0.034	0.287
ICSBM60	6	3	0.280	0.000	0.129
ICSBM67	6	3	0.269	0.000	0.120
ICSBM68	3	2	0.556	0.057	0.371
ICSBM71	3	2	0.693	0.058	0.503
ICSBM78	3	2	0.684	0.068	0.494
ICSBM80	6	3	0.880	0.330	0.514
ICSBM81	3	2	0.677	0.071	0.488
ICSBM85	10	4	1.019	0.215	0.581
ICSBM86	6	3	1.097	0.852	0.670
ICSBM90	3	2	0.419	0.023	0.253
ICSBM96	3	2	0.366	0.080	0.211
ICSBM99	3	2	0.659	0.080	0.469
ICSBM100	3	2	0.330	0.000	0.185
ICSBM107	3	2	0.676	0.023	0.486
ICSBM109	3	2	0.494	0.016	0.317
ICSBM113	3	2	0.692	0.114	0.502
ICSBM119	3	2	0.693	0.138	0.503
ICSBM120	6	3	1.081	0.193	0.659
ICSBM123	6	3	0.683	0.636	0.452
ICSBM126	3	2	0.693	0.011	0.503
ICSBM127	6	3	0.463	0.091	0.228
ICSBM130	6	3	0.532	0.136	0.273
ICSBM132	3	2	0.649	0.159	0.459
ICSBM135	3	2	0.388	0.057	0.229
ICSBM139	3	2	0.562	0.023	0.377
ICSBM144	3	2	0.249	0.023	0.128
ICSBM151	6	3	0.561	0.034	0.290
ICSBM159	6	3	0.936	0.897	0.583
ICSBM165	3	2	0.234	0.011	0.118
ICSBM172	6	3	1.071	0.818	0.653
ICSBM180	3	2	0.612	0.102	0.423
ICSBM197	6	3	1.028	0.609	0.623
ICSBM205	3	2	0.552	0.000	0.369
ICSBM206	3	2	0.633	0.094	0.448
ICSBM212	3	2	0.649	0.000	0.459
ICSBM216	6	3	0.707	0.188	0.389
ICSBM219	3	2	0.685	0.511	0.495
ICSBM227	10	4	1.331	0.796	0.729
ICSBM234	3	2	0.693	0.277	0.503
ICSBM235	10	4	1.248	0.309	0.699
ICSBM239	3	2	0.672	0.000	0.482
	5.209	2.672	0.725	0.235	0.445
	2.766	0.786	0.302	0.300	0.173

Table 3: Polymorphic variationsin SSR loci following amplification of 40 geographically diverse Panicum miliaceum L. accessions. ¹Ng: genotype No.; ²Na: observed number of alleles; ³I: Shannon information index; ⁴Ho: observed heterozygosity; ⁵He: expected heterozygosity.

Cluster analysis was performed on the 40 broomcorn millet accessions using the UPGMA method according to Nei’s genetic distance theory. The dendrogram indicated when the genetic similarity value was 0.645, the 40 broomcorn millet accessions grouped into three different clusters (Figure 4). Cluster 1 had six accessions, including a series of Longshu varieties and their parents, which originated from Heilongjiang. Cluster 2 consisted of accessions mainly from Shanxi (7), Gansu (4), and Ningxia (5) as well as several accessions from Inner Mongolia (5). Cluster 3 had eleven accessions, including most (8) of the fourteen varieties collected from Inner Mongolia. Two accessions in Cluster 3 originated from Shanxi. This pattern of diversity was in accordance with the result reported by Hu et al.

advances-crop-science-germplasm-resources

Figure 4: UPGMA dendrogram of 40 germplasm resources.

Discussion

Broomcorn millet is a native vital crop for food safety in arid and semiarid areas

Broomcorn millet is an ancient domesticated crops, and its oldest historical reports date to 10,000-8,000 BC [32]. This crop was mention in nine poems of the ancient Chinese “Book of Poetry” (Shih Ching) and was regarded as most important grain in ancient [33]. Before rice and wheat became popular, Panicum was a staple food in countries of Eastern Asia, and then spread to the entire Eurasian continent [32,34-36]. Until today, Panicum remains an important food in these regions [37,38]. Due to its shorter growth cycle and resistance to salt, alkali, and drought stresses, broomcorn millet is usually grown as an exploratory cereal in new wastelands and deserts or as a remediation crop in the outburst of natural disasters. Previous researchers have identified large phenotypic variation in broomcorn millet germplasms [10], and using these rich genetic resources to improve broomcorn millet productivity is a promising field of study. However, genomic resources and data or high-polymorphism molecular markers of broomcorn millet is lacking for genetic analysis and this crop was regarded as a “genomic orphan”. Therefore, the development of userfriendly and highly polymorphic molecular markers is vital for the success of broomcorn millet breeding programs [39].

Development of microsatellite markers using 454 pyrosequencing

Microsatellites, is also called “short tandem repeats” or “simple sequence repeats”, distributed in Eukaryotic genomes uniformly, and were composed by tandem repeats of 1-6 nucleotides. There have abundant variation in repeats number of SSR between varieties or populations, as well as highly codominant inheritance, so SSR are DNA markers can be used frequently in many genomic researches [14]. Although the application of microsatellites is simple and robust, the identification and development of microsatellites is highly challenging [19]. There are two traditional approaches to SSR loci development. The first approach involves the de novo construction of a genomic library and microsatellite development, and the second approach involves testing microsatellite primers previously developed for related species. Microsatellite development methods based on the Sanger sequencing are often low throughput and typically only obtain a few hundred sequences due to the expensive of Sanger sequencing [15]. In addition, enrichment libraries generally constructed by several specific tandem repeats that selected randomly and no prior abundance information in the genome [40], this may be lead to bias of genome representation. 454 GS-FLX technology (next-generation sequencing) provides new opportunities for microsatellite isolation due to its high throughput, low cost of operation, and more thorough representation of the genome [15,41]. To date, several crops have developed high-throughput and novel genomic SSR markers via 454 GS-FLX sequencing [18-21,42]. In this study, massively parallel sequencing technology was adopted and hoped to discovery numerous SSR with high quality from genome of broomcorn millet quickly. A total of 1,087,428 high-quality broomcorn millet genomic unigenes were generated and the average length was 370 bp. The MISA analysis results showed that 223,894 SSRs were identified from 904,311 reads. Of the SSR-containing reads, mononucleotide, dinucleotide, and trinucleotide repeat motifs are dominated in the broomcorn millet genomic sequences. This result is Similar to the findings in other crops [21,43]. The (AG/CT) n repeat was accounting fosr 45.8% of total identified SSRs, and was the predominant repeat motif in the whole genome of broomcorn millet. The (AG/CT) n repeat was followed in abundance by (AC/GT) n, (A/T) n, and (AAC/GTT) n. In this study, (AAT/ATT)n, (ATC/ATG)n, and (ACT/AGT)n were rarely detected. The isolation and identification of unwanted repeat motifs, such as (AAT/ATT)n, (ATC/ATG)n, and (ACT/AGT)n, which are present in low proportions, should enhance the number of successful primers designed.

SSRs can be effective molecular markers for the genomic analysis of broomcorn millet

SSR markers can link the genotypic and phenotypic variation, this may be helpful for breeders to expedite the development of improved cultivars [44]. In addition, SSR markers are fit for the genetic linkage map construction, genetic diversity analysis, QTL mapping, gene cloning, and marker-assisted selection due to their widely distribution in eukaryotic genomes [45]. Next generation sequencing technology can be used for the high-throughput identification of SSR loci in crop species, especially in “orphan” crop species that lack available genetic and genomic resources [46]. In this paper, 454 pyrosequencing was chose to develop codominant and polymorphic genetic SSR markers in broomcorn millet. This method was highly effective as 103 of the 240 (42.9%) randomly designed SSR primers successfully amplified stable and clear bands after PCR reaction. Then the 103 markers were selected for genetic diversity and group delineation of 40 broomcorn millet accessions which originate from 5 different ecotypes, we validated that 66 markers were highly polymorphic in test accessions.

The polymorphic markers enhanced our cognization about the genetic diversity level of broomcorn millet accessions. For broomcorn millet, isozymes and protein electrophoresis for intra-species grouping or classification have not been successful [1,47]. Genomic SSR markers in broomcorn millet were also identified more efficiently than the transferred SSR markers which selected from other crops. Hu et al. reported that only 46 of the 983 SSR primer pairs (4.6%) tested could generate clear and reproducible polymorphic fragments [12]. This numerous genomic SSR markers developed in the study will facilitate the evaluation of genetic structure and the construction of highresolution maps in broomcorn millet.

Conclusion

This study provides a broad discovery and characterization of microsatellites loci in the broomcorn millet genome using 454 GS FLX Titanium sequencing technology. Moreover, massive SSR-enriched sequence data were first generated, facilitating the discovery and utilization of genomic SSR markers, further to accelerate the genomic and genetic research of broomcorn millet.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (Grant No.31301386), the National Millet Crops Research and Development System (NMCRDS), China Agriculture Research System (CARS-07-12[1].5-A1) and the Agricultural Science and Technology Innovation Program (ASTIP) in CAAS. We are grateful to Dr. Dahai Wang and Liping Sun (Beijing Autolab Biotechnology Co., Ltd) for special contribution to this work.