Comparative Genomic and Proteomic Phylogenetic Analysis of Indian Isolate of Partial Coat Protein Gene Sequence of Zucchini Yellow Mosaic Virus (ZYMV) Using Data Mining

Crops belonging to family cucurbitaceae are generally known as cucurbits. As a group, cucurbits occupy largest area in India and in other tropical countries amongst vegetable crops. Out of all cucurbitaceous crops, summer squash is one of the important crops because it is one of the earliest vegetables reaching markets of India. Amongst different plant pathogens, viral infections are responsible for causing great losses to this crop. In cucurbit crops, viruses belonging to Potyvirus genus have severely caused economical damage all over the world [1]. In particular, Zucchini Yellow Mosaic Virus (ZYMV), a member of genus Potyvirus in the family Potyviridae, was subsequently one of the most damaging virus causing epidemics in commercial cucurbits worldwide [2]. In Korea, the disease caused by ZYMV has been considered one of the major limiting factors for production of cucurbits [3,4]. In this study the partial coat protein gene sequence of ZYMV of Indian isolate of North Western Himalayan region was determined and phylogenetic analysis of the test sequence at both genomic and proteomic level was carried out to gain insight of the evolutionary pattern of Zucchini yellow mosaic virus and hence phylograms and phylogenetic trees were constructed for all 14 countries viz, Australia, Austria, California, China, France, Hungary, India, Israel, Japan, Korea, Poland, Singapore, Taiwan and USA using phylip 3.68 and EXOMETM HORIZON respectively. The present studies on phylogenetic analysis with other countries isolates have been carried out to suggest world wide distribution of ZYMV and by tracing its phylogeny management of the disease may be understood. This work represents the first detailed phylogenetic study ever conducted with well explained flowcharts for methods used for constructing 64 phylograms and 64 phylogenetic trees.


Introduction
Crops belonging to family cucurbitaceae are generally known as cucurbits. As a group, cucurbits occupy largest area in India and in other tropical countries amongst vegetable crops. Out of all cucurbitaceous crops, summer squash is one of the important crops because it is one of the earliest vegetables reaching markets of India. Amongst different plant pathogens, viral infections are responsible for causing great losses to this crop. In cucurbit crops, viruses belonging to Potyvirus genus have severely caused economical damage all over the world [1]. In particular, Zucchini Yellow Mosaic Virus (ZYMV), a member of genus Potyvirus in the family Potyviridae, was subsequently one of the most damaging virus causing epidemics in commercial cucurbits worldwide [2]. In Korea, the disease caused by ZYMV has been considered one of the major limiting factors for production of cucurbits [3,4]. In this study the partial coat protein gene sequence of ZYMV of Indian isolate of North Western Himalayan region was determined and phylogenetic analysis of the test sequence at both genomic and proteomic level was carried out to gain insight of the evolutionary pattern of Zucchini yellow mosaic virus and hence phylograms and phylogenetic trees were constructed for all 14 countries viz, Australia, Austria, California, China, France, Hungary, India, Israel, Japan, Korea, Poland, Singapore, Taiwan and USA using phylip 3.68 and EXOME TM HORIZON respectively. The present studies on phylogenetic analysis with other countries isolates have been carried out to suggest world wide distribution of ZYMV and by tracing its phylogeny management of the disease may be understood. This work represents the first detailed phylogenetic study ever conducted with well explained flowcharts for methods used for constructing 64 phylograms and 64 phylogenetic trees.

Maintenance of the virus isolate
The virus cultures were maintained on healthy seedlings of summer squash variety Australian Dark Green by mechanical sap inoculation under insect proof glass house conditions.

Enzyme Linked Immunosorbent Assay (ELISA)
ZYMV specific antibodies along with alkaline phosphatase linked antibodies produced from (BIOREBA-AG Switzerland) were used for ELISA and protocols of suppliers of ELISA kits were used ( Figure 1). The positive and negative controls were also provided by the antibody suppliers (BIOREBA-AG Switzerland).

RNA isolation
Total RNA from virus infected summer squash leaves was isolated using RNAeasy plant Mini Kit (Qiagen). RNA isolation was also tried at healthy control plant. by using specific oligonucleotide primer p9502 shown in Table 1. For the first strand cDNA synthesis RT-PCR was carried out and for further amplification of cDNA, PCR was carried out in a thermal cycler (Applied Biosystem, USA) using specific primers ( Table 1). Components of RT-PCR and PCR were standardized (Table 2) and so do the thermal profile and no. of cycles.

Sequencing and translation of the sequenced PCR product
Sequencing using both reverse and forward primers was carried out [5] and the partial coat protein sequence obtained has been submitted to NCBI Database and also the sequence was kept as such for genomic studies at nucleotide level and was also translated to protein using Expert Protein Analysis System (EXPASY) tool for proteomic studies.

Sequence selection
Both nucleotide and protein sequences of coat protein gene of ZYMV were retrieved from National Centre for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) ( Table 3).
These nucleotide sequences and protein sequences given in Table  3 were later used with test sequence for multiple sequence alignment, phylogenetic analysis using various online/offline bioinformatic tools.

Conversion of selected sequences into FASTA format
All 67 coat protein nucleotide and protein sequences obtained from all over the world in GenBank format were converted into FASTA format [6]. These 'FASTA' formatted sequences were then stored country-wise in separate notepads.

Sequence alignment
During present investigations, multiple sequence alignment of nucleotide and protein sequences of ZYMV and other 67 ZYMV isolates retrieved from NCBI database, was carried out. Multiple sequence alignment was performed using CLUSTAL W program [7].

Phylogenetic analysis
For Phylograms phylip 3.68 Software was used and for phylogenetic

Culture identification and collection
Under field conditions, summer squash plants infected with ZYMV develop a variety of symptoms. These symptoms vary from mild to severe mosaic, green blisters on leaves, vein clearing, and shoe stringing of leaves ( Figure 3).
For culture collection, survey of various summer squash growing localities of H.P. was conducted.

Mechanical transmission
Indicator plant Chenopodium amaranticolor Coste and Reyn was also used to indicate presence of the test virus by observing the lesions.

Symptomatlogy
The first manifestation of the disease on the inoculated plants was observed after 16-18 days of inoculation in the form of vein clearing on the younger leaves. Later, mottling and mild mosaic symptoms were exhibited by the infected plants. As the infestation progressed, leaf lamina was drastically reduced in both shape and size. Leaves were deformed with dark green blisters and distorted mid ribs. Virus

Serological detection
Infected leaves of summer squash showing prominent symptoms were subjected to serological indexing and the samples collected from hill state of H.P. produced prominent yellow colour and which was also confirmed by the OD value obtained and as the OD value was so near to the positive control OD it confirmed severe infection of ZYMV in the samples drawn from District Una (HP) ( Tables 4 and 5).

RNA isolation and molecular detection of the virus using RT-PCR and amplification of cDNA
Results of serology indicated presence of test virus and concentration of the virus was also high. So, infected and healthy plants were then used for RNA isolation. The isolated RNA was reverse transcribed into cDNA. This RT-PCR was then followed by amplification of cDNA with PCR. The amplified product obtained was of 700 bp and on using this PCR product along with forward and reverse primer for sequencing the sequence so obtained were 154 nucleotides (Sequence in FASTA Format)

Translation of the test sequence
The sequence was translated into its amino acid residues using protein translator tool at Target Assisted Iterative Screening (TAIS) network. Analysis of amino acid sequence showed a longest open reading frame (5'-3') of 51 amino acids with Methionine in between. (Protein Product).

Multiple sequence alignment
Multiple sequence alignment of selected nucleotide and protein sequences of zucchini yellow mosaic virus with that of Una (Indian) isolate was performed using CLUSTAL W program [7] available online at European Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/) and similarly, country wise CLUSTAL W along with query nucleotide and protein sequence was also performed and these CLUSTAL W outputs were then used in (phylip 3.68 and EXOME TM software) bioinformatics tools for constructing phylograms and phylogenetic trees.
Pairwise percentage similarity score matrices were also drawn for each of the 67 nucleotide and protein sequences when compared with test isolate from Una (India). This data is arranged country wise in tabular form: ( Table 6).

Phylogenetic Analysis
To trace out the evolutionary patterns of the test virus and to find out relationship of the same with other selected sequences at NCBI (Tables 7 and 8) (Figure 4 (included as supplementary data)) phylograms and phylogenetic trees were constructed using Maximum Likelihood (ML), Maximum Parsimony (MP), Neighbor Joining (NJ) and Unweighted pair group method of mathematical averages (UPGMA) methods using phylip 3.68 and EXOME TM respectively.

Phylograms and phylogenetic trees analysis of nucleotides and proteins
Australia: A total of 5 nucleotide and 5 protein sequences selected from Australian sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The test virus found sequence similarity with DQ925447 in all the phylograms and phylogenetic trees constructed for test ZYMV sequences from Australia The test virus found protein sequence similarity with ABL09422 in all the phylograms and phylogenetic trees constructed Austria: A total of 9 nucleotide and 9 protein sequences selected from Austrian sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The test virus found sequence similarity with AJ420020 in all the phylograms and phylogenetic trees constructed for Austrian isolates The test virus found protein sequence similarity with CAD12315 and CAD12316 in all the phylograms and phylogenetic trees constructed for Austrian isolates China: A total of 20 nucleotide and 20 protein sequences selected from Chinese sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The test virus found least sequence similarity only with AJ316229 out of all the phylograms and phylogenetic trees constructed for Chinese isolates The test virus found least protein sequence similarity with some protein sequences from all the phylograms and phylogenetic trees constructed for Chinese isolates      Hungary: A total of 4 nucleotide and 4 protein sequences selected from Hungarian sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The Hungarian sequences found around 60% sequence similarity with the test sequence in all the phylograms and phylogenetic trees constructed for Hungarian isolates The test virus found protein sequence similarity with CAD31036, CAD31056 protein sequences in all the phylograms and phylogenetic trees constructed for Hungarian isolates Japan: A total of 6 nucleotide and 6 protein sequences selected from Japanese sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The test virus found sequence similarity with AB188115 and AB188116 in all phylograms and phylogenetic trees constructed for Japanese isolates The test virus found protein sequence similarity with BAE75935, BAE75934, and BAD74201 protein sequences in all the phylograms and phylogenetic trees constructed for Japanese isolates Korea: A total of 5 nucleotide and 5 protein sequences selected from Korean sequences were put to analysis with the test virus, the trees were drawn and the results using different methods are being briefly described The test virus found sequence similarity with AJ429071 out of all the phylograms and phylogenetic trees constructed for Korean isolates The test virus found protein sequence similarity with CAD22062, AAQ17215 and AAQ17216 protein sequences in the phylograms and phylogenetic trees constructed for Korean isolates Taiwan: 8 nucleotide and 8 protein sequences of CP ZYMV selected from Taiwan and were put to analysis with the test virus, the trees were drawn are being briefly described The test virus found sequence similarity with AF127933 in all the phylograms and phylogenetic trees constructed for Taiwanese sequences The test virus found less protein sequence similarity with AAD44688 protein sequence as revealed from all the phylograms and phylogenetic trees constructed for Taiwanese isolates 10 nucleotide and 10 protein CP gene sequences of ZYMV isolates of different countries were studied to analyze with the test virus, the trees were drawn and the results using different methods are being briefly described Among the various sequences of varied countries, sequences from California, France, India, Israel, Poland, South Africa, Singapore and USA were studied.
The test virus found maximum sequence similarity with D13914 in all the phylograms and phylogenetic trees constructed The test virus found protein sequence similarity with ABM65098 and ABI97984 protein sequences

Discussion
In the present studies, partial CP gene sequence of Una (Indian) isolate of ZYMV compared with other 67 isolates of ZYMV at both genomic and proteomic level to see its evolutionary behavior.
Viral cultures under present investigations were selected on visual symptoms. The zucchini yellow mosaic virus has been known to produce symptoms like vein clearing, yellow mosaic, blistering and shoestringing of leaves, fruit and seed deformations and stunting of plants [8]. There have been many reports of simple and rapid techniques to detect plant viruses using RT-PCR. Lately, in 2007, detection of ZYMV using RT-PCR was carried out in C. sativus L. and Cucumis melo L. in Poland. Pospiezny et al. and Auger et al. identified a strain of ZYMV on squash by means of DAS ELISA and PCR using ZYMV specific primers ZY-2 and ZY-3 and a segment of 1186 bp was amplified and sequenced [10,12].
There are other numerous reports, where both PCR and RT-PCR have been used for rapid detection of ZYMV [13,[14][15][16][17]. The amplified product of ~ 700 bp under present investigations is in consonance with the findings of Sharma, who reported similar size (~700 bp) for ZYMV isolates of various infected summer squash plants of H.P [18]. Prieto et al. had also sequenced a fragment of 395 bp in length from the 3' portion of CP gene of Chilean isolate of ZYMV. In the present case however only 154 nucleotide long DNA was amplified confirming only partial amplification and sequencing of the CP gene [19].
Multiple sequence alignment of the test nucleotide and protein sequence of test isolate with other 67 isolates of ZYMV imported from NCBI revealed that alignment score was highest for USA among varied countries and lowest for China in case of nucleotides whereas it was lowest for Korea in case of proteins. Alignment score for Indian sequence of ZYMV was 86% and 77% in case of nucleotides and proteins, respectively on using Clustal W.
Shukla and Ward predicted amino acid sequence of ZYMV coat protein of USA and compared with the published amino acid sequences of other potyviral coat proteins [20]. Overall homology ranged from 47.5 to 67.1%. This was in agreement with 38 to 71% range of homologies observed among distinct potyviruses; while different strains of the same virus showed greater than 90% homologous behavior.
In present studies phylogenetic relationship of the test isolate with 67 isolates of ZYMV retrieved from NCBI database were determined at both nucleotide and protein levels by applying four methods viz., the UPGMA [21], the neighbour joining [22], the maximum likelihood [23,24] and the maximum parsimony using Phylip 3.68 and EXOME TM software [25]. Present phylogenetic analysis at nucleotide level indicated that DQ925447 (Australia), AJ420020 (Austria) with significant bootstrap,