A Novel Algorithm to Design an Efficient siRNA by Combining the Pre Proposed Rules of siRNA Designing

Short interfering RNAs (siRNAs) can be used to suppress gene expression and have a lot of potential applications in therapy, yet how to design an effective siRNA is still under consideration. Numerous siRNA design tools have been developed recently. The set of candidates reported by these tools is usually large and often contains ineffective siRNAs. We initiated with the filtering of ineffective siRNAs, specifically


Introduction
RNA Interference (RNAi) is powerful specific process which is actively carried out by special mechanisms in the cell. RNA interference (RNAi) is a gene regulatory pathway triggered in response to double-stranded RNA (dsRNA) (Hannon, 2002). Double-stranded (ds) RNA induces the sequence-specific posttranscriptional gene silencing of cognate genes in numerous organisms (Cogoni and Macino, 2000; Zamore, 2002a; Denli and Hannon, 2003). The multidomain ribonuclease III enzyme Dicer excises long dsRNA into duplexes of 21-23 nucleotides (nt) termed short interfering RNAs Ketting et al., 2001;Knight and Bass, 2001), which direct the cleavage of complementary mRNA targets, a process known as RNA interference (Fire et al., 1998). Prior to target mRNA recognition, an siRNA duplex goes through an ATP-dependent unwinding process and one strand over the other is often preferentially loaded onto the RNA-induced silencing complex (RISC), the multiple turnover enzyme complex that mediates endonucleolytic cleavage in the RNAi pathway. The RISC is guided to cleave target mRNAs sharing perfect complementarity across the center of the complementary siRNA strand in the absence of high-energy cofactors Nykanen et al., 2001;Hutvagner and Zamore, 2002;Martinez et al., 2002). siRNAs are not the only products of Dicer. Natural dsRNA-encoding genes, named microRNA (miRNA) genes, encode RNA products of 70 nt that are predicted to form imperfect hairpin structures and are processed by Dicer to mature 21-23-nt miRNAs (Grishok et al., 2001;Hutvagner et al., 2001).
Gene silencing mediated by siRNA is not transparent at the mRNA level, but apparently detectable at the protein level. Elimination of the off-target silencing mediated by the 5'seed pairing is extremely challenging, and most siRNAs will have a number of unintended targets affected by 5'seed pairing. More than one siRNA to a given target is commonly used in discovery research to ensure a single phenotype resulting from on-target gene silencing by most siRNAs used. "Multiple siRNAs per target" approach is acceptable for discovery research but fails in potential therapeutic applications (Jayasena, 2006). The further refinement of design algorithms is necessary for siRNAs especially aimed for therapeutic developments. It is expected to merge miRNA target prediction algorithms with those developed for siRNA design to eliminate candidate siRNAs with potential off target gene silencing through the undesirable miRNA-related mechanism (Jayasena, 2006).
While it is desirable to incorporate all of the selection rules into a computer aided siRNA designing algorithm, the complication at the moment is how to rank those published rules, especially when some of the rules are contradictive. Currently, many computer aided siRNA design tools have been published and some of those have been made accessible through websites. However, none of those tools has successfully incorporated all the rules, and most of them treat their employed rules without much differentiation. In general, the existing tools adopt a set of rules and assign each rule an equal or different score, and each siRNA sequence is scored against every rule and only those sequences scoring above a predefined point are selected as valid siRNA sequences. Such a simple selection procedure does not accommodate the possibility that some rules are critical for the validity of a siRNA sequence (must be met), while some rules can only affect the efficiency of the siRNA sequence (Hong et al., 2006). Meanwhile, those web-based tools only provide users very limited flexibility, and users cannot reorganize the selection rules based on their own preferences or recent research data (Hong et al., 2006). Although the actual mechanism of which is still unclear, the off-target effect of siRNA is largely attributed to partial sequence homology between siRNA and its unintended targets. Table 1  Moreover the replacement of Java code scriptlets with JSP tags was performed to improve the readability of the tool. (f) JDBC: JSP/java application needs to communicate with a database then JDBC connector was used.

Hardware and environment
Compaq Intel 2.80 PC with a Pentium dual core processor running on the Microsoft Windows server 2003 operating system was used for development. Partial work was performed in Linux environment using Linux SUSE 10.1 operating system.
Other used software and packages MATLAB 6.5 statistical software was used to analyze the siRNA data and various online siRNA design and validation tools were used. www.ncbi.nlm.nih.gov/refseq) was used as the experiment mRNA dataset. In order to test the efficacy of designed algorithm, Homo sapiens Alzheimer's mRNA sequence (Accession no. NM_001007532.2) were used as reference sequence. Comparative analysis of siRNA prediction and their effectiveness were tested for Ambion tool, siRNA program of mEMBOSS 6.0.1 and our tool based on designed algorithm ( Figure 1) using Oligowalk tool of RNA structure 4.5 package. (b) siRNA database: Ambion siRNA database and SIR were used for comparison between design and pre-existing siRNA sequences. (c) BLAST and Smith-Waterman Search: It has been suggested that the un-detected sequence homology by BLAST search play major role in designing siRNA. In present work, we employed two filters to screen for the possible off-target effect. First, BLAST was applied to identify and remove any off-target matches for all the siRNA sequences that survive the three-phase selection procedure. Then, the remaining sequences are screened by the Smith-Waterman search.  additional rules that improves the efficiency of RNAi significantly when applied together. Each rule is assigned a score, which is summed up to a total duplex score. Out of a set with scores greater than 6, only about 17% were non-functional in their study, but as is shown in results on independent test sets were not so successful, which can be caused by over fitting to the training data.  Source code of algorithm was used to find out all the possible target sites distributed among retrieved Fasta sequences. All the siRNA were designed on the basis of target sites. Blast algorithm was applied for homology search and that of Fasta algorithm to detect off-target sequences. Compared the designed siRNA with target mRNA. Finally, the effectiveness of siRNA, secondary structure, duplex temperature, free energy, and specific effect (interaction with other biochemical pathways) of designed siRNA were calculated.

Discussion
The melting temperature were calculated from the duplex formed by antisense siRNA and the target mRNA having more in siRNAs which are design by our siRNA tool. This result indicates that more accurate and greater tendency of silencing is shown by siRNA designed by our siRNA designing tool. The value of duplex energy and overall energy were comparatively ISSN:0974-7230 JCSB, an open access journal less which indicates that the design siRNA by our tool will be more stable.
Our siRNA designing tool was built on for every platform because java/j2ee platform were used which is platform independent. The databases involved were implemented locally to speed up the design process using a MYSQL database. MySQL is used as the relational database management system to host a local database. JSP, SERVLET and HTML scripts were written to communicate with the databases using a web based user interface (GUI interface) and java programming language was used to implement the design algorithms as well as to parse BLAST and FASTA output. MATLAB statistical software was used to analyze the siRNA data. Tool uses a variety of information from genetic databases such as NCBI RefSeq and Human UniGene database, similarity searching software such as BLAST and FASTA to confirm the uniqueness of siRNA design. Web tool users need not install the siRNA design tool locally; BLAST searches carried out in the program were performed using the program BLASTN from the NCBI standalone BLAST package with standard settings and no filtering. The web-based front-end of the program, as well as the in silico digestion of the input query, quality control algorithm and output parsing scripts, were written in the java programming language. The JSP/servlet for interactive selection of sequence and manipulation of penalty scores from the graphical output was programmed in Java. The software accepts input of one or multiple target genes in Genbank or FASTA formats. Since the Genbank format provides locations for the coding region of the gene (CDS), it is the preferred format used in this study. Once the start location is determined for each gene sequence, the selection process starts by collecting siRNA candidates. It shifts one nucleotide each time along the sequence to exhaust all potential siRNA sequences and avoids any sequences that contain uncertain nucleotides other than A, T/U, G, or C because these regions may have single nucleotide polymorphism, or SNP. The selection process is the major advantages of this tool are that it allows users to adjust all the selection criteria or even rearrange the filters in the three phases through a configuration file. Where users can adjust the following from the graphic user interface (GUI) of this tool: the length of the siRNA, the range of GC content and the definition of polymers of A, U/T, G and C, etc.

Conclusion
Most of designed siRNA shows off-target effect, which can be reduced by Smith Waterman searching but complete off-targeting can be removing by chemical modification of nucleotide bases. The stability of siRNAs can be increase by convert it into short hairpin loop form. Algorithm calculated the RNA secondary structure and minimum free energy for each target sense and anti-sense sequences. Sequences were filtered to remove candidates with unfavored thermodynamic property. Effective siRNAs Journal of Computer Science & Systems Biology -Open Access JCSB/Vol.3 Issue 1 have a relatively lower duplex stability(Tm; less stable, more A/ U rich) toward the 5'-end of the strand that remains in RISC (the 'guide strand') and a relatively higher Tm (more stable, more G/ C rich) toward the 5'-end of the degraded or ejected strand.