A Tale of Effectors; Their Secretory Mechanisms and Computational Discovery in Pathogenic, Non-Pathogenic and Commensal Microbes

Secretory proteins that are involved in modulating hosts are called as effectors. Lately, finding these important proteins from a large array of other gene products have been a focus area in many high funding research programs. However, since the biological data is accumulating at a much faster rate now than ever before, this search process can be compared to finding a needle in the haystack. Conventional laboratory-based methods require critical experiments, extended time and high cost which in many cases result in failure of testing hypothesis. Using high throughput sequencing technologies, whole genome sequences are generated much more quickly and efficiently. The avalanche of genomics data has ushered new opportunities into discovery of large number of novel extra cellular secretory proteins that usually lie undetected with conventional methods. Recently powerful bioinformatics methods have emerged that can predict effectors from whole genome data of pathogens, commensal, symbiotic and environmental microorganisms much easily. In this review, we present a broad overview of these biological molecules that modulates host response in different ways in many organisms – pathogenic, non-pathogenic and commensal. We also catalogue the motifs associated with many secretory mechanisms and their prediction algorithms.


Introduction
Secretory mechanisms are highly specialized tools for microbes to interact with its host system. Certain specialized secretory proteins called as effectors play a very crucial role in host-microbe crosstalk. Effectors could be delivered into the host cells via varied mechanisms, including specialized secretion systems, physical injection and protein translocation via signal sequences. The common attribute each effector possesses is the ability to trespass the cell membrane of microbial cell and move into the extra cellular space. Once inside the external medium, be it in the host cell or outside of the host cell, the job of the effector is to modulate the host system.
Latest developments in genome sequencing technologies have advanced our knowledge about genomes of a broad diversity of microbes, including bacteria, apicomplexa, fungi, and oomycetes, as well as nematodes. Sequence analysis studies have revealed that effectors show very poor sequence similarity among themselves. Thus similarity based methods that are often used for finding such molecules have failed completely. In order to circumvent this issue, many novel bioinformatics strategies have been devised from time to time for predicting them from the whole genome. A large number of effectors have been predicted using various algorithms that were earlier not known. Many of these effectors have now been characterized in the laboratory to prove the efficacy of these methods [1]. While there are major advances made in this area to generalize the algorithms for effector discovery in silico, there are no comprehensive review on classification and prediction of effectors in different microbial systems.
In this review article, we attempt to give an over view of secretory mechanisms in commensal, parasitic, saprophytic microbes. We discuss the Sec and Tat dependent and independent secretion systems in bacterial and other microbes. The different types of secretion systems such as Type I through VII is discussed with examples. Other not-so-common secretion mechanisms such as Chaperone Usher pathway and the LOL systems are also discussed. We have elaborated motifs involved in secretion mechanisms and the available computational methods to predict effectors from the genome sequences. This article summarizes many essential aspects of effector and secretion systems in microbes that may be of great help for researchers associated in this area.

An Overview of Microbial Secretion mechanisms
The secretion process in bacteria involves two important pathways e.g.; the secretion (Sec) pathway and the two-arginine translocation (Tat) pathways that are universal across the tree of life. The Sec and Tat pathways are responsible for secreting proteins into the periplasmic regions of Gram negative bacteria that has a bilayer cell membrane, and to the exterior in Gram positive bacteria having a mono layer cell membrane. Secretion in Gram negative bacteria generally involves transfer of proteins to the periplasmic space via Sec or Tat pathway and then across the outer membrane through Type II, Type V or less commonly via Type I or Type IV pathways. However, some secreted proteins in Gram negative bacteria are directly transported into target cell in a single step across the bilayer membranes using Type I, III, IV and VI pathways [2,3]. Gram positive bacteria Mycobacterium spp utilize special type VII secretion machinery to transport proteins across their hydrophobic cell membrane and impermeable cell wall [ Figure 1]. Figure 1: Mechanism of bacterial major protein export and secretion. In the Sec pathway, unfolded proteins newly released in the periplasm can interact with chaperones and get transferred to the outer membrane, fold spontaneously or with the help of periplasmic chaperones, get secreted using other specialized systems, or become membrane-anchored via small lipid moieties guided by specialized lipoprotein sorting systems. Folded proteins are delivered via Tat system. Bacterial effectors or DNA following their delivery into host interact with its membrane or various organelles. Legends below the diagram describe different parts of bacteria and target host cells.

The Universal Sec and Tat Pathways
Sec pathway carries out most of the protein export from bacterial cytoplasm to cell membrane and extracellular environment. These proteins have roles in nutrient acquisition, cell wall synthesis and virulence and viability of bacteria [4,5]. The key features of the proteins using Sec pathway are almost always synthesized as unfolded preproteins carrying an N-terminal signal sequence that often acts as an address tag [6]. Sec system relies on proton gradient and ATP hydrolysis powered by SecA motor protein as energy source for efficient preprotein translocation. SecA, in some cases are implicated in virulence of bacteria [5]. In contrast to Sec pathway, Tat secretion pathway is responsible for translocation of folded proteins in cytoplasmic membrane of bacteria [7]. This secretion machinery recognizes a distinct motif rich in basic amino acids (Ser/Thr-Arg-Arg-X-Phe-Leu-Lys) in the N-terminal region of signal peptide. The proteins translocated to Tat translocon consisting of three membrane proteins TatA, TatB and TatC using proton gradient only as energy source [7,8]. In order to prevent mis-firing of tat peptides into sec pathways, there are subtle differences between the protein properties. The hydrophobicity in case of Tat proteins are slightly less than that of sec proteins [9]. Tat proteins carry out multiple cellular activities including anaerobic metabolism, cell envelope biogenesis, metal acquisition and detoxification, and virulence [7,9,10] . of exporting secretory molecules outside, independent of the Sec and Tat system [1]. For T1SS, ATP binding cassettes (ABC Transporters), Outer Membrane Factors (OMFs) and membrane fusion proteins (MFP) work in tandem for secretion. Several virulence factors such as metalloproteases, adhesins and glycanases of plant pathogenic bacteria and rhizobial proteins engaged in legume symbiosis are secreted through T1SS [2].
In case of type II secretory system (T2SS), the proteins that are already secreted into periplasmic space (via sec or tat pathway) are excreted out [11,12]. T2SS is conserved in Gram negative bacteria and consists of a set of 12-16 conserved proteins that are assembled into a supramolecular complex spanning the bacterial envelope, called the secreton [13]. T2SS is required for several plant and animal pathogens including Vibrio cholerae, enterotoxigenic and enterohaemorrhagic Escherichia coli (ETEC and EHEC, respectively), Pseudomonas aeruginosa, Klebsiella spp., Legionella pneumophila and Yersinia enterocolitica. Virulence factors secreted via T2SS include exotoxin A, cholera toxin, pectinase and pectate lyase of plant pathogens Erwinia carotovora and Xanthomonas campestris [12]. Type IV secretion system (T4SS) is very unique in the sense that it can transport nucleic acids into several hosts including plants, animals, yeast or other bacteria when in direct contact.. Three types of T4SS are known, (i) conjugation system that translocates DNA intercellularly by a contact dependent process, (ii) effector translocator system that is set to transfer proteins or other macromolecules to recipient cell, (iii) DNA release/uptake system that translocates DNA to or from the extracellular milieu [14] as in Helicobacter pylori ComB system. In case of Neisseria gonorrhoeae, DNA is secreted into the extracellular milieu. Brucella, Bartonella, Rickettsia, and Anaplasma release virulence factors through T4SS that consist of 12 proteins collectively termed as VirB/D4 family, similar in structure to Agrobacterium tumefaciens Vir B/D4 system [15]. Legionella and Coxiella sp. translocate their virulence factors through T4SS that consists of two groups of proteins, Dot (Defective in organelle trafficking) or Icm (Intracellular multiplication) proteins15. Recently many new type IV secretory systems have been discussed. The novel GI type IV secretory system is responsible for formation of wide variety of genome islands as in gonococcal genetic island (GGI) [16].
Interestingly, it has become apparent that T2SS of Gram-negative bacteria, the type IV pilus system (T4PS) of Gram-positive and Gramnegative bacteria, the archaeal flagellum synthesis system and the transformation system of Gram-positive bacteria are evolutionarily related and share several structural and functional features [12].
TypeV secretory systems (T5SS) are autotransporters [17]. Paradoxically this is the simplest, largest and most recently discovered secretory system [11]. This system is also dependent on sec mechinary for secretion of proteins into the periplasmic region via a signal peptide. After the signal sequence is cleaved, the passenger domain is ported out using the C terminal beta barrel that forms a pore like structure on the outer membrane. In case of toxins, the passenger domain is cleaved from the beta barrel forming a soluble component. These proteins are called as auto-transporters. Another subclass of Type 5 SS is T5cSS, where it is composed of a trimeric component, where each of the trimer contributes towards the formation of a beta barrel. There is another subclass called as two partner secretion (TPS), where it is composed of pairs of proteins and one partner carries the beta barrel and the other one carries the secretory protein.
Over 500 types of proteins are secreted via T5aSS class alone. Proteins secreted via the T5SS include adhesins such as AIDA-I and Ag43 of E. coli, Hia of Haemophilus influenzae, YadA of Yersinia enteroliticola and Prn of Bordetella pertussis. Toxins such as VacA of Helicobacter pylori; proteases such as IgA proteases of Neisseria gonorrheae and Neisseria meningitides, SepA of Shigella flexneri and PrtS of Serratia marcescens; and S-layer proteins such as rOmpB of Rickettsia sp. and Hsr of Helicobacter pylori are released through this secretory mechanism. T5bSS (TPS) secreted proteins including adhesins such as HecA/HecB of the plant pathogen Dickeya dadantii (Erwinia chrysanthemii) and cytolysins such as ShlA/ShlB of Serratia marcescens, HpmA/HpmB of Proteus mirabilis and EthA/EthB of Edwardsiellla tarda [11] .

The Sec/Tat Independent secretory systems
While T1SS, T4SS and can also operate with or without Sec/Tat secretory mechanisms, Type III and Type VI secretion systems are the classical independent mechanisms. Type III secretion system (T3SS): Some of the world's most important diseases of plants, animals and humans are caused by effector proteins delivered via type T3SS. T3SS is a needle like nanomachine that injects effectors directly into cytoplasm of eukaryotic host cells to initiate infection. It consists of bacterial membrane embedded basal apparatus, an external needle which protrudes from the cell surface and a tip complex which caps the needle. A translocon is assembled between the needle tip complex and host cells creating a pore in host cell membrane [18]. Bacterial flagellum, which is a key motility organelle also secretes virulence factors through extracellular filament [19]. Following entry into host cells, T3SS effectors manipulate host cellular compartments and molecular pathways. Modification of cell signaling pathways, destruction of cell membrane and mimicry of structure/function of eukaryotic proteins are some of the strategies employed by the effectors to help bacteria in colonization, invasion and pathogenesis [20,21]. There are distinct regions, domains or motifs in the effectors which function in infection process through protein-protein interaction, organelle targeting and immune modulation [22]. Type VI secretion system (T6SS): T6SS of Gram negative bacteria perforate prokaryotic or eukaryotic cells and release toxic effector proteins directly into the target cells in a single cell-contact dependent step. T6SSs are syringe-like contractile injection system which are similar in structure and function to cell-puncturing device of tailed bacteriophages [23]. The system was initially found to deliver effectors into eukaryotic cell, however, recent studies suggest T6SSs are more important to mediate interbacterial interactions [24]. T6SS Pseudomonas aeruginosa effectors T6S exported 1 (Tse1) and Tse3 degrade peptidoglycan of other Gram negative bacteria into small soluble fragments [25]. Several pathogenic genera Burkholderia, Pseudomonas, Yersinia and Vibrio secrete T6SS effector valine glycine repeat protein G (VgrG), which have dual functions. VgrG serves as integral structural component of T6SS and performs distinct effector activities that include cell adhesion, chitosan degradation and actin filament binding and modification. T6SS also makes cell membrane as a target [23]. T6SS phospholipase effector, known as type 6 lipase effector (Tle) proteins hydrolyse component lipids of cell membrane [26]. T6SS effectors also act as nucleases, which under experimental condition degrade plasmid and chromosomal DNA [27]. T6SS plays important role in mediating interbacterial antagonism, which in turn is important for bacterial pathogenicity. Enteric pathogens that have T6SS, including Salmonella enterica, Aeromonas hydrophila, Citrobacter rodentium establish infection by disruption of the colonization barrier of native microflora of the gut [24]. Type VII secretion system (T7SS): Secretory proteins in Gram positive bacteria have much shorter path to cover to get excreted. However, in organisms like Mycobacterium where the cell wall core consists of three covalently linked structures, peptidoglycan, arabinoglycan and mycolic acids, where the mycolic acids forms hydrophobic layer with extreme low fluidity, secretion requires specialized T7SS [28]. T7SS or ESX (Early secreted antigenic target protein) transports proteins across the hydrophobic mycomembrane. Pathogenic mycobacteria have 5 different T7SS, designated as ESX-1 to ESX-5, some of which are important for mycobacterial virulence (ESX-1, ESX-5) and some for viability (ESX-3, ESX-5) [29]. Proteins secreted through T7SS apart from ESX-4 have characteristic Nterminal proline-glutamic acid (PE) or proline-proline-glutamic acid (PPE) motifs [29]. In the extracellular milieu, mycobacteria depend on ESX-3 for metal acquisition and therefore, viability. Following internalization into macrophages, ESX-1 secreted factors acting as pore forming toxins circumvent phagolysosomal degradation and facilitate translocation of pathogenic bacteria into host cytosol. The cytosolic localization promotes mycobacterial replication and ESX-5 secreted proteins manipulate immune response of macrophages inducing cell death and promote bacterial dissemination.
Chaperone usher pathway: Extracellular proteinaceous fibres are critical virulence factors of many Gram negative bacteria, e.g. the p and type 1 pili are virulence factors of uropathogenic Escherichia coli (UPEC), which cause majority of urinary tract infection (UTI). Pili, also termed as fimbrae are non-flagellar proteinaceous appendages composed of multiple pilin units. They are involved in host attachment and invasion, biofilm formation, cell motility and transport of proteins and DNA across membranes [30]. The chaperone-usher (CU) pathway is the most widespread among several pathways that assemble adhesive pili at the surface of Gram negative bacteria. Pili are assembled by two units, periplasmic chaperone and outer membrane pore-forming protein, usher. Chaperone facilitates folding of pilus subunits in periplasm and targets them to usher. Usher is assembly platform where pilus subunits are coordinated to form pilus and released into extracellular environment through usher pore [30]. Adhesions of pili (FimH) interact with mannose residues on host epithelial cells to initiate pathogenesis [31]. Because of their important role in virulence, several antibacterial agents have been designed which disturb pilus biogenesis or block bacterial adhesion to host cells [32].
Localization of lipoprotein (LoL) system: Bacterial lipoproteins are synthesized as precursors in the cytoplasm and processed into mature forms on the cytoplasmic membrane. Some lipoproteins play vital roles in the sorting of other lipoproteins, lipopolysaccharides, and βbarrel proteins to the outer membrane. Helicobacter pylori colonize the gastric mucosa of the human stomach with a variety of factors secreted from its outer membrane. Lipopolysaccharide and numerous outer membrane proteins are involved in adhesion and immune stimulation [4].

Evolutionary Patterns of Sec and Tat mechanisms
Although some information is available on Sec and Tat proteins evolutionary relationships, it does not include life from different clades. In order to investigate the relationship between Sec and Tat machineries across the kingdoms, we did a fresh phylogenetic analysis on Sec Y and TatC proteins across the tree of life. We studied the evolutionary patterns of secY and TatC proteins across 11 clades e.g.; Proteobacteria, firmicutes, Cyanobacteria, cryptophyta, actinobacteria, fungi, algae, oomycetes, plants Glaucocystophyceae, animalia. Glaucocystophyceae belongs to a plant taxon that has reduced secretion machinery [33] . In secY protein cluster, substantial amount of sequence conservation was observed. Some members of animal secY grouped with algae and vice versa indicating acquisition of these genes via lateral gene transfer in these organisms [ Figure. 2]. The TatC system however appeared to be much diverse and much disperse in their evolutionary patterns as was already discussed [34] . For instance the oomycetes TatC system showed very little sequence similarity with other groups and all other groups indicating a much more polyphyletic origin. Members of different groups shared much less identity with each other's (< 40%) and oomycetes groups had the least identity with the remaining taxonomic groups(<=25%) [ Figure.

Secretory systems in other microorganisms
In addition to bacterial effectors, there are a number of organisms such as oomycetes, fungi, nematodes and protozoa which cause devastating diseases by secretion of large number of effector proteins. Figure. 4 summarizes the mechanisms by which these effectors modulate host system.
In the case of many pathogenic fungi (e.g., powdery mildew and rust fungi, Magnaporthe oryzae and Colletotrichum higginsianum) and oomycetes (Hyaloperonospora arabidopsidis, Phytophthora infestans and Phytophthora sojae) penetration of the host cell wall is accomplished via a hypha that differentiates into a specialized feeding structure called as haustorium. Effectors are secreted in the extrahaustorial matrix and further translocated to host cytoplasm [35][36][37]. Several haustorially expressed secreted proteins (HESPs; effectors) interact with cytoplasmic nucleotide-binding site and leucine-rich repeat (NBS-LRR) resistance proteins in the host. The intracellular protozoan parasites of humans (e.g. P. falciparum causing malaria, Toxoplasma spp. causing sleeping sickness) forms a specialized structure when entering a host cell during the blood stage of infection. This structure, the parasitophorous vacuole (PV), is functionally analogous to the haustorium of fungi and oomycetes. Effectors , similar to oomycetes and fungi, are translocated across the PV to host cytoplasm [38,39].
Plant parasitic nematodes (PPNs) are small roundworms infesting roots of thousands of plant species. PPNs have hollow protrusible syringe-like stylet, which mechanically pierces the host cell wall and injects gland secretions into the host cell cytoplasm [40]. The injectisomes of bacterial T3SS and T4SS are analogous to the stylets of PPNs [37,41]. A large number of nematode effectors are cell wall degrading or cell wall modifying enzymes and also affect plant signaling, hormone balance, and cell morphogenesis [42].

Targeting of effectors to host cell, role of conserved motifs
Pathogenic microorganisms secrete various effector proteins to manipulate host cell machinery for virulence and survival. Effectors are composed of functionally distinct domains or motifs that are responsible for protein-protein interaction, enzymatic action or organelle targeting. Identification of these motifs are important leads for designing antimicrobial peptides or drugs. Here we summarize some well recognized motifs of bacterial effectors and their targeting mechanisms [43]. A large number of toxins secreted through T1SS mostly exhibiting cytotoxic pore forming activity belong to RTX family of proteins characterized by the presence of arrays of glycineand aspartate-rich repeats [44,45]. Following entry into host cells, bacterial effectors reach their specific targets including proteins in the nucleus, mitochondria, endoplasmic reticulum, cytoskeleton, or plasma membrane utilizing specific motifs in effectors and carry out their activities [43,45]. A class of T3SS and T4SS effectors bearing Glu-Pro-Ile-Tyr-Ala (EPIYA) sequence (EPIYA motif) elicit pathogenic response by manipulating host cell signaling [46] e.g. CagA protein of Helicobacter pylori is the archetypal EPIYA effector causing gastric carcinoma [47]. A number of T3SS effectors with their motifs playing roles during infection has been demonstrated [22]. Salmonella enterica serovar Typhimurium utilize distinct motifs in their T3SS effector to advance infection following host cell function [48]. The tetratricopeptide repeat (TPR) motifs containing proteins are involved in virulence mechanisms of large number of T3SS effectors [49]. Cterminal amino acid composition and possible motifs have been exploited to predict T4SS effectors causing pathogenesis [50]. Arg-Gly-Asp (RGD) motif present in extracellular host adhesion factor is used by some bacteria to attach onto the host cells. IPIO effector or some proteins/peptides with an RGD motif interact with adhesion between cell wall and plasma membrane leading to disruption of plant cells. The motifs composed of less than 10 amino acids, called short linear motifs (SliMs) enable pathogenic bacteria to control intracellular process of host [51]. The cell wall peptidoglycan of Gram positive Citation: Bhowmick S, Tripathy S (2014) A Tale of Effectors; Their Secretory Mechanisms and Computational Discovery in Pathogenic, Nonbacteria act as surface organelle for transport and assembly of many proteins which interact with hosts. Proteins secreted and released into the environment by living Gram positive beneficial bacteria (probiotcs) interact directly with mucosal cells, such as epithelial and immune cells [52]. Identification of these proteins and their motifs (Ala-X-Ala ,signal peptide type I and the Leu-Ala-Gly-Cys, signal peptide type II or lipobox sequence) are important to design probiotic effector molecules [53][54][55]. Extracellular pilus of several probiotic bacteria induce immunomodulatory activity [56][57][58]. Bacterium Xanthomonas inject TAL effector proteins into plant cells which recognize effector specific promoter sequence in nuclear DNA via unique repeat-variable diresidues (RVDs) and activate plant genes that either benefit the bacterium or trigger host defense response [59]. Table 1 presents different examples of effectors with their targeting mechanisms. The availability of genome sequences has led to identification of plethora of putative translocated effector proteins in oomycetes [35,60]. Effectors of this pathogen are categorized into extracellular and intracellular types. Extracellular effectors mediate protection against host defense which include extracellular protease inhibitor EPI1, EPI10, cystein protease inhibitors EPICI, EPIC2B and glucanase inhibitors which all prevent degradation of pathogen cell wall components and subsequent release of elicitors to control host defence. The effectors which aid in invasion include some hydrolytic enzymes and toxins. Intracellular effector proteins act by suppressing host defense responses. However, some effectors can be recognized by host resistance (R) proteins and, as a result, initiate specific gene-for-gene defense responses. In such cases, these effectors are termed avirulence (Avr) proteins. A large number of Avr proteins show presence of a highly conserved amino acid motif, RXLR (Arg-any amino acid-Leu-Arg) which is often followed by an EER-motif (Glu-Glu-Arg) [35,60] in the C terminus. RXLR motif and its variants in both oomycetes and fungi enable the effectors to bind phosphatidylinositol-3-phosphate on the outer surface of plant and human cell plasma membranes, and this binding mediates the effector entry through lipid raft mediated endocytosis [61]. Pseudoperonospora cubensis has a nuclear-localized effector (PcQNE) with a QXLR motif, where the R of RXLR is substituted with a Q in the N terminus; plays a pivotal role in pathogenicity [62]. Another group of cytoplasmic effectors of oomycetes, known as crinklers (CRN) cause leaf crinkling and necrosis of host plants [35,60]. They exhibit highly conserved N-terminal Leu-Xaa-Leu-Phe-Leu-Ala-Lys (LxLFLAK) domain and a tri-peptide signature (Asp-Trp-Leu, DWL) at C-terminal region that ends with a conserved His-Val-Leu-Val-Xaa-Xaa-Pro motif. Crinklers have a nuclear localization signal motif that enables them to get targeted to the nucleus of the host cells.
Transport across the PV membrane by P. falciparum effectors into the host cytosol requires an 11-amino-acid host cell-targeting (HT) motif (Rx1SRxLxE/D/Qx2x3x4) with a 5-amino-acid PEXEL [alternatively known as Plasmodium Export Element (PEXEL)] core (RxLxE/D/Q) that is conserved among diverse proteins [63]. The HTmotif and PEXEL are identified by different algorithms and have slightly different specificities, but recognize the same core sequence (RxLxE/Q/D). Proteins with HT motif cross PV and enter into red blood cell, whereas proteins without HT are released into PV. Recent studies show the export mechanism is due to the HT signal binding to the lipid phosphatidylinositol-3-phosphate (PI (3) P) in the parasite endoplasmic reticulum (ER). The protease in parasite ER, plasmepsin V cleave HT signal and thus proteins are released into erythrocyte cytoplasm [64], however, PI(3)P independent transport pathway is also reported [65]. HT signal was originally identified as a highly conserved motif by pattern recognition programs in five functionally equivalent~40 amino acid, vacuolar translocation sequence (VTS) [AFNNNLCSKNAKGLNLNKRLLYETQAHVDDVHHAHHADV]. The R, L and E region (shown in bold in the pattern describe above), is known to contribute to both PI (3) P binding and protein export to the erythrocyte. Substitution of sequences KNAKGLN upstream of the HT motif (underlined in the pattern describe above), also blocked export but have not been tested for PI (3) P binding. The italicized sequence in the pattern (QAHVDDVHHAHHADV) indicates charged amino acids downstream of the HT motif enable export independent of PI (3)P,. Substitution of HHAHHA sequence in repeat sequences does not block export [63,66].
Toxoplasma parasite secrete a number of effectors to manipulate host immune system, these are rhoptry protein kinases (the ROP class); ROP18, ROP5, ROP16, ROP38, which accumulate in PV. Conserved HGB motif was identified in ROP5, that is a major virulence determinant of this protozoa [67].
The cyst nematodes secrete active CLAVATA3/ESR (CLE)-like proteins that facilitate nematode parasitism by interacting with several plant signaling pathways [42]. Nuclear localization signals (NLSs) have been predicted in several secreted proteins. During parasitism, these effectors interact with nuclear proteins leading to suppression of plant immunity. Targeted to LCV through prenylation (addition of a 15-carbon farnesyl or a 20-carbon geranyl geranyl isoprenoid group to a Cys residue in CaaX motif) [91] Targeted to SCV and SIF through prenylation [92]

Effectors localize themselves in Long Intergenic Regions
In eukaryotic pathogens, effector molecules are found to be localized in long intergenic regions [68]. The most plausible reason for this is when effectors try to integrate themselves in a tightly packed, more stable gene rich region, that region become unstable and hence undergoes negative selection. However, gene sparse long intergenic regions are more plastic and can tolerate transposon mediated gene expansion [69].Many phytopathogenic ascomycetes such as Leptosphaeria maculans are known to have effectors localized in AT rich region [70,61]. Recently a great deal of genomics work has been carried in effector systems in Oomycetes pathogens [71,76]. In oomycetes pathogens, there are pockets in the genome called as gated communities, where all the housekeeping and conserved genes reside [72]. Any effector integration event is rapidly flushed out in these area and hence the term gated community has been coined for such areas [73]. The point of effector insertion is known to cause a break of colinearity between conserved genomes [ Figure 5]. Recently chromosomal positions of effector genes have been built into algorithms for effector discovery.

Algorithms for effector prediction
For development of algorithms to predict effectors, it is essential to understand few of their common properties [74,75]. Effectors almost always are small molecules, undergoing positive selection, expansion, rapid evolution [71,76]. They have a N terminus secretory signal 77 and a C terminal E rich basic motif. Interspersed in the protein sequence lies relatively larger number of cysteine residues for proper protein folding [78]. Several types of effectors have been studied to have a specific signature motif that distinguishes and defines its physiological role [ Table 1]. These often lack introns and are located in long intergenic regions attached with a transposon. Other than this, they often are characterized as lacking a transmembrane domain [37]. Depending on their localization in the host cell, they are known to carry a NLS (Nuclear Localization Signal), MLS (mitochondrial Localization Signal) [1,79]. Since most of the effectors are located in Genome islands and are acquired through HGT events, they have subtle difference in GC content, coding potential and other genomic properties with the native genome. Based on these traits, many algorithms have been designed for efficient effector discovery [ Figure. 6].
The TATFIND algorithm was developed to identify putative Tat substrates by looking for the presence of this conserved motif in bacterial signal peptides [80].
Effectors lack significant sequence homology with each other, so prediction methods based on homology almost always fails. Most of the gene prediction algorithms miss out on finding effectors in a newly sequenced organism [71,81]. Effector prediction in a newly sequenced organism can be carried out with the following steps.
Step-1: Genome wide 6 frame translation: The whole genome six frame translation can be carried out using getorf program of EMBOSS package [82].
Step-2: Filtration: Sequences smaller than 30 aa residues can be filtered out.
Step-3: Recursive Blast search [83]: Perform blast against known effectors and merge the homologous sequences predicted by blast into the database and repeat blast till no more new blast hits are obtained.
Step 5: coding potential filtering: For the given organism create codon usage table using CUSP program from EMBOSS package and filter out sequences with coding potential < zero.
Step 6: Filter RCPs, SCPs and LIRs with higher coding potential: The 6 frame ORF data that has Repeat containing proteins (RCP), Small Cysteine Rich Citation: Bhowmick S, Tripathy S (2014) A Tale of Effectors; Their Secretory Mechanisms and Computational Discovery in Pathogenic, Non-Proteins (SCRs), Long Intergenic Regions (LIRs) [78] can be further selected for the next process.
Step 7: HMM search: Do HMM search of the dataset created from step 4 and continue a recursive search against by adding new HMM plus sequences into HMM datasets. Continue till no new HMMs are obtained.
Step 8: Motif search: Further filter down the sequences obtained from Step 5 and look for the motifs that is typical for the kind of effectors you are looking for.
Step 9: Permutation Analysis: Shuffle the nucleotides flanking the motif of interest and see if the motif sequence occurs by chance in how many cases that lacks any biological significance.
Step 10: Detecting Signals : From the sequences with true positive motifs, find out the proteins having Nuclear Localization Signal (NLS), Mitochondrial Localization Signal (MLS), C terminal basic region [55], C terminal E motifs, secretome positive proteins [77], transmembrane negative proteins and merge with dataset generated from step 6. Repeat step 7 till no more new proteins are obtained. This final dataset is now clustered using claustalw for grouping them into distinct clusters for evolutionary studies.

Conclusion
The avalanche of genomic information available over the last decade has enabled scientists in understanding the key mechanisms responsible for secretion process. Secretory proteins are no longer the elusive elements as they were before. From whole genome sequences, canonical and non-canonical secretomes are now predicted effortlessly using modern computational methods. We have taken the most up-todate information available on secretory mechanisms and summarize them in this article. The Sec/Tat dependent and independent systems are discussed in detail with proper examples to serve as a catalogue for the researchers working in this area. We further corroborated the fact that Tat machineries are more diverse whereas Sec pathways are quite conserved using diverse genomic data. We have compiled the secretion types with their associated functions and motifs involved in a great detail. In addition we have also summarized the software resources available for effector prediction for the benefit of wet lab researchers. In this article we made an attempt to compile relevant information on effector biology and prediction across the microbial world. This review will undoubtedly serve as a quick guide for wet lab and dry lab scientists in understanding effector biology, motif finding and prediction of novel effectors.