Proteomics Approaches towards Early Detection and Diagnosis of Cancer

Early stage detection of cancer is the key to provide a better outcome for therapeutic intervention. Most routine screening and diagnosis tools for cancer lack sufficient sensitivity and/or specificity and sometimes they are invasive. Proteomic technologies hold recently great promise in the search of new clinical biomarkers for the early detection and diagnosis of cancer as well as the discovery of new therapeutic targets from accessible bio-specimens. They also have the potential for contributing to the better understanding of cancer biology and helping in making the right therapeutic decisions for patients. Whereas some proteomic approaches, such as the ones used for identifying proteins and analyzing their interaction and function, are well established, others, such as protein expression profiling for biomarker discovery and validation, are still suffering from robustness and reproducibility issues before being able to have their clinical applications in cancer.


Introduction
Genome sequencing and analysis have produced a wealth of information during the last two decades, including the full genome sequence. The ultimate following step was to look at proteins; the workhorse biomolecules that are translated from genes and functionally govern cellular processes and control disease progress and malignancy. Due to various cellular mechanisms including alternative splicing and post-translational modifications of proteins (e.g., phosphorylation, glycosylation, acetylation, and proteolytic cleavage) it is estimated that the human proteome comprises more than 500,000 proteins [1,2] in comparison with about 22,000 protein-coding genes [3]. In addition, proteins dynamic is more complex than in genes, due to proteins various localizations through the cell, half-lives, interconnectivity into complexes and signaling pathways, and also response to stimuli such as disease and treatment [4]. Moreover, it is now well established that changes in levels or abundance of genes and their transcripts do not always correlate with protein abundance [5,6]. Therefore, cancer can now be considered as a proteomic disease and more linked to the post-transcriptional steps [7] even there is still a partial contribution of genetic background to the predisposition and development of this disease.
level is elevated in only about half of women having early-stage ovarian cancer [11,12]. CA-125 also has low specificity since benign conditions, such as endometriosis and pregnancy, can elevate CA-125 levels [13,14]. AFP itself has abnormal expression levels in only two-thirds of HCC patients [15]. The above mentioned facts highlight the urgent, yet unmet, need for the discovery of novel but sensitive and accurate tumor markers for cancer screening, diagnosis, and prognosis as well as fostering translational research in oncology to move from bench to bedside.
On the other hand, the number of new protein biomarkers achieving FDA-approval has trended downwards for the last decade as three or less new markers are being approved per year including all diseases [4,16]. For cancer and till recently, only nine FDA-approved, blood-based cancer markers are available; most of them are used to monitor treatment [17]. This disappointing trend suggests that conventional approaches have reached their contribution limit and that there an urgent need to develop and implement new approaches to discover and translate new biomarkers to the clinical use [18].
Altogether, these elements make the proteome and proteomics of a great interest both to researchers and clinicians, in particular for complex diseases such as cancer. In addition, the possibility to systematically and simultaneously identify and quantify large number of proteins obviously positions proteomics on the forefront to understand cancer biology and to develop promising biomarkers and drug targets for the disease.
Proteome analysis principally relies on MS though other approaches are available such as microarrays and antibody panels. MS-based proteomics, after its infancy stage, starts to mature through clear developments both in technologies and in experimental strategies [19]. Indeed, after the hype which followed the first landmark studies published a decade ago and claiming the discovery of new blood markers with both high specificity and sensitivity for example for ovarian cancer [20], joint efforts between clinicians, scientists and technologists helped to address many issues and emerged a reasonable hope from these approaches.
In this review we are not aiming to summarize the numerous studies and results linked to onco-proteomics but rather a brief description of the various strategies and MS-based approaches used with a focus on the most promising ones which may lead the next development steps and implementation of proteomics in the field of cancer detection and diagnosis.

Proteomic Approaches for Cancer Studies
The discovery and clinical application of novel tumor markers for cancer screening, diagnosis, and prognosis are actually key focus areas of translational research in oncology and represent urgent needs in the fight against cancer [8].
MS-based proteomics can be used for early discovery steps of biomarkers and their validation but also for clinical diagnosis and prognosis as an endpoint clinical assay ( Figure 1). Although conventional proteomics has become a key approach in biomarker discovery and validation, some of the most recent MS technologies have not yet been introduced to clinical application due to their complexity or their high cost [21].

Biomarker discovery
During the last few years, and due to the development of innovative MS-based proteomic technologies and strategies to deeply investigate the oncoproteome, large lists of potential biomarkers were generated although most of them are still at the level of discovery and/or validation [22]. In addition, the high throughput capacity in MS-based proteomic approaches for biomarker discovery and validation is seen to be a key element in the integration of the various oncogenomic and oncoproteomic data to fully comprehend cancer biology in particular with the growing array of genes and proteins data available in compiled and curated databases [23].
In MS-based proteomics research for biomarker discovery, there are two main approaches currently used in cancer which are: Protein identification and pattern recognition ( Figure 1). Both approaches require high-capacity computing and bioinformatics systems to process the enormous amount of data that are produced by proteomic studies. Furthermore, confidence in the identified biomarker signatures requires to be reproduced in different populations and by different laboratories. The discussion of the outcomes of the cancer proteomic studies is beyond the scope of this review, some of those studies will be, however, briefly summarized with each MS-based proteomic technology in the next section of this review.
Protein identification and quantification: Identification and quantitation of proteins and peptides by MS are usually carried out using one of the two following strategies: the first is called 'bottom-up' approach which refers to the reconstruction of the protein sequence, and thus its identification and quantitation, from the sequences of its peptides fragments after proteolytic digestion, and by analyzing the peptides mass spectra using appropriate databases. The second is the 'top-down' approach referring to the identification of a protein directly from its full sequence without enzymatic digestion. Hence, the biomarker discovery approach can be comprehensive, to enumerate, identify and quantitate as many protein components as possible from a biological sample on a data-based manner. On the other hand, protein identification and quantitation can be also focused on small set of candidate proteins issued for example from a co-immuno-purification, or even from shortlisted proteins linked to specific signaling pathway or drug target that differentiates between diseased and control subject phenotypes, in a hypothesis-based manner [25]. Identification of protein targets will immediately facilitate their quantification and validation as well as evaluating their potential clinical value for further development [26]. Protein targets that are identified with MS could also be further characterized to understand their functional role in cancer biology. Key signaling hubs of pathways involved in carcinogenesis are obviously potential molecular targets for therapeutics, and inhibitor drugs against such proteins [27,28] and are good examples of hypothesis-based approach in protein identification.
In practice, the identification of proteins using MS is nowadays well established and a straightforward proteomic task even though some purification/enrichment steps might be needed. This will developed in the section about MS platforms.
Pattern recognition: As mentioned above, cancer is characterized by the heterogeneity of its pathogenesis reflected by the multiple dysregulated proteins and cellular pathways involved in the initiation and progression of the disease. In addition, most proteins produced by tumor cells are not unique and also produced by non-cancerous cells [29]. Hence, it is now largely accepted that a single biomarker is less likely to have sufficient sensitivity and/or specificity for populationbased screening and early detection of cancer. Instead, the discovery of a panel of biomarker called "protein signatures" or "protein pattern" comprising several proteins is thought to provide higher sensitivity and specificity [30][31][32][33]. In support to this vision, many recent proteomic studies effectively reported that protein panels are more accurate tools for detecting cancer than individual proteins [34][35][36][37][38][39].
In this pattern recognition approach, MS combined with adequate bioinformatics tools are used to measure the mass and relative quantity of all proteins or peptides in biological sample without proteolysis or deep fractionation. By comparing profiles (protein signatures) between samples taken from patients with those taken for their matched healthy controls, a list of differentially expressed proteins is generated and used for further validation (Figure 1). Independent identification of those proteins is consequently carried out on MALDI-TOF-MS/MS or LC-MS/MS or even with gold standard assays to rollout the likelihood that differences in protein signatures observed between those biological samples are due to experimental bias. This approach needs however, large set of samples and good experimental reproducibility to overcome the inter-individual physiological variability. The OVA1 test, an In Vitro Diagnostic Multivariate Index Assay (IVDMIA) constituted of a panel of protein markers recently approved by the US FDA, represents the first clinical application of this protein signature approach [40]. This test is used to assess ovarian cancer risk in women diagnosed with ovarian tumor prior to a planned surgery.

Clinical diagnostics
Even though MS-based proteomic approaches are powerful discovery tools, MS is currently used in clinical settings only for few applications. This is due, at least partially, to its complexity and its low time-cost efficacy when compared to other well established standard techniques such as immunoassays or enzymatic tests [21]. Nevertheless, MS has found its application in clinical analysis longtime ago for example in the neonatal screening programs for inborn errors of metabolism, in particular for phenylketonuria which is now established in several countries [41][42][43]. More recent implementation of MS-based proteomics in clinical settings are the above-mentioned OVA1 test for clinical diagnosis of ovarian cancer [40] and also the IVD MALDI biotyper MS for identification of microorganism species in clinical microbiology labs [44]. This last test is a MALDI-TOF MS based benchtop platform for rapidly identifying bacteria and yeasts using a database of over 3,900 strains from about 2,000 well-characterized microbial species. Starting from a cultured colony, identification is performed by matching the measured protein fingerprint against the proprietary Biotyper database. Using this system, 30 to 60 strain identifications can be performed every hour with a reasonable low false positive rates, low operational costs, and low technical barriers for new operators [44,45].
Altogether, these recent developments are fostering the transition of MS from a discovery tool to a validation and diagnostic tool in clinical laboratory settings for the foreseen future [21, 46,47]. For other applications, and to overcome the limiting issues hampering the large introduction of MS applications into the clinics, MS does not require to be physically available in the clinical lab and could be used for specific needs within an appropriated research lab [21].

MS-based Proteomic Platforms for Cancer Studies
Sample preparation and protein enrichment Considerations for biological samples: A considerable progress has been achieved during the last years in proteomic technologies and strategies which enabled deciphering many biological and pathophysiological mechanisms linked to proteins and their genes. The successful transition of these technologies from research tools to clinical diagnostic/screening platforms is however still challenged by some basic issues linked to the human physiology and sample quality. Indeed, the complex nature and instability of the human clinical samples during their collection and analysis, due to the degradation of their quality and content linked to the presence of enzymes proteins, make the integrity of those samples during the whole processing steps a key to any analysis of their content [48,49]. The large dynamic range in protein concentration and the presence of different sates of proteins (various isoforms and PTMs) are other hurdles for proteomics to overcome.
For example and despite the development of standardized experimental protocols for enrichment, separation, and quantification of proteins, there is still gaps in the expected reproducibility of proteomic analysis between different laboratories mainly due to the change and degradation of protein samples during the pre-analytical (sample collection, handling and storage) and analytical steps [50]. As a consequence, an extra effort was undertaken during the last few years to overcome these issues to ensure acceptable reproducibility and avoid experimental bias. Several groups have reported recently about the potential confounding effects of pre-analytical and analytical steps, aside with various recommendations addressing best practices for specimen handling with more stringent precautions to maintain the integrity of proteins and ensuring accuracy and reproducibility of proteomic results [48,49,51,52].
These recommendations include detailed SOPs starting from the experimental design such as good matching between cases and controls (gender, age, other morbidities, etc), minimum required number of samples, the use of different sets of samples for discovery and verification/validation steps to avoid systematic bias and reduce the false discovery rates of disease markers [53]. They also include reporting as much information as possible about the way the biological samples were collected, handled and pre-processed since these preanalytical steps are often carried out in a clinical setup and not under the control of the investigators studying disease markers. It is also frequent that the sample analysis is performed awhile after its storage and thus, it is important to have detailed reports the storage conditions. Once all the optimal conditions for sample collection and handling are established and the appropriate proteomic technologies are selected, there is also the need to share biospecimen resources between independent research groups and institutions for the biomarker discovery and/or the validation steps. Finally, cohort and time-serial samples collected before the onset of the cancer are particularly useful to foster discovering and validating cancer-specific biomarkers for early detection and follow-up of the disease.
In the following section, we summarize only the two most common techniques used for protein separation, many other techniques are available but they are beyond the scope of this review.
Gels electrophoresis: Before being analyzed by MS, biological samples usually need to some extend preliminary separation, enrichment or fractionation of their protein content due to their complexity. The most common approaches for protein separation are based on protein size or their physical chemistry properties.
One of the first techniques is the one-directional polyacrylamide gel electrophoresis (1D-PAGE), where proteins from biological samples are separated based on their size by applying an electric current to a gel matrix, in which the smaller proteins move faster than the larger proteins through the gel. The resulting gel is then stained, using various reagents such as Coomassie blue dye , silver staining [54], fluorescent dyes [55] or radiolabels, and protein bands can finally be viewed and quantified for analytical or preparative purposes. 1D-PAGE technique has however a major limitation as only few tens of proteins can be clearly separated at the same time from biological which may contain hundreds to thousands different proteins. Another challenge for this technique is its low resolution as it cannot separate proteins of very similar size such us protein isoforms.
Most of these inherent limitations where however overcome later by developing and using a more complex gel-based separation method, the Two-Dimensional Gel Electrophoresis (2D-PAGE) [56]. In this technique, proteins are separated by two independent steps using two distinct properties. First, proteins are separated in a gel strip according to their isoelectric point (pI), where the net charge of the protein is zero. The proteins are then separated in a second experiment by placing the gel strip on top of a standard Sodium Dodecyl Sulphate (SDS-PAGE), which separates the proteins according their size as in 1D-PAGE.
The 2D-PAGE technique has, however, its own limitations such as; the low throughput as only two samples at a time can be processed, the requirement of relatively large amounts of sample and the time consuming laborious protocol. As the principle of the method relies on the comparison of the spots intensity within gels obtained from different samples or subjects, the inter-gel variability represented an extra hurdle for this method. Nevertheless, this was particularly enhanced by the development of the 2D-differential in-gel electrophoresis (2D-DIGE) in 1997 [57] that allows comparison of two or three protein samples simultaneously on the same gel using different fluorescent dyes for each protein sample. Inter-gel reproducibility 2D-DIGE was further improved by including internal standards and developing advanced algorithms and software for spot alignment and quantitation between different gels.
Finally, for both 1D-and 2D-PAGE protein bands need to be cut out of the gel and digested with proteases (e.g. trypsin) before being identified and/or quantified using MS and appropriates databases. 1D-and 2D-PAGE protein separation are mostly used upstream to MALDI-TOF MS for specific enriched or differentially expressed proteins but for limited sample set due to their low throughput.
Liquid chromatography: Liquid chromatography, more precisely nano-LC coupled to tandem MS (LC-MS/MS) is now widely implemented in most proteomic platforms to identify and quantify large number (thousands) of proteins from complex biospecimens. The LC principle is based on a column packed with functionalized phase to separate components of a mixture by a variety of chemical interactions between the proteins or the peptides and the column. Proteins are usually digested before being bound to the column and then eluted from it using appropriate elution gradients with flow rates of nL/min to µl/min allowing the use of smaller sample amounts and better sensitivity and resolution. Protein fractionation/separation can be also carried out on a separated LC system before being analyzed by MS system in a method called offline LC/MALDI or LC/MS (/MS).
Multidimensional LC with orthogonal separation of peptide digests coupled to ESI-MS, also called multidimensional protein identification technology (MudPIT) is an emergent strategy used in proteomics and involving two or more LC columns [21, [58][59][60][61]. MudPIT allows the identification of few thousands of protein from a given sample, it has high reproducibility and works well for hydrophobic, acidic, basic, very small, very large and low-abundant proteins which are difficult to analyze by traditional separation techniques [21].
In practice, LC protein separation is mostly used hyphenated with an ESI-MS for both subset of specific enriched or large scale protein identification and quantitation, using limited sample sets due to LC low throughput.

Mass spectrometry
A mass spectrometer machine, the key element in the MS-based proteomic approach, has mainly three components: an ionization source, a mass analyzer and an ion detector ( Figure 2). The two most common ion sources used in proteomics are based on Electrospray Ionization (ESI) from a liquid solution and Matrix-Assisted Laser Desorption/Ionization (MALDI) from solid crystals. The ion source produces ions from the sample such as peptides, proteins in the gas phase by the addition or loss of one or more protons in a so-called 'soft' ionization technique that still maintains sample integrity. A mass analyzer is then used to separate ions with different mass-to-charge ratios (m/z). The main ion analyzers used in proteomics are Quadrupole (Q), Time of Flight (TOF), ion traps, and Fourier Transform Ion Cyclotron (FT-ICR). The number of different ions is finally counted by the detector. Those ions and their numbers are finally presented by a signal processor (computer) as a mass spectrum with a series of spiked peaks, each representing the charged proteins/peptides or their fragments extracted from a given sample ( Figure 2).

MALDI-TOF:
In MALDI, proteins or peptides are mixed with a large excess of a suitable organic matrix and then spotted onto a plate. The dried mixture is then subjected to a laser pulse to generate clouds of matrix and proteins ions. The protein ions are accelerated into a vacuum tube (TOF mass analyzer) and travel through it until reaching a detector which converts the amount of ions to intensity. As ions with different masses will have different velocities, they will separately reach the detector and will be represented as a plot of distinct m/z of ions against their respective intensity, a plot called mass spectrum. This obviously needs the machine to be calibrated in advance using known polypeptides mixture, to establish the relationship between the m/z of the ions and their time of flight. In addition some MALDI instruments have the ability to provide partial amino acid sequence using its Post-Source Decay (PSD) from ions with m/z values of up to 4kDa or its In-Source Dissociation (ISD) from larger ions with the advantage of being simple, having high sensitivity and higher tolerance to buffer and salt contaminants in comparison with ESI-MS.
MALDI-TOF MS is a versatile and can be used for various purposes including proteomic applications. In a first approach, proteins are extracted from a biological sample, digested with a protease (e.g. trypsin) and then analyzed with MALDI-TOF MS to generate a list of peptide masses unique to each protein and known as Peptide Mass Fingerprinting (PMF). Comparison of the generated peptide mass list with protein databases allows the identification of the digested proteins with reasonable confidence. This approach is particularly useful in the discovery of biomarkers due to the high sensitivity (down to attomol), relatively wide dynamic range (3-4 orders of magnitude) added to the ability for high-throughput screening [63]. Nevertheless, this PMF approach is useful when there are only one or few proteins at the same time.
Alternatively, and as a second approach, proteins (or peptides) can be enriched from clinical samples such as plasma or tissue biopsy Ion Source  and then analyzed with MALDI-TOF MS to generate specific mass patterns of proteins intensity without relying on protein identity. These patterns can be used as a 'diagnostic fingerprint' comparing differential patterns between healthy-controls and diseased samples [39, 48,49,63]. This approach is useful as a preliminary discovery step with highthroughput that allows protein expression profiling of large sample sets with reasonable costs. In more advanced MALDI-TOF MS, peptides can be fragmented to further sequence them at least partially and thus giving better confidence to the protein identification.
Due to their high throughput and versatility, MALDI-TOF MS platforms have been extensively used for human cancer detection, in particular at the level of biomarkers discovery and protein pattern signature. MALDI-TOF MS was used, in combination with variety of statistical pattern-recognition and bioinformatics tools, for example in the early detection of various cancers such as breast, ovarian, prostate, colorectal, pancreatic, melanoma and lung cancer (for detailed reviews, see for example [21,24,[64][65][66][67][68][69][70]). Many protein and peptide peaks have been reported to bear significant diagnostic, prognostic or predictive value for various cancers; however, the candidate biomarkers have not yet been validated for use in clinical patient care [70]. Table 1 is summarizing a selected list of studies published using MALDI-TOF MS for cancer biomarkers discovery.

SELDI-TOF:
Surface-Enhanced Laser Desorption-Ionization (SELDI) MS uses array chips (ProteinChip) with functionalized surface which selectively bind and enrich subsets of proteins. The array chips can have different physicochemical properties such as reversephase, ion exchange, immobilized metal, or antibodies affinity. As in MALDI-TOF technique, an organic matrix is then added to the bound proteins and blasted with laser beam to generate polypeptide ions. The protein array chip is often coupled to TOF-MS and bioinformatics to derive proteome patterns for the samples analyzed [71]. As other MS-based proteomic technologies, SELDI requires a low amount of samples (femto-mole range) and it has real potential for clinical applications at the bedside to analyze samples for biomarker discovery due to its high throughput and easiness of use [71][72][73]. The common application of SELDI-TOF MS in cancer biomarker discovery is to find signature patterns correlated to healthy and diseased phenotypes. Nevertheless, the SELDI-TOF-MS suffers from its inability to directly and accurately identify the proteins within proteome patterns, and its relatively low mass resolution which limited the use of SELDI-TOF MS [74]. Due to the high dynamic range of protein levels in serum and plasma, the ProteinChip array can be quickly saturated with highabundance proteins due to its low binding capacity [50,75], and thus pre-fractionation steps are mandatory to identify biomarkers present at low abundance.
One of the first discovery studies carried out using SELDI-TOF MS for ovarian cancer detection [20], generated a wide excitement both in the scientific community and the private sector, as their results were quickly converted into a commercially available diagnostic test (OvaCheck ™ , Correlogic, Inc., Germantown, MD). Hence, SELDI-TOF MS was widely used in researches related to signature detection of cancer protein patterns [74]. This includes ovarian cancer [20,76], prostate cancer [77][78][79], breast cancer [80,81], lung cancer [82], colon cancer [83], and liver cancer [84]. SELDI-TOF MS was also applied for example to cancer relapse and prognosis of nasopharyngeal carcinoma [85,86]. More recently, the Lucid Proteomics System™, combining SELDI-TOF MS and MALDI-TOF/TOF MS technologies has provided further hopes for biomarker discovery on a single platform with improved spectra resolution and reproducibility [87]. Table 2 is summarizing a selected list of studies published using SELDI-TOF MS for cancer biomarkers discovery.

ESI-MS (/MS):
Electrospray ionization MS (ESI-MS) and often the tandem mass MS (ESI-MS/MS) are largely implemented in proteomic platforms for protein identification and quantitation from complex samples, including target protein characterization and biomarkers discovery. More often the ESI-MS/MS is used online with a nano-LC system where the sensitivity and the capacity of this soft ionization and identification technique are fully explored for proteomic applications. This hyphenation generates more information in a given time and it is also more suitable for relative quantification due to the ionization suppression issues linked to the complexity of biospecimens when analyzed with ESI-MS/MS or MALDI-MS.
On the other hand, the quantitative proteomics based on LC-MS/MS approaches has led to major development in the discovery of novel cancer biomarkers and potential therapeutic targets during the last decade. Indeed, LC-MS/MS is easily combined with quantitative techniques such as isotope-coded affinity tags (ICAT), isobaric tags for relative and absolute quantification (iTRAQ), or Stable Isotope Labeling By Amino Acids in Cell Culture (SILAC) [88][89][90][91][92]. Once introduced to the mass spectrometer, the mass shift due to labeling is easily detectable from the paired labeled versus native peptides or proteins and the quantification is achieved typically by calculating ratios between those paired species at MS or MS/MS levels. Therefore, proteins and peptides including putative biomarkers can be identified and quantified within
As an alternative to isotope labeling, label-free quantification provides simple, low-cost and technically less stringent measurements of cancerous proteomes [108]. The straightforward method is a relative quantification based on peptides/ proteins identified from spectra (spectral counting) with the assumption that precursor-ion intensities correlate with peptide abundance [109]. Samples from patients and controls for examples are analyzed separately, but using the same data acquisition protocol. Label-free quantification approach usually needs more biological samples and experimental replicates than labeling approaches aside with run-to-run high reproducibility. This approach has been used in a number of oncoproteomic analyses [110]. Appropriate bioinformatics tools for spectra alignment and peptide quantification is warranted for more confidence and future success of label-free quantitative analysis [111].
Although the fact that LC-MS/MS is mostly used in the 'bottomup' approach, some exciting applications based on the 'top-down' approach were recently developed using these platforms which may give new opportunities for cancer proteomic applications (for example see, [112,113]).

Perspectives for Cancer Proteomics
Despite the tremendous progress in the MS technologies as well as in the development standardized experimental protocols for enrichment, separation, identification and quantification of proteins, proteomics research is still limited by both technologies and bioinformatics tools currently available for analyzing proteins. Indeed, the complex nature of the human proteome, the huge dynamic range of protein concentration and the plethora of protein isoforms in specimens added to the heterogeneity in diseases are major hurdle to overcome in proteomics [114]. On the other hand, most of initial proteomic studies were designed on a 'snapshot' basis, where only a single time point from a given human sample was investigated without taking into account significant processes taking place over time. Therefore, recent studies started to look at serial-time points to get access to the temporal and special dynamics of proteins, in particular for screening and early detection of cancer and have shown promising results (see for example [34,35]). Furthermore, neither the so-called data-based strategy nor the hypothesis-based strategy were able to deliver the expected outcomes from proteomic studies and the combination of both of them will definitely help in deciphering more secrets of disease-linked proteome and accelerating the path from the bench-side to the bedside [115].
Taken together with the new metamorphosis happening in the proteomic field by including collaborative and inter-disciplinary efforts, some promising perspectives are foreseen for the near future. These include; retuning some previous approaches, developing new ones and combining proteomics with other 'omics' approaches around more targeted biological questions and strategies. In an attempt to evaluate the most common MS-based proteomic platforms, we summarized in table 3, the effectiveness and usefulness of these platforms for biomarker discovery and/or clinical diagnosis in the field of cancer.

Targeted MS-based proteomics
During the last decade, intensive work was carried out and aiming a general profiling of various accessible biofluids in the quest of new cancer biomarkers. Nevertheless, this strategy has shown its limitations due to the complexity of those biospecimens with high dynamic range and in which the most potential biomarkers leaked from tumors are available in many orders of magnitude less than the common proteins in those samples. Therefore, a more targeted or directed strategy became mandatory to give onco-proteomic studies a new breath with promising future.

Targeting diseased tissues and proximal fluids for biomarkers discovery:
It is obvious that in tumor tissues and their proximal fluids (cerebrospinal fluid, tumor interstitial fluid, nipple aspirate, etc.), tumor-derived proteins are present at higher concentration than in the bloodstream to which protein biomarkers may be secrete or leaked. Therefore, targeting those local sites will dramatically increase the possibility of isolating and identifying cancer-specific markers [116,117]. According to this approach, potential biomarkers will be first discovered in the tumors or their proximal fluids and then measured out in the plasma using highly sensitive, targeted assay technologies. This subsequent step is critical to check if biomarker candidates are available in detectable amounts in blood, either by MS or other independent assays, and thus the possibility of developing non-invasive blood-based tests for cancer diagnosis or screening. This approach was recently used with success for prioritizing candidate markers linked to breast cancer [118] and cardiac injury [119] by combining the biomarker candidate identification from tissue and then peripheral blood with targeted MS (accurate inclusion mass screening and SRM) to detect only the preselected peptides.
As gene expression profiling studies are carried out using tissues and cells, this proteomic targeted strategy has also the advantage of using the same tumor tissue for the validation of candidate biomarkers by combining genomic and proteomic data [18].
Another approach to target biomarkers is to work directly on the subcellular organelles (membranes, nucleus, mitochondria, endoplasmic reticulum, etc) rather than working on the whole cell extracts. For instance, this will help in obtaining clear idea about spatial distribution of proteins and their translocation between cell compartments as well as linking proteins to their function in a specific organelle (for example, drug receptors at cell surface, transport mechanisms by vesicles or cell fate decisions at mitochondria) [25].  [120] and more than 50% of mammalian proteins are glycosylated [8]. PTMs in proteins, however, make almost impossible to identify and quantify all isoforms of the same protein in a single experiment. Indeed, these modifications are characterized both by their very dynamic nature and low stoichiometry [19]. Thus, enrichment of specific and homogenous sub-proteome is becoming a routine procedure to increase the potential of discovered biomarkers. Many cancer proteomics studies have already focused on proteins with specific PTMs and resulted in promising results linked to the involvement of new pathways and enzymes in this disease (for example; [8,38,[121][122][123][124]). These kind of selective studies were even more facilitated recently with the development of ionization techniques such as Electron Transfer Dissociation (ETD) leaving labile PTMs intact on the peptide backbone. Furthermore, the study of the human kinome (the complement set of protein kinases responsible for protein phosphorylation) associated with clinical outcomes would clarify disease mechanisms, identify therapy targets, and develop predictive applications [125]. The glycoproteome is also of particular interest because most traditional cancer biomarkers are glycoproteins and changes in patterns of glycosylation have been reported in cancer cells and it is thought to continue to be main source of biomarkers [121,126]. Despite all progress achieved in MS technology and sample preparation and PTMs enrichment, there still an effort to be done regarding the data analysis as most of the current available software can't cope with multiple PTMs on a single protein [19].
Targeting interactome: Protein interactions play a critical role in regulating of biological functions both at cellular and organism levels as most proteins exert their function as part of multiprotein complexes. Unraveling the interactome in space and time promise to be a key step towards understanding and modeling the complex cellular functions and behavior in cancer onset and progress. This systemslevel understanding will shed light on the behavior of both proteins and genes in their networks and may represent functional molecular groups that play important roles in the disease process as descent biomarkers to be targeted [18]. This may hold considerable promise in improving drug efficacy as tumor cells may escape the treatment through alternative pathways or secondary interactions. For instance, the cellular pathways are highly dynamic and interconnected, probably to generate functional redundancy and compensating mechanisms should parts of a pathway become unavailable [25].
Protein-protein interaction is now studied by combining affinity purification, under near physiological conditions, with MS to identify protein interaction partners [19]. As example, expression profiles and protein interaction information were integrated and protein interaction networks with expression patters were identified [127][128][129] and were then shown to be predictive of breast cancer prognosis.
One, however, needs to bear in mind when studying interactome that many biologically relevant protein interactions will likely not survive the sample preparation steps and may only be measured with in vivo methods. Furthermore, some interactions that lack biological significance may be also introduced during sample processing and cell lysis. Therefore, the right biological and experimental controls should be included to ensure meaningful results.

Selected reaction monitoring quantitation
Although large numbers of putative cancer biomarkers were identified through various studies and approaches, the appropriate validation with reasonable throughput and cost is still representing a bottleneck for the biomarker development pipelines [130].
Selected Reaction Monitoring (SRM), also called Multiple Reaction Monitoring (MRM) is an emerging quan tification strategy [131]. In this targeted approach, a protein of interest is selected for quantification based on its unique precursor peptides and their consecutive fragments (called transition) and analyzed using a of triple quadrupole-or linear ion trap mass spectrometers. To ensure good quantification, at least two peptides per protein and two fragment ions are monitored (minimum of four transitions per protein) during the SRM. Due to its sensitivity and specificity, SRM approach can even be used from unfractionated samples but it requires that several peptides from the targeted protein and their fragments should be known in advance through initial LC-MS runs [132,133]. This makes SRM-type quantification more elaborated than conventional label-free quantification.
Due to its multiplexibility, specificity, and sensitivity, SRM can be used for validation of a single protein of interest but also for large-scale proteome validation [134]. Furthermore, SRM-based quantitation ensures good reproducibility across multiple laboratories [132] and can cover a high dynamic range and for a reasonable number of samples.
For more rapid and cost-effective absolute quantitation and validation of biomarkers of interest, an optimized strategy called  'Monitoring initiated detection and sequencing-multiple reaction monitoring (MIDAS-MRM)' was recently developed to avoid the requirement of immunoassay-based validation techniques which require a costly and time-consuming development of specific antibodies for the targeted biomarkers [135]. Whiteaker et al. [118] have also, for example, developed a targeted proteomics-based pipeline for verification of biomarkers in plasma based first on triage of the most promising biomarkers to be then verified with SRM. Therefore, SRM might be used as standalone quantitation method in clinical environment with a particular advantage of being specific and reasonably cost-effective as many triple quadripole MS machines are already established in clinical labs.

MALDI-Imaging
Direct tissue analysis by MALDI MS imaging (MALDI-MSI) is a fast and multiplexed approach that opens the door to new perspectives in clinical proteomics and it may become a valuable alternative to immunohistochemistery [68,[136][137][138][139].
MALDI-MSI has the unique feature to give access to the anatomical dimension or space distribution of markers in the tissue, an information usually lost in liquid or tissue-extracted samples [140]. Moreover, starting the biomarker mining at the disease-source tissue using MALDI-MSI will shorten the path to fish new disease-specific proteins. Accessible body fluids such as blood, urine or saliva can be then used for verification of the presence of these specific markers in the aim of developing non-invasive tests for cancer diagnosis or screening. Another advantage of MALDI-MSI is the possibility to analyze the tissue biopsies both using top-down and bottom-up approaches as many protocols were already tuned for this purpose and allowing direct identification and quantitation of proteins and peptides on tissues [141][142][143]. Tissue profiling from large set of biopsy samples is now possible due to the availability of appropriate algorithms for pattern comparison allowing molecular tissue classification.
The development of 3D reconstruction of the tissue composition through the imaging of many successive tissue sections started to give the opportunity to reconstruct the complete tumor maps linked to specific marker distribution [144]. This will help to make the right medical decision, in particular if MALDI-MSI is combined with other known histological and pathophysiological data of the same tissue obtained for example from PET, CTscan or MRI. Other directions to integrate MALDI-MSI into clinical settings include for example; development of profile signature diagnosis for early detection of disease and to complement histopathology, assessment of therapeutic efficacy and toxicity or drug resistance which may help in tailoring personal an efficient treatment [140]. All these developments and unique advantages are sought to make soon from MALDI-MSI a key platform in clinical histopathology and may provide a new descent tool for multiplexed cancer diagnosis [140,145].
Nevertheless, some extra efforts are pre-requisite to see the MALDI-MSI well established on the bedside, these include more standardization in sample preparation as well as in data acquisition and analysis protocols aside with improvement in resolution which didn't yet reach the subcellular levels.

Surface Plasmon resonance-MS
Recent developments have led to a closer integration of key technologies, providing a combined approach to enable full characterization (identification, quantitation, interaction, and function) of proteins from complex biospecimens. Chip-based Surface Plasmon Resonance (SPR) is an analytical, label-free, real time-reading biosensor that utilizes interaction of light photons with free electrons (surface plasmons) on a gold surface to quantify the changes in binding amount of biomaterial on the surface [146,147]. SPR biosensors are often referred as mass detectors because the mass of the molecules directly influences the signal reading. SPR, in particular BiAcore, biosensors gained enormous popularity in biomedical research and pharmaceutical industry due to their sensitivity, robustness, flexibility and their amenability to automation and throughput. One of the features of the technique is the possibility to obtain real-time data on the interaction of a ligand and its receptor allowing kinetic data to be determined aside with an accurate determination of the amount of ligand [148]. Moreover, compared to other technologies the amount of material necessary to perform the experiments is less, labeling is not required and variation in surface chemistries is possible, allowing various immobilization strategies and interaction experiments.
Several studies used SPR to detect cancer biomarkers at clinically relevant concentrations highlighting the feasibility of using SPR in a clinical setting and analyzing various specimens such as plasma, serum, saliva and tissues (for review see [149]). However, obtaining high sensitivity in complex biological samples under real physiological conditions remains one of the major challenges for bioanalytical applications with SPR biosensors. The combination of SPR with MS made a powerful platform for pairing the unique advantage of interaction affinity analysis by SPR with ligand identification by MS to further apprehend functional proteomics investigations. Here, proteins are affinity-purified, quantified and characterized in terms of their interactions, while the mass spectrometer identifies and structurally characterizes the biomolecules. Nevertheless, there are still some bottlenecks facing this technology coupling, including binding capacity and specificity of SRP sensor surface when using complex samples and the low throughput added to the relatively high cost of these systems.
SPR-MS hyphenation started recently to be used for cancer studies. For example, the use of SPR-MS enabled multiplexed-biding quantitation and the characterization of potential breast cancer marker (LAG3 protein) from plasma [150,151]. Table 4 is summarizing a selected list of studies published using SPR platforms for cancer biomarkers discovery.

Omics integration
Protein biomarkers are expected to provide more direct answers to biological and clinical questions than genomic or transcriptomic data, as the majority of known molecular markers and pharmaceutical targets are indeed proteins. Nevertheless, and despite rapid advances in the past decade, protein identification and quantification technologies still lag behind those used in DNA sequencing and mRNA expression profiling on a genome-wide scale. This is mainly due to the fact that proteins are extremely complex and dynamic, their changes are more difficult to monitor compared to genomic profiling [5]. Therefore, the integration of the data generated from various omics approaches such as genomics, transcriptomics, proteomics and metabolomics, in a system biology approach, will help in making meaningful hypotheses and foster the discovery of real biomarkers by reducing the false positive rates at the discovery and validation stages (Figure 3 for a summary of the combined approaches). Obviously, this needs to put scientists, clinicians, bioinformaticians and technologists all together for coordinated collaboration and will dramatically reduce the research costs and may shorten the way to win the battle against cancer. For example, Kulasingam et al. [12] have combined multiple data sets of biomarker candidates, including clinical ascites fluid and various cell lines, linked to ovarian cancer and have applied numerous filters, they selected for further validation only 2 promising biomarkers out of many hundreds initially identified.
Moreover, to date there have been more genomics experiments and genome coverage achieved by gene profiling, carried out in large scale clinical studies, than proteomics experiments. Thus, including these genomics data sets in the candidates database helps to better incorporate clinical information (e.g., disease outcomes) in the discovery stage [18].
Briefly, there is a clear need to make use of every piece of information available on cancer biology and pathophysiology, by implementing an integrative approach using multiple "omics" data sets to improve biomarker identification and validation with the perspective to develop specific and sensitive clinical caner markers or drug targets.

Conclusions
The development of MS technologies and proteomic field has enabled generating huge amount of data characterizing the proteome of complex biological samples as well understanding further cancer biology. Comparative analyses of samples from healthy and diseased persons became possible for the identification of thousands of potentially specific biomarkers. In addition, the development of validation platforms such as SRM and microarrays that offer the potential for highly multiplexed and sensitive analysis of the proteome is an advantage towards the development of new protein biomarkers.
[230] The transfer of biomarkers from the discovery field to clinical use is still, however, on a road coated with some technical and physiologylinked pitfalls.
On the other hand, and although the MS-based proteomics approaches did not yet deliver the promised descent cancer biomarkers to the clinics, it did increase dramatically our knowledge about cancer biology and helped in involving large scientific and industrial communities in the development of highly sensitive and accurate tools and strategies. Fortunately, and to meet proteomics potential for finding biomarkers, clinicians, statisticians, epidemiologists and chemists started to work together in an interdisciplinary approach to answer to a same question from different fields of expertise. Finally, the most recent developments in MS technologies and targeted proteomicbased approaches have given a great but reasonable hope, and note hype, to the field of cancer biomarker discovery.