Proteomics Characterization of Haptoglobin Alpha-2 Subunit in Non-Small Cell Lung Cancer Serum and Various Human Materials

Lung cancer is the cause of more cancer-related mortality in the world today for both men and women than any other cancer. There are many reports showing that the incidence of lung cancer has increased significantly among Asian with the first or second order of cancer death during the past few decades [1-3]. Lung cancer is divided into two main types; Small Cell Lung Cancer (SCLC) and non-small cell lung cancer (NSCLC). NSCLC is the common type of lung cancer, accounting for approximately 80% of all lung cancer types. There are three main subtypes based on the kind of cancer cells; adenocarcinoma, squamous cell carcinoma and large cell carcinoma. Regardless of subtype, the fiveyear survival rate for lung cancer is less than 15% from the time of diagnosis [4,5]. Although more than half of lung cancers are diagnosed at a late stage, when cure is unlikely, the survival of patients diagnosed with stage I lung cancer is also surprisingly low. Thus, there is a great need to understand the molecular and biological alterations of lung cancer that lead to a poor diagnosis and to use this information to improve diagnosis, patient management and therapy.


Introduction
Lung cancer is the cause of more cancer-related mortality in the world today for both men and women than any other cancer. There are many reports showing that the incidence of lung cancer has increased significantly among Asian with the first or second order of cancer death during the past few decades [1][2][3]. Lung cancer is divided into two main types; Small Cell Lung Cancer (SCLC) and non-small cell lung cancer (NSCLC). NSCLC is the common type of lung cancer, accounting for approximately 80% of all lung cancer types. There are three main subtypes based on the kind of cancer cells; adenocarcinoma, squamous cell carcinoma and large cell carcinoma. Regardless of subtype, the fiveyear survival rate for lung cancer is less than 15% from the time of diagnosis [4,5]. Although more than half of lung cancers are diagnosed at a late stage, when cure is unlikely, the survival of patients diagnosed with stage I lung cancer is also surprisingly low. Thus, there is a great need to understand the molecular and biological alterations of lung cancer that lead to a poor diagnosis and to use this information to improve diagnosis, patient management and therapy.
Over the past two decades, significant progress has been made toward understanding the molecular pathogenesis of human lung cancer by the identification and characterization of various cancer-related genes and/or proteins that are genetically or epigenetically altered in human lung cancer. Numerous protein markers have been elucidated in human serum, urine, seminal fluid and histological specimens that exhibit varying capacities to detect lung cancer and predict disease cause. The human serum and plasma circulated in the human system provide a wealth of diagnostic tools and are mostly used by proteomics for detecting researchers and therapeutic monitoring from a clinical viewpoint. Therefore, many researches have attempted to identify all the proteins in serum and/or plasma and reduce the complexity of the proteome by using proteomic analysis of human blood serum [6][7][8]. However, to date, few of these markers have been adequately validated for clinical use, and many more remain unidentified.
Using proteomic approaches are particularly promising for identifying protein markers from the amount of blood serum. The objective of this study was therefore to identify the novel serum biomarkers from NSCLC patients using proteomic approaches, i.e. 2-DE, 2-D DIGE and mass spectrometry. The Haptoglobin alpha-2 subunit (HAP2), which is the up-regulated protein in NSCLC serum, is our target protein and is used to validate the expression level in various human materials for answering the question of HAP2 production from lung cancer tissues and/or cells. Moreover, the sequential staining methods were used to characterize the post-translational modifications (PTMs) of HAP2, in which the FITC-labeled lectin staining could detect the specific carbohydrate moieties in HAP2. The co-immunoprecipitation against the sialyl Lewise x antibody was also used to investigate the modification of sialyl Lewis x in HAP2. Therefore, we initiated the present work to elucidate the potential biomarkers in NSCLC serum, especially HAP2, the characterization of PTMs by HAP2, and the source of HAP2.

Human serum specimens
All of samples were obtained during diagnosis of eighteen NSCLC patients, who were divided into three groups: adenocarcinoma (eight samples), squamous cell carcinoma (five samples) and unknown type (five samples) according to histological cell types. The ages of the patients ranged from 41 to 75 years (median age, 57 years). The serum samples were obtained from blood specimens that were allowed to clot for 30 min to 1 h at room temperature and then centrifuged at 3000×g for 10 min. The supernatants of the serum samples were collected, divided into small amounts of aliquots and stored at -80°C until further analysis. The serum sample was thawed once shortly before use. The protein concentration was determined using the Bio-Rad Protein assay with bovine serum albumin (BSA) as standard.

2-DE analysis
2-DE was performed with IPGphor system (GE Healthcare). Five hundred mg of protein (generally 5-10 µl of sample) was added to 0.2% SDS/2.5 mM dithiothreitol (DTE), and the mixture was heated for 5 min in a 95°C heating block [9]. The samples were then diluted into a rehydration buffer containing 7 M urea, 2 M thiourea, 4% CHAPS, 65 mM DTE, 5 mM tributylphosphine (TBP) and 0.5% IPG buffer (pH 4-7 or pH 3-10 NL). After periodic vortexing for 1 h and centrifugation at 12000×g for 20 min, the sample solutions were applied onto IPG strips (Immobiline DryStrip, 18 cm, pH 4-7 or pH 3-10 NL, GE Healthcare). IPGphor IEF (GE Healthcare) was performed under the following conditions: gel rehydration was carried out for 14 h at 50 V and run in the programmed setting for a total of 65 kVh. After IEF, the IPG strips were equilibrated in equilibration buffer (50 mM Tris-HCl, pH 8.8, 6 M urea, 30% v/v glycerol, 2% w/v DTE and a trace of bromophenol blue) for 15 min and then subsequently alkylated in the same equilibration buffer, but with 2.5% w/v iodoacetamide replacing DTE for 15 min. The IPG strips were placed on top of the 10-18% linear gradient polyacrylamide gel (18×18 cm) and covered with 0.5% agarose. The second-dimensional separation was carried out at 45 mA per gel at 15°C until the bromophenol blue dye reached the bottom of the gel. At the end of each run, the 2-D gels were stained with SYPRO  Ruby gel staining and scanned using a Typhoon 9200 laser scanner (GE Healthcare). In addition, the 2-D gel images were exported to the image analysis software program, using ImageMaster TM 2D Platinum software version 5.0 (GE Healthcare).

2-D DIGE analysis
Each of the normal and lung cancer serum proteins was minimally labeled with CyDye maleimide (Cy3, Cy5) according to the manufacturer's instructions (GE Healthcare). Before the first dimension isoelectric focusing (IEF), an aliquot from each of the CyDye labeled samples was mixed together and added to rehydration buffer (7 M urea, 2 M thiourea, 4% CHAPS, 65 mM DTE, and 0.5% IPG buffer pH 4-7). The mixed sample was applied onto IPG strip (Immobiline DryStrip, 18 cm, pH 4-7, GE Healthcare) and the IEF was performed using an Ettan IPGphor (GE Healthcare), as described above. Following equilibration, the IPG strip was placed on top of a 10-18% polyacrylamide gel. The second-dimensional separation was carried out at 45 mA per gel at 15°C until the bromophenol blue dye front reached the bottom of the gel. At the end of each run, the 2-D gel was stained using a Typhoon 9200 laser scanner (GE Healthcare) with two wavelengths at an emission filter of 580 nm (Cy3) and 670 nm (Cy5).

In-Gel tryptic digestion
Protein spots were manually excised from the gels and transferred to 500 µl siliconized Eppendorfs. The gel pieces were washed twice with 200 µL of 50% ACN/25 mM ammonium bicarbonate buffer, pH 8.5, for 15 min each. The gel pieces were then washed once with 200 ml of 100% ACN and dried using a Speed-Vacuum concentrator. Dried gel pieces were swollen in 20 µl of 25 mM ammonium bicarbonate containing 0.1 mg trypsin. Gel pieces were then crushed with a siliconized blue stick and incubated at 37°C for at least 16 h. Peptides were subsequently extracted twice with 50 µl of 50% ACN/5% TFA, then the extracted solutions were combined and dried using a SpeedVac concentrator. The peptides were eluted with 5 µl of 75% ACN/0.1% formic acid.

MALDI-TOF MS and MS/MS analysis
The samples were premixed in a ratio of 1:1 with matrix solution (5 mg/ml CHCA in 50% ACN, 0.1% v/v TFA and 2% w/v ammonium citrate) and spotted onto the 96-wells format MALDI sample stage. Data was directed acquired on the Q-TOF Ultima™ MALDI instrument (M@LDI™; Micromass, Manchester, UK) which was fully automated with a predefined probe motion pattern and peak intensity threshold for switching over from MS survey scanning to MS/MS, and from one MS/MS to another. Within each well, parent ions that met the predefined criteria (any peak within the m/z 800-3000 range with intensity above 10 count ± include/exclude list) were selected from the most intense peak for CID MS/MS using argon as the collision gas and a mass dependent ± 5V rolling collision energy until the end of the probe pattern was reached (all details are available at http://proteome. sinica.edu.tw). The MASCOT MS ion search program (http://www. matrixscience.com) was used for protein searching. Search parameters allowing for oxidation of methionine, carbamidomethylation of cysteine and one missed trypsin cleavage were selected for searching the Swiss-Prot database. Protein identification was repeated at least once using spots from different gels. In addition, the identified proteins were searched for the description based on the Swiss-Prot and NCBI protein databases.

LC-MS/MS
The tryptic digested peptides were analyzed by 1-D LC-nanoESI MS/MS. The 1-D LC-nano ESI-MS/MS analysis was performed on an integrated nanoLCMS/MS system (Mircomass) comprised of a three-pump Micromass/Waters CapLC™ system with an autosampler, a stream selection module configured for precolumn plus analytical capillary column, and a Micromass Q-Tof Ultima™ API mass spectrometer fitted with a nano-LC sprayer, operated under MassLynx™ 4.0 control. Injected samples were first trapped and desalted isocratically on an LCPackings PepMap™ C18 µ-Precolumn™ Cartidge (5 µm, 300 µm id×5 mm; Dionex, Sunnyvale, CA, USA) for 2 min with 0.1% formic acid delivered by the auxiliary pump at 30 µL/min after which the peptides were eluted off from the precolumn and separated on an analytical C18 capillary column (15 cm×5 µm id, packed with 5 µm, Zorbax 300 SB C18 particles; Micro-Tech Scientific, Vista, CA, USA) connected inline to the mass spectrometer, at 300 nl/min using a 40 min fast gradient of 5% to 80% acetonitrile in 0.1% formic acid. The peptide fragment fingerprint data were subjected to Swiss-Prot and NCBI protein database search by using the MASCOT MS/MS ion search program (http://www.matrixscience.com) to search the peptide sequences.

Phosphoprotein and glycoprotein staining
Phosphoprotein and glycoprotein staining were performed using Pro-Q  Diamond phosphoprotein gel stain and Pro-Q  Emerald 488 glycoprotein gel stain, respectively, according to the manufacturer's instruction (Molecular Probes). The 2-D gel images of the stained gels were scanned using a Typhoon 9200 laser scanner (GE Healthcare). The gels were post-stained with SYPRO  Ruby to detect the total amount of protein on the gels.

NH 2 -terminal analysis
The protein spots on the 2-DE gel were electrotransferred to a polyvinylidene difluoride (PVDF) membrane. The membrane was stained with 0.2% Naphthol blue black, and the protein band was excised to determin the NH 2 -terminal amino acid sequence. NH 2terminal sequencing was performed by Edman degradation using an Applied Biosystems Model 492 precise sequencer (Applied Biosystems, Weiterstadt, Germany). The protein search of NH 2 -terminal sequences was carried out using BLAST software from protein database at the ExPASy website (http://au.expasy.org/tools/#similarity).

Western blot analysis
Serum proteins were separated by 12.5% SDS-PAGE and electrotransferred to PVDF membrane. The membranes were blocked for 2 h with 5% skim milk and 0.1% Tween-20 in PBS, followed by incubation with the primary antibody of polyclonal rabbit anti-human haptoglobin (1:40,000 dilution) for 2 h at room temperature. After washing three times with PBS-0.1% Tween-20, the primary antibody was detected with the secondary antibody of anti-rabbit horseradish peroxidase (HRP; 1:5,000 dilution) for 1 h at room temperature, and the signals were developed by enhanced chemiluminescence western blotting detection system (ECL TM kit; GE Healthcare).

Co-Immunoprecipitation
Two primary antibodies were used in this experiment, olyclonal antibody rabbit anti-human haptoglobin and monoclonal antibody mouse anti-sialyl Lewis x. Five hundred micrograms of total protein from the serum samples were incubated with appropriate amounts of primary antibody at room temperature for 2 h. The sample mixture was mixed with protein A-sepharose (15 µl, GE Healthcare) and incubated at room temperature for 2 h with rocking. Immunoprecipitates were pelleted by centrifugation and wash three times with washing buffer (50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 0.1% Nonidet P-40 (NP-40), 1 mM EDTA (pH 8.0), 0.25% gelatin and 0.02% sodium azide). Captured proteins were eluted by boiling the beads in loading sample buffer for 5 min at 100°C. Samples were subjected to SDS-PAGE and transferred to PVDF membrane. Western blot analysis was performed as described above.

Screening of carbohydrate moiety using lectin-fluorescent staining
Four FITC-labeled lectins with different carbohydrate specificity were used; concanavalin A (ConA) specific for high α-mannose-type and hybrid-type oligosaccharide, wheat germ agglutinin (WGA) specific to N-acetyl-glucosamine and sialic residues, peanut agglutinin (PNA) specific to galactose and N-acetyl-galactosamine, and Ulex europaeus agglutinin (UEA-I) specific to fucose (Fucα1-2Gal-R). Briefly, serum proteins from normal and lung cancer serum were first separated on 2-DE, then by transfered to PVDF membrane. The membranes were blocked for 2 h with 5% non-fat skim milk and 0.1% Tween-20 in PBS, followed by incubation with FITC-labeled lectins (a final concentration of approximately 20 µg/ml) at room temperature for 2 h. After washing three times with PBS-0.1% Tween-20, the membranes were scanned using a Typhoon 9200 laser scanner (GE Healthcare) with a fluorescent-wavelength at 526 nm.

2-DE of human serum samples
Comparison of the differential protein expression profiles between normal healthy and NSCLC serum samples was performed by 2-DE analysis. Using the wide pH range of 3-10NL IPG strips, the 2-D gel patterns of normal and NSCLC serum samples showed the different protein profiles, particularly in the molecular weight range of 20 kDa and 30-55 kDa ( Figure 1A). Since there was no significant difference between the highly acidic and basic areas, the narrow pH range 4-7L IPG strips were used to focus on the different protein spots in the narrow pH range of 2-D gels. The different protein spots on the narrow pH range were distinctly elucidated, especially at the molecular weight range of 20 kDa ( Figure 1B). There were two protein spots with approximate molecular weight at 18 kDa that were markedly changed with two-fold protein spot intensity in NSCLC serum (Figure 2A). After MALDI-TOF MS analysis, they were identified to be the haptoglobin alpha-2 chain (HAP2). The possibility-based scoring of the mass spectra of HAP2 isoforms revealed the high score of the mass database search, which indicated that these two spots were HAP2 (Table 1). These HAP2 isoforms in normal and lung cancer serum samples appeared at the same molecular weight but differ in pI values with values of 5.95 and 5.60 (for normal sample), and 5.90 and 5.65 (for cancer sample).  Figure 2B). These sequences were used to search the matching protein on the ExPASy database using the BLAST program and the results identified these proteins as HAP2. In addition to 2-DE analysis, 2-D DIGE was also used to analyze the different protein expressions between normal and NSCLC serum samples, which were labeled with Cy3 (green) and Cy5 (red) dyes, respectively. 2-D DIGE results showed the distinctly different protein spots with different fluorescent colors ( Figure 3) and the HAP2 isoform spots were markedly up-regulated in NSCLC serum with the higher fluorescent intensity in red color, the same as the 2-DE result. Therefore, it could be confirmed that HAP2 isoforms were the up-regulated proteins in NSCLC serum.   expression level of HAP2 in NSCLC serum samples was higher than the expression level of HAP2 in normal serum samples ( Figure 4A). To confirm and validate the protein expression level of HAP2 in normal and NSCLC serum samples, the western blot analysis against antihuman haptoglobin was performed. The expression level of HAP2 in NSCLC serum samples was much higher the expression level of HAP2 in normal serum samples ( Figure 4B). A comparison of average protein band intensities of HAP2 expression level between normal and NSCLC serum samples from various NSCLC subtypes revealed that the high expression level of HAP2 was observed in adenocarcinoma, squamous carcinoma and unknown type of lung cancer, with the intensities of 3.90, 2.45 and 3.55-fold, respectively, the intensity of normal serum samples ( Figure 4C). In addition, the protein expression levels of HAP2 in various human lung cancer materials, including serum, tissues, cell lines and cultured media of cultivated lung cancer cell lines, were validated ( Figure 5A-D). Interestingly, the up-regulation of HAP2 was observed in the serum samples, while the HAP2 expression level in the lung tissues was variable. The HAP2 expression levels in 3 types of lung cancer cell lines were not observed, but it was observed in HepG2 cells. This was indicated that the up-regulated HAP2 has the specific expression only in human serum samples. To further verify the above observations, the identity of HAP2 in serum samples was confirmed by co-immunoprecipitation, followed by western blotting. The result of HAP2 expression showed the up-regulation in NSCLC serum samples clearly, as compared to normal serum samples ( Figure 5E).

Detection of haptoglobin isoforms using different staining methods
In order to investigate the PTMs of HAP2 isoforms, we used two fluorescent dye stains; Pro-Q  Emerald 488 glycoprotein gel stain and Pro-Q  Diamond phosphoprotein gel stain, to visualize the glycosylation and the phosphorylation of HAP2, respectively ( Figure  6). From our results, the HAP2 isoforms in normal and NSCLC serum samples could be detected by both staining methods, in which the fluorescent intensity of HAP2 in the NSCLC serum samples was higher than that of the normal serum samples. Thus, HAP2 could be classified to be both glycoprotein and phosphoprotein. We suggest that HAP2 isoforms may have glycosylation and phosphorylation due to the high amounts of oligosaccharides and phosphates modified in the NSCLC serum samples, as compared to the normal serum samples.

Carbohydrate specificity of hap2 by fluorescent-labeled lectin staining
In order to detect the glycoprotein of HAP2 using the Pro-Q  Emerald 488 glycoprotein gel stain, we used fluorescent-labeled lectins to screen the presence of the specific carbohydrate moiety of HAP2 in normal and NSCLC serum samples. After the staining of 2-D gels with four types of FITC-labeled lectins, the HAP2 regions from NSCLC serum samples were apparently different compared to the HAP2 regions from normal serum samples (Figure 7). Interestingly, the HAP2 isoforms in NSCLC serum samples gave a strongly positive fluorescent signal against FITC-labeled WGA rather than FITClabeled ConA and FITC-labeled PNA, but none were observed with FITC-labeled UEA I. Due to the specificity of WGA on core sialylation and N-acetyl glucosamine (GlcNAc), the appearance of the WGA bound HAP2 indicated that a high expression level of sialylation and/ or GlcNAc in HAP2 may be associated with the NSCLC development.

Detection of Sialyl Lewis x in HAP2
To further detect the expression of sialyl Lewis x in human serum samples, we used the co-immunoprecipitation with anti-sialyl Lewis x monoclonal antibody and followed by SDS-PAGE analysis. The SDS-PAGE profile showed a high sialyl Lewis x expression on HAP2 in the crude serum sample and the co-immunoprocipitated sample of NSCLC serum, while its expression was rarely observed in those of normal serum (Figure 8). Our result suggested that the expression of sialyl Lewis x carbohydrate antigen in NSCLC serum sample might cause a change in the glycosylation of HAP2 protein in NSCLC serum.

Discussion
Presently, many human diseases need more diagnostic tools to aid in the early evaluation of biomarkers for understanding their biological mechanisms. Proteomics approaches have been challenged to find out the protein markers and to study the change of the quantitative display of proteins in serum and other biological materials [10][11][12][13].
In this study, we reported the proteomic analysis of potential protein markers in NSCLC lung cancer serum using gel-based proteomic tools, including 2-DE, 2-D DIGE and MS. We found that HAP2 was markedly up-regulated in NSCLC serum samples and may play a critical role in NSCLC development. Usually, haptoglobin is among the most abundant glycoproteins generally secreted by the liver and is composed of two different polypeptides (α-and β-chains) whose α and β chains are covalently conjugated via disulfide bonds [14][15][16]. According to the 2-DE result, the molecular weight of HAP2 was observed at 18,000 Da, but the theoretical molecular weight of matched HAP2 from Swiss-Prot accession number P00738 was 41,717 Da. Comparing to the matching peptide sequences and NH 2 -terminal amino acid sequences, those sequences were matched to the haptoglobin alpha chain while there were no matches for the haptoglobin beta chain. The observed molecular weight of HAP2 was the same as the haptoglobin alpha chain. Thus, it was clear that the identified HAP2 in both normal and NSCLC serum samples was reliable and proven. In addition, it is well known that the functional properties of haptoglobin are involved in binding free hemoglobin, protection against free radicals, inhibition of nitric oxide, inhibition of prostaglandin synthesis, bacteriostatic effect, angiogenesis, antibody-like properties and interactions with leukocytes [17]. The high expression level of haptoglobin in serum has been found in many cancers such as breast cancer [18], hepatocellular carcinoma [19], ovarian cancer [14], pancreatic cancer [20], and lung cancer [15,21,22]. Although a few studies of haptoglobin have been reported in individual types of lung cancer, the expression and validation of haptoglobin in all types of lung cancer and various human materials has never been reported. Therefore, it is very interesting to investigate the expression level of haptoglobin in NSCLC serum samples and various human materials and through the determination of the carbohydrate specificity of HAP2.
Determination of the individual human serum samples is necessary to examine whether differences observed in most of the histological types of lung cancer and because there are a small amount of differential individuals within the lung cancer group [23]. Validation of the biomarkers is an important result of proteomic analysis for confirmation of the cancer marker before usein clinical applications. In this study, we have validated the expression level of HAP2 in the individual human serum specimen from various histological types of lung cancer by comparative 2-DE and western blot analyses. Both analyses provided the same results of up-regulation of HAP2 in the individual NSCLC serum sample. The western blot analysis also demonstrated the higher expression level of HAP2 in adenocarcinoma than squamous cell carcinoma and unknown type lung cancer. This indicates that the HAP2 expression level showed the highest specificity in adenocarcinoma lung cancer. Although the unknown unknown type of lung cancer serum samples that could not be identified the type of lung cancer also provided the high expression level of HAP2, the expression level of HAP2 in NSCLC serum could possibly be used for preliminary diagnosis of lung cancer. In addition, the expression of HAP2 was validated in various human materials such as serum, lung cancer tissues, lung cancer cells, and cultured cell media. The HAP2 expression could be detected clearly in human serum with the higher expression in NSCLC serum compared to normal serum. There was no significant increase in the relative amount of HAP2 in lung cancer tissues and none observed in the primary normal cells and three types of NSCLC cell lines. Due to the synthesis of haptoglobin by the liver cells in response to a variety of stimuli and secretion into the bloodstream [17], we proposed that the HAP2 may spread to other organelles; for example, HAP2 in the cell lines may be secreted into the cultured cell media. After we determined the HAP2 expression in the cultured cell media in the absence and presence of fetal bovine serum (FBS), no HAP2 was observed. This indicated that HAP2 is neither produced from lung cancer tissues and/or cells nor secreted to the biofluid. Therefore, the high expression level of HAP2 in NSCLC has the specificity in the human serum samples. Moreover, the HAP2 expression was observed in the HepG2 cell line. We suggest that the HAP2 is not produced from lung tissues and/or cells, but it is synthesized by the liver, especially hepatocytes, and secreted to the serum or plasma [21,24].