Evaluation and Identification of Protein Blood Biomarkers for Alzheimer’s Disease: A Systematic Review and Integrative Analysis

Background: Alzheimer’s disease (AD) accounts for 80% of all dementia, but current treatment cannot provide definite cure. For this reason, researchers seek highly accurate preclinical biomarkers for minimal cognitive impairment. This study aimed to evaluate sporadically reported protein blood biomarkers (BBs) of AD and suggest new protein BB candidates for AD. Methods: A systematic PubMed review was performed on articles published between 1989 and March 2013, and several articles and protein BBs of AD were screened based on eligibility criteria and quality. An integrative analysis was conducted to evaluate reported protein BBs and identify new protein BB candidates. Results: In total, 67 articles were included; 95 protein BBs were evaluated through a meta-scoring system based on five criteria. The highest meta-scored protein BB was Serpin A3 (meta-score: 36) followed by tumor necrosis factor-α(meta-score: 35). An integrative analysis using a protein-protein interaction (PPI) network revealed that 67 proteins are linked via 97 edges. In an extended PPI network, 105 proteins were connected via 240 edges and 63 linker molecules were discovered as new protein BB candidates. Conclusion: This study showed a total of 95 meta-scored protein BBs and identified 63 new biomarker candidates of AD. Citation: Park J, Park JE, Lee J, Choi C (2014) Evaluation and Identification of Protein Blood Biomarkers for Alzheimer’s Disease: A Systematic Review and Integrative Analysis. J Mol Biomark Diagn 5: 190. doi:10.4172/2155-9929.1000190


Introduction
Alzheimer's disease (AD) accounts for 80% of all dementia and affects more than 35 million people worldwide. People over 65 years of age have a greater chance of developing AD followed by death within 3-9 years after diagnosis [1,2].Three types of medication, acetylcholine esterase inhibitors (donepezil, galatamine, and rivastigmine), N-methyl-D-aspartate receptor glutamate receptor antagonist (memantine), and anti-oxidants (vitamin E and selegiline), are currently used to treat patients with AD [3][4][5][6].Although these therapeutic agents can decelerate disease progression, they do not confer permanent relief or prevent AD progression. For these reasons, researchers seek highly accurate preclinical biomarkers for minimal cognitive impairment, which might provide a wide therapeutic window.
Two major proteins, amyloid-beta (Aβ) and tau, are considered as traditional AD diagnostic biomarkers in cerebrospinal fluid (CSF) analysis and neuroimaging such as positron emission tomography (PET) imaging [7]. However, conventional detection of AD biomarkers has several defects such as invasiveness, low accessibility, high cost, and limited focus of clinical applicability [8]. Compared to these CSF and neuroimaging biomarkers, blood biomarkers (BBs) have many advantages including ease of accessibility and minimal invasiveness. Although numerous approaches have aimed to identify BBs of AD, selection of reliable BBs has encountered the hurdle of low reproducibility.
In this report, we conducted a systematic review of protein BBs for AD to discern the prominent biomarkers that can be applied in the clinical setting. In addition, we performed an integrative network analysis of scored protein BBs to suggest a list of potential protein BBs not yet identified. Using the meta-scoring system, this study provides the first step in proposing a list of biomarkers that could be applicable clinically.

Study selection and data extraction
Using the public PubMed electronic database, we searched eligible reports published from 1989 through March 2013. We first selected reports using the keywords "Alzheimer's disease" and either "plasma biomarker" or "serum biomarker" (891 articles were initially Evaluation and Identification of Protein Blood Biomarkers for Alzheimer's Disease: A Systematic Review and Integrative Analysis selected). The search was limited to research articles that were written in English (168 articles were excluded). Studies were excluded if they examined other diseases, including depression and diabetes, or suggested no clinical data (249 articles were excluded). Additionally, studies were excluded if they dealt with other types of AD biomarkers (CSF, neuroimaging, etc.), other molecules such as miRNA, lipids, or ions, or other fields of interest, such as drug efficiency (374 articles were excluded).Because identification of new biomarkers depends on network analysis of previously reported biomarkers, only protein biomarkers, for which database is available, were selected for further analysis. Moreover, 33 articles that reported on Aβ, which could introduce a bias in the meta-scoring system, were excluded in the metascoring step; Aβ is included in network analysis and GSEA as one of AD-related genes (ADRs). All included studies were required to have more than one convincing feature that discriminate between normal subjects and those with AD. Most studies included the Mini-Mental State Examination (MMSE) score, a commonly used scoring system to evaluate the cognitive status of patients. Counting more than one biomarker per report was allowed. Only biomarkers with their own ID listed in the Human Protein Reference Database (HPRD) were selected, and biomarkers with same HPRD ID were counted repeatedly.

Protein-protein interaction (PPI) network analysis
A gene set related to AD was retrieved from the Online Mendelian Inheritance in Man (OMIM) database (http://www.ncbi.nlm.nih.gov/ omim) using the following keywords: human AND Alzheimer. Among 30 results, we used 15 entries that have official gene symbol (Table 1). An annotated protein-protein interaction (PPI) network was constructed using these ADRs and meta-scored protein BBs. Visualization and network analysis were performed using Cytoscape [9]. Network clustering was conducted using the community cluster (GLay) network clustering algorithm plugged into Cytoscape [10].

Gene set enrichment analysis (GSEA)
Protein BBs, ADRs, and linker molecules were analyzed using the Gene Ontology (GO) Biological Process (GOBP) database. All GOBPs involved in both the BB-PPI and extended PPI network were categorized into a total of 17 manual modules including angiogenesis, behavior, blood coagulation, signal transduction, and transport based on the keyword of GOBP titles. To check if each type of observed enrichment was significantly different from those potentially obtainable from a list of randomly selected proteins, we performed random sampling analysis; we measured the ratio of the number of outcomes in which randomly selected proteins are enriched to specific manual modules more than analyzed proteins in each PPI network. We performed ten thousands of random samplings and evaluated empirical p-values.

Statistical analysis
The differences of age and MMSE score between control and AD subjects were evaluated using Welch's t-test. GSEA results were evaluated using Fisher's exact test and empirical p-value was calculated to compensate biases that may essentially exist in dataset. P-values lower than 0.05 were considered significant.

Report selection and description of the systematic review
After searching reports using the initial keywords, 891 articles were identified from PubMed (published from 1989 to March 2013). We conducted three sequential filtering steps to select qualified reports. Finally, 67 articles with 95 protein BBs were included in our study. Information of the included articles is shown in Supplementary Table 1 and schematic diagram of the research is presented in Figure 1A.

Criteria of the meta-scoring system: evaluation of protein BBs
Each protein BB was scored using the designated rule, which was described in Figure 1B. The equation evaluates each protein BB by the sum of two multiplied representative values: one is the score of the report stating the protein BB depending on four assigned criteria, and the other is the statistical significance (p-value) score of the protein BB derived from that report. Table 2 indicates five criteria: four criteria for each report and one for each protein BB, with a description of each scoring scale depending on their sub-categorization. Scoring scales for the number of control and AD subjects were decided based on their distribution patterns (Figure 2A and B). Scoring for age criterion is based on the match of ages between control and AD subjects in each of articles. We used Welch's t-value as a measure of age match. We averaged all absolute value of Welch's t of studies and used it as cutpoint for scoring. To avoid the effect of aging, studies having t-value less than the average gets a higher score than those with t-value greater than the average ( Figure 2C). The MMSE difference between control subjects and those with AD was also used as a criterion for the meta-scoring system. Because studies with larger MMSE differences were considered to have more accurate protein BBs, we gave more weight to those studies; to evaluate MMSE difference between control subjects and those with AD, Welch's t-test was used ( Figure 2D). After considering the four article-related criteria, statistical significance of the protein BBs shown in each article was examined to reflect a degree of certainty and reliability of each BB ( Figure 2E). Since we aimed to suggest new BBs for AD using PPI network analysis, variation on the nodes can dramatically affect organization of network structure and PPI partners. To select more definitely confirmed BBs, we graded low meta-scores to negative results. In total, 95 protein BBs were meta-scored and their distribution is shown in Figure 2F. Table 3 indicates the list of metascored protein BBs with an increase (+) /decrease (-) tendency and the number of reports that referred to each protein BB.

PPI network of protein BBs and ADRs
To determine the biological relevance of the protein BBs in the pathogenesis of AD, we established a PPI network of meta-scored protein BBs and ADRs that are essential for AD pathogenesis. The PPI network of protein BBs and ADRs (BB-PPI) was constructed by The names of all genes were retrieved from official gene symbol of HPRD a total of 67 nodes (58 were protein BBs, 7 were ADRs, and 2 were both protein BB and ADR) linked via 97 interactions ( Figure 3A). There were six blue spheres that indicate increase and decrease pattern varies according to articles. It can be explained by probable genotypic and epidemiological difference among independent cohorts, or essential fluctuation of growth factors and immune-related molecules (TNF, CRP, EGF, TGFB1, IL6, and CSF1 in Figure 3A). Since most included articles present statistical significance of biomarkers with range format rather than exact p-values, quantitative data fusion was unavailable.
Next, we examined the ontological features of proteins in the BB-PPI if they had specific biological functions using GSEA [11]. Several biological processes ranging from the single cell level to the organism level were discovered. Because the GO was complex, GOBPs were categorized into manually defined modules based on the title of GOBP. GSEA can assign the roles of protein BBs of AD. The significantly assigned GOBPs (Fisher's exact p-value<0.05) were shown in Figure  3B with empirical p-values after ten thousands of random samplings; the largest number of protein BBs belonged to a development/ differentiation module. Even though genes in a single module may not necessarily imply direct interactions, ontology mapping data could be utilized in various in vitro or in vivo assays evaluating designated biological functions of protein BBs.
Since a single biomarker does not provide adequate specificity and sensitivity in clinical diagnoses, several studies have attempted to improve the specificity and sensitivity using a multi-biomarker panel [12][13][14]. Similarly, we proposed a multi-biomarker panel with a collection of the highest meta-scored protein BB from all clusters from the BB-PPI. A network community is a set of densely connected nodes that could be : meta-score of a biomarker : protein blood biomarker : paper : criterion : set of criterion : score of a paper with a criterion : significance score of a biomarker in a paper

Criteria Score
The number of subjects Meta-score of each biomarker was calculated by the sum of assigned scores in each criterion. Therefore, maximum meta-score per a single article can be 4+4+2+2+3=15 (the number of subjects for control [4] and AD [4]+age [2]+MMSE score [2]+statistical significance [3]). If one biomarker was reported by multiple articles, the meta-scores evaluated in each article were then added. *2.77 is mean of Welch's t-value for age range. † 24.15 is mean of Welch's t-value for MMSE range. average meta-score of 12.19 ( Figure 4). This sub-categorization of the BB-PPI network can serve as an efficient diagnostic panel because the selection of biomarkers in each sub-network can cover the overall ADrelated ontologies. Improved diagnosis of AD can be achieved by using the combination of a blood-based multi-biomarker panel and other markers such as CSF or neuroimaging biomarkers.
linked to nodes out of the set. In terms of the BB-PPI, the community represents a group of proteins that have a large probability of being detected together in the blood. Therefore, we tried to find communities out of the BB-PPI using the Girvan-Newman fast greedy algorithm implemented in the community cluster (GLay) plugged into Cytoscape [15]. Six sub-networks were found with an average of 9.67 genes and an

Identification of new protein BB candidates through the extended PPI network
To identify new protein BB candidates with higher probability, we expanded the BB-PPI with additional proteins that interact with both protein BBs and ADRs. The extended PPI network was constructed to include a total of 105 nodes (30 protein BBs, 2 protein BB/ADRs, and 10 ADRs) via 240 edges ( Figure 5A). Compared to the BB-PPI, 63 linker molecules connected to both protein BBs and ADRs were newly added in the extended PPI network, and protein BBs without connection to linker molecules were excluded to avoid unnecessary complexity. In total, 30 protein BBs that had an indirect connection with ADRs were +/-indicates the pattern of increase or decrease in blood of AD patients compared to control. Count indicates the number of articles that deal with each biomarker. If one biomarker was reported by multiple articles, the meta-scores evaluated in each article were added (final meta-score). The names of all biomarkers were retrieved from official gene symbol of HPRD   Table 4 enumerates the list of linker molecules including information about secretion to extracellular space. In a manner similar to that of BB-PPI, the functions of protein BBs and linker molecules were assigned using GSEA. The number of genes belong to each module was shown in Figure 5B. The development/ differentiation module had the largest number of genes among the manual modules.

Discussion
Recently, an identification of new candidates for biomarker and disease-related genes have aroused the interests of researchers [8]. Despite many meta-analysis reports on AD [16][17][18][19][20][21], relatively few studies have focused on BBs. Conventional meta-analysis has mainly focused on scoring and integration of individual studies depending on the statistical method. In this systematic review and integrative analysis, however, we adopted network analysis to evaluate each protein BB of AD. Although numerous approaches utilizing proteomics or genomics have attempted to identify potential biomarkers for diverse diseases, several drawbacks remain such as high cost and low accessibility [8,22]. The PPI network analysis has been actively considered as a promising tool for identifying new candidate genes in diverse diseases such as myocardial infarction, acute renal failure, and chronic kidney disease [23][24][25][26].In this study, we propose a list of 63 candidate BBs of AD identified through an integrative network analysis combining a PPI network analysis with GSEA. From a biological point of view, the relationships between biomarkers and ADRs as well as participated ADrelated ontologies were shown using an integrative network analysis.
Linker molecules derived from an extended PPI network might be useful as new protein BBs. Even though not all linker molecules have significantly different expression levels in AD patient samples (data not shown), we speculated that several exceptions probably arise from   and decrease (green) pattern of BBs compared to the control (a blue node indicates both increase and decrease). Node size of each BB or BB/ADR is proportional to its meta-score. (B) The biological modules based on the Gene Ontology Biological Process (GOBP), which included BBs and ADRs, were searched by Gene Set Enrichment Analysis (GSEA). GOBPs were categorized into manual modules based on their title, and then filtered by Fisher's exact p-value (p<0.05). After ten thousands of random samplings, empirical p-values (the ratio of the number of outcomes in which randomly selected proteins are enriched to specific manual modules more than analyzed proteins) were evaluated.  Figure 4: Sub-networks of the BB-PPI. BB-PPI was clustered into a total of six sub-networks by the community cluster (GLay) algorithm plugged into Cytoscape. Each sub-network was composed of an average of 9.67 genes; the average meta-score is described on the bottom right.  the origin of analyzed biomarkers. Since we evaluated biomarkers originated from only blood, linker molecules may not expose different expression levels in usual cDNA microarray datasets, which are collected from diverse sources. Connectivity to other protein BBs or ADRs can provide clues that explain the possibility of linker molecules as potential biomarkers; linker molecules with high degree can have high correlation with AD pathogenesis or high utility as biomarkers. Exocytosis of linker molecules can increase their utility as biomarkers because proteins known to be secreted outside of cells can be easily detected in the bloodstream. Therefore, we further investigated whether the suggested linker molecules are secreted. Even though some secreted linker molecules are known to function in major AD pathogenesis pathways such as NFT and plaque formation [1], further studies are needed to identify specific functional roles in AD-related pathogenesis.
Interest in BBs of AD is steadily increasing due to the potential advantages of high accuracy and low cost. Using integrative network analysis, this systematic review suggested new potential biomarkers for AD and evaluated protein BBs that were reported previously. However, even in the cases of highly meta-scored protein BBs as well as newly proposed potential BBs, further studies and extensive experimental validations are necessary to be applied in the clinical setting. We expect that future works will confirm the clinical use of potential biomarkers and improve the diagnostic modality.