Figure 3: Overview of the PLS-DA analysis for the ALS vs CN comparison. (A) Plot of R2Y (explained variation) and Q2Y (predicted variation); it shows how the considered parameters change as a function of increasing model complexity. According to the cross-validation, two components resulted significant in order to explain the relationship between the descriptor matrix and the class response; nevertheless, three components were considered to allow score plotting. (B) PLS-DA score plot reveals no overlap between the two clouds of items correspondent to the two categories examined, Amyotrophic Lateral Sclerosis (ALS, red cones) vs control samples (CN, blue spheres). (The axes of the plot indicate PLS-DA components 1-3). (C) Identification of the subquadrants, in terms of pI and MW, with the highest VIP-values. Protein expression changes in the most influential subquadrants, involved in the discrimination of the disease from the healthy group, are plotted as a heat map. In this heat map, a red color reflects expression greater in ALS than CN patients, a blue color less in ALS than CN and a light green color reflects a similar expression in the two groups. (D) Validation plot by permutation test. The X-axis denotes the correlation coefficient between original and permuted data response, whereas the Y-axis shows the R2Y (triangles) and Q2Y (squares) values of all models. The two last points in the plot correspond to the values of R2Y and Q2Y for the original model. Two regression lines have been fitted, one among the R2Y points and another one among the Q2Y points. The two intercepts can be considered as measures of degrees of overfit and overprediciton.