Figure 3

Figure 3: Parameter estimates with varying n for variables under the H₀ hypothesis. This figure only concerns estimates for identification sets. Each of the four panels was obtained with a specific sample size with n={100; 200; 400; 1000}. Whatever the panel, the following distributions were plotted: 1-distribution of the estimates for the Ω_p0 variables obtained over 200 identification sets (grey histogram). 2-distribution of the estimates for the Ω_V sets obtained over 200 identification sets parameters (histogram with horizontal hatching). The vertical continuous line indicates 0.2.
With n=100, estimates for Ω_p0 are highly fluctuating, as shown by the wide distribution. Variables are selected in the extreme of the distributions of Ω_p0 estimates and the mean estimates of Ω_V variables are thus far from their true means. When increasing the sample sizes, the distribution of the estimates for the Ω_p0 variables gets narrower and the mean distribution of the Ω_V variables estimates decreases. This illustrates the regression to the mean phenomenon that leads to the inappropriate selection of some FP variables that have in fact no effect on survival.