![]() |
Figure 5: Parameter estimates for all selected variables with n={100; 1000}
and n={1000; 100}, respectively, for identification and validation sets. At the
top n={100; 1000}; at the bottom n={1000; 100}, respectively, for identification
and validation sets. The following distributions were plotted: 1-distribution of
the estimates for the ΩR sets over 200 identification sets parameters (histogram
with horizontal hatching). 2-distribution of the estimates for the ΩR sets over
200*50 validation datasets (histogram with diagonal hatching). The vertical
dotted line indicates the mean of the latter distribution. The vertical continuous
line indicates 0.2. With n=100, estimates of the strength of association are poor and have far lower estimates on validation datasets, even large ones like with n=1000. With n=1000 in the identification datasets, however, the strength of association is correctly estimated; as a consequence, it is confirmed on the validation sets, whatever their size. |