A Comment on Sample Size Calculation for Analysis of Covariance in Parallel Arm Studies

We compare two sample size calculation approaches for analysis of covariance with one covariate. Exact simulation studies are conducted to compare the sample size calculation based on an approach by Borm et al. (2007) (referred to as the B approach) and an exact approach (referred to as the F approach). Although the B approach and the F approach have similar performance when the correlation coefficient is small, the F approach generally has a more accurate sample size calculation as compared to the B approach. Therefore, the F approach for sample size calculation is generally recommended for use in practice. where Zd is the d−th percentile of a standard normal distribution. Borm et al. [5] showed that the total sample size for the ANCOVA N=2n(1− ρ2) may not be accurate enough for small sample settings to retain the pre-specified power. They provided some power plots to show that power with this sample size formula is generally smaller than 1−β for small sample settings. For this reason, they proposed to be used as the sample size by adding one subject for each group in the sample size calculation. They claimed that this sample size is accurate for all sample sizes. The second method is an exact approach based on a ratio of mean squares, where β0i is the intercept for the i th group, β1 is the common slope for both groups, and εij is the measure error which follows a normal distribution [8]. The mean difference between two groups is the difference between two intercepts. Citation: Shan G, Ma C (2014) A Comment on Sample Size Calculation for Analysis of Covariance in Parallel Arm Studies. J Biomet Biostat 5: 184. doi:10.4172/2155-6180.1000184 J Biomet Biostat ISSN: 2155-6180 JBMBS, an open access journal Page 2 of 2 Volume 5 • Issue 1 • 1000184 threshold value is Fα, where Pr(F1,N−3 ≥ Fα)=α. Under the alternative, the test statistic follows a non-central F1,N−3,λ, distribution with the non-central parameter 2 2 / ε λ σ σ = b N [7], where 2 2 2 (1 ) , ε σ ρ σ = − 2 2 2 1 1 2 2 / ( ) / ( ) , σ μ μ μ μ = − + − b n N n N and 1 1 2 2 / / μ μ μ = + n N n N is the overall response outcome mean. The power of the study is then expressed as a probability of being greater than or equal to the threshold Fα in the non-central F distribution, Pr(F1,N−3,λ ≥ Fα). The required sample size is determined by increasing the sample size by one each time until the pre-specified power is reached. Method comparison We referred to the approach proposed by Borm et al. [5] as the B approach, and the other based on the F distribution as the F approach. Power is calculated as the percentage of trials with significant p-values using ANCOVA based on 10000 simulations. Calculated power is presented in Table 1 for α=0.05, β=0.2, σ=1, and μ2−μ1=0.5, and Table 2 for α=0.01, β=0.2, σ=1, and μ2−μ1=1. Sample size based on the F approach is calculated using PASS 12 [9]. As can be seen from both tables, the difference between the B approach and the F approach is negligible for small ρ values. The power of the B approach is much lower than the pre-specified power for large ρ values, as shown in Table 2, the power could be as low as 52%. Although the B approach and the F approach have similar performance when ρ is small, the F approach generally has more accurate sample size calculation as compared to the B approach. A parallel randomized clinical trial is illustrated for sample size calculation based on the B approach and the F approach. Patients with rheumatoid arthritis are randomized into one of the groups with or without leunomide [10]. The response outcome, the disease activity score, is measured before and after the treatment. The baseline measurement is considered as the covariate in the ANCOVA model. This example is also used by Borm et al. [5]. The standard deviation is estimated as σ=1.2. At a significance level of α=0.01 and 90% power, the sample size calculations based on the B approach to detect a mean difference of μ2−μ1=0.6 are 122, 86, and 46 as total sample sizes for ρ=0.7, 0.8, and 0.9, respectively. It needs total sample sizes of 126, 90, and 50 using the F approach. The sample size from the B approach is less than that from the F approach. The sample size from the B approach may not attain the pre-specified power of the study.


Introduction
Randomized clinical trials are commonly used to confirm the efficacy of a new treatment. There are several advantages for using randomization in clinical trials, such as selection bias reduction and increased comparability among groups with potential confounding factors [1]. Balanced studies are often conducted to maximize the power of a study for a given total sample size.
Sample size calculation plays a very important role in clinical trials. It has been studied for many years and achieved significant progress [2][3][4]. As far as we know, sample size calculation approaches for analysis of covariance (ANCOVA) are very limited. Recently, Borm et al. [5] proposed a simple sample size calculation closed form for ANCOVA with one covariate which is considered as a baseline of the response outcome. Based on the sample size from a two sample t-test and the correlation between response outcomes and covariate values, they show that this formula has accurate sample size calculation. The other method is based on a ratio of mean squares [6,7] where the null distribution follows a F distribution and the alternative is a non-central F distribution. There is no systematic comparison between these two approaches.
We reviewed two existing sample size calculation approaches for ANCOVA with one covariate in Section 2. In Section 3, we compare the two approaches using exact simulation studies and an example from a randomized study is used to illustrate these two approaches. Section 4 is given to discussion.

Methods
Suppose that Y ij be the j th response outcome for the i th group, i=1, 2; j=1,2,…,n i , and X ij be the associated covariate. We consider the first group as the control, and the second group as the treatment group in this article. The covariate can be viewed as the baseline for the output. The regression model for the relationship between Y and X within the i th group is given as Y ij =β 0i +β 1 X ij +ε ij , Borm et al. [5] proposed a simple sample size calculation for ANCOVA by multiplying the number of subjects for the two sample t-test by a design factor. The factor here is 1−ρ 2 , where ρ is the correlation coefficient between the outcome and the covariate. Sample size calculation for the two-sample t-test is based on response outcomes. Given a significance level of α, a pre-specified power 1−β, the mean difference between the treatment group and the control of μ 2 −μ 1 , and a common standard deviation of response outcome σ, sample size per group is calculated as where MS b is the mean square between groups, and MS w is the mean square within the group [7]. Under the null hypothesis with no threshold value is F α , where Pr(F 1,N−3 ≥ F α )=α. Under the alternative, the test statistic follows a non-central F 1,N−3,λ , distribution with the non-central parameter N n N is the overall response outcome mean. The power of the study is then expressed as a probability of being greater than or equal to the threshold F α in the non-central F distribution, Pr(F 1,N−3,λ ≥ F α ). The required sample size is determined by increasing the sample size by one each time until the pre-specified power is reached.

Method comparison
We referred to the approach proposed by Borm et al. [5] as the B approach, and the other based on the F distribution as the F approach. Power is calculated as the percentage of trials with significant p-values using ANCOVA based on 10000 simulations. Calculated power is presented in Table 1 for α=0.05, β=0.2, σ=1, and μ 2 −μ 1 =0.5, and Table  2 for α=0.01, β=0.2, σ=1, and μ 2 −μ 1 =1. Sample size based on the F approach is calculated using PASS 12 [9]. As can be seen from both tables, the difference between the B approach and the F approach is negligible for small ρ values. The power of the B approach is much lower than the pre-specified power for large ρ values, as shown in Table  2, the power could be as low as 52%. Although the B approach and the F approach have similar performance when ρ is small, the F approach generally has more accurate sample size calculation as compared to the B approach.
A parallel randomized clinical trial is illustrated for sample size calculation based on the B approach and the F approach. Patients with rheumatoid arthritis are randomized into one of the groups with or without leunomide [10]. The response outcome, the disease activity score, is measured before and after the treatment. The baseline measurement is considered as the covariate in the ANCOVA model. This example is also used by Borm et al. [5]. The standard deviation is estimated as σ=1.2. At a significance level of α=0.01 and 90% power, the sample size calculations based on the B approach to detect a mean difference of μ 2 −μ 1 =0.6 are 122, 86, and 46 as total sample sizes for ρ=0.7, 0.8, and 0.9, respectively. It needs total sample sizes of 126, 90, and 50 using the F approach. The sample size from the B approach is less than that from the F approach. The sample size from the B approach may not attain the pre-specified power of the study.

Conclusions
The sample size calculation formula proposed by Borm et al. [5] has a closed form, and it is computationally easy. Power of the study may be lower than the pre-specified power when ρ is large. Power of the F approach is closer to the pre-specified power for all ρ values. The code written in R for the sample size calculation for both methods is available from the first author. In a study with multiple covariates, Borm et al. [5] recommended using 1−R 2 as the design factor. We consider the comparison between this approach and the approach based on the ratio of the mean squares as future work. Another possible future work would be sample size calculation based on exact approaches [11][12][13][14].