Finding Critical Values to Control Type I Error for a Biomarker Informed Two-Stage Winner Design

Adaptive clinical trial designs have been getting very popular in recent years. The PhRMA Working Group defines an adaptive design as a clinical study design that uses accumulating data to direct modification of aspects of the study as it continues, without undermining the validity and integrity of the [1]. These designs can assist in potentially accelerating clinical development and improving efficiency.


Background
Adaptive clinical trial designs have been getting very popular in recent years. The PhRMA Working Group defines an adaptive design as a clinical study design that uses accumulating data to direct modification of aspects of the study as it continues, without undermining the validity and integrity of the [1]. These designs can assist in potentially accelerating clinical development and improving efficiency.
However, the multiple interim looks and adaptive adjustments with the design can lead to inflation of type I error. Over the past decade, several statistical approaches have been proposed to control the inflation, some of which have been widely applied in practice. Some of these approaches include: error spending approach for classical group sequential plans [2][3][4]; Combination of p-values, such as Fisher's combination test [5,6], Inverse Normal Method [7], sum of p-values approach [8]; conditional error function [9][10][11]; fixed weighting method [12]; variance spending method [13,14]; and multiple testing methodology such as closed test procedures [15][16][17].
In addition to the conventional adaptive designs which use the same endpoint at the interim and the final analysis of the study, with the surge in advanced technology especially in the "OMICS" space (eg. Genomics, proteomics, etc), interest has also been drawn towards the biomarker informed adaptive clinical trial designs recently. The biomarker informed adaptive designs make interim decisions based on inference on a potentially predictive biomarker, which may be a short-term endpoint that is indicative of the behavior of the primary endpoint.
Todd and Stallard [18] proposed a statistical approach to control type I error rate for group sequential trials where the interim treatment selections are based upon only on the biomarker. Stallard [19] later proposed a method for group sequential trials that use both the available biomarker and primary endpoint information for treatment selections. Their method controls the type I error rate in the strong sense. Friede et al. [20] considered a biomarker informed dropthe-losers design using combination tests for adaptive designs and closure principle for multiple testing to achieve strong control of the family-wise type I error rate. Scala and Glimm [21] studied the case with correlated time-to-event biomarker and primary endpoint where Bayesian predictive power combining evidence from both endpoints is used for interim selection, they investigated the precise conditions under which type I error control is attained. Jenkins et al. presented a type I error control approach for an enrichment design with survival biomarker and primary endpoint which allows both the subgroup and the full population as co-primary populations.
Shun et al. [22] studied a biomarker informed two-stage winner design with normal endpoints. In the design, the interim decisions are made by ranking of the observed effects of biomarker. They derived the unconditional distribution of the final test statistic for the design with two active treatment arms and one control arm, and proposed its normal approximation for calculation of the critical value to preserve type I error rate. However, the proposed normal approximation procedure by Shun et al. [22] cannot be extended to designs with more active treatment groups. In this manuscript, we extend their work and propose a novel type I error control approach for biomarker informed two-stage winner design that can accommodate multiple active arms. Our approach preserves the type I error rate by adjusting critical rejection values of the final test statistic of the design. The exact distribution of the final test statistic is derived and R functions for calculating the adjusted critical rejection values from the skewed distribution are developed. The critical rejection values associated with one-sided type I error rate 0.025 for biomarker informed two-stage winner design with up to 7 active treatment groups are also tabulated for easy reference.
Since biomarker informed adaptive design have two endpoints i.e. the biomarker endpoint and clinical (or primary study endpoint) endpoint, it is important to define the robust model to describe the relationship between the two endpoints. Shun et al. [22] used the conventional approach to model the two normal endpoints in the biomarker informed two-stage winner design they considered, that is, a bivariate normal distribution with a correlation coefficient is used for modeling the two endpoints. However, the conventional approach is shown to be inappropriate when little historical knowledge is known about how the means of the two endpoints are related [23]. Wang et al. [23] proposed a two-level correlation model to describe the relationship between the two endpoints. Besides the correlation coefficient between the two endpoints, the uncertainty of the estimated mean level correlation between the two endpoints is also considered in their model. The two-level correlation model incorporates a new variable that describes the mean level correlation between the two endpoints. The new variable, together with its distribution, reflects the uncertainty about the mean-level relationship between the two endpoints due to a small sample size of historical data. It is shown in Wang et al. [23] that the two-level correlation model is a better choice for modeling the two endpoints than the conventional model. And in fact, the conventional model is a special case of the two-level correlation model. In this manuscript, we consider both (conventional and two-level correlation) models in our discussion for finding critical rejection values for final test statistic to preserve type I error rate.

Biomarker informed two-stage winner design
In general, a biomarker informed adaptive design is a design that combines a phase II and a phase III study. It starts with several active treatment arms and a control arm with planned interim analyses on biomarker. At interim, the inferior arms will be terminated based on effects of biomarker (either by hypothesis testing or ranking of observations), and only the most promising treatment ("winner") will be carried to the end of the study with the control arm. The final comparison between the winner arm and the control arm will be performed on data from both stages and on study primary endpoint. This design has the potential to shorten the duration of the trial for drug development and can be cost effective.
Shun et al. [22] studied a "biomarker informed two-stage winner design". This design only has one interim look when each treatment group has n 1 patients (n 1 is the interim sample size), and uses ranking of biomarker observations for the interim selection. Additional n 2 patients will be recruited for the winner arm and the control arm, and the final comparison will be performed on the primary endpoint of the 2N (N= n 1 + n 2 ) patients.
Let K be number of treatment groups (K−1active treatment groups, and 1 control group), and N be the maximum sample size for each treatment group. Assume the interim analysis is planned at the information time  In the biomarker informed two-stage winner design, the interim decision rule is that if the interim biomarker observations , select treatment j as the most effective treatment, and carry only treatment group j and the control group to the end of the study. When the interim biomarker outcomes are almost the same, the option that more than one treatment groups be carried to the end of the study is not considered in this design, because either treatment group can be selected in this case. The final assessment will be based on the study primary endpoint Y comparing the selected treatment group and the control group.

The two models for fitting the two endpoints
In this section, we briefly review two commonly used techniques for modeling the two endpoints (i.e., the biomarker and the study primary endpoint) in a biomarker informed two-stage winner design. We assume both endpoints are normally distributed in our context. For endpoints that are not normally distributed, a transformation could be considered.
The first is the conventional approach, which uses a bivariate normal distribution with a correlation coefficient to fit the two endpoints.
In this approach, the individual-level correlation coefficient ρ is the only variable to describe the relationship between the biomarker and the primary endpoint. The second is the two-level correlation model proposed by Wang, et al. [23], where a conditional bivariate normal distribution is used to model the two endpoints. This model considers both the individual-level and mean-level correlation between the biomarker and the primary endpoint.
More specifically, let X j u be the mean of biomarker for treatment group j, 2 X σ be the variance. For a fixed j, assume { } Denote the standardized mean of biomarker for each treatment group by * * , Let Y j u be the mean of study primary endpoint for treatment Denote the standardized mean of study primary are correlated with a correlation ρ for the same j and i, that is Since both endpoints are assumed to be normally distributed, the conventional approach uses a multivariate normal distribution with a correlation coefficient ρ for modeling the relationship between the biomarker and primary endpoint: This approach was used for modeling the two endpoints in the study of Shun et al. [22], etc. It has the limitation that the means for both endpoints have to be specified while running power simulation, which can be challenging especially in lieu of lack of solid historical knowledge about the relationship between the biomarker and the primary endpoint which is not uncommon. The two-level correlation model proposed by Wang et al. [23] define a new variable, R j , into the model, which was referred to the estimated mean level correlation between the biomarker and the primary endpoint. This new variable along with its distribution reflects the uncertainty of the mean-level relationship between the two endpoints especially due to a small sample size of historical data.
Assume R j is normally distributed and centered at r j ; which is the true mean-level correlation between the two endpoints. The two-level correlation model can be written as follows: And the unconditional distribution of the model could be expressed as follows: Notice that when 2 0 rj σ = , the two-level correlation model could be specialized to Thus, for a biomarker informed two-stage winner design with K (K ≥ 3) treatments, the distribution of final test statistic under the conventional model is: where: { } ( ) which is the conventional model.
As discussed in Wang et al. [23], the conventional approach is easy to overestimate the power of a biomarker informed two-stage winner design when historical knowledge about the two endpoints is not solid, while the two-level correlation model provides reasonable results as the uncertainty about the mean-level correlation is taken into account in the model. Both these models will be considered in the next two sections when we derive the distribution of the final test statistic for the biomarker informed two-stage winner design.

Test statistic and critical rejection region using conventional approach
To prevent type I error inflation of the biomarker informed twostage winner design, we use the concept of adjusting critical rejection values of the final test statistic of the design. In this section, we derive the exact distribution and give the critical rejection region for the final test statistic of the biomarker informed two-stage winner design under the conventional model. As shown in the succeeding sections the proposed approach works well for biomarker informed two-stage winner designs with any number of active arms.
Consider the following hypotheses: (1) Let G j be the test statistic comparing the primary endpoint of the jth treatment group and the control group: is the mean of the primary endpoint measurements for treatment group j at final.
It could be shown that, The final test statistic of the biomarker informed two-stage winner design can then be expressed as: max ,..., , 1,..., 1. That is, conditional on the interim selection, W takes on the value of the effect from the "winner" treatment group as the final test statistic.
For the very general case under H 1 , the distribution of the final test statistic W could be derived as: It can be seen that, the interim treatment selection of the design skewed the distribution of its final test statistic. Hence, appropriate statistical adjustment is necessary in order to preserve the type I error rate of the design. As the general distribution of the final test statistic is written, the type I error rate of the design can be preserved by adjusting the critical rejection value for the final test statistic.
Under H 0 , the distribution of the final test statistic (3) can be written as follows: Denote the distribution of final test statistic W under H 0 by F 0 . Let w α be the upper 100α percent quintile of F 0 , The type I error rate of the design can be controlled at level α if the 1-sided rejection region is Notice that, controlling type I error for the hypotheses (1) does not control the probability that the winning treatment will be deemed effective when it is in fact not effective. Even when the null hypothesis in (1) is rejected correctly, an error can still occur (ie. The wrong treatment can be selected). Hence, another interesting index for the performance of the biomarker informed two-stage winner design is "power with best treatment", which is the probability that the null hypothesis will be rejected when the best treatment is selected at interim. Wang et al. [23] studied "power with best treatment". In general, it is lower than power. However, the difference depends on the trend of the mean level relationship between biomarker and the primary endpoint. In this manuscript, we develop a method for controlling type I error of the design, "power with best treatment" will not be discussed.

Test statistic and critical rejection region using two-level correlation approach
In this section, we derive the approximate distribution and give the critical rejection region for the final test statistic of the biomarker informed two-stage winner design under the two level correlation model.
The test statistic of biomarker informed two-stage winner design comparing the primary endpoint of the jth treatment group and the control group under the two-level correlation model could be expressed as: By law of large numbers, G j is asymptotically normal, Hence under H 0 ,G j is asymptotically standard normal.
and its distribution under the two-level correlation model could then be expressed as: n u n u n u n u F w f p p dp dp where: { } Under H 0 , (6) could be simplified as: where ( ) 0 , 2 ,..., that is the variability of the estimated mean-level correlation for each treatment group is the same, the above distribution can be approximated by: ( )

R-functions to compute the critical value
We developed R functions for calculating the critical rejection values w α for biomarker informed two-stage winner design with any active treatment arms K (K ≥ 3) (Please refer to the appendix).
R function-convention_cv_Kk is for calculating the integration of F 0 (w) for biomarker informed two-stage winner design with k−1 active treatment groups and a control group under conventional model and therefore can be used to find critical rejection values for the final test statistic. Notice that convention_cv_Kk is a function of α, K, 1 n I N = and ρ. As it seems unlikely that in practice ρ will be known, the sample correlation coefficient ρ is suggested to be used for calculating an approximate critical rejection value. The question that how ρ affects the distribution of test statistic will be discussed in next section.
R function-wang_cv_Kk is for calculating values for biomarker informed two-stage winner design with k−1 active treatment groups and a control group under the two-level correlation model proposed by Wang et al. [23]. If there are unknown parameters incorporated in the functions, the parameter estimates can be used for calculating an approximate critical rejection value. As an additional check, simulations should be done to ensure type I error rate is preserved.

Critical values
Tables 1-5 provide the critical rejection values w 0.025 for biomarker informed two-stage winner design with up to 7 active treatment group under conventional model. As expected, the critical rejection value w 0.025 increases as ρ increases. It can also be seen that, critical value is a function of the information at the interim i.e. with more information at interim, the critical rejection value w 0.025 at the final analysis will be larger. Also the more active treatment groups, the larger the critical rejection value w 0.025 will be.
These tables also reflect partially how estimation of ρ affects the ρ 0 0.2 0.5 0. 8      distribution of the test statistic. Table 6 lists the errors in Type I error using ρ instead of ρ for a biomarker informed two-stage winner design with 3 active treatment groups and 1 control group. If ˆ0.5 ρ = is used instead of ρ=0.8, the true Type I error of the design is around 0.03 when we thought the type I error rate is controlled at 0.025. If ˆ1 ρ = is used, the true Type I error of the design is around 0.023. Therefore, errors caused by misestimating ρ is in general in an acceptable region.

Discussion
In this manuscript, we have proposed a novel statistical approach for type I error control of the biomarker informed two-stage winner design. We leverage the concept of adjusting critical rejection values of the final test statistic of the design for preserving the type I error rate. The exact distribution of the final test statistic is derived under the conventional one-level correlation model, and the asymptotic distribution of the final test statistic is provided for Wang et al. [23] two-level correlation model. The critical rejection values w α are computed through mathematical integrations. We developed R functions for calculating the adjusted critical rejection values from the skewed distribution of final test statistic. As shown, the critical rejection value w 0.025 increases if any of the following increases i.e. correlation (ρ), number of active treatment groups (k) and information at interim analysis 1 n N       increases.
Our proposed method circumvents the limitation of the normal approximation method proposed by Shun et al. [22], and works for designs with any number of treatment arms. However, it has the limitation that it works only for the biomarker informed two-stage winner design with normal interim and final endpoints. For the designs with non-normal endpoints, transformations might be used to convert the data to follow normal distribution. Developing novel approaches for type I error control for biomarker informed two-stage winner design with non-normal endpoints would be an interesting topic for future studies.