Bias Analysis for The Principal Stratum Direct Effect in The Presence of Confounded Intermediate Variables

Adjusting for an intermediate variable is a common analytic strategy in estimating a direct effect [1-4]. Even if the total effect is unconfounded, the direct effect is not identified when unmeasured variables affect the intermediate (mediator) and outcome variables. The total and direct effects can be formalized most readily by representing the problem nonparametrically in terms of directed acyclic graphs and counterfactual notation [5,6]. For example, in the context of randomized trials (Figure 1), the total effect of binary randomized treatment R on outcome Y is obtained without regard to intermediate D as simply the contrast between E[Y | R = 1] and E[Y | R = 0]: i.e., the intention-to-treat (ITT) effect. However, the salient scientific question of interest often involves not the total effect of R on Y, but rather only the portion of that effect that is not transmitted through the influence of R on intermediate D: i.e., the direct effect.


Introduction
Adjusting for an intermediate variable is a common analytic strategy in estimating a direct effect [1][2][3][4]. Even if the total effect is unconfounded, the direct effect is not identified when unmeasured variables affect the intermediate (mediator) and outcome variables. The total and direct effects can be formalized most readily by representing the problem nonparametrically in terms of directed acyclic graphs and counterfactual notation [5,6]. For example, in the context of randomized trials (Figure 1), the total effect of binary randomized treatment R on outcome Y is obtained without regard to intermediate D as simply the contrast between E[Y | R = 1] and E[Y | R = 0]: i.e., the intention-to-treat (ITT) effect. However, the salient scientific question of interest often involves not the total effect of R on Y, but rather only the portion of that effect that is not transmitted through the influence of R on intermediate D: i.e., the direct effect.
In many epidemiological and clinical studies in which investigators are interested in the direct effect, some factors that confound the relationship between the intermediate and outcome variables are present. Such factors are often unmeasured or not controlled for. If no control is made, the direct effect will not generally be estimated in an unbiased manner [7]. Thus, it is important to conduct a bias analysis for the direct effect, in the presence of unmeasured confounding between the intermediate and outcome variables.
Here, we focus on the application of the principal stratification approach for estimating the direct effect of a randomized treatment. Using this approach, we develop the bounds and a simple method of sensitivity analysis for the principal stratum direct effect (PSDE), which is the difference between expectations of potential outcomes within latent subgroups of subjects for whom the intermediate variable would be constant, regardless of the randomized treatment assignment. For example, the PSDE is closely related to issue of inference with a surrogate marker, where a good surrogate outcome serves as a mediator of treatment effect, leaving little effect of the treatment to directly impact the true outcome of interest though other channels [8]. Although bounds on the PSDE have been presented [9,10], we develop the bounds with narrower width by adding a plausible assumption in some situations. The methods of sensitivity analysis have been also presented [11][12][13], but the methods require some functional model, and use somewhat complex formulae and calculations. Here, we develop a simple method that is much easier to use formulae.
We require the monotonicity assumption, a standard assumption often used in the literature of causal inference [14,15], and introduce sensitivity parameters that are defined as the difference in potential outcomes with the same value of the intermediate variable between subjects who are assigned to the treatment group and those who are assigned to the control group. The remainder of this manuscript is organized as follows. We review the PSDE in the next section. In the third section, we introduce sensitivity parameters, and propose the bounds and a method of sensitivity analysis on these bases. The developed bounds and sensitivity analysis are applied to a randomized trial for coronary heart disease (CHD) in the fourth section. The last section discusses some implications of the developed approach.

Potential outcome and principal stratification
We assume a deterministic potential outcomes framework [16][17][18]. Let Y R=r and D R=r denote the respective values of the potential outcome and mediator that would have been observed if the treatment R had been set. We require the consistency assumption; this assumption is that Y R=r = Y, i.e., that the value of the potential outcome that would have been observed if the treatment r had been equal to the value of the observed outcome when actually assigned to the treatment r. We further assume the independence of treatment assignment. This assumes that Y R=r is independent of R, and means that the treatment assignment gives no information about the distribution of potential outcomes. Note that the independency between the potential outcome Y R=r and R does not mean that the observed outcome Y is independent of R.
Using the principal stratification approach [2,8], four principal strata are formulated when the randomized treatment assignment and intermediate variable are dichotomous. These four principal strata are constructed of the following compliant-mediators, alwaysmediators, never-mediators, and defiant-mediators.

The principal stratum direct effect
Under the principal stratification approach, we focus on ITT effects in two of the four principal strata formed by the potential behavior. In Figure 1, the pathway between R and Y does not include D for the always-and never-mediating principal strata because the potential level of the mediator is constant within each of these two strata. Thus, the separate ITT effect of treatment within the alwaysand never-mediating principal strata is the PSDE [12].
We denote that t takes on the values 1, 2, 3, and 4, corresponding to the compliant-mediating, always-mediating, never-mediating, and defiant-mediating principal strata, respectively, and C = t corresponds to the t th principal stratum. Then, the ITT effect for the t th principal stratum is The standard ITT effect over the whole population equals the weighted sum of the stratum-specific ITT effects across the four strata, with weights corresponding to the probabilities of membership in each principal stratum  t = Pr(C = t) such that 4 1 . Therefore, 4 4 The PSDE corresponds to the weighted sum of the ITT effect across the always-and never-mediating principal strata and is computed as The relationships between  t and p r = Pr(D = 1 | R = r) are as follows: because the subjects within each principal stratum should be homogeneously assigned to R = 0 and R = 1 due to randomization.

Bias Analysis
We define the D-specific sensitivity parameters as follows: where d = 0, 1. This definition of sensitivity parameters is an extension of an idea of the bias factors introduced in the context of randomized trials with noncompliance [20][21][22]. In these reports, it was assumed that the treatment affects the outcome only through the mediator: i.e., no direct pathway from R to Y exists in Figure 1, α d and β d are the difference in potential outcomes with the same mediator value between subjects who are assigned to the treatment group and those who are assigned to the control group. . 0] does not hold in general [1]. Then, these sensitivity parameters are interpreted as biases caused by conditioning on the mediator. α d = 0 and β d = 0 hold conditional on some covariates, if these covariates include all of the confounders of the relationships between D and Y. and where By substituting equations (4), (5), (7), and (8) into equation (2), the PSDE has the following formula: We propose bounds on the PSDE using equations (3)-(8) and a sensitivity analysis using equation (9).
By making two assumptions, we improve inequality (10) to bounds with narrower width. Note that the following Assumption 1 is newly proposed, but Assumption 2 has been presented. ASSUMPTION 1. The expectation of potential outcomes for the compliant-mediators is between the expectations of potential outcomes for the always-mediators and never-mediators, which can be formalized as -(5) yield α 0 ≥ 0 and α 1 ≥ 0, and similarly, equations (6)- (8) Therefore, the signs of α 0 and α 1 must be the same under Assumption 1. It is readily verified that the converse holds, i.e., 11 under  0  0 and  1  0. If the observed data show that E 01  E 11 ,  0  0 and  1  0 cannot hold and then E[ and conversely Likewise, bounds on E[Y R=0 | C = 1] under Assumption 1 are as follows: These bounds provide bounds on the PSDE. If the observed data show that E 01  E 11 Note that the other bounds on sensitivity parameters can be also derived from equations (3) and (6)  Chiba [19] presented the following assumption to derive an estimator of the PSDE.
Sjölander et al. [11] presented a method that was conducted by assuming structural regression models for E[Y R=r | C = c] and estimating the parameters using the expectation-maximization (EM) algorithm. Other researchers used Markov Chain Monte Carlo techniques to estimate the parameters in the framework of Bayesian inference [12,13]. Here, we propose a method without functional models or complex calculations. We use equation (9) only for the sensitivity analysis. The simplest approach is to vary the values of  1 and  0 within the relevant ranges of these values. Then, our approach can be regarded as a re-parameterization of their approaches, and its advantage is that it is much easier to use the formula in the sensitivity analysis.
We can also apply the Monte Carlo sensitivity analysis (MCSA) [23][24][25] using equation (9). For the MCSA, investigators assume prior distributions of the sensitivity parameters, and generate a large number (L) of estimates of the PSDE by drawing L sets of random values from their distributions. Then, a frequency distribution of L PSDE is generated, and we obtain the result without incorporating the random error of the estimate. To incorporate the random error, the distributions of E dr and p r based on the observed data are applied.
If investigators do not have reasonable information about prior distributions of the sensitivity parameters, they can use the bounds on  1 and  0 introduced here, once their bounds are obtained, uniform distributions within the ranges can be applied.

Application Data
We illustrate the proposed bounds and sensitivity analysis using data from the Lipid Research Clinics Coronary Primary Prevention Trial (LRC-CPPT) [26]. The purpose of that study was to evaluate the efficacy of the cholesterol-lowering drug cholestyramine in the prevention of CHD in 3806 asymptomatic middle-aged men with hypercholesterolemia. In this study, 1888 subjects were randomly assigned to receive cholestyramine treatment (R = 0) and 1918 subjects were randomly assigned to receive a placebo (R = 1). During a follow-up period of 1 year, each CHD event was recorded (Y = 0 for no event and Y = 1 for an event). At the end of follow-up, cholesterol levels were recorded for each subject. We dichotomize cholesterol levels as D = 0 for < 280 mg/dL and D = 1 for  280 mg/dL, as in previous studies [27][28][29][30]. Data from the LRC-CPPT are displayed in Table 1 [27]. Note that this example is for illustrative purposes only, as the mediator D has been dichotomized and this can give rise to misleading influences.
In the LRC-CPPT, the four principal strata are as follows. Compliantmediators are subjects whose cholesterol levels were higher than 280 mg/dL when assigned to the placebo group, but lower than 280 mg/dL when assigned to cholestyramine treatment. For alwaysmediators, regardless of treatment assignment, cholesterol levels were always higher than 280 mg/dL. Conversely, for never-mediators, regardless of treatment assignment, cholesterol levels were always lower than 280 mg/dL (and never higher than 280 mg/dL). In contrast to compliant-mediators, defiant-mediators are subjects whose cholesterol levels were higher than 280 mg/dL when assigned to cholestyramine treatment, but lower than 280 mg/dL when assigned to the placebo group.

Bounds
Inequality (10) yielded bounds of -22.39%  PSDE  27.06%. The width of the bounds is 49.45%, which is very wide and thus rather uninformative.
Although whether Assumptions 1 and 2 hold cannot be confirmed from the observed data, it is important to discuss it. In the LRC-CPPT, health-minded individuals may tend not to experience CHD and to have lower cholesterol levels than people who are not as healthconscious. Then, always-mediators, who are individuals with high cholesterol level regardless of treatment assignment, may mostly tend to experience CHD. Conversely, never-mediators, who are individuals with low cholesterol level regardless of treatment assignment, may mostly tend not to experience CHD. The probability of experiencing CHD in compliant-mediators, whose cholesterol levels depend on treatment assignment, may be between the probabilities for alwaysand never-mediators. This observation shows that . Therefore, Assumption 1 may hold. Investigators may not be able to insist that Assumption 2 holds, until the estimate of  ITT is 0 or at least close to 0. Even though the estimate of  ITT is close to 0, it may be difficult to insist on that.

Sensitivity analysis
While we did not know about the distributions of the sensitivity parameters, we assumed that the sensitivity parameters followed uniform distributions with ranges obtained under Assumption 2, i.e., -0.0364   1  0 and -0.0087   0  0. It was assumed that E dr and p r followed binomial distributions, with observed numbers and proportions estimated from the observed data.
We drew 100,000 sets of random values from these distributions, and generated a frequency distribution of 100,000 PSDE. The result is shown in Figure 2. The 50th percentile of the resulting PSDE distribution was 1.98%, (2.5th percentile: 0.15%, 97.5th percentile: 3.81%), which was larger than  ITT . Again, the result shows that the PSDE is positive.

Discussion
We have proposed the bounds and a simple method of sensitivity analysis for the PSDE. To introduce bounds with narrower width, we made Assumption 1. The advantages of the proposed bounds are that their formulae are simple and the width is narrow. Although the bounds have a weakness in requiring some untestable assumptions, Assumption 1 is a reasonable assumption in some situations, when the observed data shows that E 01  E 11 and E 00  E 10 , or E 11  E 01 and E 10  E 00 .
In this paper, we have discussed randomized trials, where treatment is unconfounded. Some researchers may be interested in an extension to non-randomized trials, where treatment R is confounded. Such an extension can be achieved as follows, if all baseline covariates X are measured. In the presence of measured covariates X, all formulae in this paper hold by applying the expectations and probabilities conditional on X. Then, With fixed values of  d,x and  d,x , we can estimate  ITTt after adjusting for x, for example, using regression analysis. Thus, the covariatesadjusted version of equation (9) can be obtained because equations (1) and (2) is also defined with adjusted  ITTt and  t . This shows that our method can be used in the presence of baseline covariates that should be adjusted for. In practice, it is very hard that we assume the values or distributions of  d,x and  d,x for all x. To reduce the number of sensitivity parameters, common  d (=  d,x ) and  d (=  d,x ) for all x will be applied.
A sensitivity analysis technique for the PSDE has previously been developed [11][12][13]. The technique requires some functional models, and use somewhat complex formulae and calculations in the sensitivity analysis. An advantage of our approach is that it is much easier to use formulae. Applying the MCSA, investigators can use our approach without complex computer programming. However, our approach has a disadvantage that it assumes monotonicity. In the LRC-CPPT, investigators can insist that the monotonicity assumption holds, if cholestyramine had beneficial effects for all subjects. In fact, however, cholestyramine may be beneficial on average but may be harmful for particular individuals. A logical next step in this research program would therefore be that the monotonicity assumption is relaxed without using complex formulae and calculations.
In this paper, we have discussed the PSDE, which is a causal effect that is not affected by intermediate variables. For example, such an effect is closely related to issue of inference with a surrogate marker, where a good surrogate outcome serves as a mediator of treatment effect, leaving little effect of the treatment to directly impact the true outcome of interest though other channels. The developed bounds and sensitivity analysis for the PSDE will be used in such situations, and will further be extended to issue of inference with non-compliance and truncation by death.    [26,27].