Model Tumor Pattern and Compare Treatment Effects Using Semiparametric Linear Mixed-Effects Models

To analyze responses of solid tumor to treatments and to compare treatment effects with antitumor therapies, we applied semiparametric mixed-effects models to fit tumor volumes measured over a period. The population and individual nonparametric functions were approximated by smoothing spline. We also proposed an intuitive method for a comparison of the antitumor effects of two different treatments. Biological interpretation was also discussed. Journal of Biometrics & Biostatistics J o u rn al of Bio metrics & Bistatis t i c s


Introduction
In anticancer drug development, demonstrating the antitumor activity of anticancer agents in preclinical animal model is important. Tumor volume is a commonly used endpoint of treatment efficacy in the evaluation of antitumor agents in such a preclinical animal tumor model. Intuitively, tumor volumes of animals treated on different antitumor agents may be used to compare the antitumor activity of the treatments. Appropriate analysis of tumor volume is therefore important in anticancer drug development. Survival analysis based on the tumor growth delay [1][2][3] is often conducted but it sometimes provides insufficient information or even invalid comparison of two treatments when both survival times are the same but tumor volumes are different. Another endpoint is tumor growth inhibition [2][3][4][5] that is generally assessed at a pre-specified time point. These approaches give valid results but can be inefficient because the information at other time points is discarded. An alternative approach is to fit tumor growth curves, such as multivariate analysis, regression modeling [6][7][8]. More recently, Liang and Sha [9] applied a parametric nonlinear mixedeffects model [10,11] to analyze changes in tumor volume. To reduce the model's assumptions and make the methods more general and robust, Liang [12] proposed a nonparametric method to model tumor volume. Although these approaches use the entire dataset, each has their own limitations. For example, parametric mixed-effects models [13][14][15] impose strong assumptions on underlying biology mechanisms and might produce coefficients with limited biological relevance [16], whereas nonparametric mixed-effects models impose no assumptions and may lose useful information when some information is available. In this paper, we suggest a compromised strategy and propose a semiparametric linear mixed-effects (SLM) model to fit tumor volumes. This research is motivated by the data from a drug combination tumor xenograft study generated in the Pediatric Preclinical Testing Program (PPTP) [17]. In this study, the human rhabdomyosarcoma cell line Rh30 was used to evaluate the therapeutic enhancement for the combination of rapamycin with cytotoxic agents. A total of 140 SCID female mice were used to propagate subcutaneously implanted Rh30 tumors. After tumors grew to a certain size, tumor-bearing mice were randomized into 14different treatment groups with 10 mice per group. Cytotoxic agents were administered at their maximum tolerated dose (MTD), 0.5MTD or 0.63MTD or without concomitant rapamycin treatment. All mice were treated for 6 weeks and followed for another 6 weeks without any treatment. The volume of each tumor is measured at the initiation of the study and weekly up to 12 weeks. Mice were euthanized usually when the tumor volume reaches four times its initial volume, thus resulting in incomplete longitudinal tumor volume data, as shown in Figure 1.
We would like to establish statistical significance of betweengroup differences in growth profiles and investigate the underlying biology. It is desirable to have interpretable parameters that represent characteristics of the growth curves, such as slope interpreted as the tumor growth rate. We also want the model to be flexible enough to allow different shapes of the curves. As in Figure 1, there is generally an upward/downward trend for a given group, but the growth patterns seem to be non-linear with different patterns among the groups. Straight-line regression models are likely to underfit the data. Polynomial regression or nonlinear models fit the observations better, but coefficient estimates can be sensitive to nonlinearity assumptions that one cannot evaluate robustly from the dataset in hand. In this case, it is reasonable to use a class of semiparametric models that keep the trend modeled parametrically, while letting the rest of the model be driven by data nonparametrically. Thus we may take advantage of both the flexibility of nonparametric models and interpretability and parsimony of parametric models.
We use smoothing spline to fit the nonparametric component, which was initially developed for smooth interpolation [18]. Under statistical context it is more appealing to fit curves that pass near the noisy data, but are not restricted to interpolating exactly. The optimum curve under certain criteria can be found by solving a penalized least squares problem [19], which is discussed in the later Section. The idea is to find a curve that is a good compromise between fidelity and smoothness. A semiparametric model combining linear predictors and smoothing splines can be written in the form of Linear Mixed Effect models (LME) [20], which enables utilization of theory and computation power in the

Model and Method
The general semi-parametric linear mixed-effects model assumes that [21] y f X Zb  . For more details on the topic see Aronszajn [22] and Wahba [23]. The estimate of f can be found by minimizing the following penalized sum of squared errors: , and it is treated as constant once estimated. The three estimates behave similarly for large sample size [21]. GML is used to estimate the smoothing parameter in this paper.
The function that minimizes (3) has the form where the coefficients A more general form of (3) enables modeling of the covariance matrix using a weight matrix W, and different smoothing components with multiple RKHS decompositions , , Minimizer to (6) is a simple extension to (4) and (5), and can be directly related [24,25] to the Restricted Maximum Likelihood (REML) solution of the following LME model Bayesian credible bands are commonly used to evaluate smoothing spline fitted values [26] by assuming the following prior for f,  where ( ) f ⋅ is fixed, t s are randomly selected and ( , ) is the corresponding credible band at significance level α . ACP has been shown to be close to nominal coverage rate 1 α − in simulations [27] and theoretical justification [28]. Note that this average coverage is weaker than pointwise coverage.
When comparing two groups with estimated functions 1 ( ) f t and 2 ( ) f t , the difference of these two groups can be derived where 1 ( ) se t and 2 ( ) se t are posterior standard errors of the fitted curves 1 ( ) f t and 2 ( ) f t respectively. A check for group difference can be performed by examining whether the credible band covers the horizontal zero line. Similar idea was once used in Bowman and Young [29] and Liang [30].
As can be seen from (6), the RKHS formulation makes it easy to model parametric and nonparametric components by manipulating the RKHS decomposition, and it combines various spline models under a unified framework. Smoothing spline ANOVA (SS ANOVA) is available with similar interpretation as ordinary ANOVA. Suppose the model space can be decomposed as in (2) is a random variable since w is a random sample from B k . What we observe are realizations of this "true" mean function plus random errors. We use label w to denote mice we actually observe.
We define four averaging operators that project the function f onto modular structures constituting this SLM model: which can be interpreted in parallel with the classical mixed models as follows: 0 µ is a constant, 0 ( 0.5) The first six terms are fixed effects. The last three terms are random effects since they depend on the random variable w. Depending on time only, the first three terms represent the mean curve for all mice. The middle three terms measure the departure of a particular group from the population mean curve. The last three terms measure the departure of a particular mouse from the mean curve of a population from which the mouse was chosen.
For categorical variables, we can either estimate each level by shrinkage estimates, which reduce overall means squared error by penalizing departure from overall mean; or we can fit each level separately as fixed categorical covariates. In this study, it is beneficial to not penalize group difference because we are interested in comparing group differences. Both shrinkage and fixed-effects estimates can be fitted under the RKHS framework with different RKHS decompositions. Suppose we want to model time using cubic spline, the categorical variable group as fixed effects and shrink mouse factors toward constants (modeled as random effects), the fixed effects in Equation (11) can be re-written as

Analysis of Xenograft Tumor Data
In this section, we use the proposed method to analyze data from the study described in introduction. Intuitively, the longer mice live, the more favorable the treatment combination is. But the mice with same survival times may have different tumor volumes. The differences in tumor volumes might represent quality of life and reveal potential intervention mechanisms of treatments. For instance, scatterplots from Treatment F and G look quite similar. As we investigate measurements from these two groups, mice with Treatment F tend to die earlier if the tumor volumes are not well controlled at early times, while mice in Group G are more likely to survive after having high initial tumor volumes.
Detail of the data set is given as follows. W=140 mice were assigned to K=14 treatments, and tumor volumes were measured on each mouse weekly for maximum of 12 weeks. Time has been normalized to [0,1]. There are two categorical covariates group and mouse, and a continuous covariate time. We treat group and time as fixed. From the design, the mouse is nested within group and treated as random.
The SLM model discussed in model and methods enables us to (i) estimate the group (treatment) effects; (ii) estimate the population mean volume as functions of time; and (iii) predict response over time for each mouse. For the purpose of this study we are most interested in (i) and (ii).
Based on the SS ANOVA decomposition in (11) and (12), we may fit the following three models. Note that Term 7 in (11), the random effects in intercepts, is not included. From the scatterplots in Figure 1, all mice start out at about the same level (as designed by the study), indicating there is not much of mouse random effects in intercept. Unsurprisingly, the random intercept term causes convergence problems as the nearzero variance is being estimated. Likelihood ratio tests of nested LME objects suggest insignificance of the random intercept beyond the random slope. Therefore, we carry out the analysis without this term.
• Model 1 includes the first six terms and the eighth term in (11).
It fits different population mean curves for each group plus a random slope for each mouse. We assume that and they are mutually independent.
• Model 2 includes all terms except 7th term in (11). It fits different population mean curves for each group plus a random slope and a random smooth effect for each mouse. We assume that  and independence between the random effects and random errors. These are similar to usual assumptions in LME models.
• Model 3 fits first-order autoregressive correlations structure AR(1) to Model 2, i.e. kwj ∈ 's are no longer assumed to be independent.
• Model 4 uses an extra parameter beyond Model 3 to account for unequal variances of within-group error terms kwj ∈ 's by modeling variance as an exponential function of time.
Among all 14 treatments shown in Figure 1, some can be instantly eliminated, such as A and K, because of poor survival times. The analysis is then narrowed down to the 6 groups N, J, F, L, D, G with similar survival times, but may have different tumor volume growth profiles (Figure 2). Such SLM models are solved by finding solutions of their LME counterparts, as shown in (7). This connection can also be utilized to calculate AIC, BIC and LRT in the sense of conventional parametric models for model selection and comparison, as shown in Table 1. LRTs of these four nested models suggest Model 4 to be most favorable. Estimated serial correlation coefficient for AR (1) Figure 3 shows predicted curves along with 95% Bayesian credible bands calculated from posterior distributions of fitted values with a diffuse prior by letting κ → ∞ in (8). Note that these predicted curves and credible bands are supposed to represent the mean curves for sub-population B k (Group k), not individuals. This explains why some estimated tumor growth patterns actually start to go down at the end of the study, such as D and F, reflecting the fact that only mice with lower tumor volumes survived among the population of these groups. Figure 4 shows pairwise differences with Bayesian credible bands. If the credible band runs above or below zero, the two treatments are deemed different. Survival times are not under consideration here, so treatment with lower tumor volumes is better (given a mouse survives till that time). For example, treatments N and J are not different from each other since the zero line is fully contained in the credible band. D is better than L because the credible band is mostly above zero. The result is summarized in Table 2.

Random effects in the smoothing components
As comparison, result from a simple ANOVA analysis for the logtransformed data is also listed in Table 2. SLM model is not only able to  detect more significantly different pairs, but also provides more insight into how the pairs are different from each other. It is worth pointing out that for Treatment G versus Treatment F, although ANOVA and SLM both detect the difference, ANOVA concludes G has lower tumor volumes, while SLM picks F. Simply taking averages in ANOVA ignores information from time and correlation of measurements within each mouse that reveal different behaviors of the two treatment groups towards the end. By looking at the SLM predicted population patterns in Figure 3, while the two tumor growth profiles look similar during early period, tumor volumes in Group F are lower than those in Group G for mice that survived to the final period of the study. For example, if we only look at tumor volumes past normalized time 0.8, the means for Group F and G are 0.4270 and 0.6264 respectively.

Remark
The dimension of the random spline covariance matrix Q increases with the number of observations. As a result, direct computation might take a long time. For this specific dataset with all groups, it took about one day to compute the model with random smoothing effects on a personal computer. To speed up the program, a low-rank approximation algorithm was used to reduce the dimension of Q by eliminating eigenvectors corresponding to small eigenvalues [31]. In our example, with a cutoff value 0.001, the approximation reduced the computation time to about an hour and gave almost identical results.

Conclusion
SLM model only specifies part of the mixed-effects model parametrically and leaves the rest to be modeled non-parametrically by data itself. It incorporates interpretability and parsimony of a parametric model and flexibility of a nonparametric model, and also avoids estimating parameters with little biological relevance. The close connection with LME models enables us to utilize existing LME fitting procedures for computation and model selection. Even when the final goal is to build a fully parametric model, SLM model can be useful for initial data exploration and shed light on parametric models to follow.
The cost of SLM model is relatively heavy computation (but reasonable with optimization) and difficulty in deriving closed-form inferences in the functional space. More research on model selection methods is needed as well.
Further effort can be invested for some special post-hoc methods such as multiple comparison correction for the SLM model. The major difficulty for such extensions would be establishing an expression of the correction, giving a theoretical justification and calculating p-value.