Medical, Pharma, Engineering, Science, Technology and Business

^{1}Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, St. Louis, MO, USA

^{2}Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, USA

^{3}Department of Ophthalmology and Visual Sciences, Washington University School of Medicine, St. Louis, MO, USA

- *Corresponding Author:
- Feng Gao

Division of Public Health Sciences

Campus Box 8100

Department of Surgery

Washington University School ofMedicine

660 S Euclid Avenue, St. Louis

MO 63110, USA

**Tel:**(314) 362- 3682

**Fax:**(314) 454-7941

**E-mail:**[email protected]

**Received date:** September 13, 2015; **Accepted date:** October 19, 2015; **Published date:** October26, 2015

**Citation:** Gao F, Miller JP, Beiser JA, Xiong C, Gordon MO (2015) Predicting Clinical Binary Outcome Using Multivariate Longitudinal Data: Application to Patients with Newly Diagnosed Primary Open-Angle Glaucoma. J Biom Biostat 6:254. doi:10.4172/2155-6180.1000254

**Copyright:** © 2015 Gao F, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

Primary open angle glaucoma (POAG) is a chronic, progressive, irreversible, and potentially blinding optic neuropathy. The risk of blindness due to progressive visual field (VF) loss varies substantially from patient to patient. Early identification of those patients destined to rapid progressive visual loss is crucial to prevent further damage. In this article, a latent class growth model (LCGM) was developed to predict the binary outcome of VF progression using longitudinal mean deviation (MD) and pattern standard deviation (PSD). Specifically, the trajectories of MD and PSD were summarized by a functional principal component (FPC) analysis, and the estimated FPC scores were used to identify subgroups (latent classes) of individuals with distinct patterns of MD and PSD trajectories. Probability of VF progression for an individual was then estimated as weighted average across latent classes, weighted by posterior probability of class membership given baseline covariates and longitudinal MD/PSD series. The model was applied to the participants with newly diagnosed POAG from the Ocular Hypertension Treatment Study (OHTS), and the OHTS data was best fit by a model with 4 latent classes. Using the resultant optimal LCGM, the OHTS participants with and without VF progression could be accurately differentiated by incorporating longitudinal MD and PSD.

Multivariate longitudinal data; Latent class growth model; Functional principal component analysis; Primary open angle glaucoma

Primary open angle glaucoma (POAG) is a chronic progressive and potentially blinding optic **neuropathy**. Visual loss is often monitored using visual field (VF) test. Depending on different algorithms used, the VF test divides retina into 74 or 52 locations and the sensitivity of vision at each location is assessed. Glaucomatous damage is **irreversible** because nothing yet can restore the optic nerve cells once they are dead. However, the risk of blindness due to progressive VF loss varies substantially from patient to patient. A natural history study on a cohort of patients newly diagnosed with **glaucoma** found out that the mean deviation (MD) index, a global summary measure for visual field test, in some patients can deteriorate at an alarming rate of 10 decibels (dB) per year, while in others the MD virtually did not change in 6 years without any treatment [1]. Early identification of those patients destined to rapid progressive visual loss is crucial to prevent further irreversible visual field loss. Conversely, for patients whose visual fields remain stable, the cost and morbidity associated with over-treatment could be avoided. Unfortunately, differentiating progressive and stable visual function is challenging, partially due to the high intra-individual variability in visual field data [2]. As in many other **chronic diseases**, multiple longitudinal biomarkers are often collected. The usefulness of longitudinal biomarkers for early detection of disease or to monitor disease progression has been long recognized in many chronic diseases such as cancer or cognitive aging. It is therefore worth while to pursue whether the prediction of VF progression in patients with POAG could be enhanced using periodically recorded biomarkers.

In the last decade, many methods have been proposed to incorporate longitudinal biomarkers for the prediction of clinical outcomes in biomedical research. Henderson et al. [3] proposed a flexible joint model that connects longitudinal and survival models with two correlated latent Gaussian processes. They assumed that the longitudinal and survival data are conditionally independent given the linking latent process and covariates. Guo and Carlin [4] developed the above joint model using a fully Bayesian approach through MCMC methods. Rizopoulos et al. [5] developed a joint model via so called shared-parameter model. They assumed that all associations between longitudinal and survival processes are induced by random effects and that the two processes are conditionally independent given the random effects. Similar shared parameter models have also been used for the joint analysis of **longitudinal data** and binary outcomes [6] or **multivariate** longitudinal biomarkers and time-to-event data [7]. In all the above methods, univariate or multivariate linear mixed models are often used to describe the change of longitudinal biomarkers over time. These models usually assume that all individuals are drawn from a single homogenous population and that all the trajectories are smoothly distributed around the population average.

An alternative approach for the joint modeling of longitudinal data and clinical outcome is a joint latent class model which uses a latent class growth model (LCGM) to describe trajectories of longitudinal biomarkers. It assumes that the population of subjects consists of heterogeneous but unobserved sub-populations (latent classes). A joint latent class model assumes that the dependence between longitudinal data and clinical outcomes is fully captured by the latent class structure. Due to its flexibility in modeling even complex dependence and its ability to incorporate covariates, many methods have been proposed in recent years to incorporate longitudinal biomarkers under the framework of latent class models [8]. Garre et al. [9] fitted a joint latent class model for the prediction of time to graft failure using longitudinal serum creatinine concentration. They used two latent classes to capture the heterogeneity of serum creatinine trajectory. The serum creatinine in the first class was described by an intercept-only random effect model, while the trajectories in the second class was modeled using a segmented random-effect model with a random change-point indicating the initiation of rapid deterioration in patient’s renal function. Proust-Lima and Taylor [10] developed a dynamic prognostic tool that incorporates the evolution of prostate-specific antigen (PSA) to predict the risk of prostate cancer recurrence for **patients** who completed radiation therapy. The within-class PSA trajectory was described by a linear mixed model including two known parametric functions for short- and long-term PSA trajectories respectively. Their research also found that the predictive accuracy was not markedly influenced by the choice of different number of latent classes. In a recent report, Maruyama et al. [11] also applied a joint latent class analysis approach to identify distinct patterns of longitudinal craving scores and to predict the binary clinical outcome of smoking cessation. LCGM-based predictions are also computationally attractive. Once a joint model has been fitted, there is no need to integrate out the random effects and all the individualized prediction can be calculated analytically. This feature makes it especially appealing for individualized real-time dynamic prediction [8].

In this article, we developed a latent class **growth model** (LCGM) using multiple longitudinal biomarkers to predict visual field (VF) progression in patients with newly diagnosed POAG. We used the data from ocular hypertension treatment study (OHTS), a phase III randomized trial of the safety and efficacy of topical hypotensive medication in preventing the development of POAG. The proposed LCGM was developed taking a stage-wise approach and based on longitudinal mean deviation (MD) and pattern standard deviation (PSD). Specifically, a functional principal component (FPC) analysis [12] was first performed to the longitudinal MD and PSD separately, and FPC scores were estimated to approximate the trajectories of MD and PSD. Subgroups (latent classes) of participants with distinct patterns of MD and PSD trajectories were then identified based on the resultant FPC scores.

OHTS is the largest randomized trial to date to test safety and efficacy of topical hypotensive medication in preventing the development of POAG. In OHTS, 1636 subjects were randomized to either observation or treatment with ocular hypotensive medication and followed for a median of 13 years. Outcome ascertainment was performed by specialized resource centers where readers were masked as to randomization assignment and information about the participant’s clinical status, and the attribution of abnormality due to POAG was performed by a masked Endpoint Committee. In OHTS, 362 eyes from 279 participants developed POAG during study. Disease progression was monitored regularly every 6 months using Humphrey 30-2 visual field (VF) tests. This dataset constitutes the largest cohort of POAG with known date of **diagnosis** and provides a unique opportunity to evaluate disease progression over time. Detailed information on OHTS is described in Gordon et al. [13].

Our analysis cohort consisted data from the first eye of 277 participants who developed POAG and had at least 7 follow-up visits. This study has been approved by Washington University Institutional Review Board. The overall median follow-up was 13 years, with a median pre-diagnosis follow-up of 8 years and median post-diagnosis follow-up of 4.8 years. We used mean deviation (MD) and pattern standard deviation (PSD) as the longitudinal predictors. MD reflects the deviation of a subject’s vision from the age-matched reference population, while PSD reflects the perturbation of vision across different testing locations and is more sensitive for early visual damage. The primary endpoint was the status of VF progression at the end of follow-up, as determined by post-diagnosis VF using the 2-omitting algorithm [14], and 57 eyes (20%) were labeled as “progressive”.

**Table 1** showed the demographic and clinical characteristics at diagnosis for progressive and stable cohorts separately. The categorical data were summarized as counts and frequencies, while the continuous variables were summarized in means and standard deviations or median and inter-quartile range as appropriate. As an illustration, in the proposed prediction model we only included following 5 variables at diagnosis as the baseline covariates: age (Age, decade), intraocular pressure (IOP, mmHg), central corneal thickness (CCT, μm), pattern standard deviation at randomization (PSD0, dB), and horizontal cup/ disc ratio (HCD). These 5 covariates were chosen because they have been identified and validated as independent predictors for the risk of POAG development [13]. The baseline covariates were standardized to have mean 0 and variance 1 throughout the remainder of this paper. As such, for these variables the odds ratios (OR) from the regression models actually represented the effect per 1-SD change.

Variables | Stable Eyes (N=220) | Progressive Eyes(N=57) | p-values |
---|---|---|---|

raceAfrican America Others | 68 (30.9) 152 (69.1) | 23 (40.3) 34 (59.7) | 0.176 |

gender M F |
125 (56.8) 95 (43.2) |
30 (52.6) 27 (47.4) |
0.570 |

age (decades) | 6.57 ± 0.95 | 6.64 ± 0.97 | 0.641 |

mean deviation (MD, dB) | -1.28 ± 2.06 | -2.90 ± 2.53 | <0.001 |

pattern standard deviation (PSD, dB) | 2.89 ± 1.26 | 3.68 ± 1.57 | <0.001 |

intraocular pressure (IOP, mmHg) | 21.6 ± 5.61 | 24.4 ± 7.77 | 0.003 |

central corneal thickness (CCT, µm) | 561 ± 38.22 | 552 ± 39.0 | 0.123 |

horizontal cup/disc ratio (HCD) | 0.53 ± 0.19 | 0.51 ± 0.20 | 0.373 |

median number of total follow-up visits (IQR) | 25 (22-27) | 23 (19-26) | 0.127 |

median number of pre-diagnosis visits (IQR) | 13 (8-19) | 12 (7-17) | 0.322 |

median number of post-diagnosis visits (IQR) | 10 (6-15) | 9 (5-13) | 0.349 |

**Table 1:** Summary statistics of predictors at POAG diagnosis, where categorical
variables are summarized as counts and frequencies, while continuous variables
are summarized as means ± standard deviations (STD) or medians and interquartile
range (IQR) as appropriate.

**Figure 1** shows the diagram of the proposed LCGM for monitoring

VF progression. The longitudinal series of MD and PSD are centered at the time of diagnosis, with black and red lines representing stable and progressive eyes respectively. Subgroups (latent classes) of participants are determined based on distinct patterns of MD and PSD trajectories. Rather than directly modeling the longitudinal change of MD and PSD in the LCGM, we first perform a functional principal component (FPC) analysis in MD and PSD separately to approximate its evolution over time. FPC analysis is chosen here because the trajectories in both MD and PSD, especially in these progressive eyes, tend to be non-linear and a conventional linear mixed model may be inadequate to fully capture the changes of MD and PSD over time.

**Functional principal component (FPC) analysis of longitudinal MD and PSD**

Functional principal component (FPC) analysis for sparse longitudinal data is performed following methods developed by Yao et al. [12] and is briefly summarized in this sub-section. FPC analysis provides a powerful tool of dimension-reduction for functional data (i.e., curves) [15]. It examines the sample covariance structure and finds the set of orthogonal principal component functions that maximize the variance along each component. Taking MD as an example, let m_{ij} denote the observation of MD for i^{th} participant at the j^{th} timepoint and X_{i}(t) be the corresponding smoothed trajectory, then FPC analysis specifies that

and where

is the error term with

represents the overall mean trajectory of MD;

is a vector of FPC scores for i^{th} participant and acts as random effects in statistical models with FPCs. It is a set of uncorrelated random coefficients having mean 0 and variance λ_{k} (i.e., eigenvalues);

is the corresponding k^{th} eigenfunction and describes the direction of deviation from the overall mean trajectory while an individual’s FPC score quantifies the extent to which their trajectory correlates with

Since the eigenvalues often decrease rapidly, the infinitedimensional curves can be well approximated by a very small number of FPCs.

In our FPC analysis, both MD and PSD series are smoothed using cubic B-splines with equally spaced knots, and the number of basis are selected based on an approximate leave-one-curve-out crossvalidation [12]. The first eigenfunction is estimated such that the eigenvalue λ_{1}(i.e., variance of ) is maximized subject to the condition The second principal component function can be obtained by first subtracting from the original smoother function,, and then treatingas the new collection of smoothed functions to find . This process can be performed iteratively for k=1, 2, …, K principle components.

Once the estimates of eigenfunction and eigenvalueλ_{k} are obtained, the FPC scores of individual participants can be calculated via numerical integration = dt by plugging in the estimated mean and eigenfunction. However, the conventional FPC procedure assumes that the density of measurements is sufficiently large. For sparse and irregularly measured longitudinal data like those in OHTS, the conditional expectation method [12] can be applied to obtain individual FPC scores, which are estimated as the best linear unbiased prediction (BLUP) given the observed trajectories and parameters,

and Where

and

In this paper, we only extract the first FPC to approximate the longitudinal MD and PSD.

**Developing LCGM based on the 1 ^{st}-FPC scores of longitudinal MD and PSD**

Let Y_{i}={Y_{i1}, Y_{i2}} denote the resultant 1^{st}-FPC scores of MD and PSD for i^{th} participant, Z_{i}={1, AGE_{i}, PSD0_{i}, IOP_{i}, CCT_{i}, HCD_{i}} be the vector of baseline covariates, Ci represent the latent class membership of i^{th} individual, and θ_{g} be the vector of class-specific parameters that differentiate the G latent classes, with i =1, 2, …, N, and g =1, 2, …, G, respectively. Then the distribution of Yi is a mixture distribution defined as,

The probability Pr(Ci = g) represents the size (mixing proportion) of latent class g in the mixture distribution and is modeled as a multinomial logistic regression, where α_{g} represents the log odds of membership in the class g relative to a reference class (class 1, say), with the parameter in the reference being 0 for identification.

describes the class-specific distribution of Yi and is assumed to follow a bivariate normal distribution. A common variance-covariance structure across all classes is often assumed in a typical latent class model. Due to high variability in both MD and PSD trajectories, we assume a class-specific variance in this model, but we also assume that the correlation between MD and PSD is fully captured by the latent classes, i.e.,

and

Given the estimated parameters and Y_{i}, each individual can be assigned to the most likely class based on the probability of class membership (often termed as posterior class-membership probability),

The best LCGM is selected by enumerating and comparing a set of competing models differing only in the number of classes. The model comparison is based primarily on the log likelihood values, including the Bayesian Information Criteria (BIC, with a smaller BIC indicating a better fit) and the Lo-Mendell-Rubin adjusted likelihood ratio test (LMR-LRT). A significant test of LMR-LRT indicates that the model with G-1 classes should be rejected in favor of the G-class GMM. In addition, we also specify that each class should include at least 5% of patients to ensure reliable within-class estimation.

**Probability of VF progression and its 95% confidence interval**

Once an optimal LCMG has been developed, the class-specific risk of VF progression can be readily obtained, then the probability of VF progression for a new patient with given MD and PSD series can be calculated in the following 3 steps:

• Estimating the 1^{st}-FPC score and its variance var( ) for the given MD and PSD series;

• Calculating the probability of class membership (*posterior class-membership probability*) given the baseline covariates and estimated 1^{st}-FPC scores;

• Estimating the probability of VF progression as weighted average of class-specific risk, weighted by *posterior classmembership probability.*

Due to the complexity of modeling, we have to rely on numerical bootstrap resampling method to construct the 95% confidence interval of VF progression. Specifically, 1000 samples of the 1^{st}-FPC scores for MD and PSD are generated from a normal distribution,˜N(, () The probability of posterior class membership and the probability of VF progression are calculated independently in each bootstrap resample. The 95% confidence interval of VF progression is then obtained by taking the 2.5^{th} and 97.5^{th} percentiles of 1000 corresponding estimates.

In this article, FPC analysis is performed using the library “fpca” in the statistical package R [12], while the parameter estimation for LCGM is implemented using statistical package Mplus which is called via the library “MplusAutomation” in R [16].

**FPC analysis of longitudinal MD and PSD**

Based on a leave-one-curve-out cross-validation [12], the trajectories in both MD and PSD could be best smoothed using 7 basis functions. The 1^{st}-FPC accounted for 89.7% and 90.8% of total variance in MD and PSD trajectories, respectively, and **Figure 2** visualized the estimated trajectories based on FPC analysis. **Figure 2A** showed the mean trajectory of MD (solid line) and mean trajectory ± 1^{st} eigenfunction of FPC (broken lines). In general, MD decreased over time, especially after POAG diagnosis. **Figure 2A** also indicated that the 1^{st}-FPC primarily captured the post-diagnosis variability [15]. To illustrate the performance of 1^{st}-FPC in approximating MD trajectories, we randomly selected 4 individuals who had different magnitude of estimated 1^{st}-FPC MD scores. All the observed trajectories (solid lines) were well approximated by the estimated curves (broken lines) based on the 1^{st}-FPC MD scores (**Figure 2B**). A positive FPC score corresponded to a higher than average MD trajectory while a negative score corresponded to a lower than average MD trajectory.

**Figure 2:** Estimated trajectories of MD and PSD using functional principal component (FPC) analysis. 2A): mean trajectory of MD (solid line) and mean
± 1st-eigenfunction (broken lines); 2B): observed (solid lines) and estimated (broken lines) of MD trajectories in 4 randomly selected participants who
had different levels of 1st-FPC scores; 2C): mean trajectory of PSD (solid line) and mean ± 1st-eigenfunction (broken lines); 2D): observed (solid lines)
and estimated (broken lines) of PSD trajectories in 4 randomly selected participants who had different levels of 1st-FPC scores.

Similarly, the 1^{st}-FPC of PSD captured the post-diagnosis variability and PSD increased over time in general (**Figure 2C**). A positive FPC score corresponded to a lower than average PSD trajectory while a negative score corresponded to a higher than average PSD trajectory. Among these randomly selected 4 individuals, all the observed PSD trajectories (solid lines) were well approximated by the estimated curves (broken lines) using the 1^{st}-FPC PSD scores (**Figure 2D**).

**Developing LCGM using 1 ^{st}-FPC scores of longitudinal MD and PSD**

**Table 2** showed the fitting statistics of 7 competing LCGMs based on the resultant 1^{st}-FPC scores. The first 5 models differed only in the number of latent classes and they all assume class-specific variance with no correlation. Model 2 (3 classes) and Model 3 (4 classes) had similar performance based on Bayesian Information Criteria (BIC) and Lo-Mendell-Rubin adjusted likelihood ratio test (LMR-LRT). Since McLachlan and Peel [17] show that BIC tends to underestimate the number of classes, we chose Model 3 as the best fit to the OHTS data. Two additional models (both with 4 classes) were further considered, one (Model 6) assuming class-specific correlation and the other (Model 7) assuming a common variance-covariance structure across all classes, but none of them showed improved fitting statistics.

Models | # latent class (G) | #parameters | BIC | LMR-LRT | Minimal class size |
---|---|---|---|---|---|

1 (class-specific variance) | 2 | 14 | 4026.6 | <0.001 | 27.7% |

2 (class-specific variance) | 3 | 24 | 3988.1 | 0.005 | 22.6% |

3 (class-specific variance) |
4 |
34 |
3988.6 |
0.024 |
6.0% |

4 (class-specific variance) | 5 | 44 | 4006.2 | 0.092 | 6.0% |

5 (class-specific variance) | 6 | 54 | 4032.1 | <0.001 | 3.2% |

6 (class-specific correlation) | 4 | 38 | 4006.3 | 0.576 | 6.0% |

7 (common variance) | 4 | 28 | 4107.4 | <0.001 | 1.1% |

**Table 2:** Fitting statistics of 7 competing latent class growth models, where Lo-Mendell-Rubin likelihood ratio test (LMR-LRT) compares the improvement in fit between
neighboring class models, with a smaller p-value favoring the G-class model over the model with G-1 classes (null hypothesis).

**Table 3** presented estimated parameters for the best model. The first class contained 84 eyes (30%), had the highest mean FPC score in MD and the lowest mean FPC score in PSD, and only 2 out of 84 eyes (2.4%) were progressed at the end of study. In contrast, the last class (16 eyes, 6%) had the lowest mean MD and the highest mean PSD, and all 16 eyes (100%) were progressed at the end of study. The 2^{nd}- and 3^{rd} class were somewhere in between and accounted for 44% and 19% of the participants, with 10.6% (13 out of 123 eyes) and 48.1% (26 out of 54 eyes) being progressive, respectively. We therefore labeled Classes 1 to 4 as “Stable, low risk”, “Stable, high risk”, “Progressive, moderate”, and “Progressive, rapid”, respectively.

Variables | Class 1 (ref.): Stable, low risk | Class 2: Stable, high risk | Class 3:Progressive, moderate | Class 4:Progressive, rapid |
---|---|---|---|---|

#Eyes#Progressive Eyes (%) |
84 2 (2.4%) |
123 13(10.6%) |
54 26 (48.1%) |
16 16 (100%) |

FPC parametersMean of MD scores Mean of PSD scores Variance of MD scores Variance of PSD scores |
12.63 ± 0.83 -5.73 ± 0.28 19.27 ± 4.29 4.09 ± 0.76 |
2.44 ± 1.16 -3.67 ± 0.59 54.21 ± 12.42 22.93 ± 4.43 |
-11.03 ± 1.97 9.86 ± 1.58 107.03 ± 22.40 68.70 ± 10.16 |
-53.75 ± 5.63 20.90 ± 3.84 370.12± 118.65 165.54 ± 45.92 |

CovariatesIntercept age pattern standard deviation intraocular pressure central corneal thickness horizontal cup/disc ratio |
- - - - - - |
1.63 ± 0.75* 0.69 ± 0.35* 2.62 ± 0.80# -0.24 ± 0.36 0.53 ± 0.43 0.08 ± 0.25 |
0.47 ± 0.73 0.82 ± 0.38* 3.58 ± 0.85# 0.44 ± 0.33 -0.39 ± 0.44 0.01 ± 0.26 |
-2.26 ± 1.15* 1.78 ± 0.50# 4.36 ± 0.89# 0.98 ± 0.55 -0.33 ± 0.72 0.43 ± 0.38 |

*p<0.05, #p<0.001: indicating the significance comparing to the reference group (Class 1).

**Table 3:** Estimated parameters and their standard errors of the latent class model based on FPC scores.

**Table 3** also showed the effects of baseline covariates on the classification of MD and PSD trajectories. Only age and PSD at diagnosis were predictive of MD and PSD trajectories. Treating Class 1 as the reference, for example, participants with old age were more likely to fall into the Classes 2, 3, or 4 (with OR=1.99, 2.27, and 5.93, respectively). Similarly, participants with high PSD at diagnosis were also more likely to be in the high risk groups (with OR=13.74, 35.87, and 78.26 for Classes 2, 3, and 4, respectively). **Figure 3** presented the individual trajectories of MD and PSD within each class. Both MD and PSD series are centered at the time of diagnosis, with red and black lines representing progressive and stable eyes, respectively.

**Individualized prediction of VF progression**

Once the eigenfunction and eigenvalue of the 1^{st}-FPC were estimated and the optimal LCMG were developed, the probability of VF progression for an individual with given series of MD and PSD could be calculated analytically.

**Figure 4** illustrated the predicted probability of VF progression at the end of study for two randomly selected individuals, one with a progressive eye and one with a stable eye. The upper panel showed the observed (solid lines) and estimated (broken lines) MD series measured 5 years prior to diagnosis (**Figure 4A**), the observed and estimated PSD series (**Figure 4B**), as well as the predicted probabilities and corresponding 95% confidence intervals of VF progression at the end of study (**Figures 4C-4H**). The two individuals could not be differentiated given MD and PSD series measured 5 years prior to diagnosis. Given a series of MD and PSD measured 2 years prior to diagnosis, the progressive participant had a relatively higher probability of VF progression, but there was substantial overlap in the 95% confidence intervals (**Figure 4**, Middle panel). However, the two participants were well separated given MD and PSD series up to 1 year post diagnosis (**Figure 4**, Lower panel).

**Figure 4:**Predicted probabilities of VF progression for 2 randomly selected participants. **Upper panel:** observed (solid lines) and estimated (broken
lines) MD and PSD series measured 5 years prior to POAG diagnosis, and the estimated probabilities of VF progression and corresponding 95% CIs; **Middle panel:** observed (solid lines) and estimated (broken lines) MD and PSD series measured 2 years prior to POAG diagnosis, and the estimated
probabilities of VF progression and corresponding 95% CIs; **Lower panel:** observed (solid lines) and estimated (broken lines) MD and PSD series
measured up to 1 year post POAG diagnosis, and the estimated probabilities of VF progression and corresponding 95% CIs.

To investigate the overall discriminating ability of the prediction model, the method was then applied to all individuals given different length of MD and PSD series, and the ROC curves and area under ROC (AUC) from the estimated probabilities were compared. A logistic regression including the 5 baseline variables plus MD at diagnosis was also fitted as a comparison. **Figure 5** showed that the ROC using prediagnosis trajectories had a slightly greater area (AUC=0.721) than the ROC using covariates at diagnosis only (AUC=0.705). It also showed that the predictive accuracy was improved in ROCs including more post-diagnosis VFs (with AUC=0.777, 0.845 and 0.879 for up to 1-, 2-, 3-year post-diagnosis MD/PSD series respectively). We admit that the predictive accuracy in **Figures 4 and 5** may be optimistic because these individuals were also used in the model development. However, an external validation is in progression using the fellow eye of the OHTS participants as well as data from the Collaborative Initial Glaucoma Treatment Study (CIGTS), another randomized phase III trial to assess the treatment efficacy on patients with newly diagnosed glaucoma [18].

In this paper, we developed a model to predict a binary outcome using multivariate longitudinal data. We used a latent class growth model (LCGM) to capture the dependence between longitudinal data and binary outcome as well as the association among multiple biomarkers themselves. LCGM is a semi-parametric statistical technique that describes individual trajectories over time after accounting for unobserved heterogeneity within the population. This model allows us to predict the probability of clinical outcome for a subject with particular covariates and series of biomarkers. We apply the method to dynamically predict visual field (VF) progression in patients with newly diagnosed POAG. The results showed that these participants with progressive visual loss could be accurately differentiated from those with stable vision by incorporating longitudinal biomarkers.

Many methods have been proposed in recent years to incorporate longitudinal biomarkers under latent class model framework [8-11] and most of them take a full parametric approach to describe the change of longitudinal biomarkers over time. In contrast, we performed a dimension reduction to convert repeated measurements into a scalar by a functional principal component (FPC) analysis, one of basic tools in functional data analysis (FDA). FDA is a collection technique about the analysis of information on curves or functions [15]. Unlike conventional methods such as ANOVA that only focus on a limited time points, FDA analyzes data about curves, surfaces, or anything else varying over a continuum. Measured data in a traditional FDA are often recorded over a densely sampled grid, i.e., in the format of infinite-dimension functions or curves such as those generated from automated sensing equipment. In a typical longitudinal study like OHTS, however, the data are often measured sparsely at discrete time points. The emerging techniques such as conditional expectation (PACE) [12] have provided a valuable bridge between functional data and classical longitudinal data. As illustrated in the OHTS data, PACEbased estimates of 1^{st}-FPC scores well approximated the trajectories of MD and PSD, and provided a flexible and yet powerful tool to incorporate multivariate longitudinal data.

There were several limitations in this study. One limitation was that it took a multi-stage approach to calculate the probability of VF progression and failed to account for the uncertainty associated with the estimated FPC scores. One particular feature of the OHTS is that both longitudinal predictors and clinical outcome are based on VF tests. That is, the status of VF progression is determined using VF tests by a point-wise linear regression, while the longitudinal predictors (MD and PSD) are two global indices derived from the same VF tests. A regression approach results in circular use of information and may be inappropriate for OHTS data. We therefore took a stage-wise approach to develop prediction model and the LCGM was only based on baseline covariates and longitudinal MD/PSD.

Another limitation of this study was that, due to relatively short post-diagnosis follow-up, VF progression was only determined as a single snapshot at the end of study and failed to take the onset time of progression into consideration. To deal with this issue, an external validation is in progression using data from the Collaborative Initial Glaucoma Treatment Study (CIGTS). In the validation study (with a median follow-up of 9 years), the status of VF progression will be determined not only at the end of study but also by a moving window over time.

In summary, despite its limitations, the proposed method provided a flexible and yet powerful tool to incorporate even complex multivariate trajectories for predicting distal clinical outcomes. Although the model in this paper is developed for the prediction of VF progression in patients with newly diagnosed POAG, the methodology is applicable to any chronic diseases with multiple longitudinal predictors. When the clinical outcome are defined independent of longitudinal predictors, however, we expect that a joint modelling approach that classifies participants based on the patterns of clinical outcome and longitudinal biomarkers simultaneously could better account for the association between two random processes and improve the predictive accuracy.

This study is supported by grants from the National Eye Institute National Institute of Health, Bethesda, MD (EY091369 and EY09341) and NCI Cancer Center Support Grant P30 CA091842.

- Heijl A, Bengtsson B, Hyman L, Leske MC; Early Manifest Glaucoma Trial Group (2009) Natural history of open-angle glaucoma. Ophthalmology 116: 2271-2276.
- Spry PG, Johnson CA (2002) Identification of progressive glaucomatous visual field loss. SurvOphthalmol 47: 158-173.
- Henderson R, Diggle P, Dobson A (2000) Joint modelling of longitudinal measurements and event time data. Biostatistics 1: 465-480.
- Guo X, Carlin BP (2004) Separate and joint modeling of longitudinal and event time data using standard computer packages. The American Statistician 58: 1-9.
- Rizopoulos D, Verbeke G, Lesaffre E (2009)Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. Journal of the Royal Statistical Society71: 637=654.
- Horrocks J, Van Den Heuvel MJ (2009) Prediction of Pregnancy: A Joint Model for Longitudinal and Binary Data. Bayesian Anal 4: 523-538.
- Luo S (2014) A Bayesian approach to joint analysis of multivariate longitudinal data and parametric accelerated failure time. Stat Med 33: 580-594.
- Proust-Lima C1, Séne M, Taylor JM, Jacqmin-Gadda H (2014) Joint latent class models for longitudinal and time-to-event data: a review. Stat Methods Med Res 23: 74-90.
- Garre FG, Zwinderman AH, Geskus RB,Sijpkens YWJ (2008)A joint latent class changepoint model to improve the prediction of time to graft failure. Journal of the Royal Statistical Society: Series A 171: 299-308.
- Proust-Lima C, Sene M, Taylor JM (2009) Development and validation of a dynamic prognostic tool for prostate cancer recurrence using repeated measures of posttreatment PSA: a joint modeling approach. Biostatistics 10: 535-549.
- Maruyama N, Takahashi F, Uno H,Takeuchi M (2012) Predicting Binary Outcomes Using Trajectory Classes. Statistics in Biopharmaceutical Research. 4: 284-292.
- Yao F, Muller HG, Wang JL (2005) Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association 100:577-590.
- Gordon MO, Beiser JA, Brandt JD, Heuer DK, Higginbotham EJ, et al. (2002) The Ocular Hypertension Treatment Study: baseline factors that predict the onset of primary open-angle glaucoma. Arch Ophthalmol 120: 714-720.
- Gardiner SK, Crabb DP (2002) Examination of different pointwise linear regression methods for determining visual field progression. Invest Ophthalmol Vis Sci 43: 1400-1407.
- Byrd D, Lee S, Campos-Astorkiza R (2008) Phrase boundary effects on the temporal kinematics of sequential tongue tip consonants. J AcoustSoc Am 123: 4456-4465.
- Muthen LK, MuthenBO (1998-2010)Mplus User’s Guide (6thedn.) Muthen&Muthen,Los Angeles, CA.
- McLachlan G, Peel D (2000) Finite Mixture Models.Wiley Inter-Science Publication New York, USA.
- Musch DC, Gillespie BW, Lichter PR, Niziol LM, Janz NK; CIGTS Study Investigators (2009)Visual field progression in the Collaborative Initial Glaucoma Treatment Study the impact of treatment and other baseline factors.Ophthalmology116: 200-207.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- 6th International Conference on
**Biostatistics**and**Bioinformatics**

November 13-14, 2017, Atlanta, USA

- Total views:
**11664** - [From(publication date):

October-2015 - Aug 19, 2017] - Breakdown by view type
- HTML page views :
**7851** - PDF downloads :
**3813**

Peer Reviewed Journals

International Conferences 2017-18