Medical, Pharma, Engineering, Science, Technology and Business

^{1}Department of Epidemiology, University of Kentucky, USA

^{2}Department of Statistics, University of Kentucky, USA

^{3}Department of Biostatistics, University of Kentucky, USA

^{4}Sanders-Brown Center on Aging, University of Kentucky, USA

- *Corresponding Author:
- Richard J Charnigo

Department of Biostatistics

University of Kentucky, USA

**Tel:**(859) 218-2072

**E-mail:**[email protected]

**Received date:** October 21, 2013; **Accepted date:** October 21, 2013; **Published date:** October 25, 2013

**Citation:** Abner EL, Charnigo RJ, Kryscio RJ (2013) Markov Chains and Semi-Markov Models in Time-to-Event Analysis. J Biomet Biostat S1:e001. doi:10.4172/2155-6180.S1-e001

**Copyright:** © 2013 Abner EL, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

A variety of statistical methods are available to investigators for analysis of time-to-event data, often referred to as survival analysis. Kaplan-Meier estimation and Cox proportional hazards regression are commonly employed tools but are not appropriate for all studies, particularly in the presence of competing risks and when multiple or recurrent outcomes are of interest. Markov chain models can accommodate censored data, competing risks (informative censoring), multiple outcomes, recurrent outcomes, frailty, and non-constant survival probabilities. Markov chain models, though often overlooked by investigators in time-to-event analysis, have long been used in clinical studies and have widespread application in other fields.

The analysis of time-to-event data in human and animal studies presents several statistical challenges. In addition to the familiar problem of censored observations, there may be multiple types of failure under consideration (the “competing risk problem” [1]); clinically relevant outcomes other than failure may be observed during follow-up [2], including those that alter the risk of failure or can occur more than once [3,4]; and individual susceptibility to failure (i.e., frailty) may not be constant over time [5]. While traditional time-toevent analysis methods like Kaplan-Meier product-limit estimation and Cox proportional hazards regression are implemented easily and use censored data efficiently when the assumption of uninformative censoring holds, analyses involving informative censoring, multiple outcomes, or non-constant survival probabilities may be well suited for application of Markov processes [6]. A contemporary approach to the informative censoring problem in Cox regression involves a multivariate survival analysis [7].

A Markov process is a stochastic process that describes the movement of an individual through a finite number of defined states, one (and only one) of which must contain the individual at any particular time. Possible movements among states may be depicted with a transition matrix or state diagram [2,3,6]. In order for the process to terminate, at least one of the states must be absorbing, i.e., individuals have zero probability of leaving the state once it has been entered. Death, for example, is an absorbing state used commonly in clinical studies, but it is also a well-known competing risk for clinical outcomes in studies of older persons [2,4]. Markov processes may be continuous or discrete as well as time-homogeneous or time-nonhomogeneous. The focus of this editorial will be discrete, time-homogeneous Markov processes called Markov chains.

Markov chain models allow analysts to calculate the probability and rate (or intensity) of movement associated with each transition between states within a single observation cycle as well as the approximate number of cycles spent in a particular state. When observations are made at regular intervals, the number of cycles can be interpreted as time in a state. Time spent in all states prior to absorption can be summed to estimate the total survival time. Use of Markov chains requires two fundamental assumptions: (i) transition probabilities are constant over time (time homogeneity); and (ii) the probability of the next transition depends only on the current state (the first-order Markov property). These models are attractive for time-to-event analysis. They accommodate the simultaneous analysis of multiple events of interest and inclusion of competing risks through the states defined in the model, as well as consideration of individual frailty through subjectspecific random effects [8,9].

Censored data, both right and left, are appropriate for Markov chains. In a Markov chain model, for example, an individual who never reaches an absorbing state (right-censored)—whether because the study observation is ongoing or the subject has withdrawn or been lost to follow up—can contribute information to the model regarding the transitions he or she did make, which is an advantage over traditional survival analysis methodology [6]. Because individuals are not required to enter the transition matrix in any particular state, left-censored data are also accommodated. Interval censoring is not formally accommodated in Markov chains, which assume that transitions take place only once per observation cycle, either at the beginning or the end. In reality, transitions make take place at any time, and multiple unobserved transitions may take place between cycle assessments. Approaches such as the half-cycle correction, where transitions are assumed to occur in the middle of the observation cycle [3], have been proposed to mitigate bias resulting from assuming that transitions take place only at the cycle’s beginning or end. If the clinical model and data structure support the assumption that all transitions are unidirectional (i.e., no reverse transitions are possible), a semi-Markov model, which is a special case of Markov chain where the time spent in the current state depends on both the prior and future adjoining states [10], could be considered for interval censored data [4,10].

Finally, unlike traditional time-to-event analysis where only one outcome is possible for each individual, Markov chains allow analysts to calculate survival times in multiple states. This is particularly attractive for studies of chronic diseases with well-defined phases, like cancer [11] and autoimmune diseases [12], where remission and recurrence are of interest in addition to overall survival, and dementia due to neurodegenerative disease, where pre-clinical and mildly symptomatic disease states are increasingly of interest to researchers working to identify treatments and prevention strategies [2,4]. As with traditional time-to-event analysis, survival curves may be estimated from model results [13]. Mean survival times may be inferred using matrix solution, Markov cohort simulation, or Markov Chain Monte Carlo simulation [3]. These calculations are more cumbersome, but still possible, when transition probability estimates are derived from covariate-adjusted regression models [14]. By contrast, semi-Markov models estimate mean survival times directly without the need for additional calculations [4].

The time homogeneity assumption can be assessed with a likelihood ratio test, and the first-order Markov property assumption can be examined with a chi-square test [6,15]. The time homogeneity assumption is often difficult to meet, particularly in studies of chronic disease where studies are years long, single observation cycles can span a year or more, and increasing age generally corresponds to greater risk of disease or death. However, this concern can be mitigated by data stratification (e.g., by age group or study period) or regression modeling, where the effect of covariates is included in the estimation of transition probabilities [16]. In regression, covariates may be either fixed or time-dependent.

Even when the fundamental model assumptions are met, application of the Markov chain model may still be unsuccessful. Data density, i.e., the observed frequency of each transition type, may be too sparse in some cells to implement the regression model. Sparse cells, where few events are observed, may lead to inaccurate estimation or prevent model convergence. In addition, there is no widely accepted goodness of fit test for the model.

Although Markov models have been used in clinical applications for over 60 years [17], incorporation of subject-specific random effects in Markov chains to account for individual propensity to make transitions is a relatively recent development [7]. However, inclusion of random effects makes estimation of the likelihood quite complex, and fitting such models can be time consuming. More importantly, their meaning must be carefully considered. Models that utilize tunnel states (i.e., non-absorbing states from which reverse transitions are not possible) [3], for example, complicate the use of random effects.

In closing, Markov chains are useful tools for survival analysis that allow for more nuanced modeling than is available in most standard time-to-event methods. While the focus of this editorial has been clinical studies, Markov chains have clear applications in diverse fields including labor research [18], finance [19], political science [20], chemical engineering [21], and demography [22]. However, while many journal readers and reviewers may readily comprehend the results from Markov models, they may lack familiarity with the underlying statistical assumptions, particularly in fields where the use of Markov models is not yet widespread. If so, they may neglect to challenge investigators to demonstrate that these assumptions are tenable. Given that improper use of Markov models may result in biased estimation, perhaps some standardization in the reporting of Markov model results and assumption verification is needed.

This research was partially funded with support from grants to the University of Kentucky’s Sanders-Brown Center on Aging, R01 AG038651-01A1 and P30 AG028383, from the National Institute on Aging, as well as a grant to the University of Kentucky’s Center for Clinical and Translational Science, UL1TR000117, from the National Center for Advancing Translational Sciences.

- Prentice RL, Kalbfleisch JD, Petersen AV, Flournoy N, Farewell VT, et al. (1978) The analysis of failure times in the presence of competing risks. Biometrics 34: 541-554.
- Abner EL, Kryscio RJ, Cooper GE, Fardo DW, Jicha GA, et al. (2012) Mild Cognitive Impairment: Statistical models of transition using longitudinal clinical data. Int J Alzheimers Dis. 2012: 291920.
- Sonnenberg FA, Beck JR (1993) Markov models in medical decision making: a practical guide. Med Decis Making 13: 322-338.
- Kryscio RJ, Abner EL, Lin Y, Cooper GE, Fardo DW, et al. (2013) Adjusting for mortality when identifying risk factors for transitions to MCI and dementia. J Alzheimers Dis 35: 823-832.
- Aalen OO (1994) Effects of frailty in survival analysis. Stat Methods in Med Res 3: 227-243.
- Hillis A, Maguire M, Hawkins BS, Newhouse MM (1986) The Markov process as a general method for nonparametric analysis of right-censored medical data. J Chron Dis 39: 595-604.
- Crowder M (2012) Multivariate survival analysis and competing risks. CRC Press, Taylor and Francis Group: Boca Raton, Florida.
- Salazar JC, Schmitt FA, Yu L, Mendiondo MS, Kryscio RJ (2007) Shared random effects analysis of multi-state Markov models: application to a longitudinal study of transitions to dementia. Statist Med 26: 568-580.
- Song C, Kuo L, Derby CA, Lipton RB, Hall CB (2011) Multi-stage transitional models with random effects and their application to the Einstein Aging Study. Biometrical J 53: 938-955.
- Kang M, Lagakos SW (2007) Statistical methods for panel data from a semi-Markov process, with application to HPV. Biostatistics 8: 252-264.
- Kay R (1986) A Markov model for analyzing cancer markers and disease states in survival studies. Biometrics 42: 855-865.
- Pan F, Goh JW, Cutter G, Su W, Pleimes D, et al. (2012) Long-term cost-effectiveness model of interferon beta-1b in the early treatment of multiple sclerosis in the United States. Clin Ther 34: 1966-1976.
- Sendi PP, Craig BA, Pfulger D, Gafni A, Bucher HC (1999) Systematic validation of disease models for pharmacoeconomic evaluations. J Eval Clin Prac 5: 283-295.
- Yu L, Griffith WS, Tyas SL, Snowdon DA, Kryscio RJ (2010) A nonstationary Markov transition model for computing the relative risk of dementia before death. Statist Med 29: 639-648.
- Anderson TW, Goodman LA (1957) Statistical inference about Markov chains. Ann Math Stat 28: 89-110.
- Kalbfleisch JD, Lawless JF (1985) The analysis of panel data under a Markov assumption. J Am Stat Assoc 80: 863-871.
- Fix E, Neyman J (1951) A simple stochastic model of recovery, relapse, death and loss of patients. Human Biol 23: 205-241.
- Pedersen J, Bjorner JB, Burr H, Christensen KB (2008) Transitions between sickness absence, work, unemployment, and disability in Denmark 2004–2008. Scand J Work Environ Health 38: 516-526.
- Hochreiter R, Wozabal D (2010) Evolutionary estimation of a coupled Markov chain credit risk model. Natural Computing in Computational Finance 293: 31-44.
- Boskin MJ, Nold FC (1975) A Markov model of turnover in aid to families with dependent children. J Human Resources 10: 467-481.
- Tamir A (1998) Markov chains in chemical engineering. Elsevier B.V.: Amsterdam, The Netherlands.
- van Raalte AA, Caswell H (2013) Perturbation analysis of indices of lifespan variability. Demography 50: 1615-1640.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Big data
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Computer-aided design (CAD)
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft Computing
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- Total views:
**11633** - [From(publication date):

July-2013 - Dec 15, 2017] - Breakdown by view type
- HTML page views :
**7835** - PDF downloads :
**3798**

Peer Reviewed Journals

International Conferences
2017-18