Medical, Pharma, Engineering, Science, Technology and Business

^{1}Department of Statistics, University of Kentucky, 725 Rose St, Lexington, KY 40536, USA

^{2}Department of Biostatistics, University of Kentucky, 725 Rose St, Lexington, KY 40536, USA

- *Corresponding Author:
- Yanbing Zheng

Department of Statistics

University of Kentucky

725 Rose St, Lexington

KY 40536, USA

**E-mail:**[email protected]

**Received date:** August 07, 2012; **Accepted date:** August 08, 2012; **Published date:** August 13, 2012

**Citation:** Zheng Y, Charnigo R (2012) On Selecting Spatial-Temporal Autologistic Regression Models for Binary Lattice Data. J Biom Biostat 3:e112. doi:10.4172/2155-6180.1000e112

**Copyright:** © 2012 Zheng Y, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

In many biological and physical sciences, rapid advances in technical capabilities have dramatically increased the amount of data that are collected across space and over time. Spatial-temporal models are important tools for the analysis of spatial data collected repeatedly over time and have been applied to a wide range of problems, including modeling patterns in lung cancer [1], breast cancer [2], birth defects [3], and West Nile virus [4]; see also Cressie [5], Rue and Held [6], and Schabenberger and Gotway [7]. In particular, for binary data that are observed on a spatial lattice over time, spatial-temporal autologistic regression models relate binary responses to covariates while accounting for spatial and temporal dependence simultaneously [8,9].

Let *y _{it}*denote the response variable such that

for t = s +1,...,m. Further, for a given time point t, we assume that the response variable follows an autologistic model

where and

Here x jit denotes the jth covariate at site i and time t, are regression coefficients are spatial autoregressive coefficients, are temporal autoregressive coefficients, and for l = 1,..., s are spatial-temporal interactive coefficients. For a given site i, we can partition the neighborhood For example, in the bark beetle infestation example of Zhu et al. [8], the study region is a regular square grid. Then we can define N_{k}(i), the k^{th}-order neighbors of a given site i, to contain the k nearest neighbors in terms of distance, for k = 1,..., q. Taking q = 2 for example, we note that θ_{1} ≠ 0,θ_{2} = 0 corresponds to spatial autocorrelation along the north-south and west-east directions, while θ_{1} = 0,θ_{2} ≠ 0 corresponds to spatial autocorrelation along the northwest-southeast and northeastsouthwest directions. Furthermore, to account for anisotropy, we could further partition N_{k} (i) by direction as in Zhu et al. [10]. In general, the magnitude of θ_{k}reflects not only the extent but also the direction of spatial autocorrelation.

Some special cases of the above spatial-temporal autologistic regression models (Cf. Reyes [11]) are as follows:

• Spatial independence:and all

• Temporal independence: and all

• Spatial-temporal separable neighborhood structure: all

• Spatial-temporal non-separable neighborhood structure: some

In what follows, for simplicity we focus on the spatial-temporal separable neighborhood structure.

Some interesting statistical problems for autologistic regression models include how to select covariates and determine an appropriate spatial and temporal neighborhood structure. For example, in studying the impact of climate change on bark beetle infestation of pine forests in North America, some of the most important scientific objectives are to identify and quantify the effects of environmental conditions (e.g. climate change) on bark beetle infestation. Also of great interest is describing the extent and direction of bark beetle dispersal [12]. Judicious selection of covariates and spatial-temporal neighborhood structure permits fulfillment of the aforementioned scientific objectives.

For binary spatial-temporal lattice data, there is not a consensus on how to perform model selection. Particularly regarding spatialtemporal neighborhood structure, this lack of consensus has resulted in researchers employing creative but ad-hoc methods for which the statistical properties are not fully understood. For example, Zhu et al. [13] selected covariates using backward elimination based on t-ratios of the parameter estimates under a pre-specified spatial and temporal neighborhood structure for their analysis of the southern pine beetle outbreak in North Carolina, United States. Zhu et al. [8] pre-selected the spatial and temporal neighborhood structure without including covariates using the AIC and then, once the neighborhood structure was specified, chose covariates for their analysis of the mountain pine beetle outbreak in British Columbia, Canada. Using pre-selected covariates, Bandyopadhyay et al. [9] employed a Bayesian paradigm to compare several different spatial dependence structures for dental caries data. As these examples suggest, covariates and neighborhood structure are usually not selected simultaneously, since examining all possible combinations of covariates and neighborhood structure may be prohibitively time-consuming.

In the remainder of this editorial, we discuss some possibilities for selection of covariates and spatial-temporal neighborhood structure, based on the premise of determining which regression and autoregressive coefficients are non-zero. One idea would be to consider a penalized log-likelihood function via adaptive LASSO [14],

where are regularization parameters for the regression coefficients β , k k correspond to the spatial autoregressive coefficients θ, and pertain to the temporal autoregressive coefficients Here is the likelihood function. However, for the spatial-temporal autologistic regression model, there is no explicit representation of the likelihood function. One possibility would be to replace the likelihood function by the pseudolikelihood function [15]. Another would be to use the Monte Carlo likelihood function (see, e.g. Geyer and Thompson [16], Huffer and Wu [17]), which consistently estimates the likelihood function but is computationally intensive.

To maximize Q(η ), one possibility is to deploy a Newton-Raphson (NR) type algorithm based on a local quadratic approximation (LQA). The LQA algorithm has been used widely and shown to produce reliable results in practice, even for dependent data [18]. However, this algorithm is slow, and a coefficient shrunk to 0 during the iteration of the algorithm remains at 0 throughout all subsequent iterations. Other methods may be considered for non-Gaussian distributions. For example, Madigan and Ridgeway [19] considered LARS-type algorithms for logistic regression, while Genkin et al. [20] proposed Bayesian logistic regression with a Laplace prior for large-scale text categorization. Park and Hastie [21] developed a path algorithm for variable selection in a generalized linear model based on a predictorcorrector method. We conclude this editorial by calling for further research on efficient variable and neighborhood structure selection for autologistic regression models, which will equip scientists with more advanced statistical tools for exploring and analyzing spatial-temporal lattice data.

- Richardson S, AbellanJJ, Best N (2006) Bayesian spatio-temporal analysis of joint patterns of maleand female lung cancer risks in Yorkshire (UK). Stat Methods Med Res 15:385-407.
- Jin X, Carlin BP (2005)Multivariate parametric spatiotemporal models for county level breast cancersurvival data. Lifetime Data Anal 11: 5-27.
- Earnest A (2010)Addressing issues in sparseness, ecological bias and formulation of theadjacency matrix in Bayesian spatio-temporal analysis of disease counts. QueenslandUniversity of Technology.
- Hartley DM, Barker CM,Le Menach A, Niu T, Gaff HD, et al. (2012) Effects of temperature on emergenceand seasonality of West Nile virus in California. Am J Trop Med Hyg 86: 884-894.
- CressieNAC (1993) Statistics for Spatial Data. 2nd edn, J Wiley.
- Rue H, Held L (2005) MarkovRandom Field: Theory and Application. Chapman and Hall, London.
- Schabenberger O, Gotway CA(2004) Statistical Methods for Spatial Data Analysis. Chapman and Hall/CRC.
- ZhuJ, Zheng Y, Carroll AL, Aukema BH (2008) Autologistic regression analysis ofspatial-temporal binary data via Monte Carlo maximum likelihood. J Agric BiolEnviron Stat 13: 84-98.
- Bandyopadhyay D, ReichBJ, Slate EH (2009) Bayesian modeling of multivariate spatial binary data withapplications to dental caries. Stat Med 28: 3492-3508.
- ZhuJ, Huang HC, Reyes PE (2010) On selection of spatial linear models for latticedata. J R Stat Soc Series B Stat Methodol 72: 389-402.
- Reyes PE (2010) Selection of spatial and spatial-temporal linearmodels for lattice data. University of Wisconsin, Madison.
- AukemaBH, Carroll AL, Zheng Y, Zhu J, Raffa KF, et al. (2008) Movement of outbreakpopulations of mountain pine beetle: Influences of spatiotemporal patterns andclimate. Ecography 31: 348-358.
- ZhuJ, Huang HC, Wu J (2005) Modeling spatial-temporal binary data using Markovrandom fields. J Agric Biol Environ Stat 10: 212-225.
- ZouH (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101: 1418-1429.
- BesagJ (1975) Statistical analysis of non-lattice data. Statistician 24: 179-195.
- GeyerCJ, Thompson EA (1992) Constrained Monte Carlo maximum likelihood for dependentdata. Journal of the Royal Statistical Society. Series B (Methodological) 54:657-699.
- HufferFW, Wu H (1998) Markov Chain Monte Carlo for autologistic regression modelswith application to the distribution of plant species. Biometrics 54: 509-524.
- WangH, Li G, Tsai CL (2007) Regression coefficient and autoregressive ordershrinkage and selection via the lasso. J Royal Stat Soc Series B Stat Methodol 69:63-78.
- Madigan D, Ridgeway G(2004) Discussion of "least angle regression" by Efron et al. Ann Stat 32: 465-469.
- GenkinA, Lewis DD, Madigan D (2007) Large-scale Bayesian logistic regression for textcategorization. Technometrics 49: 291-304.
- ParkMY, Hastie T (2007)L
_{1}-regularization path algorithm for generalized linearmodels. J Royal Stat Soc B 69: 659-677.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Big data
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Computer-aided design (CAD)
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft Computing
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- 7th International Conference on
**Biostatistics**and**Bioinformatics**

September 26-27, 2018 Chicago, USA - Conference on
**Biostatistics****and****Informatics**

December 05-06-2018 Dubai, UAE

- Total views:
**11724** - [From(publication date):

October-2012 - Mar 25, 2018] - Breakdown by view type
- HTML page views :
**7944** - PDF downloads :
**3780**

Peer Reviewed Journals

International Conferences
2018-19