Received date: April 22, 2014; Accepted date: April 24, 2014; Published date: April 28, 2014
Citation: Yue YR, Wang XF (2014) Spatial Gaussian Markov Random Fields: Modelling, Applications and Efficient Computations. J Biomet Biostat 5:e128. doi:10.4172/2155-6180.1000e128
Copyright: © 2014 Yue YR, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are are credited.
Visit for more related articles at Journal of Biometrics & Biostatistics
A powerful modelling tool for spatial data is the framework of Gaussian Markov random fields (GMRFs), which are discrete domain Gaussian random fields equipped with a Markov property. GMRFs allow us to combine the analytical results for the Gaussian distribution as well as Markov properties, thus allow for the development of computationally efficient algorithms. Here we briefly review popular spatial GMRFs, show how to construct them, and outline their recent developments and possible future work.
Gaussian Markov random fields; Markov chain Monte Carlo; spatial statistics; Cholesky factorization; Integrated nested laplace approximation
Gaussian Markov random fields (GMRFs) are powerful and important tools for modeling spatial data. They have been widely used in different areas of spatial statistics including disease mapping, spatialtemporal modeling and image analysis. Constructing a GMRF is straightforward: it is just a finite-dimensional random vector following a multivariate Gaussian distribution with additional conditional independence properties, hence termed as Markov. It is convenient and invaluable to combine the analytical results for the Gaussian distribution and the Markov properties, which enables us to solve a large class of statistical models. Historically, the most common method to make inference for the parameters in GMRFs has been maximum likelihood [1,2]. The behavior of maximum likelihood estimator is asymptotic in nature and their small sample behavior is often unknown. On the other hand, the Markov property has become a requirement for constructing efficient Markov chain Monte Carlo (MCMC) algorithms for GMRFs. Rue  showed that the Markov property makes it possible to apply numerical methods on sparse matrices. He proposed fast algorithms for sampling and evaluating the log-density of a GMRF, and conducted efficient MCMC-based inferences. Rue and Held  provides a comprehensive account of the main properties of GMRFs, emphasizes the strong connection between GMRFs and numerical methods for sparse matrices, and outlines various applications of GMRFs for statistical inference (e.g., spatial statistics, time-series analysis, graphical models).
More specifically, a GMRF is a Gaussian random vector x = (x1,… ,xn)’with Markov properties: for some i≠j, xi and xj are independent conditional on other variables x-ij. It is defined over a set of discrete indexed locations connected by a graph labelled by G, which shows the conditional independence properties of x. We say x is a GMRF with respect to G with mean μ and precision (inverse covariance) matrix Q if its density has the form
The Markov properties in x are conveniently encoded in matrix Q: Qij= 0 if and only if xi and xj conditionally independent. The pattern of zero and non-zero elements in such a matrix is called its sparsity structure. The total number of non-zero elements divided by the total number of elements is called the density of the matrix. In most cases only O(n) of the n2 entries of Q are not zeroes, so Q has a very low density and is a sparse matrix. This allows for a fast Cholesky decomposition of Q as LL0, where L is the lower triangular matrix that inherits the sparseness of Q. Therefore, only non-zero terms in L are computed and the nodes can be reordered to decrease the number of the non-zero terms. The typical cost of this decomposition depends on the dimension of the GMRF: it is O(n) for one dimension, O(n3/2) for two dimensions and O(n2) for three dimensions. Using the Cholesky triangle, it is easy to produce random samples from a GMRF, and compute log-density of (1) and marginal variances; see Rue and Held  for more technical details.
GMRFs have been used in a wide range of common statistical models such as generalized linear (mixed) models, generalized additive models, dynamic linear models, and spatial and spatiotemporal models, among others. Those models can be written as hierarchical models which assume an-dimensional latent GMRF with a sparse precision matrix to be point-wise observed through nd conditional independent data y. The estimation of the models takes advantage of modern techniques for sparse matrices. We here focus on the (two dimensional) spatial GMRFs that have been extensively used in the hierarchical analysis for spatial data . We briefly review popular spatial GMRFs, show how to construct them, and outline their recent developments and possible future work.
From definition (1), we can see that a GMRF is completely determined by its precision matrix Q: different Q’s bring different Markov properties to the field. To build matrix Q, one has to specify a particular neighbourhood structure, corresponding to the type of the spatial data. Below we show how to construct a few GMRFs that have been widely used in spatial statistics.
GMRFs on irregular lattices
An important type of spatial data is so called areal data, where the locations of observations are geographic regions (e.g., the states of the US) with adjacency information. For areal spatial data, the firstorder GMRFs on irregular lattices are often preferred. To construct neighbourhood structure, we may define two regions are neighbors if they share a common border. Other ways to define neighbors are also possible. Between neighboring regions i and j, a Gaussian increment is defined as xi-xj~N(0,1/wijτ), where wij are positive and symmetric weights. We can let wij= 1 if we believe region i equally depends on its neighbors, or let wij be, for example, the inverse Euclidean distance between region centroids if we think the neighbors somehow contribute differently. Assuming the increments are independent, the density of this GMRF is given by:
Where i ~ j denotes the set of all unordered pairs of neighbors. The corresponding precision matrix Q has entries
Where the summation over neighbors of node i on a lattice. Since the sum of each row is zero, Q is singular with rank n-1. It is easy to see that the conditional distribution of xi is
where the conditional mean of xi depends on its neighboring nodes x j through weights wi j and its conditional variance depends on weight sum wi+.
GMRFs on regular lattices
Another important type of spatial data are point-referenced data, where the spatial locations are points with known coordinates. When the coordinates constitute a regular lattice, one can construct first and second-order GMRFs as described below.
For a regular lattice with n=n1n2 nodes, let (i,j) denote the node in the ith row and jth column. In the interior, we can define the nearest four nodes of (i,j) as neighbors, i.e., (i+1,j), (i-1,j), (i,j+1), (i,j-1). Along the boundaries, we define the neighbors of (i,j) to be the two or three adjacent nodes. For example, the neighbors of (i,j) are (i,j+1) and (i+1, j) if (i,j) is the upper left corner. Without further weights, the corresponding precision matrix and the full conditionals of xi are given in (3) and (4) with wij=1, respectively. By weighting the horizontal and vertical neighbors differently, this GMRF can be extended to an anisotropic model . In practice the first-order GMRF models may not provide sufficient spatial smoothing as we need. To increase the smoothness of the field, we can use higher-order neighborhood structures. One way to build a second-order neighborhood structure for xijin the interior is based on the increment
which is the sum of second-order differences in vertical and horizontal directions. For the nodes on or near the boundaries we need different increments; see Yue and Speckman  for details. By assuming all the increments follow independent Gaussian distributions, we build a second-order GMRF on a regular lattice. The co-efficients of the corresponding Q can be found by expanding the quadratic terms of those increments. Using graphical notation, the conditional distribution of xij in the interior has mean and precision
where the locations denoted by a ‘ • ’ represent the neighbors that xi j depends on and the number in front of each grid denotes the weight given to the corresponding ‘ ○ ’ locations. This second-order GMRF is closely related to thin-plate spline , which has been widely used as a spatial smoother [1,8]. Yue and Speckman  showed that this GMRF can also be derived by discretizing the penalty function of the thinplate spline.
Continuous indexed GMRFs
The GMRFs mentioned above are all discrete indexed in nature and thus only work appropriately for the spatial data on lattices. If the observations are measured at irregularly-spaced locations, one often bins the locations to a regular lattice first and then apply the GMRF to the summary statistics calculated for the grids. As a result, we lose information from binning process and the the spatial resolution of the lattice significantly affects the inference. Another problem is that a regular GMRF can only capture the smoothness of a spatial field, but not its structure of spatial correlations, because it defines the precision matrix (not covariance matrix). Thus, we cannot make inference regarding the correlation between two locations.
To address the issues mentioned above, Lindgren et al.  derived a new class of continuous indexed GMRFs, which are explicit mappings of Mat´ern Gaussian fields that have been extensively used in statistical modeling of spatial data. Letting x(u) be a realization of spatial field x at location , such GMRFs are derived by solving a stochastic partial differential equation (SPDE)
where is the two-dimensional Laplacian operator, κ>0 is the spatial scale, α> 0 controls the smoothness of the realizations, τ>0 controls the variance, and wis the spatial Gaussian white noise. The link to the Mat´ern smoothness n and variance σ2 is ν= α-d/2 and σ2=Γ(ν)(Γ(α)(4π)d/2κ2ντ2)-1, where d is the spatial dimension. A measure of the spatial range can be empirically derived as Lindgren et al.  solve the SPDE (6) using finite element method.They first approximate x(u) with piecewise linear basis functions defined on a triangular domain, and then turn (6) into a system of linear equations to derive the GMRF solution.
As shown in Lindgren et al. , the derived GMRF is the best piecewise linear approximation to the continuous solution to the SPDE given a triangulation. Since it is a GMRF representation of Mat´ern fields, it allows us to capture both spatial correlation and spatial smoothness in a spatial process. Another exciting aspect of such models is their flexibility. There is no conceptual or computational barrier to extending them to being, for example, non-stationary, multivariate and spatial-temporal GMRFs. It is even possible to construct them on the sphere and other manifolds.
The smoothness of a GMRF is determined by the scale of increments, which is often invariant across the space. It means that the GMRF provides the same amount of smoothing at every location. This, however, limits the GMRF to stationary spatial processes only. We here present a few adaptive GMRFs that are recently developed to deal with non-stationary spatial processes.
Brezger et al.  extended the first-order GMRF in (2) to being adaptive by letting weights wi j vary with locations. More specifically, they take independent gamma priors on each wij, that is wij~ Gamma (1/2,1/2), making the marginal distribution of difference (xi-xj) a Cauchy distribution. This approach allows varying strength of interactions between neighboring sites i ~ j, but without spatial prior structure.
Similarly, Yue and Speckman  extended the second-order GMRF in (5) to being non-stationary by spatially adaptive modeling its conditional variance, i.e., Var (xij|x-ij) = (20τij)-1. Thus, a small value of τij (large variance) leads to less smoothing, appropriate when xij shows increased local variation. Then, let τij = τγij, so that τ is the global scale parameter and γij controls the local smoothing of the field. Finally, a first-order spatial GMRF is taken as a prior on log(γij) to make the smoothing vary over the space. The resulting GMRF is able to capture both local and global features of a spatial process while retains the nice Markov properties for computation. Yue et al.  proposed a similar adaptive GMRF but with independent gamma priors on γij.
The SPDE models can be generalized to be non-stationary in several ways Lindgren et al. . Let spatial scale κ and precision τ in (6) to depend on the coordinate u, and allow them vary slowly through log linear models. Therefore, one can analyze how the spatial nonstationarity depends on certain covariates. Bolin and Lindgren  used a nested SPDE approach to develop a large class of non-stationary covariance functions, such as oscillating covariance functions. Fuglstad et al.  introduced a new class of non-stationary spatial GMRFs with varying local anisotropy by adding a diffusion matrix that varies with position to the Δ operator in (6).
The adaptive GMRF models mentioned above have been successfully applied to the function magnetic resonance imaging (fMRI) data [10,13], the precipitation data [6,14], the global ozone data  and the neuroimaging meta-analysis .
Spatial-temporal GMRF models are extensions of spatial GMRF models to account for additional temporal variation. Think about a sequence of T graphs in time and let xit denote the ith node in tth graph. A common extension to a spatial-temporal GMRF is to take into account temporal neighbors in addition to spatial neighbors. The temporal neighbors can be the same nodes in the previous and next graphs, that is, xit-1 and xit+1. The precision matrix of the corresponding GMRF can be written as where QT and QS are precision matrices in time and space, respectively. We can use conditional autoregressive process (Besag, 1974)  to model temporal correlations and the spatial GMRFs presented above to model spatial correlations. Then, both QT and QS are sparse matrices, making Q sparse as well. Cameletti et al.  employed a model of this type to perform spatial-temporal analysis on particulate matter concentration data.
It is possible to build a multivariate GMRF by extending the SPDE framework considered above . The idea is to replace a single SPDE with a system of SPDEs. The co-variances matrices constructed with this approach are automatically symmetric positive definite. The sparse precision matrix of this multivariate GMRF facilitates its implementation to large data sets. Hu et al.  have shown that these models are related (but not equivalent) to the multivariate Mat´ern fields constructed by.
The GMRFs are often implemented in a class of structured additive regression models, which are quite flexible and have been extensively used . In these models, the response variable yi is assumed to belong to an exponential family, where the mean μi is linked to a structured additive predictor ηi through a link function g(•), so that g(μi) = ηi. The predictor hi accounts for various covariate effects in an additive way:
Where fj’s are unknown functions of the covariates μi, βk’s represent the linear effect of zk, and the εi’s are error terms. The structured additive regression models cover a large class of spatial and spatiotemporal models, geostatistical and geoadditive models. The spatial GMRFs can be used to model fj when the covariate uj are spatial locations. The model (7) can handle various types of responses (e.g., continuous, binary and count data) and very different forms that the unknown fj can take (e.g., random, nonlinear and spatial effects). As a matter of fact, most Bayesian models used in spatial statistics are of this form .
To fit the spatial models of form (7), one can perform standard MCMC methods to simulate the posterior distributions. Using GMRFs as priors in the Bayesian hierarchical models, it is feasible to construct accurate GMRF approximations to the full conditionals (if they are not Gaussian), based on which efficient blockwise Metropolis-Hastings algorithms can be used to explore the posteriors . One is able to take advantage of the sparseness of GMRFs and implement the fast Cholesky factorization algorithms to speed up the MCMC algorithms. This can now be easily done using an R package named spam developed by Furrer and Sain . Recently, Rue et al.  has introduced a novel Bayesian inference tool based on integrated nested Laplace approximations (INLA). The INLA method can directly compute very accurate approximations to the posterior marginal distributions for latent Gaussian models. It is much faster than MCMC and can be easily implemented using R-INLA package (http://www.r-inla.org). The package contains a variety of popular GMRFs that are readily to be used.