Assessing Univariate and Bivariate Spatial Clustering in Modelled Disease Risks

Models for spatial variation in relative disease risk often consider posterior probabilities of elevated disease risk in each area, but for health prioritisation, the interest may also be in the broader clustering pattern across neighbouring areas. The classification of a particular area as high risk may or may not be consistent with risk levels in the surrounding areas. Local join-count statistics are used here in conjunction with Bayesian models of area disease risk to detect different forms of disease clustering over groups of neighbouring areas. A particular interest is in spatial clustering of high risk, which can be assessed by high probabilities of elevated risk across both a focus area and its surrounding locality. An application considers univariate spatial clustering in suicide deaths in 922 small areas in the North West of England, extending to an analysis of bivariate spatial clustering in suicide deaths and hospital admissions for intentional self-harm in these areas. J o ur na l o f B iometrics & Bistatis t i c s ISSN: 2155-6180 Journal of Biometrics & Biostatistics Citation: Congdon P (2013) Assessing Univariate and Bivariate Spatial Clustering in Modelled Disease Risks. J Biomet Biostat 4: 161. doi:10.4172/21556180.1000161 J Biomet Biostat ISSN:2155-6180 JBMBS, an open access journal Page 2 of 9 Volume 4 • Issue 2 • 1000161 accumulated over iterations. Consider binary measures bi of disease risk for areas i=1,...,n, with bi=1 for elevated risk, bi=0 otherwise. For example, one may define bi=1 for ri>τr, where τr is a relative risk threshold (e.g., τr=1 or τr=1.25, and bi=0) otherwise. If relative risks have average 1, and τr=1 then the region-wide proportion of areas with elevated risk, E(bi)=π, will be approximately 0.5. In spatial disease applications with correlated relative risks ri, binary indicators such as bi=I(ri>1) will also tend to be spatially correlated. Region-wide spatial clustering in the bi can be measured by join-count statistics, based on concordance in risk status between area pairs. Thus, a join-count measuring clustering in high risk across a region is 11 1 1 , n n ij i j i j J w b b = = = ∑ ∑ with 0.5J11 known as the BB statistic [6]. Differing health status in neighbouring area-pairs is measured by a weighted total of joins with bi and bj discordant, which can be denoted 2 10 1 1 ( ) . n n ij i j i j J w b b = = = − ∑ ∑ Observed join-count totals can be compared with totals expected under a null hypothesis of spatial independence [7]. The expected total of concordant joins under the hypothesis of no spatial dependence is E(J11)=S0π 2 where 0 1 1 , n n ij i j S w = = = ∑ ∑ and J11 will exceed E(J11) when there is spatial patterning in the disease outcome [6]. Similarly the observed J10 can be compared with E(J10)=2S0π(1-π), and will be less than this expected total when there is disease clustering. It may be noted that in a modelling application with ri unknown, the indicators bi (and related parameters such as π) are also unknown, and sampled at each MCMC iteration. A localised set of join-count statistics (with area i as the focus) can be used to decide whether area i and nearby areas form a high risk cluster, or demonstrate an alternative risk pattern in the locality. For measuring joint high risk, with both area i and its neighbouring areas being high risk, one has 11 1 1 ( 1) ( 1) , n n i i ij j i ij j j j J I b w I b b w b = = = = = = ∑ ∑ where I(A)=1 if condition A holds, and I(A)=0 otherwise. When the focus is on area i, it is relevant to distinguish discordant high-low risk pairings (bi=1, bj=0) from low-high risk pairings (bi=0, bj=1). The relevant local join-count statistics in these cases are then 10 1 1 ( 1) ( 0) (1 ), n n i i ij j i ij j j j J I b w I b b w b = = = = = = − ∑ ∑


Introduction
Spatial analyses of disease incidence or mortality in small areas are often used to identify elevated risk. For example, posterior probabilities of elevated disease risk in each area may be obtained from Bayesian models of area disease counts [1][2][3]. However, elevated risk identified in a particular area may not extend to nearby areas, whereas spatial clustering of high risk across several adjacent or nearby areas may be of particular importance for health policy prioritisation. Identification of high risk localities may be more precise than identification of elevated risk in individual areas, especially for less frequent outcomes. Interest in spatial clustering of high risk may extend to the geographic pattern of interrelated disease outcomes (e.g., different forms of mental illness or different cardiovascular diseases).
For identifying such clustering in conjunction with a disease model, local indices of spatial association, and particular adaptations of them, become relevant. The present analysis considers use of local join-count statistics to detect high risk locality clustering for disease count data aggregated into small areas. These statistics are used in conjunction with an area disease model, and Bayesian inferences based on updating of prior assumptions using Markov chain Monte Carlo (MCMC) estimation.
The proposed join-count statistics methodology is initially applied to clustering of suicide deaths in 922 small areas in NW England. The relative mortality risks are modelled using a Bayesian spatial convolution prior [4] with alternative forms of spatial interaction (e.g., binary adjacency vs distance decay) between neighbouring areas considered. Results obtained using the join count method in conjunction with a spatial model are compared with the widely used (albeit non-Bayesian) spatial scan method, as implemented in the FlexScan and SaTScan packages. The application is then extended to analyze bivariate clustering in suicide deaths and self-harm hospitalisations.

Methods: Join-Count Measures for Local Clustering in Risk
In Bayesian small area disease applications, the classification of an area as high risk typically depends on unknown parameters. Consider disease counts (y i ,i=1,...n) with expected values e i obtained by multiplying area populations by the region-wide disease rate, and with where the r i are relative disease risks in area i with average 1. Such risks often show spatial correlation, and ignoring such correlation can lead to biased and inefficient inference, as the observations are not independent [5]. A widely applied model (known as the convolution model) involves two sets of random effects: (a) spatially structured effects s i to represent spatially correlated risks and following a conditional autoregressive (CAR) scheme [4], If there are no covariates to model spatial patterning of risks, the spatial random effects represent spatial pattern in the disease outcome, whereas otherwise they capture spatial structure in the residuals. To assess actual clustering, one may obtain measures such as Moran's I for the s i ; this entails deriving the index at each MCMC iteration, with posterior inferences (e.g. credible intervals) based on the values accumulated over iterations.
Consider binary measures b i of disease risk for areas i=1,...,n, with b i =1 for elevated risk, b i =0 otherwise. For example, one may define b i =1 for r i >τ r , where τ r is a relative risk threshold (e.g., τ r =1 or τ r =1.25, and b i =0) otherwise. If relative risks have average 1, and τ r =1 then the region-wide proportion of areas with elevated risk, E(b i )=π, will be approximately 0.5.
In spatial disease applications with correlated relative risks r i , binary indicators such as b i =I(r i >1) will also tend to be spatially correlated. Region-wide spatial clustering in the b i can be measured by join-count statistics, based on concordance in risk status between area pairs. Thus, a join-count measuring clustering in high risk across a region is with 0.5J 11 known as the BB statistic [6]. Differing health status in neighbouring area-pairs is measured by a weighted total of joins with b i and b j discordant, which can be denoted Observed join-count totals can be compared with totals expected under a null hypothesis of spatial independence [7]. The expected total of concordant joins under the hypothesis of no spatial dependence is and J 11 will exceed E(J 11 ) when there is spatial patterning in the disease outcome [6]. Similarly the observed J 10 can be compared with E(J 10 )=2S 0 π(1-π), and will be less than this expected total when there is disease clustering. It may be noted that in a modelling application with r i unknown, the indicators b i (and related parameters such as π) are also unknown, and sampled at each MCMC iteration.
A localised set of join-count statistics (with area i as the focus) can be used to decide whether area i and nearby areas form a high risk cluster, or demonstrate an alternative risk pattern in the locality. For measuring joint high risk, with both area i and its neighbouring areas being high risk, one has 11 1 1 The relevant local join-count statistics in these cases are then The count J 10i captures situations where area i is high risk, but nearby areas are mostly low risk, so that area i may be termed a high risk local outlier. The count J 01i would be elevated when area i itself does not have high risk, but neighbouring areas are mostly high risk. Finally, represents localities where both the focus and surrounding areas are low risk. The expected number of common high risk joins with area i as the focus (i.e. area i is a high risk cluster member) is E(J 11i )=S 0i π 2 , while E(J 10i )=S 0i π(1-π) and E(J 00i )=S 0i (1-π) 2 .
Consider a sequence t=1,....T of MCMC samples. From the indicators ( ) t i b of elevated risk at each MCMC iteration, one may estimate probabilities of elevated risk in area i specifically (without regard to the broader locality), namely One may also monitor join-counts indicating locality-wide elevated with posterior estimates  ( ) 11 11 1 / .
The estimated proportion of joins in the locality centred on area i that are joint high risk, namely provides a summary index of high risk across that locality. By contrast, the proportion of joins centred on area i that are (1,0) pairs provides an index that area i is a high risk outlier relative to the broader locality.
The join-counts 11i J and 10i J can be written as ( 1) i ij j j respectively, from which it follows that 11 10 0 π will be elevated when both  i H is elevated, and risk in the surrounding locality is elevated also. By contrast,  10i π will be elevated when  i H is elevated, but risk in the surrounding locality is relatively low. Similarly, 01 00 = ∑ ∑ of modelled relative risks across localities L i that include both the focus area i and areas adjacent to it. These weighted averages can be monitored during the MCMC updating and the probabilities that R i exceed 1 obtained. However, this test may be affected by unusually high relative risks in one or two areas within the locality, or by situations where a low risk area is surrounded by high risk areas.
Another option is to compare the sampled J 11i at each MCMC iteration to the expected count S 0i π 2 under a no clustering hypothesis, and obtain estimates of the probabilities

Methods: Clustering in Bivariate Risk
An extension of the proposed join-count statistics is in the detection of elevated bivariate risk across localities, based on local join-counts for the joint high risk binary event. Methods for bivariate spatial association have been proposed [8], and bivariate LISA methods indicate association between the value for one variable at a given location and the average of another variable at neighbouring locations. However, there is no widely applied cluster detection method (e.g. spatial scan technique) for bivariate outcomes. Let A and B denote two health outcomes with one option for priors on the random effects being To assess high risk clustering in both events jointly, the bivariate local join-counts can be monitored. The estimated probability of elevated bivariate risk in area i specifically is but this elevated bivariate risk may not apply across the broader locality. However, the estimated proportion of bivariate joins in the locality centred on area i that are joint high risk, namely provides a summary index of high bivariate risk across that locality. For detecting isolated elevated bivariate risk (high risk in the focus area but not extending to the broader locality), the relevant join count is Just as implications about smoothed relative risks may depend on the form of spatial interaction assumed [5], so may the inferences about clustering patterns. Implications about risk patterns for interdependent events, especially when one event is less frequent than another, may also be influenced by the form of random effects assumption (and the extent to which there is borrowing of strength). For example, clustering inferences in the less common outcome may be affected if a bivariate spatial prior (allowing correlation in spatial risks between outcomes within areas) is adopted instead of separate univariate spatial priors as in equation (3).

Results: Locality Risk Patterns in Suicide Deaths in NW England
While relatively rare, suicide is a major reason for premature mortality. To assess risk patterns in individual areas as compared to their broader localities, we consider suicide deaths y i over the period 2006 to 2010 in 922 small areas (Middle Level Super Output Areas or MSOAs) across the North West of England (Table 1). These areas are designed to be of similar size in population terms, with an average population of 7500. Expected deaths e i are based on applying an England wide schedule of age specific suicide rates to MSOA populations, with scaling applied to ensure The average mortality count is 3.5, but event totals y i in individual areas vary widely, and moment estimates of relative risk (sometimes called standard mortality ratios or SMRs), also vary widely. Such moment estimates are unreliable with variance instability when there are small numbers of suicide deaths, as in many MSOAs [9,10].
To provide stabilised estimates of relative risk including spatial borrowing of strength, a convolution model is applied with N τ A flat prior on 0 β is assumed, and a gamma prior with index 1 and shape 0.001 on the inverse spatial variance 2 1 / σ [11,12]. Convergence is improved by linking the variance parameters; thus 2 where ρ is assigned an exponential prior with rate 1. Inferences in this and subsequent models are based on the second halves of two chain runs of 10,000 iterations, with convergence assessed according to BGR statistics [13].
Localities are defined as areas adjacent to area i, though the weighting attached to different areas within such localities can be varied. To assess possible sensitivity regarding inferences about locality risk, alternative assumptions about w ij are investigated: equal weighting of all adjacent areas as compared to alternative forms of inverse distance > and local joincounts

Inferences for Locality Risks
Consider first a binary adjacency assumption for the w ij (w ij =1 if areas are adjacent, w ij =0, otherwise), under which the Moran spatial correlation index for the s i is obtained as 0.56 with 95% interval (0.47, 0.66). Figure 1 maps out the posterior mean relative risks r i across the region, though this map tends to be dominated by low density rural areas (such as in the Lake District in the northern part of the map). Subsequently higher resolution maps are used to depict risk and clustering patterns, since in the case study, high risk clustering tends to be in densely populated urban areas. Maps of the administrative geography of the region (including maps of MSOAs) are available at the UK Map Collection page http://www.ons.gov.uk/ons/guide-method/ geography/beginner-s-guide/maps/index.html.   Probabilities that the average locality risk R 1i exceeds 1 are all over 0.968. The average locality risks R 1i may be used to confirm what the join-count statistics indicate, in particular the  11i π statistics, but in themselves are not conclusive about elevated risk common to both a focus area and areas around it. Weighted averages such as R 1i may be affected by unusually high relative risks in a subset of areas within the locality, whereas  11i π is specifically focussing on elevated risk status across all areas in a locality. An example is provided by area 28 which has 1 suicide death against 4.4 expected, with an estimated exceedance probability H 28 =0.36. However, the areas adjacent to area 28 have 34 deaths in relation to 20 expected, with a probability of 0.98 that the locality wide R 1, 28 exceeds 1 (where the locality encompasses area 28). Note that this type of pattern would be detected by the join-counts J 01i and corresponding probabilities 01 Delineation of high risk localities using local join-counts in conjunction with a relative risk model such as equation (1) contrasts with the spatial scan procedure which is applied to observed area disease counts without any modelling preliminaries, for example, smoothing or borrowing strength procedures to reduce unreliability in fixed effects relative risk estimates. Despite this fundamental difference, the localities of table 1 can be compared with clusters identified by the SaTScan and FleXScan packages developed by Kulldorff [14] and Tango and Takahashi [15], respectively. It also implies to the work done by Holowaty et al. [1] and Więckowska et al. [16]. SaTScan identifies five high rate clusters with Monte-Carlo p-values under 0.2. The average  11i π is 0.666 for the 29 areas in these five clusters, and the overlap with the local join-count method is apparent in that only 2.2% of the 922 areas have The local join-count procedure also provides estimates of  10 , i π which will be elevated when  i H is elevated, but risk in the surrounding locality is relatively low. These may be considered as local high risk outliers, discordant in terms of health status from their neighbours. To demonstrate the contrasting risk patterns between the focus area and surrounding areas, we define A i , encompassing areas adjacent to the focus area i but not including that area.

Locality Risk Patterns under Alternative Spatial Weights
Inferences from convolution or other area disease count models may be affected by the form of spatial interaction assumed. Two alternatives to binary adjacency are considered, which involve down weighting areas at greater distance from the focus area (with inter-area distances based on population centroids). These assume distance decay according (γ>0) with values of γ=0.5 and γ=1 considered. These values are based on a preliminary analysis using model (4) to find an optimal value for γ using a discrete prior over values {0, 0.1, 0.2,..., 1.5}, which produced a posterior mean for γ of 0.69.
We focus on elevated locality risk in particular, and table 3 summarises locality risk patterns under the two distance decay options.
The table considers only areas with  11i π over 0.70, ranked by  11 . i π The weighted averages of modelled relative risks across localities L i (centred on and including area i) now adjust also for distance decay as well as expected deaths, namely

Results: Bivariate Spatial Clustering under Alternative Spatial Priors
We now consider local join-counts for detecting bivariate risks that are both significantly elevated and also spatially clustered. Consider    It can be seen from both figures that high suicide and self-harm rates occur widely through these three districts, but that elevated levels of both self-harm and suicide together are apparent in areas coded 330, 333, 334, 336, and 340 (in the centre of the southern boundary), and also in a north-west aligned band of Wigan MSOAs in the central part of the map.
The lower panel of table 4 shows two new cluster centres (197,492), as compared to the upper panel, these being areas where selfharm risk (both observed and modelled) is high, and estimated suicide risk is pulled towards the risk for the more common outcome under the bivariate spatial prior. Using extra information about risk patterns provided by a more frequent outcome (or by intercorrelation between outcomes in general) is generally regarded as beneficial. This is a form of borrowing strength [19] enabling stronger inferences for an infrequent outcome. However, analysis such as that here, of potential impacts on inferences about clustering, may provide an additional facet for assessing sensitivity to alternative spatial priors.

Criteria for Cut-off Points
Considering the results of both the univariate and bivariate clustering analyses together, one may set out some criteria for choosing focus areas for high risk localities. The choice of cut-off for  11i π (or  11 AB i π for bivariate outcomes) should be based on the profile of their ranked values, in conjunction with information about risk variation (e.g. the profile of H i and R i ). The necessary interconnection with H i follows from the relation    11 10 .
Health outcome data for small areas vary considerably in the extent to which significant variations in area relative risk (and hence locality clustering) can be detected and this affects cut-off choice. For example, for the relatively rare suicide outcome, there are only 18 areas with Whereas suicide is a rare outcome, self-harm is around 25 times more frequent. When a univariate clustering analysis (comparable to that carried out for completed suicide and reported on above) is carried out for self-harm, there are 275 MSOAs with H i exceeding 0.95, and 16 MSOAs with  11 0.9, i π > so a higher cut-off point could be used to detect high risk clusters for this outcome.  For the bivariate outcome analysis (suicide and self-harm) without borrowing of strength between outcomes (e.g., Table 4  π > was used. The implications of using a slightly lower cut-off point could be considered, since even in this analysis, the locality relative risks R Ai and R Bi are significantly elevated (above 1) at lower values of  11 AB i π than the illustrative cutoff taken.
It follows from the above discussion that there are no simple rules for a low threshold  11i π or  11 AB i π below which clustering is implausible. It depends on the profile of H i and R i as well as on the profile of  11i π . Also relevant is the relative size of  11i π and  10 , i π the latter being the probability of a high risk area surrounded by low risk areas.
Where an area has Pr( 1| ) i R y > below 0.9, or H i below 0.75, or  10i π clearly exceeding  11i π then high risk clustering becomes considerably less likely.

Conclusions
Small area disease models often use exceedance probabilities for each individual area to make inferences about risk patterns. However, elevated risk in an area may not necessarily extend to the surrounding locality. This paper has sought to identify areas where elevated risk extends to the broader locality using local join-count statistics. These statistics can identify local outliers as well as high risk cluster centres, and can be applied to assess high risk clustering in more than one health outcome.
The procedure here can be used in conjunction with a disease model where risk status is unknown, so enabling the clustering implications of contrasting likelihood and prior assumptions (e.g. regarding pooling between areas, and outcomes) to be assessed. In particular, inferences about clustering patterns in two outcomes considered jointly may well be influenced by alternative assumptions, particularly when a spatial prior borrows strength over outcomes as well as areas. Sensitivity of clustering inferences to alternative priors for spatial effects, such as the approach of Leroux et al. [20] in contrast to the convolution prior, also provides an additional area of research.