alexa Assessing Univariate and Bivariate Spatial Clustering in Modelled Disease Risks | Open Access Journals
ISSN: 2155-6180
Journal of Biometrics & Biostatistics
Like us on:
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Assessing Univariate and Bivariate Spatial Clustering in Modelled Disease Risks

Peter Congdon*

Department of Geography, Queen Mary, University of London, Mile End Rd, London E1 4NS, UK

*Corresponding Author:
Peter Congdon
Department of Geography
Queen Mary, University of London
Mile End Rd, London E1 4NS, UK
E-mail: [email protected]

Received Date: January 24, 2012; Accepted Date: February 19, 2013; Published Date: February 23, 2013

Citation: Congdon P (2013) Assessing Univariate and Bivariate Spatial Clustering in Modelled Disease Risks. J Biomet Biostat 4:161. doi:10.4172/2155-6180.1000161

Copyright: © 2013 Congdon P. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Biometrics & Biostatistics

Abstract

Models for spatial variation in relative disease risk often consider posterior probabilities of elevated disease risk in each area, but for health prioritisation, the interest may also be in the broader clustering pattern across neighbouring areas. The classification of a particular area as high risk may or may not be consistent with risk levels in the surrounding areas. Local join-count statistics are used here in conjunction with Bayesian models of area disease risk to detect different forms of disease clustering over groups of neighbouring areas. A particular interest is in spatial clustering of high risk, which can be assessed by high probabilities of elevated risk across both a focus area and its surrounding locality. An application considers univariate spatial clustering in suicide deaths in 922 small areas in the North West of England, extending to an analysis of bivariate spatial clustering in suicide deaths and hospital admissions for intentional
self-harm in these areas.

Keywords

Relative risk; Spatial clustering; Bayesian; Bivariate clustering; Join-count

Introduction

Spatial analyses of disease incidence or mortality in small areas are often used to identify elevated risk. For example, posterior probabilities of elevated disease risk in each area may be obtained from Bayesian models of area disease counts [1-3]. However, elevated risk identified in a particular area may not extend to nearby areas, whereas spatial clustering of high risk across several adjacent or nearby areas may be of particular importance for health policy prioritisation. Identification of high risk localities may be more precise than identification of elevated risk in individual areas, especially for less frequent outcomes. Interest in spatial clustering of high risk may extend to the geographic pattern of interrelated disease outcomes (e.g., different forms of mental illness or different cardiovascular diseases).

For identifying such clustering in conjunction with a disease model, local indices of spatial association, and particular adaptations of them, become relevant. The present analysis considers use of local join-count statistics to detect high risk locality clustering for disease count data aggregated into small areas. These statistics are used in conjunction with an area disease model, and Bayesian inferences based on updating of prior assumptions using Markov chain Monte Carlo (MCMC) estimation.

The proposed join-count statistics methodology is initially applied to clustering of suicide deaths in 922 small areas in NW England. The relative mortality risks are modelled using a Bayesian spatial convolution prior [4] with alternative forms of spatial interaction (e.g., binary adjacency vs distance decay) between neighbouring areas considered. Results obtained using the join count method in conjunction with a spatial model are compared with the widely used (albeit non-Bayesian) spatial scan method, as implemented in the FlexScan and SaTScan packages. The application is then extended to analyze bivariate clustering in suicide deaths and self-harm hospitalisations.

Methods: Join-Count Measures for Local Clustering in Risk

In Bayesian small area disease applications, the classification of an area as high risk typically depends on unknown parameters. Consider disease counts (yi,i=1,images.n) with expected values ei obtained by multiplying area populations by the region-wide disease rate, and with Equation .Then the yi may be taken as Poisson,

Equation

where the ri are relative disease risks in area i with average 1. Such risks often show spatial correlation, and ignoring such correlation can lead to biased and inefficient inference, as the observations are not independent [5]. A widely applied model (known as the convolution model) involves two sets of random effects: (a) spatially structured effects si to represent spatially correlated risks and following a conditional autoregressive (CAR) scheme [4],

Equation

where wij are symmetric spatial interactions (with wii=0 ), s[i] represents the collection of S effects excluding Si, and S0i= Equationand (b) iid random effects ui, typically normal with ui ~ N(0,τ2), to represent possible over dispersion, or excess variability in relation to the Poisson assumption. Then if there are covariates Xi relevant to explaining variations in area disease risk, one has:

   Equation    (1)

If there are no covariates to model spatial patterning of risks, the spatial random effects represent spatial pattern in the disease outcome, whereas otherwise they capture spatial structure in the residuals. To assess actual clustering, one may obtain measures such as Moran’s I for the si; this entails deriving the index at each MCMC iteration, with posterior inferences (e.g. credible intervals) based on the values accumulated over iterations.

Consider binary measures bi of disease risk for areas i=1,images.,n, with bi=1 for elevated risk, bi=0 otherwise. For example, one may define bi=1 for rir, where τr is a relative risk threshold (e.g., τr=1 or τr=1.25, and bi=0) otherwise. If relative risks have average 1, and τr=1 then the region-wide proportion of areas with elevated risk, E(bi)=π, will be approximately 0.5.

In spatial disease applications with correlated relative risks ri, binary indicators such as bi=I(ri>1) will also tend to be spatially correlated. Region-wide spatial clustering in the bi can be measured by join-count statistics, based on concordance in risk status between area pairs. Thus, a join-count measuring clustering in high risk across a region is

Equation

with 0.5J11 known as the BB statistic [6]. Differing health status in neighbouring area-pairs is measured by a weighted total of joins with bi and bj discordant, which can be denoted

Equation

Observed join-count totals can be compared with totals expected under a null hypothesis of spatial independence [7]. The expected total of concordant joins under the hypothesis of no spatial dependence is E(J11)=S0π2 where Equation, and J11 will exceed E(J11) when there is spatial patterning in the disease outcome [6]. Similarly the observed J10 can be compared with E(J10)=2S0π(1-π), and will be less than this expected total when there is disease clustering. It may be noted that in a modelling application with ri unknown, the indicators bi (and related parameters such as π) are also unknown, and sampled at each MCMC iteration.

A localised set of join-count statistics (with area i as the focus) can be used to decide whether area i and nearby areas form a high risk cluster, or demonstrate an alternative risk pattern in the locality. For measuring joint high risk, with both area i and its neighbouring areas being high risk, one has

Equation

where I(A)=1 if condition A holds, and I(A)=0 otherwise. When the focus is on area i, it is relevant to distinguish discordant high-low risk pairings (bi=1, bj=0) from low-high risk pairings (bi=0, bj=1). The relevant local join-count statistics in these cases are then

Equation

and

Equation

The count J10i captures situations where area i is high risk, but nearby areas are mostly low risk, so that area i may be termed a high risk local outlier. The count J01i would be elevated when area i itself does not have high risk, but neighbouring areas are mostly high risk. Finally,

Equation

represents localities where both the focus and surrounding areas are low risk. The expected number of common high risk joins with area i as the focus (i.e. area i is a high risk cluster member) is E(J11i)=S0iπ2, while E(J10i)=S0iπ(1-π) and E(J00i)=S0i(1-π)2.

Consider a sequence t=1,imagesimagesT of MCMC samples. From the indicators bi(t) of elevated risk at each MCMC iteration, one may estimate probabilities of elevated risk in area i specifically (without regard to the broader locality), namely

Equation

One may also monitor join-counts indicating locality-wide elevated risk Equationwith posterior estimates

Equation

The estimated proportion of joins in the locality centred on area i that are joint high risk, namely

Equation

provides a summary index of high risk across that locality. By contrast, the proportion of joins centred on area i that are (1,0) pairs

Equation

provides an index that area i is a high risk outlier relative to the broader locality.

The join-counts J11i and J10i can be written as Equation and Equationrespectively, from which it follows that

Equation

and hence that

Equation

Hence, Equation will be elevated when both Equationis elevated, and risk in the surrounding locality is elevated also. By contrast, Equationwill be elevated when Equationis elevated, but risk in the surrounding locality is relatively low. Similarly, Equationand defining Equationone has

Equation

Areas can be ranked in terms of Equationto indicate which likely high risk cluster centres are. Alternative tests regarding high risk clustering in the locality around area i might be envisaged. One involves expectation weighted averages

Equation

of modelled relative risks across localities Li that include both the focus area i and areas adjacent to it. These weighted averages can be monitored during the MCMC updating and the probabilities that Ri exceed 1 obtained. However, this test may be affected by unusually high relative risks in one or two areas within the locality, or by situations where a low risk area is surrounded by high risk areas.

Another option is to compare the sampled J11i at each MCMC iteration to the expected count S0iπ2 under a no clustering hypothesis, and obtain estimates of the probabilities Equation

If bi=1, Equationand the comparison Equationreduces to Equationa condition very likely to be met in high risk localities (where risk is elevated in both the focus area and surrounding areas, so that π11i is high). Hence, the comparison Equationwill tend to have a similar probability of holding as that for I (bi=1) in such localities.

Methods: Clustering in Bivariate Risk

An extension of the proposed join-count statistics is in the detection of elevated bivariate risk across localities, based on local join-counts for the joint high risk binary event. Methods for bivariate spatial association have been proposed [8], and bivariate LISA methods indicate association between the value for one variable at a given location and the average of another variable at neighbouring locations. However, there is no widely applied cluster detection method (e.g. spatial scan technique) for bivariate outcomes. Let A and B denote two health outcomes Equation, Equationand consider join-counts corresponding to the joint high risk classification:

Equation

The event risks {rAi ,rBi} can be obtained via the models

    Equation    (2)

Equation

with one option for priors on the random effects being

    Equation   (3)

Equation

Equation

To assess high risk clustering in both events jointly, the bivariate local join-counts

Equation

can be monitored. The estimated probability of elevated bivariate risk in area i specifically is

Equation

but this elevated bivariate risk may not apply across the broader locality. However, the estimated proportion of bivariate joins in the locality centred on area i that are joint high risk, namely

Equation

provides a summary index of high bivariate risk across that locality. For detecting isolated elevated bivariate risk (high risk in the focus area but not extending to the broader locality), the relevant join

count is

Equation

Just as implications about smoothed relative risks may depend on the form of spatial interaction assumed [5], so may the inferences about clustering patterns. Implications about risk patterns for interdependent events, especially when one event is less frequent than another, may also be influenced by the form of random effects assumption (and the extent to which there is borrowing of strength). For example, clustering inferences in the less common outcome may be affected if a bivariate spatial prior (allowing correlation in spatial risks between outcomes within areas) is adopted instead of separate univariate spatial priors as in equation (3).

Results: Locality Risk Patterns in Suicide Deaths in NW England

While relatively rare, suicide is a major reason for premature mortality. To assess risk patterns in individual areas as compared to their broader localities, we consider suicide deaths yi over the period 2006 to 2010 in 922 small areas (Middle Level Super Output Areas or MSOAs) across the North West of England (Table 1). These areas are designed to be of similar size in population terms, with an average population of 7500. Expected deaths ei are based on applying an England wide schedule of age specific suicide rates to MSOA populations, with scaling applied to ensure Equation.

Locality Index of focus area π11i posterior estimate Hi posterior estimate Relative risk ri in focus area (poster-ior mean) h11i (posterior means) Total areas in locality (inclu-ding focus) Indices of areas in locality (other than focus) Modelled relative risk R1i across locality (expectation weighted average) Pr(R1i >1) (elevated locality risk) SMR across locality
1 594 0.854 0.979 1.717 0.979 7 590, 592, 593, 595, 597, 599 1.547 0.999 2.151
2 16 0.834 0.989 1.845 0.989 9 5, 10, 11, 15, 17, 21, 22, 25 1.433 1.000 1.872
3 11 0.820 0.971 1.729 0.971 5 5, 18, 15, 16 1.478 0.997 2.011
4 249 0.761 0.969 1.692 0.966 5 247, 248, 251, 252 1.385 0.991 1.782
5 856 0.753 0.977 1.742 0.975 7 849, 851, 853, 857, 858, 859 1.375 0.995 1.957
6 595 0.743 0.889 1.460 0.889 5 593, 594, 596, 599 1.427 0.992 1.844
7 251 0.741 0.885 1.420 0.885 6 247, 248, 250, 252, 258 1.534 0.999 2.181
8 597 0.735 0.966 1.755 0.963 4 594, 599, 601 1.452 0.989 1.901
9 590 0.726 0.986 1.882 0.985 5 587, 589, 592, 594 1.465 0.993 1.741
10 710 0.723 0.966 1.768 0.943 4 709, 711, 712 1.405 0.968 1.580
11 10 0.714 0.895 1.452 0.894 7 2, 5, 6, 13, 16, 17 1.353 0.998 1.757
12 258 0.713 0.999 2.207 0.994 8 250, 251, 252, 255, 259, 262, 264 1.369 0.995 1.660
13 15 0.713 0.870 1.397 0.870 6 8, 11, 16, 18, 21 1.419 0.998 1.795

Table 1: Suicide mortality, areas with highest estimates (π11i) for elevated locality risk.

The average mortality count is 3.5, but event totals yi in individual areas vary widely, and moment estimates of relative risk yi / ei (sometimes called standard mortality ratios or SMRs), also vary widely. Such moment estimates are unreliable with variance instability when there are small numbers of suicide deaths, as in many MSOAs [9,10].

To provide stabilised estimates of relative risk including spatial borrowing of strength, a convolution model is applied with yi ∼ Poi(eiri ), where

    Equation     (4)

where EquationA flat prior on β0 is assumed, and a gamma prior with index 1 and shape 0.001 on the inverse spatial variance 1/σ2 [11,12]. Convergence is improved by linking the variance parameters; thus τ22 / ρ where ρ is assigned an exponential prior with rate 1. Inferences in this and subsequent models are based on the second halves of two chain runs of 10,000 iterations, with convergence assessed according to BGR statistics [13].

Localities are defined as areas adjacent to area i, though the weighting attached to different areas within such localities can be varied. To assess possible sensitivity regarding inferences about locality risk, alternative assumptions about wij are investigated: equal weighting of all adjacent areas as compared to alternative forms of inverse distance decay Equation . The binary indicators Equationand local joincounts Equation are monitored to provide posterior estimated probabilities Equationof high risk common to the focus and its locality, and estimated marginal probabilities of elevated risk, namely Equation

Inferences for Locality Risks

Consider first a binary adjacency assumption for the wij (wij=1 if areas are adjacent, wij=0, otherwise), under which the Moran spatial correlation index for the si is obtained as 0.56 with 95% interval (0.47, 0.66). Figure 1 maps out the posterior mean relative risks ri across the region, though this map tends to be dominated by low density rural areas (such as in the Lake District in the northern part of the map). Subsequently higher resolution maps are used to depict risk and clustering patterns, since in the case study, high risk clustering tends to be in densely populated urban areas. Maps of the administrative geography of the region (including maps of MSOAs) are available at the UK Map Collection page http://www.ons.gov.uk/ons/guide-method/ geography/beginner-s-guide/maps/index.html.

biometrics-biostatistics-suicide-realtive-risk

Figure 1: Suicide Realtive Risk, Posterior Means.

The estimated Equation have an average of 0.221, with a 0.975 percentile of 0.638, and with the maximum Equationbeing 0.854. The estimated marginal probabilities Equationhave an average of 0.426, with a 0.975 percentile of 0.923, and a maximum of 0.999. The Equation, which provide indicators of isolated high risk not extending to the broader locality, have an average of 0.205, with a maximum of 0.583. The estimated Equationare also shown; as discussed above, these are similar to Equation in localities characterised by high risk clustering, but their ordering of potential cluster centres is similar to that of the EquationOf the 13 areas with highest Equationvalues, 10 are also among the 13 areas with highest Equationvalues.

There are 5 areas with Equation over 0.75, and 13 areas with Equation over 0.70. Table 1 summarises locality risk patterns for the 13 areas with Equationover 0.70, ranked by the size of Equation, and also including estimates of Hi and ri (posterior means). The relatively low values for both Equationand Equationreflect the rarity of the suicide outcome; more frequent outcomes (such as self-harm hospitalisations considered in the bivariate analysis) are more likely to have high Equationand Equation(e.g. close to 1). Table 1 also shows posterior means of expectation weighted averages Equation of modelled relative risks across localities Li, encompassing both the focus area i and areas adjacent to it. Also shown are estimated probabilities that Rli exceed 1, namely that the entire locality has elevated risk, and unsmoothed suicide SMRs across localities

Equation

The Equation identify focus areas with high probabilities of elevated risk and of belonging to a high risk locality, rather than clusters per se. So some areas are present in more than one locality in table 1; for example, areas 594 and 599 appear twice. There are 47 distinct areas in the localities in table 1, and their posterior mean ri range from 0.99 to 2.21 with average 1.36.

Probabilities that the average locality risk R1i exceeds 1 are all over 0.968. The average locality risks R1i may be used to confirm what the join-count statistics indicate, in particular the Equation statistics, but in themselves are not conclusive about elevated risk common to both a focus area and areas around it. Weighted averages such as R1i may be affected by unusually high relative risks in a subset of areas within the locality, whereas Equation is specifically focussing on elevated risk status across all areas in a locality. An example is provided by area 28 which has 1 suicide death against 4.4 expected, with an estimated exceedance probability H28=0.36. However, the areas adjacent to area 28 have 34 deaths in relation to 20 expected, with a probability of 0.98 that the locality wide R1,28 exceeds 1 (where the locality encompasses area 28). Note that this type of pattern would be detected by the join-counts J01i and corresponding probabilities π 01i = J01i / S0i.

Delineation of high risk localities using local join-counts in conjunction with a relative risk model such as equation (1) contrasts with the spatial scan procedure which is applied to observed area disease counts without any modelling preliminaries, for example, smoothing or borrowing strength procedures to reduce unreliability in fixed effects relative risk estimates. Despite this fundamental difference, the localities of table 1 can be compared with clusters identified by the SaTScan and FleXScan packages developed by Kulldorff [14] and Tango and Takahashi [15], respectively. It also implies to the work done by Holowaty et al. [1] and Wieckowska et al. [16]. SaTScan identifies five high rate clusters with Monte-Carlo p-values under 0.2. The average Equation is 0.666 for the 29 areas in these five clusters, and the overlap with the local join-count method is apparent in that only 2.2% of the 922 areas have Equation over 0.666. Similarly, the 59 areas identified by FlexScan (in 7 clusters with p-values under 0.2) have an average Equation of 0.618.

The most likely cluster identified by SaTScan [15] contains areas {248, 249, 251, 252, 253, 258}, while FlexScan identifies the area set {249, 251, 252, 253, 258, 262, 263, 265, 271} as its leading secondary cluster (with lowest p-value after the most likely cluster). Areas {248, 249, 250, 251, 252, 255, 258, 259, 262, 264} are included in the localities identified using join-count statistics in table 1, and in fact consist of neighbouring areas in Tameside, a local authority district in the south east of the region, with the district of Oldham to the North and with Stockport to the South. Figure 2 (of MSOAs in the three local government districts of Tameside, Oldham and Stockport) shows a cluster of MSOAs in the centre of the mapped sub-region, mostly in Tameside, all having posterior mean relative risks above 1.10.

biometrics-biostatistics-modulled-suicide-tameside

Figure 2: Modulled Suicide Relative Risks, Tameside, Oldham and Stockport.

The most likely cluster identified by FlexScan consists of the areas {587, 588, 590, 591, 593, 594, 597, 599}, and the similar area set {590, 592, 593, 594, 595, 597, 599} is also the leading secondary cluster identified by SatScan. Areas {587, 589, 590, 592, 593, 594, 595, 596, 597, 599 and 601} are included in the areas in table 1, and consist of a set of areas in the coastal town of Blackpool. Figure 3 (of MSOAs in the three local government districts of Blackpool, Wyre and Fylde) shows this cluster of adjacent MSOAs at the westernmost centre of the plot, all having posterior mean relative risks above 1.15 except for area 589 with modelled relative risk of 0.994, but encompassed within surrounding higher risk areas.

biometrics-biostatistics-modulled-suicide-blackpool

Figure 3: Modelled Suicide Relative Risks, Blackpool, Wyre and Fylde.

The local join-count procedure also provides estimates of Equation which will be elevated when Equation is elevated, but risk in the surrounding locality is relatively low. These may be considered as local high risk outliers, discordant in terms of health status from their neighbours. To demonstrate the contrasting risk patterns between the focus area and surrounding areas, we define Ai, encompassing areas adjacent to the focus area i but not including that area.

Thus, table 2 shows the 12 MSOAs with Equation over 0.5, the modelled relative risk ri (posterior mean) in the focus area, and posterior mean relative risk in the surrounding area, namely

Equation

Index of focus area Hi posterior estimate π10i posterior estimate Modelled relative risk, ri, in focus area (posterior mean) Number of areas in surrounding locality (excluding focus) Modelled relative risk R2i across rest of locality (excl focus) Pr(R2i>1) (elevated risk in adjacent areas) SMR across rest of locality
532 0.853 0.583 1.379 5 0.920 0.248 0.768
620 0.800 0.582 1.358 2 0.890 0.239 0.490
439 0.787 0.580 1.288 6 0.868 0.113 0.702
772 0.974 0.550 1.805 4 1.009 0.492 0.739
554 0.739 0.545 1.233 6 0.847 0.091 0.693
406 0.804 0.521 1.299 6 0.917 0.234 0.856
862 0.896 0.514 1.469 4 0.977 0.411 0.695
763 0.743 0.509 1.252 3 0.887 0.216 0.869
158 0.855 0.506 1.329 7 0.956 0.329 0.877
740 0.960 0.503 1.145 7 1.022 0.551 0.754
744 0.765 0.502 1.244 4 0.924 0.271 0.782
198 0.749 0.502 1.251 5 0.900 0.209 0.532

Table 2: Suicide mortality, areas with highest estimates (π10i) for outlier high risk.

Also shown are unsmoothed suicide SMRs across adjacent areas EquationFor all but one area, the probabilities that R2i (average risk in the locality excluding the focus area) exceed 1 are under 0.5, whereas the probabilities Hi of elevated risk in the focus area itself all exceed 0.7.

Locality Risk Patterns under Alternative Spatial Weights

Inferences from convolution or other area disease count models may be affected by the form of spatial interaction assumed. Two alternatives to binary adjacency are considered, which involve down weighting areas at greater distance from the focus area (with inter-area distances based on population centroids). These assume distance decay according Equation (γ>0) with values of γ=0.5 and γ=1 considered. These values are based on a preliminary analysis using model (4) to find an optimal value for γ using a discrete prior over values {0, 0.1, 0.2,images., 1.5}, which produced a posterior mean for γ of 0.69.

We focus on elevated locality risk in particular, and table 3 summarises locality risk patterns under the two distance decay options. The table considers only areas with Equation over 0.70, ranked by Equation The weighted averages of modelled relative risks across localities Li (centred on and including area i) now adjust also for distance decay as well as expected deaths, namely

Equation

Locality Index of focus area π11i posterior estimate Hi posterior estimate Relative risk ri in focus area (poster-ior mean) Indices of areas in locality (other than focus) Modelled relative risk R3i across locality Pr(R3i >1) (elevated locality risk) SMR across locality
Distance Decay Coefficient (γ) equals 1
1 710 0.833 0.969 1.77 709, 711, 712 1.52 0.978 1.58
2 16 0.831 0.985 1.76 5, 10, 11, 15, 17, 21, 22, 25 1.40 1.000 1.87
3 594 0.825 0.967 1.59 590, 592, 593, 595, 597, 599 1.43 0.997 2.15
4 11 0.815 0.964 1.61 5, 8, 15, 16 1.41 0.997 2.01
5 711 0.810 0.888 1.49 709, 710, 712 1.55 0.984 1.58
6 595 0.754 0.889 1.39 593, 594, 596, , 599 1.38 0.988 1.84
7 249 0.748 0.964 1.58 247, 248, 251, 252 1.34 0.986 1.78
8 597 0.730 0.957 1.60 594, 599, 601 1.36 0.975 1.90
9 856 0.728 0.975 1.66 849, 851, 853, 857, 858, 859 1.33 0.988 1.96
10 337 0.723 0.979 1.70 332, 334, 340, 542 1.37 0.983 1.65
11 712 0.715 0.853 1.43 707, 709, 710, 711 1.43 0.971 1.30
12 333 0.712 0.912 1.45 330, 334, 336, 340 1.29 0.967 1.64
13 10 0.711 0.881 1.41 2, 5, 6, 13, 16, 17 1.32 0.992 1.76
14 732 0.709 0.933 1.52 27, 728, 729, 731, 736 1.35 0.963 1.64
15 15 0.709 0.860 1.34 8, 11, 16, 18, 21 1.36 0.996 1.79
Distance Decay Coefficient (γ) equals 0.5
1 594 0.839 0.972 1.64 590, 592, 593, 595, 597, 599 1.47 0.998 2.15
2 16 0.831 0.988 1.80 5, 10, 11, 15, 17, 21, 22, 25 1.41 0.999 1.87
3 11 0.818 0.968 1.67 5, 8, 15, 16 1.44 0.995 2.01
4 710 0.789 0.973 1.78 709, 711, 712 1.47 0.977 1.58
5 249 0.763 0.967 1.64 247, 248, 251, 252 1.37 0.984 1.78
6 711 0.763 0.895 1.48 709, 710, 712 1.49 0.980 1.58
7 595 0.754 0.894 1.42 593, 594, 596, 599 1.39 0.988 1.84
8 856 0.749 0.978 1.71 849, 851, 853, 857, 858, 859 1.36 0.993 1.96
9 597 0.726 0.959 1.66 594, 599, 601 1.38 0.978 1.90
10 251 0.722 0.871 1.38 247, 249, 251, 252, 258 1.46 1.000 2.18
11 15 0.715 0.868 1.37 8, 11, 16, 18, 21 1.39 0.994 1.79
12 10 0.713 0.895 1.42 2, 5, 6, 13, 16, 17 1.33 0.993 1.76
13 333 0.713 0.915 1.48 330, 334, 336, 340 1.30 0.965 1.64
14 258 0.705 0.999 2.16 250, 251, 252, 255, 259, 262, 264 1.38 0.998 1.66

Table 3: Elevated locality risks, local join-count statistics and distance decay options.

Table 3 shows posterior mean R3i and probabilities that R3i exceed 1. Unsmoothed suicide SMRs across the locality Equationare defined as before.

There is considerable overlap between table 3 and table 1 in those focus areas identified as having both elevated “own area” risk (high Hi) and elevated risk across the locality also. Thus of the 13 areas with high Equation identified in table 1, 11 also appear as focus areas in the top panel (high distance decay wij) of table 3, and the other two (areas 251, 590) are included in the broader localities listed there. All 13 cluster-centre areas identified in table 1 appear as such areas in the lower panel of table 3 (less marked distance decay).

Results: Bivariate Spatial Clustering under Alternative Spatial Priors

We now consider local join-counts for detecting bivariate risks that are both significantly elevated and also spatially clustered. Consider suicide deaths yAi for 2006-10 as discussed above, and self-harm hospitalisations yBi for 2006-7 to 2010-11 (five financial years, with ICD10 X60--X84 codes) across the 922 MSOAs in NW England (the data can be obtained at http://www.apho.org.uk/resource). Expected hospitalisations eBi are based on England wide age specific rates, with scaling applied to ensure Equation. Self-harm is often a precursor to later actual suicide, but considerably more frequent with average event count Equation.

A convolution model is applied with Equation, Equationbut comparing two alternative procedures to provide stabilised estimates of relative risk. The first includes spatial borrowing of strength within outcomes, but without such borrowing between outcomes, and unrelated CAR and iid priors for each event

Equation

Equation

with Equation, Equationand EquationSpatial interactions wij are binary based on adjacency. The second procedure assumes Equation follow a bivariate CAR prior [17] with unknown within area covariance matrix

Equation

with Equation taken to be Wishart with 2 degrees of freedom and identity scale matrix.

The bivariate indicators

Equation

are monitored in each case to provide bivariate join counts Equation. From these one obtains indicators of elevated bivariate risk encompassing both the focus area and its surrounding locality

Equation

One may also estimate the weighted locality relative risks for each event, namely

Equation

Equation

and the probabilities that they exceed 1.

For the model without pooling between outcomes, table 4 (top panel) shows there are 14 areas with Equationexceeding 0.5. It can be seen that stronger locality inferences hold for the more frequent second outcome (self-harm), with all the probabilities Pr(RBi >1) being 1. However, the identified localities also have Pr(RAi >1) exceeding 0.9 for all 14 cluster centres, and exceeding 0.95 for 12 cluster centres.

(a) Without pooling between outcomes
Index of focus area π11ABi posterior estimate HABi Relative risk rAi in (focus Area) Relative risk rBi in (focus Area) Indices of areas in locality (other than focus) Modelled relative risk RAi across locality Modelled relative risk RBi across locality Pr(RAi >1) (elevated locality risk) Pr(RBi >1) (elevated locality risk) SMR across locali-ty Self‐harm SHR across locality
594 0.77 0.99 1.74 2.12 590, 592, 593, 595, 597, 599 1.56 1.71 1.00 1.00 2.15 1.73
333 0.75 0.94 1.52 1.76 330, 334, 336, 340 1.33 1.38 0.97 1.00 1.64 1.39
597 0.64 0.98 1.80 2.12 594, 599, 601 1.47 1.57 0.99 1.00 1.90 1.58
16 0.58 0.99 1.76 2.60 5, 10, 11, 15, 17, 21, 22, 25 1.47 1.28 1.00 1.00 1.87 1.29
29 0.58 0.84 1.31 1.22 22, 25, 26, 27, 32, 33 1.30 1.24 0.98 1.00 1.53 1.25
312 0.57 0.94 1.52 2.56 309, 310, 311, 315 1.21 1.60 0.91 1.00 1.30 1.60
315 0.54 0.77 1.19 1.95 310, 311, 312, 316, 318, 323, 327 1.22 1.63 0.95 1.00 1.48 1.64
251 0.54 0.95 1.51 1.90 247, 249, 250, 252, 258 1.52 1.34 1.00 1.00 2.18 1.34
573 0.54 0.80 1.27 1.40 569, 572, 574, 577 1.30 1.73 0.97 1.00 1.70 1.75
731 0.53 0.79 1.26 1.29 729, 732, 733, 736 1.33 1.26 0.97 1.00 1.79 1.26
33 0.52 0.82 1.28 1.53 26, 29, 32, 174, 176 1.19 1.35 0.91 1.00 1.23 1.36
76 0.52 0.78 1.21 1.24 72, 73, 74, 78, 79, 81 1.22 1.16 0.96 1.00 1.33 1.17
856 0.50 0.99 1.68 2.22 849, 851, 853, 857, 858, 859 1.37 1.33 0.99 1.00 1.96 1.34
729 0.50 0.95 1.50 1.29 726, 727, 731, 732, 733, 734 1.24 1.31 0.93 1.00 1.34 1.32
(b) With pooling between outcomes
Index of focus area π11ABi posterior estimate HABi Relative risk rAi in (focus Area) Relative risk rBi in (focus Area) Indices of areas in locality (other than focus) Modelled relative risk RAi across locality Modelled relative risk RBi across locality Pr(RAi >1) (elevated locality risk) Pr(RBi >1) (elevated locality risk) SMR across locality Self‐harm SHR across locality
333 0.87 0.99 1.69 1.77 330, 334, 336, 340 1.40 1.38 1.00 1.00 1.64 1.39
594 0.82 1.00 1.98 2.13 590, 592, 593, 595, 597, 599 1.71 1.72 1.00 1.00 2.15 1.73
197 0.79 1.00 1.87 4.21 190, 195, 196, 201 1.31 1.81 0.98 1.00 0.98 1.82
315 0.76 0.96 1.42 1.95 310, 311, 312, 316, 318, 323, 327 1.34 1.63 1.00 1.00 1.48 1.64
33 0.74 0.95 1.43 1.54 26, 29, 32, 174, 176 1.30 1.35 0.99 1.00 1.23 1.36
313 0.73 0.97 1.48 2.70 306, 308, 311, 314, 317, 318 1.25 1.73 0.97 1.00 1.14 1.74
29 0.72 0.93 1.37 1.22 22, 25, 26, 27, 32, 33 1.36 1.24 1.00 1.00 1.53 1.25
772 0.71 0.99 1.86 2.24 767, 768, 773, 777 1.29 1.60 0.95 1.00 1.42 1.61
312 0.69 1.00 1.83 2.58 309, 310, 311, 315 1.33 1.60 0.99 1.00 1.30 1.60
76 0.69 0.91 1.32 1.24 72, 73, 74, 78, 79, 81 1.30 1.16 0.99 1.00 1.33 1.17
733 0.68 0.92 1.36 1.37 729, 731, 734, 735, 736 1.37 1.33 0.99 1.00 1.43 1.33
573 0.68 0.90 1.37 1.40 569, 572, 574, 577 1.48 1.74 1.00 1.00 1.70 1.75
731 0.67 0.92 1.41` 1.29 729, 732, 733, 736 1.43 1.27 0.99 1.00 1.79 1.26
492 0.67 0.97 1.56 3.20 489, 494, 495, 501 1.26 2.14 0.95 1.00 1.34 2.15

Table 4: Bivariate risk, areas with highest probabilities for cluster centres.

Inferences regarding the rarer outcome, both for the focus area and the locality, become stronger when there is pooling between the two outcomes (Table 4, lower panel). The pooling model is in fact supported by the data, since the Deviance Information Criterion [18] is reduced from 11311 to 11221, and the posterior estimate (with 95% CrI) for ρAB is 0.75 (0.64, 0.84). Moran spatial correlation indices for sAi and sBi are obtained as 0.45 (0.39, 0.52) and 0.42 (0.41, 0.45) respectively.

There are in fact now 28 areas with Equation exceeding 0.60, but table 4 contains the same number of cluster centres under the two options in order to facilitate comparison. The locality with the highest Equationunder the pooling model consists of five MSOAs in Wigan (areas 330, 333, 334, 336, and 340), and has 28 suicide deaths (against 17 expected), and 627 self-harm hospitalisations against 450.6 expected. Other MSOAs in Wigan with elevated and clustered bivariate risk are apparent in table 4 (the 4th, 6th and 9th focus areas in the lower panel). Figures 4 and 5 show modelled relative risks for the two outcomes in MSOAs in Wigan (MSOAs in centre), and in the adjacent St Helens and Bolton districts. It can be seen from both figures that high suicide and self-harm rates occur widely through these three districts, but that elevated levels of both self-harm and suicide together are apparent in areas coded 330, 333, 334, 336, and 340 (in the centre of the southern boundary), and also in a north-west aligned band of Wigan MSOAs in the central part of the map.

biometrics-biostatistics-modulled-suicide-bolton

Figure 4: Modelled Suicide Risk (Bivariate Analysis) Wigan, Bolton, St Helens.

biometrics-biostatistics-modelled-self-harm-risk

Figure 5: Modelled Self-Harm Risk (Bivariate Analysis) Wigan, Bolton, St Helens.

The lower panel of table 4 shows two new cluster centres (197, 492), as compared to the upper panel, these being areas where selfharm risk (both observed and modelled) is high, and estimated suicide risk is pulled towards the risk for the more common outcome under the bivariate spatial prior. Using extra information about risk patterns provided by a more frequent outcome (or by intercorrelation between outcomes in general) is generally regarded as beneficial. This is a form of borrowing strength [19] enabling stronger inferences for an infrequent outcome. However, analysis such as that here, of potential impacts on inferences about clustering, may provide an additional facet for assessing sensitivity to alternative spatial priors.

Criteria for Cut-off Points

Considering the results of both the univariate and bivariate clustering analyses together, one may set out some criteria for choosing focus areas for high risk localities. The choice of cut-off for Equation(or Equationfor bivariate outcomes) should be based on the profile of their ranked values, in conjunction with information about risk variation (e.g. the profile of Hi and Ri). The necessary interconnection with Hi follows from the relation Equation.

Health outcome data for small areas vary considerably in the extent to which significant variations in area relative risk (and hence locality clustering) can be detected and this affects cut-off choice. For example, for the relatively rare suicide outcome, there are only 18 areas with Pr(bi =1| y) = Hi exceeding 0.95, and a cut-off Equation > 0.7 was used, with a minimum Hi of 0.87 among the 13 areas above this cut-off. A slightly lower cut-off could be entertained, though the 17th ranked area in terms of Equation(with Equation= 0.69 ) has a relatively low Hi of 0.79, below the threshold of Hi=0.8 for elevated risk suggested by Richardson et al. [3]. The probabilities Pr(Ri >1| y) that the locality wide modelled SMRs exceed 1 are also relevant, provided the Ri are obtained for localities where both the focus and surrounding areas have elevated risk. All 13 areas with Equation> 0.7 have Pr(Ri >1| y) exceeding 0.95.

Whereas suicide is a rare outcome, self-harm is around 25 times more frequent. When a univariate clustering analysis (comparable to that carried out for completed suicide and reported on above) is carried out for self-harm, there are 275 MSOAs with Hi exceeding 0.95, and 16 MSOAs with Equation > 0.9, so a higher cut-off point could be used to detect high risk clusters for this outcome.

For the bivariate outcome analysis (suicide and self-harm) without borrowing of strength between outcomes (e.g., Table 4 upper panel), there are 14 areas with Equationexceeding 0.95, and a relatively low cut-off of Equation was used. The implications of using a slightly lower cut-off point could be considered, since even in this analysis, the locality relative risks RAi and RBi are significantly elevated (above 1) at lower values of Equation than the illustrative cutoff taken.

It follows from the above discussion that there are no simple rules for a low threshold Equationor Equationbelow which clustering is implausible. It depends on the profile of Hi and Ri as well as on the profile of Equation. Also relevant is the relative size of Equation and Equation, the latter being the probability of a high risk area surrounded by low risk areas. Where an area has Equationbelow 0.9, or Hi below 0.75, or Equationclearly exceeding Equationthen high risk clustering becomes considerably less likely.

Conclusions

Small area disease models often use exceedance probabilities for each individual area to make inferences about risk patterns. However, elevated risk in an area may not necessarily extend to the surrounding locality. This paper has sought to identify areas where elevated risk extends to the broader locality using local join-count statistics. These statistics can identify local outliers as well as high risk cluster centres, and can be applied to assess high risk clustering in more than one health outcome.

The procedure here can be used in conjunction with a disease model where risk status is unknown, so enabling the clustering implications of contrasting likelihood and prior assumptions (e.g. regarding pooling between areas, and outcomes) to be assessed. In particular, inferences about clustering patterns in two outcomes considered jointly may well be influenced by alternative assumptions, particularly when a spatial prior borrows strength over outcomes as well as areas. Sensitivity of clustering inferences to alternative priors for spatial effects, such as the approach of Leroux et al. [20] in contrast to the convolution prior, also provides an additional area of research.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

Article Usage

  • Total views: 11607
  • [From(publication date):
    March-2013 - Oct 19, 2017]
  • Breakdown by view type
  • HTML page views : 7824
  • PDF downloads :3783
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri, Food, Aqua and Veterinary Science Journals

Dr. Krish

[email protected]

1-702-714-7001 Extn: 9040

Clinical and Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemical Engineering and Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001 Extn: 9040

Earth & Environmental Sciences

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

General Science and Health care Journals

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics and Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001 Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Informatics Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Material Sciences Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Mathematics and Physics Journals

Jim Willison

[email protected]

1-702-714-7001 Extn: 9042

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001 Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

John Behannon

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001 Extn: 9042

 
© 2008-2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords