alexa
Reach Us +44-1522-440391
Spatial Disease Cluster Detection: An Application to Childhood Asthma in Manitoba, Canada | OMICS International
ISSN: 2155-6180
Journal of Biometrics & Biostatistics

Like us on:

Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Spatial Disease Cluster Detection: An Application to Childhood Asthma in Manitoba, Canada

Mahmoud Torabi*

University of Manitoba, Canada

*Corresponding Author:
Mahmoud Torabi
Department of Community Health Sciences
University of Manitoba
750 Bannatyne Ave
Winnipeg,Manitoba, R3E 0W3, Canada
Tel: +001-204-272-3136
Fax: +001-204-789-3905
E-mail: [email protected]

Received date: March 20, 2012; Accepted date: April 25, 2012; Published date: April 27, 2012

Citation: Torabi M (2012) Spatial Disease Cluster Detection: An Application to Childhood Asthma in Manitoba, Canada. J Biom Biostat S7:010. doi:10.4172/2155-6180.S7-010

Copyright: ©2012 Torabi M. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Biometrics & Biostatistics

Abstract

Cluster detection is an important part of spatial epidemiology because it may help suggest potential factors associated with disease and thus, guide further investigation of the nature of diseases. Many different methods have been proposed to test for disease clusters. The most popular methods for detecting spatial focused clusters are circular spatial scan statistic (CSS), flexible spatial scan statistic (FSS) and Bayesian disease mapping (BYM). The only latter approach is based on rigorous modeling approach. However, the Bayesian inference may depend on the choice of priors. We propose a frequentist approach, which yields to maximum likelihood estimation, to identify potential focused clusters. The proposed approach is based on the recent introduction of the method of data cloning. We can also provide the prediction (and prediction interval) for relative risk values. The advantages of data cloning approach are that the answers are independent of the choice of priors and non-estimable parameters are flagged automatically. We illustrate the proposed approach, and compare with aforementioned approaches, by analyzing a dataset of childhood asthma visits to hospital in the province of Manitoba, Canada, during 2000-2010. Our results showed that the potential clusters are mainly located in the north-central part of the province.

Keywords

Asthma cases; Bayesian computation; Geographic epidemiology; Prediction; Random effects; Spatial cluster detection

Introduction

Asthma is a severe disease that inflames and narrows the airways, causing difficulty in breathing. The primary cause of asthma is known to be sensitization to allergic and non-allergic triggers. Allergic triggers can be mould, animal dander, pollen, cockroach, and dust mites, and non-allergic triggers can be weather, humidity, rain/precipitation, high surface pressure, low solar irradiance, winds, air pollution, respiratory viral infections, chemicals, and certain drugs. The major risk factors for developing asthma are known to be a family history of asthma and/or allergy (eczema, allergic rhinitis); exposure, in infancy, to high levels of antigen such as house dust mites; and exposure to tobacco smoke or chemical irritants in the workplace triggers.

According to the World Health Organization, asthma is now a serious public health problem with over 300 million sufferers worldwide [1]. Over the past two decades, asthma has reached epidemic proportions in large areas of North America. Asthma rates have been increasing remarkably particularly in children where the disease occurs in up to 12% of all children in North America, and about twice as frequently in children living in poorer conditions, such as inner cities [2]. Asthma is a disease affecting approximately 8% of the Canadian population [3]. According to Statistics Canada, 10% of the Canadian children population have been diagnosed as having asthma (2008- 2009) and it is the major cause of hospitalization of children in Canada [4]. Asthma is responsible for increasing numbers and proportions of emergency room visits and hospitalizations, with some increase in deaths as well [2]. With such an impact, it is important to identify trends in asthma incidence that may suggest further epidemiological studies to identify risk factors and identify any changes in important factors. Trends may occur over region and the focus of our paper is to examine geographical variation in the number of asthma visits to hospital during 2000 to 2010 in the province of Manitoba, Canada.

A limited region within the study regions with a high ratio of disease cases is defined as a spatial cluster [5]. The identification of a cluster of disease can help to find potential factors associated with disease and lead to improved understanding of etiology. Moreover, identification of clusters may lead to more detailed investigations to find out the association between exposures and disease interventions [6].

Statistical cluster detection methods are generally classified into two main categories, focused and general (also called as non-focused). Methods for focused cluster detection are designed to identify regions with excess number of cases in the vicinity of potential causes (e.g., toxic waste site) [7,8]. On the other hand, methods for general clusters are designed to identify regions with excess number of cases. Typically, these models adopt extra-Poisson variability in different ways [9,10,11]. These methods are reviewed and compared in [12].

Methods for focused cluster detection include, but are not limited to, circular spatial scan statistic (CSS) [13], flexible spatial scan statistic (FSS) [14], and Bayesian disease mapping (BYM) [9]. The methods for general cluster detection include the Besag and Newell (BN) [15,16] test and the maximizing excess events test (MEET) [17]. The aim of focused tests is to test the null hypothesis of no local spatial cluster, while, the general tests are used to detect the potential clusters in the study region. In other words, for the focused tests (CSS, FSS, and BYM), the goal is to find a cluster for a specific region of interest, and consequently the test statistics are designed to capture the potential cluster. For the general tests (BN and MEET), the goal is to find any significant cluster in the study region without specifying any region of interest.

In this paper, we mainly focus on the focused cluster detection approaches. With advances in computational power, the Bayesian approach especially the non-informative Bayesian approach has become quite popular as a modeling approach to identify the potential clusters in a research study. However, the inference may depend on the choice of priors.

Recently, Lele et al. [18] introduced an alternative approach, called data cloning (DC), to compute the maximum likelihood estimates (MLE) and their standard errors for general hierarchical models. Lele et al. [19] also described an approach to compute prediction and prediction intervals for the random effects. The DC approach, thus, is well suited to address the issues in spatial focused cluster detection using the frequentist paradigm. The other advantages of DC method are that the answers are invariant to the choice of priors and non-estimable parameters are flagged automatically.

In this paper, we propose a frequentist approach via data cloning for identifying the potential focused clusters. In particular, we evaluate the performance of the proposed approach, and compare with other focused cluster detection approaches such as CSS, FSS and BYM, by applying to a real dataset of childhood asthma visits to hospital in the province of Manitoba, Canada, during 2000-2010.

Materials and Methods

Study subjects

The study was based on a yearly dataset of asthma visits to hospital by children (age ≤ 20) in the Canadian province of Manitoba during the 2000-2010 fiscal years (see http://atlas.nrcan.gc.ca/site/english/maps/reference/national/can_political_e/map.pdf for a map of Canada). The population of Manitoba was stable during the study period from 1.15 million in 2000 to 1.20 million in 2010, with an average population of children of around 336,000. The province consisted of eleven Regional Health Authorities that were responsible for the delivery of health care services. These eleven regions were further sub-divided into 56 Regional Health Authorities Districts (RHAD) and these RHAD are the geographic unit used in our model and all data were linked to these geographic boundaries. For simplicity, we call these regions 1,2,...,56. In addition, a population-based centroid was provided for each RHAD and these centroids were not necessarily geographic centres. The data was aggregated over the study period 2000-2010.

The number of asthma visits totaled 14,691 over the study period with mean and median number of yearly cases per region of 26 and 17 (range 3 to 422), respectively. The region children population sizes varied from 319 to 173,400, with mean and median numbers of 5,998 and 2,432, respectively. The largest population was in region 56, while region 42 had the least population.

The key data requirements for the focused methods are the number of cases and the number of expected cases or the population size for each region. When the expected number of disease cases varies by important strata, such as year and gender, adjustments can be made. The expected number of disease cases is then adjusted by year (1-10) and gender (male, female). We first briefly review the spatial focused clusters such as CSS, FSS, and BYM, and then explain the proposed approach of data cloning.

Circular spatial scan statistic (CSS)

The spatial scan statistic has been used in a wide range of applications within the field of epidemiology [20]. The circular spatial scan statistic imposes a circular window S on each region, and for any of those regions, the radius of the circle varies from zero to a pre-specified maximum distance d or a pre-specified maximum number of regions K to be included in the cluster. Let Si:j(j=1,...,J) denote the window composed by the (j-1)-th nearest neighbours to region i. The set of all windows to be scanned by the circular spatial scan statistic is S1={Si:j;i=1,...,m;j=1,...,J} For each circle, a likelihood ratio statistic is computed based on the number of observed and expected cases within and outside the circle. Let L0 and Li(i=1,...,m) be likelihood under the null and alternative hypothesis, where the null hypothesis is no cluster in region i and the alternative hypothesis is a cluster in region i based on its j-th nearest neighbours. Then the likelihood ratio statistic is given by

                                                            (1)

where Ci and Ei denote the observed and expected number of cases in a circle, respectively, and (N−Ci) and (N−Ei) denote the observed and expected number of cases outside the circle, respectively. Note that the indicator function I(.) is equal to one when Ci>Ei and 0 elsewhere.

The circles with the highest likelihood ratio values are identified as potential clusters. We can implement this method using SaTScan [21] or FleXScan [22] software. In general, the K is chosen to include at most 50% of population at risk. We used K = 15, the FleXScan default, and since our example uses aggregate data, the region centroid had to be included in the radius of the circle for the region to be part of the circle.

Flexible spatial scan statistic (FSS)

This method is similar to the method of CSS; however, the detected cluster is allowed to be flexible in shape while at the same time the cluster is confined to a relatively small neighbourhood of each region. The flexible scan statistic imposes an irregularly shaped window S on each region by connecting its adjacent regions. For each region i, the set of irregularly shaped windows with length j, the j connected regions including i, can move from 1 to the pre-specified maximum J. The connected regions are restricted to the subsets of the set of regions i and (J-1)-th nearest neighbours to the region i, where J is a pre-specified maximum length of cluster. The set of all windows to be scanned by the flexible spatial scan statistic is then S2 = {Si:j(k);i=1,...,m;j=1,...,J;k=1,....,kij} Note that the circular spatial scan statistic considers J circles for a given region i; however, the flexible spatial scan statistic considers J circles in addition to the all sets of connected regions whose centroids are located within the J-th largest concentric circle. As a consequence, the size of S2 is much larger than S1 which is at most mJ. Under the Poisson assumption, the test statistic for the flexible spatial scan statistic based on the likelihood ratio test is obtained by (1), where the circle defined in (1) now refers to the S2 rather than S1. We implement this method with the FleXScan software, using the default setting J=15. Similar to the circular spatial scan statistic, the circles with the highest likelihood ratio values are identified as potential clusters.

Bayesian disease mapping (BYM)

A Bayesian approach using Markov chain Monte Carlo (MCMC) can also be used for cluster detection [9,10,23,24]. This approach was first used by Besag et al. (BYM) [9] and the model consists of two parts. In the first part, the cases are assumed to follow a Poisson distribution with an area specific parameter θiEi:

where Ci and Ei are the observed and expected number of cases in region i respectively. The second part of the model is obtained by

where θi is the relative risk (RR) in region i, μ is the overall mean ratio over region and ηi represents spatially correlated random effects. We use conditionally autoregressive (CAR) model to capture the spatial random effects ηi. A variety of CAR models may also be used by taking a collection of mutually compatible conditional distributions where and ∂i refers a set of neighbours for the i-th region [9]. We consider the following general model for the spatial effects ηi.

                                                                                (2)

where P is a m×m diagonal matrix with elements Pii=1/ei; D is a m×m matrix with elements if region i and j are adjacent and Dij = 0 otherwise; is the spatial dispersion parameter; λη measures the spatial autocorrelation, where and are the smallest and largest eigen values of and Im is an identity matrix of dimension m (see [25] for details of this proper CAR model). The parameters can be then estimated within the Bayesian framework (MCMC) using vague priors for the parameters. This produces the posterior distributions for the parameters in the model.

A cluster is defined as a region where the estimated relative risk is significantly larger than 2 (in terms of their credibility sets) [26]. To implement this method, we used WinBUGS software [25] to compute the relative risk values.

Frequentist approach using data cloning for disease mapping (DC)

The DC method uses the Bayesian computational approach for frequentist purposes. In DC, the observations y=(y1,...,ym)′ is repeated independently by L different individuals and all these individuals obtain exactly the same set of observations y which are called y(L)=(y,y,...,y). The posterior distribution of conditional on the data y(L) is then given by

                                                                                (3)

where is prior distribution on the parameter space and is the normalizing constant. The expression is the likelihood for L copies of the original data. Lele et al. [18,19] showed that, for L large enough, converges to a multivariate Normal distribution with mean equal to the MLE of the model parameters and variance-covariance matrix equal to 1/L times the inverse of the Fisher information matrix for the MLE. This factor of 1/L adjusts for the fact that the cloned dataset has L times more information than the original dataset. Hence, this distribution is nearly degenerated at the MLE α for large L. Moreover, the sample mean vector of the generated random numbers from (3) provides the MLE of the model parameters, and L times their sample variance-covariance matrix is an estimate of the asymptotic variance-covariance matrix for the MLE α. Lele et al. [19] also provided various checks to determine the adequate number of clones L.

Prediction of relative risk: Prediction of relative risk (random effects), particularly from the frequentist viewpoint, is usually problematic. A naive approach, when α is estimated using the data, is to use where However, this approach does not take into account the variability introduced by the model parameters estimate. An approach that has been suggested in the literature (e.g., Hamilton [27]) to take into account the variation of the estimator is to use the density:

                                       (4)

where and are appropriate distributions, and denotes Normal density with mean ξ and variance σ2, which are equal to the MLE and the inverse of the Fisher information matrix here. In this paper, we obtain prediction of the RR using the density in equation (4) along with MCMC sampling. Similar to the Bayesian approach, a cluster is defined as a region where the estimated relative risk is significantly larger than 2 (in terms of their prediction intervals). We used the package of dclone [28] in software R [29] to compute the relative risk values.

Note that these focused methods have different assumptions. While the CSS and FSS methods are distribution free, the number of cases in BYM and DC methods is assumed to follow a Poisson distribution. We also need to specify the number of regions to be included in the cluster for the CSS and FSS methods while it is not required for the BYM and DC methods.

Results

We have provided the comparison of methods CSS, FSS, BYM, and DC to detect the potential clusters in our childhood asthma visits to hospital for the period of 10 years (2000-2010) in the province of Manitoba, Canada.

In Figure 1, the areas that are statistically significant (potential clusters) are shown for each method separately. The summary of the results is presented in Table 1. The order of significant regions of different methods is also reported in Table 1. More precisely, the regions are ordered based on which one is more significant to be as a cluster. For instance, 1 in the DC method means that the region 37 is most likely to constitute a significant cluster, while 6 means that the region 26 is least likely to be a significant cluster. Hence, it is easy to see which region has more contribution to constitute a cluster.

biometrics-biostatistics-health-authorities-districts

Figure 1: Regional health authorities districts (RHAD) identified as potential clusters (shaded regions) for methods CSS, FSS, BYM, and DC.

        Methods  
Region Ci Ei CSS FSS BYM DC
10 273 121 1 - - -
14 156 80 1 - - -
20 362 229 1 - - -
21 292 138 1 - - -
25 296 170 - 1 - -
26 359 124 1 1 6 6
28 356 105 - - 5 4
29 394 213 - 1 - -
31 333 231 1 - - -
32 135 48 - 1 - -
33 73 23 1 1 - -
34 218 52 1 1 3 3
35 257 96 1 1 - -
36 327 91 1 1 4 5
37 624 167 1 1 1 1
38 49 16 1 1 - -
39 117 33 - 1 - -
40 240 80 1 1 - -
41 268 70 1 1 2 2

Ci and Ei are observed and expected number of cases in region i; CSS, FSS, BYM, and DC are circular spatial scan statistic, flexible spatial scan statistic, Bayesian disease mapping, and the method of data cloning, respectively.

Table 1: The order of significant regions for methods CSS, FSS, BYM, and DC.

It seems that the methods CSS and FSS identified somehow similar regions as potential clusters with 13 regions for the FSS method and 14 regions for the CSS method. In particular, the CSS method detected the regions {10,14,20,21,26,31,33,34,35,36,37,38,40,41} as potential clusters while the FSS method identified the regions {25,26,29,32,33,34,35,36,37,38,39,40,41} as potential clusters. The main reason for different results between CSS and FSS is due to non-circular shape of some regions in the province of Manitoba, where the FSS method had the ability to identify those non-circular shaped regions as potential clusters compared to the CSS method.

The DC method detected the regions {26,28,34,36,37,41} as potential clusters. The same regions were also identified as potential clusters for the BYM method but with different order of significance (e.g., regions 28 and 36). However, the BYM approach may depend on the choice of priors and we may get different results with using different priors; noting that we used gamma distribution for the inverse of variance component with shape and scale parameter 0.001 and Normal distribution with mean 0 and variance 106 for the fixed effect. It is worthwhile to mention that regions identified as potential clusters by the methods DC and BYM were also detected by methods CSS and FSS except for the region 28.

Discussion

The most popular approaches for detecting spatial focused clusters are distribution free methods such as CSS and FSS. The Bayesian method (BYM) which is based on a Poisson model is also popular as a method for identifying spatial focused clusters. However, the Bayesian inference may depend on the choice of priors.

Using DC, we have proposed a frequentist approach which identifies potential clusters with high ratio of disease. The advantages of DC approach are that the answers are independent of the choice of priors and non-estimable parameters are also flagged automatically. We applied the proposed approach to a real dataset of childhood asthma visits to hospital in the province of Manitoba, Canada. We also compared the proposed approach with other methods such as CSS, FSS, and BYM. Two methods CSS and FSS detected some different regions as potential clusters due to non-circular shape of some regions in the province of Manitoba. Two methods BYM and DC identified lower number of regions combined as a potential cluster compared to CSS and FSS methods. Although, the results of DC and BYM were similar for detecting potential clusters in our analysis, however, one may get different results for BYM, unlike DC, with using different priors.

In the BYM and DC approaches, we conservatively defined a region as a cluster if the credibility set of the estimated relative risk was larger than two. One may define different decision rule where the estimated relative risk would be larger or smaller than two [30].

We adjusted our expected number of asthma cases by two important factors gender and year. The proposed method can be also easily extended to include some covariates directly, which may be required for some applications.

In general, the potential clusters are located in the north-central part of the province. These findings may represent real increases or may be indicative of different distributions of important covariates, such as demographic characteristics of the population of the north-central region, that are unmeasured and unadjusted for in our modeling. Further investigation is needed to explore these findings.

Acknowledgements

The author is grateful for the helpful comments and suggestions of a referee. This work was supported by grants from the University Research Grants Program (URGP) at the University of Manitoba and the Natural Sciences and Engineering Research Council of Canada (NSERC).

Disclaimer: The interpretations, conclusions and opinions expressed in this paper are those of the author and do not necessarily reflect the position of Manitoba Health. This study is based in part on data provided by Manitoba Health through Manitoba Centre for Health Policy. The interpretation and conclusions contained herein are those of the researcher and do not necessarily represent the views of the government of Manitoba.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

Article Usage

  • Total views: 12129
  • [From(publication date):
    specialissue-2012 - Dec 17, 2018]
  • Breakdown by view type
  • HTML page views : 8321
  • PDF downloads : 3808
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2018-19
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri and Aquaculture Journals

Dr. Krish

[email protected]

+1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

streamtajm

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2018 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version