Arétouyap Z^{1,*}, Njandjock Nouck P^{1}, Nouayou R^{1}, Méli’i JL^{1}, Kemgang Ghomsi FE^{1}, Piepi Toko AD^{1} and Asfahani J^{2}  
^{1}Postgraduate School of Science, Technology and Geosciences, University of Yaounde I, P.O. Box 812 Yaounde, Cameroon  
^{2}Applied Geophysics Division, Head Atomic Energy Commission, P.O. Box 6091 Damascus, Syria  
Corresponding Author :  Arétouyap Z Postgraduate School of Science Technology and Geosciences University of Yaounde I P.O. Box 812 Yaounde, Cameroon Tel: +237 675086759 Email: [email protected]; 
Received August 06, 2015; Accepted October 21, 2015; Published October 30, 2015  
Citation: Arétouyap Z, Nouck NP, Nouayou R, Méli’i JL, Kemgang Ghomsi FE, et al. (2015) Influence of the Variogram Model on an Interpolative Survey Using Kriging Technique. J Earth Sci Clim Change. 6:316. doi:10.4172/21577617.1000316  
Copyright: © 2015 Arétouyap Z, et al. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.  
Related article at Pubmed, Scholar Google 
Visit for more related articles at Journal of Earth Science & Climatic Change
Geostatistics is an efficient and effective method to continuously assess the content, the spatiotemporal distribution and the correlation of a discretely sampled deposit. It begins with an exploratory analysis that evaluates the consistency and distribution of data through histograms and QQ plots, and then a structural analysis that evaluates data correlation and dependency through variogram and finally a predictive analysis using kriging. This predicting method is used in various geographical investigations: meteorology, demography, hydrology, orography, economy, and pollution, etc. Even when using related software, it is generally of the duty of the user to manually select the suitable variogram model. The main objectives of this paper were to highlight how the choice of a variogram model can affect the results of an interpolating predictive analysis and to show how a bestfitted model can be selected. The results, illustrated with an example, show that the choice of the variogram model inevitably influences the results of a kriging at both endpoints and amplitude of the range of the estimated values. However, the direction of variation of the interpolated values is independent of the variogram model: different variogram models (with the same characteristics) produce different thematic maps but, the areas of minimum and maximum values remain unchanged. Fortunately, the computation of some cross validation tests such as mean error (ME), mean square error (MSE), root mean square error (RMSE), average standard error (ASE) and root mean square standardized error (RMSSE) can help to ascertain the performance of the developed models.
Keywords 
Kriging; Predictive analysis; Spatial analysis; Structural analysis; Variogram 
Introduction 
Environmental science can be considerate as the field of science that studies the interactions of the physical, chemical, and biological components of the environment and also the relationships and effects of these components with the organisms in the environment. In order to foresee what would happen if drastic events as extreme rainfall, extreme deforestation, chemical pollutions, etc occur, environmentalists work to understand the complex relationship between multiple disciplines including as biology, chemistry, and geology. This discipline can then address various issues as populations, weather, surface water, land, mountain, vegetation, economy, urbanization, natural hazards, mining, energy, water resources, pollution and sanitization, etc. Hence, it is divided into three main goals, which are to learn how the natural world works, to understand how we as humans interact with the environment, and also to determine how we affect the environment. The third goal of determining how humans affect the environment also includes finding ways to deal with these effects on the environment. All these categories utilize (geo) statistical approaches to resolve natural and human problems that have a spatial dimension. Actually, Geomatics is one of the important specialties because most of phenomena and matters studied in Geology need to be mapped in terms of simple illustration (reprography or presentation) or in terms of assessment (prediction or forecasting), management and allocation of the world's physical and/or human resources. In particular, assessing a variable is very delicate because it is a matter of interpolating that variable where no measurement has been conducted or establishing a correlation between data of different natures. For this purpose, several softwares have been developed including ArcGis and Golden Surfer, and are being widely used by thousands scientists worldwide for various aims. Interpolative and autocorrelation approaches are essential in number of geological and environmental investigations. Cheng et al. [1] set autocorrelation of road network data, Gerkman [2] modelled the spatial pattern econometrical parameters in the situation of small scale neighborhood, O’Kelly et al. [3] modelled the spatial interaction from Irish commuting data, Yates and Sanjeevi [4] modelled the assessment of the impact of vulnerability in the protection of critical infrastructure, Singleton et al. [5] combined Geodemographics and spatial interaction as an integrated model for higher education, LeSage and Llano [6] modelled the spatial interaction with spatially structured origin and destination effects, Nazneen [7] applied the orderedresponse model to the analysis of urban landuse development intensity patterns, Bourgault [8] revisited MultiGaussian Kriging for the estimation of spatial distributions, Kolyukhin and Tveranger [9] statistically analyzed the fracturelength distribution sampled under the truncation and censoring effects, LeSage and Sheng [10] spatially examined the endogenous versus exogenous interaction, Mack et al. [11] analyzed the spatiotemporal industrial composition; Sun et al. [12] mapped soil particle size fractions using compositional Kriging, Cokriging and Additive Logratio Cokriging. More recently, Arétouyap et al. [13] used geostatistics to characterize aquifer in the PanAfrican context, Binita and Marshall Shepherd [13] to investigate temporal and spatial assessment of climate change vulnerability; Chaney and Rojas Guyler [14] to establish the geographic variability in adolescent drug use and to correlate factors of use; Eidsvik [15] used a geostatistical approach to model reservoir; Keumseok et al. [16] to build up spatial patterns of simulated obesity prevalence were compared with measures of low income and food accessibility; Melnikova et al. [17] reviewed the history matching through a smooth formulation of multiplepoint statistics ; Mishra and Chaudhuri [18] to characterize spatiotemporal trends in vegetation greenness in Uttarakhand Himalayas; Zunkel [19] to establish a network of all 14 tornado sirens and examined the number of residents included and not included in that network. Most of mentioned modellings, geospatializations and interpolations are conducted thanks to ArcGis and Golden Surfer. The functioning of these softwares is based on interpolative techniques such as Minimum Curve, Inverse Distance, Spline functions, Trend Surface and Kriging [20]. Kriging is distinguished from all these techniques through its unbiased feature. It is so called BLUE (Best Linear Unbiaised Estimator). Thus, it is by far the most used method to that purpose in all domains of environmental sciences worldwide. Zamani and Mirabadi [21] used it to optimize the sensor orientation in railway wheel detector; Diodato to assess the spatial uncertainty of nitrates in the aquifers; Arétouyap et al. [22] used it to analyze the changes in the weather in Central Africa and also to study the distribution of physicochemical parameters of groundwater in the area of Adamawa, Cameroon [23,24] to identify an excursion set; Hamel et al. [25] to perform scintillation maps; Nshagali et al. [26] to analyze the distribution of the pH and the iron concentration in the crystalline basement in equatorial region. The use of this method is growing with the development of new mining platforms across the New Industrialized Countries (Cameroon, Australia, South Africa, Mexico, Ethiopia, Brazil, Turkey, Philippines, etc.). This method so efficient, effective and popular with geoscientists has a very important preliminary step upon which depends the reliability of interpolation and prediction: this is the structural analysis focused on the variogram. This step is so important that for many versions of Golden Surfer, it is of the responsibility of the user to select the suitable model of variogram. That is certainly why Van Groenigen studied the influence of variogram parameters on optimal sampling schemes for mapping by kriging. The main objectives of this paper are (1) to highlight how the choice of a variogram model can affect the results of an interpolating predictive analysis and (2) to show how a bestfitted model can be selected. 
Methods 
Data and study area 
In this experimental analysis, we used dataset of aquifer resistivity computed using the vertical electrical sounding conducted in the Pan African context of AdamawaCameroon [13]. This field campaign was carried out in order to characterize local aquifers and, the according dataset is presented in Table 1. 
Study area 
The Panafrican region of Adamawa is located in the heart of Central Africa between 6°8° North latitude and 11°16° East longitude (Figure 1). It extends over a length of about 410 km from West to East between Nigeria and the Central African Republic, for a total area of 67,827 Km^{2}. From March to October, the region receives an average rainfall of 1,540 mm per year. The temperature is moderate with an annual average around 25°C (Arétouyap et al. 2014). On the hydrological level, the Adamawa region is called “the water tower of Cameroon” because it feeds three of the four major watersheds of this country, namely the lake Chad Basin, the Niger basin in the North and the Sanaga Atlantic basin in the South. This region consists of two major geological domains: 
 The former basement that includes highly metamorphosed formations (migmatitic, gneiss and mica), and intrusive bodies composed of granites; 
 The covering formations that include: red lateritic soils, sedimentary (sandstones and conglomerates) and volcanic (basalt and trachyte) rocks. This region is the stool of a Panafrican granitegneissic basement, represented by granites, gneisses and Panafrican migmatites. Geological formations encountered are basalts, trachytes and trachyphonolites based mostly on concordant and discordant alkaline granites [27]. There are two major fractures slanted towards in two directions: 
 The first oriented N30 °E, most common is that of the ‘‘Cameroon volcanic line’’, 
 The second oriented N70 °E, is the ‘‘Adamawa line’’ or ‘‘shear area of Adamawa’’. 
The soils of the region are lateritic and classified into two types [28,29]: red soils derived from ancient metamorphic rocks and red soils formed on old basalts. 
Variogram 
Currently, kriging is the best interpolation technique because it is unbiased. Nevertheless, it requires data to be correlated and dependent. This structural analysis is conducted by means of variogram. The variogram is a tool that is used to describe the spatial continuity of a phenomenon [30]. The theoretical formulation of the variogram γ (h) uses the concept of variance (Var) applied to the difference between two observations z(x) and z(x+h) separated by a distance h (Eqn. 1). 
(1) 
In practice, only the experimental variogram γ_{e}(r) is calculated from observations using Eqn. 2. 
(2) 
where γ_{e}(h) is the estimated value of the variogram for lag (h); N(h), the number of pairs of points separated by distance h; z(x_{i}) and z(x_{i}+h) are values of z at positions x_{i} and x_{i}+h, respectively. Ideally, a point of the experimental variogram is considered representative if N(h) ≥ 30. At these point values, a suitable theoretical variogram model is adjusted. The main current eligible models are nugget effect, linear, gravimetric, cubic, pentaspherical, spherical, exponential, power, Gaussian, Cauchy and logarithmic variograms. A model is admissible if any variance calculated from the model is positive [31]. The description of a variogram model is based on the quantification of multiple parameters identified in Figure 2. The range (length) a is the distance where the correlation between observations becomes zero. At this distance, the variogram reaches the sill (scale) σ^{2} which is the sum of the nugget variance C_{0} and the partial sill (variance) C. The nugget effect derives from various sources such as measurement errors, existence of a microstructure smaller than the size of the sample and/or the presence of a microstructure with a range less than the distance between the two closest observations. It may be impossible to quantify the contribution of each source. 
Kriging 
Kriging is a commonly used method of interpolation (prediction) for spatial data. The data are a set of observations of some variable(s) of interest, with some spatial correlation present. Usually, the result of kriging is the expected value and variance computed for every point within a region. Thus, it is a direct approach with a unique solution to an estimation problem and can be used to estimate the unknown value Z* of a variable at a point from the surrounding known values Z_{i} using the following Eqn. 3. 
(3) 
Where λ_{i} represent the kriging weights. 
Obtaining a minimum variance of estimation means to minimize the expression given by Eqn. 4. 
(4) 
Substitution of the linear estimator can rewrite Eqn. 4 as Eqn. 5. 
(5) 
To ensure no bias for the linear estimator (Eqn. 5), the constriction: should be integrated into the model. This constraint means that the local average of the observations is constant throughout the field. The minimization of a quadratic function with the presence of an equality constraint (Eqn. 6) is effected by the method of Lagrange which involves the Lagrange multiplier μ: 
With the substitution of the Eqn. 6 can be rewritten as Eqn. 7. 
Eqn. 7 provides ordinary kriging when cancel all the partial derivatives with respect to each λ_{i} and compared to μ. The ordinary kriging system becomes: 
(8) 
The minimum estimation variance of the system (kriging variance) is determined by the substitution of kriging Eqn.s in Eqn. 8 to obtain the Eqn. 9. 
(9) 
In practice, it is easier to use the matrix form of the kriging system (Eqn. 10): 
(10) 
Where K_{s} is the (n × n) matrix of covariance between observations, k_{s}, the (n × 1) matrix of covariance between the n observations and the point to be estimated, λ. The solution of this system is provided in matrix form as given by Eqn. 11. 
(11) 
Where (12) 
And finally, (13) 
Thus, Eqn. 13 is used to calculate the kriging weights λ_{i} needed to estimate a point defined by the linear estimator with Eqn. 3 [22,32]. 
Methodological step 
To highlight the influence of the variogram model on the kriging results, we used a database made of aquifer resistivity determined in AdamawaCameroon region in order to investigate the productivity of local aquifers. Four different variogram models (logarithmic, Gaussian, exponential and spherical) with the same effect nugget (C_{0}=200 Ω^{2}m^{2}), the same sill (σ^{2}=5200 Ω^{2}m^{2}) and the same range (a=50 m) were used to interpolate the data by kriging. These variogram models are expressed by Eqns 1417. 
(14) 
(15) 
(16) 
(17) 
Correct variogram model fitting 
The variogram model is chosen from a set of mathematical functions that describe spatial relationships. The appropriate model is chosen by matching the shape of the curve of the experimental variogram to the shape of the curve of the mathematical function. This is clearly illustrated in the “Golden Surfer” software we used in this study. In fact, variogram is used in the interpolative kriging technique at its second step. This step is preceded by an exploratory data analysis and a prediction [33]. During the exploratory data analysis, data were checked consistency, outliers removed and statistical distribution identified. Normal data distribution is decided when the mean and the median are very similar. However, high skewness values indicate the existence of outliers, which are very high or low measured values comparing to the dataset. The outliers are caused by a bad measurement or a bad recording, and must be transformed when they exist. During the prediction phase, four variogram models were in order to select the bestfitted one. Predictive performances of the fitted models are checked on the basis of cross validation tests. The values of mean error (ME), mean square error (MSE), root mean square error (RMSE), average standard error (ASE) and root mean square standardized error (RMSSE) are estimated to ascertain the performance of the developed models. If the predictions are unbiased, the ME should be almost nil. But because of its weaknesses due to its dependence upon the scale of the data and to its indifference to the wrongness of variogram, ME is generally standardized by the MSE, being ideally zero. 
Results 
Using the logarithmic model, the estimated resistivity rages between 195 and 267 Ωm. The Gaussian and spherical models produce values ranged from 100 to 480 Ωm while the exponential model provides a range of 120420 Ωm. In general, each model produced a different result. The difference may be in the endpoints of the range or its amplitude. These differences are summarized in Table 2 and shown in Figure 3. 
Discussion 
In the particular case of this study, spherical and Gaussian models estimated values in the same interval (100480 Ωm). But in general, each variogram model provides distinct result. However, despite their observed differences, all thematic maps have the same variation trend. The gradient values are constant: the minimum and maximum values are almost in the same regions respectively from one map to another. These observations are conforming to results published by many other authors worldwide. Especially, Webster and Olivier [34] several concerns related to the influence of the variogram model on the predictive investigation, and Chilès and Delfiner [31] modeled spatial uncertainty in Geostatistics. It is therefore evident that the quality and the reliability of an interpolation by kriging strongly depend on the structural analysis of field data, that is to say, the variogram model. Predictive performances of the fitted models are checked on the basis of cross validation tests. The values of mean error (ME), mean square error (MSE), root mean square error (RMSE), average standard error (ASE) and root mean square standardized error (RMSSE) are estimated to ascertain the performance of the developed models. If the predictions are unbiased, the ME should be almost nil. But because of its weaknesses due to its dependence upon the scale of the data and to its indifference to the wrongness of variogram, ME is generally standardized by the MSE, being ideally zero. However, RMSE and ASE should be calculated to indicate if the prediction errors were correctly assessed in the case where they are close. Otherwise, if the RMSE is less than the ASE (or RMSSE less than 1), then the variability of the predictions is overestimated; and if the RMSE is greater than the ASE (or RMSSE greater than 1), then the variability of the predictions is underestimated. Once the best model is selected, it is used to draw the thematic map that provides the spatial distribution of the parameter to be estimated. All these errors are expressed by Eqns (18)(22) below [33,35]. 
(18) 
(19) 
(20) 
(21) 
(22) 
Where σ2(x_{i}) is the Kriging variance for location x_{i}, Z*(x_{i}) and Z(x_{i}) are the estimated and the measured values of the parameter at the location x_{i} respectively. Table 3 shows that the logarithmic model is bestfitted one. And Figure 4, we see that this is the logarithmic model that accommodates the most with the experimental variogram. This study should have many applications and impacts on environmental and earth sciences. In fact, many environmental as rainfall and earth deposits and parameters are usually called to be predicted or estimated. However, one cannot carry out measurement continuously. The parameter to be estimated is measured discretely and then, to obtain the continuous information, kriging technique is used. Nowadays, this technique based on variogram is used by so many scientists in various fields as civil protection [21], meteorology [22,30], geochemistry [13,23,26,32,33]. If authors do not take into account the paramount impact of the variogram model in such investigations, the study will be sketchy and results untruthful. This explains the importance of the present paper. Many other studies have been carried out in order to highlight the delicateness of modelling and assessment. Nshagali et al. [26] bring up the effects of scale in spatial interaction models; Patuelli and Giuseppe Arbia [3640] published an editorial on the advances in the statistical modelling of spatial interaction data. But the present paper tackles the issue of the selection of the suitable variogram model. In fact, interpolation softwares automatically propose a random linear or nugget model to the user (Figure 5). When the random linear or nugget model is automatically displayed, the user should select and “add” a model that is suitable for his dataset and then fit it. 
Conclusion 
Many scientific studies use geographical theory and methodology to resolve environmental, social and human problems. Some of these papers deal with issues of resources assessment, prediction and management. Such investigations are generally based on the interpolative techniques such as kriging. This very important technique for especially Geoscientist and other scientists in general includes a prior step namely structural analysis based on the variogram. This step is so important that it decides on the reliability and even the veracity of the kriging results. It is therefore necessary to well apply during the cross validation test in order to select the bestfitted variogram model before predictive analysis. In fact, the selection of a variogram model can be explicit or implicit (incorporated in software). This article illustrates how the use of an inappropriate variogram model can seriously distort the results of an evaluation or assessment or prediction survey. 
References 

Table 1  Table 2  Table 3 
Figure 1  Figure 2  Figure 3  Figure 4  Figure 5 