Received Date: July 13, 2017 Accepted Date: July 20, 2017 Published Date: July 25, 2017
Citation: Huo A, Zheng X, Wang G, Xie J, Yu D, et al. (2017) GA-SVM Applied in Assessing the Water Trophic State of South Lake Qujiang based on Multispectral RS. J Environ Anal Toxicol 7: 494. doi: 10.4172/2161-0525.1000494
Copyright: © 2017 Huo A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Environmental & Analytical Toxicology
Eutrophication has become a major water quality problem in most urban landscape waters of the world. Despite extensive research over the last four to five decades, many of the key issues in eutrophication science remain unsolved. In this paper, based on Support Vector Machine (SVM) a new method was proposed to monitor and evaluate the water trophic state of Qujiang South Lake. SVM is suitable for a limited number of samples because of strong nonlinear mapping ability. Model parameters can be automatically chosen by Genetic Algorithm (GA) which contributes to advantages of the Genetic Algorithm- Support Vector Machine (GA-SVM) which has high precision in solving regression problems. Enhanced Thematic Mapper (ETM) data can be used to estimate the chlorophyll-a (Chl-a) concentration of the water body. The characteristic band ratio and SVM method are used to establish a model of Chl-a concentration through remote sensing. The comprehensive eutrophication condition can be evaluated by the remote sensing (RS) results. Results show that the prediction accuracy of the GA-SVM method is better than the retrieval results of the traditional statistical regression method and a neural network. Besides, RS retrieval results corresponded with the in situ measured values, indicating that the GA-SVM is effective. Furthermore, RS data can be free downloaded, so it is also economical than in situ measuring methods. The GA-SVM can also be used to assessment larger lake eutrophication.
Support vector machine; Genetic algorithm; Water quality comprehensive evaluation; Remote sensing; Enhanced thematic mapper
As the fast-growing economy and population, Lakes have becoming eutrophic or even hyper-eutrophic, particularly in urban areas. The balance of water nutrients has been altered with luxuriant plant nutrients, industrial waste, and domestic sewage poured into lakes and rivers in various ways, causing dissolved oxygen depletion and algae over-growth. The water eutrophication affected the various service functions of lake such as fishery, water supply, and ecosystem. Meanwhile, local weather also influences local water eutrophication. So, an important mission for scientists is to monitor and predict the trophic status of lake water body in a timely and accurately manner. Phosphates, nitrates and ammonia are the common chemicals that cause eutrophication [1-4]. According to the driving force and the corresponding parameters [5,6], the continuum of water body trophic state is divided into three categories: oligotrophic, mesotrophic and eutrophic [7-9]. Phytoplankton is a water quality indicator. The composition of phytoplankton is impacted by water body trophic state and other related factors , for instance, mixing regime [7,11], water depth [12,13], temperature , dissolved oxygen concentration [15-17], ratio of euphotic to mixing depth and ratio of macro-nutrients [18,19]. However, a number of issues remain unresolved although a large number of international publications on these during many decades study. Many RS retrieval models are unable to connect to each other; the conflict still exists between different retrieval models, and it depends partially on outcomes of the innovative approach. Multispectral RS technology was utilized to monitor water eutrophic in this paper, which is different from traditional monitoring methods of water quality. It makes water body eutrophication monitoring efficiency and convenience. A RS inversion model of Chl-a concentration was built by utilizing the innovative GA-SVM and RS technology, for monitoring of Qujiang South Lake water body in Xi’an City. Furthermore, the model can be used to assess water eutrophic for larger water areas. On this basis, more realistic and efficient methods can be chosen to protect the ecological environment of water bodies against further eutrophication.
Qujiang South Lake is the major urban landscape water body that located in an ancient city-Xi’an, China. It is about five kilometres away from Xi’an City center and covers an area about 1 km2. It has been one of the most famous tourist attractions from ancient times. The name of Qujiang originated from its zigzag lake in ancient. The water of Qujiang Pool comes from Huang Ditch at Yiyu Kou of Zhongnan Mountain. As a narrow and waved large lake, the north-south direction is longer than east-west direction. Qujiang Pool, was renamed Qujiang Park. Eutrophication in Qujiang South Lake becomes serious due to the developing tourism in Xi'an City and lack of management on pollution. The study for RS monitoring and evaluation on the water eutrophication of Qujiang South Lake is of great importance. Figure 1 is the study area of Qujiang South Lake in Xi’an City, China (Figure 1).
Physical, chemical and biological analyses usually are used as research methods in the aspects of water eutrophication monitoring. In this paper physical analysis plays an important role as the major research method. Table 1 indicates the correlation coefficient values of ETM (Enhanced Thematic Mapper) bands. The results are similar with previous research of inland water . Table 2 indicates the correlation coefficient value between the water quality parameters: Chl-a, TN, TP, SD and the band spectral reflectance of ETM. The minimum value is -0.5916 in Table 2, the maximum value is 0.9383, the interval is 1.5299, which means the data interval of correlation values is obvious. Correlation coefficients between TM1, TM2, TM3 and Chl-a are low. TM1 shows negative correlation, which is due to absorption. Furthermore, compare with band 6, good relationships among the three water quality parameters (Chl-a, TN, SD) and all other bands. There is a relatively strong correlation between inter-band (R4, R5, R7) and visible band (R1, R2, R3). low correlation between TP and all bands were appeared, especially in the infrared band and visible band. But there is poor correlation between R6 band and the Chl-a, TN, TP and SD et al. water quality parameters (in Table 2 their Correlation coefficients are 0.2401,0.2701,0.1436, and 0.0135, respectively), which indicates that some band values strongly correlated with the parameter variation of water quality, while others do not. Algae and Chl-a, as reflectivities, reduce with the increase of Chl-a concentration. Although TM2 corresponds to the Chl-a reflection peak and the reflectivity goes up with the increase of Chl-a content, the band cannot reflect the Chl-a concentration well because of the yellow substance which is a complex of colored organic compounds dissolved in the water, suspended substances, self-color of water and the atmospheric disturbance. TM3 band corresponds to the strong absorption peak area of Chl-a. The prominent scattering-absorption features of chl-a include strong absorption between 450-475 nm (blue). The reflectance peak near 700 nm and its ratio to the reflectance at 670 nm have been used to develop a variety of algorithms to retrieve chl-a in turbid waters. Spectral bands in the blue to green region are appropriate to identify chl-a concentrations with acceptable precisions. TM3 visual spectral bands and their ratios are widely used to determine chl-a. The reflectivity reduces when Chl-a concentration increases and it indicates negative correlation. Water itself has strong absorption in the band and can't fully reflect the information of Chl-a. The latter three bands demonstrate a higher correlation than TM1, TM2, and TM3 band. TM4 band corresponds strongly to the Chl-a reflection peak area and reaches maximum. It illustrates that there is a big correlation between TM4 and Chl-a because of strong water absorption, the influence of the color of the water, and low signal-to-noise ratio. TM5 and TM7 bands generally cannot be used to indicate the Chl-a concentration in water, because the two bands do not corresponds strongly to the Chl-a reflection peak based on the Correlation coefficients. Correlated characteristics between the above bands and concentration of Chl-a are reflected in single TM image and the extremely bright area represents the high concentration of Chl-a in water body. So the image of the previous bands cannot visually reflect the concentration of Chl-a in water body (Tables 1 and 2).
Table 1: ETM inter-band correlation.
Note: TP, TN, and SD are total nitrogen, total phosphorus and transparency respectively; R(1-7) are 7 bands of TM satellite images.
Table 2: Correlation between Band and water quality parameters.
Above all, to avoid the influence of other factors, we chose TM3, TM4 and their ratios ((B4-B3)/(B4+B3)) as the indexes to monitor quantitatively the concentration of Chl-a, which is based on RS images according to the correlation discussion and factors that contribute to the final results. A detailed description of the theory of SVM technology is provided in many papers , so only an overview describing of a SVM model is presented here. In a regression SVM model, you must estimate the functional dependence of the dependent variable y on a set of independent variables x. It assumes that the relationship between the independent and dependent variables is given by a deterministic function f(x) f (x) = ω•?(x) + b plus the addition of some noise (y=f(x)+noise). The noise is defined by error tolerance (ε). However, ω (vector of coefficients) and b (constant) are the regression function parameters and φ is the kernel function. The aim is then to find a functional form for f(x). This can be obtained by training the SVM model on a sample set, i.e., training set, a process that involves the sequential optimization of an error function. Because ε-SVM regression has been commonly utilized in regression researches, in the study ε-SVM regression is used. Considering the effectiveness, simplicity, an SVM classifier model can do more than one kind of algorithm with short training time, the SVM classifier model was adopted in this paper. It is the earliest way to employed SVM to solve multi-classification questions.
In this research, Genetic Algorithm (GA) was used to solve the problem of SVM model parameters optimization. It can quickly find the best solution in the complex search space for the whole situation . The coding is represented as the algorithm solution (viz., parameters in model). Initially, multiple solutions were randomly generated and searched simultaneously under the lead of an adaptive function. Finally the evolution of solution is realized through the mechanisms of natural selection, exchange and variation. Algorithm search direction is decided according to adaptation degree function, which is defined as :
where, are the actual measured value and calculated value of first test sample i, respectively. A smaller Mean Absolute Deviation (MAD) value indicates to a higher adaptation value and fitness parameter values. As a result, the probability of inheriting the appropriate set of parameters to the next generation will increase.
Table 3  shows rating standards.
|Comprehensive nutrition state index range||Lake (reservoir) nutrition status|
|TLI(Σ) £ 30||The Oligotrophic|
|30< TLI(Σ) £ 50||The Mesotrophic|
|TLI(Σ)>50||50<TLI(Σ) £ 60||Eutrophication||Mild|
|60< TLI(Σ) £ 70||Moderate|
Table 3: Lake (reservoir) trophic state grading standards.
The measured data and RS image data
The RS data of ETM images and in situ monitoring water quality sample date were measured during the same period of April, 2012 in Qujiang South Lake, Xi’an City, Shaanxi Province. Chl-a, TN, TP and SD as well as COD were measured because organic pollutants are the main pollutants of urban landscape water body. Before being utilized to invert water quality parameters, the RS image data have been pretreated firstly, including geometric and radiation correction. ROI’s had been collected from the image (ROI= region of interest), and images have been reclassified taking into account the ROI. The special glass bottles with volume of 2.5 L were used to sample the water, and the 13 water quality samples were collected from Qujiang South Lake (Figure 1) and numbered from qj1 to qj13. Simultaneously, temperature, pH value and GPS position were recorded. Six points were selected under the water surface about approximately 50 cm at each location. There were 4 points of sampling sites at north, east, south, and west ends of a water body respectively and two points at the center of the lake. That is, the actual sample number is 13*6=78 sample points. All samples were then mixed and placed into two special glass bottles. One bottle was used to check the remaining indicator, and the other was used for laboratory testing of Chl-a. The lake water was samples on a sunny day and the water's surface is basically calm. The COD analysis is performed in the laboratory immediately after the delivery of the water sample (Figure 2).
Based on the above ETM bands, multiple SVM inverse model was established, and the parameters of the model were chosen by GA. Because of the limited samples, K-folding cross test was applied as follows: the samples set were separated into 7 mutually non-intersect subsets randomly; chose seven samples and six subsets for the last subsets as alternative training set, and the last one was utilized for validation; library file LIBSVM was employed to construct the SVM [ 23] and the running platform were carried out by DPS software . The previous data of six points are used to construct the model and the other seven data to test. Table 4 shows the contrasted measure results of Chl-a, B4 Gray value, B4/B3 value, and (B4-B3)/ (B4+B3) value. The MAD and correlation coefficient R2 are taken to assess the retrieving results. MAD, R2 and the prediction results are presented in Table 5 with above three methods based on the 7 samples .
Table 4: Chlorophyll-a contrast and Band ratio.
|B4 Predicted value (µg/L)||B4/B3
Predicted value (µg/L)
Predicted value (µg/L)
Table 5: The comparison results of the retrieved Chl-a formed by five test samples.
The result of (B4-B3)/(B4+B3) is better than B4/B3 and B4 when inversion is conducted using the results of different bands. The value of R2 by formula of (B4-B3)/(B4+B3) is more than 0.79. However, the other two methods (B4/B3 and B4) cannot reach this level because they are affected by other water quality factors by the arithmetic of (B4-B3)/ (B4+B3). Consequently, accurate predicted results of Chl-a can be obtained through appropriate band arithmetic and the combination of GA and SVM. Correlation value shave been presented graphically in Figure 2, and the regression line and the formula have been shown on the graph.
According to the inversion results, we get the comprehensive trophic state evaluation in the Qujiang South Lake by the formula (5) - (6) (Table 6 and Figure 3). ArcMap software was used to draw the thematic map given in Figure 3.
From Table 6 and Figure 3 we know that the trophic state gradient in Qujiang South Lake is between moderate and mild eutrophication.The eutrophication distribution assessed by RS is consistent with the measured results. The trend of the trophic state gradient is high in south and low in northern, sampling points 6, 9, 12 and 13 are in moderate eutrophication and sampling points 6 and 9 are influenced by people around the dock. For the sampling points 12 and 13, the flow velocity is low, the amount of aquatic animals and plants is large and the nitrogen and phosphorus contents are high.
|Sampling Numbers||TLI||Trophic State|
Note: TLI- Comprehensive nutrition state index
Table 6: Comprehensive trophic state evaluation result for each sampling point in Qujiang South Lake.
Concentration of Chl-a (twelve months') is retrieved using estimation model and RS image data. Dynamic monitoring is performed according to the evaluation index of single factor. It can be concluded as described in Table 7. Eutrophication is serious in March, April, May and June, declines in July and August, and increases again in October and November. It is in a low state in winter and spring, which is changeable seasonally. The Chl-a is the major factor of influence in May, June and October. Furthermore, it is the popular period for tourists, which affects eutrophication and the Chl-a result is higher.
Note: TLI(chl-a)-the corresponding Trophic State Index of chl-a
Table 7: Remote sensing dynamic monitoring value in one year.
RS technology is a cost-effective tool for environmental research that has been applied to assess the eutrophication of large urban landscapes water body. However, in order to improve the accuracy of the inversion, we need more in situ monitoring stations, what’s more, more real-time measurements of water quality parameters in the study area should be carried out. Due to that the weather condition also impacts the quality and resolution of ETM image data, if the quality of the ETM image is best improved, the study will produce better results. For example, clear sky cloudless satellite RS image data is very suitable for water quality RS inversion.
This research indicates that RS technique with GA-SVM model is a new method for studying urban landscape water eutrophication. Three achievements were obtained: (1) precise results may be obtained by using the ratio of ETM4 (Chl-a reflection peak) and ETM3 (Chl-a absorption peak) for Chl-a concentration; (2) the trophic state in Qujiang South Lake is between moderate and mild eutrophication. The eutrophic degree is high in south and low in north from the current study. Eutrophication is serious in March, April, May and June, declines during the period of July and August, and tends to increase from October to November. The seasonal change of trophic state is distinct; (3) in several areas of a lake, this method can provide relatively accurate degree on trophic status by sampling ETM images using GA-SVM, and it offers an innovative methods to explore RS monitoring of water quality. The results show that the GA-SVM method is promising and also an acceptable water quality monitoring method with reasonable accuracy. The technique can also be utilized to assess the eutrophication of large region urban landscape water body in much less effort, time and at a lower cost due to free download RS images. For public safety, lake monitoring will still be needed to evaluate algal blooms. However, this method can be used as a first step to determine which lakes should be monitored more specifically.
The authors would like to thank the key project of science and technology of social development in Shaanxi Province (No. 2016SF-411) and the Key projects of Chinese Academy of Sciences(KFZD-SW-306-2) for financial support. We would also like to thank the support from the National Natural Science Foundation of China: (Grant No: 41130753, 41672255, 41302250 and 4167020392). The authors are grateful to Associate professor Jifeng Guo. Special thanks to graduate students Hongman He, Haicun Mi, and Ruichong Liu for their continuous field work support. The authors are also grateful to the reviewers for their insightful remarks for enlightening the manuscript.