Evaluation of Surface Water Quality Impacted by Sewage Overflows from Animal and Residential Lagoon Systems using Principal Component Analysis

Copyright: © 2013 Ikem A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Evaluation of Surface Water Quality Impacted by Sewage Overflows from Animal and Residential Lagoon Systems using Principal Component Analysis


Introduction
The efficiency of animal agriculture in the United States has increased in over half a century, but environmental degradation is one of the major concerns of the populace [1]. In the United States, fortyfour (44) percent of assessed river and stream miles were reported impaired by the United States Enviromental Protection Agency (EPA). The leading causes of this impairment included pathogens, mercury, nutrients, organic enrichment and low amount of dissolved oxygen. The major sources of impairment as reported, were attributed to atmospheric deposition, agriculture, hydrologic modifications, and unspecified sources [2]. In the State of Missouri, about 286 stream miles have been classified as impaired by point source wastewater discharges. Animal agriculture is prominent, and there are currently 427 Class I concentrated animal feeding operations (CAFOs) located in Missouri. These are operations containing at least 1,000 beef cattle, 2,500 large swine, or 100,000 broiler chickens [3]. Discharge of sewage-derived nitrogen as ammonia can accelerate eutrophic conditions and create toxicity problems to aquatic organisms [4]. Sewage can also be a source of several pollutants into surface waters, and previous articles reported the presence of pharmaceutical compounds [5,6], Escherichia coli and micropollutants [7], hormones [8], polyfluorinated compounds [9], volatile organic compounds [10] and potentially toxic elements [11] in both sewage effluents and surface waters.
Apart from pollution caused by animal agriculture, domestic septic tank systems are equally significant pollution sources to surface waters and can impact local stream chemistries and ecology [12]. For example, export of nutrients (nitrogen and phosphorus) from inland waters to downstream areas is creating hypoxia conditions and dead zones [13] in the Gulf of Mexico. Wastewater systems generally can be classified as decentralized or centralized. Decentralized wastewater systems, often called "septic" or "on-site" systems, derive their name from their location. They treat wastewater close to the source, typically providing treatment on the property of individual homes or businesses. Decentralized systems also include systems serving clusters of individual homes, large capacity septic systems, and a small collection of treatment systems. These systems also treat wastewater close to the source, using small pipes for collecting small volumes of domestic wastewater, unlike centralized urban wastewater treatment systems that pipe large amounts of wastewater many miles through sewers prior to reaching the treatment facility [14]. A septic tank system is designed as a settlement chamber allowing for anaerobic reduction of organic and suspended materials in wastewater [15]. The high conversion rate of organic-nitrogen to ammonium is usually achieved in septic systems but usually fails to lower TN loads.
Diffuse pollution at Gans Creek upstream areas and point pollution sources from sewage wastewater overflows (domestic and confined animal lot lagoon systems) threaten Gans Creek ecosystem health. To protect ecosystems and adequately manage export of nutrients from inland waters requires an understanding of ecosystem conditions and the influence of both diffuse and point pollution on water quality. The objectives of this study are threefold: (1) to assess the water quality of Gans Creek at upstream, midstream and downstream sites, (2) measure the physical and chemical characteristics of sewage overflows (CALLO and DSLO) and assess their influence on Gans Creek water quality, and finally, (3) to ascertain the significant analytical variables and controlling processes influencing Gans Creek water quality using principal component analysis (PCA).

Site description and sampling
Missouri has an area of 69,000 square miles and a population of 5.6 million people, according to the 2000 census. About half of this population is concentrated on opposite sides of the State in the Kansas City and St. Louis metro areas, leaving most of other areas of the State and its waters rural in nature. Surface and groundwater in Missouri are quite varied in quantity and quality, corresponding closely with its geology and land use. There are currently 22,370.3 miles of classified streams in Missouri, 82,126 miles of unclassified streams, and 456 classified lakes totaling 293,696 acres [16].
The study area was Gans Creek, one of several streams in Columbia (County of Boone), Missouri, United States. Gans Creek is a tributary of Little Bonne Femme Creek in Boone County south of Columbia. The stream is designated as category A for the whole body contact recreation use. Gans Creek was first listed as impaired by bacteria in Missouri 2012 303(d) list of impaired waters, which was approved in whole by the U.S. Environmental Protection Agency on Nov. 13, 2012. Missouri's whole body contact bacteria criteria are based on specific levels of risk of acute gastrointestinal illness. The level of risk correlating to the category A criterion is no more than 8 illnesses per 1,000 swimmers in fresh water [16]. Gans Creek, is listed in the Missouri-Moreau watershed with a size of 8816 km2 and human population density of 220.4 persons per square km. The watershed elevation ranged from 183 to 274 m, and land use is characterized by 17.2% cropland, 35.7% grassland, 37.3% forest, 1.7% wetland, 6.9% development and 1.3% water. The main type of soil classification is silty loam with average annual precipitation of approximately 1.1 m per year. Flooding of Gans Creek is common after after heavy precipitation.
Missouri State has extensive farm lands, accounting for 66% of its land use. Major crops are corn, soybean. Animal agriculture (poultry, hog and beef production) also is significant in Missouri. The University of Missouri animal lot facility contained approximately 700 animals (mainly cattle). Five hundred (500) of these animals were fed concentrate meals, and the remaining animals were allowed to graze on farmland most times of the year. Periodically, grazing land was sprayed with confined animal lot wastewater stored in a lagoon within the facility. In another scenario at the study area, the DSLO located on the southeastern portion of the study area, consisted of approximately 100 trailers that housed an average of 4 individuals per trailer. An overflow pipe from the aerated single cell lagoon discharged wastewater through a ditch into Gans Creek. The overflow from the domestic lagoon was discharged every week of the year except during dry periods. Figure 1 presents the aerial map of the sampling area (Gans Creek, beef facility and residential Park) and the designated sample collection points are described in Table 1. Four sampling points were designated along Gans Creek that represented upstream (G1), midstream (G2) and downstream (G3 and G4) positions with respect to streamflow direction. The sewage wastewater overflows (CALLO and DSLO) were represented by (H1), and (T1) respectively. The combined sewage wastewater overflows (H1 and T1) were represented by TH1 and TH2 along a ditch that flowed by gradient into Gans Creek. The mixing zone of the combined wastewaters and Gans Creek was between G2 and G3 sites. Monthly sampling was conducted throughout the study period using a grab sampler except in January 2010 because of freezing conditions. Three types of samples were collected at the sites: (1) samples for pH, chloride (Cl -), nitrate-nitrogen (NO 3 -N), nitrite-nitrogen (NO 2 -N) and NH 3 -N determination, (2) samples for elemental measurement, and (3) samples for TOC and TN analysis. Sampling containers and preservation were implemented according to recommended procedures [17]. All samples were transported to the laboratory on ice-coolers and immediately placed in the refrigerator at 4°C in our laboratory and later analyzed for various water quality determinands.

Sample measurements
The pH and EC of samples were measured using the Thermo Orion pH/ORP/conductivity meter (Model 555A) purchased from Fisher Scientific (Hanover Park, IL 60133, USA). Appropriate buffer solutions were utilized to calibrate the meter before measurements were taken. The variables Cl -, NO 3 -N, NO 2 -N and NH 3 -N were quantified spectrophotometrically using the mercuric thiocyanate (method 8113), cadmium reduction (method 8039), ferrous sulfate (method 8153), and ammonia salicylate (method 8155) procedures respectively (Hach, Loveland, Colorado, USA). TOC and TN concentrations in samples were determined on the Shimadzu TOC/TN analyzer. Carbon and nitrogen analysis were by the nondispersive infrared (NDIR) and chemiluminescence detection techniques respectively (Shimadzu Scientific Instruments, Columbia, MD 21046, USA). Carbon and nitrogen experiments were validated with known standards during every batch run. The analytical recoveries obtained for carbon and nitrogen standards ranged from 95-102%. For elemental species measurement, samples from Gans Creek and the wastewaters were first nitric acid digested using the Ethos EZ microwave digester supplied by Milestone Inc. (Controls Dr. Shelton, CT 06484, USA). Sample digestion process was initiated by the addition of 5 ml nitric acid to 45 ml of sample, and the mixture was subjected to microwave digestion process (Step 1: 160°C for 10 min at 1050 W and finally, Step 2: 165°C for 10 min at 1050 W). The cooled digests were then placed in acid washed LDPE bottles (60 ml capacity) and later analyzed with the Varian Vista-Pro CCD simultaneous inductively coupled plasma-optical emission spectrometer (ICP-OES) (Varian Inc., California, USA). The ICP-OES was calibrated with diluted mixed standards containing aluminum (Al), B, sodium (Na), calcium (Ca), copper (Cu), magnesium (Mg), arsenic (As), barium (Ba), cadmium (Cd), cobalt (Co), potassium (K), iron (Fe), lead (Pb), sulfur (S), manganese (Mn), nickel (Ni), strontium (Sr), vanadium (V), zinc (Zn), mercury (Hg) and total phosphorus (TP). A certified standard reference material (SRM 1643e-trace elements in water) purchased from the National Institute of Standards and Technology (NIST) was used to validate the accuracy of our analytical method. The recoveries of the elements in SRM 1643e ranged from 90-108%. Method blanks were also analyzed for every batch of samples and recalibration of the elements was conducted in sets of 10 samples.

Statistical data analysis
Data summary, plots and diagrams: Microsoft Excel (Microsoft Office ® ) was used for basic data summary. In addition, time series plots and piper diagrams were prepared with AqQA version 1.1 software (Rockware Inc.).

Principal Component Analysis (PCA): Data treatment produce
pictorials and summary data information that are easy to interpret. Principal component analysis (PCA), hierarchical cluster analysis (HCA), factor analysis (FA), discriminant analysis (DA) are statistical tools that have been utilized to summarize large datasets by several authors [18][19][20][21]. PCA can explain variability in water quality datasets, identify the most important analytical variables and also assist in predicting the controlling processes that influence water quality. PCA models the hidden data structure, explain any similarities and differences among samples, and provide correlations among variables [22]. As earlier reported [23], PCA and principal factor analysis (PFA) normally involve the following five major steps: (1) start by coding the variables x 1 , x 2 ,…,x p to have zero means and unit variance; i.e., standardization of the measurements to ensure that they all have equal weights in the analysis; (2) calculate the covariance matrix C; (3) find the eigenvalues λ 1 , λ 2 ,…, λ P and the corresponding eigenvectors a 1, a 2 ,…, a p ; (4) discard any components that only account for a small proportion of the variation in datasets; and (5) develop the factor loading matrix and perform a varimax rotation on the factor loading matrix to infer the principal stations. The first few principal components (PCs) will tend to explain a large percentage of the total variance of the dataset and may be used to describe variation in water quality across the study area. Often these patterns are related to specific sources of contamination [24]. Details of PCA techniques and explanation can be found in published literature [22,25].
To extract important variables that controlled Gans Creek study area, we used our dataset consisting of 8 sampling stations and 31 analytical variables (pH, EC, Al, As, Ba, Na, Mg, K, Ca, Cd, Cr, Co, Cu, ), inorganic carbon (IC) and TOC. The number of PCs extracted to enable us explain the underlying data structure was based on the "Kaiser criterion" [26] where only factors with eigenvalues greater than unity were retained. Our dataset was subjected to PCA procedure after varimax rotation to ascertain the controlling variables and to help predict their sources. PCA was conducted using SPSS 16.0 for Windows software [27].

Physico-chemical characteristics of lagoon overflows
The summary data of Gans Creek water quality and the lagoon overflows is presented in Table 2a and b. The pH of CALLO and DSLO ranged from 7.50-8.00 during the four seasons combined, and values within this range were below the EPA stipulated guideline for drinking water [28]. The trend of major ionic species in both overflows were:  Figure 2a shows that DSLO samples consisted of Ca 2+ -Cltype, mixed Ca 2 +-Mg 2 +-Cltype and mixed Ca 2+ -Na + -HCO 3 type ionic compositions. The DSLO samples were alkaline earth metals with high or moderate chloride concentrations. CALLO samples (Figure 2b) represented to a great extent a Ca 2++ -Mg 2 +-Cl --SO 4 3type water type and dominated by Ca 2+ and SO 4 3ions. The nature of the ionic composition of the lagoon overflows is influenced by the activities at the beef facility and residential Park. EC values of the overflows (DSLO: 878 µS cm -1 and CALLO: 1281 µS cm -1 ) were pronounced during the winter period because of elevated concentration of dissolved salts. DSLO samples generally had the highest NH 3 -N levels (Figure 2c) compared to CALLO samples. The sources of NH 3 -N in DSLO and CALLO were attributed to organic matter decomposition and urea hydrolysis. The concentration of B in DSLO was highest during the spring period (B=0.367 mg L -1 ). The sources of B in sewage overflows were detergent products and boron do not substantially adsorb in the sewer system and not easily removed during sewage treatment processes [29].
The water quality data was summarized and reported as annual averages. The annual average recorded for NH 3 -N in DSLO samples (T1=8.48 mg L -1 ) was at least 13-folds higher than the corresponding observed annual average for CALLO (H1=0.68 mg L -1 ) samples. The concentration of TP in DSLO (T1=2.37 mg L -1 ) was approximately 4-folds higher than the observed annual average obtained for CALLO (H1=0.63 mg L -1 ) samples. However, the annual average recorded for TOC in CALLO

Gans Creek chemistry
Nutrient criteria are presently not available for Missouri streams but the EPA drinking water standards, Missouri department of natural resources criteria for protection of aquatic life [30] and Australia water quality guidelines [31] were used to assess the quality of Gans Creek. Gans Creek was characterized as an intermittent and perennial stream. Water quality constituents were found to vary with the flow regime during the seasons. Ambient and water temperatures observed followed typical Missouri seasonal pattern of four seasons (spring, summer, fall and winter). Water temperature was ≤4°C in the winter months and as high as 29°C during the summer period. The pH of Gans Creek (range: 7.00-8.3) can be classified as slightly above neutral throughout the sampling months and this implied no major harm to biota when compared to EPA drinking water guideline. Electrical conductivity (EC) (in µS cm -1 ) varied considerably from 139.3 (G1: most upstream site) in August 2009 to 690 (G4: most downstream site) in February 2010. 31% of samples exceeded the upper EC value (250 in µS cm -1 ) for slightly disturbed water bodies (Table 2c). Piper plot of the hydrochemical facie for Gans Creek samples is shown in Figure 3a. Gans Creek samples can be classified as a Ca 2++ -Mg 2+ -Cl --SO 4 3type water type but dominated mostly by Ca 2 + and SO 4 3ions.
The seasonal dataset had highest spatial difference for EC, Na, Mg, K, Ca, Cland SO 4 3 during the winter period. On an annual scale, Ca was the most prominent element and the order of abundance of major cations and anions followed the trend we observed for the sewage overflows. NO 3 -N ranged from nondetectable-8.9 mg L -1 in the samples, and it was the most dominant N form of combined nitrogen.    It accounted for between 94-97% and 30-94% in May 2009 (spring) and February 2010 (winter) respectively. Annual average for NO 3 -N observed for Gans Creek was below the 10 mg L -1 drinking water guideline. Nitrate usually is the dominant form of combined nitrogen in natural waters; however, the presence of nitrates at levels greater than 5 mg L -1 may reflect unsanitary conditions since one major source of nitrates in aquatic systems is human and animal excreta [32]. Figure 3b shows the time series plot of NH 4 + concentrations in Gans Creek during the sampling period. The monthly recorded data for NH 3 -N ranged from nondetectable-0.69 mg L -1 . The concentrations of NH 3 -N at G3 site were highest in September 2009 and February 2010 and was probably derived from both CALLO and DSLO. Other forms of nitrogen in Gans Creek were related to diffuse pollution from upstream areas. Ammonia, nitrite and nitrate constitute the various forms of inorganic nitrogen. Apart from dissolved oxygen (DO), unionized NH 3 [33] and NH 4 + [34] are important surface water parameters that can drastically impact the biological community structure at high concentrations. Stream alkaline condition and temperature affect the level of un-ionized NH 3 . The EPA 1999 criteria at pH 8 and 25°C for acute and chronic NH 3 levels are 5.6 mg N L -1 salmon present and 1.2 mg N L -1 fish early life stages present respectively [35,36]. Approximately 83% of Gans Creek samples exceeded the ANZECC default trigger value of 0.01 mg/L ammonia (Table 2c) designated for physical and chemical stressors in tropical Australia for slightly disturbed ecosystems for Lowland river [31].
TP concentrations were high in spring and fall compared to winter (TP in Gans Creek ranged from 0.07-0.31 mg L -1 ). The mean seasonal concentration of all sampled sites on Gans Creek exceeded the TP trigger value (10 µg L -1 ; Table 2c) for slightly disturbed lowland rivers [31]. Phosphorus is non-toxic, but high concentrations can result in excess growth of aquatic plants with various consequences to fish species. The dynamics of nitrogen and phosphorus in a stream may control the numbers and types of the biological community structure and eutrotrophication process. Phosphorus recorded for Gans Creek was mostly derived from household detergent present in DSLO samples.
The annual seasonal mean (range: 1.11-1.30 mg L -1 ) of Al in Creek samples was generally higher than the trigger value (0.055 mg L -1 Al) recommended for disturbed lowland river [31] and the Missouri Department of Natural Resources acute Al level (0.75 mg L -1) for protection of aquatic life. However, Zn levels were generally below 18 µg L -1 . The concentrations of B were highest during spring (G2: 0.045 mg L -1 ) and summer (G3: 0.060 mg L -1 ) periods. The DSLO wastewater was the major source of B in Gans Creek during the four seasons.

Results for PCA
PCA reduced the data matrix, extracted the most influencing parameters and explained the total variability in the dataset consisting of 31 analytical parameters or principal components (PCs). Each PC is represented by 31 variables with their correlation loadings constrained between +1 and -1. The best interpretation of a given PC is to associate the loadings of each variable with the possible sources of contamination at the study area. Table 3 shows our observed initial eigenvalues, cumulative variance, rotation sums of squared loadings, communalities and the correlation loadings on the PCs for the variables. Previous report classified factor loadings as 'strong' , 'moderate' , and 'weak' corresponding to absolute loading values of >0.75, 0.75-0.50 and 0.50-0.30, respectively [35]. In this study, four varimax rotated principal components (RC1, RC2, RC3, and RC4) represented approximately 60% of the total variability of the entire dataset. These rotated components   The most significant analytical variables on each of the four components were as follows: RC1 had the highest variance (20%) in the dataset and the most significant controlling variables on this component were Mg, Ca, SO4 3-, Sr, EC, pH, and Na. Strong positive correlations (>0.79) were obtained for EC, Mg, Ca, Sr and SO 4 3and moderate positive correlations were for pH (0.56) and Na (0.63). The direct positive relationships of these variables on RC1 were as a result of the high solubilities of the elements in solution with a proportionate increase in EC value. From the correlation loadings on RC1, the solubilities of Mg, Ca, Sr, Na and SO 4 3were influenced by pH at G1, G2, G3 and G4 sites. Therefore, mineralization of cations and anions (natural controlling process) in Gans Creek was the dominant factor that accounted for the 20% variability of the dataset in the study area.
RC2 accounted for approximately 19% of the total variance of our dataset, and the most significant controlling experimental variables on RC2 were PO4 3-, TN, B, NH4 + , Na, Cl-, and NO 3-. The parameters PO 4 3-, B, NH 4 + and TN were the most positively correlated variables (>0.80) followed by Na, Cl-and NO 3 with moderate correlations (0.55-0.72). The DSLO was the major contributor of B (Table 2a), and NH 4 + and TP (Figure 3a) to Gans Creek relative to CALLO. Therefore, the associations of these variables on RC2 represent the contributions of DSLO and CALLO and can be called the "sewage overflow factor". Thus, the sewage overflow factor represented approximately 31% influence on the quality of Gans Creek Creek from the four major rotated components.
The variables Fe, Al, Cr, V and NO3-had strong positive associations on RC3. Inorganic carbon was also strong but negatively correlated. The strongly correlated variables identified on RC3 represented the influence of diffuse pollution such as runoffs and other human activities upstream of Gans Creek. RC3 accounted for approximately 12% of the total variance in the dataset and represented 20% influence on Gans Creek water quality. RC4 accounted for 9% of the total variability of the dataset, and the strongly correlated variables on this component were Cu, Cd, V, Zn and TOC. These variables on RC4 probably were probably related to runoffs from cattle grazing areas near the stream segment.

Conclusions
This study evaluated the water quality of Gans Creek, the characteristics of human and animal sewage wastewater overflows, and  EC: electrical conductivity; TP: total phosphorus; NH 3 -N: ammonia-nitrogen; NO 3 -N: nitrate-nitrogen; TOC: total organic carbon; SO 4 2 -: sulfate. *ANZECC default trigger values designated for physical and chemical stressors in tropical Australia for slightly disturbed ecosystems for Lowland river [31]. Trigger values are used to assess risk of adverse effects due to nutrients, biodegradable organic matter and pH in various ecosystem types. **ANZECC Trigger values for toxicants at 95% level of protection of species for typical slightly-moderately disturbed systems [31]. a Lower values from rivers draining rainforest catchments. b Trigger value of aluminum at pH > 6.5. c Total ammonia as NH 3 -N at pH 8. d The values have been calculated using a hardness of 30 mg l -1 CaCO 3 .    their impact to Gans Creek. Throughout the sampling period, Gans Creek water chemistry was dominated by major cationic and anionic species. We observed the highest spatial difference in EC, Na, Mg, K, Ca, Cland SO 4 3values during the winter period. Comparison of CALLO and DSLO datasets showed that the DSLO was the highest contributor of NH 3 -N, Cl-, TP, TOC and B to Gans Creek. PCA result explained approximately 60% of the total variability of the analytical dataset in this study. Mineralization due to natural processes, sewage overflow factor, diffuse pollution upstream of Gans Creek and runoff effects from cattle grazing areas near the study sites were the most controlling processes at the sampling area. Over twenty analytical variables were found to be important in the chemistry of Gans Creek and heavy impact on Gans Creek quality was due to the presence of dissolved materials and ammonia.
A nutrient reduction plan was implemented by the Beef research facility manager in 2010 to mitigate the pollution problems at the study site. Some of the nutrient reduction strategies implemented were: creation of protected riparian area along sections of the Creek, nutrient management plan in place for application of manure, grass filter strips along the Creek corridor, set back distances from Creek and fencing out livestock from Creek main channel. We recommend that the wastewater from the DSLO should be continually evaluated for effectiveness and pollution reduction plan be implemented to reduce ammonia-nitrogen in sewage outflows. If possible, channeling of wastewater to an artificial constructed lagoon that allows for effective treatment and discharge may be considered.