Prediction of Effluent Treatment Plant Performance in a Diary Industry Using Artificial Neural Network Technique

Use of Artificial Neural Network (ANN) models is progressively increasingly to predict waste water treatment plant variables. This forecasting helps the operators to take corrective action and manage the process accordingly as per the norms. It is a proved useful device to surmount a few of the limitations of usual mathematical models for wastewater treatment plants for the reason that of their complex mechanisms, changing aspects-dynamics and inconsistency. This analysis considers the relevance of ANN techniques to predict influent and effluent biochemical oxygen demand (BOD), Chemical Oxygen Demand (COD), Total suspended solids (TSS) for effluent treatment process. Here, a feed forward ANN, using a back propagation learning algorithm, has been applied for predicting effluent BOD, COD, TSS. After collecting historical plant data from effluent treatment plant at Diary industry. The suitable architecture of the neural network models was ascertained after several steps of training and testing of the models. Efficiencies of the plant for BOD, COD, TSS were 85%, 78%, 75% respectively. The ANN based models were established to offer an efficient and a robust tool in prediction and


Introduction
Wastewater treatment is an important initiative which must be taken more seriously for the betterment of the society and our future. Wastewater treatment is a process, where in the contaminants are removed from wastewater as well as household sewage, to produce waste stream or solid waste suitable for discharge or reuse. Wastewater treatment methods are categorized into three sub-divisions, physical, chemical and biological. Some of the major important types of wastewater treatment process are as follows, 1. Effluent Treatment Plants (ETP) 2. Sewage Treatment Plants (STP) 3. Common and Combined Effluent Treatment Plants The principal objective of wastewater treatment is generally to allow human and industrial effluents to be disposed of without danger to human health or unacceptable damage to the natural environment. Irrigation with wastewater is both disposal and utilization and indeed is an effective form of wastewater disposal (as in slow-rate land treatment). However, some degree of treatment must normally be provided to raw municipal wastewater before it can be used for agricultural or landscape irrigation or for aquaculture. The quality of treated effluent used in agriculture has a great influence on the operation and performance of the wastewater-soil-plant or aquaculture system. In the case of irrigation, the required quality of effluent will depend on the crop or crops to be irrigated, the soil conditions and the system of effluent distribution adopted. Through crop restriction and selection of irrigation systems which minimize health risk, the degree of pre-application wastewater treatment can be reduced. A similar approach is not feasible in aquaculture systems and more reliance must be placed on control through wastewater treatment.
Modelling of effluent treatment plant (ETP) is important for predicting plant performance and operation. In addition some important process variables cannot be measured on-line, e.g. BOD5 requires 5-days incubation, and this makes it difficult to find and solve the problematic situation in time. Therefore, modelling a ETP is a difficult task and most of the available models are just approximate ones based on, probably severe, assumptions. These features make it difficult to achieve optimum performance of the ETP using conventional modelling techniques. Thus, in turns, necessitates development of more advanced modelling techniques to predict the behaviour of ETP. Thus neural networks have been found promising technique in forecasting historical data. ANN model can predict concentration of effluent parameter. It inspired by the structure and operation of the brain and central nervous system.

Theoretical Background
Artificial neural network (ANN) models, also known by other names such as connectionist models, parallel distributed processing (PDP) and neuromorphic systems, is a branch of artificial intelligence (AI). They are mathematical models of theorised mind and brain activities, which attempt to exploit the massively parallel local processing and distributed storage structure of the human brain and the central nervous system [1]. Artificial neural networks are loosely based on the structure of natural neural networks but only exhibit a very small portion of their capabilities. A neural network is characterized by its architecture that represents the pattern of connection between nodes, its method of determining the connection weights, and the activation function [2]. Like natural or biological neural network, they consist of interconnected processing elements (neurons) and satisfy the "locality constraint", which means that processing elements are only allowed to receive information supplied locally. As a result, the input to a processing element can only be directly affected by a node connected to its input path. In ANN, processing elements (PE) or nodes are equivalent to neurons in NNN. Processing elements are usually analog, non-linear and possess a small local memory and are slow compared with advanced digital circuitry. Individual processing elements are usually arranged in layers [3]. Two of these layers, the input buffer (layer) and the output buffer, are connected to the environment. Data is presented to the network at the input buffer and the response to the input is presented at the output buffer. The layers in-between the input buffer and the output buffer are called hidden layers. Hidden layers enable the network to cater to non-linearities. At each node (PE) in a layer the information is received, stored, processed and communicated further to nodes in the next layer. Each neuron is connected to every other neuron in adjacent layers by a connection weight, which determines the strength of the relationships between two connected neurons. The output from a neuron is multiplied by the connection weight before being introduced as input to the neuron in the next layer. Nodes in the various layers are either fully or partially interconnected. Each connection has associated with it a particular adaptation coefficient or "weight" representing the synaptic strength of neural connections. Different values of weights represent connections of varying strength. A zero weight represents the absence of a connection and a negative weight represents an inhibitory relationship between two PEs. These weights are adjusted using a learning rule [4].

Materials and Methodology
The Kozhikode diary located 15 km from Calicut, a district of the Kerala state is in the abode of an environment filled with the blanket of greeneries. At present, dairy is handling milk processing capacity of 2, 00,000 liters per day. However, the milk production crosses the 4, 00,000 liters per day during the festival season. Therefore, the original plant design capacity of the effluent treatment plant is 300cubic meters with milk water ratio at 1:3. Now the water consumption is limited to 1.6 times of milk handled by better water usage and conservation.

Training and test set
The development of ANN models require the use of representative data which must be divided into two sets: a data set to carry out the learning procedure (training data) and a data set for evaluating the ANN model performance (test data) [5][6][7]. Studies have shown that the way the data are divided can have a significant impact on the results obtained [8]. The fraction of the complete data to be employed for training should contain sufficient pattern so that the network can mimic the underlying relationship including trends and patterns between input and output variables adequately. An important requirement for good generalization capacity of the ANN, is the completeness of the training database. If important variables are not measured, or are not available, the ANN may give a small training error, but a large testing error. The training set consists of data patterns that the network processes repeatedly in order to learn trends and patterns in the data. During the learning process, the network is periodically evaluated using the test set patterns in order to ensure that the network is not simply memorising the training data. In this study, the, the training set consisted of 87% of total data (175 datas out of 200), remaining for testing [9][10][11]. The 'test data' set were used for testing. The test data set were not shown to the network during training; they were used after the training is finished in order to test the network for its generalization ability, and to monitor network's performance. Using the Run/ Save Best menu of the mat lab nn tool, the training was stopped at regular intervals and the performance (generalization ability) of the networks were tested by presenting the test set to the trained networks [12][13][14]. Training was continued until a plateau was reached in the RMS prediction error of the test set. The neural network model was created in MATLAB software that offers a plat-form for the simulation application. MATLAB Toolbox opens the Network/Data Manager window, which allows the user to import, create, use, and export neural networks and data (Figures 1-3).
The Network properties are as follows: 1. Network inputs: COD, BOD, and TSS.

ANN modelling
The performance of effluent treatment plant is simulated in this study by a multi-layer neural network and the performance of the treatment plant is evaluated over a period of six months using this advanced model. The BOD, COD, TSS values of effluent and influent is recorded to analyse the results. The application randomly divides input vectors and target vectors into two sets as follows: 87% are used for training, remaining was used to validate that the network is generalizing and to stop training before over-fitting. The architecture for best model was selected based on minimum Root Mean Square (RMS). Here the best model obtained was the neural network with 9 hidden neurons. The best architecture has got a RMS of 0.0984 and a regression value of 0.99959 with number of epochs as 278. Figure 4 shows the regression plot of the best model selected. The model thus developed was validated by predicting the performance of effluent treatment plant. Thus 15 data

Conclusion
In this paper, the study was carried out on prediction of ETP Performance of dairy industry using Artificial Neural network. The study was focused at the estimation of the Root Mean Square (RMS) from the inputs and outputs which were given to ANN. The results of this study indicated high correlation coefficient (R-value) between the measured and predicted output variables, reaching up to 0.99959. Therefore, the model developed in this work has an acceptable generalization capability and accuracy. As a result, the neural network modelling could effectively simulate and predict the performance of effluent treatment plant of dairy industry. It is concluded that, ANN provides an effective analysing and diagnosing tool to understand and simulate the non-linear behaviour of the plant.