An Intelligent System Based on Back Propagation Neural Network and Particle Swarm Optimization for Detection of Prostate Cancer from Benign Hyperplasia of Prostate

Conventional clinical diagnostic methods are generally based on a single classiﬁer. In present paper, we propose a hybrid Backpropagation neural network (BPNN) classiﬁer based particle swarm optimization (PSO) method. In the present paper by combining the principles two algorithm, we propose a new but simple hybrid algorithm called BPNN_ PSO. Our novel algorithm optimizes BPNN with PSO and reduces computational time of the training phase of BPNN. The performance of the algorithm has been tested with prostate cancer. A total of 360 medical records collected from the patients suffering from neoplasia diseases have been used to train and test the proposed algorithm. The results show that the proposed BPNN–PSO algorithm can achieve very high diagnosis accuracy (98%) and it proving its usefulness in supporting of clinical decision process of prostate cancer. Comparing the simulated results of the above two cases, training the neural network by PSO technique gives more accurate (in terms of sum square error) and also faster (in terms of number of iterations and simulation time) results than BPNN. By using these hybrid method for building machine learning classiﬁers, we can signiﬁcantly improve diagnostic performance with respect to the results of clinical practice.


Introduction
Prostate Cancer (PC) is the most commonly diagnosed nonskin cancer and the second leading cause of cancer deaths in western countries. Prostate cancer is a disease in which cancer develops in the prostate, a gland in the male reproductive system. Cancer occurs when cells of the prostate mutate and begin to multiply out of control. These cells may spread (metastasize) from the prostate to other parts of the body, especially the bones and lymph nodes. Prostate cancer progresses most frequently in men over fifty, as the prostate is exclusively of the male reproductive tract. In general, prostate cancer is a disease that can be diagnosed with prostate biopsy in accordance with the suspicions that arose as a result of Prostate-Specific Antigen (PSA) test, rectal examination, and transrectal findings. To definitive diagnosis besides transrectal ultrasono graphy and rectal examination for diagnosis of prostate cancer there is a need for biopsy. Despite the need for biopsy for conclusive diagnosis, patients with low cancer risk may avoid this process, which is not without risks due to possible complications that may arise. This is an invasive procedure with the risk of rectal mucosa being damaged, and its high costs. Therefore, before they agree to biopsy, patients may prefer a different methodology that may yield a more accurate result with less associated risks [1][2][3][4][5][6].
Many prediction methods have been developed that determines patients' cancer risk based on their age, ethnic background, history of cancer in the family, and PSA levels. In recent years, diagnostic research of prostate diseases has been the focus on new methodologies, such as artificial intelligence methods. Artificial Neural Networks (ANN), which is one of the artificial intelligence methods, is widely used in the classification of cancer because of the speed in responding to the analyses of marker effect [2,5,7].

Materials and Method
As prostate cancer receives more attention from both high-risk patients and urologists, the number of clinically diagnosed prostate cancer is increasing dramatically, however, the ability of urologists to review such enormous amounts of clinical data and provide diagnosis seems to be somewhat lagging. Additionally, diagnostic conclusions can vary largely between individual urologists. In last decade more Application and algorithm are used and evaluated in order to designing intelligent methods to helping medical diagnostic decisions [1,2,4,7,8].
In present study to algorithm are used. In this study, in order to improve the ability of conventional neural network to escape from a local optimum, the PSO algorithm was used to modify the network parameter.

Back Propagation Neural Network (BPNN) algorithm
In recent years, the use of Artificial Neural Networks (ANN), in particular, three layer neural networks, which can be trained to approximate virtually any continuous function, is the most usual type of Multilayer Perceptron (MLP). Multilayer Perceptron propagates the information from the input towards the output layer. MLP has an input layer of neurons, a number of hidden layers, and an output layer ( Figure  1). By using sufficient number of hidden neurons in hidden layers MLP can map any nonlinear input-output function to an arbitrary degree of accuracy. All layers are fully connected and of the feed forward type. The outputs are nonlinear function of inputs, and are controlled by applying weights to the data that are computed. In recent years, the use three layers feed forward neural networks, is the most usual type of feed forward NN, which propagates the information from the input towards the output layer.
In addition to these methods, a heuristic optimization algorithm is used to increase the success and speed of these methods. PSO as a heuristic optimization method is successfully applied to train MLPNN. It is proposed to update network weights by reasons of easy implementation and realization, the small number of parameters to be set, and capable of treatment with real numbers, no derivative information. In this study, in order to improve the ability of conventional neural network to escape from a local optimum, the PSO algorithm was used to modify the Network parameter and precision [4,9,10].

Particle Swarm Optimization (PSO) algorithm
Particle Swarm Optimization algorithm (PSO) is a randomly optimal algorithm based on swarm intelligence. The algorithm can be used to solve optimization problems. One of the first implementations of PSO was that of training Neural Networks and one key advantage of PSO over other optimization algorithms in training neural networks is its comparative simplicity. As described by Eberhart and Kennedy, the PSO algorithm is an adaptive algorithm based on a social psychological metaphor; a population of individuals adapts by returning stochastically toward previously successful regions in the search space, and is influenced by the successes of their topological neighbors. Each particle in the swarm represents a candidate solution to the optimization problem, and if the solution is made up of a set of variables the particle can correspondingly be a vector of variables. In a PSO system each particle is ''flown'' through the multidimensio n al search space, adjusting its position in search space according to its own experience and that of neighboring particles. The particle therefore makes use of the best position encountered by itself and that of its neighbors to position itself toward an optimal solution. The performance of each particle is evaluated using a predefined fitness function, which encapsulates the characteristics of the optimization problem. The main operators of the PSO algorithm are the velocity and the position of the each particle. In each iteration, particles evaluate their positions according to a fitness function [6,11,12]. Then the velocity and the position of the each particle are updated according to below equation 1; Where it is the current step number, w is the inertia weight. Researchers have shown that for large values of the inertia weight, the global search ability of the algorithm increases. Nevertheless, once the algorithm converges to the optimum solution, it can be considered as a disadvantage to select a large value for the inertia weight. For this reason, the methods, which offer to adjust the inertia weight adaptively, have been proposed. c1 and c2 are the acceleration constants r 1 and r 2 are two random numbers in the range [0,1], x i (t) is the current position of the particle, p id is the best one of the solutions this particle has reached, p gd is the best one of the solutions all the particles have reached. After calculating the velocity, the new position of each particle can be calculated according to equation 2: The PSO algorithm performs repeated applications of the update equations above until a specified number of iterations has been exceeded, or until the velocity updates are close to zero [11][12][13][14][15][16].

Hybrid algorithm of PSO and BPNN
The PSO algorithm is a global algorithm, which has a strong ability to find global optimistic result. PSO algorithm however has a disadvantage in search around global optimum space and its very slowness. It means particle swarm optimization algorithm was showed to converge rapidly during the initial stages of a global search, but around global optimum, the search process will become very slow. On the contrary, Back Propagation Neural Network (BPNN) has a strong ability to find the local optimistic result, but its ability to find the global optimistic result is weak. In other words, it can achieve faster convergent speed around global optimum, and at the same time, the convergent accuracy can be higher. By combining the PSO with the BPNN, a new algorithm referred to as PSO-BPNN hybrid algorithm is formulated. The BP algorithm has a strong ability to find the local optimistic result. Some researchers have used PSO to train neural networks and found that PSO-based ANN has a better training performance, faster convergence rate, as well as a better predicting ability than BP-based ANN does. The fundamental idea for this hybrid algorithm is that at the beginning stage of searching for the optimum, the PSO is employed to accelerate the training speed. PSO builds a set number of ANN, initializes all network weights to random values, and starts training each one. On each pass through a data set, PSO compares each network fitness. The network with the highest fitness is considered the global best. The other networks are updated based on the global best network rather than on their personal error or fitness [6,11,12,[16][17][18][19].

Dataset
In order to design PSO-BPNN hybrid algorithm, data records of the Department of Urology, Imam Khomeini Hospital was obtained. The original database contained 360 records of patients that underwent radical Prostatectomy for prostate cancer between 1 January 2006 and 31 December 2010. The dataset is a two-class problem either positive or negative for cancer and benign hyperplasia of prostate diseases respectively. Laboratory data belonging to 181 cancerous patients and other 179 patients were diagnostic to the BPH. Results of the final diagnosis after biopsy indicating whether they had cancer or not were The flowchart given in Figure 2 shows the training and testing processes of PSO-BP. As we know ANN's training process starts with random initialization of weights and biases which indicates the numerical values of the connections between layers. A particle's individuals are these weights and biases as given in Equation 4. The number of connections between layers refers to the particle size or search space dimension. After training, stop criteria is chosen as maximum generation number or target fitness value of the gbest particle [11,16].
Particles are then initialized randomly and updated afterwards according to equations (4) and (5): In most of the research the value of parameter has been selected by trial and error. In present research to determine the as usual research the number of particles, maximum generation and neurons in the hidden layer has been investigated by trial and error for each model. In this research initialization values of α and w in equation (5) were chosen as 0.975 and 0.9, respectively. w max and w min were 0.9 and 0.4. c1 and c2 constants were 2.1 and equal to each other. Limitations V min and V max were selected as -0.1 and 0.1, respectively. These values provided fast convergence to the target [11].
The basic idea behind medical tests is to calculate the probability of patients being sick based on the patients' test results. Receiver used in the study. The dataset included laboratory and demographic data like PSA, freePSA, ratio (fPSA/tPSA), and age. The descriptive statistics of preoperative parameters are given in (Table 1) In order to observe the distribution prostate neoplasia dataset including prostate cancer and benign hyperplasia of prostate, the data has presented in Figure 2. It represents the distribution of raw cancer and benign hyperplasia of prostate dataset according to the first three features (Age, PSA, and freePSA attributes).

Experiment Result and Discussion
In present research ANN parameters was optimized by the PSO algorithm. The adopted feed forward neural network has three layers and sigmoid function was used as a transfer function for all neurons. The hybrid algorithm combines the features of BP Neural Network with PSO in order to improve the traditional BP Neural Network. In present research, each particle is randomly initialized to a certain position in the problem space. The number of dimensions in the problem space is equal to the number of components there are to optimize. After the network have been constructed and topologically arranged, training phase of algorithm was started. In this research a kind of strategy was used, The training and test strategy based PSO-BP algorithm can be seen in Figure 3. Following figure shows the general block diagram of the proposed system. This paper suggests the use of PSO as the training algorithm in order to optimizing the weight and biases [6]. Structure of particle is given by P i , which is determined by Equation (3):    Operating Characteristics (ROC) analysis is an established method of measuring diagnostic performance for the analysis of medical test performance. The ROC curve is a good measure when the performance of different classifiers needs to be compared. ROC analysis is a standard approach used to determine the sensitivity and specificity of the diagnosis. Sensitivity is also known as the ability to distinguish the sick from the true ill, and specificity is the ability to distinguish the healthy from the true healthy. Sensitivity and specificity are the basic expressions for the diagnostic test interpretation of the ROC analysis [20][21][22][23]. In the following equations (6-8) Sensitivity, specificity and accuracy are expressed: In present research, BPNN-PSO was run separately for each number of neurons in hidden layer. The figure shows the changes of fitness for the best particle for each BPNN-PSO during their training processes.
Hybrid algorithm in present study combining the PSO algorithm with back-propagation algorithm, which is to mixed the particle swarm optimization Algorithms strong ability in global search and the back-propagation algorithms strong ability in local search. After implementation BPNN-PSO according to design strategy, Confusion matrix and ROC analyses of presented algorithm were conducted to see the success rate of the study that was carried out. The BPNN-PSO that was devised classified 59 of the 60 test data successfully. The confusion matrix obtained through all of the training and test data sets is given in (Table 2). When the diagnostic test conducted on all of the data in (Table 2) is examined, the following results are reached: When the diagnostic test conducted on all of the data in (Table 2) is examined, the following results are reached, Sensitivity is the ability to distinguish the patient who diagnosis with prostate cancer from the true cancerous patient. From equation (6), Sensitivity was found to be=96.87.
In present study Specificity is the ability to distinguish the patients with benign hyperplasia of prostate from the true patient. From equation (7), Specificity was found to be=100.
In addition, Accuracy is the test's total accurate diagnoses of the prostate cancer and the benign hyperplasia of prostate patient. From equation (8), Accuracy was found to be=98.33.

Conclusions
In this work, we investigated the use of Back propagation neural network based PSO methodology for classification of the prostate cancer and benign hyperplasia of prostate diseases. For the purpose several parameters were oscillated among standard range and performance were calculated. According to performance optimized parameter were selected. In order to analyzing performance of present algorithm, several statistically validated indexes were used for determining the performance of the proposed methodology. The previous reported results were also compared with our proposal. The proposed system has some advantages of automation. It is rapid, easy to process, noninvasive and cheap for clinical application. Though this system does not diagnose cancer conclusively, but by providing information it can helps doctors in deciding whether a biopsy is necessary or not. Such diagnosis method can help physician in making accurate decision about prostate cancer from benign hyperplasia of prostate.