alexa Solving Systems of Nonlinear Equations Using A Globally Convergent Optimization Algorithm | Open Access Journals
ISSN: 2229-8711
Global Journal of Technology and Optimization
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Solving Systems of Nonlinear Equations Using A Globally Convergent Optimization Algorithm

Sona Taheri*, Musa Mammadov

Centre for Informatics and Applied Optimization, School of Science, Information Technology and Engineering, University of Ballarat, Victoria, Australia

*Corresponding Author:
Sona Taheri
Centre for Informatics and Applied Optimization
School of Science, Information Technology and Engineering
University of Ballarat, Victoria, Australia
E-mail: [email protected], [email protected]

Received date: February 2010, Revised date: September 2011, Accepted date: May 2012

Visit for more related articles at Global Journal of Technology and Optimization

Abstract

Solving systems of nonlinear equations is a relatively complicated problem for which a number of different approaches have been presented. In this paper, a new algorithm is proposed for the solutions of systems of nonlinear equations. This algorithm uses a combination of the gradient and the Newton’s methods. A novel dynamic combinatory is developed to determine the contribution of the methods in the combination. Also, by using some parameters in the proposed algorithm, this contribution is adjusted. We use the gradient method due to its global convergence property, and the Newton’s method to speed up the convergence rate. We consider two different combinations. In the first one, a step length is determined only along the gradient direction. The second one is finding a step length along both the gradient and the Newton’s directions. The performance of the proposed algorithm in comparison to the Newton’s method, the gradient method and an existing combination method is explored on several well known test problems in solving systems of nonlinear equations. The numerical results provide evidence that the proposed combination algorithm is generally more robust and efficient than other mentioned methods on some important and difficult problems.

Keywords

Systems of nonlinear equations, Newton’s Method, Gradient method, Line search, Global convergence

Introduction

The solutions of systems of equations have a well-developed mathematical and computational theory when solving linear systems, or a single nonlinear equation. The situation is much more complicated when the equations in the system do not exhibit nice linear or polynomial properties. In this general case, both the mathematical theory and computational practices are far from complete understanding of the solution process.

Systems of nonlinear equations arise in various domains of practical importance such as engineering, medicines, chemistry, and robotics [15, 21, 37]. They appear also in many geometric computations such as intersections, minimum distance, creation of centenary curves, and when solving initial or boundary value problems in ordinary or partial differential equations [13] and [16]. The application of nonlinear systems in load flow calculation in power system has been done by Spong and et. all [32] in which their results of block Guass-Sidel iteration are compared with those of Newton-Raphson iteration. Solving such a system involves finding all the solutions of equations contained in the mentioned system.

In this paper, we consider the problem of finding solutions to a system of nonlinear equations of the form

image (1)

where image and imagerefers to n variables image We denote the i-th component of F by fi, whereimage is a nonlinear function and twice continuously differentiable on a convex set image

There is a class of methods for the numerical solutions of the system (1), which arises from iterative procedure used for systems of linear equations [12]. These methods use reduction to simpler one-dimensional nonlinear equations for the components image

There are some iterative methods for solving systems of nonlinear equations in the book written by Kelley [15]. A wide range class of iterative methods for solving systems of nonlinear equations has been suggested in the papers [2, 11, 25, 26].

Most of the methods for solving (1) are optimization-based methods [1, 4, 6, 11, 17, 22, 37]. In the approach proposed in [22], the system (1) is transformed in to a constraint optimization problem. At each step, some equations that are satisfied at the current point are treated as constraints and the other ones as objective functions. In a strategy based on optimization methods, at each iteration, a quadratic function is minimized to determine the next feasible point to step to. The quadratic function is the squared norm of the original system.

To find a solution of (1), one can transform the system (1) into an unconstrained optimization problem and then solving the new unconstrained problem instead by applying an optimization method. The transformed problem is formulated as:

image (2)

where, here and throughout the paper, image stands for the Euclidean norm. Obviously, optimal solutions of problem (2) with the zero value of the objective function correspond to global solutions of system (1).

In the last decades, many publications, both in theoretical and especially numerical issues, have been done for solving the problem (2) [3, 5, 9, 10, 18, 24, 27, 31, 33, 35]. Many search direction methods such as the gradient method, the Newton’s method, the quasi-Newton methods, the conjugate gradient and coordinate direction methods have been applied to find a minimizer of (2).

The steepest descent method (or gradient method) is a commonly used method. It has the globally convergence property, however, this method suffers from the slow speed and is easy plunging into local minima. In order to accelerate these difficulties, many methods have been used [10]. One way is the use of combination of different local optimization methods. It has been found that these methods show significant reduction in the number of iterations and the expense of function evaluations. In recent years, there has been a growing interest in applying these combination methods [7, 29, 30, 36]. Buckley [7] proposed a strategy of using a conjugate gradient search direction for most iterations and using periodically a quasi- Newton step to improve the convergence. This algorithm offers the user the opportunity to specify the amount of available storage. Wang and et al. [36] proposed a revised conjugate gradient projection method, that is, a combination of the conjugate projection gradient and the quasi-Newton methods for nonlinear inequality constrained optimization problems. Recently, Y. Shi [29] proposed a combined method of the Newton’s and the steepest descent methods for solving nonlinear systems of equations within each iteration. Further in [30], in order to deal with an unconstrained problem, the combination of the steepest descent with the Newton and the quasi-Newton methods were developed and compared with some traditional and existing methods.

Our procedure here for solving systems of nonlinear equations is based on the combination of local optimization methods. We apply the gradient and the Newton’s methods for our combination algorithm. They are combined into an integrated procedure, and especially the dynamic combination is of our interest challenge. The combined algorithms proposed in this paper are different from the existing algorithms [7, 29, 30, 36]. In the other words, we propose a novel algorithm with a new combination which offers the user the opportunity to specify the amount contribution of the methods.

The rest of the paper is organized as follows: Section 2 gives a brief review to preliminaries about optimization. In Section 3, we review the descent methods. We present the proposed combination algorithm in Section 4. The global convergence property of this algorithm has been proved in Section 5. We have demonstrated the efficiency of the proposed algorithm with some experiments in Section 6. Section 7 concludes the paper.

Preliminaries

Usually, optimization methods are iterative. The basic idea is that, with an initial guess of the optimal values of the variables,image an optimization method generates a sequence an optimization method generates a sequence image of improved estimates until it reaches a solution. When image is a finite sequence, the last point is the optimal solution; when image is infinite, it has a limit point which is the optimal solution of the problem. The strategy used to move from one iterate to the next distinguishes one algorithm from another. A typical behavior of an algorithm which is regarded as acceptable is that the iterates image move steadily towards the neighborhood of a point local minimizer, and then rapidly converge to that point. When a given convergence rule is satisfied, the iteration will be terminated. In general, the most natural stopping criterion is image

where image stands forimage and f is defined by (2). image is a prescribed error tolerance.

Let image be the k-th iterate image search direction, andimage step length, then the k-th iteration is image (4)

There are two fundamental strategies for moving from the current point image to a new stateimage Trust region [24, 38] and Line search [24, 28, 31, 33].

 

In the trust region strategy, the information gathered about f is used to construct a model function whose behavior near the current point xkis similar to that of the actual objective function f. When x is far from xk the model may not be a good approximation of f. Therefore, the search for a minimizer of the model is restricted to some region around xk.

In the line search strategy, the algorithm chooses a direction dk and searches along this direction from the current iterate xk for a new iterate with a lower function value.

The line search and trust-region approaches differ in the order in which they choose the direction and distance of the move to the next iterate. Line search starts by fixing the direction dk and then identifying an appropriate distance, namely the step length image In trust region, firstly a maximum distance is chosen, the trust region radius, and then a direction and a step that attain the best possible improvement subject to this distance constraint is found. If this step proves to be unsatisfactory, the distance measure will be reduced and tried again [24].

A trust region method is effective since it limits the step to a region of greater confidence in the local model and attempts to utilize more information from the local model for finding a shortened step. However, trust region models are more difficult to formulate and solve than a line search strategy [31]. In this paper, we will focus on line search strategies.

Line Search

Line search methods are traditional and efficient methods for solving unconstrained minimization problems. Its convergence has attracted more attention in recent years [3, 19, 35].

The success of a line search method depends on effective choices of both the direction dk and the step length image. It is clarified that the search direction plays a main role in the algorithm and that step length guarantees the global convergence in some cases.

There are two alternatives for finding the distance to move along image namely the exact line search and inexact line search [19, 28, 31, 33]. In the exact line search, the following onedimensional minimization problem will be solved to find a step length α

image (5)

If we choose image  such that the objective function has acceptable descent amount, i.e., it means the descent imageimage (6)

is acceptable by users, such a line search is called inexact line search. Since, in practical computation, exact optimal step length generally cannot be found, and it is also expensive to find almost exact step length, therefore the inexact line search with less computation load is highly popular.

A simple condition we could impose on % in an inexact line search is to require a reduction in image:

image (7)

It has been shown that this requirement is not enough to produce convergence to optimal point [24, 33]. The difficulty is that there is not always a sufficient reduction in f at each step, a concept we discuss next.

There are several inexact line search rules for choosing an appropriate step length image for example the Armijo rule, the Goldstein rule, and the Wolfe-Powell rules [24, 31, 33], which are described briefly in the following

Armijo Rule and Goldstein Rule

Armijo rule is as follows:

image (8)

image are tried successively until the above inequality is satisfied for m=mk

Goldstein presented the following rule. Let

image

be an interval. In order to guarantee the function decreases sufficiently, we want to choose α such that it is away from the two end points of the interval I.

The Goldstein conditions are

image (9) and image (10)

which exclude those points near the right end point and the left end point.

Wolfe-Powell Rule

It is possible that the rule (10) excludes the minimizing value of α outside the acceptable interval. Instead, the Wolfe-Powell gives another rule to replace (10):

image

Therefore, the step length αk in the Wolfe-Powell rule will be determined along the direction dk satisfying:

image (11)

and

image(12)

The Wolfe-Powell rule is a popular inexact line search rule. We will use it in our algorithm and all experiments in this paper.

Search Directions

The search direction in gradient-based methods often has the form

image (13)

where Bk is a symmetric and nonsingular matrix. For example, in the gradient method Bk is simply the identity matrix, dk = — gk [3,24,33] image

corresponds to the Newton’s method with image being available, where Hk is an exact Hessian of f [24,33]. In quasi-Newton methods, Bk is an approximation to the Hessian Hk that is updated at every iteration by means of a low-rank formula [5, 9, 24, 33]. In the conjugate gradient method, dk is defined by

image [8,18, 24, 33].

When dk is defined by (13) and Bk is positive definite, we have image and therefore dk is a descent direction.

The search direction dk is generally required to satisfy the descent condition:

image (14)

The condition (14) guarantees that dk is a descent direction of f at xk [24, 33].

Descent Methods

Many techniques have been devoted for solving (2), as well as (1). These problems are usually carried out using iterative methods due to the fact that there are generally no analytical methods to solve these problems. Among the variety of the exiting methods, the descent direction methods are the most popular techniques because of their fast convergence property. A general descent direction algorithm is given in the Algorithm 1.

Algorithm 1. A General Descent Framework

0. Lets image be a given initial point, and image an error tolerance. Each iteration image of a descent direction method contains the following steps:

1. If image then stop.

2. Compute a descent direction dk at xk satisfying (14).

3. Determine an appropriate step length αk > 0.

4. Set image and go to the next iteration.

Let image be the level set, and consider the Wolfe-Powell conditions (11) and (12) to determine αk then the global convergence of the Algorithm 1 is given by the following Theorem [33].

Theorem 1. Let αk in the above descent direction algorithm be defined by (11) and (12). Let also dk satisfies

image(15)

for some image and for all k, where image is the angle between dk and image If  image exists and is uniformly continuous on the level set Ω then either gk = 0 for some k, orimageimage

Proof can be found in [33], Theorem 2.5.4.

One of the most widely used methods satisfying Theorem 1 is the gradient method, in which image Although the method is globally convergent and usually works well in some early steps, as a stationary point is approached, it may descend very slowly. In fact, it is shown that the convergence rate of the gradient method is at least linear, and the following bound holds image (16)

where image are the largest and the smallest eigenvalues of the Hessian matrix, respectively.

In order to cope with the above-mentioned difficulties, one can use the Newton’s method with the quadratic convergence property. At the k-th iteration, the classical Newton’s direction is the solution of the following system:

image (17)

where Hk is the Hessian matrix at xk . If H is positive definite, then the Newton’s direction is a descent direction and consequently the system has a unique solution. Even when H is positive definite, it is not guaranteed that Newton’s method will be globally convergent. Although the Newton’s method generally converges faster than the gradient method, it depends strongly on a starting point. On the other hand, the application of the Newton’s method for solving the nonlinear equations is expensive due to the direct calculations of second order derivatives of the function, H. A number of techniques avoiding the direct computation of H may be used. Upon different approximation there are different methods. In this category are the quasi-Newton methods which approximate second derivatives in a most subtle and efficient way. Another alternative is the use of a fusion of different local optimization methods which lead naturally to powerful algorithms and has been attracted extensive attention in recent years. One of the most successful methods of this category, introduced by Shi [30], uses a combination of the gradient method and the Newton’s method. This algorithm is an efficient algorithm for solving problem (2) due to its global convergence property. In our experiments, we compare our results with this combination algorithm and refer it by ShA. The direction in algorithm ShA is very close to the Newton’s direction. However, practical implementations show that, in some cases the gradient method can be a more suitable choice than the Newton’s method. For instance, when the difference of the function values, in two previous iterations, and also the value of the gradient in the previous iteration is large enough, the gradient method may work better than the Newton’s method.

Proposed Algorithm

Our aim here is to present an algorithm with two different combinations for solving the problem (2), as well as the problem (1). Both proposed combinations are constructed so that they satisfy in the condition of descent methods and as well as the Theorem 1.

Let image be four parameters so thatimageimage Take any positive constants image such thatimageimage and initializeimage by 1. Let also T and image be very large and small positive numbers, respectively, and let image The steps of the proposed algorithm are as follows.

Algorithm 2. A Combination of the Gradient and Newton’s Methods

0. Choose a starting point image and an error toleranceimage . For image do

1. If image then stop.

2. If the Newton’s direction d1is not computable, due to the singularity of the Hessian, then compute the gradient direction image

3. Compute the gradient direction d2 and the Newton’s direction d1 at xk that satisfies (17).

4. Set image and image

5. If image step 6.

6. If image

7. If image go to step 9, otherwise go to the next step.

8. Use rules (11) and (12) to determine a step length image along the directionimage and go to step 12.

9. Compute image as follows:

image (18)

and set image

10. Ifimage and and go back to step 9.

11. Consider one of the following two versions to calculate Sk: a. Use rules (11) and (12) to determine a step length image along the direction imageimage. If imageimage otherwise set image

b. Use rules (11) and (12) to determine a step length image along the direction image

12. Set image

Parameters b1 ,b2 and b3are positive constants so that they offer the user the opportunity to specify the amount contribution of the methods. More precisely, when the slope of the function is slight, the algorithm tends to the Newton’s method, otherwise the contribution of gradient is increased and is considered close to the gradient method. In (18), when a difference between two previous values of the function is high then image is close to 0 and image Moreover, this equation is a dynamic form and has a crucial rule in the algorithm so that it specifies the amount contribution of the methods. It, also, guaranties that, near the solution, we get the optimal point with a super-linear convergence rate.

In step 11 of the above algorithm, we use two different strategies by means of the combination. Step 11.α is a new combination and different from the existing methods in the literature. In this combination, the step length αk is determined only along the gradient direction. In other words, we use a novel combination of the pure Newton’s method (i.e., αk = 1) and the gradient method. The second one is the usual combination which has been developed in some research works. The step length in this case is found along a combination of the gradient and the Newton’s directions.

Global convergence Theorem

Here, we establish the global convergence of the proposed combination algorithm based on the global convergence property of the Theorem 1.

Theorem 2. Consider using the Algorithm 2 to solve the problem (2). Assume that exists and is uniformly continuous on the level set image. Then either gk = 0 for some k, or image

Proof. Let assume that image is bounded below for al k. It is clear that in this case, image for all k. Denote image We will show that the direction dk obtained by the algorithm satisfies condition (15) of the Theorem image

image (19)

for all image whereimage is the angle between image

Suppose image is obtained at Step 8. Then image (Step 8), and it is easy to see that image it means (15) holds. Now, we consider other cases: case 1: image is obtained via Step 11.b and case 2: image is obtained via Step 11.a. We will proof each case separately as follows:

Take any intege k

1. In this case, we assume that image is obtained at Step 11.b. Then image is chosen as a descent direction and according to steps 9-10, the number image can be chosen so that the inequality imageimage Therefore, we have

image(20)

that is, (19) holds and therefore the obtained direction, dk, satisfies in the assumption of the Theorem 1, hence the remainder proof is similar to the proof of the Theorem 1 in [33]. 2. In this case, we assume Sk is obtained at Step 11.a, i.e. Sk = image

If the number of cases in Sk obtained by Step 11.a is finite, then it means Sk is defined by the gradient direction,d2 for all sufficiently large k and therefore the proof will be easily obtained.

Now suppose it is not finite, i.e., there is a subsequence image such that image is obtained via Step 11.a.

By considering the first condition in 11.a, since fk is bounded below we have

image

In addition, from image we obtain

image

Now, we are going to show image Suppose it is not true. Then there exist a subsequence image such that image

Here we consider two cases: imageimage does not converge to zero. The case (i) leads to contradiction by applying the second condition in Step 11.a. In the case (ii), let us consider image which is contradiction by image Therefore, the proof is complete, image

Experiments and Results

We have evaluated the performance of the proposed algorithm for several well known benchmark test problems given in [20, 34].In the proposed algorithm, we use two different combination as described in steps 11.a and 11.b; we refer these cases as 'Ala' and 'Alb', respectively. The group of methods we have compared includes Ala, Alb, the gradient method (GM), the Newton’s method (NM), and ShA presented in [8]. In all algorithms we use the Wolfe-Powell line search rules to find an acceptable step length.

The calculations were carried out using MATLAB. The comparison of the methods is based on the following criteria: all methods are terminated if the gradient converges to a predefined tolerance, image or the iteration number exceeds 500.

The parameters image used in this paper are: imageimageimage

We have listed the following ten test problems used in the experiments. To define the test functions, the general formats 1 to 3 have been adopted [20, 34]:

1. Dimension n.

2. Function definition, image

3. Standard initial point x0.

Problem 1. Helical Valley function

image

image

where

image

Problem 2. Powell Singular function

image

image

Problem 3. Wood function

image

Problem 4. Watson function

image

image

image

image

Problem 5. Extended Kearfott function

image

Problem 6. Extended Eiger-Sikorski-Stenger

image

Problem 7. Variably dimensional function

image

Problem 8. Discrete Boundary Value function

image

Problem 9. Extended Rosenbrock function

image

Problem 10. Trigonometric function

image

image

image

Table 1 lists the performance of the above-mentioned algorithms relative to the number of iterations used. We have multiplied the given initial points by 10 to have an additional initial point. In this table, “TP” and “IP” stand for test problem and initial point, respectively. Table 2 shows the summary of convergence results for the Table 1. In order to compare the algorithms with more initial points, we have generated 50 random initial points uniformly distributed from their domains with the intersection of image The summary of the convergence results of the algorithms considering these random

global-journal-technology-Number-iterations

Table 1. Number of iterations for 10 test problems

global-journal-technology-Summary-iterations

Table 2. Summary of convergence results for Table 1

initial points is given in Table 3. In these tables, notations “AC” and “NC” stand for the almost convergence and not convergence, respectively. Convergence means that the method finds the solution and image almost convergence means that the method finds a solution almost close to the optimal local solution and image otherwise not convergence.

Te numerical results in Tables 1 to 3, demonstrate the high performance of the proposed combination algorithm compared to other mentioned methods. This is confirmed by the number of iterations obtained, and the convergence properties. For example, the proposed algorithm, Ala, converges in all test problems for two different initial points. Alb converges in nine test problems out of ten. This algorithm finds the solution in the Wood function almost near the optimal solution. Although the algorithm proposed by Shi, ShA, convergences in nine test problems out of ten, but it fails to find the solution in the problem 3. Also, the number of iterations obtained by ShA is more than the proposed algorithms, in average. This is worse for the Newton’s and the gradient methods with more AC and NC properties.

global-journal-technology-Convergence-results

Table 3: Convergence results, by considering 50 initial random points for each test problem

Conclusion

A combined algorithm of the gradient and the Newton’s methods has been presented for solving systems of nonlinear equations. We have considered two different combinations. One of them is a usual case which has been recently introduced in some research works. Another one is a new combination and different from others in the literature. According to the numerical experiments, it is clear the proposed algorithm, especially the proposed algorithm with the new combination, is more efficient than others.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Article Usage

  • Total views: 11891
  • [From(publication date):
    December-2012 - Nov 22, 2017]
  • Breakdown by view type
  • HTML page views : 8112
  • PDF downloads : 3779
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords