Você está na página 1de 4

2010 International Forum on Information Technology and Applications

Improved Methods of BP Neural Network Algorithm and its Limitation


Chaoju Hu, Fen Zhao
Department of Computer Science, North China Electric Power University, Baoding071003, China
zhaofen242@163.com
input-respond. Then, it follows the direction of reduce target
output and actual output error. In order to achieve learning
objective, it from the output layer and through the middle
layer to corrects all connection weights layer by layer[1].
Training is neural networks basic feature. In order to
achieve the desired prediction accuracy, it can adjust the
weights and thresholds of the networks connection nodes
by repeated training predict network value and actual value.
The basic principles of BP neural networks
The main idea of BP algorithm is to divide study into
two phases: The first phase (forward propagation) gives
input information. From input layer, it calculates the actual
output value of each unit layer by layer through hidden layer.
The second phase (the reverse process). If we haven't get the
expected output in output layer, then we calculate the
difference of actual output and expected output (ie error)
recursively layer by layer .And we can adjust the weights
based on this difference. More specific, it is to calculate the
product of receiver units error and sending unit's activation
for each weight value. The actual change in weight can be
calculated in a pattern by weight error derivative. They can
accumulate in this mode set.
BP algorithm is a typical learning method of feed
forward neural network, which belongs to gradient descent
algorithm. And it is a supervised learning algorithm. By
Fig.1 we know that, the key of BP algorithm is positive
transmission of information and error back-propagation for
each neuron.
At present, most people use multi-layer network model,
as shown in Figure 2. Input layer is responsible for
receiving input information from the outside world, and
passed to each neuron of the middle layer. Middle layer is
the internal information processing layer, responsible for
information transformation. According to the requirements
of informations change capacity, middle layer can be
designed as a single hidden layer or multi hidden layer.
After further processing, the last information which hiddenlayer pass to each neuron of output-layer, complete a
forward spread process. This process output information
processing results from output-layer. When the actual output
and desired output are contradictory, it went into the error
back-propagation phase. Through output layer, the error
amend each layers weight by the way of error gradient
descent, and back-propagate to the hidden layer, input-layer
layer by layer. Informations positive-Communication and
errors back-propagation process is a process which is
adjusted the value of each layer continuously, and it is also a
process to learning and training neural network. This
process continues until the network output error reduced to
an acceptable level or a predetermined number of learning[2].

ABSTRACT: BP algorithm is a very important and classic


learning algorithm. It have a wide range of applications in
pattern recognition, image processing and analysis and control
areas. However, in practice, we found that BP algorithm still
have inadequates, such as the algorithms convergence is slowly,
easy to converge to local minimum points, but not the overall
optimal, Multiple iterations, numerical stability is poor. To solve
these problems, many scholars have proposed a number of
improved algorithm, this article summarizes some of the
improved algorithm frequently used: Learning rate adjustment.
Adding momentum term. Optimize the initialization of the right
value. These three methods can improve the convergence speed
and error precision of BP network. Meanwhile they can improve
convergence performance and reduce the possibility of the
network into a local minimum and oscillation.
KEYWORDS: convergence speed; local minimum; learning rate;
momentum

I.

INTRODUCTION

BP neural network (Back Propagation Neural Network),


also known as error back-propagation neural network, which
is a feed-forward network formed by the non-linear
transformation unit. A general multi-layer feed-forward
network is BP neural network.
BP neural network is a neural network which is the most
widely used. In the practical application of artificial neural
networks, 80% -90% of the artificial neural network model
are based on the BP network, or its variations. In a number of
neural network model, BP network model is awareness of
the earliest and most widely used. It is also the core of the
feed forward network, and embodies the essence of artificial
neural network. Today, BP networks have a wide range of
applications in pattern recognition, image processing and
analysis, control areas.
BP learning algorithm is the basic method to train
artificial neural networks, it is also a very important and
widths, line spacing, and type styles are built-in; examples of
the type styles are provided throughout this document and
classic learning algorithm. Its essence is to solve the
minimum of the error function. Using it, we can achieve the
adjustment of multilayer feed forward neural networks
weight. The proposal of this learning algorithm has played a
significant role in promoting artificial neural networks
development.
II.

INTRODUCTION OF BP NETWORK MODEL

BP network with backward propagation algorithm is


also known as BP algorithm. First of all, through the middle
layer, samples are communicated from the input layer to
output layer. The neurons of output layer obtain networks
978-0-7695-4115-0/10 $26.00 2010 IEEE
DOI 10.1109/IFITA.2010.324

11

When using more hidden units to solve problems, it will


lead to less local minimum. When the number of hidden
units is less, there is more local minimum. Increase the
hidden layer of nodes can improve the matching accuracy of
network and training group. However, In order to improve

networks adaptability to new graphics, it is required an


appropriate reduction in the hidden layer nodes. Therefore,
to a particular network, its hidden layer nodes should be
considered by the accuracy and generality.

X1 w1j
X2 w2j

Xn

f(wij-j)

w nj

yj

Figure 1. Model of Ann.

BP network learnings convergence speed is slowly, this


was mainly caused by too small learning rate, we can use
changed learning rate to improve it.
At the beginning of network training, we use the larger
learning rate, and decrease it allowed to the increase
number of learning when actual output and expected output
error is large. It makes the network to modify weight
sharply at the beginning of the training stage, and the error
rapidly decreased. Then gradually reduced learning rate as
the training is stable. This not only speed up the
convergence rate of the network, but also better to avoid the
oscillation.
Another way is to start training learning with smaller
rate. If the continuous training makes the error decreases,
learning rate will increase exponentially. If the error grew
very large, then quickly reduce the learning rate. Although
it can improve the network training speed to a certain extent,
but the adjustment of learning rate is improper controlled,
that lead to the networks oscillation phenomenon.
2) Adding momentum term
Using additional momentum is possible to glide local
minimum. The momentum added in this method is
essentially equivalent damping term. It reduces the
oscillation tendency of the learning process, improve the
convergence. This is the broader application of improved
algorithm[6].
The weighted formula with a momentum term is:

Standard BP algorithm is the most used neural network


training. It is also the most mature of the training algorithm,
and has been widely applied. That is for its benefits of
simple structure, maneuverability, less calculation, strong
concurrency, can simulate arbitrary non-linear input-output
relationship etc[3]. Theoretically, BP algorithm can
approximate any nonlinear function, but because of there is
no theoretical basis for the selection of parameters in many
neural network training learning, its making practical
application of neural network not satisfactory. Now, it has
not been able to overcome the limitations of BP algorithm
fundamentally[4].
III.

THE LIMITATIONS OF BP NETWORK AND ITS


IMPROVEMENT

In practice, we found there were inadequate in BP


algorithm:
(1) Algorithm convergence speed is very slow.
(2) Easy to converge to local minimum points, and
cant get overall optimal;
(3) More iterations;
(4) Bad numerical stability, parameters (learning rate,
initial weight value) is difficult to adjust [5].
Recently, in order to solve the slow convergence, long
time training, and other defects of BP network, many
scholars have made a number of improved algorithms.
However, in the process of using BP networks to solve
practical problems, how many layer networks has been
selected, What is the number of neurons on each node,
which transfer function and training algorithm we choose,
There is no viable theoretical guidance for all these
problems. The calculation can only be obtained through
experiments. It increases the workload of research and
programming computation virtually. The main directions of
BP algorithms optimize contain adjust weight, adaptive
learning rate, the network structure. By reading a large
number of literatures, the author concludes that the
improved method mainly used:
1) Learning rate adjustment[1]

+w(t + 1) =

E
+ +w(t )
w

Where: is momentum factor, generally taking about


0.9. Adding momentum term can speed up the speed of the
network. This is because the real effect of the momentum
term is to make the value of learning process is no longer
a constant value, but rather changing constantly. After the
introduction of the momentum term, it is making
adjustment change towards the average direction of the
bottom, and will not have a big swing. That momentum has

12

played a buffer role. If the system went into the flat area of
error function surface, then the error will be little changed.
.And
the
average
will
So:
+ w(t + 1) + w(t )

E
become +w =
. In this statement, the coefficient will

momentum term accelerates the pace of learning. The effect


of this method to accelerate the convergence rate is very
obvious. However, the approach has obvious faults, the
selection of parameters and can only be determined by
experiment.

become more effective, so adjustment as quickly as possible


from the saturated zone. Clearly, the introduction of

Error Back propagation

the desired output

Input layer

output layer
hidden layer
Figure 2. Multi-layer BP neural network structure.

3) Optimize the initialization of the right value[7,8].


Initialize weights influence the final solution strongly.
Right amendment is the main factor to determine the
convergence speed and precision of the network, and its
also the core of the algorithm. According to error formula:
E

k =1

k =1

j =1

This weight is able to distribute sample data uniform, and


achieve better results.
IV.

This paper summarizes three kinds of improved


methods for networks slow convergence and long training
time. Learning rate avoid the oscillation phenomenon by
reducing learning rate and increasing learning rate. Adding
momentum term is actually adding damping term, in order
to reducing the oscillation tendency of learning process,
improve the convergence, and then the network out of local
minima to achieve the convergence. When error in the
network does not fall a long time, we increase he weights of
the network, optimize the initialization of the right value.
These three methods can improve the convergence speed
and error precision of BP network. Meanwhile they can
improve convergence performance and reduce the
possibility of the network into a local minimum and
oscillation.

( d kp o kp ) 2 = ( d kp f ( w jk y j )) 2

When it is the minimum, the value wij is obtained from


Least-squares method is:
w

ij

k = 1

1
x

ij

k j

When the error is the minimum, the weight assigned


makes it easier to meet the required accuracy, and it is
making the number of debugging iterations reduction.
In addition, because of the group data of p is circular
called, not only a set of data, in other words, the data used
is relevant. Therefore, when its weight is initialized, we
must put its all weight. The weight of first floor is v i j , and
to meet:
p

CONCLUSION

REFERENCES
[1]
[2]

[3]

v ij = 1 , j = 1 , 2 , . . . , n

i=1

13

Improved Methods of BP neural network parameters.Aoxiang Li,


Jian Chen. Technology Analysis, 2009, 01
Discussion on the advantages and disadvantages of BP network and
improvement.Yibing Sun. Science and technology innovation
leader,2009,24
Improved Algorithm of BP Neural Network and Its Application.Hua
Yu,Wenquan Wu,Liang Cao. Computer Knowledge and Technology,
Vo1.5, No.19, July 2009, pp.5256-5258

[4]

[5]

[6]

The limitations of BP neural network and the study of its


improvement. Benguo Yu. J. SHANX1 AGRICUNIV. (Natural
Science Edition), 200929(1)
An improved algorithm of BP neural network in the Wheat Scab
Forecasting.Hai-ou Guan, Shaohua Xu,Yuhu Zuo,Xiaodan Ma.
Heilongjiang Bayi Farming University,2009,6
Improved method of BP neural network algorithm.Xiaoyan Liu.
Gansu Science and Technology,2008,11

[7]

[8]

14

Improvement of BP neural network prediction and the application in


the stock. Yikai Wang,Guoming Lai. Journal of Hanshan Normal
University200812
Data Mining Technology Applied Research Based on BP Neural
Network Algorithm .Xiaoya Huang. JOURNAL OF NANTONG
VOCATIONAL COLLEGE,2007,12

Você também pode gostar