Web Distributed Computing For Evolutionary Training of Artificial Neural Networks

210
PROCEEDINGS of the International Conference InfoTech-2016
Proceedings of the International Conference on

Information Technologies (InfoTech-2016)
20-21 September 2016, Bulgaria
WEB DISTRIBUTED COMPUTING FOR

EVOLUTIONARY TRAINING OF ARTIFICIAL NEURAL
NETWORKS1
Todor Balabanov, Delyan Keremedchiev, Ilia Goranov
Institute of Information and Communication Technologies, Bulgarian Academy of
Sciences, acad. Georgi Bonchev Str, block 2, office 514, 1113 Sofia, Bulgaria
e-mails: todorb@iinf.bas.bg
Bulgaria
Abstract: Evolutionary algorithms (EAs) are widely used in artificial neural networks
training. EAs are computationally interesting because it is possible ot separate the
problem solving in smaller pieces and to calculate these smaller pieces on different
machines (distributed computing). Distributed computing platforms are well established
and the most popular is BOINC, created in Berkeley. The problem in distributed
computing platforms is the heterogeneity of the computational environment. The best
way for solving heterogeneity is by using well established technology such as AJAX. In
this study a web based distributed computing platform is presented (JavaScript and
AJAX). The platform is used for ANN training with EAs.
Key words: artificial neural networks, evolutionary algorithms, optimization, AJAX
1. INTRODUCTION
Artificial neural networks (ANNs) began their development back in 1943 with
the development of Warren McCulloch and Walter Pitts [1]. In recent developments
the most common model of ANN is a three-layer network with back propagation of
error (BP). This type of ANN is directed weight graph. Each node has its own
activity and strength of connections between nodes determines how interaction is
done between individual nerve elements. Conventionally the network is divided into
three layers, the first layer serves as the input information from the external
environment and is referred to as an input. The third layer serves to display
information outside the network to the external environment and is called output.
1
This work was supported by private funding of Velbazhd Software LLC.
20-21 September 2016, BULGARIA
211
Between input and output layer stands a hidden layer, which has an essential role in
the operation of the network but also hides most theoretical unknowns (e.g. the size
of the hidden layer or how many hidden layers are optimal). In the classical threelayer model of ANN the accepted connections are only straight from input to output.
The presence of feedback is characteristic of recurrent ANN. The most frequent use
of ANN is in tasks for classification or prediction [2] [3]. The main task of the
classic three-layer ANN is to match the function between the input and the output.
This process of comparison is called training. Training is only one problem
associated with finding such values of the weights in the network for which the
network performs the task it was designed for. Once trained, ANNs are extremely
effective for use in practice, but the learning process is often slow and not very
effective [4] [5] [6]. Many learning algorithms (exact or approximated) are
developed for finding optimal values of weights. They are based on gradient
methods, evolutionary algorithms (EA) and heuristic approaches of global
optimization [7] [8] [9] [10] [11]. When using population based algorithms for
training of ANN a significant advantage is the possibility to do the learning process
in parallel implementation or even as calculations in a distributed environment [12]
[13]. The calculations in a distributed environment have serious application in large
tasks that can be calculated in parallel. ANN training, when it is based on algorithms
from the group EA, is ideal for implementation in a distributed environment. There
are different platforms in a distributed computing environment and the most popular
is BOINC with its project SETI@home [14] [15]. A major drawback of the most
popular platforms for calculations in a distributed environment is the challenge to
work in a heterogeneous environment where hardware and operating systems of
individual computers vary greatly. This shortcoming is brightly illustrated ins
BOINC platform. The creator of the project for distributed computing is responsible
to develop client programs for almost any configuration (hardware - operating
system) that should be supported in the project. Even the presence of 32 bit, 64 bit,
Windows, Linux and Mac OS X systems leads to developing at least 6 different
client applications. The variety of hardware and operating systems is a major
problem for scalability of a platform for distributed computing environments. At the
same time, high expandability of the system can be achieved if the calculations are
carried out with technology that is widespread on different hardware platforms and
operating systems. Such technology is the web browser and supported JavaScript
programming language. Any modern web browser allows asynchronous requests to
the web server by using the AJAX technology.
2. PROBLEM DEFINITION
This study is about a training model of ANN using an evolutionary learning
algorithm - differential evolution (DE) [16] [17] which is the EA group. ANN
training is done in a distributed environment. The network is represented as a
JavaScript code and communication with the central node is performed using
212
asynchronous AJAX requests. The goal of training ANN is forecasting values for
different currency on the global currency market. [18]
Forecasting is done by mathematical model based on ANN whose training is
carried out with DE in the form of a distributed computing. ANN topology is the
subject of research and for this reason in the developed system ANN's topology is set
by the user. Eligible types topologies are: multi-layer, multi-feedback and fullyconnected. The model is based on classical ANN as used in the models with BP. As a
transfer function linear transfer function is used:
u [i] = sum (w [i] [j] * x [j])
(1)
The transfer function defines how the input signals, in combination with the
weight factors, will influence the activity of the respective neuron. Although models
are available with other types of transfer functions at this stage preferences are in
favor of the simplest model based on linearity.
The result of the transfer function is necessarily normalized using a threshold
function (selected model in a sigmoid function with values ranging from 0.0 to 1.0).
Normalization is needed because of the different connections number between
different neurons.
x [i] = 1 / (1 + exp (-u [I]))
(2)
Sigmoid function is preferred for threshold function because of its suitable

properties in terms of differentiability and asymptote to plus infinity and minus
infinity. It is possible to use a binary function or a linear function (to improve
performance), but their properties affect the results that can be achieved by ANN. If
the network is working with values from -1 to +1 hyperbolic tangent can be used as
threshold function.
3. PROBLEM SOLUTION
Training shall be implemented with the help of DE. Within DE algorithm each
chromosome presents a set of weights for specific topology of ANN with a certain
number of training examples. Sequentially therefore individual chromosomes (sets
weights) are loaded in the structure of the ANN and after that tutorial examples are
fed. Calculation of the error which ANN performs for each example done and the
summed error of the specific examples defines the factor for the viability of the
chromosome (the set weights). The way to determine the coefficient of viability is
one of the key issues that determine the success of the overall prognosis. At the time
series it is reasonable training examples for ANN training to be submitted in
chronological order. This could be a problem for ANN trained with algorithm such
as BP. Also, it is reasonable chronologically earlier learners' examples to exert less
influence in shaping viability factor. Once rated individual chromosomes enter the
computational scheme of DE by performing selection, crossover and mutation.
213
In a parallel version of the algorithm local copies are created of ANN and DE.
The training takes place locally. Though each chromosome in the population
indicates an individual in the search space. Different individuals are grouped
relatively close together in this area. Parallel calculations on a multicomputer
algorithm such as grouping of individuals can result in locally different areas of
research in the solution space. In this respect, the most significant is the policy of
synchronization. The synchronization process includes the broadcast and cluster of
the best individuals on a common centralized location (in client-server architecture).
This global set of most persistent individuals can be used in the creation of further
local areas. Instead of client-server architecture it is possible to implement peer-topeer solution with the absence of a centralized server. Any locally running
application provides communication with the other locally running applications. A
major advantage of the proposed distributed system is its extremely high degree of
expandability.
The advantages of the proposed model are: First, the use of DE training of ANN
avoids the danger of catastrophic forgetting that occurs in BP based training. The
second advantage is the ability to train ANN by DE with recurrent links. The third
advantage is the ability to train ANN by DE, no matter the order in which training
examples are submitted. The fourth advantage is the ability to train various copies of
the ANN and to do so in parallel. This results in improvement of performance and
better coverage of the area of search.
The proposed model has the following disadvantages. Using ANN, regardless
of the solution of a problem, has a very slow and difficult learning process. ANN is
very effective after being trained, but the slow training process requires large
amounts of computing resources. Using DE slows the learning process, compared to
algorithms using BP. Anyway it is preferred because of the many advantages listed.
DE combined with BP presents an interesting future research. Although this can not
be achieved without any compromise with the topology of the ANN. Even though it
is a relatively clean technology the development of distributed computing system is
significantly more complicated than writing linear programs and even more
complicated than writing parallel programs. Anyway technological drawbacks are
possible.
4. EXPERIMENTAL RESULTS AND DISCUSSION
The performance in a JavaScript implementation of the language is the main
argument to be avoided in carrying out large-volume calculations or computations
that require greater accuracy. JavaScript falls into the group of scripting languages,
whose code is not compiled to instructions for the processor. Its code is interpreted
by software modules called interpreters.
214
Fig. 1. Comparison to speed-up for the dissemination of the information in ANN

during forward pass, between C ++ and JavaScript. On the X-axis are the number of
neurons and on the Y axis is performance (in milliseconds).
Fig. 2. Standard deviation in speed-up for the dissemination of the information in

ANN during the forward pass, between C ++ and JavaScript. On the X-axis are the
number of neurons and on the Y axis is the standard deviation.
215
At any time the interpreter may suspend the execution of the program code.
Therefore calculations are not sufficiently reliable. To check the difference in the
performance of the proposed AJAX-JavaScript solution a series of experiments to
distribute information in ANN during the forward pass were made. The code for the
forward pass is developed in two separate modules of the VitoshaTrade [18] system
- respectively a module in C ++ and a module in JavaScript. It is apparent from Fig.
1, that both speeds are comparable for networks with size of 10 to 100 neurons. Each
experiment was executed 30 times and the average values are presented in the figure.
As regards performance languages C ++ and JavaScript are comparable, but
there is higher stability of the computing process in C ++, which can be seen in Fig.
2. It presents the mean square deviation of the time needed for computation. This
difference is mainly due to the presence, of an interpreter and a web browser,
something that is not present in the calculations with C ++. In the category of the
languages like C ++, the program initially is translated to the assembly language and
then to machine code.
4. CONCLUSION
The realization of calculations in a distributed environment as AJAX web-based
system leads to a very high degree of expandability of the system. Practically,
distributed computing can run on any device supporting modern web browser able to
run JavaScript and AJAX. The calculation is carried out within the web browser
which is a process in an address space of the operating system. The OS in turn is
running on physical hardware. Though comparable in performance boost, the
calculations are unreliable (presence of interpreter) as opposed to the implementation
of languages like C / C ++ or Assembler.
REFERENCES
[1] McCulloch, Warren; Walter Pitts (1943), A Logical Calculus of Ideas Immanent in Nervous
Activity, Bulletin of Mathematical Biophysics 5 (4): 115133. doi:10.1007/BF02478259.
[2] Zissis, Dimitrios (October 2015), A cloud based architecture capable of perceiving and
predicting multiple vessel behaviour, Applied Soft Computing 35.
[3] Forrest MD (April 2015), Simulation of alcohol action upon a detailed Purkinje neuron model
and a simpler surrogate model that runs >400 times faster, BMC Neuroscience 16 (27).
doi:10.1186/s12868-015-0162-6.
[4] Werbos, P.J. (1975), Beyond Regression: New Tools for Prediction and Analysis in the
Behavioral Sciences.
[5] Schmidhuber, Jurgen (2015), Deep learning in neural networks: An overview, Neural Networks
61: 85117. arXiv:1404.7828. doi:10.1016/j.neunet.2014.09.003.
[6] Edwards, Chris (25 June 2015), Growing pains for deep learning", Communications of the ACM
58 (7): 1416. doi:10.1145/2771283.
216
[7] M. Forouzanfar, H. R. Dajani, V. Z. Groza, M. Bolic, and S. Rajan, (July 2010), Comparison of
Feed-Forward Neural Network Training Algorithms for Oscillometric Blood Pressure
Estimation, 4th Int. Workshop Soft Computing Applications. Arad, Romania: IEEE.
[8] de Rigo, D., Castelletti, A., Rizzoli, A.E., Soncini-Sessa, R., Weber, E. (January 2005), A
selective improvement technique for fastening Neuro-Dynamic Programming in Water
Resources Network Management, In Pavel Ztek. Proceedings of the 16th IFAC World
Congress IFAC-PapersOnLine. 16th IFAC World Congress. Prague, Czech Republic: IFAC.
doi:10.3182/20050703-6-CZ-1902.02172. ISBN 978-3-902661-75-3. Retrieved 30 December
2011.
[9] Ferreira, C. (2006), Designing Neural Networks Using Gene Expression Programming, In A.
Abraham, B. de Baets, M. Kppen, and B. Nickolay, eds., Applied Soft Computing
Technologies: The Challenge of Complexity, pages 517536, Springer-Verlag.
[10] Da, Y., Xiurun, G. (July 2005), T. Villmann, ed. An improved PSO-based ANN with simulated
annealing technique. New Aspects in Neurocomputing, 11th European Symposium on
Artificial Neural Networks. Elsevier. doi:10.1016/j.neucom.2004.07.002.
[11] Wu, J., Chen, E. (May 2009). Wang, H., Shen, Y., Huang, T., Zeng, Z., ed. A Novel
Nonparametric Regression Ensemble for Rainfall Forecasting Using Particle Swarm
Optimization Technique Coupled with Artificial Neural Network, 6th International
Symposium on Neural Networks, ISNN 2009. Springer. doi:10.1007/978-3-642-01513-7-6.
ISBN 978-3-642-01215-0.
[12] Rumelhart, D.E; James McClelland (1986), Parallel Distributed Processing: Explorations in the
Microstructure of Cognition. Cambridge, MIT Press.
[13] Russell, Ingrid, Neural Networks Module, Retrieved 2012.
[14] D. P. Anderson, J. Cobb, E. Korpela, M. Lebofsky, and D. Werthimer, SETI@home: An
experiment in public-resource computing, Communications of the ACM, Nov. 2002, Vol. 45
No. 11, pp. 56-61.
[15] D. Anderson. BOINC, A System for Public-Resource Computing and Storage, In proceedings
of the 5th IEEE/ACM International GRID Workshop, Pittsburgh, USA, 2004.
[16] Storn, R., Differential Evolution - A Simple and Efficient Heuristic Strategy for Global
Optimization over Continuous Spaces, Journal of Global Optimization, vol.11, Dordrecht,
pp.341-359, 1997.
[17] Price, K., An introduction to differential evolution, In David Corne, Marco Dorigo, and Fred
Glover, editors, New Ideas in Optimization, p.79108, Mc Graw-Hill, UK, 1999.
[18] Balabanov, T., VitoshaTrade - Distributed System for Forex Forecasting by Artificial Neural
Networks and Evolutionary Algorithms.
https://github.com/TodorBalabanov/VitoshaTrade/tree/master/ajax/

Web Distributed Computing For Evolutionary Training of Artificial Neural Networks

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Web Distributed Computing For Evolutionary Training of Artificial Neural Networks

Enviado por

Direitos autorais:

Formatos disponíveis

210

PROCEEDINGS of the International Conference InfoTech-2016

Proceedings of the International Conference on