Escolar Documentos
Profissional Documentos
Cultura Documentos
TELECOMMUNICATIONS SYSTEMS
BT Telecommunications Series
MODELLING FUTURE
TELECOMMUNICATIONS
SYSTEMS
Edited by
P. Cochrane
Advanced Applications and Technologies
BT Laboratories
Martlesham Heath
UK
and
D.J.T. Heatley
Advanced Mobile Media
BT Laboratories
Martlesham Heath
UK
Contents
Contributors
Preface, Peter Cochrane and David J T Heatley
vii
IX
The future
P Cochrane
11
Fractal populations
S Appleby
22
Internal markets
I Adjali, J L Fernandez-Villacaiias Martin and M A Gell
45
65
Hierarchical modelling
M A H Dempster
84
103
Distributed restoration
D Johnson, G N Brown, C P Botham, S L Beggs
and I Hawker
124
Intelligent switching
R Weber
144
10
Neural networks
S J Amin, S Olafsson and M A Gell
153
vi
CONTENTS
11
168
12
201
13
Evolving software
224
J L Fernandez-Villacanas Martin
14
245
15
Evolution of strategies
264
S Olafsson
16
285
S Olafsson
17
311
Index
345
Contributors
I Adjali
S Amin
S Appleby
S L Beggs
C P Botham
G N Brown
R A Butler
C A Carrasco
School of Engineering,
Staffordshire University
P Cochrane
M A H Dempster
J L Fernandez-Villacanas
Martin
M AGell
I Hawker
D J T Heatley
D Johnson
M H Lyons
P W A Mcillroy
E A Medova
S Olafsson
viii CONTRIBUTORS
C T Pointon
School of Engineering,
Staffordshire University
S Steward
R Weber
C S Winter
Preface
Since the invention of the electric telegraph, generations of engineers have
concerned themselves with the modelling of systems and networks. Their goal
has been, and continues to be, the gaining of fundamental insights and
understanding leading to the optimum exploitation of available technology.
For over 130 years this has brought about startling advances in the development of transmission systems, switching and networks. We are now within
sight of realizing a global infrastructure that represents the nervous system
of the planet, with telecommunications governing and underpinning all of
mankind's activity. It is therefore vital that we continue to expand our
understanding of all facets of this global infrastructure, from the constituent
parts through to market demands.
At a time when national networks are achieving 100070 digital transmission
and switching, with optical fibre dominating over copper cables, and with
satellite and microwave radio, demand for mobility and flexible access is on
the increase, and a new awareness of complexity has arisen. Firstly, the world
of telecommunications is becoming increasingly complex and inherently
nonlinear, with the interaction of technologies, systems, networks and
customers proving extremely difficult to model. Secondly, the relevance of
established models and optimization criteria are becoming questionable as
we move towards the information society. For example, minimizing
bandwidth usage or charging for distance hardly seems appropriate when
both are becoming increasingly low cost and irrelevant with the deployment
of optical fibre systems. Conversely, optimizing the performance and cost
of system hardware and software independently of each other seems shortsighted when either can represent a dominant risk. In a similar vein we could
also challenge the continuation of established, but little understood,
technologies and approaches in software and packet switching.
The key question is whether we are optimizing the right parameters to
the right criteria. There are no universal answers or solutions to this question
as we live in a sea of rapidly changing technology, applications and demand.
Even a crude global model remains just a gleam in our engineering eye, but
a much coveted objective. In the meantime, we have to settle for an
independent and disconnected series of models and assume we can cope with
the rising level of chaos (in the mathematical sense)! Probably the single,
PREFACE
most focused hope that we can foster is the ideal of widespread (even global)
simplification. Switching and transmission systems hardware has already
undergone a meteoric rise in complexity, followed quite naturally by incredible
simplification, and there are now signs that software may ultimately share
the same good fortune. In contrast, their interaction with services, compounded by the unpredictability of the market-place, shows no such tendency
- so far!
The ideal of a single, all-embracing model that will identify and correctly
optimize the right parameters is undoubtedly some way off. It may even be
unattainable in the strict sense due to the rapid development of new technologies, services and societies, and so we may never attain true global
optimization. Nevertheless, work towards understanding that goal and the
barriers must continue. It is therefore the purpose of this book to highlight
a meaningful sample of the diverse developments in future system and
network modelling. Our selection has been purposeful and designed to
contrast with, and challenge, the progressively established wisdoms and
practices of previous decades. We contend that telecommunications is undergoing fundamental change across a broad range of technologies, and as such
can adopt new strategies to dramatic effect. The key difficulty is the
transformation of established human preconception. For example, one fibre
in a cable can be more reliable than ten in parallel; the duplication of power
supplies can realize a higher level of network reliability than alternative
routeing; conventional software routines amounting to millions of lines of
code can be replaced by just a few hundred by using the principles of artificial
life; conventional market models do not necessarily apply to telecommunications, etc. All of these are known to be true and yet fly in the face of
current expectations and acceptability.
In gathering together this selection of diverse topics, we have tried, with
the help of the best crystal ball available, to indicate the most likely directions
for the long-term development of telecommunications. In this task we have
enjoyed the full co-operation and support of the individual authors whose
respective works all support our future vision. That is not to say that there
have not been, or do not remain, points of contention. Quite the contrary.
Nor is our selection complete - we have merely taken a snapshot, the best
available at this epoch, to indicate some of the most promising and likely
directions. We hope that you, the reader, will find our selection agreeable
and that you will share in our excitement for the challenge ahead.
Peter Cochrane
David J T Heatley
THE FUTURE
P Cochrane
1.1
INTRODUCTION
2 THE FUTURE
challenge a number of the established wisdoms and indicate the likely impact
of the changes forecast and the implications for future networks.
1.2
NEW NETWORKS
In less than 15 years, the longer transmission spans afforded by optical fibre
has seen a reduction in the number of switching nodes and repeater stations.
The arrival of the optical amplifier and network transparency will accelerate
this process and realize further improvements across a broad range of
parameters, including:
improved reliability;
1.3
THE FUTURE
10 Gbit/s and higher rates almost endlessly. An interesting concept now arises
- the notion of the infinite backplane. It could be used to link, for example,
Birmingham, Sheffield and Leeds through the use of optically amplifying
fibre that offers total transparency. Such concepts naturally lead to the idea
of replacing switches by an optical ether operating in much the same way
as radio and satellite systems today. The difference is the near-infinite
bandwidth of the optical ether. Demonstrators have already shown that a
central office with up to two million lines could be replaced by an ether
system, but suitable optical technology is probably still some 15 or so years
away. Systems of this kind would see all the software, control and
functionality located at the periphery of networks with the Telco probably
becoming a bit carrier only!
1.4
SIGNAL FORMAT
SOFTWARE
1.5
1.6
SOFTWARE
In the software domain very minor things pose a considerable risk, which,
it appears, might grow exponentially in the future. New ways of negating
this increasing risk are necessary as the present trajectory looks unsustainable
in the long term. Perversely, the unreliability of hardware is coming down
rapidly whilst that of software is increasing, so much so that we are now
seeing sub-optimal system and network solutions. From any engineering
perspective this growing imbalance needs to be addressed. If it is not, we
can expect to suffer an increasing number of ever more dramatic failures.
It is somewhat remarkable that we should pursue a trajectory of
developing ever more complex software to do increasingly simple things. This
is especially so, when we are surrounded by organisms (moulds and insects)
that have the ability to perform complex co-operative tasks on the basis of
very little (or no) software. An ant colony is one example where very simple
rule-sets and a computer with - 200 (English garden ant) to 2000 (Patagonian
ant) neurons are capable of incredibly complex behaviour. In recent studies,
the autonomous network telepher (ANT) has been configured as a contender
for the future control of networks. Initial results from simulation studies
have shown considerable advantages over conventional software. For network
restoration, only 400 lines of ANT code replaced the > 106 lines presently
used in an operational network. Software on this scale 1000 lines) is within
THE FUTURE
the grasp of the designer's full understanding, and takes only a few days
to write and test by a one-man team.
1.7
on the earthquake scale, 6.0 marks the boundary between minor and
major events - a magnitude 6 outage would represent, say, 100 000
people losing service for an average of 10 hours;
1.8
NETWORK MANAGEMENT
'" (1.2)
For example, a network of 500 000 nodes with a mean time before failure
(MTBF) of 10 years will suffer an average of 137 node failures and will
generate an average of 68.5 million reports per day. Assuming each node
is communicating with all the others is, in general, unreasonable, and the
opposite extreme is the least connected case, which leads to:
the mean number of reports per day = [N 2/6]/(MTBF in days)
... (1.3)
Whilst there are network configurations and modes of operation that
realize a fault report rate proportional to N, the nature of telecommunications
networks to date tends to dictate an N 2 growth. A large national network
with thousands of nodes can generate information at rates of -1 Gbyte/day
under normal operating conditions. Clearly, maximizing the MTBF and
minimizing N have to be key design objectives. A generally hidden penalty
associated with the N 2 growth is the computer hardware and software, plus
transmission and monitoring hardware overhead. For very large networks
this is now growing to the point where it is starting to rival the revenue-earning
elements - a trend that cannot be justified or sustained.
1.9
THE FUTURE
1.10
All of our experience of systems and networks to date, coupled with the
general development of photonics and electronics, points towards networks
of fewer and fewer nodes, vastly reduced hardware content, with potentially
limitless bandwidth through transparency. With networks of thousands of
nodes, failures tend to be localized and isolated - barring software-related
events! The impact of single or multiple failures is then effectively contained
by the 'law of large numbers' with individual customers experiencing a
reasonably uniform and flat grade of service. However, as the number of
nodes is reduced, the potential for catastrophic failures increases, with the
grade of service seen at the periphery becoming extremely variable. The point
at which such effects become apparent depends on the precise network type,
configuration, control and operation; but, as a general rule, networks with
< 50 nodes require design attention to avoid quantum effects occurring under
certain traffic and operational modes. A failure of a node or link today, for
a given network configuration and traffic pattern, may affect only a few
customers and go almost unnoticed. The same failure tomorrow could affect
large numbers of customers and be catastrophic purely due to a different
configuration and traffic pattern.
1.11
A GLOBAL MODEL
CONCLUSIONS
operator, for serving customer needs will become the essential credo as the
level of competition increases. Models giving an end-to-end network view
of the service - interfaces, protocols, signalling, connection, performance
and activity - are therefore the next big challenge.
1.12
CONCLUSIONS
10
THE FUTURE
BIBLIOGRAPHY
Cochrane P: 'Future trends in telecoms transmission', Proc IEEE F 13717, pp 669
(December 1984).
Cochrane P, Heatley 0 T J and Todd C J: lTV World Telecoms Conference, Geneva
91, p 105 (1991).
Cochrane P and Brain M C: IEEE Comsoc Mag 26/11, pp 45-60 (November 1988).
IEEE Special Issue: 'Fiber in the subscriber loop', LTS 3/4 (November 1992).
IEEE Special Issue: 'Realizing global communications', COMSOC Mag 30/10 (1992).
IEEE Special Issue: 'The 21st century subscriber loop', COMSOC Mag 29/3 (1991).
IEEE Special Issue: 'Global deployment of SOH complaint networks', COMSOC
Mag (August 1990).
IEEE Telecommunications Network Design & Planning, J-SAC 7/8 (October 1989).
World Communications - going global with a networked society, lTV Publication
(1991).
Hughes C J: 'Switching - state-of-the-art' , BT Technol J, 4 , No I, pp 5-19 and 4 ,
No 2, pp 5-17 (1986).
- Hawker I: 'Future trends in digital telecoms transmission networks', IEEE ECEJ,
2/6, pp 251-290 (December 1990).
Brown G N et al: 3rd lEE Conference on Telecommunications, Edinburgh, PP 319-323
(1991).
Olshanski R: 'Sixth channel FM video SCM optical communication system', IEEE
OFC'88, p 192 (1988).
Chidgey P J and Hill G R: 'Wavelength routeing for long haul networks', ICC-89,
p 23.3 (1989).
Healy P et al: 'SPIE digital optical computing II', 1215, PP 191-197 (1990).
MODELLING INTERACTIONS
BETWEEN NEW SERVICES
M H Lyons
2.1
INTRODUCTION
12
competition between a telecommunications service and a rival (nontelecommunications) service offering similar facilities.
2.2
GROWTH MODEl
GROWTH MODEL 13
parameter includes factors such as price and quality of service and is defined
formally as:
Pi = the probability that a customer will, all things being equal,
purchase service i. This definition implies EPi = 1.
I
new customers;
existing customers.
2.2.1
New customers
If n\ and n2 are the number of existing customers to services 1 and 2 respectively, then N (the total number of existing customers to the service class)
... (2.2)
and the growths of services 1 and 2 are given by:
14
[PtF/(P1F1 + P2 F2)]RN
... (2.3a)
+ P2 F 2)]RN
... (2.3b)
[P2F2/(P1F1
2.2.2
Existing customers
The number of existing customers (N) is constant during a unit time period.
However, some redistribution of existing customers between the services may
occur (Fig. 2.1).
service 2
service 1
J12 = CP2 F2n 1
n1
Fig. 2.1
n2
- on2
2.2.3
Overall growth
The overall equations describing net growth of services 1 and 2 in unit time
are obtained by summing the new customers arising from growth and transfer:
2.3
+ P2F2)] RN + C(P2F2nl -
PIFln2)
15
... (2.7a)
... (2.7b)
2.3.1
When the two services are fully interconnected, a customer can communicate
with the whole of the service class, i.e. F I = F 2 = 1. This situation applies,
for example, to competition between rival PSTN operators. Substituting for
F I and F 2 in equation (2.7), and using the fact that (PI + P2) = 1, the
following expressions are obtained:
.:lnl
= PIRN + C(Pln2 -
... (2.8a)
... (2.8b)
ni
tends to piN.
'0 3000
0
z 2000
1000
0
Fig. 2.2
year
service 2. The equilibrium market shares of services 1 and 2 are 60010 and
40% respectively, reflecting the values of PI and P2'
This model has been used to estimate the future growth of mobile
communications services in the UK by assuming that mobile could be
considered to be in competition with connections to the PSTN. The results
are shown in Fig. 2.3.
100000
-0 -o~<> -
PSTN
- - - - - - - -cellular
<Ii
~
iii
100
::l
()
10
lL-_...L..._......L_---lL...-_.J--_........_
1987
Fig. 2.3
1989
1991
1993
1995
1997
........_
1999
17
2.3.2
18
12000
'"Qj
~:;,
o
'0
o
10
15
20
25
year
Fig.2.4
Fig. 2.5
2.3.3
19
before, substituting in equations (2.7a) and (2.7b) gives the overall growth
for services 1 and 2:
~nl
- p\n2)
... (2.10a)
... (2.IOb)
100.00
10.00
~
Ql
iii 1.00
or;
IJl
Q)
p=0.5
.><
iii
year
Fig. 2.6
20
0.8 and Pvideo = 0.5. Both curves assume a growth rate for the
overall service class (videotelephony + POTS) of 5070, an initial market
penetration by videophones of 0.1 % and a value of C = I. It can be seen
that if P = 0.8, then videophones would reach an equilibrium 75% of the
market by 2010, whereas for P = 0.5, the market share remains static at a
mere 0.1 %. Smaller values of P would lead to a decline in market share.
Pvideo =
2.4
CONCLUSIONS
REFERENCES
I.
2.
Veda T: 'Demand forecasting models for markets with competition', Int Teletraff
Congr, .!l, p 261 (1990).
3.
4.
Mahajan V and Muller E: 'Innovative diffusion and new product growth models
in marketing', J of Marketing, 43, p 55 (1979).
REFERENCES
21
5.
6.
7.
3
FRACTAL
POPULATIONS
S Appleby
3.1
INTRODUCTION
This chapter presents a review of fractal and related techniques which may
be useful for the planning or analysis of large networks to serve the human
population. The work divides naturally into two areas:
firstly, the use of fractals for modelling and characterizing the spatial
distribution of human population;
Finally, these two areas are combined to show how fractal structure in
the population affects the design of a distribution network.
The motivation for this review is to make the techniques described here
more widely known amongst the telecommunications engineering community
and to show how these techniques can be used.
The main reason for a telecommunications operator to be interested in
these techniques is because a graph-theoretical approach is not tractable for
large networks; there are too many possible network configurations. If an
underlying structure could be found in the population distribution, then it
might allow a number of problems to be solved without designing the network
FRACTAL GEOMETRY
23
3.2
FRACTAL GEOMETRY
24
FRACTAL POPULATIONS
law then the exponent of the power law is a characteristic of the coast.
Mandelbrot proposed interpreting the value of the exponent of the power
law as an indicator of the dimension of the coast.
This 'divider dimension' is only one of many dimensions that may be
used to characterize a fractal. In general the process of measuring the
dimension of a shape proceeds as follows. One forms an approximation of
the shape such that all the detail below some length is obscured (in this chapter
this length will be called the resolution). The coastline example above used
the dividers set at a particular spacing to form an approximation of the
coastline which obscured all detail below the divider spacing. The coastline
can be approximated by joining the points where the dividers cross the coast.
The next step is to establish how much information is required to specify
the location of a point in the shape to within the resolution. In the case of
the coastline, the amount of information required to specify the pair of line
segment ends that straddle a point on the coastline is used. This is the
logarithm of the number of segments. The amount of information is then
plotted against the logarithm of the resolution. If the resulting graph is a
straight line then, for the purposes of this chapter, the shape is a fractal.
The dimension of the shape is the negative of the gradient of the line.
When measuring the dimension of a distribution such as the population
distribution it is more suitable to partition the plane into squares of a given
size and count the number of people living in each square in order to form
the approximation of the actual distribution at a particular resolution. In
this case the size of the squares is the resolution. The next concern is the
amount of information required to determine in which square a particular
member of the population (selected at random) lives. There are many different
information measures that could be used but these can all be shown to be
special cases of the generalized information given by:
I
... (3.1)
FRACTAL GEOGRAPHY
25
3.3
FRACTAL GEOGRAPHY
26
FRACTAL POPULATIONS
the town and yet communication with the people in the town takes place
through the town's perimeter. This may explain the dendritic town
morphologies noted by Fotheringham, Batty and Longley [12].
In a series of papers Longley, Batty and co-workers [6, 12-15] investigated
the use of fractals for modelling urban morphology. They tried a number
of fractal-generating algorithms with the primary interest of discovering
whether simple algorithms could explain the complex shapes exhibited by
urban population distributions. Two algorithms of particular note are
diffusion limited aggregation (DLA) and the dielectric breakdown model.
A DLA cluster begins with a seed particle. A second particle is allowed
to randomly walk on a lattice until it collides with the seed particle (collision
meaning that it occupies a neighbouring lattice site) or until it wanders beyond
some limit whereupon it is discarded. Another particle is then released and
the process continues; particles either stick to the growing cluster or wander
beyond the given limit. Figure 3.1 shows a DLA cluster.
Clusters constructed in this way have no characteristic length. It is not
at all obvious why this should be the case since the lattice upon which the
cluster is built clearly has a characteristic length. The scale invariance seems
to be due to the way that the growing arms of the cluster screen the inner
perimeter sites from the diffusing particles.
DLA was originally proposed as a model for a number of natural processes
[16] . There have been many papers published which study different aspects
of DLA and similar growth processes. The sources that are of most relevance
to the current work are those that discuss the occupancy probability
distribution [17-24] and the various simple algorithms that produce complex
dendritic structures [25-27].
Measurements of the fractal dimension of the DLA clusters reveal that
the Do dimension is approximately 1.7. In comparison Batty, Longley and
Fotheringham [6] found the dimension of Taunton in Somerset to be between
1.6 and 1.7.
The DLA process in its original form has no parameters to adjust and
so cannot be fitted to actual data on the shape of towns. For example, a
method of adjusting the fractal dimension to fit that actually measured would
be beneficial. The dielectric breakdown model (DBM) has such a parameter
that can be adjusted.
DBM is closely related to DLA since both processes are governed by
Laplace's equation:
... (3.2)
In both DLA and DBM this is solved in two dimensions with the
appropriate boundary conditions which assume that the growing cluster is
FRACTAL GEOGRAPHY
Fig. 3.1
27
... (3.3)
28
FRACTAL POPULATIONS
3.4
The q dimensions have been measured for cities in the United States and
Great Britain [31].
Figures 3.2 and 3.3 show generalized information as a function of
resolution as q-l for the United States and for Great Britain. Natural
logarithms have been used to calculate information.
8
7
C 4
.Q
1U
E 3
~ 2
.......
O~-..,.~
~~
__
_---
.......;~--,..~
100
Fig. 3.2
29
6
)(
ell
E 5
c
4
C-
"e
o
"~
:s<5
OL.10
~-----.....;;:lI--...J
resolution, km
Fig. 3.3
30
FRACTAL POPULATIONS
Figure 3.5 shows information plotted against resolution for the United
States as q-l when all cities with a population below 50 000 inhabitants
have been neglected. The graph shows that the quality and extent of the linear
region have been reduced. It is worth noting, though, that the gradient of
the linear region has only decreased slightly compared with the graph in
Fig. 3.2.
8
x
ell
E 6
c::
E5
c::
.Q
co
E3
.2
-- -
. 2
OL..-_....I100
.......
----:3o___---J
resolution, km
Fig. 3.4
=50,000
.....
OL..-_......
100
_----
~----"""--~
resolution, km
Fig. 3.5
The effect of replacing cities with points will be most pronounced for
larger q values and smaller resolutions. For example, Fig. 3.4 shows a
considerable variation in information for small resolutions.
LARGE GRAPHS
31
The values of D q for the United States and Great Britain are presented in
Tables 3.1 and 3.2.
Table 3.1
Dq
0.0
0.5
1.0
1.5
2.0
2.5
3.0
1.580.05
1.52 0.03
1.46 0.05
1.360.05
1.260.05
1.17 O.I
I.I!O.!
Table 3.2
Dq
0.0
0.5
1.0
1.5
2.0
2.5
3.0
1.55 0.05
1.53 0.05
1.49 0.05
1.45 0.05
1.290.!
1.160.1
1.1O0.!
3.5
LARGE GRAPHS
What constitutes a large graph will depend on the problem at hand. In general,
though, a graph is large when no deterministic technique is able to provide
a solution and one needs to resort to stochastic or approximate methods.
If it is true that the morphology of towns is determined to a large extent
by a volume/surface relationship, then there is likely to be a clear structure
to the communication that takes place between people as a function of their
locations since the freedom of communication is, in part, measured by the
surface of the town. There is some justification, therefore, in studying the
relationship between fractally distributed populations and the networks that
are required to interconnect that population. This has led to some work on
the concept of 'fractal graphs'.
3.5.1
Fractal graphs
The name 'fractal graph' was used by Bedrosian and co-workers [32-34]
to describe networks used to interconnect large populations that are
distributed in a fractal manner. The population distributions studied initially
were simple power-law distributions about a single population centre (as has
been suggested is the case for a town). The population was interconnected
32
FRACTAL POPULATIONS
by a minimum spanning tree where the cost of each link in the network was
simply equal to its length. Various statistics were then calculated for the
resulting network. In later work the populations were extended to multi-foci
populations based on DLA.
The single focus population distributions produced in this work may be
justified by early work on population distributions which used a power law
model of population density centred on a single point. The work of Batty
and Longley [13] provides some justification of the multi-foci population
distributions based on DLA. The networks produced are not at all realistic
though. The reason is that the total length of the links is not the only factor
determining the cost of the network. There are many other factors which
need to be included such as the costs of the nodes and the fact that the cost
of any network component is dependent on its capacity.
3.5.2
LARGE GRAPHS
33
consists of all of those occupied links which are interconnected. Again there
exists a critical probability and a series of interrelated exponents.
The connection between percolation theory and the study of networks
arises when one wishes to calculate the conductivity of a substance in which
the bonds may assume different impedance values with different probabilities.
One technique which has been successfully used to analyse systems near
critical points is that of the renormalization group [36]. Renormalization
group techniques have been used to predict the critical probabilities of a
number of percolation systems and have been used to estimate their critical
exponents [35]. The renormalization group is an approximation technique
whereby one iteratively replaces the lattice at one scale with the lattice at
a coarser scale. Take, for example, site percolation on a square, twodimensional lattice with lattice parameter b and with a probability that any
site is occupied of p, and then approximate this lattice by a lattice with a
larger lattice parameter, say 2b (see Fig. 3.6); then each point on the coarser
lattice will represent four points on the finer lattice. Since, in this case, interest
lies in the clusters of connected occupied sites, the collection of four sites
is replaced with an occupied site only if the four points on the finer lattice
are connected. The probability that a group of four adjacent sites on the
finer grid is connected can be estimated and, as a result, the probability that
the group is replaced by an occupied site can also be estimated. If this
probability is smaller than p then at each renormalization the occupation
probability will decrease and therefore the correlation length relative to the
renormalized lattice parameter will also decrease (correlation length increases
monotonically with p). The critical probability will be that which is invariant
under renormalization.
To understand the conductivity of a bond-percolation process one needs
to study the 'backbone' of the incipient cluster since the links which pass
no current will not contribute to the conductivity even if they are part of
the incipient cluster. Orbach [37] and Aharony et al [38] have studied the
dynamics of percolation clusters at the critical point (the latter workers in
terms of eigen-dimensions of fractal transfer matrices disussed in section
3.5.3).
Renormalization techniques with two parameters have also been used to
estimate the dimension of diffusion-limited aggregation clusters [17].
Site percolation has a particular relevance to the design of networks that
consist of a number of nodes interconnected by links. In this case, site
percolation can be used as a simple model for the propagation of states from
one node to another across the network. For example, if there is a certain
probability that a node will become overloaded then percolation theory may
be useful in determining the statistics of the regions of overloaded nodes.
34
FRACTAL POPULATIONS
lattice
parameter = b
--~-II"--~-II"-
Fig. 3.6
3.5.3
35
3.6
A simple example
36
FRACTAL POPULATIONS
Fig. 3.7
Fig. 3.8
37
be placed at the same location with respect to the half-size triangles as the
single node was with respect to the full-size triangle. Each of the smaller
triangles has one-half of the linear dimension of the larger triangle and one
third of the population. The total length of cable used in each of the three
smaller triangles will then be one sixth of that used when there was just a
single node. There are three smaller triangles so the total length of cable is
one half of that in the single node case.
In this example, the capacities of the links connecting the users to the
first layer nodes does not change as the number of nodes is increased. With
the assumption that the cost of a cable of given capacity is proportional to
its length, it is clear that the cost of the cable is reduced by one half when
the number of nodes is increased by a factor of three. The cable cost as a
function of the number bf nodes is then a power law whose exponent is
log(2)/log(3). This is the reciprocal of the Hausdorff dimension of the
Sierpfnski triangle [1].
As far as the cost of the nodes is concerned, the symmetry of the
distribution (at least in this example) means that each node will serve the
same fraction of the population and therefore will be expected to have the
same cost. The total cost of the nodes in a network layer will be proportional
to the number of nodes in that layer.
Knowing both the total cost of nodes and the total cable cost as a function
of the number of nodes means that one can choose the number of nodes
that minimizes the cost of the first layer in the network. This process can
be repeated for subsequent layers by treating the first layer nodes as the
population to be served by successive layers.
3.6.2
So far the generalized q dimensions have been reviewed, and the connection
between the fractal properties of a population distribution and the cost of
a network to interconnect that population has been indicated. In this section
the actual population distributions of the United States and Great Britain
obtained from census data will be used to see if the relations that were
suggested in the previous section hold in realistic cases.
38
FRACTAL POPULAnONS
39
of the number of nodes, and the index of the power law to be the inverse
of the fractal dimension of the distribution. This function can be inverted
so that the number of nodes is a function of the total link length. In this
case, the logarithm of the number nodes in the first layer of the network
will be equal to I q when q = O. This can be extended further to calculate I q
for other values of q as a function of total link length. In this case the total
link length plays the role of the resolution in the calculation of the generalized
q dimensions.
Figures 3.9 and 3.11 show the total link length plotted against the number
of nodes for the United States and the Great Britain respectively. Figures
3.10 and 3.12 show the generalized information plotted against the total link
length. The linear regions in the graphs are not as well defined as those
10 12
E
.c
i5>
c:
..9!
<Il
:0
CIl
0
.Q
109
10
100
number of nodes
Fig. 3.9
6
c:
.2
iii 4
E
(;
~ 3
o"'---
-':-::--
Fig. 3.10
40
FRACTAL POPULAnONS
number of nodes
Fig. 3.11
.2
iii
.E.!;;
109
total cable length. km
Fig. 3.12
41
3.6.3
The previous section used the k-means algorithm to show that the moments
of cable length do indeed scale in the same way as in the Sierpinksi triangle
example. The next logical step would be to ask whether the generalized
dimensions can be related directly to the cost of a layer of a distribution
network.
To cope with the most general case of fractal population distribution (i.e.
where the distribution has well-defined but arbitrary generalized dimensions),
some assumptions need to be made regarding the way that the costs of the
network components depend on their parameters.
For network nodes it must be assumed that the cost of a node is a function
of the population served by that node only. The costs model for the node
is then represented by a weighted sum of power-law functions whose
exponents take on the same values of q as the q dimensions, Dqo In practice
this means that q should be between 0 and 3 since the dimensions are not
well defined outside of this range. This allows the possibility, say, that the
node cost function could be the sum of a constant value and a value that
is proportional to the number of people served by the node.
For links in the network, again it must be assumed that the cost of a link
is a function of the population served by the link, and that the cost of a link
is proportional to the length of the link.
With these assumptions regarding the cost models of the network
components, it is possible to expand the costs of each type of network
component in terms of the generalized entropy, and (with suitable
renormalization) as a function of the q dimensions. This means that
characterization of a population distribution in terms of its q dimensions
is enough to estimate the cost of each layer of a distribution network.
One benefit of deriving an equation to estimate the network cost is that
it can be differentiated and optimized with respect to parameters in the
equation. For example, if it is assumed that the catchment areas of the first
layer nodes all have the same diameters, then the optimum diameter can be
found very easily.
This connection between the generalized q dimensions and the cost of
a distribution network is reported in Appleby [42].
42
FRACTAL POPULATIONS
3.7
CONCLUSIONS
This chapter reviews work on three main topics - the fractal structure of
the spatial distribution of the human population, the analysis of scaleinvariant graphs and the use of the fractal nature of the population to estimate
the cost of a distribution network to serve the population.
The fact that there is such a strong, easy to characterize structure in the
spatial distribution of the population should simplify the task of making
design decisions regarding networks on a national scale. Such a structure could
also be used for issuing planning guidelines which are designed to minimize
the capital cost of a national network.
The work on scale-invariant graphs has introduced the concept of critical
phenomena to networking. This means that for large networks, small changes
in a network parameter may have very dramatic consequences, possibly
resulting in end-to-end blocking.
Finally, it has been shown that the fractal dimensions of the population
distribution impinge directly on the cost and hence the design of a distribution
network.
REFERENCES
1.
2.
3.
4.
5.
Batty M: 'Cities as fractals: simulating growth and form', in 'Fractals and Chaos',
pp 43-49, Springer-Verlag (1991).
6.
7.
8.
9.
REFERENCES
43
19, No 3,
44
FRACTAL POPULATIONS
INTERNAL MARKETS
I Adjali, J L Fernandez-Villacanas Martin and M A Gel!
4.1
INTRODUCTION
46
INTERNAL MARKETS
80
60
.9
~
Ql
a.
0
'0
40
Qj
.0
E
::J
c:
20
0 .......
....-..
year
Fig. 4.1
Monopoly
Monopoly service provision
Business oriented
Stable pricing regime
One network
The emergence of the CIP will bring enormous changes not only to the
ways in which a communications company carries out its business to win
and keep its customers but also to the basic philosophy underlying network
engineering. Having to interconnect and interwork with large numbers of
competitive networks may undermine the traditional philosophy of the
Central Office which has until now dominated telecommunications. It is
unlikely that the 'Central Office' philosophy will be able to cope with the
operation of global networks and their myriad interoperations with networks
INTRODUCTION
47
PUP
homogeneous network
Fig. 4.2
competitive access
provider
CIP
heterogeneous network
48
INTERNAL MARKETS
THE MODEL 49
4.2
THE MODEL
50
INTERNAL MARKETS
computational system
J;JI!I
[njP;(n')P(n',t) - njP;(n)P(n,t)]
O'-'~ <0
at
O'-'~<F>
at
= a;<O;
= 2a'W> + a
1
2
a
at
'~
1
f
,O'--<>=a,<>+-a
a <F>
at
0'-'_
Fig. 4.3
fa~<O
+ (2a; +
~ az)<F>
<F> +-a,
f2 '- W>
a
+ fa;' <P> +
a';'
<~'>
+ a2
THE MODEL
51
52
INTERNAL MARKETS
I
I
I
_____ 1
I
I
I
I
._---,----I
I
I
-----1----I
I
I
I
I
_____ I
I
organism
systems
~------------~
I
I
I
pay-offs
macroscopic layer
how well organisms perlorm
Fig. 4.4
Fig. 4.5
THE MODEL
53
the reinvestment of the profit, etc), the total pay-off would be the composition
of those for each individual system parameter identified in the independent
set. It should be noted that, as in its biological counterpart, the system bases
are common to all system parameters in the same way as aminoacids are
always combinations of three of the same four bases.
Depending on the content of the relaxation box, when a system parameter
decides to change its appreciation value (this time can vary for the different
parameters), there is said to be a pay-off mutation. The content of a random
number of boxes changes by the amount indicated in the greed box. This
mutation mechanism turns into a competitive process as systems try to
improve their pay-offs. In an environment with a large number of systems,
these learn to adopt evolutionary strategies that defeat (or suppress) the
weakest with lowest pay-offs. Social behaviours, such as parasitism, flock
formation, antiparasitism seen in biological systems, will emerge in these new
telecommunications, computational and market systems as players learn how
to change their attributes to dominate the market.
If a system is devised in which resources provide several services with
different pay-offs to a pool of randomly distributed agents, these competition
mechanisms would help the resources to become more attractive to the agents.
The pay-off pool may be finite, Le. resources cannot increase their pay-offs
permanently if there are no resources that are lowering theirs; this represents
a dynamic conservation process. The size of this pool can be fixed initially
or can be left to expand or contract. In a computational model the limits
would be imposed by the initial size of allocated memory and the processor
speed as resources would compete for space and CPU time. The agentresource system is open-ended when resources evaluate their pay-offs
depending on the model constraints instead of being fixed beforehand. There
is therefore a perpetual force, the pay-off mutations, pushing the system out
of local minima in the search for improved solutions.
So far consideration has been given to the resources changing their payoffs, competing and therefore communicating with other resources. But the
agents can also communicate with one another. Inter-agent communication
is achieved through the introduction of an alternative set of agent boxes that
allow for individual characteristics for each agent. Some of these boxes are
templates (e.g. 4 boxes with binary Is or Os code for 16 different templates)
that can be compared with other agents' templates. This comparison process
enables agents to construct plans for market tactics and strategies.
When the number of resources is low, the diversity is not high enough
for the resources to learn from evolution; that is why learning must come
from looking at the system and analysing the competitive behaviour of a few
strategies that have been previously introduced. An example along these lines
will be discussed in section 4.3.2.
54 INTERNAL MARKETS
Fig. 4.6
Schematic diagram of a decentralized agent/resource system made of computational
agents sharing two resources. The system's behaviour is determined by the agents' evaluation
of the pay-off corresponding to each resource.
4.3
to study the case where resources (as well as agents) compete within the
market environment - as this requires that the resources change and
adapt their pay-offs to the current market situation, time-dependent
solutions with changing pay-offs would, therefore, be sought.
4.3.1
55
Stationary solution
Numerical values for the input parameters (coming into the transitiOn
probability function p) will be taken from Kephart et al [10] to provide
comparison with some previously published (but restrictive) results. In a
simple case, p can be made a function of the fractional number of agents
fusing resource I, through the payoffs G 1 and G2 for using resources I and
2 respectively:
... (4.1)
Figure 4.7 shows the pay-offs G 1 and G2 as a function of f. They model
a simple competitive behaviour (opposing gradients) between agents so that
the pay-off for using each resource decreases with the number of agents
already using the same resource.
An agent will therefore choose to switch to the other resource if its payoff is larger. The system reaches a stability point when the two pay-offs are
equal so agents will prefer staying with the resource they are using. For G 1
and G2 given in equation (4.1), this optimal behaviour of the system occurs
for f = 0.75, i.e. 75% of all agents are using resource l. The decision region
can be made less sharply defined by introducing an uncertainty element in
the pay-off evaluation of agents. This can be achieved by introducing
15
~
Q.
3
2
0.8
1.0
Fig. 4.7
56
INTERNAL MARKETS
Gaussian noise with standard deviation a around the true value of the payoff. The resulting transition probability p is given by:
p =
2I
1 + erl
(G(-G
2;-
... (4.2)
)]
and shown in Fig. 4.8 for a value a=0.125. The two limiting cases of a=O
and a = 00 correspond respectively to perfect knowledge (f = 0.75) and
complete lack of information on pay-offs, leading to the uniform distribution
of agents (f = 0.5).
1.0
Ql
~
::J
0.8
Ul
Ol
c:
'iii
0
0
"
0.6
,
,, "
()
'0 0.4
1i
til
.Q
0.2
0.
",
"",
""""
"
", "
0.0
0.2
0.4
0.6
0.8
1.0
Fig. 4.8
57
Fig. 4.9
58
INTERNAL MARKETS
fact was noticed by Kephart et al [10 1 and used in systems with delayed
information to reduce the effects of persistent oscillations and chaos, which
are manifestations of nonlinearities in the fluctuations.
The following conclusions can be made:
the first order nonlinear corrections are sufficient for correctly estimating
fluctuation effects in the system, especially if the uncertainty parameter
is not too small;
59
0.40
0.40
02= 0.04
~0.30
~ 0.20 Ifl ..
0.30
2=0.24
~ 0.20
.g
0.0.10
0.0.10
0.2
0.40
0.6
0,
0.40
~030
O2
~ 0.20
=0.54
2=0.98
0. 0 . 10
0.00
Fig. 4.10
0.8
Effect of different 02 for each resource for a system with cubic pay-offs (leading
to a bistable system), with uncertainty 0\.
60
INTERNAL MARKETS
4.3.2
In this model notation, and for this particular example, the pay-offs
consist of only one system parameter,!, that is expressed in terms of five
system bases (linear, constant, random, relaxation and greed) from which
only the first two can mutate while the random, relaxation and greed
components are fixed.
Starting from an initial distribution!, which will depend on (11 and (12,
the less-dominant resource (0 2) mutates (increases) its slope and intercept
proportionally to I::..!=!- 0.5. In order to constrain the system it is assumed
that these increases are equally matched by the decreases in slope and intercept
for resource I:
O 2 = (c+ I::.. c)! + (d + I::.. d)
and
0 1 = (a-I::..c)!+(b-I::..d)
... (4.5)
where:
I::..c=-yl::..!+ 0
and
I::..d=cxl::..!+{3
... (4.6)
61
have two contributions each - one depending on how badly they are losing
to the competing resource (sensitivity) and another random component
introducing noise (using here, for simplicity, a = 'Y and 0 = (3).
The competing process goes as follows. The deterministic equation gives
an initial distribution/that allows the resources to calculate how much they
have to mutate their pay-offs to become more attractive to the agents. The
new pay-offs are re-introduced, together with al and a2, to calculate the new
probability p which is simultaneously used to solve the deterministic equation
and the fluctuation equations, thereby giving the market share value / as a
function of time and the evolving pay-offs.
As in biological systems, the rate at which mutations happen is
fundamental in achieving evolutionary improvement. In this simple case it
has been observed that after a mutation, irrespective of the initial
configuration, equilibrium is always reached after 100 units of time. This
is the relaxation time that must be introduced in order to take full advantage
of all the mutations; changes would otherwise happen too quickly for the
system to adapt and gain benefit.
Figure 4.11 shows the evolution of the market share of resource 1 in a
system with two resources and pay-offs, as in equation (4.1), taking
al =0.14, a2=0.40 and a step-size M=O.Ol. The evolving pay-offs were
updated every 100 time units, and calculated with a sensitivity parameter
a=O.1 and a noise parameter, {3, with random values in the interval (0,1).
The simulation starts with resource 1 having a market share of 26070 and
no initial fluctuations. At t = 100, as the system is nearly at its equilibrium
point (f"" 0.73), the first update of the pay-off values is introduced according
0.8
Q)
:5o
~ 0.6
Ol
c:
'iii
::l
III
E
Q)
0.4
1ijl
'0 02
c: .
o
.~
.:: O.OL-
~------I~---~
200
400
600
time
Fig. 4.11
62
INTERNAL MARKETS
to the prescription in equation (4.5). This has the effect of changing the
equilibrium distribution in favour of resource 2; the market share!of resource
1 now decreases towards the value of 0.58, which it reaches fairly quickly
before the next pay-off update (at t = 200) takes place. This second update
again reduces the equilibrium distribution to the value of! = 0.49, resulting
in resource 1 having a slightly smaller market share. At the next update
(t = 300), resource 1 fights back and pulls the equilibrium value to! = 0.57.
A general 'tit-for-tat' behaviour is observed where the actions of the two
resources are equally matched (fighting with the same intensity). From the
long-time behaviour of the system observed in Fig. 4.11, it is further
concluded that, after a given number of iterations, the system will settle into
an equal distribution of agents over the two resources. It is interesting to
observe that fluctuations do not affect the evolution of the system drastically.
They seem, however, to introduce a short oscillation in the market share
before the system settles into the equilibrium distribution. Stochastic effects
may have a more important impact if nonlinear pay-offs are used. The
inclusion of more resources is also likely to remove the symmetrical 'tit-fortat' behaviour observed here and may lead to more complex strategies in the
competition between resources. These issues are being investigated and will
be reported elsewhere [11].
4.4
CONCLUSIONS
CONCLUSIONS
63
among different systems arises when mutations are introduced at the system
bases' level. New perception values arise in the form of compositions
of system parameters (genes) that are the result of random combinations of
mutated system bases.
In order to test the time-independent approximation in the case of this
agent/resource system, a system with two resources was used and pay-off
functions associated with the two resources were considered in order to model
a simple competitive strategy between agents, with an uncertainty parameter
monitoring the accuracy of information available. Sensitivity to accuracy of
the information available to agents was also studied; the main observation
is that higher uncertainty leads to the suppressing of nonlinear noise effects.
This is compatible with the conclusion in Kephart et al [10] that an increase
in the uncertainty parameter lowers the threshold for persistent oscillations
and chaos in systems with time delay, since these non-optimal behaviours
are the result of nonlinearities taking over in the dynamical equations. It has
also been shown how the one-step Markov formulation enables the exact timeindependent distribution to be found in the case of a bistable system, which
results from nonlinear pay-off functions.
Also a simplified time-evolution scenario has been modelled by making
two resources with linear pay-offs compete for the agents with two mutating
system bases (linear and constant) coding for only one genej. After an initial
period of instability the system adopts a 'tit-for-tat' cycle where one resource
dominates the other only to give way to the competing one after a fixed
relaxation interval. Time-dependent solutions, in complex systems with payoffs that result from the combination of several system parameters each with
up to eight mutating bases, have been studied in a separate work [I I] .
As indicated in the introduction, the emergence of markets inside
communications systems will lead to major changes in the ways in which
communications businesses are run. Increasing levels of competition will lead
to a speeding up of all business operations, including price setting for
communications services. Ability to unravel, understand and probe the myriad
of ultrafast business cycles, rhythms and discontinuities erupting inside the
global communications network will provide the intelligent operator with a
novel source of competitive edge. Competitive real time pricing for non-free
services, coupled with strategic planning and business operations, may emerge
as an important component of automated network intelligence. The work
reported here represents an initial step towards developing new tools which
will enable us to meet some of the challenges associated with the increasingly
competitive, turbulent and volatile communications business.
64
INTERNAL MARKETS
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
Van Kampen N G: 'A power series expansion of the master equation', Can J
Phys, 39, p 551 (1961).
9.
5
EVALUATION OF HOPFIELD
SERVICE ASSIGNMENT
M R W Manning and M A Cell
5.1
5.1.1
INTRODUCTION
The assignment problem
With the rapid increase in complexity of telecommunications and computational systems, an urgent requirement is the development of techniques
for dealing with difficult optimization problems, many of which are associated
with the assignment of tasks to resources. Typical application areas are:
66
where aij are the gain terms such that 0:5 aij :5 1 and Ti is the time estimated
by the task's agent that it would take for an ideally suited resource (aij = 1)
to carry out the task. An estimate of the actual execution time, if task i is
assigned to resource}, is taken to be T/ajj in the present model, so that time
increases as the resource becomes more unsuitable.
If it is assumed that tasks can be dealt with individually, then the
assignment problem can be tackled by simply allocating each task to the
resource corresponding to the highest gain term. In large systems, however,
there will be many incoming tasks at any given decision instant and the
problem then becomes one of switching the tasks through to the resources
in an optimum manner at high speed.
5.1.2
Background
The Hopfield neural network [3] has been extensively used for solving
optimization problems, such as the travelling salesman problem (TSP) [4],
but also for simpler problems such as its use for switching purposes [5-8].
The switching application is relevant as it is very similar to the assignment
problem, even though it is an easier optimization task because there are no
gain terms involved. The assignment problem, also known as the resource
allocation problem, has been dealt with in the form of the concentrator
assignment problem [9] and the list-matching problem [10]. It has also been
considered in the generalized higher-order case [11, 12].
The present study provides more information regarding the parameters
used and the various considerations in the simulation of the Hopfield net
for considerably larger problem sizes. Another aspect of the present work
is its emphasis on the performance of the Hopfield net at system level, in
particular highlighting its behaviour when operating in an overload condition.
INTRODUCTION
67
5.1.3
Framework
Studies into the performance of task distribution systems have been made,
and in particular, a comprehensive theoretical framework has recently been
developed for describing processes in service systems [2, 18]. The neural
network can be used within market-based systems [19], where the gain terms
may be considered as a function of cost or price structures. The objective
of the present study is to present the Hopfield net as an implementation
method for distributed task allocation in these and more general systems (Fig.
5.I).
Nevertheless, it is important to understand that the work presented in
this chapter is just a first step towards developing a more sophisticated model,
as the mechanisms controlling the adjustment of the gain terms Cl'ij are a
critical feature of the envisaged system [18,19] (see also Chapters 4 and
16). The present work does not deal with this aspect, but instead takes the
Cl'ij gain terms to be known a priori, assuming a uniform distribution of these
terms for the purposes of the simulations. The hardware-implemented neural
network would therefore perform the assignment operation only as part of
a more comprehensive processing system.
68
inputs:
tasks, jobs,
service-requests,
fault identification
Fig. 5.1
Use of a Hopfield neural net to optimize the allocation of incoming tasks to available
resources. The objective of the neural net is to optimize the gain terms of the chosen task/resource
pairs.
This chapter introduces the Hopfield neural network, then uses the
presented model to solve a task allocation problem by making certain
assumptions about the task allocation process. The relevant factors affecting
the convergence of the neural net are discussed. Simulation results are
presented and, in particular, a range of performances is determined as the
optimization process is forced to become localized instead of being more
global. The merits of the Hopfield net approach are discussed, as well as
the extensions required for a general framework.
5.2
Fig. 5.2
Here the weight W pq connects the output v p of the pth neuron to the
input uq of the qth neuron, while lq represents an internal bias in the qth
neuron. The summation is over all neurons in the network. The activation
function f has to be some nonlinear monotonically increasing function a standard continuous-valued sigmoid function is used, with neuron outputs
ranging from 0 to 1:
... (5.3)
where {3 is the gain factor, which controls the steepness of the sigmoid
function.
Hopfield showed that the net will always converge providing the synaptic
strengths (weights) in the net are symmetric, i.e. if W pq = W qp for all
interconnections. The proof is based on the use of an overall energy function
associated with the net [3]:
E = - Yz E E
p q
WpqVpV q -
E
p
Ipv p
117
E l~~
p
"
where T is the neuron time constant, which throughout this work is set to
unity. Note that the lower bound of the integral term in E representing the
internal energy of each neuron is taken to be Yz instead of the zero value
given in Hopfield [3]. This is because the neuron sigmoid function in this
case ranges from 0 to 1, whereas in their model it is from - 1 to 1. Equation
(5.4) corresponds to the following dynamical equation:
70
... (5.5)
-dEldv
The energy function will decrease until it reaches one of its minima because
dEl dt::5 0 [3] - these equilibrium points are the attractors of the system,
corresponding to solutions of the energy function as determined by the set
of weights Wpq and biases I q in the network.
5.3
5.3.1
Assumptions
In this study, two basic assumptions are made about the system:
task indivisibility;
resource dedication.
worthwhile. If the assignment was successful, there will be one non-zero entry
in the row corresponding to the input task.
The second assumption is that each processor can take on at most one
task, and, unless interrupted, that processor is dedicated to the task to which
it has been assigned. Each column in the output matrix therefore has at most
one non-zero entry, depending on whether the processor corresponding to
that column has had a task assigned to it or not. This is not a restriction
on the type of processors used in the system, as long as, for any processor
capable of dealing with several tasks, the task allocation controller is made
aware of the number of effective processor, this amount being equal to the
number of tasks that the processor could handle at that instant in time. The
possibility of a certain level of control hierarchy is therefore also implied.
5.3.2
Because of the two assumptions made, there will be at most one non-zero
entry in any of the columns or rows of the output matrix, depending on
whether a task was assigned to the corresponding resource or not. In addition
to the minimization of both row and column sums, the purpose of the
controller is to maximize throughput, Le. to have the maximum number of
non-zero entries or assignments in the output matrix. These constraints can
be summarized by the first three terms in the following energy function that
the controller has to minimize:
E=A/2
i= I j=l 1= I
I;o<j
Vij Vii
+ e/2 (min(M,N) -
+ B/2
M
i:1 j ~ 1 Vij)
E v
Vk'
1)
j=1 i= I k=l
k;o< i
N
+ D i~ I j~ 1Vij(l -
Cl'ij)
... (5.6)
where Vij is the entry for the ith row and jth column. The first and second
terms correspond to row and column sum minimization respectively, while
the third term results in matrix sum maximization. The fourth term
corresponds to the quantity to be optimized, which in this case is the sum
of chosen gain terms Cl'ij. The objective is for the neural net to minimize the
sum of cost terms Cij = (l - Cl'ij) whilst still respecting the constraints
imposed by the first three terms.
The expression in equation (5.6) is compared to the energy function for
a Hopfield net in equation (5.4) to obtain the weight and bias terms. While
the integral term in equation (5.4) is required in addition to the above function
to ensure that there are two attractors corresponding to a neuron either being
72
J..IJ
C/2 - D
0jl) -
Ojl(l -
0ik)
... (5.7)
(l - (Xij)
where Wij,kl is the weight between neuron ij and neuron kl, and I ij is the bias
for neuron ij; Oij = 1 if i = j and 0 otherwise. Substituting the weight and bias
terms back into the dynamical equation (5.5) gives:
dUi/dt =
- Uij -
EVil -
1=1
I;><j
k=1
Vkj
+ C/2 - D(l-
(Xij)
...
(5.8)
k;><i
where the neuron inputs Uij are initialized to zero. (Small random initialization values could also bJ used, but this was not found to make any
difference.) Equation (5.8) is the required differential equation to update
the neurons at each iteration, resulting in minimization of the energy function
and therefore leading to the solution of the optimization problem.
Once the neural net has converged using the dynamical equation (5.8),
then only N neurons (or min (N,M) neurons if N;zt.M in the case of a
rectangular array) will remain active, the rest having been turned off. If the
neuron in position (i,j) is active then this is the control for task i to be assigned
to resource j.
5.3.3
The optimization term for the gain terms (Xij in the energy function (equation
(5.6 results in the neuron biases being adjusted by the (Xij (equation (5.7.
An alternative is to follow the switching approach [5] and simply initialize
the neuron inputs Uij with the values of the gain terms (Xij' Using a simplified
dynamical equation without the D-term (equation (5.8 will lead to the
neurons with the highest initial values being chosen that still satisfy the constraints given by the A, Band C terms. The method involves centring the
inputs around zero and multiplying them by a factor of, for example, 102
to improve convergence time [22]. While this approach does present a speed
advantage (approximately twice as fast), it leads to slightly worse results (by
73
about 1%) and exhibits a higher sensitivity to the choice of the A, Band
C parameters (e.g. up to 3070 difference for the two sets given in section 5.4.2).
The C term in the energy function (equation (5.6 relies on the fact that
the dynamics are such that the summation of Vij terms always approaches
the min (M,N) term from below, thus guaranteeing that there is no signchange leading to erroneous operation. For more complex problems such
as the TSP [4], the C term is squared so that only the magnitude is considered
in the energy function. This results in the C /2 expression in the dynamical
equation (5.8 being replaced by C (min(M,N) - E E Vij)' As can be
expected, this modification leads to no performance gain for the simple assignment problem.
Tagliarini and Page [9] use a D term which is added to the weights Wij,kl
rather than the biases I ij The D term in this case is more complex, as the
symmetry of the weights matrix needs to be maintained. No advantage was
obtained from this method.
Brandt et al [10] use an additional feedback term FVjj, where F is a
constant, which is added to the right hand side of the dynamical equation
(5.8). This term originates from a slightly different formulation of the energy
function. Again, the results obtained were no better than for the simpler
dynamical equation (5.8).
5.4
Attractors
To obtain neuron output states Vij of either zero or one, the final neuron
input Uij has to be either a large negative or a large positive value. To
guarantee the existence of these two attractors, i.e. asymptotically stable
states, the dynamical equation (5.8) is considered under conditions of
equilibrium, i.e. with du / dt = 0, for the two cases to obtain the following
(see Chapter 10):
Vij
=0
Vij =
~ Uij
= -A -B + C/2
~ Ujj =
-D(1-aij)
< 0
... (5.9)
... (5.10)
To satisfy these inequalities for any value of aij, equation (5.9) uses the
maximum value aij = 1, while in equation (5.10) the minimum value aij = 0
is taken. Additionally, it is beneficial (see Chapter 10) to keep the negative
attractor in equation (5.9) larger than the positive attractor in equation
(5.10). As the solution requires far more zeros than ones, a larger negative
attractor can improve the operation of the network. While the
74
A> CI2>D
... (5.11)
5.4.2
Parameter determination
... (5.12)
It can be seen from this that the parameters {3, C and Llt playa similar
role in the evolution. Because of the decay term in the dynamical equation
(5.5) and also the updating process, there remain significant differences
between the way these terms affect the operation of the network.
The performance of the net is significantly affected by changes in the
gain factor {3 in the activation function of the neurons in equation (5.3). A
large value of {3 implies that the neuron outputs will nearly always be very
close to zero or to one. Because the neuron states are then close to their
attractors, this makes it impossible to escape from false local minima in which
the net may have become trapped. For this reason, one has to reduce {3,
provided that the optimization parameters are made large, although if {3 is
too small, the net will find it difficult to evaluate correctly the differences
between the gain terms aij, especially for large networks. Extensive testing
over the range of matrix sizes considered (up to 128 x 128) led to the choice
of {3 = 1.
The step-size Llt used in the updating procedure U new = Uold + (du / dt)Llt
of the neuron inputs is important. If Llt is too large, errors will result in the
updating. Simulations for a range of matrix sizes gave results close to optimal
from Lltz5.10- 2 onwards, with convergence times increasing rapidly from
Llt z 5.10 - 4. The chosen operating value of Llt = 10 - 2 showed no deterioration compared to smaller step-sizes.
The set of optimization parameters used was A = B = 1000, C = 25 and
D = I, which performed well over the range of problem sizes. An example
of a set which gave better results than the first set (0.3 % closer to optimal)
over the range studied was A == B == 200, C == 100 and D == 1. Whilst the latter
set was fine for a uniform distribution of <Xij, the performance degraded for
the extreme case of using binary (Xij for large-size matrices (> 50 x 50) due
to the smaller value of A used. The parameters should work over the whole
range of sizes and ideally for any type of <Xij distribution. In any case, no
parameters were found which gave more than marginal improvements over
the first set.
5.4.3
Performance
The performance of the Hopfield net was evaluated for two types of
distribution of gain terms <Xij - uniformly distributed and binary. Figure
5.3 gives the performance over matrix sizes from 5 x 5 to 50 x 50 averaged
average
3.0
(ij
2.5
.S 2.0
a.
0
g
0~
1.5
1.0
,,
,,
, ...., ,
0.5
0.0
,,
,,
"
"
10
"
20
30
40
50
problem size
(a)
(ij
25
20
15
g
~
0
worst case
\
~
.... _- _--
10
5
0
...
10
20
30
40
50
problem size
(b)
Fig. 5.3
Performance of the Hopfield neural net compared with the optimal solution, showing
(a) average and (b) worst case. The comparison is for uniformly distributed (-) and for binary
(----) gain terms, (tij'
76
over 100 trials in each case. The average performance for the uniformly
distributed case is approximately 2070 from optimal, but gradually improving
with increasing matrix size. As can be expected from the simpler problem,
the extreme case of binary values for CYij results in near-optimal solutions.
Because of the random nature of the inputs, small-problem sizes suffer
from trials that locally give non-uniform distributions of CYij over the matrix,
resulting in reduced quality of solution. The worst-case solutions in
Fig. 5.3 (b) reflect the fact that the large-problem sizes result in more of a
uniform distribution of CYij across the matrix, which the neural net finds
easier to solve.
A different method of generating a uniform distribution of CYij across the
matrix gave averages that were of the order of 0.1 % (instead of 2 %) from
optimal for a 10 x 10 matrix.
The results compare well with published ones, such as an average of 7.5%
from optimal for 5 x 12 matrices [9], or 50% of trials within 3% of optimal
for 7 x 7 matrices [17]. The neuroprocessor architecture in Eberhardt et al
[14] outperforms the general-purpose neural-net structure used here, giving
worst case results of 0.5% from optimal for matrices up to size 64 x 64.
Difficulties exist with comparisons in general, as there is a level of uncertainty
as to how the input data has been generated.
5.4.4
Considerations
The operation of the neural network relies on the balance of the constraint
terms (A, Band C terms) and of the optimization term (D term) in equation
(5.6). The result is therefore not necessarily improved by simply increasing
the value of D, as the solution may move to the high CYij values too fast. The
net converges to a set of chosen neurons with some corresponding to higher
CYij, but with the whole set corresponding to an overall lower sum of CYij
terms. The goodness of the solution was most sensitive to the value of D
and did not improve by increasing D for the given parameter set.
A second possibility for increasing the goodness of the solution is to add
noise into the system [23]. In fact, this is a necessity, if binary CYij are used,
in order to escape local minima during convergence. Noise of order 10- 2
was found to be most appropriate for this. This was injected into the system
either by addition to the dynamical equation (5.8) or to the gain factor {3
in equation (5.3). This noise did not affect the goodness of the solution for
SIMULATION RESULTS
77
5.5
5.5.1
SIMULATION RESULTS
Basic performance
78
i'::~~';'--=--==
:= 40
'S 20
o
o
20
40
60
80
arrival rate
(a)
~5-~ 60[
40
~
~ 2:
(..
..I.,
20
.,,'
...... ...,.
,**'
,I.-_E:../_/_,."
.......
I ..-
40
60
~,
80
arrival rate
(b)
Fig.5.4
Averaged utilization and number of lost requests for a 64-resource system against
task arrival rate. Performance is given for three sampling intervals: 0.2 (--), 0.07 (----) and
0.02 (-.-.-.). Optimization becomes global as the sampling interval is sufficiently increased, hence
the worse performance deterioration for small sampling intervals in overload condition (constant
task arrival rates, constant task times 7 j = I).
5.5.2
Overload conditions
SIMULATION RESULTS 79
5.5.3
Scalability
The set of curves is similar for other matrix sizes. To demonstrate the
scalability of the application, the highest task arrival rate for which no lost
requests occur is found over a fixed number of runs of 500, using a sampling
interval of 0.2. In Fig. 5.5 this breakpoint is shown to increase linearly with
the number of resources in the system, given an identical distribution of gain
terms aij for all system sizes, which in this case is a uniform distribution
between 0 and 1. Figure 5.5 indicates that the system is well-behaved in that
an increase in the number of resources leads to a linear increase in overall
performance.
In terms of speed the neural net scales fairly well - 290 iterations for
a 10 x 10 matrix increases up to about 1000 iterations for a 200 x 200 matrix
for the parameters given in section 5.4.2. It should be noted that these figures
could probably be improved through the fine-tuning of the simulations. A
trade-off between convergence speed and goodness of solution can be made
by an appropriate choice of parameter set, e.g. halving these times for a drop
of a couple of percentage points in performance.
maximum arrival rate with no lost requests over 500 runs
100
Q)
~ 60
"iii
>
.~ 40
20
Fig. 5.5
Scalability of application - the maximum arrival rate, allowing no lost requests,
increases linearly with the number of resources (constant task arrival rates, task times Tj = I,
sampling interval = 0.2).
5.6
The Hopfield neural net has been proposed as an optimization tool for
dynamic task assignment in an environment consisting of different types of
resource. The purpose of the present work has been to determine all
parameters affecting the operation of the Hopfield net, which then led on
to an assessment of the net's performance. Because the task assignment
problem is a general one, the results presented here can also be applied in
other planning or control systems.
Although the Munkres algorithm [13] works optimally and exceeds the
performance of the Hopfield net in the restricted case of linear assignment
only, the neural network has the advantage of flexibility of application. For
more general higher-order cases, the Hopfield net easily outperforms any
conventional algorithms [15]. It has also been successfully adapted for
carrying out many-to-many assignments instead of one-to-one assignments.
An example of this is 2-to-l assignment, if each processor can accept two
tasks rather than just one. This is done by replacing the summation terms
in the dynamical equation (5.8) by sums of product terms. Because of its
inherent parallelism, the neural net can obtain significant speed improvements
over the conventional algorithm, if it is implemented in a parallel hardware
structure [16, 17], and especially for large-size problems.
It is intended to construct a system of clusters of resources connected
to the network via Hopfield nets acting as controllers. A range of topologies
can be considered, e.g. in Fig. 5.6 a consortium of resource clusters is formed
around a ring network which is accessed in turn via a controller by the external
network, thus allowing a hierarchy of controllers to be set up if necessary.
Applications in such heterogeneous decentralized systems are discussed further
in Chapter 4.
A prerequisite for such systems is that a job consisting of many tasks
will require communication between the various tasks. So, in addition to the
execution costs, there will be communications costs incurred which will be
negligible if the tasks are performed on the same machine, and if carried
out within the same resource cluster will be less than for communications
between different clusters. The work described in this chapter has now been
successfully extended to take into account additional constraints, such as these
communications costs, allowing the method to be applied to more general
higher-order problems [15].
CONCLUSIONS 81
resource
cluster
Fig. 5.6
5.7
CONCLUSIONS
82
REFERENCES
1.
2.
3.
4.
5.
Ali M M and Nguyen H T: 'A neural net controller for a high-speed packet
switch', Proc Int Telecommunications Symposium, pp 493-497 (1990).
6.
7.
8.
9.
10. Brandt R D, Wang Y, Laub A J and Mitra S K: 'Alternative networks for solving
the traveling salesman problem and the list-matching problem', Proc of the IEEE
Int Conf on Neural Networks, 1.., pp 333-340 (July 1988).
II. Fang Land Li T: 'Neural networks for generalized assignment', Proc of 2nd
lASTED lnt Symposium, 'Expert Systems and Neural Networks', pp 78-80
(August 1990).
REFERENCES
83
12. Li T and Fang L: 'A comparative study of competition based and mean field
networks using quadratic assignment', Proc of 2nd lASTED Int Symposium,
'Expert Systems and Neural Networks', pp 81-83 (August 1990).
13. Munkres J: 'Algorithms for assignment and transportation problems', J Soc Ind
Appl Math, ~, pp 32-38 (1957).
14. Eberhardt S P, Daud T, Kerns D A, Brown T X and Thakoor A P: 'Competitive
neural architecture for hardware solution to the assignment problem', Neural
Networks, -.i, pp 432-442 (1991).
15. Bousono C and Manning M: 'The Hopfield neural network applied to the
quadratic assignment problem', to appear in the Neural Computing Applications
Journal.
16. Moopenn A, Duong T and Thakoor A P: 'Digital-analog hybrid synapse chips
for electronic neural networks', in Touretzky D (Ed): 'Advances in Neural
Information Processing Systems 2', Morgan Kaufman Publishers, pp 769-776
(1990).
17. Duong T, Eberhardt S P, Tran M, Daud T and Thakoor A P: 'Learning and
optimization with cascaded VLSI neural network building-block chips', Int Joint
Conf on Neural Networks, 1-, pp 184-189 (June 1992).
18. Adjali I and Gell M A: 'Self-organization in open computational systems', Phys
Rev E, 49, (5-A), pp 3833-3842 (1994).
19. Gell M, Fernandez-Villacanas J L, Adjali I, Manning M and Amin S: 'Selforganization and markets inside communications', Proc of 6th Annual Conf on
Neural Networks, Genetic Algorithms and Chaos Theory: Intelligent Financial
and Business Systems, London (February 1994).
20. Amin S J, Olafsson Sand Gell M A: 'Constrained optimization for switching
using neural networks', Proc of the Int Workshop on Applic of Neural Networks
to Telecomms, INNS Press, pp 106-111 (1993).
21. Aiyer S V, Niramjam M and Fallside F: 'A theoretical investigation into the
performance of the Hopfield model', IEEE Trans on Neural Networks, 1-, No
2 (June 1990).
22. Manning M Rand Gell M A: 'A neural net service assignment model', BT Technol
J, 11, No 2, pp 50-56 (April 1994).
23. Hertz J, Krogh A and Palmer R G: 'Introduction to the theory of neural
computation', Santa Fe Institute, Addison-Wesley (1991).
HIERARCHICAL
MODELLING
M A H Dempster
6.1
INTRODUCTION
Management science has for over thirty years been concerned with
mathematical and computer models at the macro, meso and micro levels of
detail to support corporate decision-making in planning, management and
control, which reflects the classical three-level military hierarchical planning
concepts of strategic long-run, tactical medium-run and operational shortrun [1] (see Table 6.1). As planning moves down the corporate hierarchy
it becomes increasingly detailed and involves shorter timescales and many
more, but smaller, uncertainties. The mathematical modelling involved at
successively lower levels reflects these differences - paralleling the macro,
meso and micro-scale mathematical models of classical physics (e.g. see
Woods [2]) - increasing in complexity at each level (see Table 6.2). In a
stationary corporate environment, operational planning models - involving
mainly management and control functions - can become extremely complex.
In a highly dynamic uncertain environment, useful mathematical and
computer models tend to become simpler, as it is the strategic and tactical
decisions involving rarer major uncertainties which are critical for survival.
Over the last two decades, supported by rapid technological advances in
computing and telecommunications, complex corporate information systems
have developed, involving multiple decision-support systems at each level,
which, taken together, are referred to as hierarchical planning systems [3].
INTRODUCTION
85
Table 6.1
Planning, management and control hierarchy after Anthony [1). Strategic and
tactical levels handle complexities and uncertainties by aggregation in a stable environment in
order to focus on rare major environmental change.
Lead
Time
Level 1 Strategic
Level 2 Tactical
Level 3 Operational
Table 6.2
Cost
t t
Uncertainty
Macro
Meso
Micro
Physics
MACRO
Complexity
Fluid flows
High density
Irreversible
Queuing Networks
Reflected
Fluid
Flows
Euler (incompressible)
Navier-Stokes (compressible)
PDEs
MESO
Kinetics
Medium density
Partly reversible
Reflected
Brownian
Motions
Poisson, Boltzmann
Blazov PDEs
MICRO
Particle dynamics
Low density
Reversible
Standard
Discrete
Event (flow)
Processes
Many-body ODEs
About a decade ago it was proposed that the behaviour and performance
of such complex human-computer systems could be evaluated in terms of
relatively simple three-level stochastic optimization models [4, 5] (an idea
originally introduced in the context of military logistics by Dantzig in the
1940s). This has subsequently been demonstrated in the fields of
manufacturing and distribution.
A recent text [6] on integrated voice-data telecommunications network
design discusses the three-level corporate planning hierarchy in the context
of private corporate network design, management and control, providing
mathematical models at each level of detail and recommending the establishment of a corporate network team concerned with tasks at each level. In this
chapter tentative steps are taken towards appropriate 3-level stochastic
optimization models for understanding and designing integrated hierarchical
planning systems in the telecommunications industry. In this context, it should
86
HIERARCHICAL MODELLING
r------------,
host
virtual network service
virtual session
trartlc Inputs and outputs
------.J..r:;;;;;;;;;;;;;;;l
physical link
external
site
subnet
node
network
subnel
external
node
site
r-----------t---------------------------------------------------------t-----------t"
L
.J
.J
Fig. 6.1
Open systems interconnection (OSI) seven-layer architecture (7). Each layer presents
a virtual link to the next higher layer and data-flow rates and planning lead times vary directly
with depth.
87
life cycle. Currently, queuing models are used to describe the network, while
deterministic optimization methods are the primary tools in both design and
routeing problems. The purpose of the hierarchical models proposed in this
chapter is to perform integrated optimization with random elements
represented at each level at an appropriate level of detail. To this end, the
next section outlines recent mathematical work on successively aggregated
meso and macro approximations of micro-level queuing networks - the
traditional tool of telecommunications performance engineering. Building
on the familiar discrete-event stochastic processes of queuing theory by
appropriate rescaling of time and aggregation of events, reflected Brownian
motions and deterministic fluid flow processes are involved (or, in the case
of bursty traffic, Markov modulated fluid stochastic processes) (see Table
6.2). In section 6.3, the recent (functional) central limit theory for stochastic
processes which yields these results is briefly outlined. These ideas are applied
to some preliminary two- and three-level planning models in section 6.4. In
the final section of the chapter a few directions for future research and open
mathematical problems are outlined, whose pursuit and solution would
contribute to the applicability of the approach introduced here to integrated
network planning for future multi-point, multimedia and multi-rate
connections as discussed, for example, in Hui [8].
6.2
88
HIERARCHICAL MODELLING
Fig. 6.2
Open queuing network model with independent identically-distributed inter-event
exogenous input and potential service processes and infinite queue buffers.
(potential) service rate p.j and the switching fractions Pjk of particles that are
routed directly to node k on link (j,k) after service at node j, on each of
the J: = INI nodesj in the network. The row v~.ctor/matrix triplet (AD, p.',
P) specifies the long-run average performance 6f the system and the (total)
inflow vector A' of (total) arrival rates Aj at nodes j = 1, ... ,J is the maximum
solution of the traffic equations:
A' = Ao
+ (A' /\
... (6.1)
p.')p,
OJ
89
which represents the difference between the equilibrium cumulative input and
potential output processes (and which, together with P, completely specifies
the system) and the equilibrium lost output (due to empty nodes) process:
Y' : == [Y'(t) : ==(Y1(t), ... ,YJ(t : t ~
OJ
Indeed, given X' and P, the pair (Y' ,Q') is the unique solution of:
Q' ==
X'
Y' (I-P)~O'
... (6.2)
(where 0' is the process which is identically zero), for which Y' is nondecreasing with Y'(O) : == 0' and has co-ordinates Yj(t)~O, when QiU) ==
0, j == 1, ... ,J. These relations may be expressed as the statements that, given
X' and P', the process Y~ 0 is the unique non-negative solution of the
abstract order complementarity problem [14] defined by equation (6.2) and:
[X'
... (6.3)
where Q' is defined by equation (6.2) and Y' == E1j:s(") .:lY' (ti), Le. the sum
of the jumps .:lY(tjO of the process Vat jump epochs tj up to time t. The
lost output process Y' is termed the regulator of the queue-length process
Q' and it follows that it is the least element process satisfying the inequalities
in equation (2) and non-negativity.
Aggregating numbers of particles by the scaling, e.g. X~/fit and
accelerating time by the scaling nt, the queue length process Q' emerges in
the (almost sure) limit as reflected Brownian motion (RBM) [15] (from a
functional central limit theorem for stochastic processes) regulated by a
suitable increasing (local time) process Yand driven by a Brownian motion
X (see Fig. 6.3). Specifically, the processes (X, Y, Q) represent the asymptotic
heavy-traffic diffusion approximations to the (X, Y, Q) processes of this
queuing network as the exogenous input rates ~3', service rates j.tn' and
initial potential throughputs X~ (0) increase at rate fit as n- 00. When
numbers of particles are more heavily aggregated by the scaling, e.g. X~/n
with the same time acceleration nt, the queue length process Q' emerges in
the (almost sure) limit as deterministic fluid flow (from a suitable functional
strong law of large numbers for stochastic processes) regulated by a
deterministic increasing process Y' and driven by the deterministic expected
potential net throughput process:
... (6.4)
90
HIERARCHICAL MODELLING
AAA
almost surely
(X, Y,a,)
(X, Y, a,)
inc~ea~
Brownian motio!
reflected Brownian motion
nt
X In /
n
>
almost surely
.
'h
/
expected potential throug put
X:
6<, Y, a,)
\.~
(x.Y:'0': )
Fig. 6.3
... (6.5)
91
,\ 0
[X' + Y'(I-P)]dY=O
... (6.6)
where Y' (t) = t(f-t' - A' )1\0 Hence the time derivative Yj (t) of the )th coordinate of the lost output process is identically zero unless f-tj > Aj, )Ea,
when the queue at) is identically 0 due to the fact that instantaneous outflow
rate exceeds instantaneous inflow rate. Formally, to apply the theory of
Borwein and Dempster [14] the sample paths of the processes (X, Y, Q),
(X, Y, Q) and (X, Y, Q) may be considered as elements of suitable spaces of
functions of time - respectively, left-limited right-continuous functions,
continuous functions and continuously differentiable functions.
From a geometric point of view, the resulting order-complementarity
problems represent abstractly the dynamical situation in which the appropriate
queue length process Q' evolves in the interior of the non-negative orthant
of IR J exactly as the corresponding potential throughput process X'
(representing all network nodes occupied) until it hits one or more lower
dimensional faces of this cone (representing empty nodes), when the
appropriate regulating lost output process y ' acts minimally to reflect it back
(in directions dictated by the Leontief matrix [ - P [14, 16]) into the interior
of the non-negative orthant (see Fig. 6.4).
I.
node 3
II.
II.
X(O)=Q(O)
II.
II.
Q(t)=X(t)
node 2
node 1
Fig. 6.4
92
HIERARCHICAL MODELLING
6.3
+ ... + X n
nal >O)==O(e-nl(a),
... (6.7)
where:
functional
theory
classical
theory
MICRO
{X n} n = 1,2,
{X n} n = 1,2,...
independent identically
distributed random variables
EX
=J.!
var X
n-oo
93
=(J2
"-
drift
MESO
a.s. \
X 1+..+X n -
2
BM(J.!, (J )
extreme value
I =0 (e-nl(a))
P {X 1 + ... + Xn>na}
n-oo
MACRO
Fig. 6.5
~dete@~
a.s.
nEX
X 1 + .... + Xn
a.s.
nEX
94
HIERARCHICAL MODELLING
the Gaussian distribution in the central limit theorem. The large deviation
result (equation (6.7)) is represented by the negative exponential distribution
of the extreme values of the asymptotic Brownian motion and the Poisson
process nature of their occurrence over time.
A standard (vector) Brownian motion (or Wiener) process W has
continuous sample paths, a Gaussian state distribution N(O,tI)) for W(t),
t ~ 0, and stationary independent increments, i.e. the distributions of the
increments W(t)- W(s) depend only on t-s and the random (vector)
variables:
are independent for any n ~ 1 and 0 $ to < ... < t n < 00. The sample paths of
the Wiener process, although continuous, are extemely erratic (technically,
they are of unbounded variation and at almost all points of time they do
not possess a derivative). A process X is a (vector) Brownian motion with
drift IJ- and (co)variance (matrix) (J if it has the form:
X(t) = X(O)
IJ-t
(JW(t)
... (6.8)
6.4
95
... (6.9)
LE[h(x,d)]
... (6.10)
X2:0
S.t.
EXa -s.b,
aEA
96
HIERARCHICAL MODELLING
Sw
wEW
s.t. E f p $. Ca +
pEQ.
pEP w
fp +
aEA
Xa
Sw =
dw
wE W
... (6.11)
... (6.12)
Here the (routed) flow f p is the random stationary state number of OSl
connections routed by the bandwidth manager using path pEPw' the set of
allowable paths (restricted to three in the application) associated with the
origin-destination (00) (node) pair wE W, Qa is the set of paths utilizing link
aEA and Sw is the random number of unserved OSl requests from the total
random demand dw associated with the pair w. The inequality (6.11) is the
(almost sure) capacity constraint involving the current embedded link
capacities Ca and the second stage decision variables of additional allocated
capacities X a , aEA, while equation (6.12) represents the demand constraints
with (almost sure) non-negative slacks sw' wE W, which drive the entire
planning process.
Although not stated this way, the demand d vector is modelled as a
stationary Markov modulated fluid state variable with equiprobable
independent rates on each link. Rate estimates come from Kalman filtering
of actual network traffic and involve 5-10 rates on 10000 pairs, leading
to an astronomical number 5 100_10 100 of network demand states. This is
beyond the range of current (and probably future) numerical algorithms [20]
for solving explicitly the complete certainty equivalent form of the two-stage
recourse problem (6.10) - even when the requirement of integral flows is
dropped. Hence, an iterative algorithm, termed stochastic decomposition
[21], combining Benders' decomposition with network state sampling, has
been employed to solve problem (6.10) in the sense of providing tight
confidence bounds (see also Oantzig and Infanger [22]) on expected unserved
requests. This solution was also validated by dynamically routed simulations,
all in reasonable computing times on contemporary UNIX workstations.
On the other hand, the full three-level model is a new order of
computational difficulty, even with continuous variable assumptions, without
considerable further simplifying analysis, which remains to be done.
97
98
HIERARCHICAL MODELLING
ATM network
customer premises
sources
muhiplexer
voice
data
video
voice
data
video
Fig. 6.6
ATM network.
layers
path
....
call
burst
cell
Fig. 6.7
jI
'
",.
0
0
C1
..............
0
h
"""n
traffic
state
GoS
"path
Npath
9path
"call
Ncall
9call
"burst
Nburst
9burst
"cell
Ncell
9cell
99
capacity units) between the OD pair wE W. The result is a deterministic twostage planning model with second stage a classical multicommodity flow
problem involving link provision costs f3 a, rEA, and OD pair revenues rw '
wE W, namely:
min
s. t.
0:
aEA
- E
f3aCa
E f p)
... (6.14)
aEA
... (6.15)
Cw
wEW
... (6.16)
pEPw ,
wEW
... (6.17)
wEW
rw
E f p $Ca
pEQa
E fp
pEP w
fp
;::::
pEP w
100
HIERARCHICAL MODELLING
6.5
This chapter treats two topics which are - at least in the author's view closely related. The first is a practical concern with the use of three-level
hierarchical models for integrated network planning; the second is a
mathematical concern with aggregating the flow of discrete network events
for use with more appropriate models at earlier higher levels of the planning
process, and providing a justification for the use of deterministic flow models
for network design.
Three-level hierarchical stochastic optimization models could help to aid
and understand, as an integrated whole, piecemeal complex computer-based
planning and management systems for future networks. This has been
tentatively demonstrated by the models of the previous section.
Progress towards this lofty goal would be aided by rigorously extending
the results of section 6.3 to queuing networks with finite node buffer
capacities, when problem (6.2)-(6.3) becomes the order complementarity
problem:
REFERENCES
101
O'~Z'
O'~Q/:=[X'+(Y'-Z')(/-P)]~B',
where B' is a constant process representing fixed node capacities and Z'
represents the buffer overflow loss process. An optimization problem on a
single node for such a system is studied in Harrison [15].
Progress would also be aided by an extension of the model (6.9) or (6.10)
of section 6.4 to incorporate a dynamic third stage allowing non-stationary
network demand processes to illuminate network capacity expansion
planning. Efficient process path sampling and numerical optimization
procedures based on nested Benders' decomposition have yet to be designed
for such models, but progress in efficient simulation of diffusion processes
[27] is relevant to this endeavour.
In conclusion, it is clear that the application to telecommunications
network planning of multilevel stochastic optimization models is mathematically and computationally challenging. Hopefully, this chapter has also
indicated their potential as practical aids to future network planning problems
in the industry.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
102
HIERARCHICAL MODELLING
10. Kelly F P: 'Reversibility and stochastic networks', Chapter 8, Wiley, New York
(1979).
II. Kelly F P: 'Loss networks', Ann Appl Probability,
-.L,
pp 319-378 (1991).
12. Kleinrock L: 'Queueing systems', Vols I and 2, Wiley, New York (1975).
13. Molloy M K: 'Fundamentals of performance modelling', Macmillan, New York
(1989).
14. Borwein J M and Dempster M A H: 'The order complementarity problem', Maths
of OR, 14, pp 534-554 (1989).
15. Harrison J M: 'Brownian motion and stochastic flow systems', Wiley, New York
(1985).
16. Chen H and Mandelbaum A: 'Leontief systems, RBVs and RBMs', in Davis
M H A and EIliott R J (Eds): 'Applied stochastic analysis', Gordon and Breach,
New York, pp 1-43 (1991).
17. Davis M H A: 'Piecewise-deterministic Markov processes: a general class of nondiffusion stochastic models', J Royal Stat Soc, B46, pp 353-388 (1984).
18. Dempster M A H: 'Optimal control of piecewise deterministic processes', in Davis
M H A and Elliott R J (Eds): 'Applied stochastic analysis', Gordon and Breach,
New York, pp 303-325 (1991).
19. Sen S, Doverspike R D and Cosares S: 'Network planning with random demand' ,
Tech Report, Systems and Industrial Engineering Dept, University of Arizona
(December 1992).
20. Dempster M A H and Gassmann H I: 'Computational comparison of algorithms
for dynamic stochastic programming', Submitted to ORSA J on Computing.
21. Higle J L and Sen S: 'Stochastic decomposition: an algorithm for two-stage linear
programs with recourse', Maths of OR, 16, pp 650-669 (1991).
22. Dantzig G Band Infanger G: 'Large scale stochastic linear programs: importance
sampling and Benders' decomposition', Tech Report SOL91-94, Dept of
Operations Research, Standford University [to appear in Ann of OR] (1991).
23. Medova E A: 'ATM admission control and routeing', Internal BT technical report
(December 1993).
24. Hui J Y, Gursoy M B, Moayeri N and Yates R D: 'A layered broadband switching
architecture with physical or virtual path configurations', IEEE J on Selected
Areas in Communications, .2.-, pp 1416-1426 (1991).
25. Labourdette J-F P and Acampora A S: 'Logically rearrangeable multihop
lightwave networks', IEEE Trans Comms, 39, pp 1223-1230 (1991).
26. Medova E A: 'Network flow algorithms for routeing in networks with wavelength
division multiplexing', Proc lith UK Teletraffic Symp, Cambridge (1994).
27. Newton N J: 'Variance reduction for simulated diffusions', Tech Report, Dept
of Electronic Systems Engineering, University of Essex (1992).
7
GRAPH-THEORETICAL
OPTIMIZATION METHODS
E A Medova
7.1
Communications networks of any kind - from early telegraph and circuitswitched telephone networks to future integrated broadband networks - are
represented most naturally by a graph G(V,E), where vertices, or nodes, of
Vare essentially switches (telephones or computer terminals) and the edges
or arcs of E are the transmission links. Classification of networks, for example
into local area networks (LANs), metropolitan area networks (MANs) or
wide area networks (WANs), will result in a change of the technical definitions
of network nodes and their geographical coverage, but the graph representation preserves the concepts of 'interconnectivity' and 'reachability' in terms
of existing paths leading from anyone node to any other node. This is the
precise reason why graph-theoretical methods are of great importance for
design and routeing in telecommunications networks.
Graph theory has its own extensive vocabulary which differs slightly from
author to author. A knowledge of this theory is important since solutions
of graph problems based on intuition can be misleading and a slight change
of graph structure can turn a problem to one that is computationally
intractable. Although there have been many applications of graph theory
to network design and analysis over a long period, probabilistic analysis and
Erlang traffic theory prevails over it as a basic tool because of tradition and
the educational background of communications engineers.
e.. . =
IJ
V':8r
e2
e3
V4
V3
e4
eS
Fig. 7.1
V, [0
V1
V2
V3
M: = V2 0
V3 0
V4 0
V4
i]
(set mjj = 0)
mij : =
For the graph of Fig. 7.1 the incidence matrix is given by:
M:=
VI2
V
V3
V4
el
e2
e3
e4
-1I
0
0
1
0
-1
0
0
1
-1
0
0
1
0
-1
105
es
Jl
switching networks (see Fig. 7.3), Le. open acyclic (no cycles) networks
with N input nodes, N output nodes and at least N log N internal nodes;
Fig. 7.2
Fig. 7.3
(a)
(b)
107
(c)
Fig. 7.4
11
11
12
12
13
In
On
13
1
2
3
(a)
(b)
11
12
13
In
On
Fig. 7.5
7.2
109
7.3
in terms of the average number of 'packets' transmitted per unit time and
quality of service is measured in terms of the average delay per packet. The
basic underlying quantities are of course random variables whose averages
and other statistics are used for performance assessments. Analytical
expressions for such measures are usually not accurate and are very difficult
to use in optimization models.
Stochastic optimization is a challenging area for research, with very
interesting applications to telecommunications, as for example in dynamic
alternative routeing (DAR) [7,8,9] and the work on private networks of
Higle and Sen [10, II].
An alternative is to use deterministic optimization models and to measure
performance on a link in terms of (perhaps a fixed factor times) the average
traffic carried by the link, with the implicit stationarity assumption that the
statistics of the traffic entering the network do not change over the time period
being studied. This assumption is adopted here and the formulation of flow
models as in Bertsekas and Gallager [12] is described.
The traffic arrival rate h is called the flow on link (i, j) expressed in
data-units/sec, where the data-units can be bits, packets, messages, etc. The
objective function to be optimized is of the form:
E D(
...)
Ij Jij
(i,j)
... (7.1)
D..(
...) = max [Jij
"/e]
Ij Jij
Ij
... (7.3)
Xp
III
Fig. 7.6
lr
= 11.4.5.6].
optimum path P w
I
w2
destination
for aD
pair w1
destination
for aD
pair w2
Fig. 7.7
subject to E
x p)
... (7.4)
all paths p .
containing (I,j)
Xp =
rw for all wE W
pEPw
Xp ~ 0
N(V,A)
Maximum flow problem (MF) - for the single commodity flow problem
consider the following notation:
113
iij
aj > 0
supply node
demand node
aj = 0 trans-shipment node.
aj < 0
min E cIjJjj
f'..
(i,j)EA
s.t. E iij
j
- E fji
j
iij ~ 0
(integer),
(i,i) EA
... (7.5)
A solution is being sought for the constraints which will yield an extreme
value (minimum) of the objective (cost) function. When all costs Cjj are set
to -1, the problem becomes equivalent to MF.
The main idea of the primal cost improvement solution method [14] is
to start with a feasible flow vector and to generate a sequence of other feasible
flow vectors, each having a smaller primal cost than its predecessor. If the
current flow vector is not optimal, an improved flow vector can be obtained
by pushing flow along a simple cycle C with negative cost, where C+ and
C- are the sets of forward and backward arcs of C. The simplex method
[20] for finding negative cost cycles is the most successful in practice and
it can also be used to give the proofs of important analytical results concerning
graph algorithms for network flow problems.
It can be shown that a basic feasible solution B of the flow conservation
constraints corresponds to a subgraph NB which is a spanning tree of the
network represented by G. This is the principal result which relates the simplex
method of linear programming and graph-theoretical algorithms.
The network simplex method can in fact be used to solve a variety of
optimization problems such as assignment, transportation (both special cases
of trans-shipment involving bipartite graphs), and capacitated network flow
E };j
i=\ j=\
... (7.6)
min E
c~J~j
k = \ (i,j)EA
S.t. E
f~j
[j :(i,j)EA)
r
E f~j
k=\
E f~i
[j :(i,j)EA)
Uij
The MFP belongs to the class of problems for which exact solutions in
integers are believed to be computationally unfeasible for large networks.
A standard heuristic uses linear programming to solve the problem in real
numbers and then adjusts the solution found to get an approximate integer
solution to the original problem [18]. A new heuristic procedure [21] has
been developed in the context of the optical network design problem using
the best known polynomial algorithms from Simeone et al [22].
Network design - any of the above problems can be modified to incorporate
a network design objective by adding the constraints:
115
E f~j::5
k=l
UijYij
... (7.8)
where Yij is a 0-1 variable which represents whether or not a link (i,j) is to
be included in the network with corresponding cost term qjYij'
When any of the above problems has a suitable special structure, a large
number of efficient non-simplex algorithms have been developed for solutions
of each particular problem. Non-simplex methods may often be classified
as either greedy methods or dynamic programming.
A greedy method works in a sequence of stages, considering one input
at a time. At each stage, a particular input forms part of an optimum solution
to the problem at hand. This is done by considering the inputs in an order
determined by some selection procedure which mayor may not be in terms
of the objective (cost) function of the problem. In some cases the greedy
algorithm generates a sub-optimal solution.
A well-known greedy algorithm is the Kruskal algorithm for finding
minimum spanning trees. Interest in spanning trees for networks arises from
the property that a spanning tree is a subgraph 0' of a (nondirected) graph
o such that V(O')= V(O) and 0' is connected with the smallest number of
links. If the nodes of 0 represent cities and the links represent possible
(bidirectional) communications links connecting two cities, then the minimum
number of links needed to connect n cities is n - 1. The spanning trees of
o represent all feasible choices. In practical situations, the links will have
weights assigned to them, e.g. the length of the link, the congestion on the
link, or the cost of construction of the link, etc. The design problem is to
select a set of communications links that would connect all the specified cities
and have minimum total cost or be of minimum length. Therefore the interest
here is in finding a spanning tree of 0 with minimum 'cost' (suitably
interpreted). A greedy method to obtain a minimum-cost spanning tree builds
this tree edge by edge. Kruskal's algorithm uses the optimization criterion
for choosing the next edge in the solution by considering the edges of the
graph in non-decreasing order of 'cost'.
Dynamic programming is another algorithm design method that can be
used when the solution to the problem at hand may be viewed as the result
of a sequence of decision stages. For some problems, an optimal sequence
of decisions may be found by making the decisions one at a time and never
making an erroneous decision. This is true for all problems (optimally)
solvable by the greedy method. For many other problems, it is not possible
to make stepwise decisions (based only on local information) in such a manner
that the sequence of decisions made is optimal. For example, the shortest
path from node i to node j in a network is impossible to find by the greedy
method. But to find a shortest path from node i to all other nodes in a network
G on n nodes, Dijkstra's (dynamic programming) algorithm yields an optimal
solution in O(n 2) basic steps.
One theoretical way to solve problems for which it is not possible to make
a sequence of stepwise decisions leading to an optimal decision sequence is
to try all possible decision sequences, which is termed complete enumeration
and usually involves a number of sequences exponential in the problem size.
Dynamic programming often reduces the amount of enumeration required
using the Principle of Optimality [19]:
'An optimal sequence of decisions has the property that, whatever the
initial state and decisions are, the remaining decisions must constitute
an optimal decision sequence with regard to the state resulting from the
first decision.'
The difference between the greedy method and dynamic programming
is that in the greedy method only one decision sequence is ever generated.
In dynamic programming many decision sequences may need to be generated
to solve the problem at hand. This is illustrated in the context of 'shortest'
path problems.
Shortest-path problems (SP) - three types of shortest-path problem and
corresponding solution methods are of interest:
from one node to another node, i.e. one origin-destination pair Dijkstra's algorithm;
Floyd-Warshall algorithm;
-
Bellman-
117
... (7.9)
... (7.10)
calculation of the shortest path from any node to all others [12]. For packetswitched networks, such as Arpanet, the asynchronous distributed version
of the Bellman-Ford shortest path algorithm has been proposed [12]. For
real time application it was shown that this basic algorithm converges to the
optimal routeing distances if the link lengths in the network stabilize and
all cycles have strictly positive length. However, this convergence can be very
slow, which is a particular problem in the case of link failure, when the
algorithm will keep iterating without effective end. This behaviour is known
as counting and in this case data messages cycle back and forth between nodes,
which is called looping. It is obvious that such a problem may completely
destroy communication, particularly in a high-speed network.
7.4
119
65536
16384
4096
1024
256
64
16
4
Fig. 7.9
Fig. 7.10
x,
X2
x3
)(4
)(5
Fig. 7.11
x7
)(8
Vl
Y2
Y3
Y4
Ys
Y6
Y7
Ya
7.5
CONCLUSIONS
It is important to stress that more attention be paid by telecommunications engineers to theoretical work already completed at a very advanced level in mathematics and theoretical computer science. However, the
chosen methods must be carefully tailored to the application at hand.
CONCLUSIONS
01
Fig.7.12
121
"1
"2
"3
"4
"5
"6
"7
A ring configured from an existing point-to-point fibre optic mesh network (a),
"1
'-2
"3
"4
"4
"1
"2
"3
"3
"4
"1
"2
"2
"3
"4
"1
"1
'-2
'"3
"4
01
Fig. 7.13
0
1
REFERENCES
l.
2.
Lea C-T: 'Bipartite graph design principle for photonic switching systems', IEEE,
Trans Commun, 38, No 4, pp 529-538 (1990).
3.
4.
5.
6.
7.
8.
9.
l, pp 319-378
(1991).
10. Higle J L and Sen S: 'Recourse constrained stochastic programming', Proc 6th
Int Conf on Stochastic Programming, Udine, Italy (1992).
II. Sen S, Doverspike R D and Cosares S: 'Network planning with random demand',
Research Report, Systems and Industrial Engineering Dept, University of Arizona
(December 1992).
12. Bertsekas D and Gallager R: 'Data networks', Prentice Hall, Englewood Cliffs
(1987).
13. Kleinrock L: 'Queuing systems: Vol II', Computer Applications (1976).
14. Bertsekas D: 'Linear network optimization, algorithms and codes', MIT Press
(1991).
15. Christofides N: 'Graph theory, an algorithmic approach', Academic Press (1975).
16. Gondran M and Minoux M: 'Graphs and algorithms', Wiley, New York (1984).
17. Hu T C: 'Combinatorial algorithms', Addison-Wesley (1982).
18. Hu T C: 'Integer programming and network flows', Addison-Wesley (1970).
19. Lawler E: 'Combinatorial optimization: networks and matroids', Holt, Rhinehart
and Winston, New York (1976).
20. Dantzig G B: 'Linear programming and extensions', Princeton (1963).
REFERENCES
123
21. Medova E A: 'Network flow algorithms for routeing in networks with wavelength
division multiplexing', Proc 11th UK Teletraffic Symposium, pp 3/1-3/10 (March
1994).
22. Simeone B, Toth P, Gallo G, Maffioli F and Pallotino S (Eds): 'Fortran codes
for network optimization', Annals of Operational Research, 11 (1988).
23. Awerbuch Band Peleg D: 'Routeing with polynomial communication-space tradeoff', Discrete Math, 2., pp 151-162 (1992).
24. Horowitz E and Sahni S: 'Fundamentals of computer algorithms', Computer
Science Press, Potomac, MD (1978).
25. Labordette J-F and Acompora A S: 'Partially reconfigurable multihop lightwave
networks', Proc IEEE Globecom '90, 300.6, pp 1-7 (1990).
26. Upfal E: 'An O(n logn) deterministic packet-routeing scheme', J of ACM, 39,
pp 55-70 (1992).
27. Garey M R and Johnson D S: Computers and intractability: a guide to the theory
of NP-completeness', W H Freeman and Co (1973).
28. Lovasz L: 'Communication complexity', in Korte R et al (Eds): 'Algorithms and
Combinatorics', .2.-, Springer-Verlag, Berlin (1990).
29. Medova E A: 'Optimum design of reconfigurable ring multiwavelength networks',
Proc Tenth UK Teletraffic Symposium, BT Laboratories, pp 9/1-9/9 (April 1993).
30. Medova E A: 'Using QAP bounds for the circulant TSP to design reconfigurable
networks', in Pardalos P and Wolkowics H (Eds): 'Proc DlMACS Workshop
on the QAP', American Mathematical Society, Providence (1994).
DISTRIBUTED RESTORATION
D Johnson, G N Brown, C P Botham, S L Beggs, I Hawker
8.1
INTRODUCTION
8.2
NETWORK PROTECTION -
AN OVERVIEW
NETWORK PROTECTION
125
99.9999
99.999
.!!1
'(ij
-10
'5
~
'u
<f.
99.99
99.9
Fig. 8.1
will affect service to the customer. For data links with a drop-out time as
low as 500 ms, both the MTBF and circuit availability remain much higher
than for alternative restoration methods.
Comparing DRAs to 'end-to-end' path protection the latter is equally
fast but less flexible and much more expensive in standby hardware. DRAs
allow protection capacity to be shared across the network, considerably
reducing the redundancy necessary for a given level of restorability.
8.3
PRINCIPLES
127
Each end node compares its own unique identity number with the NID
(node identity) field lodged in its receive register, i.e. the identity of the
node at the other end of the failed span. The node with the lowest number
will become a sender node and the other will become a chooser node
(Fig. 8.2(b)). This selection is arbitrary but necessary to ensure that sender
and chooser nodes are clearly identified.
The sender node sets the target fields of the signatures on all of its
protection line systems to the identity of the chooser node and the source
field to its own identity. This is the start of the sender flooding phase
which is used to identify make-good paths (Fig. 8.2(c)).
Nodes receiving these new signatures react by changing the source and
target fields on their outgoing protection links to the same values, thus
rebroadcasting the flood message. These intermediate nodes are said to
playa tandem role (Fig. 8.2(d)).
128
DISTRIBUTED RESTORATION
[3
alarm
[3
signature
)(
key
Fig. 8.2
8.4
129
all-spans mode - each span will fail in turn and invoke the DRA to
find alternative routes, the time taken to find each replacement route
for all failed links being recorded and displayed for each span;
interactive mode -
all-nodes mode - similar to all spans mode, except that each node in
the network fails in turn.
type 01 ORA
and relevant
data
physical
network data
and topology
volume 01
messaging
traffic
graphical
representation
01 ORA
Fig. 8.3
cable
link
Fig. 8.4
8.5
This section presents simulation results for span restoration times based on
a hypothetical SDH transport network. Protection links were added to the
network using a heuristic algorithm (described in section 8.8) to enable
restoration of any single span failure. The basic topology of the network
(Fig. 8.5) comprises 30 nodes and 57 spans with a total of 332 working links.
The simulation was run using TENDRA in the all-spans mode described
earlier.
Fig. 8.5
131
Test network.
The results (Fig. 8.6) indicate that distributed span restoration in an SDH
network is feasible in - 1 sec. This offers the possibility of restoration within
the call drop-out threshold, providing customers with an uninterrupted
service; 5 ms processing and 20 ms crossconnection times were assumed, which
represent modest modifications to current crossconnect specifications [12].
25
0-.!2.
-0
Q)
.9
'"~
"'~'""
20
15
10
5
0
0
500
restoration time, ms
Fig. 8.6
8.6
Ideally networks should be resilient not just to span failures but also to
multiple cable failures and node failures. All network faults should be
imperceptible to customers. In this section alternative approaches to
distributed restoration are considered [13] to determine which offer the best
prospects for achieving this aim.
8.6.1
ALTERNATIVE APPROACHES
Table 8.1
133
Pre-planned
Real time
Speed
very fast
fast
Storage required
none
Risk of non-restoration
small
extremely small
8.6.2
A further option is for the DRA to construct a restoration between the end
nodes of each failed path 1 rather than the end nodes of the failed span 2
Path restoration is more efficient in its use of spare capacity than span
restoration because the whole path is re-routed and re-optimized. It is also
more flexible because it can restore multiple span and node failures. Figure
8.7 illustrates how path restoration can be initiated - a span carrying two
paths has failed, and the end nodes of the span have detected the failure and
have sent messages along the affected paths to notify their end nodes. A sender
and chooser node are selected for each path and restoration completed as
previously described. Table 8.2 summarizes the main features of span versus
path restoration for span failures.
sender 2
path 1//-...
...
. //~
sender 1
detecting node
failure
chooser 1
Fig. 8.7
Table 8.2
Span restoration
Speed
fast
Path restoration
moderate
moderate
very good
difficult
simple
I A 'path' is defined here as a bi-directional circuit, routed from one node to another via any
number of intermediate nodes.
2 A 'span' refers to the collection of all line systems directly between two nodes.
8.6.3
Node restoration seeks to restore all paths through a failed node. The two
principal methods are path restoration, as described in section 8.6.2, or local
restoration of paths within spans adjacent to the failed node (Fig. 8.8).
In the latter method, every node records the identities of the previous two
nodes visited by each path so that, when a node sees an alarm, it is able to
initiate restoration either with its neighbour or with its neighbour's
neighbours.
Table 8.3 compares the features of node restoration between adjacent
nodes and end-to-end path restoration.
8.6.4
From the preceding discussion, it can be seen that no one method of applying
distributed restoration is better than all others in all circumstances. For
--0
D
,;'
,8
Fig. 8.8
Table 8.3
Restoration between
adjacent nodes
End-to-end path
restoration
Speed
fast
moderate
Spare capacity
utilization
moderate
good
difficult
simple
135
try real time span restoration for any spans not restored;
try real time end-to-end path restoration for any outstanding faults (slow,
but copes with any network fault including node failures).
pre-planned
span
restoration
first-level
protection
second-level
protection
real time
span
restoration
third-level
protection
real time
path
restoration
Fig. 8.9
8.7
pre-planned
path
restoration
pre-planned
node
restoration
Stage
Route
finding
Route
storage
Dependency
on management centre
central preplanned or
real time
central preplanned
distributed
pre-planned
distributed
pre-planned
and real
time
central
total
distributed
distributed
for preplanning
none
distributed
none
3
4
Restoration
time
2-5 min
I sec if plans OK
I sec
< I sec
8.8
137
Figure 8.10 shows the restorability versus redundancy plot for an example
network design (see Fig. 8.5).
The benefit of optimizing true cost rather than the number of protection
links depends on the variation of link costs within the network. Networks
containing a wide variation in link lengths, a range of environments, or which
use a mixture of transmission technologies (e.g. fibre and radio) may benefit
significantly from true-cost optimization.
Placing spare capacity in such a way that the restoration algorithm can
find all the make-good routes is harder for node failures than for span failures
since the design and restoration algorithms must use the same technique for
handling contention for spare capacity between the failed paths. An extended
heuristic design algorithm solves this problem by prioritizing restoration
actions.
The planning algorithm described above has been applied successfully
to networks across Europe.
redundancy, %
Fig. 8.10
8.9
Fig. 8.11
CONCLUSIONS
139
the network operator (or end user) selects the source (sender) and
destination (chooser) nodes and requests a number of circuits;
the sender automatically transmits messages to each of its nearest neighbours, and messages 'flood' the whole network eventually reaching the
chooser; network resources are reserved at each node during flooding;
Simple metrics have been associated with the assigned routes to regulate
path length to avoid areas of the network which are already heavily loaded.
The algorithm has been validated on a test network (Fig. 8.12) and all circuits
were assigned in less than 100 ms, for example, between nodes X and Y.
8.10
CONCLUSIONS
DRAs are simple, fast and can find multiple diverse routes around a network
failure. No databases are required and no co-ordinated or centralized control
is needed to find the routes. Any new line system or node is automatically
protected without the need to modify protection plans.
DRAs have several distinct advantages over traditional centrally
controlled approaches.
The algorithms are fast. Recent simulation work using the TENDRA
simulation model has demonstrated the potential for sub-second
restoration in an SDH network compared to minutes for centralized
140
DISTRIBUTED RESTORATION
'""""
~---"
\
........
\....
\
....
....
........."
.....
.....
"",',
..........
,,
.....
1{q
II
I
I
I
\
,
,
"
I
,
I
:
\,
I
I
\\
\
\ I
"
~
\
x---.,\
I
I I
I I
I I
II
II
II
"
.... -
h--.:q\
~ -
<f-:.:-----~-,'
, -'& \
'"
----b
"rr""tI',.,,'"
~,
Fig. 8.12
REFERENCES
141
APPENDIX
List of acronyms
APS
ATM
DeS
DRA
GUI
ISO
MTBF
MTTR
OSI
SDH
TENDRA
REFERENCES
l.
3.
4.
Chujo T, Komine H, Miyazaki K, Ogura T and Soejima T: 'Distributed selfhealing network and its optimum spare-capacity assignment algorithm',
Electronics and Communications in Japan, Part 1, 74, No 7 (1991).
5.
6.
7.
8.
9.
Schickner M J: 'Service protection in the trunk network: Part 2 - AutomaticallySwitched (I-for-N) Route protection system', British Telecommunications Eng
J, Vol 1-, pp 96-100 (July 1988).
10. SchicknerM J: 'Service Protection in the Trunk Network: Part 3 Automatically-Switched Digital Service Protection Network', British
Telecommunications Eng J, 1-, pp 101-109 (July 1988).
11. McCafferty J K and Spada M E: 'Network restoration: putting the network back
together again - faster', Telephone Engineer and Management, 97, No 16,
pp 25-28 (August 1993).
12. Bellcore: 'Digital cross-connect systems in transport network survivability', SRTNWT-002514, Issue 1 (January 1993).
13. Johnson D, Brown G N, Beggs S L, Botham C P, Hawker I, Chng R S K, Sinclair
M C and O'Mahony M J: 'Distributed restoration strategies in telecommunications networks', Proc International Communications Conference (lCC'94),
New Orleans, USA (May 1994).
14. Chng R S K, Sinclair M C, Donachie S J and O'Mahony M J: 'Distributed
restoration algorithm for multiple failures in a reconfigurable network', Proc
5th Bangor Communications Symposium, University of Wales, Bangor, UK,
pp 203-206 (June 1994).
REFERENCES
143
9
INTELLIGENT SWITCHING
R Weber
9.1
INTRODUCTION
E::
9.2
145
Consider a single switch, whose bandwidth is such that it can route C cells
per second and whose input buffer is B cells. It is reasonable to imagine that
calls can be classified into m classes - such as video, telephone, fax or file
transfer; calls in the same class have the same statistical characteristics.
Suppose that the proportion of calls in class i is fixed at Pj' and so if there
are N calls in progress the number in class i is Npj' Denote by F(N,B,c) the
average number of cells that are lost per second. Suppose the QoS constraint
requires the frequency of cell loss to be less than some small amount, say
F(N,B,c)-s.1O- 8 . Given this constraint, one asks for the value of the
maximum possible N. This is a difficult question. One way to address it is
by a queuing theory approach - one selects a probabilistic model for the
burst traffic and then attempts to calculate F(N,B,c). However, any traffic
model that is simple enough to be treated by this type of analysis is unlikely
to be rich enough to encompass the variety of traffic characteristics one would
expect to meet in practice.
The approach here is different. On-line measurements are made to
estimate the cell-loss rate and to decide whether additional calls can be
admitted. For example, if the switch is presently carrying 200 calls, the present
cell-loss rate might be estimated as less than 10- 8 and further that it will
not become greater than 10- 8 even if a further 20 calls were to be routed
through the switch. The problem with this approach is that it is difficult to
estimate the cell-loss rate when this rate is so small. Cells are lost very
infrequently and there will be little information on which to base an estimate
of the loss rate. Furthermore, it is not at all clear how F(N,B,c) scales in
N. What is the number of extra calls that can be safely routed through the
switch if all we know is that L(200,B,c) is about 10- IO ?
Fortunately, key insights are provided by the theory of large deviations.
This is a theory that is concerned with rare events - precisely such events
as infrequent buffer overflows. The theory is a rich one that has many
applications (see Bucklew [1]). Three important insights of the theory are
that:
if a rare event occurs, then it does so in the most likely of the ways that
it can happen.
146
INTELLIGENT SWITCHING
- BH(N,c)
... (9.1)
o(B)
where:
H(1
+ E)N,c)
= H(N,c/(1
+ E
... (9.2)
Note that equations (9.1) and (9.2) imply that for any k> 1:
<1>(1 +E)N,B,c) =
[<I>(N,B/k,c/(1
+ E) +
O(B)]k
... (9.3)
In equations (9.1) and (9.3) the terms in o(B) are such that o(B)/B- 0
as B-oo. In, other words, these are asymptotics for large B. Equation (3),
with E = 0, provides a method of estimating <I>(N,B,c). The idea is to observe
E
:0
<ll
.g
Q.
~
.Q
~
u
N=340
on/off = 25/50 ms
/\, = 2500 cells/s
c = 350 000 cells/s
0.1
0.01
0.001
0.0001
1e-05
1e-06
2%
1e-08
09~
= 0, 0.01, 0.02
O~~~
1e-07
.1e -
increasing linearly in N
\ \
_ _~
Fig. 9.1
_ _~..,--_----.,.~_ _~
400
600
buffer size
147
the offered traffic, but to pretend that the buffer is only as large as B/k,
for some k> 1. The frequency with which a buffer of this size would overflow
during its busy periods can be estimated by a simple on-line simulation, that
can be implemented in software and which simply counts cells in a small
virtual buffer of size B/k. If, for example, k = 4, then the on-line simulation
would keep track of the contents of a buffer that is only one quarter as large
as it is in reality. If the frequency of buffer overflow in this smaller buffer
is estimated to be 8 x 10- 3, then by equation (9.3) an estimate of the
frequency of buffer overflow in the actual buffer is the fourth power of this
quantity, i.e. 4.096 x 10- 9 Since the frequency of overflow in the small
virtual buffer is relatively large, in this example 8 x 10-3, it should be
possible to get a reasonable estimate of the overflow rate.
Equation (9.3) also suggests a way of estimating whether or not it is
possible to increase the number of calls that are routed through the switch
by some small percentage, say by 100OJo. Again, over some period of time,
we should conduct an on-line simulation to measure the cell-loss rate that
would occur if the buffer were of size B/k and the switch bandwidth were
of size c/(l + ) cells per second. The result of this simulation can be used
to estimate the cell-loss rate in the actual buffer for a bandwidth of c/(l + ).
By equation (9.2), it follows that if the QoS constraint is satisfied under the
reduced bandwidth then 100% more calls can be routed through the switch
when it operates with the true bandwidth, c. Note that the assumption that
p is fixed corresponds to the assumption that the extra N calls will occur
in the same mix of classes as those calls already present at the switch.
The on-line simulator that carries out the two estimation procedures
described above has been called MINOS (monitor for inferring network
overflow statistics), evoking the island of Crete where, at the Computer
Science Institute in 1990, the idea of this simulator was originated by
Courcoubetis, Walrand and Weber [2]; there, also, details of the derivation
of equations (9.1) and (9.3) can be found. A principal advantage of MINOS
is that it does not require any assumptions to be made about the statistical
nature of the traffic. MINOS uses actual observed traffic to make its
inferences; it is adaptive and adjusts its recommendations to changing patterns
and types of calls.
Subsequent research has addressed various issues concerned with practical
implementation; some relevant remarks are given here.
To obtain a better estimate of if!(N,B,c) , it is valuable to estimate
(N,B/ k ,c) at three values of k. This is because a more refined version of
large B asymptotic is:
10gif!(N,B/k ,c) = A
148
INTELLIGENT SWITCHING
10g4>(N,B,c) =
N/(Boco) + o(N)
... (9.4)
EFFECTIVE BANDWIDTHS
149
9.3
EFFECTIVE BANDWIDTHS
Here we adopt an approach which also has some of the advantages of the
previous section - a simple characterization of source statistics that can be
measured on-line. However, the emphasis is the direct calculation of effective
bandwidths for calls in different classes. Suppose that a switch handles M
classes of traffic and has capacity to handle c cells per second. A number
of authors have described models for which the condition, that the switch
can carry N j sources of class i and the probability of buffer overflow be kept
smaller than some specified amount, can be written:
c~
E Ni(Xi
... (9.5)
i=!
where (Xi is the effective bandwidth of a source in class i, i= 1, ... ,m, e.g. see
Kelly [4], Courcoubetis and Walrand [5]. When a source is bursty, its
effective bandwidth will be somewhat greater than its average rate.
However, because at any given moment some sources are producing cells
above their average rates and others below, there is potential for statistical
multiplexing. This means that each source's effective bandwidth need not
be as great as its peak rate.
Again the analysis is based on the theory of large deviations. We suppose
that time is discrete and each of the N i sources in class i deliver to the buffer
numbers of cells at successive time points that are independently distributed
as the stationary process [X:,t = 1,2, ... As in the previous section, let
ip(N,B,c) be the probability that the buffer overflows during a busy period,
where Npi is the number of calls in class i. Kesidis and Walrand [7] have
shown that equation (9.1) holds, and:
J.
where:
i()) = lim log
T-CXl
E [ex p ()(
Exi )]
t=!
150
INTELLIGENT SWITCHING
c~
E NNx;(OIB)
i
where 0';(01B) = 4>;(B)/(OIB), and is thus identified as the effective bandwidth of class i. Courcoubetis and Weber have shown that it is possible to
make the expansion:
... (9.6)
where
mi =
Exi)
2 ]
This is often called the index of dispersion. It is also 1f times the spectral
density evaluated at 0, i.e.:
00
+ 2 E 'Yi(k)
k=\
where 'Yi(k) is the kth order autocovariance of the process (Xli J. The above
converges for well-behaved, purely nondeterministic second-order stationary
processes. In the case that the numbers of cells that a source produces in
successive periods are independent, 'Yi is the variance. In general, I' can be
estimated from the data by spectral estimation techniques (see, for example,
Chatfield [8]). It is attractive that effective bandwidths might be estimated
observed data, since it is unlikely that any theoretical model is rich enough
to adequately model all traffic classes.
It is interesting to observe what happens if the source is pre-smoothed
by linear filtering, say:
where ao + ... -t.... a p = 1 is imposed, so that the mean does not change.
Then, because 1;(0) = IT(0)12Ji(0) and the transfer function, IT(O) I =
I Ei ad = 1, it is found that 'Yi does not change. What happens is that presmoothing, effected by averaging inflows of several periods, decreases the
variance, but simultaneously increases higher-order autocovariances, but the
combined effect is that the effective bandwidth is unchanged. This is not
too surprising, since the effects of pre-smoothing are not really seen by a
very large buffer, and it is large buffers with which these effective bandwidths
are concerned.
CONCLUSIONS
151
This makes sense in various ways. It has the right dimensionality properties,
scaling correctly in time and in cells. It agrees asymptotically, as O/B-O,
with the effective bandwidths given in de Veciana et al [9].
9.4
CONCLUSIONS
This chapter has explained how insights arising from the theory of large
deviations can be used to make on-line estimates of the cell-loss rates arising
at the buffered switches of an ATM network.
The approach also suggests that the effective bandwidth of a bursty traffic
source can be computed as a function of the mean source rate and its index
of dispersion. These ideas are simple to implement, have worked well in some
simulations, and are presently receiving further development and refinement.
152
INTELLIGENT SWITCHING
REFERENCES
I.
2.
3.
4.
5.
6.
7.
8.
Chatfield C: 'The analysis of time series: theory and practice', Chapman and
Hall, London (1975).
9.
de Veciana G, Olivier C and Walrand J: 'Large deviations for birth death Markov
fluids', Probability in the Engineering and Informational Sciences, 7 , pp 237-235
(1993).
-
10
NEURAL NETWORKS
S J Amin, S Olafsson and M A Gel!
10.1
INTRODUCTION
154
NEURAL NETWORKS
10.2
The neural network used in this study is the Hopfield model [8, 9] . It consists
of a high number of simple processing elements interconnected via the neural
weights. Due to the high number of neural connections, the Hopfield network
provides massive processing capabilities. At any moment in time each neuron
is described by two continuous variables, the neural activity level Xij and the
neural output Yij' These variables are related by the nonlinear monotonically
increasing processing function!:
... (10.1)
In this work! is taken to be the sigmoid function:
!(Xjj) =
1 + exp( _ !3Xjj)
... (10.2)
where !3 is the gain factor, which controls the steepness of the sigmoid
function, as illustrated in Fig. 10.1.
1.0
0.8
0.2
155
- aXjj
+ E
T;j,kl Ykl
... (10.3)
Ij,j
k,l= I
Tjj,kl is the weight matrix which describes the connection strength between
the neurons indexed by (ij) and (kl). I ij describes the external bias which can
be supplied to each neuron. Hopfield has shown that, for the case of
symmetric connections Tjj,kl = Tkl,ij and monotonically increasing
processing function, the dynamical system of equation (10.3) possesses a
Lyapunov (energy) function which decreases on the system's trajectories. The
existence of such a function guarantees that the system converges towards
equilibrium states which define point attractors for the dynamics. The
Hopfield energy function [8] is of the form:
E = -
ij,kl
f(Xij)
Aij
i,j=1
where the
Ajj
I jj Yjj
i,j = I
j-I(Xjj) dXiJ
... (10.4)
Ajj
a, for all i,l, the equation of motion (10.3) can be written as:
... (10.5)
where the dot denotes differentiation with respect to time. From this relation
it can be derived that:
n
E = -
i,j = I
-2 E
i,j = I
aE'
aE')
Y . . +_x..
( aYjj
I)
aX;j
I)
~ (~)2
axI)
ayI)
<0
... (10.6)
156
NEURAL NETWORKS
The inequality follows from the fact that the processing function is
monotonically increasing. Without the integral term in equation (10.4) the
time derivative of the energy function becomes:
E ( i ~j.kl hi + J.)
i,j = I
( E
'lij,pq Ypq
+ I ij
Ti-J. kl
p,q= !
( E
k,I=!
Ykl
(d!(X;j))
dXij
IJ
k,l= !
E a--x- ( dd!(Xjj))
i,j=!
IJ
IJ
h)
Xij
... (10.7)
dXij
k,!= !
10.3
'lij,kl Ykl
+ I ij
... (10.8)
Y=
YII,
,Yin)
Ynl:
':Ynn
Yij
= [
o if rij
1 if
rij
... (10.9)
157
In this formulation the rows represent the input lines, and the columns
represent the output lines. The above condition states that there could be
more than one packet in input line i requesting a transmission to the output
line j. Every index pair (ij) defines a connection channel. During each time
slot, only one packet can be permitted per channel. In the general case the
configuration matrix, which sets up the channel connections, can maximally
contain one non-vanishing element in each row and column. If there is a
queue at each input and a request for a connection to each output, then the
request matrix Y = Yij is said to be full. This amounts to every column and
every row containing some non-vanishing entry. Furthermore, if there is more
than one non-vanishing entry in any row or column, then more than one
input is requesting to be connected to the same output. Since only one input
can be connected to one output at any time, the switching mechanism will
have to choose only one request at any time and force the rest not to be
connected. Also, the switching mechanism must not generate non-vanishing
entries in a column or row, which initially had zero entries. In the case of
a full request matrix the optimal switching requires the following mapping:
[YI, ....... , Yn] -
... (10.10)
(i )
n( n( n
n( n( )
output matrices:
(
(
1 0
0 1
0 0
0 0
1 0
o 1
1 0
0 0
0 1
0 1
1 0
0 0
0 0
0 1
1 0
0 1 0
0 0 1
1 0 0
158
NEURAL NETWORKS
n
n( n
n( n
input matrix:
0 0
I 0
I
output matrices:
(
(
0 0
I 0
0 I
0 0
0 0
I 0
0 0
I 0
0 0
0 0
0 0
0 I
10.4
From the analysis of the configuration matrices one can construct an energy
function for the switching problem. This function can then be compared with
the Hopfield energy function to find the resulting weight connection matrix
and the external biases. From the description in the previous section it is easily
established that an energy function for the switching problem [6] is given by:
E= -
E YiJYil+2 i,j,I=1
2
I r! j
+ C(
2
i,j= I
159
E Yij Yjk
i,j,k= I
k r! i
... (10.1 I)
n- Yij )
This energy function takes on zero values only for configuration matrices
that are solutions for a full request matrix Y (i.e. for an input matrix which
has at least one non-vanishing element in each row and one non-vanishing
element in each column). The last term on the right hand side of equation
(10.11) takes on positive values if the request matrix contains one or more
zero rows or columns. A comparison with equation (10.4) without the integral
term gives the following expressions for the weights and biases:
Tij,kl =
- AO ik (I
C
2
... (10.12)
where oij is the Kronecker delta. Substituting this back into the dynamical
equation gives:
dXjj
dt
-axij-A
'-'
I r! j
Yil- B
'-'
Yk"+k r! i
J
2
... (10.13)
which is the desired differential equation for the switching problem. Under
this dynamical equation the neural activities are a dissipative process. They
will develop towards neural configurations which minimize the energy
function (equation (10.11 and are therefore solutions to the switching
problem.
10.5
... (10.14)
160
NEURAL NETWORKS
XO,ij = - A
I~j
f(XO,il) - B
f(xo
k~i'
kj)
c
2
... (10.15)
L.= -A-B+
"'\1,1)
C
2
... (10.16)
- A - B+ 2
~O
... (10.17)
Xo,ij =
2"
... (10.18)
>0
... (10.19)
SIMULATION RESULTS
... (10.20)
0<C<2(A +B)
10.6
161
Tij,kl -
SPECIAL CASE
In this section some general properties of the connection matrix are discussed,
such as the degree of connectivity it establishes in the network. A symmetric
connection matrix can be achieved by putting A = B, in which case the
condition in equation (10.20) reads:
... (10.21)
0<C<4A
10.7
SIMULATION RESULTS
162
NEURAL NETWORKS
10.7.1
o <C<2A
... (10.22)
SIMULATION RESULTS
163
5.0e+03
S -5.0e+03
~
~ -1.0e+04
"0
-1.5e+04
-2.0e+04
100
80
60
40
number of iterations
Fig. 10.2
20
20
00
40
60
80
neuron index
_ 1.0e+00r
6.0e-01l
2.0e-01
100
80
number of 60
iterations
40
80
20
00
Fig. 10.3
2.0e+03r
I
O.Oe+OO
I
.
: -4.0e+03;
:!:!. - 6.0e+03:
.
-8.0e+03r
x
-2.0e+03r
-1.0e+04 i
100
80
00
Fig. 10.4
164
NEURAL NETWORKS
1.0e+00;
8.0e-01;
:g6.0e-Ol '
- 4,Oe-Ol
2.0e-Ol
O.Oe + 00
100
80
60
number of
iterations 40
20
Fig. 10.5
10.7.2
Imposed attractor
SIMULATION RESULTS
165
1.0e+00.
8.Oe- 01 r
6.0e-01 t
4.0e-01f
2.0e-01 r
O.Oe+OO
100
80
60
40
number of iterations
20
00
Fig. 10.6
neuron index
Result for simulations of 8 x 8 input matrix with A = B= 1250 and C= 100; all
neurons are changing according to the dynamics of the network.
1.0e+00
8.Oe-Ol
6.0e-Ol
4.0e-01
2.0e - 01
O.Oe+OO
100
80
60
80
40
number of iterations 20
00
Fig. 10.7
neuron index
Result for simulations of 8 x 8 input matrix with A = B = 1250 and C = 100; nonrequested neurons are forced towards the imposed attractor.
166
NEURAL NETWORKS
10.8
In this chapter a study of the stability and sensitivity properties of the Hopfield
dynamical model applied to a crossbar switch has been presented. The
dynamical equation (fot the switching energy function) has been assessed
with respect to the setting of internal optimization parameters and
information about network stability and performance obtained. Various
bounds on the values of the optimization parameters were studied and
extensive simulations have been performed to verify these bounds. The role
of optimization parameters (A, B, C) and the role of an additional imposed
attractor (x 3 = - 2A) have also been demonstrated. Also, the use of the
random gain parameter {3, as given by equation (10.2), is crucial to push the
network out of local minima in which the system may get trapped. The
approach established here for the application of the Hopfield dynamical
model to crossbar switching is relevant to other problems, e.g. resource
allocation, in which an optimization process represents a key step in reaching
a solution. Such examples are ubiquitous in telecommunications and
computational systems.
The simple approach to the determination of convergent optimization
parameters in the Hopfield dynamical model presented here has shown how
the introduction of an imposed attractor into the network enables the model
to be computationally operational for large random input matrices. This
capability has been exploited in the study of a switching problem, which was
previously handicapped by slow convergence of the network. It has also been
shown how improved computational speeds can be obtained.
REFERENCES
167
REFERENCES
I.
2.
3.
4.
5.
6.
7.
Marrakchi A and Troudet T: 'A neural network arbitrator for large crossbar
packet switches', IEEE Transactions on Circuits and Systems, 36, No 7,
p 1039 (1989).
-
8.
9.
10. Aiyer S V, Niranjan M and Fallside F: 'A theoretical investigation into the
performance of the Hopfield model', lEE Transactions on Neural Network, I ,
No 2 (June 1990).
-
11
11.1
INTRODUCTION
Optical fibre transmission systems have now largely replaced their copper
forebears. This has been achieved by overlaying the copper pair and coaxial
systems with optical fibre, thereby realizing vastly increased repeater spacing,
smaller cable size, increased capacity and reliability, and orders of magnitude
in reduction of costs. Despite these radical changes, the approach to network
design, reliability and performance assessment has seen little change, merely
a scaling of the established copper techniques, figures and assumptions, with
minor modifications to accommodate the new family of optoelectronic
components. The validity of this is questionable since the move from copper
to glass has eradicated key failure mechanisms such as moisture and corrosion,
while at the same time the considerable increase in repeater spacings has
removed the need for power feeding and so has moved the reliability risks
towards surface switching stations. Present system and network models do
not reflect the full impact of these improvements or attach sufficient
importance to the novel features of this new technology.
The established approach to network management, maintenance, repair
and restoration is perhaps best described as curative, i.e. preventative or preemptive measures to improve overall performance are generally not adopted.
However, there is an increasing body of evidence that suggests such an
approach is possible. This is increasingly so with the performance enhancements realized by fibre systems that, in turn, allow the detection of fibre
RELIABILITY
169
RELIABILITY
General
Transmission technology has always been a rapidly changing field with new
generations of system evolving on an ever-shortening time scale. Within a
ten-year period two generations may be realized and deployed in very large
numbers, with an expected service life of 7-10 years. Even undersea cables,
traditionally a techno-cautions field, are now being upgraded with higher
capacity (5-10 times) systems in the same time frame. Not surprisingly,
therefore, the reliability equation is constructed with data of an immature
and often dubious pedigree. Never before has there been a time when the
statistical reliability data and evidence being accumulated for a system under
study has been overtaken by a new generation containing, in part at least,
some radically new technology - and this is likely to become the norm.
It is also worth noting that even when a technology pedigree has been
established over a long period of time, a reliability model is still likely to
succumb to relatively large prediction errors in either direction. Even with
the utmost of care, a system containing thousands of components may
experience the odd rogue that slipped through, or suspect batches installed
in error - stipulated storage and operating conditions exceeded for individual
components or sub-modules and complete systems, human interdiction
introducing a progressive weakening of sub-systems and systems, the impact
of static discharge, electromagnetic radiation by man, acts-of-God, etc. The
task ahead is therefore difficult and complex and it is necessary to make a
number of fundamental and simplifying assumptions. Whilst the absolute
nature of the results presented here can be challenged, their relative accuracy
is sufficient for the purpose of comparison. Moreover, the results fall in line
with operational experience where available.
In this study six fundamental assumptions are made:
Whilst these assumptions are not strictly true they are sufficient to
construct a meaningful comparative model. It is also useful to make the
following additional assumptions based upon practical experience with real
systems:
all components are tested and proven to be within specification, and have
verified characteristics;
RELIABILITY
171
all elements can be allocated a mean time between failure (MTBF) that
is either computed or estimated based on field experience;
MTBF
(MTTR + MTBF)
MTBF)
11.2.2
Since 1969 the progress of integrated circuit technology has seen a doubling
of electronic circuit density each year. Whilst data/clock rates, power
consumption, and performance have improved at a more modest rate, they
have generally been sufficient to place transmission system development at
the leading edge. The key reason for this is that the high-speed elements of
transmission systems have required only a modest degree of integration,
whereas the lower-speed elements have taken advantage of significant
integration. By and large the field is dominated by silicon technology with
current bit rates reaching beyond 10 Gbit/s. The use of GaAs, although
capable of significantly higher speeds, is still relatively rare and may be
confined to optical devices and specialist amplifiers and logic elements. The
172
bulk of the electronic terminal and repeater technology therefore has a long
and traceable heritage through its silicon basis, affording some confidence
in device and component performance prediction. Furthermore, there is now
a reasonably solid body of evidence and data to be found in publicly available
handbooks of reliability data (e.g. HRD 4).
For the purposes of historical completeness this chapter presents reliability
analyses on copper systems (twisted pair and coaxial), multi-mode and single
mode fibre systems. From the data available, mean reliability figures have
been derived across European and North American systems for all system
types, including the now dominant 565/1800 Mbit/s plesiochronous digital
hierarchy (PDH) systems. The numbers of these systems deployed can be
counted in their thousands and thereby provide a good foundation for
extrapolations into higher order and future systems. For the synchronous
digital hierarchy (SDH) systems under development, a scaling from the
existing PDH systems has been combined with manufacturers' data and trial
results. In the case of optically amplified systems and the use of wavelengthdivision multiplexing (WDM) and wavelength-division hierarchies (WDH),
the best available data from reported system experiments and trials has been
assumed. In all cases, the objective has been to be balanced and consistent
in order to benchmark comparisons.
11.2.3
Transmission cables and line plant vary within and between countries and
comprise:
direct bury;
river, lake, sea and ocean crossings at depths of less than 1 m to greater
than 3 km, with or without armour.
RELIABILITY
173
fibres do not corrode and require far fewer joints because they can be
installed in longer continuous lengths;
the increased repeater spacing owing to the low loss of fibre has eradicated
the need for power feed conductors and buried/surface repeaters in
terrestrial systems, and is now only necessary on undersea systems that
exceed 150-250 km in length;
despite early fears, fibre technology has turned out to be more resilient
and easier to handle than its copper forebears;
fibre and copper, however, are equally susceptible to the spade and back
hoe digger.
It is important to differentiate between two kinds of link availability one is the link availability experienced by the customer, the other by the
operator. The two are different due to the nature of individual cable failures
and the use of N+ I stand-by, or network protection/diverse routeing. To
understand this, consider these two extreme failure scenarios.
Only a single fibre between two end points is broken - to effect a repair
the operator has to take the entire cable out of service and insert an extra
length; so effectively a total outage is seen across the whole cable without any circuit or network protection, this is what the customer sees,
too. However, if automatic protection is provided, the customer will only
experience a momentary break in transmission. The worst experience the
customer is subject to is when manual re-routeing is used and the MTTR
can extend to several minutes or even hours for remote locations.
174
All the fibres between two end points are simultaneously broken without circuit or network protection the customer sees the same outage
time as the operator. The last customer is restored when the last fibre
is spliced. If network protection is provided it is only the operator who
is inconvenienced.
Practice
Failure mechanisms
MTBF
(years)
Fibre
Copper
MTTR
(days)
buried
earthenware
ducts
10
<I
direct bury
as above
<1
overhead on poles
as above + windage
snow and ice
high loads and cranes
tree falls
road accidents
buckshot
<1
undersea
shallow
depth
>15
15
<7
non-armour
deep lay
water ingress
corrosion
sharks
>40
40
<20
RELIABILITY
Table 11.2
Practice
175
Failure mechanisms
MTBF
(years)
Copper
Fibre
MTTR
(days)
buried
earthenware
ducts
160
100
<1
direct bury
as above
40
25
<1
overhead on poles
as above + windage
snow and ice
high loads and cranes
tree falls
road accidents
buckshot
2.5
<1
11.2.4
Repeaters
When optical fibre systems were first being developed, pulse code modulation
(peM) and coaxial digital systems had already matured and were in
widespread use in the network. Repeaters were placed at regular intervals
along the line to reshape, regenerate and retime (i.e. the 3R process) the signal,
thus ensuring a consistent performance over long lines that would otherwise
be impossible. A migration of that same technology into the optical regime
was the obvious next step. The first optoelectronic repeaters were realized
by merely introducing lasers and photodetectors into standard electronic
circuitry. The fact that fibre does not introduce the same degree of signal
distortion as experienced on copper meant that optoelectronic repeaters were
devoid of sophisticated equalization circuitry. With modern day optoelectronic repeaters it is possible to achieve repeater spacings well in excess of
100 km, although in practice spacings tend to be a more modest 30-50 km.
This contrasts significantly with the 1-2 km for coaxial systems using electronic
repeaters.
With the recent development of optical amplifiers, particularly erbiumdoped fibre amplifiers (EDFA), repeaters will be all-optical in the signal path,
with a small amount of electronics retained purely for monitoring and
management functions. This technology will open up the full bandwidth of
fibre routes, effectively transforming them into transparent optical pipes with
almost unlimited bandwidth. This, in turn, will see the introduction of
wavelength-division multiplexing (WDM) with signal formats resembling
those of the frequency-division multiplexing (FDM) copper systems, albeit
in the optical regime.
176
Table 11.3 lists the FIT figures for the individual components in electronic,
optoelectronic and all-optical repeaters. These figures are shown through eight
eras of digital transmission technology, spanning the introduction of simple
PCM in the 1960s through to the optically amplified systems of the 1990s
and beyond. In each case the total FIT figure is converted into an MTBF
from which an unavailability figure is computed using the assumption that
the MTTR is constant at 5 hours. From Fig. 11.1 it is clear that the
introduction of power-supply duplication to offset the high FIT of DC/DC
converters, which were introduced on a large scale during the 1980s, has a
significant bearing on the unavailability of terrestrial systems. Undersea
systems benefit from a further improvement in unavailability by using power
feeding along cable conductors and doing away with individual DC/DC
converters.
Full duplication of repeaters is seldom undertaken in practice as the overall
improvement in system reliability is negligible when compared with the risk
posed by the cable. It would also involve the added complexity of hot standby switching, sensing and control, which is not trivial.
11.2.5
Terminal stations
Terminal stations are the network nodes at each end of a long transmission
link and are effectively the interface point (Le. switching centre, central office,
etc) to the local loop. Essentially they comprise repeaters (electronic or
optical), multiplexers (MUX) and switches, together with management and
control elements. Table 11.4 lists the FIT figures for the key components
across eight eras of digital transmission technology. In each case the total
FIT figure is converted into an MTBF and unavailability, again assuming
a constant MTTR of 5 hours. Figure 11.2 shows the unavailability is again
dominated by the risk associated with the power supply DC/DC converter.
Duplicating the power supply reduces the risk to well below that imposed
by the station battery and power system for remote locations. Again, full
duplication of terminal equipment in terrestrial systems is seldom undertaken
as the overall system advantage is negligible when compared with the risk
posed by the cable and repeaters. Furthermore, it involves the added
complexity of hot stand-by switching, sensing and control, which again is
not trivial but is less of a problem to deal with than in the repeater case.
Full terminal duplication is most commonly adopted on undersea systems,
since the highest achievable availability is warranted on such critical
international routes.
Cin;uil t'oard
I
2
I
0
0
0
0
26
l
6
0
"
2
2
0
0
0
0
632
J2
120
4
4
4l
7
0
3
12
3
2
8
3
2
0
0
0
0
0
0
Table 11.3
I
0
0
,
3691.25
30.9J
0.00189
1000
'00
150
0
0
Coaxial
90/140 Mbills
PDH
TOlal
Count
FITS
164
32.8
1.8
6
3l
7
18.6
62
12
12
0
0
3
4.l
Il
6
3
60
2
10
10
15
5
4
60
150
3
0
0
0
0
0
0
0
0
0
0
56
l60
3
110
410
3
0
0
0
0
0
0
0
0
850
'l
4l
6.75
0
0
0
0
10.8
16
6
60
4
4
0
0
I
2
I
0
0
l
I
2
24
106
4
4
38
7
0
2
6
I
I
6
2
I
I
0
I
I
0
0
2.
2
I
0
2
0
0
\70
20
0
2
1980s
3276.8
34.84
000168
1000
800
150
0
0
7
0
6
1.8
2
30
6
6
15
50
0
200
10
0
0
290
100
IlO
0
80
0
0
57
3
0
I
7.2
lO
I
200
11.4
FITS
21.2
1.2
20
4.9E-05
92.152
1239
4.7E-OS
95.48
100.424
1137
5.IE-05
1196
42171.7
2.71
0.02162
42253
2.70
0.02166
1000
0
0
40000
8
42335.1
2.70
0.0217
I
0
0
2
4
2171.:'
52.5(,
0.00111
1000
0
0
40000
8
2253
50.67
0.00115
I
0
0
2
4
0.6
l
6.6
4
0
0
0.6
2
30
4
6
Il
0
200
0
0
10
100
60
50
IlO
20
40
60
200
32
1.8
0
I
3
40
I
200
10.4
Single Mode
2.4/10 Gbil/s
SDH
Count
TOlal
FITS
60
12
3
0 .
I
5
26
7.8
4
4
0
0
0
0
I
0.3
0
0
I
30
2
2
2
6
I
15
0
0
I
200
0
0
0
0
I
10
0
0
4
40
I
lO
I
IlO
I
20
0
0
2
120
6
240
340
34
6
0 .
0
0
2
I
6
1.8
2
20
I
I
2
200
1990,
2335.1
48.89
0.0012
1000
0
0
40000
l2
2
I
22
4
0
0
2
I
I
4
2
I
0
I
0
0
I
I
6
I
I
I
I
I
l
320
12
0
2
10
4
I
2
Single Mode
0.56/1.8 Obills
PDH
Count
Total
I
0
0
2
4
Single Mode
90/140 Mbil/s
PDH
TOlal
Count
FITS
13.6
68
3
0 .
2
10
28
8A
6
6
0
0
3
4
1.2
I
2
I
30
6
6
2
6
Il
I
I
lO
I
200
0
0
10
I
0
0
100
I
100
10
I
lO
IlO
I
I
20
2
80
0
0
4
160
490
4.
2.4
16
0
0
2
I
3.6
12
lO
l
I
I
2
200
FITS
1980/90s
1970/BOs
Multi-Mode
6/8 Mbit/s
PDH
Count
lOial
JR SYSTEMS -
19705
2992.5
38.15
lI.t1015J
1000
'00
ISO
0
0
30
0
0
0
0
0
0
320
100
300
0
0
0
0
63.2
4.2
0
0
7.8
lO
6
0
3.6
6
60
8
24
1.2
20
U.S
7
0
FITS
1.60s
Copper Pair
1.5/2 Mbit/s
peM
Count
TOlal
Coaxial
Pair
Fibre
Powrr sllpplirs
Station ballery
Po.....er feed unit
Po.....er pid off
DC fDC o.:onvertcr
DC/DC du Iil:atcd
COMfIl1OrS
FET~
Po.... ~r
Dri\'~r~
Laser
LED
PIN
"PD
Peller coolers
Transislors Signal
VaraCIOT
Oiodf's Signal
Vollage regulator
Po.....er
Tramforrncrs
Filters
Variable
Tamalum
Electrolytic
C8parilOf'S CeramiC/poly
1000
400
150
20000
0.2
0.3
l
0.3
I
5
3
0.3
2
30
I
3
15
50
200
200
10
10
100
10
50
150
20
40
60
40
0.1
0.15
I
0.5
03
10
I
100
Pre-sels
FITS
COMPONENTS
SYSTEMS
ERA
I
0
0
I
I
21322
5.35
56.832
2009
2.9E05
O.OI()9)
.2.1786.9
S.24
0.01117
7S.15(,
1519
3.9E05
1000
0
0
20000
2
1322
86.35
0.00068
I
0
0
I
I
1786.9
63.88
0.00092
1000
0
0
20000
2
2000 +
1990>
OPTICALLY MPLIFIED
DISTRIBUTED
LUMPED
Single Mode
Single Mode
10+ Gbil/s
2.5110 Gbil/s
WDH
SDH
Coun!
TOlal
Count
TOlal
FITS
FITS
Il
3
6
1.2
0
0
0
0
I
0
0
l
2A
4
1.2
8
I
I
2
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I
3
0
I
Il
0
0
0
0
0
0
I
200
I
200
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
20
0
0
0
I
lO
0
I
IlO
0
0
0
0
0
0
0
0
0
0
I
I
60
60
40
I
40
I
14.2
6.8
142
68
0
0
0
0
0
0
0
0
O.l
2
I
I
0.3
I
0.3
I
2
20
I
10
I
I
I
I
2
200
0
0
Table 11.4
1110 Unavailability
MTBF (years)
3
6
280
136
48
21000
320
0
280
0
0
260
32
16
260
8
32
560
136
72
2100
70
0
2300
30
120
Count
168
27591
4.14
45523.2
2.51
0.02276
5523.2
20.67
0.00276
3
6
30
14
22
7.2
120
24
1000
40000
8
2700
50
0
42
10
28
MTBF (years)
07, Unavailabilil)'
22.30
5119.03
I.IE-05
66783.5
1.71
0.03339
6783.5
16.83
0.00339
1000
60000
18
9
140
22
270
7.5
0
1680
600
1120
120
500
600
14
24
15
14
8
I
12
10
4
5.4
244
180
J5
20
12
10
2400
10
0
22
30
10.44
10936.7
5.2E06
45689.9
2.50
0.02284
5689.9
20.06
0.00284
1000
40000
8
6
120
10
1.5
240
1200
900
880
100
500
300
14
24
15
14
8
I
10
10
2
4.2
20
180
63
16
0
14
10
6
210
16
0
78
58
10.2
20
Total
FITS
PDH
Coun!
290
34
4
22
0
PDH
0.56/1.8 Gbil/s
Single Mode
1980/905
72
12.6
30
18
122
6
260
22
0
360
42
6
390
9
0
1920
0
0
240
600
600
20
24
90
6
16
150
75
30
0
88
6
100
87591
1.30
0.0438
0.0138
2
4
24
24
12
84
1360
48
48
0
0
24
12
4
20
8
6
20
8
5
250
30
0
440
20
20
Coun!
Total
FITS
FITS
PDH
Total
Coun!
PDH
(OI(l8x
Single Mode
1.512 Ml:it!s
90/140 Mbit/s
3900
60
0
1000
60000
18
19805
Mulli-Mode
4x 1.5/2 Mbit/s
6/8 Mbil/s
2100
48
0
11200
0
0
2600
1600
2400
260
24
480
2160
212
630
70
0
460
9
600
TOlal
FITS
I970/80s
MULTIPLEX TERMINAL
SDH
Single Mode
''''''''
I
2
4
11655.1
4.9E-06
44259.4
2.58
0.02213
9.79
4259.4
26.80
0.00213
1000
40000
8
5.4
100
6
18
10
6
210
0
0
400
720
800
300
300
60
J5
18
10
2.4
18
150
15
0
51
54
6.6
10
Total
FITS
2100
0
0
10
12
20
6
6
2
10
6
I
'8
9
5
170
15
0
270
22
2
Coun!
SDH
2.4/10 Obil/s
79155
1.44
0.03958
1000
60000
18
3
6
19155
5.%
0.00958
1000
20000
2
720
220
2400
195
0
0
0
0
2380
3600
]450
390
108
570
72
136
2160
405
84
0
486
15
680
Total
FITS
DC/DC Convener
DC/DC Duplicated
100
1000
220
72
220
0.3
10
Coaxial
Pair
Fibre
WDM MUX
POWN Suppltes
Station Baltery
Circuit Board
24000
1300
0
0.1
0,1'5
0.5
Joints
Soldered
Crimped
Spliced (fibre)
ConnKlors
0
0
0
40
60
40
238
72
23
390
36
38
I
3
15
10
50
150
240
68
72
0.3
2
30
1350
84
0
2430
50
136
Count
PDH
Coaxial
60/68>: 1.512 Mbil/s
90/140 Mbil/s
Copper Pair
24132>:64 kbit/s
1.5/2 Mbitls
peM
19705
1960>
TTL
MOS
ECL
Signal
Drivers
Power
Inlqr8Ced Cels
Tnnslslors
Voltage Regulator
Power
Filters
Dlodts
Signal
InduClOrs
Transformers
Wound Ilems
Tantalum
0.2
0.3
5
0.5
0.3
FITS
Ceramic/Poly
Electrolytic
CapI('ilors
Pre-sc~ls
Resislors
Low Power
High Power
COMPONENTS
SYSTEMS
ERA
SDH
I
I
2
I
2
I
1400
0
0
5
5
J2
6
6
5
180
12
0
150
10
0
2.51
45477.2
1.3E-06
0.0112
22406.1
5.09
2406.1
47.44
0.0012
1000
20000
2
0.3
20
I
140
0
0
200
300
480
1.8
12
150
54
12
0
30
3
0
FITS
TOlal
2.5/10 Gbit/s
SDH
Single Mode
1990,
0
0
0
10
I
0.02
IE-QS
5685000
2004
56.96
2004
56.%
O,(XH
0
0
0
1000
1000
FITS
Total
WDH
Count
10+ Obit/s
WDM
Single Mode
2000 +
OPTICALL Y AMPLIFIED
LUMPED
DISTRIBUTED
Counl
0.1
.~
15
~
0.01
co
C
::J
.-------.
iii
>
~
0
179
0.001
0.0001
._--.
0.00001 + - - - - + - - - - + - - - - + - - - - I - - - t - - - + - - - - I
19605
19705
1970/805 19805
1980/905 19905
19905
2000+
technology era
Fig. 11.1
0.1
0.01
.----'-------.----------.,---
0.001
15
iii
>
0.0001
C
::J
~
0
0.00001
'"
~---+--
0.000001
0.0000001
0.00000001 +----+----+----+----1,..---+---+-----'+
19605
19705 1970/805 19805 1980/905 19905
19905
2000+
technology era
Fig. 11.2
11.3
In order to model circumstances in national and international routes, endto-end system lengths of 100 km, 1000 km and 10 000 km are assumed. For
each system length the model computes the correct number of line repeaters
for the technology era. For example, repeater spacings in the copper eras
were constrained to 2 km, whereas the later optical fibre eras readily
accommodate 50 km. In the case of a 100 km system length, the model
therefore invokes 49 repeaters for the copper eras, reducing to I for the fibre
eras. For 100 km and 10000 km systems these figures scale linearly. This
is implicit from this point on.
11.3.1
Terrestrial systems
f 01~
~
>'<
~d"PIi"'<,"1
~~~
~--c
O.Ol
.----~--MUX
(duplicated PS)
line (duct)
O.OOlt-----+-----f---+----+------,r----+--==.....l}
19605
19705
1970/805
19805
1980/905
19905
19905
2000+
technology era
Fig. 11.3
10
----
181
:0
.!!1
.~
0.1
Cll
::J
0'<
0.01
line (duct)
Fig. 11.4
supply duplication is used, the repeaters and cable bcome equally dominant
with time. The MUX plays little part in the equation. The reducing number
of repeaters with time again accounts for the observed improvement in their
cascaded reliability.
As might be predicted, the reliability of 10 000 km systems is broadly
similar to that observed for the 1000 km case because the length-dependent
elements (i.e. cable and repeaters) dominate the reliability equation. So, once
again, Fig. 11.5 shows that the reliability of the cascaded repeaters dominates
throughout the technology eras if power supply duplication is absent, with
cable and MUX reliability having little influence. If, however, power-supply
duplication is used, the repeaters and cable become equally dominant with
time. Duplicating the power supply of the terminal MUX now has such a
marginal impact that it is arguably not worthwhile.
11.3.2
Undersea systems
Optical fibre undersea systems have been in service since the mid-late 1980s,
and this is reflected in the graphs presented in this section.
For 100 km undersea systems Fig. 11.6 shows very clearly that cable
unavailability always dominates, and this continues to be so for 1000 km
10-~
~
ii
.!Jl
iii
line (duct)
~---~
~-
.. ---....,.r----
>
<tl
c
-..------
_______
--~r----
---~:i__=-==_..
0.1
0.001
+-
M-j-U_X_(:....d-.:up_li_ca-j-t_ed_PS-.:)_+--_ _-+
1960s
1970s
1970/80s
1980s
1980/90s
+--_ _-=::f:==~
1990s
1990s
2000+
technology era
Fig. 11.5
0.1
0.01
0.001
>
0.0001
ii
.!Jl
'iii
<tl
::l
?f-
-------------..
0.00001
0.000001
0.0000001
0.00000001 +----+----t----+------i---t-----+----<;J
1960s
1970s
1970/80s 1980s 1980/90s 1990s
1990s
2060+
technology era
Fig. 11.6
(Fig. 11.7) and 10 000 km (Fig. 11.8) system lengths. The observation is not
a reflection on the quality of undersea cables, quite the contrary, it is a direct
consequence of the unavoidably high MTTR. Clearly the time taken to
dispatch a cable ship plus crew, locate the fault, recover the cable, effect
the repair and replace the cable is significantly more than the corresponding
time for terrestrial systems (compare Table 11.1).
0.1
0.01
E
:0
0.001
183
.----.---0-. _---.
'm>
ell
0.0001
::J
0.00001
0.000001
0.0000001
0.00000001+---+----+---l------1f----i---+----l;J
1960s
1970s 1970/80s 1980s 1980/90s 1990s
1990s
2000+
technology era
Fig. 11.7
0.1
0.01
:0
~
0.0001
::J
0.00001
if.
.------------.
0.001
'iii
>
ell
c
repeaters
0.000001
0.0000001
0.00000001+---+-----+---l-----If----i---+----l;J
1960s
1970s 1970/80s 1980s 1980/90s 1990s
1990s
2000+
technology era
Fig. 11.8
184
11.3.3
~O'1.~
:0
.!!1
'iii
10c
N=10
N=1
O,001+-----+---+---+---+---+---+--~
1960s
1970s
1970/80s 1980s
1980/90s
1990s
1990s
2000+
technology era
Fig. 11.9
185
10
:0
~
ro>
etl
::J
eft 0.1
0,1+----+---1-------+----+---+------+--_ _
1960s
1970s
1970/80s
1980s
1980/90s
1990s
1990s
2000+
technology era
Fig. 11.11
stand-by may have been the right solution in the early days of peM when
other critical factors were at play, but today, and more so in the future, it
is clear that this approach wins no meaningful avantage over the significantly
simpler use of power supply duplication. At a time when telcos are striving
to increase network utilization N + I stand-by is also now expensive in the
broad sense. Realistically a worthwhile advantage can only be achieved by
introducing network protection (diverse routeing), in which case power supply
duplication could also be dispensed with.
11.3.4
Add-drops
:0
E1
.~
0.1
::J
oR
0.01
duplicated power supply
o to 10 add drops
0.001+----+----+------1---+-----+-----+------1
1970s 1970/80s 1980s 1980/90s
1990s
1990s
2000+
1960s
technology era
Fig. 11.12
11.4
187
100
10 add drops
no duplicated power supply
~:---
o add drops
0.1 + - - - - + - - - + - - - - - + - - - - - - 1 - - - - + - - - - - + - - _
1960s
1970s
1970/80s
1980s
1980/90s
1990s
1990s
2000+
technology era
Fig. 11.13
10 add drops
15
.!!1
.~
C
::::l
it-
0.1
o add drops
0.01+----+---+-----+------1----+-----+--_
1960s
1970s
1970/80s
1980s
1980/90s
1990s
1990s
technology era
Fig. 11.14
poles and wire drops. All of these are good targets for other utilities and
accidents generally. As a result the risks encountered are orders of magnitude
greater per unit length compared with the long-lines environment, but
fortunately the distances are short and so the risks are manageable.
188
In this analysis of reliability in the local loop, the same overall approach
has been adopted as was used thus far for long-distance systems, the primary
difference being the significantly shorter line lengths involved and the higher
risk of failure or damage to line plant. Not surprisingly it is found that the
overall balance between the various MTBFs and unavailabilities for the local
loop is dramatically shifted.
11.4.1
Route configurations
Iocalona
189
(MU
tw!st8dpU"
ovlothtldor
dlJllClburyor
DurJM dUd
10<&1
(MUX)
.-
twtsled pair
'emoto
(M\)
poIIIOP
~ntralor
~
......
...--.,...... ..
18
OOP
cable In
~dUd
....,=:-l~
I,em coax or
... ......
lWtstedPWIll
'km twIl1ld
...Iod_
p.. .,dirld
tlUryOf
poIIIOP
cotlC4onuttOli
~
eaOIe"
.....
lekln MM
..~
I akin tw'IIttd
2'OOm
ptIIr In cited
.. Id Pll, In
bury or
ovWhtId 0(
bul'1ed ckId
d1.ect bU'Y Of
tM,lrIM dUCt
twistld~J,1n
dit~buryor
199012000s : SM r....
......
to th. homo
I,ekm SM fbi.
buy 01'
bur. dUel
in
"'1
_.........
tSJun SM
"'-'<I
.........
18kmSM
,.".
Fig. 11.15
.......-
1.lkm SMtibtt
In d1recl tuyOf
overhtlOOI
eked buryOIl'
190
11.4.2
Reliability
From Fig. 11.16 it is clear that overall unavailability in the local loop has
remained relatively constant through the eras. At first sight this would appear
inconsistent with the fact that reliability improvements are constantly being
realized and deployed. However, it must be remembered that, as the new
technologies are introduced, bringing the benefit of, for example, increased
capacity and new facilities, they also come with a price - an initial reduction
in effective reliability until the new technology matures. The early days of
fibre is a classic example of this. The ensemble effect of this is a broadly
constant reliability over the eras, but it is important to note that this is in
concert with significant increases in capacity, system reach and performance,
and equally significant reductions in equipment and operating costs. Some
of these trends are illustrated in Fig. 11.17.
Comparing the overall reliabilities of the local loop and 100 km systems
(see Fig. 11.9) yields similar figures. The reliability of customer-to-customer
links (i.e. local loop + long line + local loop) of this approximate length
is therefore evenly distributed between the local loop and long line, which
is an optimum situation. However, for links in excess of 100 km it is found
that their reliability is dominated by the long-line portion. For example, these
results show that a 10 000 km route (compare Fig. 11.11) has a failure risk
today more than one hundred times that of the local loop. Does this mean
there can be complacency about the local loop in the knowledge that for many
traffic connections the reliability problems arise elsewhere? Most definitely
not, because it is vital that telecommunications is viewed as an end-to-end
191
0.1
:c
.!J!
.~
Ctl
0.01
~~
::J
._~
CO <duct> LO <direct bury> <O/H> customer
CO <duct> LO <direct bury> customer
~
CO <duct> LO <duct> customer~
CO <duct> customer (no LO)
0.001+----+----+----+---1----+----+----1
1950/60s 1970/80s 1980/90s 1980/90s 1990/2000s 1990/2000s 1990/2000s 1990/2000s
technology era
Fig. 11.16
0.1
O.OOll--.:-+-----4I----+-----+~~~:::::~~~
19501605 1970/805 1980/905
technology era
Fig. 11.17
192
11.5
11.5.1
CABLE REPAIR
Throughout the history of cable transmission the philosophy has been to avoid
putting all your eggs in one basket, i.e. to distribute circuits across several
pairs, coaxial tubes and fibres. This approach was encouraged by technologies
that could support only a limited amount of traffic on individual bearers.
With the advent of the optical amplifier, this philosophy is about to be
changed in a most dramatic way. The ability to place all circuits on one
amplified optical fibre using WDM actually improves the overall circuit
availability. Consider the rationale behind this rather surprising statement.
Suppose a cable contains a number of parallel fibres each of which is carrying
an equal share of the total traffic in the cable. When the cable is hit, how
long does it take to repair? A fixed time to detect, locate, dispatch, and
prepare the cable for fibre splicing might be supposed - say 24 hours (it
would probably be considerably less, but this scenario is purposely being
pessimistic). Then the repair crew start to splice the individual fibres - say
15-30 minutes per fibre. The MTTRs and unavailabilities that follow from
this are listed in Table 11.5 as a function of the number of fibres in the cable,
and these are summarized in Fig. 11.18. For even a modest number of fibres,
say 200, the MTTR and unavailability increase rapidly. From the point of
view of the last fibre to be repaired, which could be your fibre, this is
unacceptable. Now consider that same cable breakage but with all the traffic
carried by the previous 200 fibres now on a single fibre using WDM. Now,
the MTTR and unavailability are only marginally down on the best-case
figures governed by the static time to detect, locate, dispatch and prepare
the cable. To the customer as well as the operator this represents a significantly
better availability.
11.5.2
CABLE REPAIR
Table 11.5
Number
of fibres
1
2
10
100
200
500
1000
193
Cable MTTR and unavailability figures for a given amount of traffic concentrated
on N fibres.
1.02
1.04
1.21
3.08
5.17
11.42
21.83
0.028
0.029
0.031
0.085
0.142
0.313
0.598
99.97
99.97
99.97
99.92
99.86
99.69
99.40
1.01
1.02
I.I
2.04
3.08
6.21
11.42
0.028
0.028
0.030
0.056
0.085
0.170
0.313
99.97
99.97
99.97
99.94
99.92
99.83
99.69
25.00
0.6
2000
0.5 .
:0
15.00
0.4
.!!!
.~ 0.3
ctl
c
iO'
'0
10.00
::>
(j)
cr:
I:~
rf 0.2
5.00
0.1
0 + - - - - - + - - - - - - - + - - - - + - - - - - + - - - - - + - - - - - - + 0.00
1
2
10
100
200
500
1000
number of fibres
Fig. 11.18
level or error event activity, and then use the phase or path delay as a distance
calibration, to be able to accurately position the fault. It is also conceivable
that network protection switching could be invoked before the fibre break
interrupts service to the customer.
11.6
Failure prediction
Recent studies have shown that it is possible to differentiate between the cause
of error bursts on the basis of the error-pattern statistics (see Chapter 12).
Further development of these techniques might enable discrimination between
events and alarms to the point where maintenance action can be focused and
explanations furnished automatically. Failure type, time and location
forecasting is an area now needing attention in order that network reliability
and performance be further enhanced.
11.6.3
Network management
MTBF in days
For example, a network of 500 000 switching nodes each with an MTBF
of 10 years will suffer an average of 137 node failures and will generate an
average of 68.5 million reports per day. The assumption in the above formula
195
that each node is communicating with all the others is, of course, somewhat
extreme. At the opposite extreme there is the least connected case, which
leads to:
mean number of reports per day : : : ;
MTBF in days
which predicts that the reports still count in the millions. Whilst there are
certain network configurations and modes of operation that realize fault
report rates proportional to N, the nature of telecommunications networks
to date tends to dictate a - N 2 growth. Indeed, a large national network
with thousands of switching nodes can generate information at rates of
- 2 Gbyte/day under normal operating conditions. Maximizing the MTBF
and minimizing N must clearly be key design objectives.
11.6.4
Software
Today's networks rely heavily on software for their management and service
provision operations, while the software itself is becoming more complex,
involving in some instances millions of lines of code (Fig. 11.19). Coupled
telecommunications
ATM
SOH
control centre
System-X
exchange
other examples
miles of code
20
0
0
0
SOl
0
0
Encyclopaedia
Britannica
nuclear reactor
space shuttle
complete works
of Shakespeare
1 human's limit
10
small exchange
0.2
TXE-10 switch
Fig.ll.19
196
with this is the fact that even minor errors in software, either in the base
code or in its implementation, pose a considerable risk to network operation,
as is evidenced in recent outages which may be quantified using the modified
Richter scale in Fig. 11.20.
If the present trajectory in software development is maintained, the
magnitude of the risk will grow exponentially into the future. In contrast,
the reliability of hardware is improving rapidly whilst that of software is
reducing, so much so that sub-optimal system and network solutions are being
seen. From any engineering perspective this growing imbalance needs to be
addressed. If it is not, an increasing number of ever more dramatic failures
can be expected. A number of technological developments hold promise that
this trend can be checked - for example, optical transparency within the
network which utilizes very simple switching and control methodologies, and
distributed intelligence whereby network control becomes less centralized and
hence less critical.
Richter scale
failure types:
range of effects:
major economic
disruption
international
network control
e[I,21
e [3,4)
national network
control
7
e[51
national press
reports
switch node
failure
local press
reports
local network
failure
local line
fault
government
concern & action
individual
complaints
within normal
contract
Fig. 11.20
QUANTUM EFFECTS
11.7
197
PEOPLE
11.8
All experience of systems and networks to date, coupled with the general
development of photonics and electronics, points towards networks of fewer
and fewer nodes, vastly reduced hardware content, with potentially limitless
bandwidth through transparency. With networks of thousands of nodes,
failures tend to be localized and isolated - barring software related events.
The impact of single or multiple failures is then effectively contained by the
'law of large numbers', with individual customers experiencing reasonably
uniform grade of service. However, as the number of nodes is reduced the
potential for catastrophic failures increases, with the grade of service
experienced at the periphery becoming extremely variable. The point at which
such effects become apparent depends on the precise network configuration,
control and operation, but, as a general rule, networks with less than 50 nodes
require careful design to avoid quantum effects occurring under certain
operational modes, i.e. a failure of a node or link today for a given network
198
configuration and traffic pattern may affect only a few customers and go
by almost unnoticed; the same failure tomorrow could affect large numbers
of customers and be catastrophic purely due to a different configuration and
traffic pattern in existence at the time. Caution should be exercised when
moving towards networks with fewer nodes, while at the same time increasing
the extent of mobile communications at their periphery.
11.9
CONCLUSIONS
199
11.10
CONCLUSIONS
for terrestrial fibre systems the repeater and cable risks are generally
dominant, whereas only cable risks dominate in undersea systems;
the reliability of long lines can be on a par with the future local loop
when diverse routeing is introduced;
200
check the validity of these assumptions and solutions would be well advised.
What was effective and appropriate yesterday may not remain so tomorrow.
This chapter has examined the reliability of optical transmission systems
and networks against a perspective of past, present and future technologies.
The challenge ahead is the realization of transparent optical networks
requiring a minimum of software and human interdiction. Such networks
will ultimately be required to satisfy the future demands of mobile computing
and communications on a global scale. Looking beyond this, new forms of
fibre (e.g. those that give even lower loss than silica, and/or contain
programmable structures to realize integrated signal processing) and networks
must be found to create even more reliable solutions.
REFERENCES
1. Mason C: 'Software problem cripples AT&T long-distance network', Telephony, 218,
No 4, p 10 (January 1990).
2. Neumann P G: 'Some reflections on a telephone switching problem', Commun of ACM,
33, No 7, p 154 (July 1990).
3. 'SS7 errors torpedo networks in DC, LA', Telephony, 221/1 (I July 1991).
4. 'DSC admits software bug led to outages', Telephony, 221/3 , pp 8-9 (15 July 1991).
5. Davenport P: 'Scarborough returns to electronic dark ages', The Times, Issue 63864
(15 November 1990).
BIBLIOGRAPHY
Cochrane P, Heckingbottom R and Heatley D J T: 'The hidden benefits of optical
transparency', Optical Fiber Communication Conference (OFC'94), USA (February 1994).
Cochrane P, Heatley D J T et al: 'Optical communications - future prospects', lEE
Electronics and Communication Engineering Journal, 2., No 4, pp 221-232 (August 1993).
Cochrane P and Heatley D J T: 'Optical fibre systems and networks in the 21st century',
Interlink 2000 Journal, pp 150-154 (February 1992).
Cochrane P and Heatley D J T: 'Optical fibres - the BT experience', Conference
on Fiber-optic Markets, Newport, USA (21-23 October 1991).
Cochrane P, Heatley D J T and Todd C J: 'Towards the transparent optical network',
6th World Telecommunication Forum, Geneva (7-15 October 1991).
Heatley D J T and Cochrane P: 'Future directions in long haul optical fibre
transmission systems', 3rd lEE Conference on Telecommunications, Edinburgh,
pp 157-164 (17-20 March 1991).
Hill A M: 'Network implications of optical amplifiers', Optical Fiber Communication
Conference (OFC'92), San Jose USA, paper WF5, p 1218 (2-7 February 1992).
Butler R A and Cochrane P: 'Correlation of interference and bit error activity
in a digital transmission system', lEE Electronics Letters, 26, No 6, p 363 (March
I~~.
-
12
PRE-EMPTIVE NETWORK
MANAGEMENT
R A Butler and P Cochrane
12.1
INTRODUCTION
Instead of waiting for systems to fail before taking remedial action, it might
be possible to remove systems from service by diverting traffic prior to failure.
Also, the likely failure mechanism might be identified from the transient
pattern of error events, facilitating rapid repair and restoration. If these objectives were achieved they would lead to extensive cost savings and network
performance enhancement. The basis for this alternative monitoring strategy
has been formulated from practical system experience and predicated by the
total lack of adequate burst error models. To date it has been assumed that
randomly generated errors have a negative exponential arrival statistic, which
can be demonstrated to be true. All other errors have been assumed to have
some form of compound Poisson arrival [1]. Whilst this might be true, no
one has been successful in formulating a general mathematical model that
fits anything but a small selection of the recorded error events from practical
networks. There are two possible explanations for this apparent difficulty.
First the statistics of individual bursts may be different depending on their
origin, such as power transients, lightning, capacitor breakdown, human
intervention, radio interference, etc. Secondly, the reported models generally
try to fit statistical distributions to the error signals produced after the line
decoding operation. It is contended here that decoding and retiming circuits
significantly distort the burst error statistics and further complicate the model.
202
12.2
I CCITT has now become ITU/T - the Telecommunications standardization sector of the ITU.
EXISTING MODELS
12.3
203
EXISTING MODElS
Berger and Mandelbrot [9] observed the clustering of error events and
noted their appearance to be characteristic of a process governed by a
Pareto distribution. They claim that the interval between successive errors
is statistically independent of earlier activity and attempted to fit this
model to measurements made on the German telephone network. The
model is reasonable for the data used, but considerable complications
need to be added to fully explain the data obtained by other researchers.
This has led to the rejection of this model [10, 11].
Bond [12] describes error bursts in terms of the gap distribution. Gaps
are counted from the start of one error to the start of the next. The
minimum value of g is therefore 1 bit. Successive gaps form a sequence
of not necessarily independent random variables.
Pullum [17] assumed that the occurrence of bursts can be described with
a Poisson distribution using a parameter ml and that the occurrence of
errors within a burst can be described by a second Poisson distribution
using a parameter m2. These are then combined into a single distribution, first described in 1939 [18], and known as Neyman's Type A
(NTA) contagious distribution. The model attempts to fit a distribution
to the errors emerging from measurements [1], but no account is taken
of the order in which the error events arrive and vital information is lost
in respect of the physical cause of a burst.
204
12.4
MATHEMATICAL MODEL
The established models reported in the literature [9, 12, 13, 17] for burst
effects attempt to define a statistical distribution for the occurrence of errors.
This has limitations as the distribution is dependent on the cause of the error
burst and has, so far, defied all attempts at generalization. Many texts
concerned with random processes in communications stress the importance
of autocorrelation in distinguishing signals from noise. In terms of this work
the signal is the cause of the error burst and the autocorrelation function
of a burst is given by:
R(T) =
T=O,I,2,3 ...N
r,f(t)f(t+T)
... (12.1)
t=O
where T, t and T are quantized in terms of bit periods. In the general case
the power spectral density of an error burst is given by:
S(w) = Tsinc 2(wTI2) [R o+
r,
j=
2Rjcos(iwT)]
... (12.2)
where:
the sinc 2 term gives the power spectral density of a burst its characteristic shape;
the cosine series introduces maxima and minima unique to each error
burst.
In order to study the variation of the power spectral density of error bursts
with system and interference wave form parameters, power spectral densities
need to be compared. To reduce the storage and computation necessary,
algorithmic compressions are required to produce metrics that adequately
describe the structure. Such metrics are relatively simple to compute and it
has been found that a suitable mechanism for this comparison can be based
on peak amplitude comparison of the terms in the cosine series. Standard
statistical metrics have been used to characterize the cosine term peak
amplitudes in both measured and simulated results:
N
r, RJ
mean, JJ-
j=
r,
i=l
... (12.3)
Rj
MATHEMATICAL MODEL
205
E Rj(i- fJ-)2
standard deviation, a
i=l
... (12.4)
ER j (i-fJ-)3
skewness
j=
... (12.5)
E R j(i_fJ-)4
kurtosis
i=l
... (12.6)
Examples of these have been calculated for sample error bursts, as shown
in Fig. 12.1.
These metrics have been found to be sufficient in practice as they produce
a unique set of values [19-21]. It must be remembered that a time-reversed
error burst will produce the same autocorrelation function and every error
burst has a palindrome. In practice, the likelihood of this occurring is
sufficiently small to be of no real significance and is merely recorded here
as a matter of completeness. Even if by some unlikely cause palindrome effects
turn out to be significant the necessary algorithm for their detection and
differentiation is trivial. Experience gained through simulation and
measurement has shown that the mean metric is best suited for comparative
purposes [22]. The variance, skewness and kurtosis may have some value
in separating bursts with similar mean metrics.
Consider a very specific error burst containing N errors, each separated
by an interval of / bits. Then:
mean =
(N+ 1)/
--3-
... (12.7)
CCITT Recommendation G.821 [7] defines system failure when the BER
exceeds 10- 3 for ten consecutive seconds. Assuming the errors occur at
exactly WOO-bit intervals for a system operating at a bit rate of R bits per
second, then:
lOR
1000
mean ::::: -3- + -3-
... (12.8)
206
error burst
e=error
autocorrelation function
(ACF)
mean
(bits)
stand. dev.
(bits)
skewness
kurtosis
eeeeeeeeee
~~
3.67
2.21
0.57
2.36
~Lulill
3.00
1.73
0.58
2.33
2.33
1.25
0.59
2.27
~ i.. . . .,'":-:-~:---:-::-- __
1.67
0.75
0.63
2.04
~l
1.00
NC
NC
NC
4.00
2.00
0.60
2.20
4.47
2.50
0.28
2.08
4.13
2.47
-0.07
1.47
eeeeeeee
eeeeee
1
9
---1......I-...I~I...I__~
eeee
ee
e-e-e-e-e
ee--ee--ee
~I
()___
--.,
I
I
-L.:-:-~...a...~--II-
___ 1
11.
9
.........---JL..I........._
I I I
......
I
eee---eee
~~II
el,
Fig. 12.1
for systems operating at very low error rates, large mean metric values
result;
SIMULAnON
12.5
207
SIMULATION
This was repeated 400 times for a given set of parameters. Allowing for
the published constraints of Monte Carlo simulation [23, 24] that involve
1000-10 000 runs, this number was shown to give acceptable confidence
intervals. An element of batch processing was incorporated into the simulation
with a series of parameter sets specified and simulated to illustrate the
variability of metrics with time. Each set of parameters has been termed a
'scene'. This visualization is based on a statistical analysis of the simulated
results, yielding an average result and a 95010 confidence interval for each
scene.
208
12.6
MEASUREMENT SYSTEM
interference
waveform
code
errors
AI S
recovered
clock
clock
bit
errors
Fig. 12.2
IBM PC compatible
12.7
209
INTERFERERS
A t exp( - tIT)
... (12.9)
12.8
12.8.1
... (12.10)
The results from the measurements and bit-by-bit simulation are illustrated
in Fig. 12.3. As the decay time constant increases from 100 bits, more cycles
of the interferer contribute to the error activity. This increase in error activity
leads to an increase in the value of the measured mean metrics. Once the
decay time constant reaches 600 bits all the cycles in the 320-bit duration
interferer are contributing to the error activity. Consequently, increasing the
decay time constant does not yield any significant increase in the value of
the mean metrics. The trends of the measured bit and code error mean metrics
are similar [21]. The bit-by-bit simulation result is similar, but by no means
a close match, to the measured results. The measured and Monte Carlo
simulation results for bit errors are compared in Fig. 12.4 and show good
agreement with Fig. 12.3.
An important feature of the measured and Monte Carlo simulation results
is the similarity between the bit and code error mean metric values for each
duration. In practice only code error metrics are available to predict the bit
error performance the customer is receiving. On the basis of these results
the code error mean metric appears to be a good indicator of the bit error
mean metric.
210
120
100
::'
80
2
:c
c5
'C
l/
a;
E
co
40
(])
E
c:
20
0
0
/'
.
.I
,"
..
60
.I
200
,"
"
,
,,
,
,
,,
,
,
,,
,
,,
{.
,,-,,
",
"
"
,,''
~""
"
"
----- simulation
........... measured bit errors
- . - measured code errors
400
600
800
1000
Fig. 12.3
120
100
80
2
:c
<.>
'C
60
a;
E
c:
co
(])
40
20
0
0
,,
,,"
,
,
,,
,
,,
,,,
,,
,,
,,'
, , ~"
" "
....... >
--measured
- - - - - Monte Carlo
,,
,
,,
200
400
600
800
1000
Fig. 12.4
211
12.8.2
When the signal-to-noise ratio (SNR) is low the mean metrics have a high
value and are rather unfocused, giving wider confidence intervals as the low
SNR produces many errors. For larger values the mean metrics begin to focus
towards particular values as the number of errors reduces. The trends of the
bit and code error mean metrics in Fig. 12.5 are again similar, with the code
error mean metric having slightly lower values.
90
:0
c5
.~
a;
E
c::
co
Ql
50
E
40
- - bil errors
-----code errors
\
\,
30
,,
,,
,~
~~
~ ~~ ~ -~=-===-===-=-==~-
20 L . - - - I _ - - L _ - J . . _........_
200
2
2.5
3
3.5
4.5
5.5
signal-Io-noise ralio
Fig. 12.5
212
90
80
- - bit errors
-----code errors
70
60
.i1
:c
rS
';:
Qi
III
Q)
10
100
80
60
40
20
-20
-40
Fig. 12.6
duration for a decaying sine wave interferer with the transmitter operating
in a nonlinear region of its characteristic [21].
Having been able to identify changes in the parameters of a decaying sine
wave interferer and the system, attempts were made to identify changes in
the parameters of a second type of interferer. Unfortunately, there is clearly
no correlation (Fig. 12.9) between the measured and simulated results. The
Monte Carlo simulation also reveals the same discrepancy. This problem is
compounded by the good correlation between the measured and simulated
results for a decaying sine wave interferer with HDB3 coded data, as shown
in Figs. 12.3 and 12.4. In order to identify the cause of this discrepancy the
measured results were studied in detail. It emerged that the error density
within the burst caused by the decaying sine wave interferer is typically 21.5070
for bit errors and 16.5% for code errors. The error density for the peak and
decay interferer is 40.7% for bit errors and 18.9% for code errors. Bearing
in mind that the nature of the interferers is such that this error density will
affect both polarities of marks for the decaying sine wave interferer, but only
one polarity of marks for the peak and decay interferer, then the peak and
decay interferer is subjecting one polarity of marks to a far greater error
density than the decaying sine wave interferer.
213
120
100
- - - - - simulation
........... measured bit errors
- . - measured code errors
80
.l!l
:0
e5
.;::
Qi
60
c:
III
OJ
40
20
duration, bits
Fig. 12.7
Metrics for HDB3 coding relative to interferer duration for a linear transmitter.
214
100
- - - - - simulation
........... measured bit errors
- - measured code errors
90
80
70
.l!l
:0
60
(,)
";::
Qi
E
c:
ctl
Ql
50
40
30
20
10
0
0
32
64
96
Fig. 12.8
Metrics for HDB3 coding relative to interferer duration for a nonlinear transmitter.
60
- - - - - simulation
........... bit errors
- - code errors
50
.l!l
:0
cS
.;::
Qi
E
c:
40
30
ctl
Ql
,,
20
,
,,
,,
,
,,
,,
,
,,
,,
,
,,
,,
,
,,
,,
,
,,
,,
:;;,,~~.:~~~. ::>;.::<..,..
10
"
,,
'\......
".
,'.....,.
"
0
0
20
40
",
''''''''''''''-',
v"' :::::::..~.:.::::
60
Fig. 12.9
215
dictated by the received marks. When the DSV is out of bounds the decoder
is assumed to be functioning incorrectly. An equivalent of AIS is injected,
causing the suspension of bit and code error recording. During periods of
AIS the bit and code error interval counters continue to increment, awaiting
the recovery of the decoder.
Results from this modified Monte Carlo simulation are presented in
Figs. 12.10 and 12.11. For the decaying sine wave interferer (Fig. 12.10),
the results follow a similar trend to the measurements and simulation. In
the peak and decay case (Fig. 12.11) the modified Monte Carlo simulation
is now reacting to the high error density by introducing AIS, and the mean
metric value is consequently reduced and now resembles the measured results.
Decoder behaviour has thus been identified as the major cause of discrepancy.
The Monte Carlo simulator was also modified so it could react in a similar
way to a practical decoder. Fine tuning of this modification to include decoder
delays was attempted but with little effect on the metric values.
The original simulation, based on mark error probability is a record of
what actually happened, and what could be achieved by way of metric
variation if the decoding process could cope with the error burst densities
involved.
120
....
100
I'.,::.:,,:::-:.:-:.:-.:.-=-:.:-:.:-:.:-:;-:,;-:.:-~-_
80
_._
:c
<.S
0;::
a;
c:
III
CD
60
40
i /'. . . . . .. //
......
."
/ / ../
20
,..:::<
0
20
/' I
,.......................
.......
.....
...,....
0
decay time constant, bits
Fig. 12.10
25
-----...........
- -
20
.l!l
:0
u
.;:
Qi
E
15
10
CIl
CIl
0
-5
20
40
60
80
10
Fig. 12.11
12.9
The 5B6B block code [31] is well documented and, in contrast to the HDB3
case, the decoder action is precisely defined. Consequently a good model of
decoder action is possible. For all the simulations the SNR = 6 (i.e. the
background bit error ratio = 10 x 10- 9), and the decision threshold was set
to its nominal position. A set of 100 simulation runs is presented for each
scene. It is assumed that the decoder remains in alignment during the error
burst which is reasonable for the interferer durations studied here. Griffiths
[31] quotes the mean total realignment time of a 5B6B decoder to be 750 bits.
12.9.1
The results for the peak and decay interferer with varying decay time constant
are illustrated in Fig. 12.12 and show a mean metric that follows the now
familiar trend with a tight 95010 confidence interval, even for this limited
number of runs. It should be noted that the code error indications are based
217
70
- - bit errors
- - - - - code errors
........... scaled code errors
60
50
.2l
:0
c5
.;:
40
Qi
c:
ell
Q)
30
20
10
0
0
---
Fig. 12.12
Metrics for 5868 coding for a peak and decay interferer with varying decay
time constant.
on words. The code error mean metric axis requires a scaling factor of 5 to
be applied for comparison with the bit error mean metric axis. When this
scaling is applied, the mean metric is found to have similar values for both
bit and code errors. It is therefore used in all further results for 5B6B coding.
As the decaying sine wave interferer varying decay time constant increases,
more cycles of the interferer contribute to the error activity, until all the cycles
in the 320-bit duration are contributing to the error activity, at which point
no further increase in the mean metric is possible. The mean metrics have
a tight 95010 confidence interval even for this limited number of runs
(Fig. 12.13).
As the decaying sine wave interferer with varying duration increases, more
cycles of the interferer contribute to the error activity. The mean metrics have
a tight 95% confidence interval (Fig. 12.14).
12.9.2
The results from the Monte Carlo simulation of the peak and decay interferer
with varying decision threshold offset are illustrated in Fig 12.15. As the
218
--------------80
70
.l!l
:c
u 60
- - bit errors
----- code errors
';:
Qi
E
as
Q)
E
60
40
30
20
10
100
Fig. 12.13
Metrics for 5868 coding for a decaying sine wave interferer with varying decay
time constant.
90
80
- - bit errors
----- code errors
70
60
,,
,
,,
,,
,
,,
,,
,
,,
,,
,
,,
,,
,
,,
,,/
,,
,,
,
,,
,,
,
,, "
,
,,
,
,,
,
,
20
10
0l-......L_......._I--L.,............._L...-......L_........_I-.....I
32
96
duration, bits
Fig. 12.14
Metrics for 5868 coding for a decaying sine wave interferer with varying duration,
80
---,,
,
,,,
,,
r"""
,,
,
,
70
60
.l!l
- - bit errors
- - - - - code errors
I
\
,
,,,
,
,,,
,,
,,,
,,
,,
,,
,,
,
:0
<.J
';::
Q)
,,
,
ttl
Q)
,
,,,
,
,,,
,
,,,
,
,,
10
0
-100
-50
50
100
Fig. 12.15
Metrics for 5868 coding for a peak and decay interferer with varying decision
threshold offset.
decision threshold offset sweeps from - 100070 to + 100070 the mean metrics
focus towards a particular value and then defocus.
The most focused set of results coincide with the minimum values for
the mean metrics for an offset of 0070. In this region little, if any, error activity
is produced.
As the decision threshold offset sweeps from - 100070 to + 100070 the mean
metrics focus towards a particular value and then defocus (Fig. 12.16). The
most focused set of results coincide with the minimum values for the mean
metrics for an offset of + 40070. In this region the effect of the interferer
is counteracted by the decision threshold offset and little error activity results.
Here the scaling of the code error axis by a factor of 5 does not give such
a convincing correlation between bit and code error mean metrics at the
extremes of the decision threshold offset. This can be explained by the wide
spread of metric values that are produced by the high number of background
errors introduced by the larger decision threshold.
220
90
I
80
- - bit errors
__ - - - code errors
.l!l
:0
<5
'':::
Qi
,'",,
,,
,,
,,
,,
c:
<ll
Q)
30
20
10
0
-100
-50
0
decision threshold offset, %
Fig. 12.16
Metrics for 5B6B coding for a peak and decay interferer with varying decision
threshold offset.
12.10
CONCLUSIONS
A mechanism has been devised which can detect changes in interferer and
system parameters from the error activity they induce in a transmission
system. In particular variations in interference parameters such as duration
and decay time constant can be detected. Variations in system parameters
such as SNR, decision threshold offset and linearity can also be detected.
The metrics produced behave in similar, almost identical ways for both bit
and code error activity. This enables the use of code error metrics as 'inservice' indicators of bit error performance. Decoder actions have also been
shown to be critical, and under high error-density conditions can inhibit true
error detection and metric determination. An 'in-service' detection of system
ailments and changes from code-error activity has therefore been demonstrated, and this may allow the cause of impending failures to be predicted
from the transient patterns.
Identifying the cause of an error burst only relies on the decay and
duration times plus the form for all transient interferers. It is certain that
REFERENCES
221
REFERENCES
I.
2.
3.
4.
5.
6.
7.
8.
9.
Berger J M and Mandelbrot B: 'A new model for error clustering in telephone
circuits', IBM Journal of Research and Development, 7, Part 3, pp 224-236
(1936).
-
222
REFERENCES
223
13
EVOLVING SOFTWARE
C S Winter, P W A Mcilroy and
J L Fermindez-Villacanas Martin
13.1
INTRODUCTION
BIOLOGICAL EVOLUTION
13.2
225
BIOLOGICAL EVOLUTION
In 1859 Darwin first described how 'evolution' could lead to the formation
of new species of animals [1]. Darwin's model of evolution required a diverse
population of organisms that competed to obtain sufficient resources to
reproduce. Those organisms that obtained the necessary resource had to be
able to pass on to their offspring information on the strategy that they had
used. Evolution has come to be associated with the phrase 'survival of the
fittest', from the struggle for resource. This conceals, however, the importance
of several other elements of the evolutionary model. The four key elements
are - competition, diversity, selection and reproduction.
226
EVOLVING SOFTWARE
13.3
13.3.1
227
Genetic algorithms
Genetic algorithms were first described by Holland [4] in his seminal work
in the field. A genetic algorithm consists of a linear string of symbols. The
length of the string is fixed. The symbols represent possible functions or states
of the system. In the simplest form the string consists of binary symbols and
each position in the string represents one of two possible functions or values
that the system may possess. The string thus represents the 'genotype'.
Each string is evaluated to see how effective the particular combinations
of values or functions it represents are at tackling the given problem. The
success of the string at tackling the problem is expressed by a numerical value
- its 'fitness'. Conversion of the string symbols to their respective functions
or values is equivalent to the conversion of genotype to phenotype in biology.
The fitness function then represents success in competition and reproduction
- a high fitness implies more progeny in the next generation. An example
that has been tackled in this way is the travelling salesman problem. The
string might encode the order of the cities to be visited and the fitness would
be related to the distance travelled on that particular journey.
After the assignment of fitness two strings are selected for replication.
The chances of selection are related to the string's 'fitness' although the precise
selection mechanism differs between the various implementations of genetic
algorithms. The two parent strings produce a child by means of the 'crossover'
operator. Figure 13.1 shows how the simplest crossover operator works. First,
a point is chosen on one parent string - all the string values before this point
are passed to the offspring, and all the string values after the equivalent point
on the other parent are passed on. More sophisticated crossover operators,
such as two-point selection, have also been studied. Mutation is typically a
secondary, low-frequency event which inverts the binary value at a single
point on the string. This process of selection, crossover and mutation is
repeated until there are the same number of children as parents, at which
point all the parents are killed.
parent 2
parent 1
~ x~
crossover here
~
child
Fig. 13.1
228
EVOLVING SOFTWARE
The great strength of the genetic algorithm lies with this crossover
operator. Holland showed that the search for possible solutions actually
proceeds much faster than the number of individuals in the population might
indicate. This is because each string represents not just itself, but a whole
family of related strings where each binary position can be replaced by a
'#' or 'don't care' symbol. Thus the string' 1100' also represents the solutions
'#100', '#1##', etc. The fitness of any string is effectively the sum of the
fitness of each of these representations or 'schemata'. The crossover operator
works by maximizing the frequency of the best schemata in the population.
Since each string represents many schemata (for a 4-bit string, 24 = 16
schemata), the search proceeds much faster than might first appear. This
'schema' theory has explained why such evolutionary processes are more
efficient than a random search. An excellent introduction and explanation
of it is given in Jones [9] and a detailed mathematical analysis in Holland
[4]. Unfortunately schema theory can only be applied to strings of fixed
length. Clearly for some problems the form of the solution is known but
not its exact parameters (in the travelling salesman problem, for instance,
the number of cities is known but not the order). Genetic algorithms work
well on these problems. However, for many problems neither the form nor
the parameters of the solution are known. For these problems genetic
algorithms are largely useless. Despite this limitation genetic algorithms have
been successfully used in an enormous range of optimization problems and
are a well-proven technique, demonstrating that evolutionary approaches can
be used to tackle problems where other heuristic techniques fail.
13.3.2
Genetic programming
229
= (T1 - (T2 + T1)) + (If ((T3 + T2) < = T1) T3 else T2)
Fig. 13.2
230
EVOLVING SOFTWARE
parent 1 x parent 2
child
Fig. 13.3
Genetic program crossover operator. The short arrows mark the place the crossover
operator breaks the chains into two sub-trees.
breaks most of the requirements of the schema theory (fixed length, positional
relationship, etc).
13.3.3
Genetic algorithms are not creative - they can only produce solutions that
lie in the region bounded by the largest and smallest numbers the strings can
represent. They do not generate new functions. Genetic programs can generate
new functions - by combining elements from the original set to produce
functional sub-trees, although an inappropriate selection of functions at the
start can inhibit or even stop the evolutionary process. Neither genetic
algorithms nor genetic programs can be used if the desired outcome is a group
of programs that co-operate with each other to tackle the task. A radically
different approach to evolving systems has been pioneered by Ray [7] and
Skipper [8]. They attempted to model biology directly rather than the
abstractions used in genetic algorithms and programs. Despite this, they offer
insights into how evolving software might develop and techniques that might
be included into the more conventional approaches. Here the Tierra model
231
232
EVOLVING SOFTWARE
13.3.4
select a herd of horses, race them, discard the losers and breed from the
winners, and then repeat this process until one horse runs the race
sufficiently quickly for the original purpose (= selective breeding);
233
select a nice open grassy plain like the Serengheti (food but no cover),
populate it with lions, release a bunch of assorted animals and wait (a
few million years), and then come back and select those animals that
have evolved to escape from the lions (= open-ended evolution).
The first method is how genetic algorithms work. You know what you
want, you formulate the problem precisely and you already know the general
form of the solution. It is efficient but unoriginal, and requires considerable
pre-simulation knowledge. The second method is how artificial life techniques
work. You create a general environment that represents the nature of the
problem you want to solve. Then you sit back and see what comes out. It
might be a fast horse; it might be a rhinoceros; both solve the problem as
expressed! The artificial life approach is less controlled but correspondingly
more creative. It can produce solutions of a form never thought of or
anticipated. Such simulations may be a first step to producing originality
and innovation with a computer. The next section describes some of the
frustrating creativity that artificial life programs can demonstrate.
13.4
The chapter has concentrated on how genetic programming and artificial life
techniques work. These both seem more promising than genetic algorithms
for program writing because they are designed from the beginning to evolve
programs of variable length. Genetic algorithms generally use fixed-length
strings. Although variable string length implementations exist [10, 11] , their
theoretical basis has yet to be demonstrated. Such implementations fit uneasily
into the schema theory, thus largely negating the advantages of the genetic
algorithm approach.
13.4.1
Genetic programs
Koza has used genetic programming to tackle a range of problems [6] . Koza's
method has been applied to a number of problems using a C implementation
of this technique [12] and a series of trials of the techniques has begun. The
problems tackled fall under the heading of 'pattern recognition'. They consist
of trying to find patterns in an apparently chaotic or very noisy data set and
trying to identify generalized features in a pattern. Curve fitting, trends in
noisy data and feature recognition all represent complex problems that might
be solved using fairly short evolved sequences of code. Tackett [12] has
already shown that genetic programming can produce efficient pattern
recognizers - evolving a program that identified tank targets from noisy
234
EVOLVING SOFTWARE
IR detector data. Whilst similar work can be done with neural networks,
Tackett [13] found that the genetic programs outperformed the neural nets.
The genetic programs were found to be remarkably good at curve fitting,
quickly deriving a formula very close to that originally used to generate a
test set of data points. Similarly, faced with an experimental time series, the
genetic program evolved an algorithm that predicted the subsequent behaviour
of the data series better than that obtained through simple linear regression
analysis. Now, programs are being evolved that, hopefully, will recognize
facial features by deriving a suitable algorithm from a set of test faces.
Problems like pattern recognition, where it is difficult for a human
programmer to know where to start, are particularly appealing to evolving
software practitioners. Although it is important to choose a suitable initial
function set (Le. it is necessary to specify the set of operators [ +, -, *, /,
IF, LOG, etc] available to the GP), it is not necessary to specify in advance
how the GP will combine the operators it uses. The evolution time required
for the first two problems has been of the order 1-10 hours and thus within
the power of current machines. The resulting programs have possessed less
than 100 nodes and leaves.
There is a well-developed theory to explain why the crossover operator
in genetic algorithms works efficiently (see the schema theory in section 13.3).
Koza shows that crossover, rather than mutation or random search, is also
the key to the speed of genetic programming [6]. However it is not clear
why the genetic program crossover operator should be an efficient search
tool. At a first examination a randomly selected sub-tree is unlikely to improve
another tree when it is grafted on. Although the sub-tree is physically
unchanged, one would expect its change in location to change its meaning.
So, although the new tree is likely to be syntactically valid, one would not
expect it to produce a meaningful result. The explanation lies both in the
nature of the evolutionary process and in the concept that the meaning of
a sub-tree depends little upon when it is moved.
Examining an evolved program it is quickly noticed that many branches
of the tree are similar. Structures have developed that provide general
problem-solving tools that are hierarchically arranged. This can be pictured
for pattern recognition as the development of crude classifiers that operate
by progressively filtering their output upwards through a tree of similar crude
classifiers. The tree is then self-similar. At whatever level the tree is viewed,
the same classifier structure exists. This has great evolutionary benefits, since
there is only one basic structure that evolves in a variety of patterns and is
modified slightly to fine-tune the structure. Now any sub-tree has a similar
meaning when it is grafted on to another tree. The evolutionary process drives
the trees in this direction because such 'self-similar' trees will always have
output reasonably similar to their parents. Trees whose offspring are unpre-
235
dictable are likely to produce a high ratio of failed children which will lead
in the long run to their elimination from the pool of possible solutions. Thus,
a 'self-similar' solution offers many evolutionary advantages and will tend
to arise regardless of the function set put in initially. It is interesting to
disassemble such programs and observe what basic functional units have
evolved.
The great advantage of genetic programming over other techniques is that
the evolved parse trees can often be simplified and presented in a manner
that a human programmer can understand. This 'understandability' and ease
of relationship to, in particular, standard programs makes the technique more
user-friendly and easier to integrate into standard programming environments. Thus, genetic programming might soon become a standard tool
for developing small, but complex, pieces of code in normal development
procedures.
13.4.2
Tierran simulations have shown that the programs can increase the speed
of their self-copy algorithms by a factor of six when the simulator is run
overnight. They have achieved these improvements by significant alterations
to their original programs, such as 'unrolling the loop' [7]. Thus, Tierra
might be a good tool for the optimization of application programs, and may
be particularly useful for the programming of massively parallel machines.
However, the only problem tackled by the Tierran organisms described in
the literature is that of efficient self-reproduction. So the question remains
- can Tierran programs do something else apart from reproducing?
Tierra has been altered so that the creatures can be forced to solve practical
problems rather than just reproduce. The original instruction set was modified
so that programs could read and write from I/O buffers. These buffers can
be used by the programs to communicate directly with other creatures or
with the user. The user places the problem in the input buffer and the creature
replies via the output buffers. The slicer queue was modified so that creatures
communicating to the user were rewarded with extra CPU time according
to how well they performed the task.
The first problem fed to the Tierran creatures was a simple maze-running
problem, the intention being to reward those creatures that learnt to follow
the maze quickly. The second problem was to evolve a general algorithm
for converting any input 32-bit integer so that all the bit positions were unity.
The results were both disappointing and surprising. In both cases the
organisms found efficient ways to 'cheat'. In the maze-running problem, the
programs ignored the problem and indeed evolved maze-running software
out of their system altogether. Then, instead, they optimized their reproduc-
236
EVOLVING SOFTWARE
tion loop using the minimal CPU time permitted to organisms which refused
to co-operate. It was necessary to 'feed' the bugs a minimal time, even if they
were not tackling the problem, in order to let the process of evolution start.
Viewed from the organism's perspective the problem is how to use the CPU
resource available to reproduce efficiently. Perhaps the environment failed
to provide sufficient reward and they found a creative solution to reproducing
efficiently in the minimal time available. Clearly the bugs found that evolving
a good maze-running algorithm in order to gain more resource was not a
satisfactory evolutionary strategy in this environment. The slicer implementation was altered further and the second problem fed to the simulator. This
time the programs kept intact the low-quality algorithm initially embedded
in their code but evolved their reproductive code much more rapidly than
their bit-manipulation code. These programs were clearly more concerned
about reproduction than task solving. From the human perspective they
'cheated' and exploited loopholes in the environment to maximize their reproductive success. Although artificial life programs are particularly good at
this, any evolutionary system, be it genetic algorithm/programming or Tierra,
will exploit any available loopholes in the fitness functions where it enhances
its reproductive success. It is very easy to anthropomorphize when describing
the apparent behaviour of these programs and speak in general terms of their
'behaviour' .
Apart from going their own way, Tierran programs have a further
disadvantage over genetic programming - it can be extremely difficult to
disassemble a creature to work out how it is functioning and even more
difficult to analyse how the whole environment is operating.
Despite these difficulties some progress has been made with C-zoo.
Evidence of programs co-operating to tackle a task has been observed by
altering the interactions controlling the creatures in the C-zoo. Co-operation
between programs is difficult to evolve in genetic algorithms or genetic
programming because the evolving programs do not directly interact with
each other, and the interactions are limited to a rather remote selection
procedure. In C-zoo and Tierra, interactions through templates left in the
memory allow a wide range of competitive and co-operative behaviour. It
would be interesting to maintain this feature whilst making the system more
task-oriented.
13.4.3
Hermes
237
238
EVOLVING SOFTWARE
11001100
11110010
11110000
10101010
10000001
sensor
condition
effector
action
condition
action
Fig. 13.4
Hermes classifier system. A sensor can detect messages either from the environment
or its own internal list and place them back into either register.
The mutation operator can cause both bit mutations within a rule and rule
duplication or deletion. This means the complexity and number of rules in
a creature can increase or decrease with time under evolutionary pressure.
Hermes has been used to evolve solutions to a variant of the travelling
salesman problem. In this variant there are a number of tasks located on
a grid and a number of 'salesmen' who must visit these tasks and complete
them in the minimum time. Each salesman has a particular skill set and is
rewarded for completing jobs according to how well their skill set matches
that job. However, during the day more tasks appear - how do you best
utilize your salesmen to complete the maximum number of jobs? This is a
'multiple travelling salesmen in a fog' problem. The Hermean representation
is very computationally expensive so that evolution is relatively slow compared
with genetic programming. However, evolution of the programs has been
observed, but, to date, no examples of inter-program co-operation have been
clearly identified. Like Tierra, disassembling Hermean programs to find out
how and why they work is time consuming. Generally, phenomena like self-
FUTURE DEVELOPMENTS
239
13.5
FUTURE DEVELOPMENTS
13.5.1
Time scales
A good programmer writes 20-30 lines of debugged code per day - roughly
300 bytes of object code. How quickly do programs such as Tierra or Koza's
genetic programming evolve an equivalent amount of code? The comparison
depends on a number of assumptions.
Only the evolution of small programs has been studied. The scaling to
large problems is unknown. Section 13.3 briefly discussed this issue. Until
workstations of sufficient power are available the techniques will be
difficult to test on long programs.
240
EVOLVING SOFTWARE
Computer model
Processing
speed (MIPS)
Year
available
Mae Quadra
10
30
100
1992
1992
1992
1992
2000
2010
2020
100 days
30 days
Sun Spare II
HP Apollo 720
Human
3 x 103
3 x 105
Workstation
Workstation
Workstation
3 x 107
10 days
I day
I day
2 hours
3 mins
The predicted year that such a serial computer will be available for about
25 000 at 1992 prices is indicated in column 3. Column 4 shows the calculated
and observed times for 'producing' thirty lines of code. This data is presented
graphically in Fig. 13.5. Thus, if genetic programming can be shown to scale,
it is predicted that around the turn of the century such techniques will compete
with humans at writing code. From then on the power of the machines will
rapidly overtake the human programmer.
13.5.2
FUTURE DEVELOPMENTS
100
241
,...,...-
10
,...-
,...-
0.1
0.01
0.001
,...-
,...-
~ ~
0.
c.~
00
~g
QlC\l
"O~
ec.
workstation
Fig. 13.5
Time taken by evolving software to produce 300 bytes of object code on various
current and anticipated workstations.
242
EVOLVING SOFTWARE
model' - has been shown to be considerably faster at evolving than the static
test case.
Programs developed using rigorous mathematical models should, in
theory, be provable as error-free. Evolving software will only be statistically
error-free depending on the number of test cases to which it has been exposed
unless formal methods can be developed for verifying the generated code.
Programs that are shown to possess errors would probably have to be evolved
further. However, evolving software tends to develop programs that are fault
and error tolerant. This is because of the nature of the evolutionary process,
particularly with the co-evolving parasite model. A program that evolves to
cope well with a subset of the data and poorly with the rest can be viewed
as 'brittle' and error-sensitive. Changes in the test set will rapidly kill these
programs. So programs that return reasonably close, but not fully
satisfactory, answers to a fuller range of the test set of problems are more
likely to survive. A good example of this process can be seen in Tierra. The
low flaw rate of execution of instructions drives evolution towards programs
that can tolerate small errors in their execution. A program that requires a
thousand instructions to be executed exactly as they are written is likely to
rise quickly to the top of the reaper queue and to extermination. Tierran
programs can be hard to disassemble precisely because, for them, it is more
important to evolve error tolerance than readability! Until larger programs
can be evolved it will be difficult to judge the error rate in an evolved, in
preference to a hand-written, program.
13.5.3
The most striking examples of evolved software come from the field of genetic
programming. However, the problems tackled, whilst interesting and
complex, only require relatively short lengths of code. Indeed, to prevent
the simulator grinding to a halt the maximum depth of the parse tree is
deliberately limited to, typically, fifteen layers. The question then becomes:
'How does genetic programming scale with the length of program to be
evolved?' The simple answer is nobody knows. Holland discusses the
consequences of gene string length and the number of alleles (independent
symbols that can occupy a single site on the gene string - for binary
representations this is two [0, I]) on the size of the genetic search space and
thus on the search time (see in particular Jones [9] for a clear discussion).
The search space for a genetic algorithm and the number of schemata in the
population both grow as power functions. Unfortunately Holland's analysis
does not apply to open-ended systems. Genetic programs that seem to evolve
similar structures may actually search a smaller space than first appears and
thus may also show a less than exponential growth in difficulty with the size
CONCLUSIONS
243
13.6
CONCLUSIONS
244
EVOLVING SOFTWARE
REFERENCES
I.
2.
3.
Conrad M: 'Structuring adaptive surfaces for effective evolution', Proc 2nd Ann
Conf Evol Prog, p I (1993).
4.
5.
6.
7.
Ray T: 'An approach to the synthesis of life', Artificial Life II, p 371 (1992).
8.
Skipper J: 'The computer zoo - evolution in a box', Proc 1st Eur Conf on Art
Life, p 355 (1992).
9.
10. Harvey I: 'Species adaptation genetic algorithms: a basis for a continuing SAGA',
Proc 1st Eur Conf on Art Life, p 346 (1992).
II. Goldberg D E, Deb K and Korb B: 'An investigation of messy genetic algorithms',
Technical Report TCGA-90005, TCGA, University of Alabama (1990).
12. Tackett W A and Carmi A: 'SGPC: simple genetic programming in C', University
of Southern California, Dept of EE Systems and Hughes Missile Systems Co.
13. Tackett W A: 'Genetic programming for feature discovery and image
discrimination', Proc of the Fifth International Conference on Genetic
Algorithms, p 303 (1993).
14. Holland J H: 'Escaping brittleness: the possibilities of general purpose learning
algorithms applied to parallel rule-based systems', in Michalski R S, Carbonell
J G and Mitchell T M (Eds): 'Machine Learning: An Artificial Intelligence
Approach, Vol II', Morgan Kaufmann, Los Altos, pp 593-623 (1986).
15. Smith S F: 'Flexible learning of problem solving heuristics through adaptive
search', Proc 8th Int Conf on Art Int (1983).
16. Hillis W D: 'Co-evolving parasites improve simulated evolution as an optimisation
procedure', Artificial Life II, p 313 (1992).
14
14.1
INTRODUCTION
is self-regulating;
14.2
the agents should be able to dynamically alter their task allocations and
number.
247
So far it has been argued that the use of mobile agents in conjunction
with indirect inter-agent communication can lead to intrinsically robust, wellstructured code. This potential benefit remains academic until it can be
demonstrated that such agents can be programmed to carry out useful tasks
in a way which includes this benefit.
In an extreme case a single mobile agent can do any task that a central
controller could do. The mobile agent has access to the same information,
since it can move to any part of the system and read any data that would
be accessible to a central controller. It can then be said that there is no
restriction to the tasks that non-intercommunicating mobile agents can do
compared to a central controller. This means that the issue is not whether
a mobile agent can carry out useful tasks but whether mobile agents can be
used to carry out tasks in a way which offers benefits over central or fully
distributed control.
14.3
249
14.4
14.5
AN EXAMPLE APPLICATION
14.5.1
The load management agent (load agent) provides the lowest level of control
in the system. It is designed to manage the network by distributing traffic
evenly. When a load agent is launched on a node in the network it updates
the routes from that node to all other nodes by modifying appropriate
routeing tables throughout the network. An agent primarily acts on behalf
of the node where it was launched but can have a beneficial effect elsewhere
in the network - when a routeing table at a node is changed it will re-route
traffic from all nodes via the new route, not just that sourced from the agent's
own node. When an agent has completed a single update for its node it
terminates.
The algorithm for constructing routes is based on a well-known optimal
route algorithm by Dijkstra [7]. In this example the algorithm is used to
EXAMPLE APPLICAnON
251
find the path from the source node (s-node) to each other node in the network
that has the maximum amount of spare capacity available. The spare capacity
of a route is defined to be equal to the smallest spare capacity of any
component on the route.
The internal state of the agent consists of a list of records each with the
following fields:
node identifier;
There is one record for each node that the agent knows about. Initially,
the agent only knows that the s-node exists, so there is only one record. The
s-node record is permanent (as indicated by the flag) and its spare capacity
is set to infinity. The contributing neighbour field is not used for the s-node.
The agent interrogates the s-node to find out who its neighbours are and
what the spare capacity to each neighbour is. A temporary record is then
created for each neighbour which indicates the s-node as the neighbour that
contributed to the calculation of the spare capacity to that neighbour.
The temporary record with the largest spare capacity is made permanent.
This will be called the newly promoted node. Figure 14.1 shows the way that
the algorithm 'grows' the network of permanently labelled nodes by
promoting temporarily labelled nodes at the periphery. At this point the agent
knows the route from the s-node to the newly promoted node. In this case
it is rather trivial; traffic for the newly promoted node should simply be passed
directly to that node from the s-node.
The agent now makes its first move to the newly promoted node so that
this node becomes the current node. The agent creates new temporary labels
for each of those neighbours of the current node that it does not already
have records for. If the agent already has a record for a neighbour it will
check to see if the route via the current node has a higher spare capacity
than that indicated in the record. If this is the case then the record will be
altered to indicate the new, higher capacity and show the contributing
neighbour as the current node. Once again the temporary record with the
largest spare capacity is made permanent. It is now known that the route
to the newly promoted node should be the same as the route to the neighbour
that contributed to the calculation of the spare capacity to the newly promoted
252
SOFTWARE AGENTS
II
Fig. 14.1
s-node
permanently labelled node
temporarily labelled node
node. To update the routeing tables accordingly, the agent needs to visit every
node between the contributing neighbour and the s-node.
The order in which the agent visits the nodes to update the routeing tables
is very important. If the agent passed from the s-node to the newly promoted
node then there is a possibility that, at some stage, the routeing tables would
contain invalid routes that were either circular, or terminated before reaching
the destination. If, instead, the agent moves from the destination back to
the s-node then no route which was previously valid would be invalidated
by the changes that the agent makes to the routeing tables.
To see how updating a route from destination to source does not leave
an invalid route, consider the example shown in Fig. 14.2. Suppose an agent
is in the process of modifying a path which was originally valid. Now trace
the path from the source node to the destination node with the routeing tables
in this intermediate state. At some point on the path a node may be
encountered whose routeing table has been modified by the agent. If no such
EXAMPLE APPLICAnON
-"
,,
,,
,,
,,
,",
,,
,,
253
,
,,
current agent location
,,
,,
" . starting node
- - - - - old route
- - new route being created
by agent
Fig. 14.2
node is found then, by assumption, the route to the destination node is valid.
If a modified node is encountered, then the path segment between the source
node and the first modified node must be valid (since the path has been traced
that far). The path segment between the first modified node and the
destination node must be valid since it is being written by the agent. This
argument is independent of the choice of source node and so will be valid
for all source nodes in the network.
14.5.2
Parent agents provide the second level of control in the system. These agents
are responsible for managing the population level and task allocation of load
agents. The parent agent detects nodes with high utilization values and then
launches load agents on nodes that are sourcing the most traffic. In this way
the parent agent provides a link between the nodes that are most likely to
become overloaded and the nodes that are most likely to cause the overload.
A parent agent steps randomly around the network gathering information
for its internal records from each node that it visits. From each node the
agent records the traffic sourcing rate and current utilization. Traffic sourcing
rate values are used to build a 'sourcing-rate history' for each node in the
network that the agent knows about. Utilization values are used to form a
'utilization history', which is an average of the utilization values of the last
254
SOFTWARE AGENTS
n nodes that the agent has visited. Using this information, a parent agent
can determine the level of management required in the network at any time
by analysing how evenly traffic is distributed. The parent agent uses techniques similar to those used in economics to analyse income distributions [8].
The parent agent updates an internal record with the following fields for
each node visited in the network:
node identifier;
EXAMPLE APPLICAnON
255
on this node (perhaps there are already too many processes on this node)
the parent agent moves to the next node in its ranking table until it successfully
launches an agent. The agent will move to the next ranked node if:
it estimates that the node still has an agent working for it;
it believes that an agent has crashed - in this case the parent agent will
clear the records of the crashed agent, thus allowing a load agent to be
launched on this node next time.
Parent agents must be able to detect load agents that have crashed and
replace them quickly. The parent agent does this by using load agent timestamps to calculate average load agent lifetimes (the time taken for the load
agent to do a route update for an s-node and then terminate). When a load
agent is launched it registers its start time at the s-node and when it finishes
it posts its start time again in another field at the s-node. By comparing these
two fields a parent agent can determine if the last load agent to be launched
has finished successfully. If this is the case, it will know that it can launch
another load agent. If a load agent crashes or is delayed due to an unusually
heavy work-load, the parent agent will become aware when the time elapsed
from the last posted start time becomes larger than the average life-time.
To cope with variations in the lifetimes, the parent agent has a safety margin
to allow agents to overrun under heavy traffic conditions. When this time
has elapsed the parent agent will assume the load agent has crashed and will
clear its records from the s-node to allow another load agent to be launched
in future. If that scheme fails for any reason such as the system suddenly
slowing down and hence making load agent lifetimes much longer, it will
not matter, as a node can have a number of load agents working for it at
any time. The agent lifetime average is calculated so that the last lifetime
is weighted heavily. This is essentially a last lifetime value with a small
historical content.
It is crucial that parent agents regulate themselves from information
gathered from the environment in order that an ensemble of parent agents
can self-organize with no direct inter-agent communication. The parent agent
must know not only that there is a need for more load agents, but also when
there are enough load agents and, therefore, adding more would not be useful.
The parent agent will stop the launch cycle and return to the 'gatherinformation' mode if:
the traffic sourced by the ranked node will not significantly increase load
on the network;
EXAMPLE APPLICATION
14.5.3
257
258
SOFTWARE AGENTS
soon as they read another parent monitor's records. Crashed parent agents
will be replaced after only slightly longer than their typical cycle time;
meanwhile load agent management will not suffer as other parent agents can
cope equally as well. The preferred number of parent agents can be changed
when the system is running and the population will adjust accordingly.
14.6
14.6.1
The network
To test the agents' ability to spread traffic evenly, a network was needed
that had a number of alternative routes from node to node. This network
must not be fully interconnected as there would always be a direct link to
all other nodes and the network would be limited only by the capacity of
its links. Also the network cannot be a tree network as there will be no
routeing decisions for the agents to make since there is only ever one route
from node to node. A proposed UK network was chosen as the test network
(see Fig. 14.3). This network is sufficiently interconnected to give the agents
routeing decisions to make and it also has the benefit of being a realistic
network topology.
A network description file was created to set the positions of the 30 nodes
in the network and the links connecting them. Links are defined as
unidirectional and so nodes were connected with a link in each direction to
represent a circuit. In this example all nodes in the network were given the
same capacity.
14.6.2
Traffic in the network was represented by blocks of calls that were 2.5010
of each node's total capacity. The source and destination nodes were chosen
so that there was a range of call distances, where the distance is the number
of nodes through which a call travels. As the initialized routes for the network
were known, it was possible to deliberately overload particular nodes by
organizing many calls to cross at one node. This was done for three nodes
in the network.
The call blocks were scheduled to start at regular staggered intervals and
the durations were arranged so that they all finished approximately together.
The traffic generator module is able to apply a traffic profile to the network
at any specified rate.
Fig. 14.3
14.6.3
259
distribution as shown in Fig. 14.4(a). It should be noted that nine nodes are
completely unused.
To test the system with agents present, the routes were initialized as before,
and a parent monitor was started on a node. The parent monitor then
launched the required number of parent agents. For a network of this size,
two parent agents are sufficient and this value had been pre-set. The traffic
generator was then started with the same traffic profile as before. As there
is a random element in the behaviour of the system, the agents were tested
with the traffic profile a number of times.
As soon as the parent agents found traffic in the network, they calculated
where load agents would be needed. As the amount of traffic in the network
increased, so did the number of load agents. As traffic was added, the
maximum spare capacity routes could differ from the initial default routes.
The load agents changed routes in response to this and moved call blocks
away from congested nodes. In the tests, the number of load agents typically
peaked at about seven. When the traffic stabilized and then was removed
from the network no more load agents were launched. Figure 14.4(b) shows
a typical result, with the network at the point of peak traffic.
0> 100%
(a)
50-75%
8
0
0 -25 %
(!)
not used
25 - 50%
(b)
without agents
Fig. 14.4
75-100%
with agents
261
Comparison of the two results in Fig. 14.4 shows the benefit of the agents.
The agent-managed network had a maximum node utilization of 85070 - no
nodes are over-loaded. This was achieved as the agents spread the traffic
more evenly to use all the nodes, no nodes remaining unused.
14.7
The use of mobile agents combines some of the benefits of central control
and distributed control. At each stage of the algorithm the load agent learns
about the network incrementally, by interrogating a node to discover its status
and the status of the links to its neighbours; no central store of information
is required although non-local information is held within the agents.
Individual agents can fail whilst the overall control system continues to
function. Dijkstra's algorithm is very difficult to implement in a fully
distributed manner and yet it has been implemented here in a very simple
way that retains a number of the benefits of distributed control.
The agents exhibit both algorithmic and heuristic behaviours. They exhibit
algorithmic behaviour when calculating routeing tables but heuristic behaviour
when managing their workload by choosing whether to launch agents,
terminate themselves or move to another node. This is a powerful way of
introducing a degree of apparent intelligence into an otherwise purely
algorithmic system.
Distributed control has advantages and disadvantages compared to central
control of a distributed resource. Primarily, distributed control can be more
robust and faster reacting than central control. The robustness occurs since
there is no single controller whose failure would cause the whole control
system to fail. On the other hand, central control is potentially able to produce
a result that is closer to optimum since a central controller will have global
information available to it.
Mobile agents appear to be distributed in the sense that there is no central
controller. However, the agents have a view of the distributed system which
is not local and, indeed, may be global since, in principle, a mobile agent
could visit every part of the distributed system to gather data.
The concept of mobile agents seems to embody the advantages of
distributed control in that they are very robust and yet they can have a view
of the system which includes many nodes.
As well as providing some of the benefits of central control, the mobile
agents can inherit some of the disadvantages of central control. In central
control, decisions may be based on data that is quite old, since it will take
time to poll all parts of the system. Mobile agents also obtain the data that
they require in a serial manner and so could also suffer from the problem
of carrying out actions based on old data.
The age of the data that an agent uses can be controlled by restricting
the number of nodes that an agent visits to gather data before it implements
a control action. If the number of nodes visited by an agent is too small,
the agent may suffer from the problem that its control is far from globally
optimal. However, the use of mobile agents in this way allows one to obtain
the desired balance between the characteristics of distributed control and those
of central control.
14.8
CONCLUSIONS
The main conclusion from this work is that the use of mobile agents offers
a radically different way to control a distributed system such as a
communications network. The use of mobile agents extends the philosophy
behind object oriented programming to make it more amenable to
programming distributed control algorithms.
The use of mobile agents provides a means of taking advantage of work
in robotics where a novel architecture, called the Subsumption Architecture,
has been shown to be superior to traditional artificial intelligence in terms
of the complexity of the environment it can cope with and the simplicity of
the resulting control structure.
Distributed resource controllers based on mobile agents would have quite
different characteristics from those based on a more traditional approach
to control problems. There would be an automatic increase in robustness,
the penalty for which would be the inability to formally predict the
performance of the resource being controlled by the mobile agents. This may
not be a disadvantage for certain aspects of control since in most control
problems some kind of heuristic algorithm will be needed (even if it is usually
provided by a human controller).
Mobile agents would be of most benefit in the kinds of application where
robustness and the ability to self-regulate are more important than the speed
of response. They are likely to be suitable for carrying out background
optimization tasks by adjusting parameters to keep the system within
operating limits. This attribute would allow agent control systems to carry
out some of the mundane tasks currently undertaken by human supervisors.
REFERENCES
263
REFERENCES
1.
2.
3.
4.
5.
Brooks R: 'A robust layered control system architecture', IEEE Journal of Robot
Automation, RA-2, No 1 (1986).
6.
7.
8.
15
EVOLUTION OF STRATEGIES
S Olafsson
15.1
INTRODUCTION
INTRODUCTION
265
266
EVOLUTION OF STRATEGIES
Applications of dynamic game theory to task allocation on distributed processor systems have been discussed in Olafsson [9]. Also, the use of some
evolutionary models [10] has been applied to the analysis of the market
diffusion of different, and competing, network technologies [II]. Further
applications of the results will be presented elsewhere.
The chapter is organized as follows. Section 15.2 reviews some of the
basic concepts of dynamic evolutionary game theory (hereafter mostly called
dynamic game theory), as introduced by Maynard Smith [4], Taylor and
Jonker [12], and Zeeman [13]. Section 15.3 gives a formal definition of
the evolutionarily stable strategy [5] and discusses some of its general
properties. Here it is proved that the equilibrium states for the game can
be derived analytically, i.e. without simulating the game dynamics. In section
15.4 some stability properties of the equilibrium strategies are analysed. It
is shown how the analysis of a linear system reveals the number of pure
strategies which contribute to an evolutionarily stable strategy. Furthermore,
it is proved in this section that the fitness of an equilibrium strategy can be
evaluated in terms of the eigenvalue spectrum of the game's stability matrix.
In the final section a few examples are also discussed and results of analytical
calculations and simulations presented.
15.2
This section introduces the basic concepts of dynamic game theory. Let
S=(Sl>S2,.",Sn) define the finite set of strategies available to a population
of competitors. The vector P = (PI>P2,'" ,Pn) describes the probabilities with
which the strategies are used, i.e. Pi = P(Si) is the probability that a
competitor uses strategy Sj. The pay-offs associated with the various
strategies are presented in the form of a gain matrix G = Gij , I::; i,j::; n. The
precise meaning of the matrix elements is as follows: G jj is the pay-off to
an individual applying strategy Si against an individual using strategy Sj' The
components of the vector 1= (fl ,12,." ,In) present the fitness values assigned
to the various strategies. In general, the probabilities, the pay-offs and the
fitness values are functions of time. The triplet r = (G,s,p) will be taken to
define a game.
As a result of a contest between two opponents applying anyone of the
available strategies, their respective fitness values do, in general, change. For
the evaluation of the fitness a rule discussed by Maynard Smith [4] is
adopted. It gives the fitness of the strategies by the following expression,
I(p) =10 + Gp. The component.fi(p) =lo,i + (Gp)j gives the fitness of a player
using strategy Sj, when contesting a population using a strategy defined by
the probability distribution p = (PI>'" ,Pn)' Similarly, if a population of
competitors plays the available strategies with the probability distribution
267
I p =10 + p + Gp.
Generally, the mean fitness of the population serves, at any moment in time,
as a benchmark against which the fitness values of the pure strategies are
to be compared.
One would, in general, expect the dynamic to move the game towards
probability distributions which favour strategies with high fitness. In a
biological context, the interpretation of this fact can be twofold. Either the
present population modifies its strategy towards a probability distribution
which improves the average fitness of the community, or those members of
the population which are already applying the high fitness strategies are
rewarded by a higher number of descendants, which consequently inherit
the strategies of their parents. The effect of this is also that more individuals
will be playing the successful strategy. From a mathematical point of view
both interpretations are identical.
Many workers have considered it to be a disadvatage that evolutionary
game theory treats the growth of strategies as an asexual process. In fact,
this feature of the theory is a very desirable one from the market strategist's
point of view. Here, the success of a strategy has in general two very different
effects. Firstly, it implies a market expansion for its user, and, secondly, the
strategy multiplies in the sense that it gets used by any number of competitors
which become aware of its success. Both cases can easily be captured by the
elements of dynamic game theory [9], if the gain matrix elements are made
dependent on the strategy probability distribution.
This work considers mainly the following updating rule:
... (15.1)
where the dot means derivative with respect to time. The reasons for
considering this system of equations rather than those studied in Taylor and
Jonker [12] and Zeeman [13] have mainly to do with the context in which
this study arose, i.e. using dynamic game theory for the analysis of
competitive markets. Realistically, market-like systems are in general not
globally stable, as the position of the equilibria will depend on the system's
initial configuration. Furthermore, for most markets it is likely that a number
of gain matrix elements are negative, as one can expect losses when applying
some of the available strategies. Unlike in the case studied in Taylor and
Jonker [12] and Zeeman [13] the effects of these negative values cannot
be removed by just adding a constant vector to each column of the gain
matrix. They are real in the sense that they affect the attractor behaviour
of the dynamic system [8].
The main questions addressed in this chapter are the following.
268
EVOLUTION OF STRATEGIES
Does the system in equation (15.1) find these strategies (if they exist)?
15.3
STRONG STRATEGIES
Before tackling the problems related to the questions stated in the previous
section a more precise definition of what is meant by a winning strategy is
needed. Let II be the state space of all possible probability distributions, i.e:
n
II
[Pi~OI.E
Pi =
1=1
1]
STRONG STRATEGIES
269
i.e. if the highest element in the row vector Gj==(Gjl> ... ,G jn ) is on the
diagonal of the gain matrix G. The following example demonstrates that some
games have no strongest strategy.
Example -
Assume that pEIl is the strongest strategy in the game r == (G, p, s). Then
p+Gr>q+Gr, vrEII, which implies p+G>q+G. It is straightforward to
establish that this inequality leads to the two incompatible conditions
- PI> - ql and PI> q, where PI and ql are the first components of the
probability vectors p and q. This game has therefore no strategy which is
stronger than all other strategies.
As mentioned in the introduction Maynard Smith and Price [5]
introduced the concept of the evolutionarily stable strategy which is a highly
biologically motivated concept. The definition given here is taken from Taylor
and Jonker [12] and Zeeman [13]. It generalizes the definition given in
Maynard Smith [4].
Definition 15.2 - a strategy pEII is called an evolutionarily stable strategy
(ESS) if for all strategies qEII - [PJ one or the other of the two conditions
holds:
... (I5.2a)
... (l5.2b)
The ESS gives a formalized definition of the best strategy in an
evolutionary context [4]. In particular the definition implies the ability to
resist the invasion of new strategies, possibly generated through mutations.
This point will be discussed later. Before discussing in what type of situation
one or the other condition, stated in the definition of the ESS, is satisfied,
it is necessary to make some statements about a system being in ESS. Here,
a simple proof of a theorem first proved by Bishop and Cannings [14] will
be given.
let p be an ESS. Then the fitness of p is equal to that of
all the pure sub-strategies Si contribution to p, i.e. fp = ii
Theorem 15.1 -
Proof - let W = [1:5 i:5 n IPi ~ OJ. The ESS is given by the linear combination p = E PiSi' Assume fp> ii, for some iE W. Then:
iEW
fp
E Pkfk
kEW
kEW - fiJ
kEW - fiJ
Pkfk + pJp
... (15.3)
Pk
Set qk =
--
ii = fj = f p ,
Vi,jE W
... (15.4)
The result stated in Thereom 15.1 is intuitively clear. If the fitness of the
various contributing strategies was not equal, the probabilities would be
shifted so as to achieve that condition. A brief inspection of the equation
for the probability evolution makes this clear.
Maynard Smith [4] defined the ESS to be the ability of a population
to resist mutant strategies. A brief description of his original arguments and
how they relate to the formal definition in equation (15.2) will now be given.
Let pEII be an equilibrium state for a population, i.e p describes how the
... (15.6a)
... (15.6b)
G=(-~ ~).
This is the so-called Hawk-Dove game which has been discussed in detail
by Maynard Smith [4] and others (see for example Zeeman [13]). Here it
will be demonstrated that with respect to the dynamic (equation (15.1 this
simple game has two equilibria, only one of which is an ESS. Assume that
the system has settled in an equilibrium state for which one writes
P = P1SI + P2S2. Assuming that both strategies contribute with a nonvanishing probability, Le. PI ~ 0 and P2 ~ 0, then fl = f2 = (1). These
conditions lead to a matrix equation of the form shown in equation (15.5).
The probabilities solving the fitness-constraint conditions can therefore be
found as solutions to this matrix equation. As the matrix equations can be
scaled by an arbitrary factor, one can write the linear system as Gp = h,
where /2 is a two-component unity vector, h = (1, l)T. The normalized
solutions to Gp = /2 will represent the equilibrium state p. They are, as one
would expect, PI =P2 = 112. Later, this result will be generalized to include
multi-strategy games. It should be noted that the fitness values associated
with this probability distribution are fl = f2 = 0.5 and the average fitness is
the same, (1) =0.5.
272
EVOLUTION OF STRATEGIES
.....
1.0
_-:-:::;;Ir;":'I
.. -=.~_.-.-
0.8
0..
ii
til
.c
e0..
0.6
0.4
0.3
400
0
time
Fig. 15.1
Pi
= (
initialized in the interval (0, Jf72) evolve towards the attract or p = (1/2, 1/2). When the system
is initialized in the interval (jf7i) it converges towards the attractor p = (1,0).
273
Theorem 15.2 -
system:
.E
GijPj
JEW
Then qi
~ defines
. I.J
Pj
.. , (15.7)
JEW
r = (G,p,s)
Theorem 15.3 demonstrates how all strategies of one particular game can
be found by extending the results of Theorem 15.2 to include all possible
sub-matrices of the gain matrix.
Theorem 15.3 -let G be a real n x n gain matrix. For every system of indices
(Giii;.. . .
. . .I.e.
are pOSItIve,
vector:
q(k)
= ( q(k),l>'"
P(k)i>
,
0 ,1=
' I , ... ,m=ran k(Gili2
,q(k),m ) , q(k),i
ik) ,
'1'2'k
P(k),i
-m--
t h en t h e norma I'Ize d
Tb '
represents an eqUl I num state
E P(k),j
15.4
r = (G,p,s).
j= I
This section discusses in some detail the connection between the equilibrium
states for rule (15.1) and the ESS. From previous analysis it is clear that ESS
defines an equilibrium, but it is not clear whether all equilibrium states also
define an ESS. Furthermore, it will be discussed whether and then under what
conditions the equilibrium states are unique. The following two Lemmas can
be proved by using elementary linear algebra.
Lemma 15.1 - let G be the n x n gain matrix for the game r = (G,p,s) where
every component of p has a non-zero value, then this equilibrium is unique
only if rank(G) = n. This does not exclude the game having a number of
different equilibria each one with less than n non-vanishing components.
From this it follows that if the equilibrium states with n non-vanishing
components are not ESS, then the game has no n component ESS. The
following Lemma states that the n component equilibrium states are in fact
ESSs.
Lemma 15.2 - let G be an n x n gain matrix and det(G) ;:0. Then any n
component stable equilibrium state of the game r = (G,p,s) is also an ESS.
As discussed, an ESS defines an equilibrium state for the dynamic
equations. It is important to understand the stability of the ESS. The precise
meaning of this statement is the following: 'Does a permutation in the
probability state vector pEII lead to a new equilibrium or does the new
(permutated) strategy lose and the system fall back to its previous
equilibrium?' The question has to be approached in a dynamic context.
Assume that Po = (Po I>Po 2,." ,Po n) is an equilibrium state for rule (15.1).
Linearising rule (15.1) in t'his stat~ gives an equation of the form q = t:. (p o)q
with:
n
t:.(Po) ij = Po,i ( G ij -
k=!
(Gjk + GA )PO,k)
... (15.9)
Theorem 15.4 -
Theorem 15.5 -
fp(p) =
2:
[tr(A) - tr(t:.(p]
... (15.10)
are defined by
A ij = GijPj
275
and tr(A)
231)
1 2 3
3 1 2
w=
r = (G,p,s).
Define:
... (15.12)
276
EVOLUTION OF STRATEGIES
0.8
0.7
0.6
N
a.
0.5
:.g
0.4
~
D
ea.
0.3
0.2
0.1
O'--_~
0.1
probability, P1
Fig. 15.2
23 I )
then I(p) = E
pjSj
iEW
15.5
EXAMPLES
In this section the results from previous sections will be applied to some
concrete cases. It will be shown how the number of contributing strategies
can be derived from the analysis of linear systems of the type in equation
(15.5). Furthermore, it will be emphasized that it is possible for the evolving
system to arrive at non-stable equilibria with high fitness values. These states
are characterized by their dependence on the system's initial strategy state.
Example 1 - in this example the methods developed so far are used to analyse
the four-strategy game defined by the gain matrix:
EXAMPLES
277
AI=
A 3=
0 ~) (
C D;
5
-1
3
7
2
2
; A2=
A4 =
35)
-1
4
4
-1 2
3 1
-1
3
4
2 5
3 -1
73)
These are the equilibrium states for the subgames defined by A 2 , A 3 and
states are:
A2
A3
A4
The equilibrium states PI> P2, P3 are therefore stable. Looking at these
vectors as defining strategies where one of the strategies has not been selected,
P2 would be taken to mean ih = (0.39, 0.0, 0.12, 0.49) when viewed as a
strategy within the initial game. Similarly one can define the following two
states ih = (0.48, 0.28, 0.0, 0.24) and P4 = (0.45, 0.41, 0.14, 0.0). All the
states P2,P3,P4define stable equilibrium states for the dynamics of the initial
4 x 4 game. For example, if the system is initialized as P2 = (ql ,0,Q3,Q4) it
eventually converges towards P2' The same is true for the other states P3 ,P-4'
278
EVOLUTION OF STRATEGIES
0.6
~ 0.4
----------------
:0
--------------------
ell
.0
a.
.......
'" '"
............................................................
_._._._._._._._._._.-._._.
__ ._._._.-._._.-
0'--
--1-
--10.
.&--
200
.&--
400
700
.....
1000
time
(a)
0.6
l/J
:0
ell
.0
ea.
"","'---- -
/.-.-.-.-
0.1
..-
O'--
--I-
200
...L..
-'-'-'-'-'-'-'-._._.-..
400
600
---'
800
1000
time
(b)
Fig. 15.3
Solution trajectories for the four-strategy game with the gain matrix in
Example I. The two different initial conditions lead to the same equilibrium strategy.
EXAMPLES
279
--
0.5
0.4
~
;E
0.3
.,'-',
'./
,r'o
".
! '.
ii
""
~,
0.2
a..
.~
'
'',...(''
" "
,/
,/.
,, ,
0.1
,,~-'-
........
'_
---------~------------_..-----
.-.-.-.-._.- '-'-.-
._ .... _...
.....................................................................
,,
.."
OL...._---L_
_......L_ _...L._ _....-_---JL.-_--L_ _- L - _
200
400
600
800
1200
1400
time
P2
Fig. 15.4
The gain matrix is the same as in Fig. 15.3. The probability vector = (0.39,0.0,
0.12, 0.49) defines a non-stable equilibrium. By perturbating its second component, the system
evolves towards its global equilibrium at p=(0.43, 0.35, 0.12, 0.10).
From the above it is clear that the equilibrium states can be analysed in
terms of the algebraic properties of the gain matrix. Given the gain matrix
one would in general not have to simulate the system in rule (15.1) to find
the equilibrium states. These can be found by solving the linear system in
Theorem 15.2.
Example 2 - In the following, an example is considered which has been
discussed by Maynard Smith [4] and Zeeman [13]. It is a four-strategy game,
the so called 'Hawk-Dove-Bully-Retaliator' (HDBR) game. The gain matrix
is given by the following expression:
D
B
R
1
3
3
1
6
2
6
2
6
0
3
6
0
4
2
4
The letters for the individual strategies have been included to indicate
the expected benefits when one strategy is played against another. First one
has to solve the linear equation Gp = /(4) just to find that the P vector
contains two negative components. One concludes that there is no equilibrium
state containing some contribution from all the pure strategies. Furthermore,
the fact that there are two negative components in the solutions to Gp = /(4)
shows that there are no equilibrium states containing more than two contributing strategies. This can be demonstrated by considering the eigenvalues of
the matrices found by removing some of the rows and the columns. First
consider the following sub-matrices:
=0 }A =(;
6
3
6
A, =(: : }A' =0
6
2
6
Al
3
6
4
2
4
6
2
2
)
2
4
where the matrix Ai ; i = 1,2,3,4 is found by removing the ith row together
with the ith column. Only in the case of the three first matrices do the
equations AiP = /(3) have solutions with non-zero components. The normalized solutions are:
PI
(1/3, 0, 213); P2
EXAMPLES
281
0.7
0.6
N
a.
,,,,-- ..................
O.5
I
::? 0.4
,,
I
:0
ell
-g
a.
0.3
0.2
0.1
..
....
..
"" ....
..
I
,,,
,,
.... '"
,.'
".
------
OL'~_~=--_:"'-'_~:--.....c.:':"'_-_#;=::=:=--~~:""':::"':::~u
0.6
probability, P3
Fig. 15.5
Probability trajectories for a three-component subgame, A I of the Hawk-DoveBully-Retaliator game. The component pz is plotted as a function of the component P3 for six
different initial conditions.
0.6
0.1~_---;~_---;~--*,---::-L:--_*,........_""'=""::--_~
Fig. 15.6
Probability trajectories for a three-component subgame, A 3 of the Hawk-DoveBully-Retaliator game. The component pz is plotted as a function of the component PI for seven
different initial conditions. The component P3 was initialized at the value 0.5. Four of the initial
states settle in the ESS state at p = (2/3, 1/3, 0). The remaining initial states evolve towards
non-stable equilibrium states.
282
EVOLUTION OF STRATEGIES
ous initial values for Pl' By initializing the system in P = (r, 0.5, 0.5 - r),
rE [0.27,0.32] one finds the trajectories shown in Fig. 15.6. For r=0.285
the system converges in the point P = (2/3,1/3,0) which has the average fitness
value of If(p)) = 2.66. If on the other hand r5 0.285 the system converges
towards a state of the general form p (q) = (O,q, 1- q) with the above
mentioned q-dependent fitness. The evolution of the fitness for the various
initial conditions is shown in Fig. 15.7.
The states p (q) = (O,q,l- q) are non-stable equilibrium states as each one
of them is arrived at through one particular initial condition. For q5 0.67
these states have fitness which is higher than that of the attractor state
p = (2/3,1/3,0), i.e. 2.66 but they are not stable with respect to perturbations.
They are therefore not ESS.
As mentioned at the beginning of this example two of the components
of the solution of Gp =/(4) are negative. By removing the corresponding
rows and columns and solving the reduced linear equation A 2,4 P = /(2) with
A 2 ,4 =
(~
:)
= (-
0.02, - 2.98).
...................................................................
3.2
,.,-------------------------------------:,~.--"'"-------------------
3.0
III
III
.... .'
2.8
CD
E 2.6
....
'.
2.4
2.2
0
time
Fig. 15.7
REFERENCES
15.6
283
Some of the limitations of conventional game theory are due to its lacking
of dynamics. In their seminal work von Neumann and Morgenstern [I]
expressed some regret at the static nature of game theory. These limitations
were partly overcome with the development of methods to find equilibria
in zero-sum games [16]. Since then, most of game theory has made strong
usage of the concept of strategic equilibrium. It is probably the most
frequently used game theoretic concept in applications to market analysis
and strategic games. Some of the economic applications are market
equilibrium, co-operation, bargaining and public goods, just to mention a
few.
Contrary to the static game theory, evolutionary game theory analyses
the temporal evolution of strategies. A fundamental concept in evolutionary
game theory is that of an evolutionarily stable strategy. Whether the strategies
of the game evolve towards an evolutionarily stable strategy or not depends
often on the initial strategy applied by the population. The various trajectories
represent the learning processes initiated under different conditions, but
motivated only by maximizing returns. This fact is considered to be of an
essential importance in applications to economics and market analysis.
Contrary to the conditions amongst animals, market operators have more
freedon in constructing the initial strategy distribution.
This chapter has discussed in some detail the mathematical structure of
evolutionary game theory. It has been demonstrated that the evolutionarily
stable strategy introduced in Maynard Smith and Price [5] can be found
analytically by solving a set of linear equations. Furthermore, it has been
shown how an algebraic analysis of the gain matrix allows one to find the
number of pure strategies which contribute to the evolutionarily stable
strategy.
It has also been demonstrated that adaptive strategies are better
characterized by stability rather than optimality. Simulations have shown
that some unstable states, reached through specific initial conditions, can
have higher fitness values than evolutionarily stable strategies for the same
game. This is best demonstrated by the fact that the fitness of a population
can be expressed in terms of the sum over the eigenvalue spectrum of the
game's stability matrix.
REFERENCES
1.
284
EVOLUTION OF STRATEGIES
2.
Lewontin R C: 'Evolution and the theory of games', J Theor Bioi, I ,pp 382-403
(1961).
-
3.
4.
5.
Maynard Smith J and Price G R: 'The logic of animal conflict', Nature, 246,
pp 15-18 (1973).
-
6.
7.
8.
9.
16
16.1
INTRODUCTION
The recent development of open computer systems has led to new challenges
for the management of computer resources and their efficient utilization
[ I, 2]. Interconnected heterogeneous computer systems, operating in a highly
parallel manner, are very different in their behaviour to those conventional
computational resources operating in an isolated and non-interconnected
manner. The management of these systems requires a new approach aimed
at an efficient distribution of computational requirements in an environment
subject to continuous change and evolution.
First attempts to apply market principles to the allocation of computer
time were reported in Sutherland [3], where the price of computer time was
allowed to depend on general demand and the relative priority of users, so
that the more important users had easier access to computer resources.
However, even the most impoverished users could be allocated some computer
time not needed by anyone else. By applying this auction principle it was
found that the computer utilization was very high.
In recent years, applications of market-like and game-theoretic principles
to resource allocation on open heterogeneous computer systems have been
studied more rigorously and some extensive theoretical frameworks for this
approach have now been worked out [4, 5] . In the course of these studies
the view has emerged that the market approach offers considerable benefits
compared with centrally controlled and synchronized networks. In the last
ten years, some researchers [6-8] have reported on implementations of
286
0 1,1
=
...... OI.N
... (16.1)
INTRODUCTION 287
well suited. Because of the limited information, the dynamics of the task
allocation process is a probabilistic one.
In this work a new model for task distribution on an open multiprocessor
system is introduced [10, 11]. It is more general than the one described in
Kephart et al [4] as it can, without any modification, be applied to an
arbitrary number of tasks and processors. The model is dynamic and the
basic set of equations describes the time evolution of a matrix quantity,
Pm,n, which gives the probability that task m is dealt with by processor n.
The limited information available to the tasks, and therefore the degree of
uncertainty in the system dynamics, is reflected in the time evolution of
Pm,n' For example, a minimum knowledge of the potential benefits of
choosing one or another processor results in no preference at all, leading
to a near equal probability distribution of tasks on the available processors.
On the other hand, a reliable knowledge of the benefits that a task can expect,
by using one processor rather than another, will manifest itself in a preference
for few processors and possibly in a structured, i.e. uneven, task distribution.
As in statistical systems in general, the structure inherent in the probability
distribution can be described in terms of the entropy function (see, for
example, Chandler [12]). In the model developed here an entropy function
is introduced for each task. The task entropies give some information on
the utilization of the processor system. Each task entropy provides
information on how that task distributes itself, in a probabilistic manner,
on the available processors. Low values for any of the task entropies
demonstrate a preference for some processors over others.
In section 16.2 the basic concepts of the model are introduced. In addition
to the above mentioned probabilities Pm,n and the utility functions Gk,i'
section 16.2 introduces a gain parameter {3 and the so-called transfer function
W(kl,m,n which measures the transition rate for task k moving from processor
n to processor m. The transfer function is responsible for the continuous
redistribution of the tasks, i.e. the time evolution of Pm,no In this notation
W(kl,m,ndt measures the probability of moving task k from n to m in the time
interval dt. The system evolution can also be described in terms of a quantity
in which gives the expected value for the fraction of the total number of
tasks being dealt with by the nth processor.
The task entropies are introduced in section 16.3. The special distributions
leading to maximum and minimum values for the task entropies are discussed,
as well as their importance as a 'watchdog' for the system's utilization. The
section also discusses the use of task entropies as a metric for the processor
system's suitability for dealing with the incoming computational requirements.
A number of different choices for the utility functions are discussed in
section 16.4 and the dynamical results of each particular choice are described.
288
16.2
BASIC FORMALISM
E P mn
n= 1
'
1, "1m,
... (16.2)
BASIC FORMALISM
289
E P mn
fn
m=\
... (16.3)
'
a(
~t3
).
rm,n
W(k),m,n
.\
e-(t!Xm,n)
-00
Wo (
k )
21-erf(-t3fm,n)
... (16.4)
Fig. 16.1
tasks
processors
--...
Ul(k),1,N
y-1
\.....
~
Ul(k),n,m
Fig. 16.2
How the transfer function is responsible for the transfer of jobs from one processor
to another.
It has been argued [11] that the evolution of the probability matrix
Pm,n(t) can be described by the master equation:
BASIC FORMALISM
291
1.0
0.8
0.6
~
3'
0.4
0.2
0.0
-6
Fig. 16.3
-4
-2
The shape of the transfer function for three different gain parameters.
... (16.5)
Wn,1
Vm
... (16.6)
!n
E
1=\
wn,IJi -
E WI,n!n
... (16.7)
1= 1
where the dot means the time derivative, and!n is the expected value for the
fraction of tasks using the nth processor. The assumption in equation (16.6)
is a rather unrealistic one as it does not consider any task-dependent
requirements but assumes that all the tasks perceive the processor system in
an identical manner. The network can consist of a high variety of service
providers offering services as diverse as digitized voice, electronic mail,
interactive videotex services and facilities which perform lengthy numerical
calculations. The tasks submitted to the network are of equally diverse nature.
292
It can be shown [II] that, in the case of only two processors, equation
(16.7) reduces to the following expression:
... (16.8)
This equation is simply the one studied by Kephart et al [4] and
Hubermann and Hogg [5] in the case of a two-processor or strategy system.
It is interesting that even this simple system can display immensely
complicated behaviour, including chaotic phenomena which arise when time
delays are introduced [4]. It is concluded that the equations studied by these
authors [4, 5] constitute a special subset of the general case of equation
(16.5).
16.3
Sk =
-E PknlnPkn , k=I" .. ,M
n=I'
... (16.9)
SIMULAnONS
293
If all the components are close in value the various tasks have a similar
distribution on the processor system. Given that the computational and
processing capabilities of the processors are in general very different and
together span a large spectrum of processing facilities, similar values for the
S components imply that the processor system is equally suitable for all of
the tasks. For a multitude of different tasks this is generally not the case.
16.4
SIMULATIONS
... (16.10)
where aj and bi> i = 1,2 are positive or negative constants. Here, the utilities
are expressed in terms of the fractions of jobs presently distributed on the
two processors.
Because of the more general nature of the model introduced in this work,
it is not sufficient to express the Nx M elements of the utility matrix only
in terms of the N expected fractions of jobs fl> ... ,fN on the processor
system. As it is the time evolution of Pm,n that is of interest, the aim is to
express the utility matrix elements in terms of these probabilities, or functions
thereof. The elements of the probability matrix P m,n relate to the fractional
averages as expressed in equation (16.3). Further motivations for this choice
will be discussed later in this section.
In all the simulations conducted the number of tasks and processors was
kept constant - 25 processors and 20 tasks.
294
16.4.1
Arbitrary preference
In the first experiments (Fig. 16.4) it is assumed that the utility matrix elements
are functions with arbitrary values in some interval [O,A] ,A> 0. The smaller
A is, the closer in value are the elements of the utility matrix. This, on the
other hand, means that the tasks perceive the processors as being similar with
respect to the benefits of using them. Under these circumstances one would
expect the initial probability distribution to stay fairly even, because tasks
are not particularly encouraged to use one processor rather than another.
If, however, the values of A are increased, a preference is likely to arise and
the initially even distribution may generate structures which reflect the
different values of the matrix elements Gk,i' This evolution is reflected in
changes of the entropy. Figures 16.4(a) and (c) show the time evolution of
the expected fractional task distribution on the system's processors for two
different values of A. Figures 16.4(b) and (d) show the corresponding task
entropy evolution. Both Gk,i and {3 are kept at fixed values during the
simulation.
(a)
32
30
28
26
22
2._
A_ 100
~.'O
f28
i28
A.100
6\0
2_
22
20
..".
(b)
(d)
Fig. 16.4 (a) and (c) represent the fractional distribution of tasks on the processors for two
different values of A:A = 1.0 and 10.0; (b) and (d) show the evolution of the entropy functions
associated with the tasks.
SIMULAnONS
16.4.2
295
Next, the assumption that Gk,i = Pk,i' i.e. task k believes its benefits of
using processor i are directly proportional to the probability that it is using
that processor already, makes equation (16.5) nonlinear in Pk,i' and implies
that, as certain tasks increase their usage of particular processors, the more
likely they are to use them in the future. In this case, redundancy
(unemployment) for a number of processors would be expected, at least for
sufficiently high values of (3. A few results using this choice are presented
in Figs. 16.5(a)-(d).
In this case, it is assumed that the tasks are programmed in such a manner
that they respond to the present probability distribution of tasks on the
processor system. A knowledge of this probability can be achieved if the tasks
record the past pattern of probability distribution, i.e. an estimate for the
present probabilities is achieved by examining the past usage of the various
.. 6
p.200
I.
l5.
(a)
(c)
3.5
~ 25
i
9
20
, 5
Fig. 16.5
~ 2.5
i
~.200
1Ii
20
J!l
' 5
Fractional task distribution, (a) and (c), and task entropy evolution, (b) and (d),
when the gain matrix G k i is equal to the probability distribution Pk i. The graphs show the
,
two cases (3 = 20.0 and 50.0.
.
296
16.4.3
k,i
a
2b
... (16.12)
Gk,i
SIMULATIONS
1.0
297
0.8
b=0.5
b=0.75
0.2
Fig. 16.6
The gain functions as they depend on the probability distribution and four different
values of b.
16.4.4
Do as the others do
(it) E
nr!m
Pn,i'
Gm,i
i.e. processor i becomes more attractive to task m the more it is used by other
tasks. Two instances of this choice are represented in Figs. 16.8(a)-(d) for
two different values of the gain parameter, {3 = 20.0 and 50.0.
298
,....
(a)
3'
~ 30
f30
i
~~
28
~
b-0.5
28
.
26
2i':x,~~200
"
20
..".
40
20
(b)
00
10
-.
(d)
Fig. 16.7
The gain matrix is given in section 16.4.3. Graphs (a) and (c) give the fractional
task distribution on the processors and graphs (b) and (d) represent the associated task entropies.
It is noticeable that the task distribution becomes more, even with increasing values for b. This
is reflected in overall increased values for the task entropies.
It is obvious from equation (16.13) that the value of the utility matrix
element Gm,i is close to the average probability Pi of the tasks using
processor i. Indeed equation (16.13) can be rewritten in terms of this average
probability as follows:
Gm,i = Pi -
(~) Pm,i
... (16.14)
PROCESSOR ENTROPY
~r
299
p.200
. . .0
... 0
j
13~
20
0.
""'"
IlfCCOSOO"
(a)
(e)
~2~
-eI 20
Il
11~~20
80 60
1~
IlfM
(d)
(b)
10
00
.0
tasks
Fig. 16.8 The fractional distribution of tasks and the associated task entropies where the
gain functions are chosen as in section 16.4.4 Results for two different gain parameter values
are displayed.
16.5
In section 16.3 the concept of a task entropy was introduced. For each task
one entropy function was defined. It was seen that the time evolution of the
task entropies gives some information on the utilization of the processor
system. However, the limitations of the task entropies as a watchdog for the
utilization of the whole processor system were also discussed, and it was
pointed out that the only real information they give is how the individual
tasks are distributed on the available processors. For example, a high value
for the entropy of the kth task only means that this particular task has a
close-to-even probability distribution over the system's processors. The task
therefore does not express any real preference for anyone processor.
300
If all the task entropies are high, none of the tasks has a preference for
anyone of the available processors. Under these circumstances, the processor
system would be well utilized, with the tasks evenly distributed over the
processor system. If, on the other hand, each of the task entropies is very
low, one knows that all the tasks have strong preference for only one or a
few of the processors. However, by considering the task entropies alone, one
cannot decide whether all the tasks have preference for the same few or
different few processors. One cannot therefore reliably assess the utilization
of the whole processor system. This point is demonstrated by analysing the
results of some of the simulations discussed in previous sections.
Figure l6.4(b) only demonstrates that each task has a similar entropy
value. This does not mean that all the tasks are evenly distributed over all
the processors as is clearly shown in Fig. l6.4(a). The graph in Fig. l6.4(b)
says only that the tasks have a similar distribution. This fact is demonstrated
in Fig. l6.9(a), from where it can be seen that the probability distributions
for the tasks are similar, but that they are not necessarily even, which explains
the distribution as represented in Fig. l6.4(a). On the other hand, the fact
that the task entropies in Fig. l6.4(b) are relatively high means that the task
distribution is fairly even. This results in reasonable system utilization. The
distribution in Fig. l6.9(b) explains in the same way the results demonstrated
by the graphs in Figs. l6.4(c) and (d). Here, the individual task entropies
are lower than in Fig. l6.4(b), resulting in poorer system utilization. A look
at Fig. l6.4(a) shows that no single processor receives more than 5010 of the
total amount of the work-load and no processor has less than I %. The
situation is completely different in the case demonstrated in Fig. 16.4(c), when
one processor received more than 25010 of the total work-load and about 15
processors are almost idle.
It is concluded that the task entropies alone are not, in general, a reliable
measure for the utilization of the processor system. Their main value lies
in the fact that they measure the suitability of the whole processor system
for a given task. The processor system is particularly well suited for the
execution of a task if, as a result of the bidding process, it is likely to be
given to anyone of a large number of processors.
A quantity better suited for monitoring the total distribution of tasks on
the whole processor system is the expected fractional task distribution
introduced in section 16.2. In terms of this distribution function the 'processor
entropy' is defined as follows:
N
S = - E !nln!n
n=1
... (16.15)
PROCESSOR ENTROPY
~
';::
A:: 1.0
0.8
p:: 1.0
iii
'90. 6
.>t:
'"
.?' 0.4
:5
jg 0.2
c.
(ij
20
'""
10
processors
20
25
(a)
A:: 1.0
/3:: 1.0
5
(b)
Fig. 16.9
on available processors
The probability distribution of tasks
(with reference to Fig. 16.4).
301
302
16.6
This section briefly discusses how some general problems from the theory
of economic equilibrium relate to the model developed in this work. It is
demonstrated how a task distribution, which maximally utilizes the systems
resources, can be constructed by maximizing the task entropies subject to
constraints given by the available processor resources.
It is tempting to take the approach to the task allocation problem in which
the processors are looked at as being consumers and the tasks as commodities.
Each consumer has limited resources and can therefore only consume a subset
of all available commodities. The aim of an effective allocation procedure
is to distribute the commodities (tasks) on the consumers (processors) such
that the total consumption exhausts the total resources. This point will now
be discussed in terms of the equilibrium states of the dynamical allocation
equation (16.5), i.e. states which satisfy the stationary condition:
... (16.16)
Let R m n be the resources available to processor n in dealing with tasks
of type m'. Then the allocation, described by the distribution p~ n' is
optimal only if:
N
r;
n=1
p(mlpO
n
fi,O
= r;
n= 1
R m n' 'rim
'
... (16.17)
303
... (16.18)
1n
... (16.20)
304
16.7
305
Rescaling all the utility matrix elements by the same amount increases the
variance in the benefits as perceived by the tasks. This results in an increasing
workload for some of the processors and less work for others. The results
for this type of utility matrix were presented in Figs. 16.5(a)-(d).
By putting the utility matrix elements equal to the momentary probability
distribution, Gk,i = Pk,i (the self-confident choice), the perceived utility
changes as the system evolves. As the initial values of Pk,i are not perfectly
even, this kind of choice will eventually lead to a preferential distribution
of the tasks, at least for sufficiently high gain parameters. This was
demonstrated in Figs. 16.5(a)-(d) for two different gain parameters.
The self-confident choice has some shortcomings as discussed in section
16.4.3. These can be rectified by introducing a feed-back effect into the utility
function (the limited self-confident choice). This guarantees that a processor
becomes less attractive if its usage exceeds a certain critical limit. The effects
of this choice on the expected fractional task distribution and the entropy
evolution can be seen in Figs. 16.7(a)-(d). The threshold defined by the critical
limit (equation (16.12 relates to the resources available to the processor.
When these are fully stretched the processor becomes less attractive and tasks
are allocated to alternative processors.
Finally the option is considered of making a processor more attractive
to task m the more it is used by other tasks. As demonstrated in section 16.4.4
this choice leads to a very small variance for the values of the utility matrix
and consequently a fairly even task distribution (Figs. 16.8(a) and (b. A
sharp deviation from an even distribution can only be observed for large gain
parameters, {3 = 50.0 (Figs. I6.8(c) and (d.
Earlier sections have discussed at some length how the task entropy gives
valuable information on the probability distribution of the allocation of tasks
to the processor system. It supplies a metric for the suitability of the processor
system for the execution of the incoming tasks. However, it does not, in
general, present a reliable metric for the actual utilization of the whole system.
To supplement the utilization metric a scalar quantity, called processor
entropy, is introduced. The processor entropy gives information on how the
totality of tasks has been distributed on the processors. Figures 16.10-16.13
plot the processor entropies of all the different utility function choices
discussed in section 16.4.
In general, the different resources available to the various processors will
put constraints on the probability distributions under which the entropy
functions are to be maximized. This important point is discussed in section
16.6. It is demonstrated how this optimal task distribution can be found by
applying the principle of maximum lack of knowledge. As the simulations
of section 16.4 have clearly demonstrated, the choice of the utility function
fixes the distribution of tasks on the processor system. Bearing in mind that
306
3.5
A= 1.0, 13 = 1.0
<Il
Cll
~3.0
C
Cll
o
<Il
<Il
Cll
K2.5
()
2.0!------:~--l:__---=~===-~-___J
Fig. 16.10
The graph shows the processor entropies for the case represented in Figs. 16.4(a)(d) but now for three different values of A.
3.
13 = 20.0
<Il
.~ 3.0
cCll
o
<Il
<Il
Cll
g 2.5
a.
2.0't,-_ _+
__--J..,......-_ _
~--~--~
time
Fig. 16.11
Processor entropies for two different gain parameters and a gain function
set equal to the probability distribution. The associated task distributions are given in
Figs. 16.5 (a) and (c).
307
3.250
en
Ql
'0.
-E
(3 = 20.0
3.240'
3.230'
Ql
oen
~
b= 1.0
3.220
3.210
Fig. 16.12
..L..-_ _....,.a-_ _~
time
The processor entropies for the case presented in Figs. l6.7(a)-(d) for three different
values of b.
3.5
3.0
en
Ql
'0.
gc 2.5
Ql
oen
2.0
i5..
1,5
_ _~:--_ _-':-_ _--:':-_ _--:':
1.0~
Fig. 16.13
time
308
16.8
CONCLUSIONS
CONCLUSIONS
309
Fig. 16.14 The formal representation of the processing of jobs on a heterogeneous network.
The incoming job is split into different tasks in a suitable way and the network's processors
are informed of the presence of these outstanding jobs. They are invited to send in bids which
will be evaluated by the job provider. On the basis of that evaluation, the job provider writes
down a gain function which quantifies the perceived benefits of using anyone of the processors
which have submitted a bid. The task solutions are returned by the individual processors to
be joined into a final solution.
310
REFERENCES
1.
Hewitt C: 'The challenge of open systems', Byte, 10, pp 223-242 (April 1985).
2.
3.
4.
5.
6.
7.
8.
9.
10. Olafsson S: 'A model for task allocation' , Internal BT Report (September 1993).
11. Olafsson S: 'A general model for task distribution of an open heterogeneous
processor system', IEEE Transactions on Systems, Cybernetics and Man, 24,
Pt II (1994).
12. Chandler D: 'Introduction to modern statistical mechanics', Oxford University
Press (1987).
13. Wallich P and Corcoran E: 'Games that networks play', Scientific American,
p 92 (July 1991).
14. Jaynes E T: 'Information theory and statistical mechanics', Phys Rev, 106,
pp 620-630 (1957).
-
17
COMPLEX BEHAVIOUR IN
NONLINEAR SYSTEMS
C T Pointon, R A Carrasco and M A Gell
17.1
INTRODUCTION
312
An example of the requirement for decentralized control in communication systems is within the context of emerging communications free-trade
zones (CFTZ), in which numerous communications service providers will be
operating different networks with differing characteristics [3]. Differing
services may be offered by different networks; networks will be competing
against each other and using resources in other networks; different networks
may have different levels of reliability and may be managed to give various
qualities of service. In such zones, networks will have to interconnect and
interoperate as both users and network operators draw upon resources
scattered within the disordered communications conglomeration.
The communications conglomeration in a CFTZ will consist of large open
collections of locally controlled, asynchronous and concurrent processes
interacting with an unpredictable environment. Decisions made at any point
in the system will be based upon local, imperfect, delayed and conflicting
information. Such systems will have operational characteristics which are
likely to be very different from those of the homogeneous network in the
public utility paradigm dealing with one type of information (e.g. voice)
controlled by a central office. Decentralized control of communications and
computational structures will become an overriding prerequisite for the
integrity and security of the highly complex communications systems which
will emerge [3, 4] .
With increasing network complexity and correspondingly increasing distribution, it will become essential to develop decentralized control and coordination mechanisms: it will become increasingly difficult to control global
networks with centralized control systems. Globally distributed networks will
raise many new problems, particularly in the areas of signalling and signal
processing as many networks evolve and operate asynchronously.
New ways of engineering communications and processing systems will
be required to take account of the high levels of diversity, distribution and
decentralization. This will lead to fundamental changes in terms of the ways
in which command, co-ordination and control functions are perceived; interworking will raise many issues which go far beyond basic issues of protocols
for interconnection. Since the complete information exchange required for
a centralized decision-making process may not be feasible, particularly as
many information and communications systems are by nature decentralized,
practical constraints may make distributed decentralized control mandatory,
especially for extended multi-commodity service and network systems with
rapidly varying service configurations and user demands. The co-ordination
of decentralized decision makers is, however, a formidable problem.
Decentralization by its very nature introduces uncertainty into the decision
process - remote components of the same system can only have limited
information about each other and the overall system. Hence, decisions must
INTRODUCTION
313
314
17.2
Service processing elements in telecommunications networks can be visualized as processing units in a multi-process environment which handle a certain
number of tasks including service originations, collection and processing of
service requests, disconnects and overhead activities including operating
system overheads and network management audits. Figure 17.1 shows a
simplified processor schedule for the processing unit in a typical switching
system.
The system operates as follows. A timer of duration T is initiated by the
overhead routines at the start of a processor cycle. The processor completes
all outstanding jobs in the higher priority data process queue before proceeding to a lower priority data process queue. Following the expiration of
the timer, the processor suspends work on any outstanding jobs and initiates
a new cycle. Note that the timer value T effectively determines the nature
of the processor schedule. If T is small, the schedule resembles a non-preemptive priority schedule, whilst a large T has the effect of assigning equal
priority to each of the queues in a schedule that is indicative of polling. The
application of this model can be viewed within the following contexts:
315
t=O
overhead
activities
decreasing
priority
Fig. 17.1
Figure 17.2 shows a block diagram of the switching system under consideration. The operation of the switching system is as follows. Inbound data
I[k] arrives at data process 1 (QI [k]). This process introduces a fixed time
delay d into the data stream. Once served (at the rate R t ), the data 0 1 [k]
is forwarded to data process 2 (Q2 [k]) where it is processed at the rate R 2
and subsequently passes out of the system (given by O2 [k]). Prioritization
is introduced into the system by serving data process 2 before data process
1. Operating on a fixed time cycle, the processor decides which data process
to perform based upon the quantity of data in each data process QI [k] and
Q2 [k]. If data is present at the second data process queue, then it will be
processed first. Within any given cycle, data in data process 1 will be served
only if data process 2 has been emptied. If both data processes are emptied,
the system waits until the beginning of the next time cycle. This represents
a non-pre-emptive schedule with data process 2 having the highest priority.
The data flow through the system may be studied using discrete time analysis
316
data process 2
queue
O,lk)
delay
d
outbound
data
inbound
data
Ilk)
data process 1
queue
Q,lk)
Fig. 17.2
dkj
02[k j ))
... (17.2)
R(
Qdk+l]=f ( Qdkj+I[kj + - (Q2[kj-Q2[k+lj
R2
+ Qdk-dj-Qdk-d+ I] +I[k-dj-R 2)
...
(17.3)
where the function min(i ,J) is the minimum of the two real arguments i and
j and, for some argument x, the function f is:
x+lxl
Equations (17.1) and (17.2) express the contention and priority of the
two data process queues whilst equations (17.3) and (17.4) are conservation
rules. Since equations (17.3) and (17.4) describe a nonlinear system with
feedback, an analytic treatment of the flow equations is not feasible.
However, it is possible to perform a certain amount of analysis by generating
the system's attractors for different service loading conditions using functional
iteration. For telecommunications applications, an understanding of the
317
system's reaction to an input data stream that exhibits occasional peak values
is essential. In order to simplify the problem, a digital switching function
is considered with a peak value, which occurs for the duration of one
processor time cycle with amplitude I p at several instances of time. The
switching function can be described as:
f=Inorm+I[k-To] +~ I[k-Til - ~ I[k-Tj]
J
for i
where I[k-T] =
Inorm
[ I peak > I max
... (17.5)
k< T
... (17.6)
k?:.
and N is the total number of sample intervals, k is the time step (sample),
and Tj are the switching time variables, and I max is the maximum
steady-state capacity of the data processor.
Numerical simulations of this nonlinear data processor system model have
revealed four basic operational modes in which the system either returns to
its original state after a peak loading, enters into long-lived oscillatory
behaviour, degrades into chaotic states or results in unbounded (overloaded)
behaviour.
TO, Tj
17.3
The block diagram of Fig. 17.3 shows the architecture of the signal processing
system for studying the complex behaviour of the nonlinear array. The
personal computer (PC) provides the man-machine interface to the processing
system and executes a suite of host software which allows:
data obtained from the array and processed by the DSP to be uploaded
to the PC for display in commercial graphics software packages.
318
PC host computer
: serial communications
RS232 interface
output data rate
Texas Instruments
TMS320C25
data process 1 length
digital signal
data process 2
processor
length
target system
address bus
control bus
data bus
, ,
data
acquisition
interface
data
generator
process
rate 1
process rate 2
input data
data processor clock
Fig. 17.3
array of nonlinear
data processing
elements
The DSP target module provides a convenient interface for both the
generation and acquisition of data to and from the array respectively. The
user-defined array parameter values are assembled by the DSP for onward
transmission to the internal registers of the data generator hardware.
Conversely, data obtained from the array by the data acquisition interface
are received by the DSP for digital signal processing operations and returned
to the PC host. Signal transformation processing techniques (e.g. phase space
decomposition, Fast Fourier Transform) are used to analyse real time dynamical behaviour of the heterogeneous array.
The array has scope to emulate global information networks consisting
of thousands of elements. First results from the array, programmed initially
as cascades of double-process elements with simple embedded nonlinearity
[5-8] , have reproduced traffic phenomena which have recently been revealed
in teletraffic studies in real telecommunications networks. The array provides
a rapid and flexible method of exploring phenomena in diverse communica-
319
17.3.1
The processor system was realized in VLSI using field programmable gate
array devices with two intensive shift register fabrics representing the two
data process queues connected in cascade. The data processes provided storage
for the two data streams contending for execution by the data processor.
The flow of data between the data processes was controlled by a data direction
control logic unit in the form of a Moore algorithmic state machine.
A block diagram of the data processor element is shown in Fig. 17.4.
The data processor element represents the nonlinear data processor system
and consists of two data processes that are connected serially and their
operation controlled by the data processor controller. A peripheral device
which provides the source data to the system is also shown.
The data processor element operates as follows. When the Write] line
is asserted high, by the input data generator, data available on the Data line
is systematically written to the first data process. Both data processes are
identical in structure but differ in their interface connections. Whereas the
Data and Write] inputs on data process 1 are sourced by the data generator,
in the case of the second data process, these are provided by the Output line
of data process 1 and the Write2 line of the data processor controller.
The input clock to each of the data processes is provided by the data processor controller which takes one of two possible nominal rates. The first
rate Rload is used for loading data into the particular data process at a rate
determined by the master clock. The second rate R9I ff1oad for iE(1,2), is a
system parameter that determines the frequency at which data is removed
from each of the data processes. Furthermore, as the two data processes are
connected in series, a transfer of information between the first and second
data processes requires that the output rate of the first data process be
synchronized to the input data rate of the second data process. In either case,
such transfers are initiated by the data processor controller by means of the
Read], Read2 and Write2 signals, based upon the status of the output ready
lines - OPRdy] and OPRdy2 of the respective data processes.
17.3.1.1
320
data processor
controller
Clk1
r--- read 1
r--- OPRdy1
Clk2
write 2
read2
OPRdy2
A-
master
clock
Y;
>.
:;;: 11l "0
<3
11l""
-0
~~ f-- data
-Ql
::Ie f-- write 1
e.G)
.!:
Ol
a:
0..
a:
0..
output
data process 1
Fig. 17.4
N N
>-"0
"0 11l
input
Ql
""
Ny
Ql N
=E=
;: ()
output 1--'output
data
data process 2
The first is responsible for the generation of the data processor fixed schedule
cycle and is self-sufficient, i.e. it generates the active low terminal count status
signal AckTimer at the appropriate moment and consequently asserts the
reset mechanism RstDe/ay.
The second timer is responsible for the generation of the status signal
AckDe/ay following a given number of elapsed processor time cycles. The
value of the counter variable can be changed as necessary to explore a wide
range of system performances.
321
StateO.
OPRdy2
OPRdy1
Fig. 17.5
If data process 2 is empty at the start of a processor cycle, but data process
1 contains data, then data process 1 will be served. If neither data process
contains data at the beginning of a processor cycle, then the system remains
idle until the next processor cycle, with the exception of inbound data entering
data process 1. This is defined by the path StateO-State2-StateO.
Data contained within data process 1 are processed by means of the data
processor controller initiating a fixed delay of duration d processor cycles
322
before the transfer of data from the first data process to the second data
process at a rate determined by the parameter R?ffload. Upon the expiration
of this delay the system remains idle until the end of the present processor
cycle, when a new cycle is initiated. This is achieved by traversing the
transitional path State3-State4-State5. Data contained within data process
2 at the beginning of a processor cycle are processed by removal at a rate
governed by Rfffload. As previously discussed, if this operation is performed
within a given processor cycle, then data process I resumes service. Otherwise
data process 2 continues to be served over a number of processor cycles if
necessary, until the entire contents of the data process have been removed.
Data processor controller equations
... (17.7)
... (17.8)
[OPRdy2/\p/\( - q)/\r]
... (17.9)
... (17.10)
AckDelay)/\( - p)/\r]
v [( -p)/\( -q)/\r]
RstDelay = [( - p)/\( - q)/\r]
... (17.11)
... (17.12)
17.3.1.2
The data processor model previously described is representative of a timeshared multi-process class of system that adopts a first-come first-served nonpre-emptive discipline, and therefore requires intermediary storage for the
processes that await the attention of the processor. The use of two intensive
shift register components naturally preserves the sequence of the arriving data
and permits the storage of any outstanding data that awaits execution by
the data processor.
The structure of each of the data processes is such that, at each data bit
position x, there exists a data register R x and an associated controller Con x '
Each register has three operating modes hold, load and shift with the
exception of the first stage which has only two (load and hold):
the hold mode maintains the contents of the associated data register bit
position for the duration of the current clock cycle;
the load mode is responsible for loading the associated data register bit
with the data value available at the immediate left data register output;
the shift mode is responsible for invoking a right shift operation on the
corresponding data register bit position.
These modes are derived for all stages (apart from the first stage) from the
corresponding controller outputs ~ and Ml (where x represents the
particular stage), which form the control inputs to a 3-to-l multiplexer at
the data register input. In the case of the first stage, a single multiplexer
control input M n_ 1 is used to define the two operating states load and hold.
Each multiplexer has an additional output enable En control input to ensure
data integrity.
Each controller utilises two states full and empty, which indicate the status
of the corresponding data register. There are three stages of controller element
- last, ith and first. The ith stage controller is a generic stage, and the
remaining two stages are its derivatives, with the constraint that the last stage
(controller) requires the shift output function, whereas the first stage does not.
Table 17.1 shows the input, output and register operating mode
combinations that apply to the first, ith and last stage controllers respectively Conn _ 1, Coni and Cono
A state transition diagram for the generic ith stage controller is shown
in Fig. 17.6, from which the set of state transition tables (given in Appendix
B) and resulting implementable set of equations relating to the first and last
stage controllers can be deduced.
324
Table 17.1
Data process controller input, output and register operating mode combinations.
First stage
controller
(Conn_i)
Inputs
Read, Write, R n -
ith stage
controller
(Coni)
2
Read, Write, R j _
Rj + 1
Last stage
controller
(Cono)
I'
Read, Write, R 1
Outputs
Rn_ 1
Rj
Output ready, R o
Register
operating
modes
Hold, Load
Fig. 17.6
... (17.13)
325
M? = Read/\R j + ,
... (17.14)
MI
= (- Read)/\R j
... (17.15)
Enj
[(-
v [( -
The equations, derived from Table B3 (see Appendix B), for the/irst stage
of the data process controller are:
NextR n _ 1 = [Write/\( - Read)/\R n _ 2 ] v [Write/\R n _ 1 ] v [( - Read)
/\R n _ d
... (17.17)
M~_I
( - Read)/\R n -
... (17.18)
... (17.19)
The equations, derived from Table B4 (see Appendix B), for the last stage
data process controller are:
NextR o = [Write/\( - Read)]
V[R1/\R o]
M8
v [ Write/\R o]v [( -
Read)/\R o]
... (17.20)
Readl\R(
... (17.21)
MJ = [( - Read)/\R o]
... (17.22)
Eno
I\Read/\( -R 1)]
...
(17.23)
OPRdy = R o
17.4
It has been shown that the data processor system will return to a transparent
state following a momentary overload of arbitrary size in the rate of the input
data stream, as long as the normal input data rate is less than half of the
frequency of the second data process rate (i.e. I norm < ~) [7]. An input
326
data rate below this critical value leads to a stable mode of operation, whereas
a rate which exceeds the critical value results in an unstable mode of operation
which can lead to chaotic and unbounded behaviour. Through the use of
four differing sets of input stimuli, a single data processor was driven into
stable, unstable, chaotic and unbounded modes of operation. The first set
of input data satisfied the critical value condition and enabled the processor
to return to its transparent state following a series of excessive input
perturbations. However, the second, third and fourth sets of input stimuli
were chosen so that they exceeded the critical value and, following a series
of momentary overloads in the input data rate, resulted in sustained
oscillations and chaotic and boundless behaviours respectively in the length
of the low priority data process. The high priority data process on the other
hand, exhibited long-lived oscillations and chaotic states in response to these
stimuli, whilst the output data rate developed 'hard-clipped' oscillations with
a variable degree of regularity.
17.4.1
1.2
1.0
0.8
1.2
1.0
0.8
0.6
0.6
0.4
0.4
0.2
1.2
O'%~~mlm
1.0
~~~~~~~:0.60.8
I
y , y .,' y
327
.2
= Q,[9i+i+3 )
0.2
0.4
(a)
2.5
':
."
..
2.0
2.5
....:......
""";
1.5
. . . . .: ....
-:,
... '.
1.0
2.0
....
"
1.5
....
.......
1.0
0.5
0.5
o o
(b)
Fig. 17.7
Graphs showing the calculated phase space surface trajectory of the low priority
data process queue length for the cases of (a) [norm = 0.125 and [peak = 2.0, (b) [norm = 0.25 and
[peak = 2.0, (c) [norm = 0.332 and [peak = 2.0, and (d) [norm = 0.5 and [peak = 2.0. For clarity,
representative axes are labelled in (a) only. In (a) the system returns to a steady or transparent
state following the initial transient behaviour. The transient behaviour is indicated by the eight
peaks and the steady-state behaviour corresponds to the remaining flat portion of the phase
portrait. In (b) the system does not return to a transparent state, but exhibits long-lived oscillations.
In (c) the length of the data process queue varies in an apparently random fashion and in (d)
the data process grows without bound. The construction of the phase portraits is described
in Appendix C.
328
o 0
( )
1.80E3
1.60E3
1.40E3
1.20E3
1.00E3
8.00E2
6.00E2
4.00E2
2.00E2
. . ....
. '.
,"
..
'.',
1.20E3
1.00E3
8.00E2
6.00E2
4.00E2
2.00E2
-,
O.OOEO
O.OOEO
1.40E3
1.40E3
1.00E3
1.00E3
6.00E2
6.00E2
2.00E2
2.00E2
2.00E2
(d)
Fig. 17.7
(Contd).
329
growth of the process queue length which is clearly visible in Fig. 17.7(d),
in that the phase portrait grows away from the origin.
Figure 17.8 shows the numerically simulated phase spatial evolution of
the high priority data process queue length when the rates [norm E[0.125,0.25,
0.332,0.5J are successively used in conjunction with the above input data
rate profile to stimulate the data processor. It is shown that the application
of an input profile adopting the rate [norm = 0.125 results in a rapidly
diminishing transient response, for which the data process queue length is
empty (indicated by the smoothness of the base of the phase portrait in
Fig. 17.8(a)) after approximately 42 schedule cycles. In contrast, a normal
input of [norm = 0.25 results in an initial transient, which after approximately
100 cycles leads to long-lived oscillatory behaviour which is reflected by the
irregular structure in the phase portrait in Fig. 17 .8(b). This behaviour is
superseded as the data process degrades into chaotic states in which the queue
lengths vary in an irregular manner when the system is driven with an input
data rate of [norm = 0.332. This is illustrated by the rich dynamical structure
of the centre forward portion of the phase portrait in Fig. 17.8(c). As the
frequency of the service requests increases still further, the high priority
process is granted the processor schedule; meanwhile the low priority process
queue continuously grows. Once the low priority process regains an execution
slot, the effect of the fixed delay is such that no data is transferred between
processes and the high priority process becomes idle. This effect is
compounded with time and explains why under conditions of input
information overload, the low priority process queue grows without bound,
whilst the high priority process contains little data. Assuming an input data
rate of [norm = 0.5, Fig. 17.8(d) shows that the high priority data process
queue remains relatively inactive when the rate of data arrivals exceeds a
critical value; the size of the fixed delay ensures that few jobs accumulate
in the higher priority queue.
Figure 17.9 shows how the output data rate varies against elapsed
processor cycles. In each case, the output rate is equal to the input rate prior
to the cycle in which the system was perturbed by the first single cycle of
peak rate in the input data stream. The output data rate subsequently begins
to oscillate between the assigned value of the high priority data process rate
and zero. However, whereas the system that was perturbed with the 'undercapacity' data stream quickly returns to stability (since the data processor
is able to satisfy the computational demands of the input data stream), the
'overloading' data stream, that violates the critical value of input data rate,
results in continued oscillatory behaviour. Driving the system further into
either the chaotic regime or unbounded mode results in similar sustained
oscillatory behavour.
330
.... -: ..
.......
:'
.. "
'
-'
0.25
. ,.
-.- ..
....
'
(a)
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
.. ,-
..... >: : . : : : .
..
.....
,.
":
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
,.... '.
...... '
"
'
'
1.4
0.8
1.2
1.0
0.6
0.8
(b)
Fig. 17.8 Graphs showing the calculated phase space surface trajectory of the high-priority
data process queue length for the cases of (a) lnorm=0.125 and lpeak=2.0, (b) lnorm=0.25 and
lpeak = 2.0, (c) [norm = 0.332 and [peak = 2.0 and (d) [norm = 0.5 and [peak = 2.0. The representative
labelling of the axes is given in Fig. 17.7(a) and described in Appendix C. In (a) the system
returns to a steady or transparent state following the initial transient behaviour. The transient
behaviour is indicated by the two peaks and the steady-state behaviour corresponds to the
remaining flat portion of the phase portrait. In (b) the system does not return to a transparent
state, but exhibits long-lived oscillations. In (c) the length of the data process queue varies in
an apparently random manner and in (d) the data process remains virtually empty throughout
the system's execution.
RESULTS
AND DISC
USSION
0
. .... .-,......-::.........,... ..... .
1.4
1.2
1.0
0.8
0.6
0.4
0.2
1.4~~~~3
1.2
1.0
0.8
..
:::'
.....
.0
1.4
1.2
1.0
0.8
0 .6
0.4
~~Ng2
1.4
1.2
1.!l
0 .6
0 .6
0 .4
0.2
c)
0.5
d)
fi g . 11.8
lco ntd).
0.8
331
332
0.55
0.50
0.45
0.40
Q)
0.35
(tj
0.30
L-
:J 0.25
0..
:J 0.20
0
L-
0.15
L-
0.10
L-
Cll
"0
0.05 L0
I,
20
40
60
80
100
(a)
~ 0.35 LCll
(tj 0.30 L-
"0
:J 0.25 I - 0..
:J 0.20 I0
0.15 L0.10~
0.05 L0
20
40
60
80
100
120
140
160
180
200
(b)
Fig. 17.9
Graphs showing the calculated variation of the output data rates against discrete
processor cycles for the cases of (a) lnorm=0.125 and lpeak=2.0 and (b) lnorm=0.25 and
lpeak = 2.0. In (a) the system returns to a steady or transparent state following the initial transient
behaviour. In (b) the system does not return to a transparent state, but exhibits long-lived
oscillations.
17.4.2
333
The first test application of the array has been the study of the processor
batching behaviours which frequently emerge in teletraffic systems. The
nonlinear processing system was driven into the stable, unstable and chaotic
modes of operation using different sets of input stimuli, each having a profile
which remained constant with a magnitude [norm prior to a peak value of
magnitude [peak which occurred for one processor cycle, after which a
constant rate [norm was established for the remaining cycles. The DSP was
used to calculate the phase portraits (see Appendix C) and frequency spectra
of the dynamical first data process queue, in real time. The phase portraits
represent the system dynamics projected on to a two-dimensional plane where
the first point of the original time series is represented by the phase space
ordinate and the third point of the time series, the phase space abscissa. The
second point of the time series is represented by the ordinate of the second
point in phase space. The connection of the points results in a trajectory which
enables the state dynamics of the nonlinear processing system to be visualized.
Figure 17.10 shows the resulting phase portrait when the system is operated
in a stable mode. A specific set of system parameters for the stable mode of
operation were chosen: R?ffload = 1, R~ff1oad = 0.5, [norm = 0.125, [peak = 2.0
3.3
2.8
2.3
N
+
'+
ci
i.
II
1.8
1.3
:oE 0.8
~
0.3
-0.2
-0.2
0.3
0.8
1.3
1.8
2.3
2.8
3.3
X"X,., = Q,[4i+j]
Fig. 17.10
Phase portrait of the nonlinear data processor system when operated in a stable
mode.
334
and d = 3. The reader is referred to Burton and Gell [7) where more detailed
discussion of the system behaviours is given. From the transparent state in
which the length of the first data process queue length is zero (given by the
point (0,0 the trajectory visits a series of points in phase space prior to
entering the basin of attraction of the fixed point (0,0). Thus, the system
returns to its transparent state following the transient response to an input
perturbation.
Figure 17.11 shows the phase portrait of the data processor system when
operated in an unstable mode and the following parameters were assumed:
Rfmoad= I, Rf moad =0.5, I norm =0.25, I peak =2.0 and d=3. Starting from the
transparent state given by the point (0,0), the system exhibits transient
behaviour about the input perturbation and subsequently adopts a limit cycle
bounded by the points (2,1)-(1,1)-(0,1)-(1,1)-(1,2). Figure 17.12 shows the
phase portrait of the data processor system when operated in the chaotic
mode. Starting from the transparent state given by the point (0,0), the system
quickly adopts the form of a strange attractor - a complicated geometric
form [10, II).
Figure 17.13 shows the power spectrum of the low-priority data process
queue length Q, [k) when the nonlinear data processor system was operated
3.3
2.8
N
+
2.3
'+
i. 1.8
cf
II
. 1.3
0.8
0.3
-0.2
-0.2
Fig. 17.11
0.3
0.8
1.3
1.8
2.3
2.8
3.3
335
2.5
2.0
C\j'
'+
i.
ci
1.5
1.0
II
,;
>: 0.5
0
-0.5
-0.5
0.5
1.0
X"X,+,
Fig. 17.12
1.5
2.0
2.5
= Q,[4i+i]
103
ci
102
r:ii
.Q
10'
10
10-' L..-_.L-_..L-_-'-_....L.._--L_......L_--1_---JL......_.L-----I
o 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
frequency, f
Fig. 17.13
336
cr
o
Ol
.Q
102
10'
10
0.05
0.10 0.15
0.20
0.25 0.30
0.50
frequency, f
Fig. 17.14
17.5
CONCLUSIONS
CONCLUSIONS
337
338
APPENDIX A
Nonlinear data processor system discrete time analysis
With reference to Fig. 17.2, the data flow through the system may be studied
using discrete time analysis. The data process queue lengths at a sampling
interval k+ 1 are given by the queue length in the previous interval k plus the
net flow in that interval:
Qdk+l]
= Qdk]
+ I[k] - 0dk]
... (AI7.!)
and
... (AI7.2)
The output of the second process can be one of two possible values. The
maximum rate at which the second data process queue can be served (R 2)
or if there is a smaller amount of data on the second data process queue then
the output of the second data process will be the contents of the process during
the previous processor time cycle, Q2[k] +0 1 [k-d]. The output of the first
process is similarly constrained with the additional provision that only the
residual capacity after the second data process is served is available to serve
the first data process. That is, the maximum amount of data that can be dealt
with by the first process is the rate of the first process, R 1, multiplied by the
fraction of the time cycle remaining once the second process has finished.
The time taken by the second process is given by:
2[k]
... (AI7.3)
R,
2[k] )
1---
R2
... (Al7.4)
If there is less than this amount of data in the first process queue then
the output of the first process will be the contents of the process during the
previous time cycle, QI [k] +/[k]:
... (AI7.5)
... (AI7.6)
APPENDIX A
339
... (AI7.7)
... (AI7.8)
1-
2[k]
) is equivalent to
R2
- Qdk-d+l] +I[k-dJ,R 2
... (AI7.11)
R2
j(R 2-Q2[k]-Qdk-d]
+ Qdk-d+l]-I[k-d]))
... (AI7.12)
340
Substituting equation (AI7.6) for the 0, [k) term in equation (Al7.1) results
in:
Qdk+l] =Qdk] +I[k)-min ( Qdk] +I[kJ,R,( 1-
2[k)
R;- ))
... (AI7.13)
QI [k] +I[kJ, -
R.
(R 2-(Q2[k]
R2
R.
R2
R2
-Qdk-d+l] +I[k-d] -R 2
... (AI7.17)
APPENDIX A
341
Q2[k+l] =Q2[k] +QI [k-d] -QI [k-d+l] +/[k-d] -02[k] ... (AI7.18)
and:
Hence:
Q2[k+l] =!(Q2[k] +QI [k-d] -Q, [k-d+l] +/[k-d] -R 2 )
...
(A17.20)
342
APPENDIX B
State transition tables
The state transition diagrams for both the data processor controller and the
data processes can be synthesized into an intermediate tabular form, known
as a state transition table, and used to determine the governing equations.
Table 81
State transition table for the data processor controller showing the inputs OPRdyJ,
OPRdy2, AckTimer and AckDelay, the present state variables p, q and r, the next state variables
Nextp, Nextq and Nextr, and the outputs Read], Read2, Sell, Se12, Write2 and RstDelay.
0
0
0
-
0
I
0
0
0
0
0
0
0
0
I
I
I
I
0
0
0
0
0
0
0
0
0
0
0
I
I
I
I
I
I
0
0
1
1
I
0
0
0
0
0
0
0
0
0
I
I
0
I
0
I
0
0
0
0
0
0
I
I
I
I
0
0
0
0
0
0
0
0
0
0
0
I
I
0
0
0
1
1
0
0
0
1
I
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I
I
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I
0
0
0
0
I
I
Table 82
State transition table for the ith stage data process controller showing the inputs
Write, Read, R j _ 1 and R j + I , present and next states R j and NextR j , and the output control words
Enj, Mp and Mjl which correspond respectively to the three operating modes
Hold, Load, and Shift.
Write
Read
R j _1
Ri + 1
Rj
NextR i
En j
MDI
M;'
X
X
X
X
X
0
0
0
0
0
0
0
I
I
I
X
X
0
I
I
0
0
I
I
I
I
0
0
0
0
X
X
I
0
0
0
I
I
I
X
X
X
X
0
0
I
I
I
I
Table 83
0
I
0
0
State transition table for the first stage data process controller showing the inputs
Write, Read and R n - 2, present and next states R n _ 1 and NextR n _ I , and the output control words
Enn_1 and M~_I which correspond respectively to the two operating modes Hold and Load.
Write
Read
Rn- 2
I
I
I
X
X
0
X
I
0
0
0
I
I
X
X
X
R n _ 1 NextR n _ 1 En n _ 1 MR-I
0
0
0
0
I
I
I
0
0
0
I
I
I
I
I
I
0
0
0
APPENDIX C
343
Table 84
State transition table for the last stage data process controller showing the inputs
Write, Read and R ,. present and next states R o and NextR o, and the output control words Eno.
M8 and M6 which correspond respectively to the three operating modes Hold, Load. and Shift.
Write
Read
R,
Ro
NextR o
Eno
0
I
I
X
X
X
I
0
0
I
X
X
X
X
0
0
0
I
I
1
1
0
0
I
I
I
1
0
I
I
0
0
0
0
1
0
0
MoO MOl
-
0
0
0
0
APPENDIX C
Phase-space representation of time series
The two-dimensional phase portraits presented in this chapter are projected
on to a two-dimensional plane according to the following rule
Xk>Xk+1 =QI [4i+j]
... (CI7.l)
for i=0,1,2, ... ,N, )=0,1 and k=0,2,4, ... , -2 -1 where x" 1
y. and x1+'
I y. 1 are
,+
the phase-space co-ordinates, QI [.] is the time series describing the length
of the low-priority data process queue, and N is the length of the time series.
The three-dimensional phase portraits show the phase trajectory plotted as
a surface and projected on to a two-dimensional plane according to the
following rule:
Xk>Xk+I>Xk+2 = QI [9i+)]
Yk>Yk+I>Yk+2=QI [9i+)+3]
Zk>Zk+I>Zk+2 = QI [9i+ )+6]
... (CI7.2)
344
REFERENCES
I.
2.
Olafsson Sand Gell M A: 'Application of an evolutionary model to telecommunication services', European Transactions on Telecommunications, 4 , No I,
pp 69-75 (1993).
3.
4.
5.
6.
7.
Burton F and Gell M A: 'Data flow through a double process system', European
Transactions on Telecommunications, -.!, No 2, pp 221-230 (1993).
8.
9.
10. Cvitanovic P: 'Universality in chaos', second edition, Adam Hilger, Bristol (1989).
II. Bai-Lin H: 'Elementary symbolic dynamics and chaos in dissipative systems',
World Scientific Publishing Co, Singapore (1989).
12. Mees A and Sparrow C: 'Some tools for analyzing chaos', Proc of the IEEE,
75 , No 8, pp 1058-1070 (1987).
Index
Abscissa, phase space 333
Access, see Network, Switching
Activation function 69, 72
Add-drop 185, 186
multiplexer, see ADM
Adjali I 45
ADM 96
Agent 61
autonomy 247
behaviour 261
box 53
central control 261
co-operation 58
distributed control 261
distribution 62
inter-agent
comunication 248
load management 250, 254-256,
258, 260
mobile 246, 248-249, 262-263
parent 253, 255-258, 260
performance 259
randomly distributed 53
robustness 256-257, 261
self-organizing 247, 255, 262
uniformly distributed 56, 59
see also Software
AI 9
distributed 248
AIS 208, 215
Alarm 125
Alarm insert signal, see AIS
Algorithm
Bellman-Ford 116,117
Benders' decomposition 97
bucket brigade 237
complexity 117
compression 204
convergence 118
Dijkstra's 116
distributed circuit
assignment 139
distributed control 262
distributed network 129
distributed restoration, see DRA
dynamic programming 117
embodied 117
Floyd-Warshall 116
general 235
genetic 227-228, 230, 232-233,
242-243
graph theoretical 109, 113
greedy 115, 116
heuristic 130, 137
k-means 38-39, 41
Kruskal 115
maximal flow 109, 117
maze-running 236
message flow 109
minimum cost-flow 117
Moore state machine 319
Munkres 67, 80
non-simplex 115
optimal route 251
optimization 121
parallel distributed 118
346
INDEX
Prim 117
recursion 117
restoration 137
route-finding 138, 139
self-copy 235
shortest-path 117
simplex 113, 121
span failure 135
stochastic decomposition 97
Amin S J 153
Amino-acid 51, 231
Amplification
optical 9,172,175,176,177,199
Analysis
equilibrium 48
market-oriented 48
marginal 48
performance 86
probabilistic 104
sensitivity 23
stochastic 144
Annealing
hysteretic 67
simulated 137
ANT 5
Anthropomorphize 236
Appleby S 22, 245
Approximation
distribution 57
heavy-traffic diffusion 89
linear noise 50
macro fluid 90
meso diffusion 90
time-dependent 63
time-independent 63
Van Kampen 54, 56, 62
APS 125
Arc 103-104
incoming 104, 107
minimal total 112
outgoing 104, 107
Architecture
meshed 137
network 138, 336
ring 120
seven-level OSI 86
subsumption 248-249, 262
switching 1, 107
time and space bottlenecks
in 107
Arpanet 112, 118
Array
nonlinear 317, 336
see also VLSI
Artificial intelligence, see AI
'Artificial Life' 224,233, 236, 239
simulation 232
systems 232
Assignment
capacity 112
capacity and flow 112
flow 112
probabilistic 290
see also Resource
Asynchronous transfer mode,
see ATM
ATM 4,49,97, 124, 144, 151, 153
network control 153
photonic 4
Autocorrelation 204, 205
Autocovariance 150
Automatic protection switching,
see APS
Autonomous network telepher,
see ANT
Backplane 4
Bandwidth 1,8,9,96,98,144,147
effective 149-151
limitation 197
limitless 197, 199
linear programme 100
management 145
transparency 197
unlimited 175
INDEX
Battery
central 190
Beggs S L 124
Behaviour 236
antiparasitism 53
chaotic 326, 336
colony 232
co-operative 236
emergent 225
flock 53
formation 51
oscillatory 317, 329
parasitism 51, 232
predator 232
social 53, 232
task-oriented 236
tit-for-tat 62, 63
unbounded 326
virsus 232
see also Market, Competence
Bellman-Ford, see Algorithm
Bell System 95
Benchmarking 172
BER 169, 201, 216
long-term 201
Billing 3, 9
Binary symbol 227
Biological
ants 237
competition 62
crossover 226-229, 237
evolution 224
message 237
mutation 226, 227
pay-off model 52-53
phenomena 224-225
sex ratio 265
stratification 51
systems 12, 51, 53, 61
techniques 239
Bit error analysis 200
Bit error ratio, see BER
347
Blocking
burst 99
call 99
Erlang 99
Botham C P 124
Breakage, see Maintenance
Breeding 232
Bridge 108
Broadband, see ISDN
Broadcast 3, 120
Brown G N 124
Brownian motion 25, 87, 94
local time process 91
reflected 89, 92
Buffer 87, 144, 146, 147
finite node 100
non-blocking 120
overflow 145, 146, 147, 149
queues 109
storage memory 119
Building block 224
Burst error 194
Business systems
automated 48
Butler R A 200
Bytes
frame overhead 126, 136
Cable
break 124, 199
coaxial 168, 172, 183, 188,
190, 192
direct bury 172, 188
failure
accidental 188
corrosion 168, L73, 188, 197
digger 173
dominant mechanism 183
moisture 168, 173
multiple 132
spade 173
statistical independence 170
windage 188
348
INDEX
fibre 169
outage 173-174
overhead 172, 188
repair, see Maintenance
risk 199
ship 182
size 168
undersea 169, 172, 182
Call
drop-out threshold 131
fax 145
file transfer 145
loss rate 148
telephone 124
video 124
Capital
initial 52
Cardinality 105
order 105
size 105
Carrasco R A 311
CCITT 153
Cell 51
header 97
loss rate 144, 146, 147, 148, 151
production rate 149
route 145
packetized 97, 124
Cellular 2, 3, 16, 20
'Central Office' 46, 48, 120, 176,
245, 312
Channel
allocation, virtual 97, 100
capacity 49
connection 157
identifier, virtual 97
Chaotic
attractor 25
phenomena 292
regime 329
state 49, 58, 63, 317, 326, 329
Characteristic
length 26
statistical 145
Children 235
CIP 46
Circuit
assignment 138, 139, 140
bi-directional 133
electronic 171
equalization 175
hot standby 183
integrated 171
multiple failure 184
protction 174
standby 184
Circulant, see Network topology
Classifier 234, 237
Clock
recovered 208
Cluster 32
Coaxial, see Cable
Cochrane PI, 168, 200
Code
evolved 243
generation 243
Coding
5B6B 216-219
debugged 239
error axis 219
HDB3 208-216
Coefficient
cubic 51
linear 51
quadratic 51
Communications, see Telecommunications
Competence 249
Competition 45, 53, 61, 63,
225, 311
co-evolving 232
sensitivity 61
INDEX
349
demand 97
linear 112
probability 98
Converter
DC/DC 176
Copper 1, 3, 168-169
drop 197
twisted pair 172, 177, 183, 188
systems 172, 174, 175, 188
Correlation 212, 219, 225
length 32
Cosine series 204
Cost 52
distance-related
lowest 199
minimal total 115
negative 113
operating 190
primal improvement 113
reduction 168
running 197
transmission 2
Counting 118
CPU 53, 225, 231-232, 235-236
Craftsman 224
Crossbar 107
Crossconnect 131, 141
Crossover
operator 234
see also Biological
Curve
fitting 12, 229, 233-234
Koch 25
Customer
chance 12
control 9
expectation 9
Damage, see Maintenance, Cable
DAR 110
Darwin C R 225, 226
Data
census 37
350 INDEX
chaotic 233
clock rate 171
noisy 233
random stream 207
reliability, see Reliability
Database 49, 139, 140
access time 136
DBM 26
DeS 96, 137
computer-controlled 125
Debugging 249
Decision-making 57
optimal sequence 115, 116
Principle of Optimality 116
routeing 117
stages 115
stepwise 115, 116
Decision threshold 211, 216,
217-219
Decoder 215
Decomposition 242
phase space 318
Delay
fixed feedback 315
Delta
function 156
Kronecker 159
Dempster M A H 84
Dendritic structure 26
see a/so Morphology
Density
power spectral 150, 204
Dielectric breakdown model,
see DBM
Diffusion 95
Diffusion limited aggregation,
see DLA
see a/so Equation
Digital crossconnect system, see DeS
Digital system processor, see DSP
Digital sum variation, see DSV
Dijkstra, see Algorithm
INDEX 351
systems
25, 58
Hopfield 155
irreversible nonlinear 339
Laplace's 26
macroscopic 59
master 50
non-vanishing contribution 161
reliability 170, 172, 181
stochastic differential 95
traffic 88
Equilibrium 19,20,56,61,89,
159-160, 162, 268, 270-279,
308
queue length process 88
regulator 89
see also Analysis
strategy 273, 278, 282
Erbium-doped fibre amplifier,
see EDFA
Error
activity 211
bit 207, 208
burst 201, 202, 205, 212,
215, 220
palindrome effect 205
code 207, 208, 213, 215, 220
density 212, 213, 215, 220
detection 220
free seconds, see EFS
interval logger 208
probability 213
randomly generated 200
statistics 200
transient pattern 200
see also BER
ESS 265-266, 269, 281, 282, 283
Evolution 45, 53, 60, 61, 225-226,
228, 229, 234-236, 238, 243
'environment-oriented' 232
model 232
open-ended 233
strategy 226, 236
352
INDEX
'task-oriented' 232
Evolutionarily stable strategy,
see ESS
Execution time 66, 247
Expansion, large system-size 50
Exponents 41
Facsimile 17, 20
Failures in ten, see FIT
Fast Fourier Transform, see FFT
FDM 175
Fernandez-Villacanas Martin J L
45, 224
FFT 318
Fibre 1-3, 137, 169, 192
low-loss 173
multimode 172, 188
one per customer 175
reliability 168-200
signal distortion 175
single mode 172, 188
splice 74
system 174, 175
technology 169
terrestrial 199
to the home, see FTTH
to the kerb, see FTTK
undersea 181, 199
Filtering
linear 150
see also Kalman filtering
FIT 171, 176
Fitness 227, 229, 232, 236, 239, 276
Flow
averaging 150
capacitated network 113, 118
commodity 111, 114
conservation 112-114
basic feasible solution 113
control 86
deterministic fluid 87, 89
maximum 112-114
minimum cost 113
INDEX
acyclic 105
bipartite 107, 113
complete 105
connected 105
dimensions 105
directed 96, 104
disconnected 108
distinct points 109
model 104
network-directed 109
nondirected 115
planar 109
redundancy 106
regular 105, 107
representation 103
switching 107
theory 103-104
problems 103-104, 109
vocabulary 103
trivial 108
see a/so Fractal, Tree
Greed 52, 60
see a/so Algorithm
Hardware
monitoring overhead 7
unreliability 5
Hausdorff dimension 37
Hawk-Dove game 271
Hawk-Dove-Bully-Retaliator
game 279, 281
Hawker I 124
Heatley 0 J T 168
Heuristic 114, 118, 228
Holding time
negative exponential 92
Holland J H 227
Homeworking 199
Hopfield net 65-67, 68-70, 71-77,
80-82, 153-154, 162
attractor 73-74, 160, 162
imposed 164
negative 160, 164
353
354 INDEX
multistage 107
spanning tree 32, 105
Interface 9
humanized I
International Standards Organization, see ISO
Invariance 151
ISDN 19, 201
broadband 97
ISO 126, 141
IT 45
Ito integral 91
Jackson, see Theory
Johnson D 124
Joint 173
Kalman filtering 97
Kephart J 0 58, 62
Kleinrock independence
assumption 87, 110
Koza, see Genetic
Kruskal, see Algorithm
Kurtosis 205
LAN 103
Landsat images 25
Language
C 229, 233, 240
c++ 229
fitness specification 243
LISP 228-229
Mathematica 229
Scheme 229
XLisp 229
Laplace, see Equation
Law of large numbers 8, 90, 93, 197
Leaky feeder 3
LED 208,211
Light emitting diode, see LED
Lightning 194, 220
Linear
dependence 243
programming 113, 114, 137
regression 234
INDEX 355
conditions 45
discontinuities 47
disequilibrium 48
dynamics 52
emergent behaviour 49
environment 54, 58
evolution 47
forces 45
global 51, 311
pluralism 47
principles 285
process
agent/resource 49
auctioning 49
bartering 49
bidding 49
share 15-16, 19, 20, 59, 60, 62
strategy 53, 62, 63
Markov 54, 62-63, 95, 170
modulated fluid 87, 92, 97, 99,
146, 148, 151
piecewise deterministic 92
process 151
holding times 151
Matrix
adjacency cost 116
configuration 158
connection 161, 164
symmetric 161
gain 266-267,271,273-276,279,
283, 298
incidence 104
interscale transfer 34-35
Leontief 92
pay-off 286
preference 293
probability 290
routeing transition 88
stability 266, 274-275, 277,
280-281, 283
utility 286, 298, 305, 308
vertex adjacency 104
weight 155
Maze running 235
software 235
McIllroy P W A 224
Mean time before failure, see MTBF
Mean time to repair, see MTTR
Measure
conventional information 24
performance 7, 9, 109, 110
see a/so Metric
Medova E A 103
Megastream 139
Memory 225, 231, 236
Menger, see Theory
Message passing
fast 125
Method, see Algorithm, Heuristic
Metric 139, 209
confidence 207, 208, 211, 217
decision-point 206
mean 205-206, 211, 215, 216,
217, 219
pattern 206
Microwave 2
MIMD 231
MINOS 147, 148
MIPS 240
MMP 28
Mobility 3, 198
Mode
all-nodes 129
all-spans 129
behavioural 336
chaotic 326, 333, 337
free-run 129
interactive 129
operational 197
simulation 129
stable 326
unbounded 326
unstable 326, 333, 334, 335, 337
Model
356
INDEX
biological 230
burst error 200
closed network 87, 105
connectivity 109
Byzantine general problem
109
evolutionary 225
hierarchical 84
multicommodity flow 117
multinomial logic 28
network flow 109, 110
open network 87, 105
inter-arrival process 88
service-time process 88
optimization 110
probabilistic 125
reliability 169, 170
three-level stochastic
optimization 85, 95
Moment generating function 149
Monitor for inferring network
overflow statistics, see MINOS
Monopoly 46
Morphology 25
dendritic town 26
urban 26
MTBF 7, 125, 129, 171, 174, 176,
187, 194, 195
MTTR 129, 130, 171, 173, 174,
176, 182, 192, 193
Multilayer framework 141
Multimedia 199
Multiple-instruction multi-data,
see MIMD
Multiplexing 97, 148, 176, 183
Boolean 229
control 323
duplication 182
statistical 149
terminal 180
Multiplicative multinomial process,
see MMP
Mutation 51,53,61,63
bit 238
operator 238
pay-off 53
search 234
Netput process 89
Network 1, 46-47, 311
access 5, 15
audit 138
balanced 113
capacity 2, 125, 127
availability 140
bounded link 114
upgrade 197
utilization 125, 287
circuit-switched 103, 250
classification 103
communications 105, 109
complexity 6, 312
congestion 250
control 80, 124
distributed 138
control software 9
cost optimization 137, 138
customer-to-customer 199
data 109
design 85, 114, 168
digitalization 5
disjoint path 106, 108, 109
down time, see MTBF
element manager 136
end-to-end view 9
extension 140
failure 1, 6, 8, 124, 125, 132,
139, 170, 171, 173, 185, 194
location forecasting 194
statistics 174-175, 194
see a/so Cable failure
failure-resilient 120
flexibility 124, 190
future 121
heterogeneous 337
INDEX
357
sub-second 140
time 136
see also DRA
ring 137
self-routeing 107
software 124
sparsely connected 161
star 35, 100
switching 105,107,109,120,144
telegraph 103
test 140
throughput 109
topology 86,100,107,109,118,
129, 130
irregular mesh 107
connected mesh 107, 112
ring (circulant) 107
traffic rebalancing 100
transparency 2, 8, 9
unstable behaviour 164, 313
utilization 184
vulnerability 108
wide area, see WAN
see also Hopfield, ATM
Neural
activity level 154
negative 161
McCulloch and Pitts model 68
network 65-66, 68, 76, 77, 80,
153, 158, 233
back-propagation 153
velocity 162
weights 154
synaptic 161
Neuron 71, 72-74, 155, 160, 161,
164
Neuroprocessor 76
Node
balanced bottleneck 88
bottleneck 88
in chip 118
chooser 127
358
INDEX
destination 110
failure 124, 125, 132, 134, 137
reduction 8
flexibility 180
geographical coverage 103, 110
identity (NID)
intermediate 133
interrogation 261
message sink 109
message source 109
non-bottleneck 88
occupancies 88
ordered pair 104
paths 103
protection 127
reduction 197
restoration 134
simultaneous failure 7
strict bottleneck 88
switch 195, 199
tandem 127
technical definition of 103
termination 35
tree 228
unordered pair 104
see a/so Origin-destination,
Vertices
Noise 61, 76
Gaussian 56, 207
nonlinear 63
see a/so Approximation
Nonlinear control strategy 229
NP-hard 118
Object oriented programming 110,
262
00 96, 100, 110
multiple-pair flow 114
pair revenues 99
Offspring 225
Olafsson S 153, 264, 285
Open systems interconnection,
see OSI
Optical
free-space 2, 3
HOB3 modem, see Coding
network
design 114
multiwavelength 120
transparency 200
wavelength 117, 169
receiver 208
technology 199
transmitter 208, 211
transmitter-receiver pair 100,
120
see also Fibre, Network
Optimization 85, 112, 113, 153,
158, 228
combinatorial 107
control 243
criterion 115
deterministic 110
linear 112
nonlinear 112, 118
objective function 110, 121
parameter 154, 159-161,
162-164
stochastic 86, 110
transportation 113
Optoelectronic
component 168
Order complementarity 89, 100
Organic molecule 51
Organism 51, 225, 236
Origin-destination, see 00
Oscillation
'hard-clipped' 326
persistent 58, 63, 326, 329
see also Behaviour
OSI 125
Output
equilibrium lost 89
potential 89
Overload 66
INDEX
avoidance 138
Packet 120, 156-158
delay 110
transmission 109
Paradigm
competitive industry, see CIP
public utility, see PUP
Parallelism 80, 121
Parameter
control 59
interference 209, 216, 220
preference 12
system 63, 217
uncertainty 58-59, 63, 286
variation 211
Parent 226-227
monitor 257, 260
Path
connecting 109
cyclic 105
delay 3
directed 110
geographical 184
length 127
restoration 133
end-to-end 134-135
see also ORA
shortest 116
simple 105
single 107
virtual 97, 100
see also Vertex
Pattern 224, 229
recognition 233, 234
Pay-off 58
changing 60, 61
cubic 59
linear 60
nonlinear 63
perceptions 62
stochastic effects 62
see also Biological, Mutation
359
360 INDEX
consumption 171
feeding 168, 176, 188
grid distribution 194
outage 184
spectrum 335, 336
supply duplication 176,180-181,
184, 199
surge 221
transient 194
Pre-smoothing 151
Pricing
real time strategies 48, 63
setting 63
Prim, see Algorithm
Probability 13, 24, 51
cell-loss 99
critical 33
distribution 266, 275, 280
factors
advertising 51
dealing 51
special offers 51
trust 51
momentary 305
system 58
transition 56
see also Distribution
Profit 52
maximization 112
Program 224, 229
application 235
'brittle' 242
co-operation 238
C-zoo 231, 235-236
error-free 242
error-sensitive 242
error-tolerant 243
evolved 239
genetic 242-243
Hermes 236-239
heterogeneous 237, 285, 286
length 243
INDEX 361
Re-investment 52
Relationship
cost/price 240
parasitic 232
power-law 23-24
weighted sum 41
predator/prey 12
symbiotic 232
Relaxation 53, 60, 61
Reliability 2, 7, 125, 172, 179, 180,
186, 190, 199
end-to-end 173, 179, 186, 199
hardware 196
long line systems 179, 197
operational 197
optical 199-200
statistical 170, 172
see also Fibre optic, Local
loop
Renormalization 32, 33, 41
Renyi A 24
Repair, see Maintenance
Repeater 175-176, 183, 186, 199
buried/surface 173
cascaded 180-182
duplication 176
line 179
optoelectronic 175, 176
reliability 180-197
spacing 168, 173, 175, 199
technology 172
Repeater stations 2, 197
Replication 232
Reproduction 225, 227, 235
Requirements capture 239
Re-routeing 173
Resilience 86, 125, 141, 173
Resolution 24
logarithm 24
Resource 225
allocation 65-67, 313
linear 67
struggle 225
Revenue 12
Reward mechanism 51
Lf. 173
Richter scale 6, 196
Robot control 248
Routeing 86, 103, 109, 111-112,
117, 119, 120, 153, 185, 198
alternative 125, 129, 172, 258
availability 176
configuration 173
cost 247
in data 117
diverse 173, 184, 198-199
duplication 198
dynamic alternative, see OAR
hierarchical adaptive 120
multiple diverse 127, 139
optimal 110-111, 116
protocol 144
restoration 127, 129-130, 132
selection 120, 258
table 250
Rule
complexity 238
condition-action 237
deletion 238
duplication 238
Sampling interval 77, 78
adaptive 78
Satellite 3
geostationary 3
link 3
low earth orbit 3
mobile 2
Schema 228, 230, 233, 234, 242
SOH 124, 125-126, 130-131, 139,
172
restoration routes in 136
see also Network
Search space 242
Selection 225, 227
362 INDEX
Serengheti 223
Service 8, 9
address 315
translation 315
availability 9
competition 11-12, 15, 20
customer 125
development 311
diversity 9
expected life 169
modelling interactions 11-20,49
new 18
origination 315
provider 45
telephony, see Telephony
uninterrupted 131
Shannon information 24
Sierplnski triangle 35-37, 41
Sigmoid function 69, 154
Signalling 3, 9, 136
common channel 49
duration 315
overhead 136
Signal-to-noise ratio, see SNR
Signature
acknowledgement 127
index numbers 127
Silicon technology 171, 172, 200
Simplex, see Algorithm
Simulated annealing method 35
Simulation
availability 129
computer 225
Monte Carlo 207,209,212,215,
217
on-line 127, 131
speed 129
Skewness 205
SNR 211, 216, 220
Software 1, 5, 9, 125, 195, 200
agent 209
decomposition 243
engineering 224
evolving 242, 243
practitioner 234
robustness 246, 247
scaling 242, 243
self-regulating 110, 208
SONET 107
survivability 107
Span
failure 127, 130, 131, 133, 137
restoration 125
pre-planned 135
real time 135
Spatial distribution, see Distribution
Spectrum 2
photonic 9
telephone 103
State transition
diagram 342
table 342
Static discharge 170
Stationary process 149
nondeterministic 150
Steward S 245
Strategy
evolutionarily stable, see ESS
integrated restoration 135
mutant 270
strong 268-273
winning 268
Subdivision
recursive 28
Subgraph, see Graph
'Survival of the fittest' 225
Switching 9,87,144,149,168,176,
185, 198, 313
access 153-154
adaptive 153
broadband 153
computer terminal 103
crosspoint node 107
failure 107
INDEX
fraction 88
hot standby 176
interconnectivity 198
mechanism 157
packet 153
protection, see APS
redundancy 161
robust 153
station 168
Symmetry 59
Synchronicity 208
Synchronized optical network,
see SONET
Synchronous digital hierarchy,
see SDH
System
agent/resource 50-51, 59, 63
attractor 70, 155, 316, 334
availability 171
base 51
linear 60
constant 60
random 60
relaxation 60
greed 60
behaviour 58
bistable 59
brittle 49
client/server 49
complexity 49, 54, 58, 313
crash 257
decision-support 85
development 311
distributed 245, 246, 262
failure 313
fault-tolerant 314
ferromagnetic 59
fluctuations 49
internal 57
nonlinear 50, 54, 57, 58
heterogeneous 314
hierarchical planning 85, 86
363
intelligent 198
irregular operation 313
long-distance 198
market-like 286
N+ 1 standby 125,173,183-184
nonlinear processing 314, 316,
319, 333, 335, 338
time cycle 320
open 49
open-ended 242
operating 245
performance 320
reliability 176
repeaterless 180
self-organizing 49, 198
statistical 287
teletraffic 333
terrestrial 173,176,180,182,184
unavailability 186,187,191,198
undersea 173, 176, 181, 199
utilization 308
see also ATM, Biological,
Parameter, Telecommunications
Systolic chip 188
Tariff 9
Task allocation 70-73,80,253,305
arbitrary preference 294
controller 70, 71
'do as the others do' 297
dynamic 285, 288
predetermined 308
self-confident choice 295-297,
305
Telecommunications
complexity105, 118
reduced 198
convergence with computing/
media 45
decentralization 311, 312
design 112
distribution 311
364
INDEX
diversity 311
engineer 104
evolutinary process 135, 243
global 199
infrastructure 8, 45
market 45, 47, 51
mobile 200
operator 22, 46, 311
routeing, see Routeing
UK 45,48
USA 45
Telephony 15, 20
see also Cellular
Telex 17,20
Temperature
Curie 59
TENDRA 124, 129, 130, 140
Terminal
duplication 176
station 176
Testing 239
Theory
central limit 92-94
central-place 25
deterministic 49, 100
dynamic systems 274
game 264, 282
dynamic 266-268
evolutionary 264-266,
282-283
principles 285
zero-sum 282
Jackson's 110
large deviation 94, 145
mean-field 62
Menger's 108
stochastic 49, 87, 94
see also Graph, Traffic,
Percolation
Throughput
maximization 158
Tools 224
Topology
multiple multi-butterfly 120
multi-ring 120
see also Network
Traffic
approximation 87-91
average 110
bursty 87, 144-146, 149
busy period 149
class 144, 149, 150
congestion 109, 313
diversion 200
Erlang theory 104
future 20
generator 250
input arrival rate III
intensity 88, 144, 260
management 86, 253
modelling 8
modes 8
offered 127
pattern 8, 258
profile 250, 258
queue length 109
stationarity assumption 110
studies 318
synchronization 313
waiting time 109
Transfer function 287, 291
Transmission 7, 49, 171, 198
cable 192
capacity 110
diplex, see WDM
duplex 193
length 115
reliability 173
technology 137, 169, 176
link 103, 104, 115, 118
see also Edge
Trans-shipment 112, 113
Travelling salesman problem 66,
73, 120, 227, 238
INDEX
365
incidence 104
path 105
set 107
see also Matrix
Videotelephony 18, 19, 20
quality 19
Virtual circuit 117
VLSI 25, 67, 319
field programmable gate
array 314, 319
WAN 103
Wavelength division hierarchy,
see WDH
Wavelength division multiplexing,
see WDM
WDH 172
WDM 2, 100, 120, 172, 175,
192, 193
soliton 4
Weber R 144
Wiener process 94
Winners 232
Winter C S 224
Workstation, see PC
Zone, communications free-trade
312