Você está na página 1de 66

Introduction to Optimization

Dimitri P. Solomatine
Professor of Hydroinformatics

Some of the optimization problems in water


resources
multi-purpose reservoir operation: what should be the releases
ensuring maximum satisfaction of the users?
resources allocation: how much water should get every user?
models calibration (error minimization): what are the model
parameters ensuring minimum error?
urban drainage networks optimization (design and rehabilitation):
which pipes to change to ensure minimum damage from flooding?
etc, etc, etc.

D.P. Solomatine. Introduction to Optimization 2

1
Optimization is in Greek, maximization
Single-objective optimiz.
X squared

250

We typically optimize some function, 200

called objective function, or criterion


150
X squared
100

Single-objective optimization problem: 50

F (x1, x2, , xN ) min


0 5 10 15 20 25 30 35

where x1, x2, , xN are real numbers


(vector X in N-dimensional space)
Multi-objective optimization problem:
F1 ((x1, x2, ,, xN ) min
...
Fm (x1, x2, , xN ) min f ( x ) f ( x ) f ( x )
f ( x) , ,..., i
0 i i

F min is equivalent to F max x x x


1 2 n

Example: x1, x2, , xN are reservoir releases


F (x1, x2, , xN ) is a amount of hydropower max 3

Optimization problem

Three components of an optimization problem:


Objective function(s): what do you want to minimize or
maximize?
Decision variables: the variables on which the Objective
function(s) depend(s) and which are to identified
Constraints: the ranges for these variables and other
considerations limiting the choice of the decision variables
values

2
Decision making in planning

identify and quantify objective function(s), OFs


define decision variables and constraints
collect data
generate alternatives
evaluate alternatives
(=calculate OF for each of them) Process of
select the p
preferred alternative optimization

(=choose the alternative with the


minimum or maximum value of the OF)
implement the selected alternative

D.P. Solomatine. Hydroinformatics and optimization. 5 5

Classical optimization

D.P. Solomatine. Introduction to Optimization 6

3
Unconstrained optimization (1)

Minimize function f(x) = (x-15)2


Solution: df/dx = 0 2(x-15)=0 x*=15

X squared

250

200

150
X squared
100

50

0
0 5 10 15 20 25 30 35

D.P. Solomatine. Introduction to Optimization 7

Unconstrained optimization (2)

(simplified) necessary condition for optimum of f(x):


df / dx = 0

Multi-dimensional case f(x) (x R n ): all partial


derivatives are zero
f ( xi )
0, i 1, 2..., n
xi
or written in compact form: gradient is zero
f ( xi ) f ( xi ) f ( xi )
f ( x) , ,..., 0
x 1 x 2 x n

D.P. Solomatine. Introduction to Optimization 8

4
Constrained optimization (1)

Minimize function f(x) = (x-15)2 with constraint


5 < x < 25
Solution: df/dx = 0 2(x-15)=0 x*=15
(always works?)
Constraints X squared

250

200

150
X squared
100

50

0
0 5 10 15 20 25 30 35

D.P. Solomatine. Introduction to Optimization 9

Constrained optimization (2)

Minimize function f(x) = (x-15)2 with constraint


5 < x < 10
Solution: df/dx = 0 x*=15 DOES NOT WORK!
Constraints X squared

250

200
Minimum
150
X squared
100

50

0
0 5 10 15 20 25 30 35

D.P. Solomatine. Introduction to Optimization 10

5
Constrained optimization (3)

General non-linear optimization problem:


f(x) min
subject to gi (x) < 0 (inequalities)

Inequalities can be converted to equalities (introducing


so-called slack variables):
f(x) min
gk (x) = 0,
0 k=1,,
k=1 m (equalities)

This problem can be solved by introducing Lagrange


multipliers

D.P. Solomatine. Introduction to Optimization 11

Constrained optimization (4): Lagrange


multipliers
f(x) min
gk (x) = 0 k=1 ,m
k=1, m

Instead of f(x), the Lagrangian function is minimized:


m
L(x, ) f (x) i g i ( x )
i 1

(n+m) partial derivatives are set to zero, and then (n+m)


algebraic equations with (n+m) unknowns are to be
solved

D.P. Solomatine. Introduction to Optimization 12

6
Constrained optimization (5): penalty
functions
f(x) min
gk (x) < 0 k=1,,
k=1 m

Just make f(x) very large outside of constraints:


fnew(x) = f(x), if gk (x) < 0 k=1,m
fnew(x) = VeryLargeValue, if gk (x) 0 k=1,m

D.P. Solomatine. Introduction to Optimization 13

Constrained optimization (6): use of


numerical methods in real-
real-life problems
f(x) min
gk (x) = 0 k=1,m
=1 m

Real-life problems have very complex constraints and


objective functions, and typically do not allow for the use
of Lagrangian multipliers
iterative schemes were developed where at each step
gradient
d is calculated
l l d or assessed, d andd a step is made
d
towards a minimum

D.P. Solomatine. Introduction to Optimization 14

7
Steepest descent

select starting point x0


calculate direction (based on
gradient f(x) or smarter)
try to minimize along this line:
needs making trial steps
make a definite step towards the
minimum along this line how large is the step?
xt 1 xt t f ( x)
recalculate direction
repeat until minimum is reached
problem: how to find the
step?
D.P. Solomatine. Introduction to Optimization 15

Steepest descent in detail

the optimal step size

D.P. Solomatine. Introduction to Optimization 16

8
Possible problems with the convergence

D.P. Solomatine. Introduction to Optimization 17

Descent using smarter schemes

conjugate gradient methods


(Fletcher-Reeves Polak-
(Fletcher-Reeves,
Ribiere algorithms)
quasi-Newton methods
(using second derivatives):
Davidon-Fletcher-Powell
(DFP), Broyden-Fletcher-
Goldfarb-Shanno (BFGS)

D.P. Solomatine. Introduction to Optimization 18

9
Optimization
O i i i alongl the
h line,
li
without calculation of derivatives

Used at different stages of optimization of


complex functions of many variables

Line optimization finding a minimum of a


function of 1 var, without calculating derivatives
bracketing a minimum by golden section search
using parabolic approximation (method of Brent)

a b c

D.P. Solomatine. Introduction to Optimization 20

10
Line optimization principle of bracketing
Minimum is originally F(x)
bracketed by points 1,3,2.
Choose some point 4 between
2 (worst) and 3 (best)
Evaluate function at 4
If 4 is better than 2, replace 2
by 4 F(x1)
Select some point 5 between 1
(worst) and 3 (best)
If 5 is better than 1, replace 1
by 5 5 6
Then 6, which replaces 4.
The rule at each stage is to
keep a center point that is 3
lower than the two outside
start
points
After the steps shown, the X
minimum is bracketed by
points 3,6,5
x1 x3 x2
How to choose the points?
D.P. Solomatine. Introduction to Optimization 21

Line optimization Bracketing minimum by


golden section search
Two points a, c are given.
How to choose b, i.e. what
should be W ?

Solution:
imagine b is known so that
points a, b, c are given.
minimum is between b and c
how to find x ? x ? c
a b

(b a ) (c b )
W 1-W
W 1W
(c a ) (c a )
( x b) Z
Z
(c a )
Z
Minimize worst case possibility choose x W Z 1 W Scale similarity: W
to make the next interval equally probable 1 W
to be to the left and to the right of x Combine:
3 5
W 2 3W 1 0 Solution: W 0.38197
2
D.P. Solomatine. Introduction to Optimization 22

11
Line optimization Brent search (1)

three points are chosen


parabolic approximation is used at each step
the minimum of parabola becomes the new point in
bracketing

D.P. Solomatine. Introduction to Optimization 23

Line optimization Brent search (2)


This line is not known!
F(x) (but function F(x) can be evaluated at each x)

original points:
1,2,3
F(x)
parabola is drawn
-------
its minimum (4)
replaces 3
new parabola is F(x1)
drawn through
1,4,2
its minimum is at
5, which is close F(x5)
to the function
minimum x

x1 x2 x3
The best estimate of the minimum
D.P. Solomatine. Introduction to Optimization 24

12
Stopping rules of iterative algorithms

Iterate till f (xn) f (xn-1)


More formally: y
Run iterations untill relative tolerance becomes smaller than some
relative accuracy

Relative tolerance f ( xn ) f ( xn 1 )

f ( xn )

Additional
dd l stopping rules:
l xn xn 1

x
Ensuring changes in x are not too large xn

Number of iterations reaches some predefined nmax

Linear programming

D.P. Solomatine. Introduction to Optimization 26

13
Linear optimization (linear programming)

Particular case when f(x) and g(x) are linear functions:


n
f (x) c0 ci xi
i 1
n
g k (x) a0 a i xi 0
i 1

The method to solve was at first proposed by Kantorovich


(1939) in the problem of optimal cutting of plywood. Nobel
Prize of 1975.

Popular method of solving: simplex method of Dantzig


(1947)
D.P. Solomatine. Introduction to Optimization 27

Solving linear programming problem

f(x) = 2x1 + x2 max


subject
j to 4(x
( 1 + x2) < 28 and x1 >0 and x2 >0
Gradually increase Z
in the equation for x2

Constraints: the objective


optimal point
x2 function
Z = 2x1 + x2
Stop when we are x1

about to get out of


constrained region.
x1 Solution:
x*=(7, 0)
f (x) = 14
D.P. Solomatine. Introduction to Optimization 28

14
Dynamic programming

Universal approach to solving dynamic problems


example: reservoir optimization where releases are
functions of time
Steps in dynamic programming:
1. Break the problem into smaller sub-problems.
2. Solve these problems optimally using this three-step
process recursively.
3. Use these optimal solutions to construct an optimal
solution for the original problem
problem.

D.P. Solomatine. Introduction to Optimization 29

Dynamic programming: universal approach to


solving dynamic problems
1. The problem can be divided into stages with a decision required at
each stage (e.g.: stage=time; decision=reservoir release)
2 Each stage has a number of states associated with it (e
2. (e.g.
g volume)
3. The decision at one stage transforms one state into a state in the
next stage (release new volume at the next time step)
4. Given the current state, the optimal decision for each of the
remaining states does not depend on the previous states or decisions
5. There exists a recursive relationship that identifies the optimal
decision for stage j, given that stage j+1 has already been solved.
6. The final stage must be solvable by itself.
S3-x3
S2=Q-x1 S3=S2-x2
Q Si =state (volume)
x3 Xi =decision (discharge)
x2
NB1(x1) NB2(x2) NB3(x3) net benefits of
x1 decisions

D.P. Solomatine. Introduction to Optimization 30

15
Dynamic programming
volume s2=Q-x1 s3=s2-x2 s3-x3 Inflow qi
Volume Q, s2, s3,
Q

Release x1, x2, x3,

0 State
Decisions: x1 x2 x3 Decision

Bellmans principle of optimality: An optimal policy has the property that


whatever state the process is in at a given stage and
whatever the decision taken from that state,
the remaining decisions must constitute an optimal sub-policy for the state resulting
from this decision:
if red is optimal if start from stage 0,
then blue (dashed) is also optimal if start from stage 1
D.P. Solomatine. Introduction to Optimization 31

NB3(x)
4 possible states s3
Dynamic programming
s2=Q-x1 s3=s2-x2

Q
NBj(x) = net benefit of making decision x at stage j
fj (s) = total net benefit of having state s at stage j
((includes benefits at all subsequent
q stages)
g ) 0
Calculate all immediate benefits NBi(x)
of all possible decisions x at each stage j Decisions: x1 x2 x3
at stages 1,2,3
Determine max possible benefit f3(s3) for each state s3 : f3 (s3 ) maxNB3 (x3 )
0 x3s3
Determine max possible benefit f2(s2) associated
with decisions x2 and x3 given initial state s2 (for each state s2)

f2 (s2 ) max NB2 ( x2 ) f3 (s2 x2 ) f3 ((ss3 )


0 x 2s 2

Determine benefit f1(Q) associated with decisions x1 and x2 and x3 given


initial state Q (original problem)
f1(Q) maxNB1(x1) f2 (Q x1)
0x1Q

In general form: f j ( s j ) max NB j ( x j ) f j 1 ( s j x j )


0 xj sj

16
Multi--criterial (multi
Multi (multi--objective) optimization

33

Optimization problems F1 (x1, x2, , xN ) min


...
are usually multi-
multi-objective Fm (x1, x2, , xN ) min

there are several objectives that are to be optimized


often they are in conflict, i.e. minimizing one does not mean
minimizing another one
a solution (decision variables values) is a compromise
Examples:
multi-purpose reservoir operation
Power vs. irrigation vs. navigability
water allocation
irrigation vs. water supply to cities
pipe networks optimization (design and rehabilitation)
costs vs. reduction of flood damage
models calibration (error minimization)
models good "on average" vs. good for forecasting floods

17
Pareto optimal set of solutions
dominates all other solutions
OF 2 Best for OF2

Best for OF1


A compromise
solution
OF 1

ideal point
(does not exists)

Inflow qi
Volume Q, s2, s3,

Release x1, x2, x3,

Example:
Multi--purpose reservoir optimization:
Multi
use of a Neural Network
as a replicator of hydraulic/hydrologic model

Case study: Apure River, Venezuela


D.P. Solomatine, L.A. Torres. Neural network approximation of a
hydrodynamic model in optimizing reservoir operation.
Proc. 2nd Intern. Conf. on Hydroinformatics, Zurich, 1996, 201-206. 36

18
Multi--objective optimization: Apure river
Multi

37

Posing the problem

river basin with several reservoirs, hydropower production,


g
commercial navigation downstream
Multicriterial problem posed: identify the releases Rt
t=1,,52 from the reservoir and:
minimize time of non-navigable situation downstream
minimize difference between target and achieved power
production lines
Reservoirs / dams: Sub catchments runoff
Sub-catchments
La Honda
La Cuevas
La Vueltosa

D.P. Solomatine. Introduction to Optimization 38

19
Decision space and objective (criteria) space

Decision space (variables):


releases Rt t = 1,,52
1 52 from the reservoir

Objectives (criteria) space


Objective 1: minimize difference between target and
achieved power production lines
Objective 2: minimize time of non-navigable situation
downstream

D.P. Solomatine. Introduction to Optimization 39

Mapping of decision space into objective space


dominated
(inferior) solutions
T=Time of non-navigability
non-inferior (the
Objectives space best) solutions,
OF 2
uncomparable
uncomparable,
Pareto layer

Evaluation by
running a
model

OF 1
Decision space
D Difference between target and
D=Difference RN
achieved power production
ideal point Alternative
(does not exists) solutions
R1

R2 Rt releases (1..N)
D.P. Solomatine. Introduction to Optimization 40

20
Converting multi
multi--objective problem to a
single--objective problem
single
consider 2-objective problem:
Z1 min and Z2 min
Z1
weighting method:
Z = w1 Z1 + w2 Z2
disadvantage: how to choose w1, 2?
one is compensated by another
method of ideal point
Z = sqrt (Z12 + Z22) Z2
disadvantage: data must be
normalized
one is compensated by another
Z1
constraint method
consider only Z2 < FixedValue
then Z1 min
disadvantage: less sensitive
to the value of Z2 Z2
optimal solution
D.P. Solomatine. Introduction to Optimization 41

Reservoir optimisation in Apure river basin


using Mike11/NAM model
Optimisation
Time of navigability was put
into a constraint,
constraint mismatch Start
with the power production
target was minimized generate release schedules
MIKE-11/NAM modelling run Mike11/NAM model river and
system was used to model catchment
to produce water levels
modelling
river hydrodynamics and
hydrology of the 21 check navigability
constraints
adjacent
dj t sub-catchments
b t h t
The problem:
optimal solution
MIKE11/NAM too slow reached?
cannot be put in
optimization loop (software Stop
constraints)
42
D.P. Solomatine. Role of ICT:

21
Reservoir optimisation in Apure river basin
using a surrogate ANN model
Mike11/NAM Neural Network emulator
Surrogate modelling: Optimisation
Mike11/NAM was run Start
offline for a variety of
scenraios and inputs generate release schedules
The generated data was run Mike11/NAM model river and
used to train a Neural run Neural network replicating catchment
to produce water
Mike11/NAM model levels
Network surrogate model modelling
(model of a model) check navigability
This ANN model was used constraints
instead of Mike11/NAM in
the optimization loop optimal solution
reached?

Stop
43
D.P. Solomatine. Role of ICT:

Methods to solve real-


real-life single
single--criterial
optimization problems

Mutli-extremum functions (many local extrema)


Mutli-
Functions not known analytically

D.P. Solomatine. Introduction to Optimization 44

22
Multi--extremum functions
Multi

minimum is global if the function is everywhere concave


(concave downward); otherwise it is local
2 f ( xi )
0, i 1, 2..., n
xi2 Constraints X squared

250

200

Minimum
150
X squared
100

50

0
0 5 10 15 20 25 30 35

D.P. Solomatine. Introduction to Optimization 45

Multi--extremum problems
Multi

Minimize function f(x) = 2 + x2 + sin(10x)


Solution: df/dx = 0 many answers ((local
local minima,
minima
but only one of them is global
f(x) = 2 + x^2 + sin(10x)

0
-1.5 -1 -0.5 0 0.5 1 1.5 2

D.P. Solomatine. Introduction to Optimization 46

23
Multi--extremum problems
Multi

D.P. Solomatine. Introduction to Optimization 47

Problems when objective function is not


known analytically
if objective function is not known analytically then we
cannot compute derivatives (gradients)
this is a typical case when the values of objective
function are calculated by a computer program
this typically means employing direct search methods

D.P. Solomatine. Introduction to Optimization 48

24
Process of calibration as an optimization
problem (1)
Input X

Physical Model
system y = f (X, P)

Measured Error Model output


output yMES e (yMES-yMOD) yMOD

Error small
enough? No, update model parameters P
Yes

Stop
D.P. Solomatine. Introduction to Optimization 49

Process of calibration as an optimization


problem (2)
Popular way to measure error is root mean squared error:
T
1
E ( P)
T
wt OBSt MODt ( P)2 min
i
t 1
OBSt = observed output variables values (numbers),
MODt = values generated by a model
time moments t = 1T
P is the parameters vector {P1, , Pn} = independent
variables in the optimization
p problem
p

Problems in solving this problem:


there is no analytical expression for E(P), so we
cannot find the derivatives (gradient).
typically it is multi-extremum
D.P. Solomatine. Introduction to Optimization 50

25
Global (multi-
(multi-extremum) optimization

Find an optimizer x* such that generates a minimum f* of


the objective function f (x)
where x X and f (x) is defined in the finite interval (box)
region of the n-dimensional Euclidean space:
X = {x R n: a x b} (componentwise)

D.P. Solomatine. Introduction to Optimization 51

Approaches to solving optimization problem


depend on the properties of f(x)
f(x) is a single-extremum function expressed analytically; if its
derivatives can be computed, then gradient-based methods may be
used:
conjugate gradient methods;
quasi-Newton methods (using the second derivatives)
f(x) is a single-extremum function which is not analytically
expressed, so the derivatives cannot be computed and direct search
should be used:
downhill simplex descent (Nelder & Mead 1965)
rotating directions (Rosenbrok 1960);
direction set method (Powell 1964), combined with line search
(Brent 1973)
no assumptions are made about the properties of f(x):
direct search is needed
multi-extremum (global) optimization can be employed
D.P. Solomatine. Introduction to Optimization 52

26
Example: Hosaki function of 2 var., two
minima, global at (4, 2), local at (1, 2)
7 3 1 4 2 - x2
f ( x1 , x2 ) = ( 1 - 8 x1 + 7 x12 - x1 + x1 ) x 2 e
3 4

D.P. Solomatine. Introduction to Optimization 53

Rastrigin function of 2 var.


admissible domain [-
[-1,1]x[
1,1]x[--1,1], global min.of 0.0 at (0, 0) and more
than 50 local minima

f ( x1 , x 2 ) = 2 + x12 - cos ( 18 x1 ) - cos ( 18 x 2 )

D.P. Solomatine. Introduction to Optimization 54

27
Another example

D.P. Solomatine. Introduction to Optimization 55

Main approaches to direct search and global


optimization
set (space) covering techniques;
random search methods,
methods including evolutionary and
genetic algorithms
methods based on multiple local searches (multistart)
using clustering;
other methods (simulated annealing, trajectory
techniques, tunneling approach, analysis methods based
on a stochastic model of the objective function).
function)

D.P. Solomatine. Introduction to Optimization 56

28
Random search methods

A sub-class of direct search methods

D.P. Solomatine. Introduction to Optimization 57

Random search methods

employs an idea of direct search (since derivatives are


not known)
two conflicting objectives:
exploring the search space generating points in the
unexplored regions
exploiting the best solutions using the best points (found
so far) to find even better points

D.P. Solomatine. Introduction to Optimization 58

29
Random search methods: some algorithms

pure (direct) random search (uniform sampling)


N points are drawn from a uniform distribution in X
and f is evaluated in these points; the smallest
function value is the minimum f * assessment
adaptive random search (non-uniform sampling)
Pronzato 1984
evolutionary (genetic) algorithms
controlled random search (Price 1983, Storey and Ali
1994)
shuffled complex evolution (combination of downhill
simplex and multistart), Duan et al. 1992)
adaptive cluster covering (Solomatine 1995, 1999)

D.P. Solomatine. Introduction to Optimization 59

Pure random search

N points are drawn


from a uniform
distribution in X and
f is evaluated in
these points; the
smallest function
value is the minimum
f * assessment

D.P. Solomatine. Introduction to Optimization 60

30
Genetic and evolutionary algorithms

D.P. Solomatine. Introduction to Optimization 61

Genetic algorithm (GA)

Main idea: try to emulate the natural evolution


Terminology is borrowed from natural genetics
Genetic operators: recombination (to combine good
points), mutation (to generate new points), selection
(what points to leave for the next population)
Evolution: iterative generation of organisms (points) and
death (removal) of the unfit (with low function value)

D.P. Solomatine. Introduction to Optimization 62

31
GA: genes, chromosomes, binary coding

Terminology is borrowed from natural genetics


all points = population
each point {x1, x2,, xn} = chromosome
chromosomes are made of genes, so each variable
coordinate xi = gene

D.P. Solomatine. Introduction to Optimization 63

GA: example of recombination in 2D

In 2 dimensions
recombination can be
seen as the exchange
of coordinates:
S=(xS1 xS2)
T=(xT1 xT2)
Offspring:
S=(xS1 xT2)
T (xT1 xS2)
T=(x

Hope is that the


generated offsprings
are also good

D.P. Solomatine. Introduction to Optimization 64

32
GA: recombination in multidimensional case

Number of variables n=4:


Two points are considered:
S = ( xS1 xS2 xS3 xS4 )
T = ( xT1 xT2 xT3 xT4 )
(cut is made here)
offsprings:
S= ( xS1 xT 2 xT3 xT4 )
T= ( xT1 xS 2 xS3 xS4 )

D.P. Solomatine. Introduction to Optimization 65

GA: binary coding

Often Binary coding of variables is used by l bits.


For example,
example if x1=14 and number of bits l=5 then
x1 = 01101
For number of variables n=4, point (5, 20, 9, 25) is
represented by the following chromosome:
x1 x2 x3 x4
5 20 9 25 (decimal)
00101 10100 01001 11001 (binary) (nl =20 bits)

D.P. Solomatine. Introduction to Optimization 66

33
GA: recombination with binary coding

S= 00101 10100 01001 11001


T= 10001 11111 00011 00011
(=7, so cut is made here)
then the result of the recombination operator will be two
offsprings S' and T'

S= 00101 10100 01011 00011


T= 10001 11111 00001 11001
x1 x2 x3 x4

D.P. Solomatine. Introduction to Optimization 67

GA: recombination operator in general case

Two parents (binary representation):


S = { S1,...,S
S,S
S+1,...,SSnl }
T = { T1,...,T, T+1,...,Tnl }

Two offsprings:
S' = { S1,...,S,T+1,...,Tnl }
T' = { T1,...,T
, , ,
,S+1,
,...,S
, nl }

There is also multi-point crossover

D.P. Solomatine. Introduction to Optimization 68

34
GA: mutation operator

In a chromosome:
S = { S1,...,S S,...,SSnl }
randomly change bit S

D.P. Solomatine. Introduction to Optimization 69

GA: selection operator

In the resulting population P(t) of parents Si and


offsprings
p g Sk select some of them,, thus creating g
population P(t+1)
How to select:
select best points out of the set of offspring points (so
that the parents are immediately excluded)
select out of the joint population of parents and offsprings
select out of the joint population of parents and offsprings
but allowing parents to survive in the population only for a
fixed number of populations
In these two last cases the selection is called elitist, since
it chooses only for best points, thus guaranteeing a
monotonically improving performance

D.P. Solomatine. Introduction to Optimization 70

35
GA algorithm iterate until the desired
accuracy is achieved
0.Set population number t = 0.
1 (initialize population). Construct a set P(t) of N points by drawing
them
h from
f X according
d to a certain rule.
l
2 (evaluate initial population). Compute function values in all points
of the population.
3 (recombine). Recombine points from P(t) according to the
recombination operator, creating new points (individuals) called
offsprings; put them to the population thus creating set P1(t).
4 (mutate). Modify (mutate) coordinates of some points from P1(t)
according to the mutation operator, creating set P2(t).
5 (evaluate population). Compute function values in all points of the
population P2(t).
6 (select). Apply selection operator and select some best points from
P2(t), thus creating the next population P(t+1).
7. If termination criterion is not fulfilled, then increase t and go to
step 3, otherwise stop.
D.P. Solomatine. Introduction to Optimization 71

Evolutionary algorithms (evolutionary


strategies) - German school
Mutation:
pperformed with respect
p to corresponding
p g variances of a certain n-
dimensional normal distribution.
Recombination of two variants:
discrete recombination is similar to a crossover in GA with real-
valued coding;
intermediate recombination creates a point S, such that
S'i = Si + (Ti - Si)
where [0,1]
[0 1] is a uniform random variable.
variable
Selection in ES, like in GA, can be done in two ways:
selecting best points out of set of offspring points (so-called
(,)-selection)
out of the union of parents and offsprings ((+)-selection).
D.P. Solomatine. Introduction to Optimization 72

36
Simplex descent method

D.P. Solomatine. Introduction to Optimization 73

Simplex descent method of Nelder and Mead


(1965)
four points are chosen to form a simplex
function is evaluated in all of them
the worst is reflected through the plane going through
the other three and replaced by its reflection; this forms
a new simplex
the process is repeated until the desired accuracy is
reached
Example: new simplex
p is
evaluate thus formed
all points

evaluate new
point
reflect the
worst point worst point
repeat the
process
D.P. Solomatine. Introduction to Optimization 74

37
Simplex descent method: possible operators

(a) reflection away from bad


point

(b) reflection and expansion


away from bad point

(c) contraction along one


dimension from bad point

(d) contraction along all


dimensions toward the best
point
D.P. Solomatine. Introduction to Optimization 75

Adaptive cluster covering algorithm (ACCO)

D.P. Solomatine. Introduction to Optimization 76

38
Main ideas used in ACCO

random generation of initial set


reduction: removal of points with high values so that only
good points are left
clustering of remaining points and establishing the
regions where a local minimum could be located
iterate for each region (box):
generate more points in the box
move the box towards the centroid
shrink the box
stop when desired accuracy is reached

D.P. Solomatine. Introduction to Optimization 77

Reduction: example of a single


single--variable
function
function not known analytically
but that can be calculated

randomly generated points regions with low function values


potential locations of local minima

leave only black points


D.P. Solomatine. Introduction to Optimization 78

39
Adaptive cluster covering (ACCO): step 1

randomly generate
points
i t
reduce (leave only good)
cluster
iterate for each cluster
(box):
generate more points
shift the center
shrink the box

D.P. Solomatine. Introduction to Optimization 79

Adaptive cluster covering (ACCO): step 2

randomly generate points


reduce
d (l
(leave onlyl good)
d)
cluster
iterate for each cluster
(box):
generate more points
shift the center
shrink the box

D.P. Solomatine. Introduction to Optimization 80

40
Adaptive cluster covering (ACCO) : step 3

randomly generate points


reduce
d (leave
(l only
l good)d)
cluster
iterate for each cluster
(box):
generate more points
shift the center
shrink the box

D.P. Solomatine. Introduction to Optimization 81

Adaptive cluster covering (ACCO) : step 4

randomly generate points


reduce
d (leave
(l only
l good)d)
cluster
iterate for each cluster
(box):
generate more points
shift the center
shrink the box

D.P. Solomatine. Introduction to Optimization 82

41
Adaptive cluster covering (ACCO)

Main principles:
reduction
clustering
adaptation
periodic
randomization

D.P. Solomatine. Introduction to Optimization 83

Global optimization tool GLOBE

D.P. Solomatine. Introduction to Optimization 84

42
Optimization tool GLOBE (by Solomatine) in
calibration of a rainfall-
rainfall-runoff model

D.P. Solomatine. Introduction to Optimization 85

GLOBE in minimization of a test function

D.P. Solomatine. Introduction to Optimization 86

43
Global optimization tool GLOBE:
the following algorithms are implemented
CRS2 (controlled random search, by Price 1983)
CRS4 (modification by Ali & Storey 1994)
GA with
ith a one-point
i t crossover, and
d with
ith a choice
h i between
b t the
th
real-valued or binary 15-bit coding, various random bit mutation,
between the tournament and fitness rank selection, and between
elitist and non-elitist versions
Multis - multistart algorithm (based on Powell-Brent direct
minimization)
M-Simplex - multistart algorithm (based on downhill simplex
method of Nelder and Mead)
adaptive cluster covering (ACCO)
adaptive cluster covering with local search (ACCOL)
adaptive cluster descent (ACD);
adaptive cluster descent with local search (ACDL)

D.P. Solomatine. Introduction to Optimization 87

Case study:
using global optimization in calibration of
models

D.P. Solomatine. Introduction to Optimization 88

44
Process of calibration as an optimization
problem (1)
Input X

Physical Model
system y = f (X, P)

Measured Error Model output


output yMES e (yMES-yMOD) yMOD

Error small
enough? No, update model parameters P
Yes

Stop
D.P. Solomatine. Introduction to Optimization 89

Process of calibration as an optimization


problem (2)
Popular way to measure error is root mean squared error:
1 T
wt OBSt MODt ( P ) min
2
E ( P) i
T t 1
OBSt = observed output variables values (numbers),
MODt = values generated by a model at
time moments t = 1T
P is the parameters vector {P1, , Pn} = independent
variables in the optimization
p problem
p

Problems in solving this problem:


there is no analytical expression for E(P), so we
cannot find the derivatives (gradient).
typically it is multi-extremum
D.P. Solomatine. Introduction to Optimization 90

45
Process of calibration as an optimization
problem (2)
Popular way to measure error is root mean squared error:
1 T
wt OBSt MODt ( P) min
2
E ( P) i
T t 1
OBSt = observed output variables values (numbers),
MODt = values generated by a model
time moments t = 1T
P is the parameters vector {P1, , Pn} = independent
variables in the optimization
p problem
p
Problems in solving this problem:
there is no analytical expression for E(P), so we
cannot find the derivatives (gradient).
typically it is multi-extremum
often calibration is a multi-objective problem
D.P. Solomatine. Introduction to Optimization 91

Using GLOBE for models calibration

D.P. Solomatine. Introduction to Optimization 92

46
Experience in using GO for model calibration

calibration of a lumped hydrological model (Solomatine


1995)
calibration of an electrostatic mirror model (Vdovine et
al., 1995)
calibration of a 2D free-surface hydrodynamic model
(Constantinescu 1996)
calibration of an ecological model of plant growth
calibration
lb off a distributed
d b d groundwater
d model
d l
(Solomatine et al. 1999)
parameter estimation of support vector machines (2004)

D.P. Solomatine. Introduction to Optimization 93

Lumped conceptual rainfall-


rainfall-runoff model

they operate with different


but mutually interrelated
storages
g representing
p g
physical elements in a
catchment
all parameters and variables
represent average values
over the entire catchment
the description of the
hydrological process cannot
be based directly on the
equations that are supposed model parameters cannot
to be valid for the individual usually be assessed from field
soil columns data alone, but have to be
the equations are semi- obtained through the help of
empirical, but still with a calibration.
physical basis
D.P. Solomatine. Introduction to Optimization 94

47
Tank model of Sugawara (1974)

eight parameters
that cannot be
measured:
d1, d2
k1, k2, k3, k4
s1, s2
they have to be
identified
(calibrated)

D.P. Solomatine. Introduction to Optimization 95

Rainfall--runoff model: non


Rainfall non--calibrated

D.P. Solomatine. Introduction to Optimization 96

48
Rainfall--runoff model: calibrated
Rainfall

D.P. Solomatine. Introduction to Optimization 97

Performance indicators of GO algorithms

effectiveness (how close the algorithm gets to the global


minimum);
efficiency (running time) measured by the number of
function evaluations needed
reliability (robustness) of the algorithms can be measured
by the number of successes in finding the global
minimum, or at least approaching it sufficiently close

D.P. Solomatine. Introduction to Optimization 98

49
Performance of various GO algorithms

SIRT rainfall-runoff model (8-var.)


9

CRS2
GA
8
ACCOL/3LC
Multis
Function value

7 M-Simplex
CRS4
ACDL/3LC
6

4
0 1 2 3 4
Thousands
Function evaluations
D.P. Solomatine. Introduction to Optimization 99

Experience in using GO for other problems

solution of a dynamic reservoir optimization


instead of performing exhaustive search (solving dynamic
programming problem), faster randomized partial search
was used
optimization of pipe networks
posing a non-linear or discrete optimization problem is not
always possible
problem by its nature is multi
multi-extremum
extremum

D.P. Solomatine. Introduction to Optimization 100

50
Water distribution networks
Using EPANET models
and single
single- and multi
multi-objective
objective optimization

Abebe, A.J., and Solomatine D. (1998). Application of global optimization to the design
of pipe networks. Proc. Int. Conf. on Hydroinformatics, Balkema, Rotterdam.
L. Alfonso, A. Jonoski, D.P. Solomatine (2009). Multi-objective optimization of
operational responses for contaminant flushing in water distribution networks. ASCE J.
Water Res. Planning & Management.
101

Example 1: rehabilitation of the water distribution


network (WDN) using EPANET model
13 12 12
Test example: Hanoi Network (fragment) 11
Model: EPANET 11
34 10
33 26 27 28 9
31 30 25 26 27 16 15 14 10 9
15 14 13
16
8
32 25 17
17 8

30 24 18 7

18 7
24 19
31
6
19
30 29 23 20 3 4 5
29 28 23 20 3 4 5 6

21 2 RESERVOIR
34 Pipes 21 2 100

31 Nodes 22 1
1 Reservoir 22
1

D.P. Solomatine. Introduction to Optimization 102

51
Example 2: optimization of water distribution
network (WDN) rehabilitation

Optimization of Networks with Predetermined Topology


Number and length of pipes
Demand at every node (including pressure)
Other hydraulic elements
Commercially available pipe sizes
Decision Variables
Diameter of each pipe in the network
Result: with optimal pipe diameters costs are 20% lower
A.J. Abebe, D.P. Solomatine. Application of global optimization to
the design of pipe networks. Proc. 3rd Int. Conf. on
Hydroinformatics. Copenhagen, Denmark, August 1998.
D.P. Solomatine. Introduction to Optimization 103

WDN: Objective function and constraints

The actual cost of the network Ca has many components, but


here a simplified approach is used: it is calculated based on
the cost per unit length associated with the diameter and the
length of the pipe: n
C a c( Di ) Li
i 1
Constraints:
Hydrodynamic constraints
Continuity constraint
Energy balance
Commercial constraints
Market available (discrete) pipe diameters should be used
Nodal head constraints
Satisfaction of min/max nodal head

D.P. Solomatine. Introduction to Optimization 104

52
Network optimization based on EPANET
model

Hydraulic model

Optimization tool

Optimal solutions

D.P. Solomatine. Introduction to Optimization 105

Example 2: Multi
Multi--objective optimisation of operational
responses for contaminant flushing in WDN

Test case study: fragment of the WDN in Villavicencio, Colombia


V7

Trace J9
0.00 V5
0.00 Contam
0.00
50.00
percent
V4 H1

Trace J9
0.00
0.00
H2 V3
0.00
50.00
percent
V2

D.P. Solomatine. Introduction to Optimization 106

53
Organization of computation

EPANET is used as the modelling engine


EPANET

G.pin text file:


COPA
-Runs EPANET.
1
0 -Estimates number of affected nodes G.rsp text file:
0 -Compares initial status and current 1.234
1 configuration
1 -Calculates function.
-Writes G.rsp file

GLOBE
(GA)

D.P. Solomatine. Introduction to Optimization 107

Solutions

One of the solutions: close valve V7, open hydrant H2


V7

Contam V5

Trace J9
0.00
0.00
0.00 V4 H1
50.00
percent

Trace J9
0.00
0.00 H2 V3
0.00
50.00
percent

V2

D.P. Solomatine. Introduction to Optimization 108

54
Combining criteria: method of "ideal point"

Select the solution closest to the "ideal point"

Sector 11: Optimisation results


60
of affected junctions

50

40

30
Number o

20

10

0
0 5 10 15 20 25 30 35
Number of changes in the network
D.P. Solomatine. Introduction to Optimization 109

Multi-objective optimization: finding the


Multi-
Pareto layer of "good solutions"
The "best" solution is to be selected by a decision maker

Pareto front for optimisation problem


Sector 11 Villavicencio

45
40
35
unctions

30
25
Affected ju

20
15
10
5
0
0 5 10 15 20 25 30 35 40 45 50
Number ofto
D.P. Solomatine. Introduction movements
Optimization 110

55
Urban drainage networks optimization

Rehabilitation of drainage
networks using hydrodynamic
models and multi-objective
optimization
W. Barreto Cordero, R.K. Price, D.P. Solomatine, Z. Vojinovic.
Approaches to multi-objective multi-tier optimization in urban
drainage planning. Proc. 7th Intern. Conf. on Hydroinformatics,
Nice, Research Publishing, 2006.
W. Barreto Cordero, Z. Vojinovic, R. Price, D.P. Solomatine. A
Multi-objective Evolutionary Approach for Rehabilitation of Urban
Drainage Systems. J Water Res. Planning & Mang., 2009 (in
review).
111

Hydrodynamic modelling of sewers and


drainage
DHIs MOUSE (MOdel for Urban SEwers) or US EPAs SWMM
(Stormwater Management Model) systems are widely used
St. Venant equations: SWMM uses explicit scheme, MOUSE
uses Abbott implicit scheme with self-adaptive time step

D.P. Solomatine. Introduction to Optimization 112

56
Flood damage functions

Cost flood f (Volume)


Volume Y(m) A m2 Cost

0.00 0.00 0.00 0.00

1500 1.25 320 12525.25

Above ground
Under gground
L(m) D(m) Aging Cost

100 0.50 2 0.00

500 1.25 8 12525.25

Cost Pipes f ( L * C )

D.P. Solomatine. Introduction to Optimization 113

Objective functions

Multi-objective approach:
Flooding damages:
ncells nlu
f1 fc
i 1 k 1
k
i Cmax
k
depi
Rehabilitation costs (pipe renewal)

n
f 2 Min (Li * Ci )
i 1

D.P. Solomatine. Introduction to Optimization 114

57
Multi-objective optimization:
Multi-
example of urban drainage system rehabilitation
Z1 = flood damage due to overflows dominated (inferior)
Criteria space solutions

non-inferior
i f i (the
(th
best) solutions,
uncomparable,
Pareto layer

Decision space
Z2 = costs of implementing
p g DN
a particular rehabilitation
option
ideal point
(does not exists)
D1
A decision (rehabilitation option)=
set of pipe diameters Di (i=1..N)
D2 Di pipe diameters (1..N)
D.P. Solomatine. Introduction to Optimization 115

Linking hydrodynamic model and


optimization software
Optimization: replacing pipes, creating additional
storages
g

Wastewater System Pipe


Network Model (MOUSE)
Data Processor Data Processor
Optimization Procedure
(GLOBE, NSGA-II)

D.P. Solomatine. Introduction to Optimization 116

58
Example: Drainage network

D.P. Solomatine. Introduction to Optimization 117

Pipe replacement options tested during


optimization process

D.P. Solomatine. Introduction to Optimization 118

59
NSGAX optimization software:
search for optimal pipe replacement option

D.P. Solomatine. Introduction to Optimization 119

Results of optimization: set of solutions


ue to overflows

1
Flood damage du

Costs of implementing
rehabilitation option

D.P. Solomatine. Introduction to Optimization 120

60
Parallel computing in computationally
intensive optimization problems

121

Parallel Computing

Clusters of computers:
1. Parallel Virtual Machine (PVM):
Heterogeneous Cluster Networks
Context management (virtual machine)
Fault tolerant
2. Message-Passing Interface (MPI):
Standard for Super Computers (MPP)
Commercial and free implementations
Future of Parallel computations ?

D.P. Solomatine. Introduction to Optimization 122

61
Parallel Computing: organisation of the
computational process

Master Serial approach:

Master
Slave 1

Slave 2 Master/Slave
Slave 3

Slave 4

Serial Time
Parallel Time
Communication Time
D.P. Solomatine. Introduction to Optimization 123

Parallel Computing: parallelization of the


optimization algorithm
Parallelized random search by GA:
Master Slaves

Init

Nsgas.exe
Random
Population
Ifacemouse.exe
Nsgam.exe
Mouse.exe
Population Parallel
Evaluation evaluation

Genetic Master: nsgam.exe Slave: nsgas.exe


nsgas exe
Operators 1. Start slaves 1. Wait for master command
2. Send configuration command 2. Select command:
3. Send Individual Configuration
Stop?
4. Receive evaluated Individual Evaluate Individual
5. Go to step 3 until end Terminate
End 6. Send termination command 3. Go to step until end

D.P. Solomatine. Introduction to Optimization 124

62
Parallel Computing: computers used in the
cluster
Tested cluster configuration:

1 Cpu AMD Atholon 800Mhz. RAM 128MB HD 20GB.


OS: LINUX Debian etch

4 Cpus Celeron 1.8GHz, RAM 128MB, HD 20GB,


OS: Windows 2000

4 Cpus Celeron 2.0GHz, RAM 128MB, HD 20GB,


OS: Windows 2000

2 Cpus P4 2.4GHz and 1.8GHz, RAM 256MB, HD 40GB,


OS: Windows 2000

D.P. Solomatine. Introduction to Optimization 125

Parallel Computing: arranging the cluster

D.P. Solomatine. Introduction to Optimization 126

63
Parallel Computing: XVPM interface

D.P. Solomatine. Introduction to Optimization 127

Parallel Computing: reduction in computing


time

Results after 5000 function evaluations:

1 Cpu = 12.8 hours 62.5 % Reduction !


10 Cpu = 4.8 hours
D.P. Solomatine. Introduction to Optimization 128

64
Water networks optimization: conclusions and
recommendations

optimization leads to more economical solutions


due to the complexity of the objective function the use of
direct search (global optimization) is needed
multi-objective optimization provides for more flexible
approach and involves a decision maker
there is a certain difference in performance of various
methods so using several of them is recommended. If a
fast evaluation is needed ACCO could the first choice
some real-valued algorithms need to be updated to allow
for encapsulation and decoding of discrete solutions
see more:
A.J. Abebe, D.P. Solomatine. Application of global optimization to the design of pipe
networks. Proc. 3rd Int. Conf. on Hydroinformatics. Copenhagen, Denmark, August 1998.
Ostfeld A. (2006) Enhancing Water-Distribution System Security Through Modeling, ASCE
Journal of Water Resources Planning and Management, Vol. 132, No. 4, pp. 209-210
D.P. Solomatine. Introduction to Optimization 129

Global (direct) optimization: conclusions

there is a set of GO methods that can be effectively used


in various water-related
water related problems
some methods can be several times more efficient (e.g.,
ACCO with the low number of function evaluations) than
e.g. widely used genetic algorithm
this is especially important for problems with expensive
(time-demanding) functions to optimize

D.P. Solomatine. Introduction to Optimization 130

65
Optimization: conclusions

Water systems are very expensive, so their optimization


by even several percent could mean considerable savings
Characterizing the system performance is typically done
by running models
This means the objective function is not known
analytically, so global direct optimization (random search)
methods are often the only choice
Multi objective optimization allows for inclusion of various
Multi-objective
factors, stakeholders and decision makers
Optimization constitutes an important part of
hydroinformatics technology
Data Models Knowledge Decisions Impacts
D.P. Solomatine. Introduction to Optimization 131

Thank you for your attention

132

66

Você também pode gostar