Benders Decomposition For Dummies

qwertyuiopasdfghjklzxcvbnmqw
ertyuiopasdfghjklzxcvbnmqwert
yuiopasdfghjklzxcvbnmqwertyui
Benders Decomposition for
opasdfghjklzxcvbnmqwertyuiop
Dummies
asdfghjklzxcvbnmqwertyuiopas
How I learned it
dfghjklzxcvbnmqwertyuiopasdf
Gro Klæboe
ghjklzxcvbnmqwertyuiopasdfgh
jklzxcvbnmqwertyuiopasdfghjkl
zxcvbnmqwertyuiopasdfghjklzx
cvbnmqwertyuiopasdfghjklzxcv
bnmqwertyuiopasdfghjklzxcvbn
mqwertyuiopasdfghjklzxcvbnm
qwertyuiopasdfghjklzxcvbnmqw
ertyuiopasdfghjklzxcvbnmqwert
yuiopasdfghjklzxcvbnmqwertyui
opasdfghjklzxcvbnmqwertyuiop
asdfghjklzxcvbnmrtyuiopasdfgh
Contents
For whom is this note intended?............................................................................. 3
What is Benders decomposition?........................................................................... 3
What are the benefits of using Benders decomposition?....................................3
Presentation of the case study............................................................................... 4
Dressing up the problem in the standard notation.................................................4
The first stage..................................................................................................... 4
The second stage................................................................................................ 5
The optimality cuts – the link between the first and the second stage...............7
Taking care of stochasticity................................................................................. 8
Example of an optimality cut.............................................................................. 8
The Bender procedure from beginning to end......................................................11
The steps in Benders decomposition.................................................................12
Step 1: Initialization....................................................................................... 12
Step 2: Sub problems..................................................................................... 12
Step 3: Convergence test.............................................................................. 12
Step 4: Master problem.................................................................................. 12
But how do I know how many iterations to run?...............................................13
List of notation..................................................................................................... 15
References........................................................................................................... 15
For whom is this note intended?
The primary target group for this memo is very narrow: Me. I need to write down
what I just have understood, so that I don’t forget it. Other people that might
benefit from this memo are probably quite like me in the following respects:
- You have a basic idea about what stochastic programming is about

- You are familiar with the notion of two stage recourse problems
- You are used to the notation in the textbook of Birge & Loveaux[1]
- You have tried to follow a textbook on the implementation of benders, but
has stumbled in the details
- You need a practical rather than mathematical approach to concepts to
understand them fully
- You are looking for a basic, step-by-step-explanation on how to implement
benders, and are willing to suffer a tedious presentation to get all the
details
Since I don’t intend to publish these notes other places than on my home page, I
don’t bother to pretend that Google isn’t my main source of information.
Therefore, the list of references include other lecture notes and memos and other
useful stuff found on the net.
What is Benders decomposition?

Benders decomposition is a way to split complicated mathematical programming
problems into two, and thereby simplifying the solution by solving one master
problem and one subproblem. It is commonly used for stochastic two-stage
programs with recourse where the problem can be split in the first and second
stage problem, but benders decomposition can be used for deterministic
problems too - see [2] for an example.
Originally, benders was written to solve integer, non-stochastic programs. This is

what mr. Benders was struggling with.
Within stochastic programming, one often refer to the L-shaped method [1],
which was developed by van Slyke and Wets. I think they were the first to borrow
from Benders decomposition technique and apply it to decomposition methods.
(There is also someone named Kelly who touched upon the same things). This
method is basically the same as benders decomposition, with the addition of
feasibility cuts (see [1] ch 5.1 for details on this). Since we in this article will work
with problems with relative complete recourse, we can disregard the feasibility
cuts.
What are the benefits of using Benders decomposition?

This is a good question.
There exists two families of decomposition techniques: Decomposition along the

time stages, and decomposition along scenarios[3]. Benders decomposition
decomposes along time stages. There exists other technique that decomposes
along time stages that are faster than Benders. I generally have the impression
that Benders is the learn-to-walk-before-you-can-run way to approach the
problems.
Presentation of the case study

I will illustrate the implementation of Benders from a simple example from Higles
excellent tutorial on stochastic programming [4]. The problem is basically as
follows: Dakota furniture company makes desk, tables and chairs from three
types of input: lumber and two types of labour: finishing and carpentry. They are
faced with a two-stage recourse problem in the following way: They need to
purchase their input factors before demand for the different products are known.
However, as demand is revealed, they are free to divide the input factors
between the different products in a way that maximizes their income, given the
availability of inputs.
The first stage problem is thus to determine how much lumber to buy, and how
much finishing and carpentry-skilled labour to hire. The random variable is
demand quantities, and the second stage problem is to figure out how many
desks, tables and chairs to produce, recognizing that production opportunities are
limited by the inputs bought in stage 1.
The parameters of the problem are as follows:
Tabell 1: Cost of input factors and input factor requirements
Resource Costs Input requirements

$ Desk Table Chair
Lumber (bd 2 8 6 1
ft)
Finishing 4 4 2 1.5
(hrs)
Carpentry 5.2 2 1.5 0.5
(hrs)
Tabell 2: Demand scenarios and sell prices
Demand scenarios Sell price

Low Most likely High $
Desks 50 150 250 60
Tables 20 110 250 40
Chairs 200 225 500 10
Probability 0.3 0.4 0.3
Dressing up the problem in the standard notation

The first stage
First, one word of caution: In every introductory text on Bender’s the first stage
problem is a minimization problem, and the second stage is also a minimization
problem. Do yourself a favor and formulate your problem so that it is a
minimization problem in both stages, or you’ll be doing sign errors constantly.
Remember that a maximization problem can always be turned into a
minimization problem by minimizing the negative of the max problem. Believe
me, it’s easier to interpret the negative of your problem than to get all the other
signs and inequalities in Bender’s right.
Ok. Let’s turn to the standard formulation. For the first stage, the notation is as
follows:
T
min z=c x+ θ
s.t Ax=b
x≥0
θ is an approximation of the second stage costs as a function of your chosen x.

We will elaborate on how we approximate it later. The first stage problem
basically says that you should choose x as to minimize the costs in the first and
the second stage, subject to the restrictions on xs that are identifiable in the first
stage. However, since we haven’t dealt with the second stage yet, we have no
clue about the costs in the second stage (the value of θ), and therefore we
optimistically set its initial value to -∞.
If you are like me, it is often useful to see the vectors with some real values to
get a feel of the problem, so here is the objective.
[] []
xl xl
min z=[ c l c f c c ] x f +θ=[ 2 4 5.2 ] x f −∞
xc xc
But what do the first-stage restrictions look like? In Higles original problems,
there are none specified (probably because Higle do not show how to solve the
problem using Bender’s). There are no restrictions to how much lumber, finishing
and carpentry labour we can buy. But when solving the case, we’ll find out that
to get started, we really need some upper limits on the x-vector, otherwise the
first stage problem will in some of the starting iterations be unbounded. This will
often be the case with Benders. Luckily, it’s usually not hard to think of some
limits to the x that would be relevant to include. There might be some budget
limits on how much money the producer could spend on buying inputs, or in the
extreme case, the availability of resources will ultimately be limited by global
availability. (There cannot be more labour hours available for carpentry in one
specific hour than there are people in the world!). So, I’ll give you some upper
limits on x, which I know are sufficient for the levels of demand: Lumber: max
3500 bd ft. Finishing labour: max 1500 hours. Carpentry labour: max 875 hours.
With this information at hand, the A-matrix simply is the identy matrix, and the
restriction Ax=b becomes:
[ ][ ] [ ]
1 0 0 xl 3500
0 1 0 xf ≤ 1500
0 0 1 xc 875
Solving this problem for the first iteration of the first stage, it is obvious that the
solution is:
[ ][]
xl 0
xf = 0
xc 0
The second stage

Given that we have chosen some values of the x’s from the first stage, how can
we now make the best of the situation in the second stage. Let us denote the
T
vector of x’s from the first stage x=[ x l x f xc ] to emphasize that the values
are fixed. We now want to solve the second stage problems. For all the discrete
realizations, s=1,…S, of the random variable ω, we solve the following problem
min w=qy
st Wy=h ( ω s )−T x
y≥0
Some comments on the notation and stochasticity here. T is often named the
technology matrix, W the recourse matrix, h the resource matrix 1 and q the cost
vector. All these might be a function of the stochastic variable, but to use
Benders, W must be independent of ω. That is known as fixed recourse. Also, life
is easier when q is independent of ω, as this interferes with feasibility cuts in
some ways that I have not really thought very hard about (the interested reader
is referred to chapter 3 of [5]). However, in our case, only demand is stochastic –
which means (as I will show) that only the resource vector is stochastic.
The objective in the second stage is really to maximize income give the available
inputs and demand constraints. However, as we want to work with a minimization
problem, we formulate the problem so that we try to minimize the negative costs
of selling furniture.
1 Note that if you are reading Higle, she denote the resource matrix with r rather than h.
¿
−60 −40 −10
¿
[]
yd
z=[ q d qt q c ] y t =¿
yc
min ¿
Then, let’s move on to the restriction matrix for the second stage. There are two
sets of equations to this particular problem, the input restrictions saying that you
cannot make more furniture than the stock of inputs allows you (the three first
rows of the W,h and T vector/matrix), and the demand restriction saying that you
cannot sell more to the market than the demand scenario allow you to (the three
last rows of W,h and T vector/matrices).
[ ][ ] [ ] [ ][ ]
8 6 1 0 −1 0 0
4 2 1.5 y 0 0 −1 0 x
2 1.5 0.5
d
0 0 0 −1
l
y t = d (ω ) − xf
1 0 0 d s 0 0 0
0 1 0 yc d t (ω s) 0 0 0 xc
0 0 1 d c (ω s) 0 0 0
Note that many of the rows in the technology matrix consist of all zeros. This took
a bit time for me to realize (but maybe you are smarter than me and grasp this at
once), but all restrictions that limits the second stage problem, but are not a
function of your first stage variables will have all-0-rows in the T-matrix. However,
they are still important, because the influence the objective function in the
second stage problem, and as we will see later on, the objective function will
enter the approximated second stage cost (θ). Note also that the three last rows
in the h-vector have stochastic elements. Thus, we will actually not have only
one second stage problem – we will have s (in our case 3) problems – one for
each scenario.
One thing that I see is imprecise in this exposition, is that the second stage
restrictions are represented by an equity constraint, whereas it really should be
less than or equal to constraints. I guess the condenced form Wy = h – Tx
involves some slack variables to make the equation hold. However, I will ignore
this for now – since I am going to solve the problem through the high-level OR
program GAMS rather than by matrix calculations. If you are trying to solve the
problem in MATLAB , Maple or equivalent, you should probably think about this a
bit.
The optimality cuts – the link between the first and the
second stage
Since you are reading this text, you have probably looked at the Bender’s
decomposition method and know that the first and the second stage problem are
tied together through the optimality cuts. Thus, after the first initialization, the
first stage problem gets a set of extra equations (one for each iteration), that
limits θ.
T
min z=c x+ θ
s.t Ax=b
El x+θ ≥ e l ,l=1, … . L
x≥0
Note that theta is unbounded, so that negative values are allowed. Note also
that θ has no subscript l, so that for each iteration, there will be more and more
restrictions that limits θ, ultimately pushing θ to its optimal value. But what do
these restrictions represent? Let us rewrite the optimality cut from the Birge &
Louveaux [1]notation:
θ ≥ el− El x l=1, … . L
This equation says that θ must be greater than the right hand side. Thus, we limit
the objective (a minimization problem) by saying that the θ-part must be at least
greater than something. But what is this something? To put it short, the
something is the expected objective value of the second stage problem e l (give
some fixed values of x) less the contribution to reduce the second stage objective
value by changing the first stage variables x.
The next question is then how do we know how much changing x is going to
reduce the second stage objective? The answer is that we use the marginal
values on the restrictions that x enters. If the marginal value is negative,
increasing x would reduce the second stage objective (and thus making it better
since it is a minimization problem), whereas a positive marginal value, would
increase it. The coefficient El tells us how fast the change is.
[A word of caution: It is really easy to get sign confused when working with
optimality cuts. Birge & Louveaux’s formulation is more intuitive appealing,
whereas Higle’s formulation is more mathematically correct. When reading the
two text side by side it might seem as they disagree on the sign for the x-term,
but they don’t. In Birge & Louveaux’s notation, E l is the simplex multiplicator of
Tx, whereas in Higle, the term βx is the simplex multiplicator of –Tx. Therefore,
the two restrictions
θ ≥ el− El x l=1, … . L (Birge & Louveaux)
and θ ≥ α l + β l x l=1, … . L (Higle)
are really equivalent].
Where does this whole idea of optimality cut come from, and why does it work?
The answer is duality theory. Let us go back to the formulation of the second
stage problem. If you know your linear programming basics, you know that you
can formulate the dual of the second stage problem as follows:
T
max π [h( ωs )−T x]
T
s.t. π W ≤q
π ≥0
Where π is the vector of dual variables, and all other matrixes should be known
from the definition of the primal second stage problem. [The last restriction, π
≥0, really only holds if Wy≤h(ωs)-Tx, but since I guess that slack variables are
included in B&L formulation, I don’t worry too much about this].
For the optimal x-vector, the objective of the dual second stage problem will be
equal to the objective of the primal second stage problem. For all non-optimal x-
vectors, the objective value of the dual problem will be less than the second
stage objective. This means that the objective of the dual second stage problem
act as a lower bound on the true second stage cost.
w ≥ π T [h(ωs )−T x ] (3)
Does this look familiar? It should. Remember that the value θ is an

approximation of the second stage objective w. And the approximation takes
place using dual variables from the optimal solution of the second stage given
that we have fixed some first-stage variables. Using the notation from Birge and
Louveax, we define
S
T
e l ≡ ∑ p s ( π ls ) hs
s=1
S
T
El ≡ ∑ ps ( π ls ) T s
s=1
With these definitions at hand, it is easy to see that the weak duality property (3)
is equivalent with the optimality cut.
θ ≥ el− El x <-> w ≥ π T [h(ωs )−T x ]
Taking care of stochasticity

Foregive me father, cause I have sinned… The last paragraph is imprecise,
because I have partly ignored the stochasticity of the problem. θ is not really an
approximation of the second stage cost, but the second stage costs that might
arise given different outcomes of the random variable. What θ really should
approximate, is the expected value of the second stage problem.
S
Eω ( w )=∑ ps w s
s=1
Example of an optimality cut
To illustrate what an optimality cut may look like, let us have a look at one
particular iteration of the Dakota furniture problem. I have chosen to show
iteration 4.
In the first stage problem, the objective is still to minimize the cost of purchasing
inputs while balancing the income that the purchase of inputs could lead to in the
second stage
[] []
xl xl
min z=[ c l c f c c ] x f +θ=[ 2 4 5.2 ] x f +θ
xc xc
We still have the first stage constraints limiting how much input it is possible to
buy:
[ ][ ] [ ]
1 0 0 xl 3500
0 1 0 x f ≤ 1500
0 0 1 xc 875
From the 3 earlier iterations, we have the following three restrictions on thetha:
[]
xl
θ ≥ 0+ [ −6.25 −2.5 0 ] x f
xc
[]
xl
θ ≥ 0+ [ 0 −20 0 ] x f
xc
[]
xl
θ ≥ 0+ [ 0 0 −30 ] x f
xc
If you want to reflect a bit upon the values of these three optimality cuts, you’ll
notice that without at least one unit of each input, the Dakota Furniture Company
will be unable to produce a single table, desk or chair, and the objective value of
the second stage (el) is therefore 0 in all three cuts. Also, the feedback from the
second stage value tells us that it is the lack of lumber that is the most binding
constraint, leading to a negative marginal value2 on this in the first iteration.
Since the cost of lumber (2) is less than the expected income from buying lumber
(-6.25), this leads us to buy lumber in the second iteration, but then it is the lack
of labour for finishing that is most restrictive and yields a negative marginal
2 Remember: Minimization objective : Negative marginal value indicates a better
objective if we increase the level.
value. Again, we decide to buy finishing as well, only to discover that it would be
beneficial also to have carpentry skilled labour.
With this information we solve the first stage problem and get the following
decision on how much input to buy:
[] [ ]
xl 3500
xf = 1250
xc l=4
833.33
[Jeg undres hvilken informasjon det er som får programmet til å kjøpe mindre enn
max av finishing og carpentry?]
With these numbers at hand, we move to the second stage problem. We

maximize income for each of the three demand scenarios given this stock of
inputs. The objective value of the three scenarios and the marginal value of
increased inputs (the first stage variable) are listed in the table below:
Scenario W Probability Marginal values

Lumber Finishing Carpentr
y
1 (low -5800 0.3 0 0 0
demand)
2 (most -15650 0.4 0 0 0
likely)
3 (high -21250 0.3 0 -15 0
demand)
Also, let us have a look at the marginal value of increased demand in the three
scenarios in this iteration:
Scenario Probability Marginal values

Desks Tables Chairs
1 (low 0.3 -60 -40 -10
demand)
2 (most 0.4 -60 -40 -10
likely)
3 (high 0.3 0 -10 0
demand)
The marginal values of increased demand should not be so surprising. Since we

have quite a lot of input, the marginal value of increased demand in the low and
most likely scenario is just equal to the (negative of the) selling price since we
are left with additional inputs even when all demand is met. In the high demand
scenario it is only demand of tables that can fully be met, so an increase in
demand for tables would increase profit. However, due to lack of inputs,
increased production of tables would happen at the expense of desk production,
so contribution to profit is less than the selling price for tables.
Given the structure of this problem3, where the h-vector has zeros for all rows
where the T-matrix is non-zero, and vice versa, the constant term of the cuts can
be calculated in two ways – either directly through the marginal values on the h-
vector, or indirectly, but subtracting the marginal values of the T-matrix
multiplied with the first stage decisions from the second stage objective.
Let us start out with the conventional way of calculating that [1] prescribe:
S
T
e l ≡ ∑ p s ( π ls ) hs
s=1
Thus, for our problem, this becomes:
[]
3 dd, s
e l=4=∑ p s [ π d , s π t , s π c , s ] d t , s =¿
s=1
dc , s
[ ] [ ] [ ]
50 150 250
0.3 [ −60 −40 −10 ] 20 + 0.4 [ −60 −40 −10 ] 110 +0.3 [ 0 −10 0 ] 250 =−8750
200 225 500
However, we can choose to do this in another way. Remember that the objective
of the second stage problem is equal to the dual variables times the right hand
side. We may exploit this in the following way:
w s=π T [h s−T x]
T T
π hs =w s+ π T x
S S
e l ≡ ∑ p s π T h ( ω s )=∑ p s (w s+ π T T x)
s=1 s=1
So, calculating this for our case leaves us with. (Note that the marginal values
refer to –Tx, so that we need to replace the + with a – sign to get the calculations
correct).
( [])
3 xl
e l=4=∑ p s w s− [ π l ,s π f , s π c ,s ] xf =¿
s=1
xc l =4
(
3500
0.3∗ −5800− [ 0 0 0 ] 1250
833.33 [ ]) ( l =4
[ ]) (
3500
+ 0.4∗ −15650− [ 0 0 0 ] 1250
833.33 l =4
+ 0.3∗ −21250−[ 0 −15 0 ]
[ 8
3 I have not really thought it through whether this structure is a necessary condition for
being able to calculate el in two ways, or whether it also can be done with other types of
problems.
Which method is better, depends on the structure of the problem. With a large
second stage problem (h-vector with many non-zero elements), I think this
second approach is faster.
Let us now calculate El. That is just a question of calculating the probability
weighted sums of the marginals from all scenarios. For lumber and carpentry the
value of increased inputs are 0 for all scenarios, but for finishing the marginal
value is -15 in the high demand scenario. This leaves us with a coefficient of
0.3*0 + 0.4*0 + 0.3*-15 = -4.5 for finishing. The fourth cut will then look as
follows:
[]
xl
θ ≥−8750+ [ 0 −4.5 0 ] x f
xc
The cut tells us that, from the second stage point of view, the expected income
of the input vector bought in l4 is $-8750, and that this income could be
increased by $4.5 for each extra unit of finishing that was available. Is this
interpretation really correct?
The Bender procedure from beginning to end

The procedure so far:
- Each iteration contains one run of the first stage problem and s runs of the
second stage problem where s is the number of scenarios
- The goal of the first stage problem is to decide upon the first stage
decision variables, x. These are then transferred to the second stage
problem of the same iteration and kept constant for all scenarios.
- The goal of the second stage problem is to find the expected value of the
second stage problem, and also the gradient of the first stage variable with
respect to the second stage objective.
- Each iteration of the second stage problem adds one cut to the first stage
problem, whereas each run of the first stage problem only replace the old
set of first stage variables.
When you solve the first stage problem the cost of the second stage problem is
bounded by the optimality cuts. Thetha can be reduced by increasing the first
stage decision variables (the x’s), a decision that has to be weighed against the
increased first stage costs. But the gradient of which increased x reduce the
second stage cost is based on an optimal second stage decision given x, so the
estimation of thetha from the first stage will act as a lower bound on the second
stage cost.
The second stage problem will calculate the cost of the second stage given
optimal second stage decisions given fixed first stage variables.
The steps in Benders decomposition
This section follows [6].
Step 1: Initialization
v:=1{iteration number}
UB:=∞{Upper bound}
LB:=- ∞{Lower bound}
Solve initial master problem
T
min c x
Ax=b
x≥0
v ¿
x ≔ x {optimal values}
Step 2: Sub problems

For ω in Ω do
Solve the sub problem
T
Minq ω y ω
W y ω=h ω−T ω x v
yω ≥ 0
End for
T v T v
UB:= min ⁡{UB , c x + ∑ pω qω y ω }
ω∈Ω
Step 3: Convergence test

If (UB-LB)/(1+LB)≤TOL then
Stop: required accuracy achieved
Return x v
End if
Step 4: Master problem

Solve the master problem
T
min c x+θ
Ax=b
Tωx
¿
−π lω ¿
pω ¿
θ≥ ∑ ¿
ω ∈Ω
x≥0
v ¿
x ≔ x {Optimal values}
v ¿
θ ≔θ
LB:= c T x v +θv
Go to step 2
The naming of iterations in Benders

This is confusing stuff.
But how do I know how many iterations to run?

The question of the number of iteration was my main headache when I first tried
to implement the Benders algorithm. Birge & Louveaux [1] has the following
stopping criterion:
Let w v =e v −E v x v
If θv ≥wv, then stop, xv is an optimal solution. Otherwise, add the optimality cut to
the master problem and run another iteration. However, this knowledge is not
very useful in predicting the progress of your algorithm, since the gap θ v-wv do
not necessarily steadily decrease. This is illustrated for our case study in Figur 1.
30000
25000
[$]
20000
GAP, wv- thetav
15000
10000
5000
0
1 2 3 4 5 6 7 8 9 10 11
Iteration number
Figur 1: Gap between the probability weighted sum of second stage objective and the
approximation (theta) in the Dakota furniture problem.
With large problems at hand, it is quite frustrating not to know how much closer
you are to the optimal solution, but just to sit and hope for an optimal solution in
the next iteration.
If you look at implementations of Benders algorithm or look at applications in

papers, you have almost certainly stumbled over the notions of the upper and
lower bound. The notion of gradually narrowing the range of the master
objective through a non-decreasing lower and a non-increasing upper bound, as
illustrated in , seems rather appealing. In addition, this measure allows us not to
calculate a problem to full optimality, but to cut the algorithm when the upper
and lower bound is close enough.
2000
1000
Upper and lower bound on master objective [$]
0
-1000er1 er2 er3 er4 er5 er6 er7 er8 er9 r10 r11
it it it it it it it it it ite ite
-2000
-3000
-4000
-5000
-6000
Upper bound
-7000
Lower bound
-8000
-9000
-10000
-11000
-12000
-13000
-14000
-15000
-16000
Figur 2: Example of bounding of the master objective of the Dakota furniture problem
However, finding out how this upper and lower bound is calculated is rather
tedious. Let me present some intuition behind it here:
Since we have a minimization problem, the lower bound of the objective is the
best possible objective, right? And the upper bound represent the worst case. So,
after the first iteration we know that the expected profit of the Dakota furniture
company is between $14875 and $0 – not a very precise measure – huh?
Ok, now I reveal how the upper and lower bound is calculated. The lower bound is
simply the optimal value of the master objective .
T v v
LB=c x +θ
The upper bound is the first stage cost, plus the probability weighted sum of
optimal responses in the second stage:
UB=c t x v +∑ ps w s |x v
Aren’t these two measures really the same? No, there is one big difference. When
calculating the lower bound, you are allowed to vary the first stage decisions,
while simultaneously taking into account the impact of your choice of x in the
first and second stage – although the effect on second stage can only be
approximated through the restrictions on theta. In the Dakota furniture problem,
this means that we choose the amount of inputs to buy while accounting for that
the lumber, finishing and carpentry both represent a direct cost, but also an
income opportunity in the second stage, and trade these two objectives off
against each other.
When calculating the upper bound, however, the choice of inputs is fixed, and the
only thing we can do about it is to make the best out of it when demand is
revealed. However, if the choice of inputs happened to be optimal, we will also
have an optimal solution.
Birge & Louveaux [1] describe the theory behind this in chapter 9. They state
that “The L-shaped method (…) is based on iteratively providing a lower bound
on the recourse objective, Q( x) .” For details on the use of bonding, I refer the
interested reader to chapter 9, and 9.3 in particular.
The lower bound is continuously improving for each iteration, whereas the upper
bound stays the same for several iterations.
List of notation
l – iteration number in the L-shaped method
h – resource vector
p – probability
s – scenario
T – technology matrix
π – dual variables
References
[1] J. R. Birge and F. Louveaux, Introduction to Stochastic Programming. New
York: Springer, 1997.
[2] U. G. Christensen and A. B. Pedersen, "Lecture Note on Benders'
Decomposition," ed, 2008.
[3] C. C. Carøe and R. Schultz, "Dual Decomposition in Stochastic Integer
Programming," K.-Z.-Z. f. I. Berlin, Ed., ed, 1996.
[4] J. L. Higle, "Stochastic Programming: Optimization When Uncertainty
Matters," in Tutorials in Operations Research, ed: INFORMS, 2005.
[5] P. Kall and S. W. Wallace, Stochastic programming, 1 ed.: John Wiley &
Sons, 1994.
[6] E. Kalvelagen, "Benders Decomposition for Stochastic Programming with
GAMS," ed, 2003, p. 10.

Benders Decomposition For Dummies

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Benders Decomposition For Dummies

Enviado por

Direitos autorais:

Formatos disponíveis

qwertyuiopasdfghjklzxcvbnmqw

- You have a basic idea about what stochastic programming is about

What is Benders decomposition?

Originally, benders was written to solve integer, non-stochastic programs. This is

What are the benefits of using Benders decomposition?

There exists two families of decomposition techniques: Decomposition along the

Presentation of the case study

The parameters of the problem are as follows:

Tabell 1: Cost of input factors and input factor requirements

Resource Costs Input requirements

Tabell 2: Demand scenarios and sell prices

Demand scenarios Sell price

Dressing up the problem in the standard notation

θ is an approximation of the second stage costs as a function of your chosen x.

The second stage

θ ≥ el− El x l=1, … . L (Birge & Louveaux)

and θ ≥ α l + β l x l=1, … . L (Higle)

are really equivalent].

w ≥ π T [h(ωs )−T x ] (3)

Does this look familiar? It should. Remember that the value θ is an

θ ≥ el− El x <-> w ≥ π T [h(ωs )−T x ]

Taking care of stochasticity

With these numbers at hand, we move to the second stage problem. We

Scenario W Probability Marginal values

Scenario Probability Marginal values

The marginal values of increased demand should not be so surprising. Since we

The Bender procedure from beginning to end

Step 2: Sub problems

Step 3: Convergence test

Step 4: Master problem

The naming of iterations in Benders

But how do I know how many iterations to run?

If you look at implementations of Benders algorithm or look at applications in

Você também pode gostar