Escolar Documentos
Profissional Documentos
Cultura Documentos
BY
SIVA THEJA MAGULURI
DISSERTATION
Submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in Electrical and Computer Engineering
in the Graduate College of the
University of Illinois at Urbana-Champaign, 2014
Urbana, Illinois
Doctoral Committee:
Professor R. Srikant, Chair
Professor Bruce Hajek
Professor Pramod Viswanath
Assistant Professor Yi Lu
Associate Professor Lei Ying, Arizona State University
ABSTRACT
iii
iv
ACKNOWLEDGMENTS
When I look back at my past self at the time when I arrived here at UIUC as
a fresh college graduate, I see how much my experience at UIUC has helped
in my growth and how much I have learned here. I have been very fortunate
to have had an opportunity to work in CSL. The one person who played the
most important role in my experience here is obviously my advisor, Prof.
Srikant. He is not only an amazing teacher but also a great mentor. I thank
him for his constant guidance and support all through my grad school. He
taught me the basics of how research is done, how it is communicated and
above all how to think about research problems. He has always been there
to support me. My heart-felt thanks for his support, care, and concern for
my well-being.
I thank Prof. Lei Ying from Arizona State University, who has been my
coauthor for much of the work that went into this dissertation. His suggestions and insights have been very important in this work. I thank Prof Bruce
Hajek who was co-advisor during my masters. I also thank Prof Bruce Hajek, Prof. Lei Ying, Prof. Yi Lu and Prof. Pramod Viswanath for serving
on my committee. I thank the anonymous reviewers who have taken time
to review my papers and have given their valuable feedback to improve the
results that went into this dissertation. Thanks are also due to Weina Wang
for her important comments on one of the results.
I thank UIUC for providing me with one of the best possible learning
environments. In addition to an excellent research environment in CSL,
the university has provided me an opportunity to learn through the various
courses that I took from ECE, CS and Maths. These courses had some of
the best teachers and I have thoroughly enjoyed taking them.
I thank my group members Rui, Chong, Javad and postdocs Chandramani
and Joohwan who have always been ready for discussions. Thanks are due
to Sreeram who has been a great partner in course projects and for the
vi
Champaign-Urbana. If not for this family, my life here, with the harsh
winter, would have been very challenging. They have also provided me a
smooth transition from life in India to the US.
Over the last year of my stay in Chambana, Vedanta Study Circle and
my friends there have become the defining aspect of my life. I thank Chaitanya, Ramsai, Srikanthan, Raghavendra, Suraj, Sasidhar and Kalyan for
their company, our weekly meetings, philosophical discussions, outings and
relaxing dinners.
In the last year of my grad school, I had to undergo a major surgery. It
was one of the biggest challenges in my life and I could face it only with the
support of so many of my friends and well-wishers. I express my heartfelt
gratitude to Pooja, Chaitanya, Manju Aunty, Pushpa Aunty and Manish.
Since my parents could not be around, my advisor, Prof Srikant assumed
that role. He drove in harsh winter weather to visit me and he constantly
inquired about my well-being and progress. My sister, Hima Bindu, took
a week off in the very first month of her new job to attend to my needs.
I should specially mention my friend Ramsai in this context. He has been
my constant companion, caretaker and has attended to my every need. He
has shown in practice, the philosophy of Karma Yoga (Unselfish Action).
I do not have words to express my gratitude to him and so I quote Swami
Vivekananda, the famous Indian monk. In happiness, in misery, in famine,
in pain, in the grave, in heaven, or in hell who never gives me up is my
friend. Is such friendship a joke? A man may have salvation through such
friendship. That brings salvation if we can love like that. I am fortunate to
have found such a friend in Ramsai.
I thank my parents for their unconditional love, constant support and encouragement in all my endeavors. They have made several sacrifices to give
me the best in everything. Since my childhood, my mother has motivated
me to excel and has taken pride in my success. She has supported my decision to pursue the PhD in a far off country instead of a lucrative career
in management closer to home, despite not completely appreciating why. I
thank my sister for her love and support.
Most importantly, I thank the Lord who, out of His grace, has more than
provided for my every need. I thank him for all the experiences he has put
me through and for giving me the strength to go through them. He has given
me the grad school experience which was a wonderful learning opportunity
viii
not only academically and professionally, but also personally and spiritually.
He put me in touch with great friends, teachers and well-wishers. I humbly
dedicate this dissertation to Him.
ix
TABLE OF CONTENTS
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Best Fit Is Not Throughput Optimal . . . . . . . . . . . . . .
1.2 A Stochastic Model for Cloud Computing . . . . . . . . . . .
2 THROUGHPUT OPTIMALITY: KNOWN JOB SIZES .
2.1 A Centralized Queueing Approach . . . . . . . . . .
2.2 Resource Allocation with Load Balancing . . . . . .
2.3 Simpler Load Balancing Algorithms . . . . . . . . .
2.4 Discussion and Simulations . . . . . . . . . . . . . .
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . .
3 UNKNOWN JOB SIZES . . .
3.1 Algorithm . . . . . . . .
3.2 Throughput Optimality 3.3 Throughput Optimality 3.4 Local Refresh Times . .
3.5 Simulations . . . . . . .
3.6 Conclusion . . . . . . .
.
.
.
.
.
.
1
3
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
10
15
21
28
30
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
Geometric Job Sizes . . . . .
General Job Size Distribution
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
32
34
38
43
57
59
64
4 LIMITED PREEMPTION . . . . . . . . . . . . . . . . . . . . . . . 65
4.1 Unknown Job Sizes . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 DELAY OPTIMALITY . . . . . . . . . . . . . . . . . . . . . .
5.1 Algorithm with Limited Preemption - Known Job Sizes . .
5.2 Throughput Optimality . . . . . . . . . . . . . . . . . . . .
5.3 Heavy Traffic Optimality . . . . . . . . . . . . . . . . . . .
5.4 Power-of-Two-Choices Routing and MaxWeight Scheduling
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
77
79
83
84
111
115
6 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.1 Open Problems and Future Directions . . . . . . . . . . . . . 116
A PROOF OF LEMMA 3.6 . . . . . . . . . . . . . . . . . . . . . . . 118
B PROOF OF LEMMA 3.7 . . . . . . . . . . . . . . . . . . . . . . . 119
x
xi
CHAPTER 1
INTRODUCTION
Memory
CPU
Storage
15 GB
8 EC2 units 1,690 GB
17.1 GB 6.5 EC2 units 420 GB
7 GB
20 EC2 units 1,690 GB
Figure 1.1: The number of backlogged jobs under the best-fit policy and a
MaxWeight policy
for all k. We let Nmax denote the maximum number of VMs of any type that
can be served on any server.
4
Am (t) denotes the total number of type m job arrivals at time t, routing is
P
done so that i Ami (t) = Am (t).
In each time slot, jobs are served at each server according to a scheduling
algorithm. Let Dmi (t) denote the number of type-m jobs that finish service
at server i in time slot t. Then the queue lengths evolve as follows:
Qmi (t + 1) = Qmi (t) + Ami (t) Dmi (t).
The cloud system is said to be stable if the expected total queue length is
bounded, i.e.,
"
#
XX
lim sup E
Qmi (t) < .
t
N RM
+ : N =
L
X
)
N (i) and N (i) Conv(Ni ) ,
i=1
R
:
(
S)
C
,
+
+
where ( S) denotes the Hadamard product or entrywise product of the
m to denote
vectors and S and is defined as ( S)m = m S m . We use
C is same as ( S) Cb We use int(.) to denote interior of a
m S m so
set.
We next use a simple example to illustrate the definition of C.
Example 1.2. Consider a simple cloud system consisting of three servers.
Servers 1 and 2 are of the same type (i.e., they have the same amount of
resources), and server 3 is of a different type. Assume there are two types of
VMs. The set of feasible VM configurations on servers 1 and 2 is assumed to
be N1 = N2 = {(0, 0), (1, 0), (0, 1)}, i.e., each of these servers can at most host
either one type-1 VM or one type-2 VM. The set of feasible configurations on
6
(1)
(3)
N2
N2
2
1
0
Conv(N3 )
Conv(N1 )
(1)
N1
(3)
N1
C
(2, 2)
N1
CHAPTER 2
THROUGHPUT OPTIMALITY: KNOWN
JOB SIZES
Since the job sizes are known at arrival, when the jobs are queued, one
knows the total backlogged workload in the queue, i.e., the total amount of
time required to serve all the jobs in the queue. One can use this information
in the resource allocation problem. Let qmi (t) denote the total backlogged
workload of type m jobs at server i.
In this chapter, we will first draw an analogy with scheduling in an ad hoc
wireless network. We will show that the algorithms for ad hoc wireless can
be directly used here for a simplified system. Then, we present an almost
throughput optimal resource allocation algorithm for the cloud computing
system. The results in this chapter have been presented in [20] and [21].
(2.1)
Link1
A
Interference
Link2
C
Interference
Link3
E
Figure 2.1: Interference constraints for six users and three links and the
corresponding interference graph
(i)
#
X
qm (t) <
intC.
whenever
The proof is based on bounding the drift of a quadratic Lyapunov function.
12
The drift is shown to be negative outside a finite set and the Foster-Lyapunov
theorem is invoked to prove positive recurrence of the Markov chain corresponding to the system as long as the arrivals are within the capacity region.
We skip the proof of this proposition since it is much simpler and is in the
same lines as that of Theorem 2.1.
One drawback of MaxWeight scheduling in wireless networks is that its
complexity increases exponentially with the number of wireless nodes. Moreover, it needs to be implemented in a centralized policy. However, for the
cloud system the server by server implementation in Algorithm 1 is has much
lower complexity and can be implemented in a distributed manner. Consider
the following example. If there are L servers and each server has S allowed
configurations, then when each server computes a separate MaxWeight allocation, it has to find a schedule from S allowed configurations. Since there
are L servers, this is equivalent to finding a schedule from LS possibilities.
However, for a centralized MaxWeight schedule, one has to find a schedule
from S L schedules. Moreover, the complexity of each servers scheduling
problem depends only on its own set of allowed configurations, which is independent of the total number of servers. Typically the data center is scaled
by adding more servers rather than adding more allowable configurations.
13
Therefore, since one cannot choose MaxWeight schedule in every time slot,
a natural alternative is to somehow choose MaxWeight schedule every few
time slots and then perform some reasonable scheduling between these time
slots. It turns out that using MaxWeight schedule once in a while is good
enough.
Before we present the algorithm, we outline the basic ideas first. We group
T time slots into a super time slot, where T > Smax . At the beginning of a
super time slot, a configuration is chosen according to the MaxWeight algorithm. When jobs depart a server, the remaining resources in the server
are filled again using the MaxWeight algorithm; however, we impose the
constraint that only jobs that can be completed within the super slot can
be served. So the algorithm myopically (without consideration of the future) uses resources, but is queue-length aware since it uses the MaxWeight
algorithm. This is described more precisely in Algorithm 2
Algorithm 2 Myopic MaxWeight allocation:
We group T time slots into a super time slot. At time slot t, consider the
ith server. Let N (i) (t ) be the set of VMs that are hosted on server i at the
beginning of time slot t, i.e., these correspond to the jobs that were scheduled
in the previous time slot but are still in the system. These VMs cannot be
reconfigured due to our nonpreemption requirement. The central controller
(i) (t) to fill up the resources not used
finds a new vector of configurations N
by N (i) (t ), i.e.,
X
(i) (t) arg
qm (t)Nm ,
N
max
N :N +N (i) (t )Ni
Note that this myopic MaxWeight allocation algorithm differs from the
server-by-server MaxWeight allocation in two aspects: (i) jobs are not interrupted when served and (ii) when a job departs from a server, new jobs
14
15
(2.2)
For all other t, at the beginning of the time slot, a new configuration
is chosen as follows:
X
(i) (t)
qmi (t)Nm ,
N
arg max
N :N +N (i) (t )Ni m
16
(2.3)
a super time slot. For any t such that 1 (t mod T ) T Smax , for each
server i,
X
(i)
qmi (t)Nm
(t 1)
X
X
(i)
(i)
(i)
= qmi (t)Nm
(t ) +
qmi (t) Nm
(t 1) Nm
(t )
m
X
(a)X
(i)
(i) (t)
qmi (t)Nm
(t ) +
qmi (t)N
m
m
m
X
(i)
(i)
m
(t) Iqmi (t)Smax Nmax
=
qmi (t)Nm (t ) + qmi (t)N
m
X
(i)
qmi (t)Nm
(t )
(i)
m
(b)X
(i)
2
qmi (t)Nm
(t) + M Smax Nmax
,
m
(i)
m
where the inequality (a) follows from the definition N
(t); and inequality (b)
17
holds because when qmi (t) Smax Nmax , there are enough number of type-m
jobs to be allocated to the servers, and when 1 (t mod T ) T Smax ,
all backlogged jobs areeligible to be served in terms of job sizes. Now since
(i)
|qmi (t) qmi (t 1)| = ami (t 1) Nm (t) ami (t 1) + Nmax , we have
X
(i)
qmi (t 1)Nm
(t 1) 0 +
(i)
qmi (t)Nm
(t) +
2
(Smax + 1).
where 0 = M Nmax
2
Let V (t) = |q(t)| be the Lyapunov function. Let t = nT + for 0 < T .
Then,
(2.5)
"
=E 2
XX
m
(i)
qmi (t) ami (t) Nm
(t)
XX
ami (t)
2
(i)
Nm
(t) q(nT )
=q
"
XX
K1 + 2E
=K1 + 2
(2.6)
XX
i
#
(i)
qmi (t)Nm (t) q(nT ) = q (2.7)
"
XX
2E
=K1 + 2
#
(i)
qmi (t)Nm
(t) q(nT ) = q
"m
XX
#
(i)
2E
qmi (t)Nm
(t) q(nT ) = q
m
i
X
X
2 + 2
m E[qmi (nT ) (nT )|q(nT ) = q]
K1 + 2
m
m
"m
XX
(2.9)
#
(i)
2E
qmi (t)Nm
(t) q(nT ) = q
m
i
X
X
2 + 2
m qmi
=K1 + 2
m
m
m
(2.8)
18
(2.10)
"
XX
2E
#
(i)
qmi (t)Nm (t) q(nT ) = q ,
(2.11)
P 2
2
2
where K1 = M LNmax
+ m (
m + m + 2m Nmax ) and im = im (nT ) =
arg min qmi . Equation (2.8) follows from the definition of ami in the routing
i{1,2,,,L}
algorithm in (5.2). Equation (2.9) follows from the independence of the arrival process from the queue length process. Inequality (2.10) follows from the
P
fact that E[qmim (t) (t)] E[qmim (nT ) (t)] E[qmim (nT ) (nT ) + t1
t0 =nT am (t)] =
E[qmim (nT )] + .
Now, applying (2.4) repeatedly for t [nT, (n + 1)T Smax ], and summing
P
over i and using the fact that i ami (t) = am (t), we get
XX
i
(i)
qmi (t)Nm
(t)
m
(n+1)T Smax 1
0
L(t nT )
XX
i
(i)
qmi (nT )Nm
(nT )
t0 =nT
am (t0 )Nmax .
(2.12)
T Smax
Si max
i . According
such that (1+)T
i Conv(Ni ) for all i and
= P
exists
T Smax
i
i
X
X
T
(i)
i
qmi (nT )
qmi (nT )Nm
(nT ).
m
T Smax m
m
(2.13)
Thus, we get,
XX
i
(i)
qmi (t)Nm
(t)
m
(n+1)T Smax 1
0
L(t nT )
XX
i
(i)
qmi (nT )Nm
(nT )
t0 =nT
am (t0 )Nmax
(2.14)
(n+1)T Smax 1
X
X
(1 + )T X X
i
L(t nT )
qmi (nT )m +
am (t0 )Nmax .
T Smax i m
m
t0 =nT
0
(2.15)
19
(2.16)
+2
m qmi 2(1 + )
T
T Smax
XX
i
i .
qmi
m
P i
m qmi = P
i
Note that
i m qmim
i m qmi . We will now use this relation
m
and sum the drift for from 0 to T 1. Using (2.16) for [0, T Smax ],
and (2.11) for the remaining , we get
E[V ((n + 1)T ) V (nT )|q(nT ) = q]
T K1 + 2
2 +
m Nmax )
(
m
T 1
X
i 2 (1 + )T
qmi
m
T Smax
i,m
XX
i ,
K2 2T
qmi
m
X
T S
max 1
X
=0
+ 2T
+ 2L
=0
i (T Smax )
qmi
m
i,m
PT Smax 1
P 2
PT 1
0
. Let
where K2 = T K1 + 2 m (
m + m Nmax )
=0
=0 + 2L
P P
i K2 /T }. The set B is finite. This is because
B = {Y : i m qmi (Y)
m
P P
i
there are only a finite number of q ZM
+ L vectors satisfying
i
m qmi m
and for each q, there are a finite number of states Y such that q = q(Y).
Clearly the drift E[V ((n + 1)T ) V (nT )|q(nT ) = q] is negative outside
the finite set B. Then from the Foster-Lyapunov theorem [22, 23], we have
e
that the sampled Markov chain Y(n)
, Y(nT ) is positive recurrent and so
P P
limn E[ i m qmi (nT )] < . For any time t between nT and (n + 1)T ,
we have
"
#
"
#
t1
XX
XX
XX X
E
qmi (t) E
qmi (nT ) +
ami (t0 )
i
=E
"
XX
i
m t0 =nT
#
qmi (nT ) + T
20
X
m
m.
Therefore, we have
"
lim E
#
XX
i
"
XX
i
#
qmi (nT ) + T
< .
This proves the theorem.
Since the work load qmi (t) is always at least as much as the number of
jobs, Qmi (t) , from Proposition 1.1, we have that any arrival rate
/ C
is not supportable. Thus, we have that Algorithm 3 is almost throughput
C, we can choose
optimal. By this, we mean that given any arrival rate
the parameter T so that the system is stable.
21
a (t)
m
ami (t) =
0
if i = im (t)
otherwise.
Otherwise, the algorithm is identical to the JSQ-Myopic MaxWeight algorithm considered earlier. The following theorem shows that the throughput
performance using the power-of-two-choices algorithm is similar to that of
JSQ routing algorithm when all the servers are identical.
Theorem 2.2. When all the severs are identical, any load vector that sat int(C) is stabilizable under the power-of-two-choices routing
isfies T STmax
and myopic MaxWeight allocation algorithm.
Proof. Again, we use V (t) = |q(t)|2 as the Lyapunov function. Then, from
(2.7), we have
#
E[V (t + 1) V (t)|q(nT ) = q] K1 + 2E
qmi (t)ami (t) q(nT ) = q
m
i
"
# .
XX
(i)
2E
qmi (t)Nm
(t) q(nT ) = q
m
i
(2.17)
For fixed m, let Xm (t) be the random variable which denotes the two
servers that were chosen by the routing algorithm at time t for jobs of type
m. Xm (t) is then uniformly distributed over all sets of two servers. Now,
using the tower property of conditional expectation, we have
"
XX
#
E
qmi (t)ami (t) q(nT ) = q
" i"
#
#
X
=E E
qmi (t)ami (t) q(nT ) = q, Xm (t) = {i0 , j 0 } q(nT ) = q
"
=E [ E [min (qmi0 (t), qmj 0 (t)) am (t)| q(nT ) = q, X(t) = {i0 , j 0 }]| q(nT ) = q]
(2.18)
qmi0 (t) + qmj 0 (t)
0
0
am (t) q(nT ) = q, X(t) = {i , j } q(nT ) = q
E E
2
#
"
L 1 1X
m q(nT ) = q
=E
qmi (t)
(2.19)
L
2
2
i
22
"
#
X
m
= E
qmi (t) q(nT ) = q
L
i
#
"
t1 X
X
X
m
qmi (nT ) +
ami (t0 ) q(nT ) = q
E
L
0
i
i
! t =nT
m X
m ,
qmi +
L
i
(2.20)
2.20 follows from the fact that E[ i ami (t0 )] = E[am (t0 )] = .
Since the scheduling algorithm is identical to Algorithm 3, the bound in
(2.12) still holds
XX
i
(i)
qmi (t)Nm
(t)
m
(n+1)T Smax 1
0
L(t nT )
XX
i
(i)
qmi (nT )Nm
(nT )
t0 =nT
am (t0 )Nmax .
(2.21)
Since T(1+)T
Smax
T Smax
assumed that all the servers are identical, so C is obtained by summing L
C, means (1+)T Conv(N ) =
(1 + )
X
m X
T
(i)
(nT ).
qmi (nT )
Thus, we get
XX
i
(i)
qmi (t)Nm
(t)
Smax 1
X
m (n+1)TX
(1 + )T X X
L(t nT )
qmi (nT )
+
am (t0 )Nmax .
T Smax i m
L
m
t0 =nT
0
23
Now, substituting this and (2.20) in (2.17) and summing over t [nT, (n +
1)T 1], as in the proof of Theorem 2.1, we get
E[V ((n + 1)T ) V (nT )|q(nT ) = q]
T K1 + 2
m Nmax )
2 +
(
m
=0
+ 2T
qmi
i,m
K2 2T
+ 2L
T S
max 1
X
=0
(1 + )T X
i (T Smax )
2
qmi
m
L
T Smax i,m
XX
i
T 1
X
qmi
.
L
24
arg min
a (t)
m
ami (t) =
0
if i = im (t)
otherwise.
Theorem 2.3. Assume am (t) amax for all t and m. Any job load vector
int(C) is stabilizable under the pick-and-compare
that satisfies T STmax
routing and myopic MaxWeight allocation algorithm.
Proof. Consider the irreducible Markov chain Y(t) = (Y(t), {im (t)}m ) and
the Lyapunov function V (t) = |q(t)|2 . Then, as in the proof of Theorem 2.2,
similar to (2.17) for t nT, we have
E[V (t + 1) V (t)|q(nT ) = q, im (nT ) = i0 ]
"
#
XX
K1 + 2E
qmi (t)ami (t) q(nT ) = q, im (nT ) = i0
m
i
#
"
XX
(i)
qmi (t)Nm
(t) q(nT ) = q, im (nT ) = i0 .
2E
(2.22)
m > 0 and
m is not on the boundary of C, one
This is possible because if
i
i
XX
i
(i)
qmi (t)Nm
(t)
L(t nT ) 00
(1 + )T X X
i ,
qmi (nT )
m
T Smax i m
(2.23)
2
where 00 = M Nmax (amax + Nmax ) + M Smax Nmax
.
P P
(i)
We also need a bound on the increase in i m qmi (t)Nm (t) over mul-
25
(i)
(nT )
qmi (nT )Nm
XX
XX
(i)
2
qmi ((n + n0 )T )Nm
(nT ) + n0 T LM Nmax
(i)
qmi ((n + n0 )T )Nm
((n + n0 )T ) + n0 T L 00 ,
where the second inequality follows from the fact that we use maxweight
scheduling every T slots and from the definition of 00 . Now, again, using
(2.13), and (2.23), for any t such that 1 (t mod T ) T Smax , we have
XX
(i)
qmi (t)Nm
(t)
L(t nT ) 00
(1 + )T X X
i .
qmi (nT )
m
T Smax i m
(2.24)
Fix m. Let im
min = arg min qmi (nT ). Note that |qmi (t) qmi (t 1)| =
i{1,2,,,L}
(i)
ami (t) Nm (t) amax + Nmax . Therefore, once there is a t0 nT such
that im (t0 ) satisfies
qmim (t0 ) (t0 ) qmim
(t0 ),
(2.25)
min
then, for all t t0 , we have qmim (t) (t) qmim
(nT ) + (t nT ) (amax + Nmax ).
min
(t nT )
Probability that (2.25) does not happen is at most 1 L1 0
. Choose t0
so that this probability is less than p = /4. Then, (1+p) = 1+/4. Choose
k so that kT > (t0 nT ) and ((n + k)T t0 ) + (t0 nT ) kT (1 + /4).
Then
(n+k)T 1
#
E
qmi (t)ami (t) q(nT ) = q, im (nT ) = i0
i
t=nT
"
#
t0
X
X
=
E
qmi (t)ami (t) q(nT ) = q, im (nT ) = i0
i
t=nT
"
#
(n+k)T 1
X
X
0
+
E
qmi (t)ami (t) q(nT ) = q, im (nT ) = i
X
"
t=t0
m (t0 nT )
X
i
qmi +
t0
X
t=nT
26
(2.26)
(n+k)T 1
t=t0
m ((n + k)T t0 )
+ p
qmi
i
(n+k)T 1
+p
(2.27)
t=t0
(1 p) ((n + k)T t0 )
i +
qmim
min m
kT
X
=0
m (t0 nT )
+ (1 p)
m kT
qmi + p
qmi
(2.28)
i
qmi
m
(2.29)
K3 + (1 p) ((n + k)T t0 )
i
qmi
m
+ (1 p)(t0 nT )
i + pkT
qmi
m
X
i
K3 + (1 p)kT (1 + /4)
i + (1 + /4)pkT
qmi
m
K3 + kT (1 + /4)2
i
qmi
m
(2.30)
i
qmi
m
(2.31)
i
qmi
m
(2.32)
K3 + kT (1 + 3/4)
X
i
where K3 =
kT
P
=0
#
(i)
2E
qmi (t)Nm
(t) q(nT ) = q, im (t) = i0
m
i
t=nT
XX
i
K4 + 2kT (1 + 3/4)
qmi
m
X
"
XX
XX
T
i k(T Smax )
qmi
2(1 + )
m
T Smax m i
27
XX
1
i
K4 + kT
qmi
m
2
m
i
P Smax 1
. The result follows from the
where K4 = kT K1 + M K3 + 2L 00 kT
=0
Foster-Lyapunov theorem [22, 23].
Note that the assumption that the arrivals in each time slot are bounded
by amax can easily be relaxed, similar to the proof of theorems 2.1 and 2.2.
Figure 2.2: Comparison of the mean delays in the cloud computing cluster
in the case with a common queue and in the case with power-of-two-choices
routing when frame size is 4000
and 500. Therefore, the average job size is 130.5 and the maximum job size
is 500. We call this distribution A.
We further assume the number of type-i jobs arriving at each time slot
(1)
i
, 100). We varied the
follows a binomial distribution with parameter ( 130.5
traffic intensity parameter from 0.5 to 1 in our simulations. Here traffic
intensity is the factor by which the load vector has to be divided so that it lies
on the boundary of the capacity region. Each simulation was run for 500, 000
time slots. First we study the difference between power-of-two-choice routing
and JSQ routing by comparing the mean delays of the two algorithms at
various traffic intensities for different choices of frame sizes. Our simulation
results indicate that the delay performance of the two algorithms was not
very different. For concision, we only provide a representative sample of our
simulations here for the case where the frame size is 4000 in Figure 2.2.
Next, we show the performance of our algorithms for various values of the
frame size T in Figure 2.3. Again, we have only shown a representative sample
for the power-of-two-choices routing (with myopic MaxWeight scheduling).
From Theorems 2.1 and 2.2, we know that any load less than T STmax is
29
2.5 Conclusions
In this chapter, we have studied a few different resource allocation algorithms
for the stochastic model of IaaS cloud computing that was presented in the
Introduction. We assume that job sizes are known and are bounded and that
30
31
CHAPTER 3
UNKNOWN JOB SIZES
The algorithms presented in Chapter 2 assume that the job sizes are bounded
and are known when the job arrives into the system. This assumption is not
realistic in some settings. Many times, the duration of a job is not known
until the job finishes. Moreover the bounded job sizes assumption excludes
simple job size distributions like geometric distribution because it is not
bounded. In this chapter, we will present algorithms when job sizes are not
known.
The scheduling algorithm presented in this chapter is inspired by the one
studied in [19] for input-queued switched. Since a MaxWeight schedule cannot be chosen in every time slot without disturbing the jobs in service, a
MaxWeight schedule is chosen only at every refresh time. A time slot is
called a refresh time if no jobs are in service at the beginning of the time
slot. Between the refresh times, either the schedule can be left unchanged
or a greedy MaxWeight schedule can be chosen. It was argued that such a
scheduling algorithm is throughput optimal in a switch.
The proof of throughput optimality in [19] is based on first showing that the
duration between consecutive refresh times is bounded so that a MaxWeight
schedule is chosen often enough. Blackwells renewal theorem was used to
show this result. Since Blackwells renewal theorem is applicable only in
steady state, we were unable to verify the correctness of the proof.
Furthermore, to bound the refresh times of the system, it was claimed
in [19] that the refresh time for a system with infinitely backlogged queues
provides an upper bound over the system with arrivals. This is not true for
every sample path. For a set of jobs with given sizes, the arrivals could be
timed in such a way that the system with arrivals has a longer refresh time
than an infinitely backlogged system.
For example consider the following scenario. Let the original system be
called system 1 and the system with infinitely backlogged queues, system 2.
32
System 1 could have empty queues while system 2 never has empty queues.
Say T0 is a time when all jobs finish service for system 2. This does not
guarantee that all jobs finish service for system 1. This is because system 1
could be serving just one job at time T0 1, when there could be an arrival of a
job of two time slots long. Let us say that it can be scheduled simultaneously
with the job in service. This job then will not finish its service at time T0 ,
and so T0 is not a refresh time for system 1.
The result in [19] does not impose any conditions on job size distribution.
However, this insensitivity to job size distribution seems to be a consequence
of the relationship between the infinitely backlogged system and the finite
queue system which is assumed there, but which we do not believe is true in
general.
In particular, the examples presented in [28] as well as an example similar to the counter example to Best-Fit policy in Chapter 1 show that the
policy presented in [19] is not throughput optimal when the job sizes are
deterministic.
Here, we develop a coupling technique to bound the expected time between
two refresh times. With this technique, we do not need to use Blackwells
renewal theorem. The coupling argument is also used to precisely state how
the system with infinitely backlogged queue provides an upper bound on the
mean duration between refresh times.
In this chapter we present a throughput optimal scheduling and load balancing algorithm for a cloud data center, when the job sizes are unknown.
Job sizes are assumed to be unknown not only at arrival but also at the beginning of service. This algorithm is based on using queue lengths (number
of jobs in the queue) for weights in MaxWeight schedule instead of the workload as in the algorithm in Chapter 2. The scheduling part of our algorithm
is based on [19], but includes an additional routing component. Further, our
proof of throughput-optimality is different from the one in [19] due to the
earlier mentioned reasons.
If the job sizes are known, one can then use backlogged workload as the
weight in the algorithm presented here. In that case, this algorithm does not
waste any resources unlike the algorithms in Chapter 2 which forces a refresh
time every T time slots potentially wasting resources during the process. In
particular, when the job sizes have high variability, the amount of wastage
can be high. However, the algorithm in this chapter works even when the
33
job sizes are not bounded, for instance, when the job sizes are geometrically
distributed.
In terms of proof technique, in this chapter, we make the following contributions:
1. We use a coupling technique to show that the mean duration between
refresh times is bounded. We then use Walds identity to bound the
drift of a Lyapunov function between the refresh times.
2. Our algorithm can be used with a large class of weight functions to
compute the MaxWeight schedule (for example, the ones considered in
[29]) in the case of geometric job sizes. For general job sizes, we use
a log-weight functions. Log-weight functions are known to have good
performance properties [30] and are also amenable to low-complexity
implementations using randomized algorithms [31, 32].
3. Since we allow general job-size distributions, it is difficult to find a Lyapunov function whose drift is negative outside a finite set, as required by
the Foster-Lyapunov theorem which is typically used to prove stability
results. Instead, we use a theorem in [33] to prove our stability result,
but this theorem requires that the drift of the Lyapunov function be
(stochastically) bounded. We present a novel modification of the typical Lyapunov function used to establish the stability of MaxWeight
algorithms to verify the conditions of the theorem in [33].
We will first present the refresh times based scheduling algorithm and argue
that the refresh times are bounded. We illustrate the use of this result by
first proving throughput optimality in the simple case when the job sizes are
geometrically distributed. In the following section, we present the proof for
the case of general job size distributions. The results in this chapter have
been presented in [34] and [35].
3.1 Algorithm
There are two main challenges in this setting. The first is that, since the job
sizes are unknown, the total backlogged workload cannot be used to calculate
the weight of different schedules. We address this by using number of jobs
34
(3.1)
(i)
(i)
(i)
em
em
em
If it is not a refresh time, N
(t) = N
(t 1). However, N
(t)
jobs of type m may not be present at server i, in which case all the
jobs in the queue that are not yet being served will be included in the
(i)
new configuration. If N m (t) denotes the actual number of type m jobs
(i)
selected at server i, then the configuration at time t is N (i) (t) = N (t).
Otherwise, i.e., if there are enough number of jobs at server i, N (i) (t) =
(i)
em
N
(t).
36
arbitrarily high variance would be allowed but purely heavy-tailed distributions would not be allowed under our model.
37
e
Obtain a new process, X(n)
by sampling the Markov chain X(t) at the
e
e
refresh times, i.e., X(n)
= X(tn ). Note that X(n)
is also a Markov chain
e
because, conditioned on Q(n)
= Q(tn ) = q0 (and so N (tn )), the future of
e
evolution of X(t) and so X(n)
is independent of the past.
P P
Using V (X) = V (Q) = m i S m Q2mi as the Lyapunov function, we will
first show that the drift of the Markov chain is negative outside a bounded set.
This gives positive recurrence of the sampled Markov chain from the FosterLyapunov theorem. We will then use this to prove the positive recurrence of
the original Markov chain.
First consider the following one step drift of V (t). Let t = tn + for
0 < zn .
(V (t + 1) V (t))
XX
S m (Qmi (t) + Ami (t) Dmi (t))2 S m Q2mi (t)
=
m
=2
XX
XX
XX
m
P
where K5 = L(Amax + Nmax )2 ( m S m ).
Now using this relation in the drift of the sampled system, we get the
following. With a slight abuse of notation, we denote E [(.)|Q(tn ) = q] by
Eq [(.)].
e + 1)) V (Q(n))|
e
e
E[V (Q(n
Q(n)
= q]
=E[V (tn+1 ) V (tn )|Q(tn ) = q]
#
"z 1
n
X
=Eq
V (tn + + 1) V (tn + )
=0
"z 1
n
X
X
S m Qmi (tn + )Ami (tn + )
Eq
2
=0
m,i
!
S m Qmi (tn + )Dmi (tn + )
39
#
+ K5 .
(3.3)
The last term above is bounded by K5 K1 from Lemma 3.1. We will now
bound the first term in (3.3). From the definition of Ami in the routing
algorithm in (3.1), we have
2Eq
"z 1
n
X
XX
=0
=2Eq
=0
2Eq
"z 1
n
X
X
#
S m Qmim (tn + ) (tn + )Am (tn + )
"z 1
n
X
X
=0
#
"z 1
#
n
X
X
X
2
S m qmim Eq
Am (tn + ) +
A2max S m Eq zn2
m
A2max K2
=0
X
m
S m + 2Eq [zn ]
S m qmim m ,
(3.4)
where im (t) = arg min Qmi (t) and im = im (tn ). Since Qmim (tn + ) (tn + )
i{1,2,,,L}
Qmim (tn ) (tn + ) Qmim (tn ) + Amax because the load at each queue cannot
increase by more than Amax in each time slot, we get the first inequality.
Let Y(t) = {Ymi (t)}m,i denote the state of jobs of type-m at server i.
(i)
j
When Qmi (t) 6= 0, Ymi (t) is a vector of size Nm (t) and Ymi
(t) is the amount
th
of time the j type-m job that is in service at server i has been scheduled.
(n)
Note that the departures D(t) can be inferred from Y(t). Let F be the
filtration generated by the process Y(tn + ). Then, A(tn + +1) is indepen(n)
(n)
dent of F and zn is a stopping time for F . Then, from Walds identity1
[36, Chap 6, Cor 3.1 and Sec 4(a)] and Lemma 3.1, we have (3.4).
Now we will bound the second term in (3.3). To do this, consider the
(i)
em
following system. For every job of type m that is in the configuration N
(tn ),
consider an independent geometric random variable of mean S m to simulate
i
potential departures or job completions. Let Ij,m
(t) be an indicator function
(i)
th
em
denoting if the j job of type m at server i in configuration N
(tn ) has
a potential departure at time t. Because of the memoryless property of
1
Walds identity: Let {Xn : n N} be a sequence of real-valued, random variables
such that all {Xn : n N} have same expectation and there exists a constant C such that
E[|Xn |] C for all n N. Assume that there exists a filtration {Fn }nN such that Xn
and Fn1 are independent for every n N. Then, if N is a finite mean stopping time with
PN
respect to the filtration {Fn }nN , E[ n=1 Xn ] = E[Xn ]E[N ].
40
i
geometric distribution, Ij,m
(t) are i.i.d. Bernoulli with mean 1/S m .
i
If the j th job was scheduled, Ij,m
(t) corresponds to an actual departure. If
not (i.e., if there was unused service), there is no actual departure correspond(i)
b mi (t) = PNem (tn ) I i (t) denote the number of potential deing to this. Let D
j,m
j=1
b mi (t) = Dmi (t)
partures of type m at server i. Note that if Qmi (t) Nmax , D
b mi (t) Dmi (t)
since there is no unused service in this case. Also, D
b mi (t) Nmax . Thus, we have
D
(3.5)
Note that Qmi (t) Qmi (t1)Nmax , since Nmax is the maximum possible
departures in each time slot. So, we have
b mi (tn+ ) (Qmi (tn ) Nmax ) D
b mi (tn + )
Qmi (tn+ )D
b mi (tn + ) N 2 .
Qmi (tn )D
max
Using this with (3.5), we can bound the second term in (3.3) as follows:
2Eq
"z 1
n
X
X
#
S m Qmi (tn + )Dmi (tn + )
=0 i,m
2Eq
"z 1
n
X
X
#
b mi (tn + )
S m Qmi (tn + )D
=0 i,m
2
LNmax
S m 2Eq [zn ]
"
2Eq
S m Qmi (tn )
=2Eq [zn ]
#
b mi (tn + ) K6
D
=0
i,m
zX
n 1
e (i)
qmi N
m
K6
(3.6)
i,m
P
2
b(n) denote the filtration generwhere K6 = LNmax
m S m (2K1 + K2 ). Let F
b n + )}. Then, F(n) Fb(n) . Since zn is a stopping
ated by {Y(tn + ), D(t
41
(n)
time with respect to the filtration F , it is also a stopping time with re(n)
b n + + 1) is independent of Fb(n) ,
spect to the filtration Fb . Since D(t
(i)
b n + ) is sum of N
em
Walds identity can be applied here. D(t
(tn ) independent Bernoulli random variables each with mean 1/S m . Thus, we have
(i)
b mi (t)] = N
em
E[D
(tn )/S m . Using this in Walds identity we get (3.6).
b there exists > 0 such that ((1 + ), S) C,
b
Since (, S) int(C),
P
there exists {(1 + )i }i such that i S Conv(Ni ) for all i and = i .
i
According to the scheduling algorithm (3.2), for each server i, we have that
X
e (i) (tn ).
Qmi (tn )N
m
(3.7)
K7 2Eq [zn ]
i,m
XX
i
(ba)
K7 2
XX
qmi im S m ,
where K7 = A2max K2
qmi im S m
P i
and
i
E[Qmi (t)]
m,i
m,i
As t , we get
lim sup
XX
lim sup
n
E[Qmi (t)]
XX
m
42
K3 + K1 LM (Amax + Nmax ).
Qmi
X
Wm (lj ),
j=1
where lj is the duration of completed service for the j th job in the queue.
Note that lj = 0 if the job was never served.
The expected backlog evolves as follows:
Qmi (t + 1) = Qmi (t) + Ami (t) Dmi (t),
where Ami (t) = Ami (t)S m since each arrival of type m brings in an expected
load of S m . Dmi (t) is the departure of the load.
Let pbml = P (Sm = l + 1|Sm > l). A job of type m that is scheduled for l
amount of time, has a backlogged workload of Wm (l). It departs in the next
time slot with a probability pbml . With a probability 1 pbml , the job does
not depart, and the expected remaining load changes to Wm (l + 1). So, the
43
W (l)
with prob pbml
m
Dmi (t) =
W (l) W (l + 1) with prob 1 pb .
m
m
ml
(3.8)
This means that the Dmi (t) could be negative sometimes, which means the
expected backlog could increase even if there are no arrivals. Since the job
size distribution is lower bounded by a geometric distribution by Assumption
3.1, the expected remaining workload is upper bounded by that of a geometric
distribution. We will now show this formally.
From Assumption 3.1 on the job size distribution, we have
P (Sm = l + 1|Sm > l) C
P (Sm > l + 1|Sm > l) (1 C).
Then, using the relation P (Sm > l + k + 1|Sm > l) = P (Sm > l + k + 1|Sm >
l+k)P (Sm > l+k|Sm > l), one can inductively show that P (Sm > l+k|Sm >
l) (1 C)k for k 1. Then,
Wm (l) = E[Sm l|Sm > l]
k=0
(1 C)k 1/C.
(3.9)
k=0
Then from (3.8), the increase in backlog of workload due to departure for
each scheduled job can increase by at most Wm (l +1), which is bounded 1/C.
There are at most Nmax jobs of each type that are scheduled. The arrival in
backlog queue is at most Amax S max . Thus, we have
Qmi (t + 1) Qmi (t) Amax S max +
Nmax
.
C
(3.10)
Similarly, since the maximum departure in work load for each scheduled job
is 1/C, we have
Qmi (t + 1) Qmi (t)
Nmax
.
C
(3.11)
Since every job in the queue has at least one more time slot of service left,
44
Qmi (t) Qmi (t). Since Wm (l) 1/C, we have the following lemma.
e 1 such that Qmi (t) Qmi (t)
Lemma 3.2. There exists a constant C
e mi (t) for all i, m and t.
CQ
Unlike the case of geometric job sizes, the actual departures in each time
slot depend on the amount of finished service. However, the expected departure of workload in each time slot is constant even for a general job size
distribution. This is the reason we use a Lyapunov function that depends
on the workload. We prove this result in the following lemma. This is a key
result that we need for the proof.
Lemma 3.3. If a job of type m has been scheduled for l time slots, then the
expected departure in the backlogged workload is E[Dm |l] = 1. Therefore, we
have E[Dm ] = 1
Proof. Recall pbml = P (Sm = l + 1|Sm > l). We have
Wm (l) =E[Sm l|Sm > l]
=b
pml1 + (1 pbml ) (1+E[Sm(l + 1)|Sm > l + 1])
=1 + Wm (l + 1) (1 pbml ) .
Thus, we have
Wm (l) Wm (l + 1) = 1 Wm (l + 1) (b
pml ) .
(3.12)
=1
from (3.12).
P
Since E[Dm |l] = 1 for all l, we have E[Dm ] = l E[Dm |l]P (l) = 1.
As in the case of geometric job sizes, we will show stability by first showing
that the system obtained by sampling at refresh times has negative drift. For
reasons mentioned in the introduction, here we will use g(y) = log(1 + y) and
45
G(Qmi ),
i,m
g(y)dy =
G(q) =
We will show that the drift of V (Q) between two refresh times is negative.
To do this, we will need the following generalized form the well-known Walds
identity.
Lemma 3.4 (Generalized Walds identity). Let {Xn : n N} be a sequence
of real-valued random variables and let N be a nonnegative integer-valued
random variable. Assume that
D1 {Xn }nN are all integrable (finite-mean) random variables
D2 E[Xn I{N n} ] = E[Xn ]P (N n) for every natural number n, and
P
n=1
Then, there exists > 0 and C such that lim supn E[e
46
] C .
We will use this lemma with the filtration generated by the process X(t)
and consider the drift of a Lyapunov function. However, the Lyapunov function corresponding to the logarithmic g(.) does not satisfy the Lipschitz-like
bounded drift condition C1 even though the queue lengths have bounded
increments.
Typically, if -MaxWeight algorithm is used (i.e., one where the weight for
the queue of type m jobs at server i is Qmi with > 1) corresponding to the
P
Lyapunov function V (Q) = i,m (Qmi )(1+) , one can modify this and use
the corresponding (1 + ) norm by considering the new Lyapunov function
P
1
U (Q) = ( i,m (Qmi )(1+) ) 1+ [37]. Since this is a norm on RLM , this has
the bounded drift property. One can then obtain the drift of U (.) in terms
of the drift of V (.).
Here, we dont have a norm corresponding to the logarithmic Lyapunov
function, so we define a new Lyapunov function U (.) as follows. Note that
G(.) is a strictly increasing function on the domain [0, ), G(0) = 0 and
G(q) as q . So, G(.) is a bijection and its inverse, G1 (.) :
[0, ) [0, ) is well-defined.
X
G(Qmi )).
U (Q) = G1 (V (Q)) = G1 (
(3.13)
i,m
(2)
(1)
U (Q ) U (Q )
(1)
(2)
and Q ,
(1)
V (Q ) V (Q )
(1)
log(1 + U (Q ))
The proof of this Lemma is based on concavity of U (.) and is similar to
the one in [37]. The proof is presented in Appendix A.
We will need the following Lemma to verify the conditions C1 and C2 in
Lemma 3.5.
47
Qmi (t + 1) Qmi (t) g(Qmi (t + 1))
m,i
Qmi (t + 1) Qmi (t) g(Qmi (t + 1)) g(Qmi (t))
m,i
48
(3.14)
Ami (t) Dmi (t) g(Qmi (t)),
(3.15)
m,i
where (3.14) follows from the convexity of G(.). To bound the first term in
(3.15), first consider the case when Qmi (t + 1) Qmi (t). Since g(.) is strictly
increasing and concave, we have
g(Qmi (t + 1)) g(Qmi (t))
= g(Qmi (t + 1)) g(Qmi (t))
g 0 (Qmi (t))(Qmi (t + 1) Qmi (t))
= Qmi (t + 1) Qmi (t) ,
(Qmi (t + 1) Qmi (t))
where the second inequality follows from g 0 (.) 1. Similarly, we get the
same relation even when Qmi (t) > Qmi (t + 1).
So the first term in (3.15) can be bounded as
X
g(Qmi (t + 1)) g(Qmi (t))
m,i
X
Qmi (t + 1) Qmi (t) g(Qmi (t + 1)) g(Qmi (t))
m,i
X
Qmi (t + 1) Qmi (t) 2
K8 ,
m,i
(3.16)
Dmi (t) Ami (t) g(Qmi (t + 1)).
(3.17)
m,i
X
m,i
"z 1
n
X
X
Eq
=0
m,i
!
g(Qmi (tn + ))Dmi (tn + )
#
+ K8 .
(3.18)
The last term above is bounded by K8 K1 from Lemma 3.1. We will now
bound the first term in (3.18).
Eq
"z 1
n
X
XX
=Eq [
(a)
Eq [
(b)
Eq [
=0 m
zX
n 1 X
=0 m
zX
n 1 X
=0 m
zX
n 1 X
=0
#
g(Qmi (tn + ))Ami (tn + )
m
zX
n 1
S m g(qmim )E[
m
(c)
E[zn ]
Am (tn + )] + E[zn2 ]
=0
S m g(qmim )m + A2max K2
K9 + E[zn ]
X
m
Sm +
Amax Nmax X
K2
Sm
C
m
S m g(qmim )m ,
(3.19)
where im (t) = arg min Qmi (t), im = im (tn ), im (t) = arg min Qmi (t), im =
i{1,2,,,L}
i{1,2,,,L}
P 2 Amax Nmax P
P
e m.
im (tn ) and K9 = A2max K2
K2 m S m +K1 m S m log(C)
m S m+
C
The first equality follows from the definition of Ami in the routing algorithm in (3.1). Since Qmim (tn + ) (tn + ) Qmim (tn ) (tn + ) Qmim (tn ) +
S m Amax + Nmax /C because the load at each queue cannot increase by more
than Amax S m + Nmax /C in each time slot, we get (a). Inequality (b) follows
from concavity of g(.) and g 0 (.) 1.
Inequality (c) follows from Walds identity and Lemma 3.1. For Walds
50
Dmi (t) =
(j)
Dmi (t).
j=1
(i)
D(j) (t) if j th job in N
em
(tn ) was scheduled
mi
(j)
b
Dmi (t) =
(3.20)
1
if the j th job was unused.
(i)
em
N
(tn )
b mi (t) =
D
b (j) (t).
D
mi
(3.21)
j=1
(3.22)
Since that there is no unused service if Qmi (t) Nmax , we have (a). Inequale from Lemma 3.2 and
ity (b) follows from the fact that Qmi (t) Qmi (t)C
(i)
Nm (t) Nmax .
51
"z 1
n
X
X
#
g(Qmi (tn + ))Dmi (tn + )
=0 i,m
Eq
"z 1
n
X
X
#
b mi (tn + ) K11
g(Qmi (tn ))D
=0 i,m
X
i,m
g(qmi )Eq
"z 1
n
X
#
b mi (tn + ) K11 ,
D
(3.23)
=0
where K11 = LM (K10 K2 + Nmax g(Nmax )K1 ). We will now use the generalized Walds Identity (Lemma 3.4), verifying conditions (D1), (D2) and (D3).
b mi (tn + ) have finite mean by definition, and
Clearly, (D1) is true because D
from Lemma 3.3.
b mi (tn + ), from (3.8) and (3.9), |D
b mi (tn + )|
From definition of D
Nmax /C. So,
h
i
X
b mi (tn + )|I{zn } Nmax
Eq |D
Eq I{zn }
C =1
=1
Nmax X
=
Pq (zn )
C =1
52
Nmax
Nmax K1
Eq [zn ]
.
C
C
This verifies (D3). We verify (D2) by first proving the following claim.
Claim 3.1.
h
i
h
i
b
b
Eq Dmi (tn + )|zn = Eq Dmi (tn + ) .
b (j) (t) as defined in (3.20).
Proof. Consider the departures due to each job, D
mi
Intuitively, conditioned on {zn }, we have a conditional distribution on
the amount of finished service for each of the jobs. However, from Lemma
3.3, the expected departure is 1 independent of finished service. Thus, the
conditional workload departure due to each job is 1. This is the same as
the unconditional departure, again from Lemma 3.3. We will now prove this
more formally.
The event {zn } is a union of many (but finite) disjoint events {E : =
1 . . . A}. Each of these events E is of the form {(q(tn ), A(tn ), D(tn )), (q(tn +
1), A(tn + 1), D(tn + 1)), . . . (q(tn + 1), A(tn + 1), D(tn + 1))}. In
other words, each event is a sample path of the system up to time tn + and
contains complete information about the evolution of the system from time
(j)
tn up to time tn + . Let lmi be the amount of finished service for the j th
(j)
job of type m at server i. lmi is completely determined by E . Conditioned
b (j) (t) depends only on l(j) . It is independent of the other jobs in the
on E , D
mi
mi
system, and is also independent of the past departures. Thus, we have
i
h
b (j) (tn + )|E =
Eq D
mi
i
h
b (j) (tn + )|l(j) =
Eq D
mi
mi
1.
i
(j)
b
Eq Dmi (tn + )|zn
h
i
X
b (j) (tn + )|E
=
P (E |zn )Eq D
mi
P (E |zn )
= 1.
h
i
b (j) (tn + ) = 1. SumSimilarly, from Lemma 3.3 and (3.20), we have Eq D
mi
ming over j, from (3.21), we have the claim.
53
Since
h
i
h
i
b
b
Eq Dmi (tn + )Izn =Eq Dmi (tn + )|zn P (zn )
h
i
b mi (tn + ) P (zn ),
=Eq D
we have (D2). Therefore, using the generalized Walds identity (Lemma 3.4)
in (3.23), we have
Eq
"z 1
n
X
X
#
g(Qmi (tn + ))Dmi (tn + )
=0 i,m
i,m
The key idea is to note that the expected departures of workload for each
scheduled job is 1 from Lemma (3.3). We use this, along with the generalized
Walds theorem, to bound the departures similar to the case of geometric job
sizes.
b there exists {i } such that = P i and i S
Since (, S) C,
i
i
int(Conv(Ni )) for all i . Then, there exists an > 0 such that (i + ) S
e mi ) log(C(1+
e
Conv(Ni ) for all i. From Lemma 3.2, we have g(qmi ) g(Cq
e The last inequality which is an immediate conseqmi )) g(qmi ) + log(C).
quence of the log function has also been exploited in [38] [39]for a different
problem. For each server i, we have
X
i
e
(g(qmi ) log(C))(
m + )S m
(a) X
e (i) (tn )
g(qmi )N
m
e (i) (tn ),
g(qmi )N
m
m
(i)
e + 1)) V (X(n))|
e
e
e
E[V (X(n
Q(n)
= q, Y(n)]
54
!
X
X
i
K12+Eq [zn ]
g(qmim )m S m g(qmi )(m + )S m
m
(a)
XX
i
g(qmi )
K13
S min K1
log(1 + U (q))
S min K1
2
whenever U (q) > e(2K13 /S min K1 ) . Thus, U (n) satisfies condition C1 of Lemma
e
3.5 for the filtration generated by the {X(n)}.
From Lemma 3.6, Lemma 3.7
and (3.16), we have
(U (tn + + 1) U (tn + ))
[V (tn + + 1) V (tn + )]
log(1 + G1 (V (Q(tn + ))))
P
K8 + Amax m,i S m g(Qmi (tn + ))
log(1 + G1 (V (Q(tn + ))))
55
Amax S max
K8
+
LM
log(1 + G1 (V (Q(tn + ))))
(a)
K14
if U (tn + ) > 0,
K8
S max
+ Amax
. Since U (Q) > 0 if and only if V (Q) > 0 if
where K14 = log(2)
LM
and only if Q 6= 0, there is at least one nonzero component of Q = 0 and
so V (tn + ) > G(1). This gives the inequality (a). If U (tn + ) = 0, from
K8
log(2)
Nmax
.
LM
Setting K19 =
K19 z n
e
and K4 > 0 such that, limn m i E[e U (X(n)) ] K4 . Since G(.) is
convex, from Jensens inequality, we have
P
G
G
Q
(t
)
n
mi
m,i
V (Q(tn )).
LM
m,i
m,i
56
(3.25)
P P
Thus, we have limn m i E[Qmi (tn )] LM
K4 .
For any t > 0, if tn+1 is the next refresh time after t, from (3.11) we have
Nmax
C
e mi (tn+1 ) + zn Nmax
CQ
C
XX
XX
e
E[Qmi (t)]
E[C (Qmi (tn+1 ) + zn Nmax /C)].
Qmi (t) Qmi (t) Qmi (tn+1 ) + zn
As t , we get
X
X
N
max
e mi (tn )+zn
lim sup E[Qmi (t)] lim sup E CQ
C
t
n
m,i
m,i
LM e
K1 LM Nmax
CK4 +
.
(i)
(i)
em
em
If it is not a refresh time, N
(t) = N
(t 1).
Proposition 3.2. Assume that all the servers are identical and the job size
distribution satisfies Assumption 3.1. Then, any job load vector that satisfies
b is supportable under random routing and MaxWeight schedul(, S) int(C)
ing at local refresh times as described in Algorithm 6 with g(q) = log(1 + q).
58
We skip the proof here because it is very similar to that of Theorem 3.1.
Since routing is random, each server is independent of other servers in the
system. So, one can show that each server is stable under the job load vector
(/L, S) using the Lyapunov function in (3.13). This then implies that the
whole system is stable.
In the next section, we study the performance of these algorithms by simulations.
3.5 Simulations
In this section, we use simulations to compare the performance of the Algorithms presented so far. We use the same simulation set up as in Chapter 2
with 100 identical servers, three types of jobs and three maximal VM configurations for each server viz., (2, 0, 0), (1, 0, 1), and (0, 1, 1). We consider two
load vectors, (1) = (1, 31 , 23 ) and (2) = (1, 12 , 12 ) which are on the boundary
of the capacity region of each server. (1) is a linear combination of all the
three maximal schedules whereas (2) is a combination of two of the three
maximal schedules.
We consider three different job size distributions. Distribution A is the
same bounded distribution that was considered in Chapter 2, which models
the high variability in jobs sizes. When a new job is generated, with probability 0.7, the size is an integer that is uniformly distributed between 1 and 50;
with probability 0.15, it is an integer that is uniformly distributed between
251 and 300; and with probability 0.15, it is uniformly distributed between
451 and 500. Therefore, the average job size is 130.5.
Distribution B is a geometric distribution with mean 130.5. Distribution
C is a combination of distributions A and B with equal probability, i.e., the
size of a new job is sampled from distribution A with probability 1/2 and
from distribution B with probability 1/2.
We further assume the number of type-i jobs arriving at each time slot
i
follows a binomial distribution with parameter ( 130.5
, 100).
All the plots in this section compare the mean delay of the jobs under
various algorithms. The parameter is varied to simulate different traffic
intensities. Each simulation was run for one million time slots.
59
Figure 3.1: Comparison of mean delay under Algorithms 4 and 6 for load
vector (1) and job size distribution A
3.5.2 Heuristics
In this section, we study the performance of some heuristic algorithms. We
have seen in the previous subsection that the idea of using local refresh times
is good. Since JSQ routing provides better load balancing than random
routing, a natural algorithm to study is one that does JSQ routing and
updates schedules at local refresh times. This leads us to Algorithm 5. Since
we dont know if Algorithm 5 is throughput optimal, we study its performance
using simulations.
We also consider another heuristic algorithm motivated by Algorithm 4
60
Figure 3.2: Comparison of mean delay under Algorithms 5, 7 and 6 for load
vector (1) and job size distribution A
and Algorithm 3 in Chapter 2 as follows. Routing is done according to the
join the shortest queue algorithm. At refresh times, a MaxWeight schedule
is chosen at each server. At all other times, each server tries to choose a
MaxWeight schedule myopically. It does not preempt the jobs that are in
service. It adds new jobs to the existing configuration so as to maximize
the weight using g(Qmi (t)) as weight without disturbing the jobs in service.
This algorithm tries to emulate a MaxWeight schedule in every time slot by
greedily adding new jobs with higher priority to long queues. We call this
Algorithm 7.
This algorithm has the advantage that, at refresh times, an exact MaxWeight
schedule is chosen automatically, so the servers need not keep track of the
refresh times. It is not clear if this algorithm is throughput optimal. The
proof in section 3.3 is not applicable here because one cannot use Walds
identity to bound the drift. This algorithm is an extension of Algorithm 3
Chapter 2 when the super time slots are taken to be infinite. However as
stated in Chapter 2, Algorithm 3 is almost throughput optimal only when
the super time slot is finite. Investigating throughput optimality of 7 under
appropriate assumptions is an open problem.
Figures 3.2, 3.3 and 3.4 compare the mean delay of the jobs under Algorithms 5, 7 and 6 with the three job size distributions using the load vector
(1) . Figure 3.5 shows the case when the load vector (2) is used.
61
Figure 3.3: Comparison of mean delay under Algorithms 5, 7 and 6 for load
vector (1) and job size distribution B
Figure 3.4: Comparison of mean delay under Algorithms 5, 7 and 6 for load
vector (1) and job size distribution C
62
Figure 3.5: Comparison of mean delay under Algorithms 5, 7 and 6 for load
vector (2) and job size distribution A
The simulations indicate that both Algorithms 5 and 7 have better delay
performance than Algorithm 6 for all job size distributions and both the load
vectors. The performance improvement is more significant at higher traffic
intensities. In the cases studied, simulations suggest that Algorithms 5 and
7 are also throughput optimal. Since we do not know if this always true, it
is an open question for future research to characterize the throughput region
of Algorithms 5 and 7.
In sections 3.2, it was noted that a wide class of weight functions can be
used for MaxWeight schedule in the case of geometric job sizes. However,
the proof in section 3.3 required a log(1 + q) weight function for general
job size distributions. So, we now study the delay performance under linear
and log weight functions. Figure 3.6 shows the delay of Algorithms 5 and
7 under the weight functions, q and log(1 + q). Job size distribution A was
used with the load vector (1) . It can be seen that there is no considerable
difference in performance between the two weight functions. It is an open
question whether Algorithms 4 and 6 are throughput optimal under more
weight functions.
63
Figure 3.6: Comparison of mean delay under Algorithms 5 and 7 for load
vector (1) and job size distribution A with log and linear weights
3.6 Conclusion
In this chapter, we studied various algorithms for the problem of routing
and nonpreemptive scheduling jobs with variable, unknown and unbounded
sizes in a cloud computing data center. The key idea in these algorithms is to
choose a MaxWeight schedule at either local or global refresh times. We have
presented two algorithms that are throughput optimal. The key idea in the
proof is to show that the refresh times occur often enough and then use this
to show that the drift of a Lyapunov function is negative. We then presented
two heuristic algorithms and studied their performance using simulations.
64
CHAPTER 4
LIMITED PREEMPTION
65
(4.1)
For all other t, at the beginning of the time slot, a new configuration
is chosen as follows:
X
e (i) (t)
g(Qmi (t))Nm ,
N
arg max
N :N +N (i) (t )Ni m
definition.
Definition 4.1. For a random variable S, W (s) = E[S s|S > s] is called
its Mean Residual Life function or Mean excess function.
If S is a random variable denoting the job size distribution, then the mean
residual life is the expected remaining amount of service of that job given
that it has been served for s time slots.
Assumption 4.1. The mean residual function of the job size distribution of
type m jobs is upper bounded by a constant 1/Cm .
Example 4.1. A geometric random variable has a constant mean residual
function and so satisfies Assumption 4.1. Any distribution with finite support
also satisfies Assumption 4.1.
Example 4.2. Zeta distribution, a heavy-tailed distribution and a discrete
counterpart of Pareto distribution, is given by P (S = k) = k a /(a) where
(a) is the Reimann-Zeta function and a > 1. Then, the mean residual
function is
W (s) = E[S s|S > s]
P
0
s0 =s+1 P (S s )
=
P (S s + 1)
P
P
00 a
0
00
0 (s )
P s =s0 a
= s =s+1
.
s0 =s+1 (s )
Note that
s00 =s0 (s
00 a
>
R
s0
xa dx =
(s0 )1a
a1
P
(s0 )1a
1
0
Ps=s+1 0 a
W (s) >
a 1 s0 =s+1 (s )
P
s + 1 s0 =s+1 (s0 )a
P
0 a
a1
s0 =s+1 (s )
s+1
=
.
a1
Therefore, W (s) cannot be upper bounded by a constant and so Zeta distribution does not satisfy Assumption 4.1.
We will now show that Algorithm 8 is throughput optimal with appropriately chosen g(.) when the job size distributions satisfy Assumption 4.1.
67
The process X(t) = (Q(t), Y(t)) is a Markov chain, where Y(t) is defined
in the section 3.2 of Chapter 3. Let Wm (l) be the mean residual life of a job
of type m given that it has already been served for l time slots. In other
words, Wm (l) = E[Sm l|Sm > l]. Note that Wm (0) = S m . Then, we denote
the expected backlogged workload at each queue by Qmi (t). Thus,
Qmi (t) =
Qmi
X
Wm (lj ),
j=1
where lj is the duration of completed service for the j th job in the queue.
Note that lj = 0 if the job was never served.
The expected backlog evolves as follows:
Qmi (t + 1) = Qmi (t) + Ami (t) Dmi (t),
where Ami (t) = Ami (t)S m since each arrival of type m brings in an expected
load of S m . Dmi (t) is the departure of the load.
Let pbml = P (Sm = l + 1|Sm > l). A job of type m that is scheduled for l
amount of time has a backlogged workload of Wm (l). It departs in the next
time slot with a probability pbml . With a probability 1 pbml , the job does
not depart, and the expected remaining load changes to Wm (l + 1), so the
departure in this case is Wm (l) Wm (l + 1). In effect, we have
W (l)
with prob pbml
m
Dmi (t) =
W (l) W (l + 1) with prob 1 pb .
m
m
ml
(4.3)
This means that the Dmi (t) could be negative sometimes, which means
the expected backlog could increase even if there are no arrivals.
Then from (4.3), the increase in backlog of workload due to departure
for each scheduled job can increase by at most Wm (l + 1), which is upper
bounded by 1/Cm due to Assumption 4.1. There are at most Nmax jobs
of each type that are scheduled. The arrival in backlog queue is at most
Amax S max . Then, defining C = minm {Cm } we have
Nmax
Cm
Nmax
Amax S max +
.
C
68
(4.4)
Similarly, since the maximum departure in work load for each scheduled job
is 1/Cm , we have
Qmi (t + 1) Qmi (t)
Nmax
Nmax
.
Cm
C
(4.5)
Theorem 4.1. Assume that the job arrivals satisfy Am (t) Amax for all m
and t and that the job size distribution satisfies Assumption 4.1. Then, any
b is supportable under JSQ routing
job load vector that satisfies (, S) int(C)
and myopic myopic MaxWeight allocation as described in Algorithm 8 with
g(q) = log(1 + q).
Proof. When the queue length vector is Qmi (t), let Y(t) = {Ymi (t)}m,i denote the state of jobs of type-m at server i. When Qmi (t) 6= 0, Ymi (t) is
a vector with one entry for each job that was partially served. Some of the
partially served jobs are currently scheduled and some were interrupted at
the beginning of a super time slot. Therefore, the size of Ymi (t) is at most
j
(t) is the amount of time the j th partially served type-m job
Qmi (t) and Ymi
that is in service at server i has been served.
It is easy to see that X(t) = (Q(t), Y(t)) is a Markov chain under Algorithm 8.
e
Obtain a new process, X(n),
by sampling the Markov chain X(t) every T
e
e
time slots, i.e., X(n) = X(nT ). Note that X(n)
is also a Markov chain.
We will show stability of X(t) by first showing that the Markov Chain
e
X(n) corresponding to the sampled system is stable.
With slight abuse of notation, we will use V (t) for V (Q(t)). Similarly,
V (n), U (t) and U (n). We will establish this result by showing that the
Lyapunov function U (n) satisfies both the conditions of Lemma 3.5. We will
study the drift of U (n) in terms of drift of V (n) using Lemma 3.6. First
consider the following one step drift of V (t):
(V (t + 1) V (t))
X
=
G Qmi (t + 1) G Qmi (t)
m,i
Qmi (t + 1) Qmi (t) g(Qmi (t + 1))
m,i
Qmi (t + 1) Qmi (t) g(Qmi (t + 1)) g(Qmi (t))
m,i
69
(4.6)
Ami (t) Dmi (t) g(Qmi (t)),
(4.7)
m,i
where (4.6) follows from the convexity of G(.). To bound the first term in
(4.7), first consider the case when Qmi (t + 1) Qmi (t). Since g(.) is strictly
increasing and concave, we have
g(Qmi (t + 1)) g(Qmi (t))
= g(Qmi (t + 1)) g(Qmi (t))
g 0 (Qmi (t))(Qmi (t + 1) Qmi (t))
= Qmi (t + 1) Qmi (t) ,
(Qmi (t + 1) Qmi (t))
where the second inequality follows from g 0 (.) 1. Similarly, we get the
same relation even when Qmi (t) > Qmi (t + 1).
So the first term in (4.7) can be bounded as
X
g(Qmi (t + 1)) g(Qmi (t))
m,i
X
Qmi (t + 1) Qmi (t) g(Qmi (t + 1)) g(Qmi (t))
m,i
X
Qmi (t + 1) Qmi (t) 2
K8 ,
m,i
(4.8)
Dmi (t) Ami (t) g(Qmi (t + 1)).
(4.9)
m,i
X
m,i
=Eq
Eq
"T 1
X
#
V (nT + + 1) V (nT + )
=0
"T 1
X
=0
m,i
#
+ K8 .
(4.10)
The last term above is bounded by K8 T . We will now bound the first term
in (4.10).
Eq
"T 1
XXX
=0 m
=Eq [
(a)
Eq [
(b)
Eq [
T 1 X
X
=0 m
T
1 X
X
=0 m
T
1 X
X
#
g(Qmi (nT + ))Ami (nT + )
=0 m
S m g(qmim )E[
K21 + T
T 1
X
Nmax
Amax S m ]
C
Am (nT + )] + K20
=0
S m g(qmim )m ,
(4.11)
where im (t) = arg min Qmi (t), im = im (nT ), im (t) = arg min Qmi (t), im =
i{1,2,,,L}
i{1,2,,,L}
P
2
2
im (nT ) and K20 =
m (Amax S m + Nmax Amax /C)T (T 1)/2 K21 = K20 +
P
e m . The first equality follows from the definition of Ami in
T m S m log(C)
the routing algorithm in (4.1). Since Qmim (nT + ) (nT + ) Qmim (nT ) (nT +
) Qmim (nT )+S m Amax + Nmax /C because the load at each queue cannot
increase by more than Amax S m + Nmax /C in each time slot from (4.4), we get
(a). Inequality (b) follows from concavity of g(.) and g 0 (.) 1. Note that
eq C
e q C.
e This gives (4.11).
Lemma 3.2 gives qmim qmim C
m im
m im
Now consider the second term in (4.10).
Eq
"T 1
X
#
g(Qmi (nT + ))Dmi (nT + )
=0
71
T 1
X
=0
T
1
X
Eq g(Qmi (nT + ))Dmi (nT + )
" "
Eq E
=0
i,m
##
(i)
(nT + )
Y(nT + ), Q(nT + ), Nm
#
"T 1
XX
(i)
(nT + ) ,
=Eq
g(Qmi (nT + ))Nm
(4.12)
=0 i,m
where the last equality follows from Lemma 3.3 since the expected departure
is 1 for any job that is scheduled.
P
(i)
To bound this term, we will first show that the increase of g(Qmi (t))Nm (t)
m
is bounded in a super time slot. For any t such that nT t < (n + 1)T , for
each server i,
X
(i)
g(Qmi (t))Nm
(t 1)
X
(i)
= g(Qmi (t))Nm
(t )
m
(i)
(i)
g(Qmi (t)) Nm
(t 1) Nm
(t )
m
(a)X
(i)
g(Qmi (t))Nm
(t ) +
e (i) (t)
g(Qmi (t))N
m
m
m
X
(i)
e (i) (t) IQ (t)Nmax
g(Qmi (t))Nm
(t ) + g(Qmi (t))N
=
m
mi
m
X
(i)
(i)
em
+
g(Qmi (t))Nm
(t )+g(Qmi (t))N
(t) IQmi (t)<Nmax
m
(b)X
(i)
e max ),
(t) + M Nmax g(CN
g(Qmi )(t)Nm
m
(i)
72
2
e max ) + M Nmax
. Using this recursively for t = nT +
where 0 = M Nmax g(CN
such that < T , we get
(i)
(nT + ) 0
g(Qmi (nT + ))Nm
(i)
g(Qmi (nT ))Nm
(nT ).
e
g(Qmi (nT ))(im + )S m log(C)
(im + )S m
+ )S m
g(Qmi (nT ))(im + )S m IQmi (nT )Nmax
g(Qmi (nT ))(im + )S m IQmi (nT )<Nmax
(i)
g(Qmi (nT ))Nm
(nT ) + g(Nmax )
X
(im + )S m
m
(i)
g(Qmi (nT ))Nm
(nT )
X
+ g(Nmax )
(im + )S m .
XX
i
(i)
g(Qmi (nT + ))Nm
(nT + )
L 0
XX
i
XX
i
(i)
g(Qmi (nT ))Nm
(nT )
73
"T 1
XX
#
g(Qmi (nT + ))Dmi (nT + )
=0 i,m
K23 T
XX
!
X
g(qmim )m S m g(qmi )(im + )S m
m
(a)
K24 S min T
XX
i
g(qmi )
74
K25
S min T
log(1 + U (q))
S min T
2
whenever U (q) > e(2K25 /S min T ) . Thus, U (n) satisfies condition C1 of Lemma
e
3.5 for the filtration generated by the {X(n)}.
We will now verify condition
C2. From Lemma 3.6, Lemma 3.7 and (4.8), we have
(U (nT + + 1) U (nT + ))
[V (nT + + 1) V (nT + )]
log(1 + G1 (V (Q(nT + ))))
P
K8 + Amax m,i S m g(Qmi (nT + ))
+
LM
log(1 + G1 (V (Q(nT + ))))
(a)
K14
if U (nT + ) > 0,
K8
S max
where K14 = log(2)
+ Amax
. Since U (Q) > 0 if and only if V (Q) > 0 if
LM
and only if Q 6= 0, there is at least one nonzero component of Q = 0 and so
V (nT + ) > G(1). This gives the inequality (a). If U (nT + ) = 0, from
(3.16), we have (U (nT + + 1) U (nT + )) K15 = G1 (K4 ). Thus, we
have
e
constants > 0 and K4 > 0 such that, limn m i E[e U (X(n)) ] K4 .
75
m,i
G Qmi (nT )
V (Q(nT )).
LM
(4.13)
Qmi (nT )
m,i
m,i
LM U (X(n))
e
e
.
P P
Thus, we have limn m i E[Qmi (nT )] LM
K4 .
As t , we get
X
X
Nmax
e
lim sup E[Qmi (t)] lim sup E CQmi (nT )+T
C
t
n
m,i
m,i
e
C
LM
T LM Nmax
K
+
.
4
4.2 Conclusion
To ameliorate the poor performance of the completely nonpreemptive algorithms studied so far, in this chapter we have studied the cloud resource
allocation problem when jobs are allowed to be preempted once in a while.
We assumed that the job sizes are unknown and unbounded, and have presented a throughput optimal algorithm.
76
CHAPTER 5
DELAY OPTIMALITY
78
MaxWeight scheduling and JSQ routing. In section 5.2, we show its throughput optimality. In section 5.3, we apply the above three-step procedure
and prove heavy-traffic optimality when all the servers are identical. The
lower bound in this case is identical to the case of the MaxWeight scheduling problem in ad hoc wireless networks in [37]. The state-space collapse
and upper bound do not directly follow from the corresponding results for
the MaxWeight algorithm in [37] due to the additional routing step here.
In section 5.4 we present heavy traffic optimality when power-of-two-choices
routing is used instead of JSQ for the indentical server case.
Note on Notation: The set of real numbers, the set of non-negative real
numbers, and the set of positive real numbers are denoted by R, R+ and R++
respectively. We use a slightly different notation in this chapter. We denote
vectors in RM or RL by x, in normal font. We use bold font x only to denote
vectors in RM L . Dot product in the vector spaces RM or RL is denoted by
hx, yi and the dot product in RM L is denoted by hx, yi.
79
L
P
Ci i s.t. s si }. Here si just denotes an element in Ci and not ith power
P i=1
of s. Here
denotes the Minkowski sum of sets. Therefore, C is again a
convex polytope in the nonnegative quadrant of RM , and it can be described
by a set of hyperplanes as follows:
C = {s 0 : c(k) , s b(k) , k = 1, ...K},
where K is the number of hyperplanes that completely defines C, and (c(k) , b(k) )
completely defines the k th hyperplane H(k) , c(k) , s = b(k) . Since C is in the
first quadrant, we have
||c(k) || = 1 , c(k) 0,
Similar to C, define S =
L
P
b(k) 0 f or k = 1, 2, ...K.
i=1
(k)
= b(k) is
= b(k) ), for each server i, there is a b(k) such that c(k) ,
c ,
i
i
L
P
(k)
the boundary of the capacity region Ci , and b(k) =
bi . Moreover, for
i=1
o
n
L
P
(k) Ci such that
(k) =
(k) and
(k) C lies on the k th
every set
i
i
i
i=1
E
D
(k)
(k)
(k) (k)
= bi .
hyperplane H , we have that c , i
(k)
Proof. Define bi
C=
L
P
= max c(k) , s . Then, since
sCi
i=1
L
P
(k)
bi .
i=1
C, there are
(k) Ci for each
Again, by the definition of C, for every
i
L
P
(k) =
(k) . However, these may not be unique. We will prove
i such that
i
i=1
n
o
D
E
(k)
(k)
(k) (k)
i=1
arg max
sCi
M
X
sm qmi .
m=1
M
X
sm qmi ,
81
Let Ymi (t) denote the state of the queue for type-m jobs at server i. If there
j
are J such jobs, Ymi (t) is a vector of size J and Ymi
(t) is the (backlogged) size
th
of the j type-m job at server i. It is easy to see that Y(t) = {Ymi (t)}mi is a
Markov chain under the JSQ routing and MaxWeight scheduling. Let qmi (t)
denote the queue length of backlogged workload of type-m jobs at server i.
Then the vector of backlogged workloads, q(t) = {qmi (t)}mi , is a function
P j
q(Y) of the state Ymi (t) given by qmi (t) = j Ymi
(t).
However, note that Y(t) is not a time-homogeneous Markov process. Same
pling the process Y(t) every T time slots, we obtain the process Y(n)
=
Y(nT ), which is a time-homogeneous Markov process. Moreover, note that
b
the process Y(n)
= (Y(nT ), Y(nT + 1) . . . Y(nT + T 1)) is also a timehomogeneous Markov process.
The queue lengths of backlogged workload evolve according to the following
equation:
qmi (t + 1) = qmi (t) + ami (t) sim (t)
= qmi (t) + ami (t) sim (t) + umi (t),
(5.1)
where umi (t) is the unused service, given by umi (t) = sim (t) sim (t), sim (t)
is the MaxWeight or myopic MacWeight schedule as defined in Algorithm 9
and sim (t) is the actual schedule chosen by the scheduling algorithm and the
arrivals are
a (t) if i = i (t)
m
m
.
(5.2)
ami (t) =
0
otherwise
Here, im is the server chosen by the routing algorithm for type m jobs. Note
that
umi (t) = 0 when qmi (t) + ami (t) Dmax smax .
(5.3)
Also, denote s = (sm )m where
L
X
sm =
sim .
(5.4)
i=1
Denote a = (ami )mi , s = (sim )mi and u = (umi )mi . Also denote 1 to be the
vector with 1 in all components.
82
whenever Y0
/ B.
83
Since we assume that the total number of job arrivals, job sizes and total
number of departures in the queue is bounded in each time slot and since
the set B is finite, we have that set B 0 is also finite. Then, using the Fosterb
Lyapunov theorem, we have that the Markov process Y(n)
is also positive
recurrent. Note that this is the only instance in this chapter where we assume
that am (t) amax . This assumption can easily be relaxed.
e
b
Note that the steady-state distributions of the processes Y(n)
and Y(n)
are related. Let (q0 , q1 , . . . , qT 1 ) be the queue lengths of backlogged
b
workloads corresponding to the state of the Markov process Y(n).
Let
(0 (q), 1 (q), . . . , T 1 (q)) denote their marginal probability distributions in
steady state. Then, the probability distribution of the queue lengths of the
e
sampled process Y(n)
is given by 0 (q).
Now that we have shown that a steady-state exists, in the next section, we
will show that the queue lengths of backlogged workload are optimal in the
steady state in the heavy-traffic regime.
(5.5)
sH(k)
(k) =
+ (k) c(k) ,
interior(C). We let , (k) K denote
where (k) > 0 for each k since
k=1
(k)
the vector of distances to all hyperplanes. Note that may be outside the
84
(5.6)
Let q() (t) be the backlogged workload queue length process when the arrival
() .
rate is
(k)
(k)
cm
L
Define c(k) RM
. We expect that the state
+ , indexed by m, i as cmi =
L
space collapse occurs along the direction c(k) . This is intuitive. For a fixed m,
JSQ routing tries to equalize the queue lengths of backlogged workload across
servers. For a fixed server i, we expect that the state space collapse occurs
along c(k) when approaching the hyperplane H(k) , as shown in [37]. Thus, for
JSQ routing and MaxWeight, we expect that the state space collapse occurs
along c(k) in RM L .
For each k Ko() , define the projection and perpendicular component of
q() to the vector c(k) as follows:
(,k)
, c(k) , q() c(k)
(,k)
, q() q||
q||
(,k)
85
(,k)
1
L
(k) 2
(,k)
(,k)
+ B3 ,
c(k) , q() (t)
2
() 2
(k)
(
)
(,k)
, B3
1
is o( (k)
).
(k) 0
where (k) =
1
L
c(k)
2
(k)
,
2
E
, ()2 .
Note that since the process Y(t) is not time-homogeneous, the steady state
b
referred to in the theorem is that of the process Y(n).
The theorem gives
(k)
a bound on E c , q(t) , where the expectation is taken according to the
distribution t mod T (.) and the bound is valid for all {0, 1, 2, . . . , T
1}. We do no need the assumption am (t) amax in the proof of heavytraffic optimality. This assumption was needed for throughput optimality
in Theorem 5.1 (and so for the existence of steady-state) and can easily be
relaxed.
We will prove this theorem by following the three-step procedure described
before, by first obtaining a lower bound, then showing state space collapse
and finally using the state space collapse result to obtain an upper bound.
c(k) , q
()
=E
M
P
m=1
(k)
cm
L
L
P
qmi
in
i=1
steady state using the lower bound on the queue length of a single server
86
queue. For the cloud system, given the capacity region and the face F (k) ,
we will construct a single server queue with appropriate arrivals and service
rates to obtain a lower bound.
Consider a single server queuing system, () (t), with arrival process
(k) ()
(k)
1
c , a (t) and service process given by bL at each time slot. Then
L
() (t) is stochastically smaller than c(k) , q() (t) . Thus, for every t, we have
E
c(k) , q() (t) E () (t) .
() (t) is smaller than the departure
recurrent since the arrival rate 1L c(k) ,
(k)
rate bL . Therefore, a steady state distribution exists for () (t). Using 2 as
Lyapunov function and noting that the drift of it should be zero in steady
state, we get that in steady state [37],
(,k)
(,k)
B1 ,
(k) E () (t)
2
where c
2
(k) 2
(k)
cm
2 M
(,k)
B1
b(k) (k)
2
and
(,k)
m=1
(5.7)
1
L
(k) 2
() 2
(k)
(
)
.
Thus, in steady state, in the heavy traffic limit as (k) 0, we have that
L
lim (k) E
(k) 0
(k)
,
2
(5.8)
D
E
2
where (k) = 1L c(k) , ()2 .
Note that this lower bound is a universal lower bound that is valid for any
joint routing and scheduling algorithm.
87
s2
c(k) q(,k)
||
(,k)
is bounded
q() is O( 1 )
s1
Figure 5.1: Illustration of the capacity region, the vector c(k) and state
space collapse. As the arrival rate approaches the boundary of the capacity
region, the queue length vector q is increasing as O( 1 ). But the component
perpendicular to c, i.e. q is bounded.
to c(k) is bounded. This is illustrated in Figure 5.1. So the constant does
1
not contribute to the first order term in (k)
, in which we are interested.
Therefore, it is sufficient to study a bound on the queue length of backlogged
workload along c(k) . This is called state-space collapse.
Define the following Lyapunov functions:
V (q) , kqk2 =
M
L X
X
(k)
(k)
(k)
(k)
2
qmi
, W (q) ,
q
, W|| (q) ,
q||
i=1 m=1
(k)
V|| (q)
, c(k) , q
() 2
1
(k)
2
=
q||
=
L
L X
M
X
!2
qmi cm
i=1 m=1
With a slight abuse of notation, we will use V (Y) for V (q(Y)) and similarly with the other Lyapunov functions. Define the one step drift of the
above Lyapunov functions for the original system Y(t) and for the sampled
e
system Y(n)
respectively as follows:
V (Y) , [V (q(Y(t + 1))) V (q(Y(t)))] I(Y(t) = Y)
e , [V (q(Y((n + 1)T ))) V (q(Y(nT )))] I(Y(nT ) = Y).
e
Ve (Y)
(k)
(k)
e and Ve (k) (Y).
e We will
f (k) (Y)
Similarly define W (Y), V|| (Y), W
||
show that the state space collapse happens along the direction of c(k) for the
e
sampled system Y(n).
We will need Lemma 3.5 by Hajek [33], which gives
88
(k)
e is negative. Here
f (k) (Y)
e
in steady-state if the drift of W
a bound on
q
If we further assume that the Markov chain {X[t]}t is positive recurrent, then
Z(X[t]) converges in distribution to a random variable Z for which
h ? i
E e Z C ? ,
which directly implies that all moments of Z exist and are finite.
e
We will use this result for the Markov process Y(n)
and the Lyapunov
(k)
e
f (Y).
function W
We will use the following lemma (similar to Lemma 7
(k)
e
Ve|| (Y).
The proof follows from concavity of square-root function and
using Pythagoras theorem. See [37] for details.
89
e The drift of W
e can be bounded as follows:
f (k) (Y)
e , q(Y).
Lemma 5.3. Let q
(k)
e
f (Y)
W
1
e Ve (k) (Y)
e
Ve (Y)
||
(k)
e
2
q
L
e RM
q
+ .
(5.9)
Note that
i
e
e
e
E V (Y) Y(nT ) = Y
h
e
=E [V (Y((n + 1)T )) V (Y(nT ))| Y(nT ) = Y
=
=
(a)
nTX
+T 1
t=nT
nTX
+T 1
t=nT
nTX
+T 1
e
E [V (Y(t + 1)) V (Y(t))| Y(nT ) = Y
h
i
i
e Y(nT ) = Y
e
E E [V (Y(t + 1)) V (Y(t))| Y(t) = Y, Y(nT ) = Y
i
e
E [E [V (Y)| Y(t) = Y]| Y(nT ) = Y ,
t=nT
where (a) follows from the fact that Y(t) is Markov. Similarly, one can bound
e Then using these equalities in (5.9), we get
the conditional drift of Ve|| (Y).
h
i
(k) e
e
f
E W (Y) Y(nT ) = Y
1
(k)
e
2
q
nTX
+T 1
E [E [V (Y)| Y(t) = Y]
t=nT
(k)
V|| (Y) Y(t)
i
i
e
= Y Y(nT ) = Y .
(5.10)
e () ). We will
e() for q(Y
In the following, we will use q() for q(Y() ) and q
(k)
start by bounding the one step drift of V|| .
h
(k)
V|| (Y() ) Y() (t)
()
E M
=Y
h
i
(k)
(k)
=E V|| (q() (t + 1)) V|| (q() (t)) Y() (t) = Y()
h
i
2
(k) () 2
(k)
()
()
=E c , q (t + 1) c , q (t) Y(t) = Y
h
i
2
(k) () 2
(k)
()
()
()
()
=E c , q (t) + a (t) s (t) + u (t) c , q (t) Y(t) = Y()
h
2
=E c(k) , q() (t) + a() (t) s() (t) + c(k) , u() (t) c(k) , q() (t)
90
+2 c(k) , q() (t) + a() (t) s() (t) c(k) , u() (t) Y(t) = Y()
h
E c(k) , a() (t) s() (t) 2 c(k) , s() (t) c(k) , u() (t)
+2 c(k) , q() (t) c(k) , a() (t) s() (t) Y(t) = Y()
(k) ()
2 c(k) , q(Y() )
c , E a (t) Y(t) = Y() E s() (t) Y(t) = Y()
2
2 c(k) , smax 1
!
(,k)
M
L
L
h
i X
X
2||q|| || X
()
()
cm
E ami (t)|Y(t) = Y() E si()
=
m (t)|Y(t) = Y
L m=1
i=1
i=1
K26
(,k)
!
L
X
i()
()
E sm
(t)|Y(t) = Y() K26
m
M
2||q|| || X
=
cm
L m=1
(,k)
M
2||q|| || X
i=1
!
L
X
(k) (k) c(k) E si() (t)|Y(t) = Y() K26 (5.12)
m
m
m
cm
m=1
(,k)
M
2||q|| || X
L
X
L
X
m=1
i=1
i=1
(5.11)
i=1
m(k)
cm
!
m()
E sm (t)|Y(t) = Y() K26
(k)
2
(,k)
||q|| ||
(5.13)
L
(,k)
L X
M
(k)
2||q|| || X
i(k) E si() (t)|Y(t) = Y() K26 2
||q(,k)
cm
||
=
m
m
||
L i=1 m=1
L
2(k) (,k)
K26 ||q|| ||,
L
(5.14)
(,k)
(k)
where K26 = 2JM s2max and recall q|| , q(Y() )|| . Equation (5.11) follows
from the fact that the sum of arrival rates at each server is same as the
external arrival rate. Equation (5.12) follows from (5.6). From the definition
L
i(k) Ci such that
(k) = P
i(k) . This gives
of C, we have that there exists
i=1
(k)
(5.13). From Lemma 5.1, we have that for each i, there exists bi such that
M
P
(k)
(k)
i(k)
cm
and c(k) , si() bi for every si() (t) Ci . Therefore, we
m = bi
m=1
i(k) E si() (t)|Y(t) = Y() 0
cm
m
m
m=1
Now, we will bound the one step drift of V (.). By expanding the drift of
V (q() ) and using (5.3), it can be easily seen that
" L M
#
X X ()
E M V (Y() )|Y() (t) = Y() K27 + EY()
2qmi ami (t) sim (t) ,
i=1 m=1
(5.15)
where K27
P 2
2
2
=M
m + m + 2M smax (1 + Dmax ) and EY() [.] is shortm
()
L X
M
X
"
()
M
X
#
()
2qmim am (t)
m=1
i=1 m=1
M
X
()
() .
2qmim
m
m=1
Then we have
E M V (Y() )|Y() (t) = Y()
#
"
L
M
X
X
()
() ()
m qmim 2 Eq()
qmi sim (t)
K27 + 2
m=1
m=1
i=1
M
L
M
X X q ()
X
mi
()
() q () 2
=K27 + 2
m
m mim
L
m=1
m=1
i=1
"M
#
L
M
L
()
X
X
X
X ()
q
mi
()
+2
2 Eq()
qmi sim (t) .
m
L
m=1
m=1
i=1
i=1
M
X
(5.16)
(5.17)
M
X
()
m=1
M
X
L
()
X
q
mi
i=1
"M
#
L
X
X ()
qmi sim (t)
2 EY()
i=1
L
()
X
qmi
(k) (k) c(k)
=
2
m
m
L
m=1
i=1
L
X
2(k) (,k)
= ||q|| || + 2 EY()
L
i=1
m=1
L
X
2
"
"
EY()
i=1
M
X ()
qmi
m=1
92
M
X
#
()
m=1
(k)
m
!#
sim (t)
(5.18)
Substituting all the bounds, i.e., (5.14), (5.17), and (5.18) in (5.10), and
e () ), we get
e() for q(Y
using q
h
i
e () ) Y() (nT ) = Y
e ()
f (k) (Y
E W
"
L
M
nTX
+T 1
M
()
X
X
X
qmi (t)
1
()
() ()
(t)
q
E
K
+
2
28
m
m mim (t)
(,k)
L
e
2
q
m=1
m=1
i=1
t=nT
+2
L
X
"
EY()
M
X
(5.19)
!#
#!
(k)
m
i
e ()
sm (t) Y() (nT ) = Y
, (5.20)
L
()
qmi
m=1
i=1
where K28 = K26 + K27 and im (t) denotes the server to which type m jobs
were routed at time t. We will first bound the terms in (5.19). Note that
from the queue evolution equation (5.1), we have that
()
()
()
qmi (nT )|
nTX
+T 1
t=nT
(5.21)
()
(a)
()
L
X
()
qmi (nT )
i=1
1
L
L
X
nTX
+T 1
()
qmi (t) +
i=1
t=nT
nTX
+T 1
!
ami (t) + T smax ,
t=nT
where (a) follows from the fact that the server im (t) has the smallest workload
e () ], we bound
at time t for type-m jobs. Using EYe () [.] for E[.|Y() (nT ) = Y
() is such that there
the terms in (5.19). We will assume that the arrival rate
() > for all m. This assumption is reasonable
exists a > 0 such that
j
because we are interested in the limit when the arrival rate is on the boundary
of the capacity region.
nTX
+T 1
t=nT
M
X
m=1
"
() E e ()
m Y
()
qmim (t) (t)
L
()
X
q (t)
mi
i=1
93
nTX
+T 1
t=nT
M
X
"
()
EYe ()
()
qmim (nT ) (nT )
m=1
nTX
+T 1
L
()
X
q (nT )
mi
i=1
M
X
L
X
() qe()
m
mim
()
(5.22)
(5.23)
!2
u L
L
X ()
2T
1 X ()
u
t
K29
qemi
qemi0
L m=1
M
0
i=1
i =1
M
X
v
u
t
M uX
L
X
2
2T
()
= K29
qemi
L m=1 i=1
v
uM L
X X () 2
2T u
t
K29
qemi
L
m=1 i=1
1
M
L
X
() + 2T smax
+ 2T
m
qemi
L
m=1
i=1
t=nT
!!
()
L
M
()
X
X
qemim
q
e
mi
()
= K29 2T
m
L
L
m=1
i=1
!
M
L
2T X () X ()
()
m
= K29
qemi qemim
L m=1
i=1
v
u
M
L
u
X
X
2
2T
()
()
() t
K29
qemi qemim
m
L m=1
i=1
v
!2
u L
M
L
u
2T X ()tX ()
1 X ()
K29
m
qemi0
qemi
L m=1
M
0
i=1
i =1
K29 +
(5.24)
!2
()
qemi0
i0 =1
M
L
1 X X ()
qe 0
M m=1 i0 =1 mi
!2
,
(5.25)
PM ()
P
2
() 2
where K29 = 4T 2 M
m=1 m . Equation (5.22) follows
m=1 (m ) + 4T smax
from the fact that the `1 norm of a vector is no more than its `2 norm for a
vector in RL . The minimum mean square constant estimator of a vector is
its empirical mean. In other words, for a vector x in RL , the convex function
pP
P
1
2
i (xi y) is minimized for y = L
i xi . This gives (5.23). Equation
()
(5.24) follows from the assumption that
m > . Equation (5.25) follows
P
P
2
from the observation that ( m xm ) m xm .
We will now bound the terms in (5.20). To do this, we will first show that
P
()
the sum Jm=1 qmi (t)sim (t) does not change by much with in T time-slots.
94
=
(a)
()
qmi (t)sim (t 1)
m=1
M
X
()
qmi (t)sim (t )
()
qmi (t)sim (t ) +
m=1
M
X
()
qmi (t)sim (t )
M
X
m=1
M
X
M
X
!
()
qmi (t)sim (t)
m=1
M
X
()
qmi (t)sim (t ) +
m=1
(b)
()
m=1
M
X
m=1
()
M
X
()
m=1
()
m=1
where Inequality (a) follows from the scheduling algorithm and definition
()
of sim (t). Inequality (b) holds because when qmi (t) Dmax smax , there are
enough number of jobs so that there is no unused service. From (5.1), we get
M
X
()
qmi (t
1)sim (t
(c)
1)
M s2max (Dmax
+ 1) +
M
X
()
m=1
m=1
M
X
()
M
X
()
m=1
m=1
"
"
!#
#
(k)
m
()
e ()
2
E
EY()
qmi
sim (t) Y() (nT ) = Y
L
m=1
i=1
t=nT
"M
!#
nTX
+T 1 X
L
X ()
(k)
m
=2
EYe ()
qmi (t)
sim (t)
L
m=1
t=nT i=1
"M
!#
nTX
+T 1 X
L
M
(k)
X ()
X
m
i
2
(k)
()
2
EYe ()
qmi (nT )
sm (nT )
+ 2T
m m
L
m=1
m=1
t=nT i=1
nTX
+T 1
L
X
M
X
95
+ 2T LK30
=2T
L
X
"
EYe ()
M
X
()
qmi (nT )
m=1
i=1
"
L
X
(a)
=K31 + 2T
min
i
r Ci
i=1
M
X
()
qemi
m=1
!#
M
X
(k)
m
(k)
()
sim (nT ) + 2T 2
m m + 2T LK30
L
m=1
!#
(k)
m
i
rm
,
(5.26)
L
P
2
(k) ()
where K31 = 2T 2 M
m=1 m m + 2T LK30 + +2T M LDmax smax . Equation
(a) is true because of MaxWeight scheduling. Note that in Algorithm 9, the
actual service allocated to jobs of type m at server i is same as that of the
MaxWeight schedule as long as the corresponding queue length of backlogged
workload is greater than Dmax smax . This gives the additional 2M LDmax s2max
term.
:
Assuming all the servers are identical, we have that for each i, Ci = {/L
C}, so Ci is a scaled version of C. Thus,
i = /L.
Since k Ko() ,
we also have that k Koi() for the capacity region Ci . Thus, there exists
(k) > 0 so that
(k)
(k)
(k)
B(k) , H(k) {r RM
}
+ : ||r /L||
lies strictly within the face of Ci that corresponds to F (k) . (Note that this is
the only instance in the proof of Theorem 5.2 in which we use the assumption
(k)
that all the servers are identical.) Call this face Fi . Thus we have
L
X
i=1
L
X
i=1
"
min
ri Ci
M
X
m=1
min
ri B
()
qemi
M
X
()
(k)
(k) m=1
(k)
ri B (k) m=1
v
uM
L uX
X
= (k) t
i=1
()
qemi
()
qemi
m=1
m=1
M
X
(5.27)
!
()
qem0 i cm0
!
cm
m0 =1
v
uM
L
2
Xu X
()
qemi
= (k) t
i=1
!#
!
(k)
m
i
rm
L
qemi
L
M
X
X
=
min
i=1
(k)
m
i
rm
L
M
X
()
(5.28)
!2
!
qem0 i cm0
!
(k)
m
i
rm
L
cm
(5.29)
m0 =1
M
X
!2
()
qem0 i cm0
m0 =1
96
(5.30)
v
u L M
L
uX X 2 X
()
(k) t
qemi
i=1 m=1
M
X
!2
()
qem0 i cm0
(5.31)
m0 =1
i=1
()
()
()
M
i
is in
= 0. The vector qei , qemi
em0 i cm0
cm Lm rm
m0 =1 q
m=1
m
()
()
()
where
qei||
=
RM . Its component along c RM is qei|| =
qei||
cm
m
PM
()
q
e
c
.
Then,
the
component
perpendicular
to
c
is
m=1 mi m
P
()
()
()
M
0
qei
=
qemi
q
e
c
cm
and the term in (5.28) is
0
0
m
m =1 m i
m
(k)
P
()
i
. This term is an inner product in RM which is miniqi )m Lm rm
m (e
(k)
(k)
i
mized when ri is chosen to be on the boundary of B(k) so that Lm rm
()
qei
(k)
m
()
ke
qi k.
i
(k) e ()
()
()
e
f
E W (Y ) Y (nT ) = Y
v
!2
uM L
M
L
X X () 2 1 X
X
1
2T u
()
t
qemi
qe 0
(,k)
K28 T + K29 L
L m=1 i0 =1 mi
e
2
q
m=1 i=1
v
!
u L M
2
L
M
uX X 2 X
X
()
()
+K31 2T (k) t
qemi
qem0 i cm0
(5.32)
h
i=1 m=1
m0 =1
i=1
M
L
M L
K32
T X X () 2 1 X X ()
qemi
qe
(,k)
(,k)
L m=1 i=1 mi
e
q
e
m=1 i=1
2
q
(a)
L X
M
L
2 X
X
()
qemi
+
i=1 m=1
v
uM L
0
X X () 2
(b)
K32
T u
t
qemi
(,k)
e(,k)
e
2
q
q
m=1
i=1
97
i=1
M
X
!2 12
()
qemi cm
m=1
L X
M
X
i=1
!2
() cm
qemi
L
m=1
!2
K32
T 0
=
(,k)
(,k)
e
q
e
2
q
(,k)
2
2
e||
ke
q() k
q
K32
0
=
(,k)
T
e
2
q
T 0
whenever
(,k)
K32
(k)
()
e
e
W (q(Y )) =
q
,
T 0
(5.33)
where K32 = K28 T + K29 + K31 and 0 = min{ M , (k) }. Inequality (a) follows
from the fact that ( x + y)2 x + y. Inequality (b) follows from the
following claim, which is proved in C.
e RM L ,
Claim 5.1. For any q
M
L
1 X X ()
qe
L m=1 i=1 mi
!2
+
L X
M
X
()
qemi
2
i=1 m=1
L
X
M
X
i=1
m=1
!2
()
qemi cm
L X
M
X
!2
()
qemi cm
i=1 m=1
In addition to (5.33), since the departures in each time slot are bounded
and the arrivals are finite there is a D < such that P (|Z(X)| D)
almost surely. Now, applying Lemma 5.2, we have the following proposition.
Proposition 5.1. Assume all the servers are identical and the arrival rate
() int(C) is such that there exists a
()
m > for all m for some > 0. Cone () (n) under JSQ routing and myopic MaxWeight
sider the sampled system Y
scheduling according to Algorithm 9. For every k Ko() , there exists a
set of
i
h
(k) e ()
r
(k)
(n))
finite constants {Nr }r=1,2,... such that in steady state, E
q (Y
(k)
Nr
written the proofs more generally whenever we can so that it is clear where
we need the identical server assumption. In particular, in this subsection,
up to Equation (5.3.2), we do not need this assumption, but we have used
the assumption after that, in analyzing the drift of V (q). The upper bound
in the next section is valid more generally if one can establish state-space
collapse for the non-identical server case. However, at this time, this is an
open problem.
queue length of backlogged workload, E c(k) , q() (t) and show that in the
asymptotic limit as (k) 0, this coincides with the lower bound. For ease
of exposition, we will omit the superscript () in this section. In order to
obtain a matching upper bound, we consider the drift of the same Lyapunov
(k)
function that was used in the lower bound, viz., V|| (.). As a result, we have
the following lemma.
Lemma 5.4. In steady state,
nTX
+T 1
(5.34)
t=nT
nTX
+T 1
t=nT
nTX
+T 1
2 i
c(k) , s(t) a(t)
+
2
E
nTX
+T 1
t=nT
2 i
c(k) , u(t)
2
(k)
c , u(t) .
(5.35)
(5.36)
t=nT
(k)
Ve (k) (Y(nT ))
h
i
(k)
(k)
= V|| (q(Y((n + 1)T ))) V|| (q(Y(nT )))
=
nTX
+T 1 h
i
(k)
(k)
V|| (q(t + 1) V|| (q(t))
t=nT
99
=
(a)
nTX
+T 1 h
t=nT
nTX
+T 1 h
t=nT
nTX
+T 1 h
c(k) , q(t + 1)
2 i
c(k) , q(t)
2 i
2
+ 2 c(k) , q(t) + a(t) s(t) c(k) , u(t)
t=nT
(k)
2
(k)
2 i
+ c , u(t) c , q(t)
=
nTX
+T 1 h
c(k) , a(t) s(t) + 2 c(k) , a(t) s(t) c(k) , q(t)
t=nT
2 i
+ 2 c(k) , q(t) + a(t) s(t) c(k) , u(t) + c(k) , u(t)
.
(k)
Equation (a) follows from (5.1). Noting that the expected drift of V||
zero in steady state, we have the lemma.
is
Note that the expectation in the lemma is according to the steady state
b
distribution of the process Y(n),
i.e., at time t, the queue length distribution
is (q) where = t mod T .
We will obtain an upper bound on E c(k) , q(t) by bounding each of the
above terms. Before that, we need the following definitions and results.
(k)
Let be the steady-state probability that the MaxWeight schedule chosen is from the face F (k) at a time t such that t mod T = , i.e.,
(k) = P hc, s(t)i = b(k) whenever t mod t = ,
where sm =
L
P
i=1
1
T
PT
=0
(k)
. Also,
define
(k) = min b(k) hc, ri : r S \ F (k) .
Then we have the following Claim.
Claim 5.2. For any (k) 0, (k) , Then, note that
nTX
+T 1
t=nT
"
E
b(k)
c(k) , s(t)
L
2 #
T (k) (k) 2
2
b
+
hc,
s
1i
.
max
L (k)
100
(5.37)
hD
Ei
c(k) , s(q(t)) T c(k) ,
t=nT
= T (b(k) (k) )
nTX
+T 1
hD
E
t=nT
nTX
+T 1
t=nT
nTX
+T 1
E
i
c(k) , s(q(t)) I hc, s(t)i = b(k)
hD
E
i
c(k) , s(q(t)) I hc, s(t)i =
6 b(k) T (b(k) (k) )
hD
E
i
c(k) , s(q(t)) I hc, s(t)i =
6 b(k) T (b(k) (k) ) T b(k) (k) .
t=nT
E
hD
i
6 b(k)
c(k) , s(q(t)) I hc, s(t)i =
t=nT
nTX
+T 1
(k)
mod T )
t=nT
(k)
.
(k)
#
2
b(k)
(k)
c , s(t)
E
L
t=nT
nT +T 1 h
2 i
1 X
=
E b(k) hc, s(t)i
L t=nT
nTX
+T 1
"
nT +T 1
h
2
i
1 X
(k)
(k)
(k)
=
1 t mod T E b hc, s(t)i | hc, s(t)i =
6 b
L t=nT
2
T
(1 (k) ) b(k) + hc, smax 1i2
L
101
T (k) (k) 2
2
+
hc,
s
1i
.
b
max
L (k)
i
ML
Claim 5.3. Let q i RM
+ for each i {1, 2, ....L}. Denote q = (q )i=1 R+ .
sCei
L
P
i=1 s Ci
L
P
sCei
sCei
L
P
i
q , (si )
<
i=1
i i
i i
hq i , si i,
hq
,
s
i.
Then
there
exists
an
i
L
such
that
q , (s ) < max
max
i
i
s Ci
i=1 s Ci
which is a contradiction.
Therefore, choosing a MaxWeight schedule at each server is the same
as choosing a MaxWeight schedule from the convex polygon, Cei . Since
L
such
there are a finite number of feasible schedules, given c(k) RM
+
(k)
(k)
that n
||c || = 1, there exists an angle
(0, 2 ] such that, for all
o
(k)
L
L
(i.e., for all q RM
such that
q q RM
: ||q|| || ||q|| cos (k)
+
+
qq(k) (k) where ab represents the angle between vectors a and b), we
||
have
(k)
c , s(t) I (q(t) = q) = b(k) / LI (q(t) = q) .
We can bound the unused service at each time t in steady-state as follows.
E
c(k) , u(t) E c(k) , s(t) a(t)
1
= E c(k) , s(t) c(k) ,
L
1
= E c(k) , s(t) b(k) (k)
L
(k)
,
L
(5.38)
where the last inequality follows from the fact that the MaxWeight schedule
lies inside the capacity region and so E c(k) , s(t) b(k) .
102
We will also need the following bound. Since the change in workload
between T time-slots is bounded as in (5.21), we get
(k)
(k)
||q|| (t) q|| (nT )|| = q(k) (t) q(k) (nT ), c(k)
||q(k) (t) q(k) (nT )||
+T 1
nTX
a(t)
+ M LT smax .
(5.39)
t=nT
Now, we will bound each of the terms in (5.36). Let us first consider the
, we have
first term in (5.35). Again, using the fact that the arrival rate is
nTX
+T 1
2 i
c(k) , s(t) a(t)
t=nT
nTX
+T 1
"
"
# nTX
#
+T 1
(k)
b(k) 2
2
b(k)
(k)
=
E
c , a(t)
+
E c , s(t)
L
L
t=nT
t=nT
nT +T 1 (k)
(k)
(k) X
b
c , s(t)
2
E
L t=nT
L
!2
(k)
nT +T 1
(k)
(b) X
1
+ c , b
E c(k) , a(t)
L
L
t=nT
"
#
nTX
+T 1
2
b(k)
(k)
c , s(t)
E
+
L
t=nT
nTX
+T 1 h
nT +T 1
(c) 1
(k)
i
(k) X
2
E
c , a(t)
+ 2
E c(k) , a(t)
L t=nT
L t=nT
2
(k)
T (k) (k) 2
2
b
+
hc,
s
1i
+T
+
max
L
L (k)
2
T (k) (k) 2
T D (k) 2 2 E T (k)
2
c
, +
+
b
+
hc,
s
1i
(5.40)
max
L
L
L (k)
1 (k) (k) 2
T
2
(,k)
+ (k) b
+ hc, smax 1i
,
(5.41)
=
L
L
(a)
D
2
2
2 E
((k) )
where (,k) was earlier defined as (,k) = L + 1L c(k) , ()
. The
last term in (a) is dropped to get (b) since it is negative. Inequality (c)
follows from (5.37) in Claim 5.2.
ini (5.40) is obtained by noting
h
The first term
2
and so E
=
that E [a(t)] =
c(k) , a(t)
= var c(k) , a(t)
103
) . Consider the second term in (5.35).
c(k) , var(a(t)
nTX
+T 1
(k)
c , u(t)
t=nT
2 i
nTX
+T 1
(k)
c , 1smax E c(k) , u(t)
t=nT
(k)
T
(k)
c , 1smax ,
(5.42)
(k)
= (umi )mL(k) . Also define, the projections, q
|| = c(k) , q
c(k) and
and u
++
(k)
(k)
q
(k)
(k)
q
. Then in steady-state, for all
= q
|| . Similarly, define u
|| and u
time t we have
c(k) , q(t) + a(t) s(t) c(k) , u(t)
h
(k)
(k)
2 i
(k)
=E c , q(t + 1) c , u(t) E c , u(t)
E c(k) , q(t + 1) c(k) , u(t)
(t)
(t + 1) c(k) , u
=E c(k) , q
h
i
(k)
(k)
=E ||
q|| (t + 1)||||
u|| ||
(k)
(k)
|| (t + 1), u|| (t)
=E q
hD
Ei
(k)
=E q
(t
+
1),
u
(t)
||
hD
Ei
(k)
(t)i] + E
(t)
=E [h
q(t + 1), u
q (t + 1), u
r h
i
(k)
(t)i] + E ||
q (t + 1)||2 E [||
u(t)||2 ]
E [hDmax smax 1, u
q
(t)i] + N2(k) E [h
(t)i]
Dmax smax E [h1, u
u(t), u
q
(t)i] + N2(k) smax E [h1, u
(t)i],
Dmax smax E [h1, u
E
(5.43)
(5.44)
104
Note that
(t)i]
E [h1, u
1
(k)
(t)
c(k) , u
cmin
1
= (k) E c(k) , u(t)
cmin
(k)
(k) ,
Lcmin
(k)
(k)
where cmin = min cm > 0 and the last inequality follows from (5.38). Thus,
(k)
mL++
we have
nTX
+T 1
c(k) , q(t) + s(t) a(t) c(k) , u(t)
t=nT
(k)
T Dmax smax
(k)
+T
Lcmin
(k)
(k)
N2 smax (k) .
Lcmin
(5.45)
We will now consider the left hand side term in (5.34). Given that the
for every t in steady-state, we have
arrival rate is
(k)
c , q(t) c(k) , s(t) a(t)
(k)
b(k)
(k)
b(k)
(k)
1
(k)
c , E c , q(t) c , s(t)
=E c , q(t)
L
L
L
(k)
(k)
(k)
b
(k)
= E c , q(t) E ||q|| (t)|| c(k) , s(t)
.
(5.46)
L
L
E
Now, we will bound the last term in this equation, summed over T time slots
in steady-state using the following lemma.
Lemma 5.5. In steady-state, we have
nTX
+T 1
b(k)
(k)
c , s(t)
E
L
t=nT
s
q
(k) (k) 2
2
T K33 + cot2 ((k) ) K34
(b
)
+
hc,
s
1i
,
max
L (k)
where K33 =
(k)
||q|| (t)||
(5.47)
2
() k + ||||2 +T 2 k
() k2 +T 2 ||||2
M LT smax k
M LT 2 s2max +2
105
and K34 =
(k)
N2
(k)
4N1 K33
q
()
2
2
T k k + kk + M LT smax + 4K33 .
(k)
(k)
||q (nT )|| cot q(nT )c(k) ||q (nT )|| cot (k) . Intuitively this means
that, in steady state, when q(nT )c(k) > (k) , q(nT ) must be small. Oth(k)
erwise, it would contradict the state-space collapse result that q (nT ) is
small. Here is the precise argument.
nTX
+T 1
b(k)
(k)
c , s(t)
E
L
t=nT
!
"
#
nTX
+T 1
+T 1
nTX
(k)
b
(k)
c(k) , s(t)
E
||q|| (nT )|| +
a(t)
+ M LT smax
L
t=nT
t=nT
nTX
+T 1
b(k)
(k)
(k)
(k)
c , s(t)
b
c(k) , s(t)
E
a(t)
+ M LT smax
+
L
t=nT
t=nT
v "
2 #
nTX
+T 1 u
u b(k)
tE
hc(k) , s(t)i
L
t=nT
v
u
!2
+T 1
u
nTX
h
i
u
(k)
tcot2 ((k) ) E ||q (nT )||2 + E
a(t)
+ M LT smax
(k)
||q|| (t)||
t=nT
(5.48)
v
u
q
u
(k) t
2
(k)
K33 + cot ( ) N2
T
nTX
+T 1
"
E
t=nT
q
(k)
T K33 + cot2 ((k) ) N2
b(k)
hc(k) , s(t)i
L
2 #
(5.49)
(k) (k) 2
2
(b
)
+
hc,
s
1i
.
max
M (k)
(k)
have that c(k) , s(nT ) = bL . One can then inductively argue as follows
(k)
for other times t. Suppose that at time t c(k) , s(t) = bL . Then at time
t + 1, s(t) is still feasible and it maximizes hq(t + 1), si since q(t+1)c(k) (k) .
Therefore, myopic MaxWeight chooses a schedule such that c(k) , s(t + 1) =
(k)
b
. Therefore, in this case,
L
nTX
+T 1
t=nT
(k)
(k)
b
(k)
E ||q|| (t)|| c , s(t)
= 0.
L
Case(iii): q(nT )c(k) (k) but q(t)c(k) > (k) for some t {nT + 1, . . . , nT +
T 1}
Let t0 denote the first time q(t)c(k) exceeds (k) . Then, similar to Case (ii),
(k)
up to time t0 , a schedule is chosen so that c(k) , s(t) = bL . We can now
bound the remaining terms similar to Case (i). As in Case(i), we have that
(k)
(k)
||q|| (t0 )|| ||q (t0 )|| cot (k) . Also, note that from (5.21), we can show
(k)
(k)
that the bound in (5.39) is valid for ||q|| (t) q|| (t0 )|| too. Then, using the
same argument as in Case (i), we get
nTX
+T 1
b(k)
(k)
c , s(t)
E
L
t=nT
(k)
nTX
+T 1
(k)
b
(k)
=
E ||q|| (t)|| c , s(t)
L
t=t0
"
!
#
nTX
+T 1
+T 1
nTX
(k)
b
(k)
c(k) , s(t)
(k)
||q|| (t)||
(5.50)
r h
i
(k)
2
We will now bound E ||q (t0 )|| using state-space collapse.
107
(k)
q (nT )||
(k)
q|| (t)
(k)
(k)
q|| (nT )
||
(k)
2
a(t)
+ 2 M LT smax
t=nT
nT +T 1
!2
X
h
i
(k)
(k)
E ||q (t0 )||2 E ||q (nT )|| + 2
a(t)
+ 2 M LT smax ||
t=nT
q
(k)
(k)
()
2
2
N2 + 4N1
T k k + kk + M LT smax + 4K33 .
The
follows from
state-space
collapse, Proposition 5.1,
h last inequality
i
h
i
(k)
(k)
(k)
(k)
2
E ||q (nT )|| N2 and E ||q (nT )|| N1 . Combining the three
cases, we have the lemma.
Now, substituting (5.46), (5.47), (5.41), (5.42) and (5.45) in (5.36), we get
(k)
nTX
+T 1
t=nT
T (,k)
(,k)
+ T B2 ,
c(k) , q(t)
2
where
(,k)
B2
D s (k) (k)
2
1 (k)
max max
(k)
+
= (k) b(k) + hc, smax 1i2 +
c
,
1s
max
(k)
2
2 L
cmin
s
q
(k) (k) 2
2
(b
)
+
hc,
s
1i
+ K33 + cot2 ((k) ) K34
max
(k)
s
(k)
(k)
+
LN2 smax (k) .
cmin
We will now use (5.39) to get a bound on E c(k) , q(t) in steady-state
for any time t. Let nT t nT + T 1. Then,
(k) E
h
i
(k)
(k)
c(k) , q(t) (k) E c(k) , q(nT ) + (k) E ||q|| (t) q|| (nT )||
(k)
nTX
+T 1
c , q(t)
(k)
108
+E
(k)
||q|| (nT )
i
(k)
q|| (t)||
where
(,k)
B3
(,k)
B2
(,k)
2
h
i
(k)
(k)
+ (k) E ||q|| (t) q|| (nT )||
#
"
nT +T 1
X
(,k)
+ B2 + 2(k) E
a(t)
+ 2(k) M LT smax
t=nT
(,k)
(,k)
+ B3
(,k)
(k)
1
L
(k) 0
(k)
(k) 2
0. Thus, in the
(k)
,
2
(5.51)
We have so far, obtained heavy traffic optimality of c(k) , q(t) , which depends on the vector c(k) . In other words, we have optimality of a particular
linear combination of queue lengths and we do not have a choice on this linear
combination. In this subsection, we will extend the heavy traffic optimality
result to ||q(t)||2 . We will do that by first obtaining lower and upper bounds
n
on c(k) , q(t) for any n 1.
Proposition 5.2. Consider the cloud computing system described in section
5.1. Under the same conditions as in Theorem 5.2, for any n 1, we have
(,k)
n!
1
2
!n
(,k)
B1,n , ((k) )n E
n i
(,k)
n!
1
2
!n
(,k)
+ B2,n ,
D
2
2 E
(,k)
(,k)
(,k)
1
and B1,n , B2,n are o( (k)
where 1
= 1L c(k) , ()
). The lower
bound is a universal lower bound applicable to all resource allocation algorithms. The upper bound is attained by Algorithm 9.
109
n i
(k)
()
lim ( ) E c , q (t)
= n!
,
(k)
2
0
(k) n
where (k) =
1
L
c(k)
2
E
, ()2 .
The proof is based on the three-step procedure as in the previous subsections. The first step is to prove the lower bound in Proposition 5.2. It follows
directly from Lemma 10 in [37], which is a bound on the nth power of single
server queue, in the lines of 5.7. State space collapse as stated in Proposition
5.1 is applicable here. Upper bound is obtained by first setting the drift of
the following Lyapunov function to zero in steady state,
n
1
(k)
n
(k)
Vn|| (q) , c(k) , q() =
q||
=
L
L X
M
X
!n
qmi cm
i=1 m=1
(n + 1) (k)
n
n i
c(k) , q(t)
(5.52)
t=nT
n1
nTX
+T 1 X
t=nT
j=0
h
i
j
(k)
n+1
(k) n1
(k)
(k) (nj+1)
( ) E c , q(t)
c , a(t) b
j
(5.53)
nTX
+T 1 X
n
t=nT
j=0
n+1
((k) )n1
j
j
(nj+1) i
hc(k) , q(t) + a(t)i b(k) b(k) c(k) , s(t)
(5.54)
nTX
+T 1 X
n
n+1
+
((k) )n1
j
t=nT j=0
h
j
(nj+1) i
E hc(k) , q(t) + a(t) s(t)i b(k) b(k) c(k) , u(t)
E
(5.55)
Each of these terms are then bounded, similar to the previous subsection
and appendices C and D of [37]. We skip the details here. Unlike the proof
of Theorem 5.2, we need the assumption that am (t) amax in the proof of
heavy traffic optimality in Proposition 5.2. We can now obtain heavy traffic
110
2
1
(,k)
(,k)
B1,2 , ((k) )2 E
q() (t)
+ B3,2 ,
2
2
D
E
(,k)
(,k)
(,k)
1
1
() 2
(k) 2
(,k)
2
()
lim ( ) E
q (t)
=
,
2
(k) 0
(k) 2
The lower bound is a universal lower bound applicable to all resource allocation algorithms. The upper bound is attained by Algorithm 9. Thus,
Algorithm 9 is second moment heavy traffic optimal
Proof. From the Pythagorean theorem, we have
2
()
2
()
2
()
2
q|| (t)
q() (t)
=
q|| (t)
+
q (t)
The result then follows
from
2 Proposition 5.2 and state space collapse in
()
(k)
Proposition 5.1 that
q (t)
N2 .
In the following section, we will study the power-of-two choices routing
algorithm that is easier to implement than JSQ.
All the type i job arrivals in this time slot are then routed to the server
with the shorter queue length of backlogged workload among these two, i.e.,
im (t) = arg min qmi (t).
i{i1m (t),i2m (t)}
Then, we have that the cloud computing system is heavy traffic optimal.
In other words, we have the following result.
Theorem 5.4. Theorems 5.1 and 5.2 hold when power-of-two-choices routing
is used instead of JSQ routing in Algorithm 9.
Proof. Proof of Theorem 5.1 when power-of-two choices algorithm is used, is
very similar to proof of Theorem 2.2 in Chapter 2 and so we skip it. Proof
of this theorem is similar to that of Theorem 5.2 and so here we present only
the differences that arise in the proof due to power-of-two choices routing.
We will use the same Lyapunov functions defined in the proof of Proposition 5.2. We will first bound the one-step drift of the Lyapunov function
V (.). Note that the bound (5.15) on the drift of V (.) is valid. Recall that we
use q() for q(Y() ).
#
" L M
X X ()
E M V (Y() )|Y() (t)=Y() K27 + EY()
.
2qmi ami (t) sim (t)
i=1 m=1
(5.56)
The arrival term here can be bounded as follows. Let Xj = (i1m , i2m ) denote the two servers randomly chosen by power-of-two-choices algorithms for
routing of type m jobs.
"
EY()
M X
L
X
()
2qmi
m=1 i=1
M X
L
X
#
(ami (t))
" "
=EE
()
2qmi
m=1 i=1
(a)
= E
M
X
#
#
(ami (t)) Y(t)=Y() , Xj =(i1m , i2m ) Y(t)=Y()
1
()
()
2am (t) min{qmi1m , qmi2m }|Y(t) = Y()
LC
2
M
X
m 1
LC
2
m=1
()
()
2 min{qmi1m , qmi2m }
112
(b)
m 1
LC
2
m=1
M
X
M
X
m=1
()
()
()
()
!
1
()
()
L (qm max qm min ) ,
L
C2
L
()
X
2q
mi
i=1
()
()
()
()
where qm max = maxm qmi and qm min = minm qmi . Equation (a) follows
from the definition of power-of-two-choices routing and (b) follows from
()
()
()
()
the fact that 2 min{qmi1m , qmi2m } qmi1m + qmi2m for all i1m and i2m and
()
()
()
()
()
()
2 min{qm max , qm min } (qm max +qm min )(qm max qm min ). Then, from (5.56),
we have
M
X
()
()
()
m 1 (q () q () )
E M V (Y )|Y = Y
K27
m max
m min
LC
2
m=1
" L M
#
L
M
()
X
X
X X ()
q
mi
m
2qmi sim (t) .
2Eq()
+2
L
m=1
i=1
i=1 m=1
(5.57)
The terms in (5.57) are identical to the ones in (5.17) and so can be bounded
by (5.18). Moreover, the bound on one step drift of the Lyapunov function
e () ) in (5.10) are
f (k) (Y
V|| (.) in (5.14) as well as the bound on the drift W
valid here since it does not depend on the routing policy. Substituting (5.14),
e () ), we get
e() for q(Y
(5.57) and (5.18) in (5.10), and using q
h
i
e () ) Y() (nT ) = Y
e ()
f (k) (Y
E W
"
nTX
+T 1
M
X
1
()
m 1 (q ()
E
K
28
m max(t) (t) qm min(t) )(t) (5.58)
LC
(,k)
2
e
2
q
m=1
t=nT
#!
"M
!#
L
X
X ()
(k)
m
e ()
sim (t) Y() (nT ) = Y
. (5.59)
+2
EY()
qmi
L
m=1
i=1
()
In the notation qm max(t) (t0 ), max(t) denotes the server with highest workload
at time t, even though we are interested in the workload at time t0 . We will
first bound the terms in (5.19). Using (5.21), we have
()
(a)
()
()
qm min(nT ) (nT )
nTX
+T 1
t=nT
()
(b)
()
()
qm max(nT ) (nT )
nTX
+T 1
t=nT
where (a) follows from the fact that the server min(t) has the smallest worke () ], we
load at time t for type-m jobs. Using EYe () [.] for E[.|Y() (nT ) = Y
bound the terms in (5.58). As in Section 5.3.2, we will assume that the
() is such that there exists a > 0 such that
()
arrival rate
m > for all m.
nTX
+T 1 X
M
h
i
()
() 1 E e () q ()
(t)
q
(t)
m L
m max(t)
m min(t)
C2 Y
m=1
t=nT
M
nTX
+T 1 X
t=nT
i
h
()
() 1 E e () q ()
() + 2T smax
(nT
)
+
2T
(nT
)
q
m L
m
Y
m min(nT )
m max(nT )
C2
m=1
nTX
+T 1 X
M
()
()
() 1 (e
qm
em min )
m L
max q
C2
t=nT m=1
r
M
nTX
+T 1 X
2
1
()
()
()
m
=K29
L qem max qem min
LC
2 L
t=nT m=1
v
u L
P () !2
nTX
+T 1 X
M
uX
e 0
1
0 q
()
()
t
K29
qemi i mi
m L
L
C2 L i=1
t=nT m=1
v
u L
P () !2
nTX
+T 1
M
uX
X
e 0
2
0 q
()
00 t
K29
qemi i mi
,
L
L
m=1
i=1
t=nT
K29
P
PM ()
2
() 2
where K29 was already defined as K29 = 4T 2 M
m=1 (m ) +4T smax
m=1 m
and 00 = (M 1)
. This term is same as the term in (5.24) with 00 instead
L
of . This term can then be bound by the term in (5.25) with 00 instead of
. Noting that the terms in (5.59) are identical to the ones in (5.20), we can
bound them using (5.26) and (5.31) as in section 5.3.2, and we get (5.32)
with 00 instead of . Note that the remainder of the proof of state-space
collapse in section 5.3.2 does not use the routing policy and is valid when
is replaced with 00 . Moreover, the proofs of lower bound in section 5.3.1 and
upper bound in section 5.3.3 are also valid here. Thus, the proof of Theorem
114
5.4 is complete.
h
2 i
()
Similarly, we have the following result of heavy traffic optimality of E q (t)
under power-of-two choices routing, which we state without proof.
Proposition 5.3. Proposition 5.2 and Theorem 5.3 hold when poer-of-twochoices routing algorithm is used instead of JSQ routing in Algorithm 9.
5.5 Conclusions
In this chapter, we studied the resource allocation problem assuming that
the job sizes are known and preemption is allowed once every few time slots.
We presented two algorithms based on myopic MaxWeight scheduling and
join the shortest queue or power-of-two choices routing algorithms. We have
shown that these algorithms are not only throughput optimal, but also delay
optimal in the heavy traffic regime when all the servers are assumed to be
identical.
115
CHAPTER 6
CONCLUSION
117
APPENDIX A
PROOF OF LEMMA 3.6
Since G(.) is a strictly increasing bijective convex function on the open interval (0, ), it is easy to see that G1 (.) is a strictly increasing concave
function on (0, ). Thus, for any two positive real numbers v2 and v1 ,
0
G1 (v2 ) G1 (v1 ) (v2 v1 ) (G1 (v1 )) where (.)0 denotes derivative.
Let u = G1 (v). Then,
v = G(u)
dv
= G0 (u) = g(u) = g(G1 (v))
du
1
du
=
.
1
dv
g(G (v))
Since
du
dv
G1 (v1 )
lemma.
(v2 v1 )
.
g(G1 (v1 ))
(1)
Using V (Q ) and V
118
1
.
g(G1 (v))
(2)
(Q ) for v1
Thus, G1 (v2 )
and v2 , we get the
APPENDIX B
PROOF OF LEMMA 3.7
Since the arithmetic mean is at least as large as the geometric mean and
since G(.) is strictly increasing, we have
!
Y
1
G
(1 + Qmi ) LM 1
i,m
!
(1
+
Q
)
mi
i,m
G
1
LM
! P
P
P
i,m (1 + Qmi )
i,m (1 + Qmi )
i,m (1 + Qmi )
=
log
+1
LM
LM
LM
P
(a) 1 X
i,m (Qmi )
V (Q))
LM
V (Q)),
=
where inequality (a) follows from log sum inequality. Now, since G(.) and
log(.) are strictly increasing, we have
1
e( LM
log(1+Qmi ))
1 + G1 V (Q))
1 X
log(1 + Qmi ) log (1 + G1 V (Q)) .
LM i,m
i,m
(B.1)
Now to prove the second inequality, note that since Qmi is nonnegative for
all i and m,
!
X
i,m
log(1 + Qmi )
(1 + Qm0 i0 )
i0 ,m0
119
log(1 + Qmi )(1 + Qmi )
i,m
i,m
P
(1 + Qmi ) log(1 + Qmi ) i,m Qmi 1
.
e
!
Y
e (1 + Qmi ) + 1
i,m
i,m
X
(1 + Qmi ) log(1 + Qmi )
Qmi .
i,m
i,m
i,m
The last two inequalities again follow from the fact that G(.) and log(.) are
strictly increasing.
120
APPENDIX C
PROOF OF CLAIM 5.1
For any two job types m and m0 as well as servers i and i0 , we have
i2
()
()
0
(e
qm0 i qem0 i0 )cm
2
2
X
X ()
()
()
()
()
()
()
()
2(e
qmi qemi0 )cm0 (e
qm0 i qem0 i0 )cm
(e
qmi qemi0 )cm0 + (e
qm0 i qem0 i0 )cm
h
()
(e
qmi
()
qemi0 )cm0
m<m0
m<m0
(C.1)
M X
M
X
()
(e
qmi
m=1 m0 =1
()
()
qemi0 )cm (e
qm 0 i
()
qem0 i0 )cm0
M X
M
X
()
()
(e
qmi qemi0 )2 (cm0 )2 (C.2)
m=1 m0 =1
M
X
!2
()
()
(e
qmi qemi0 )cm
"
m=1
M
X
#
()
()
(e
qmi qemi0 )2
M
X
(cm0 )2 .
m0 =1
m=1
The left-hand sides of (C.1) and (C.2) are equal for the following reason.
The two sums in the LHS of (C.2) can be split into three cases, viz., m = m0 ,
m < m0 and m > m0 . The term corresponding to m = m0 is zero. The other
two cases correspond to the same term which gives the factor 2 in (C.1).
Considering the three cases, it can be shown that the right-hand sides of
P
2
(C.1) and (C.2) are equal. Noting that M
m0 =1 (cm0 ) = 1 and summing over
i, i0 such that i < i0 , we get
M
X
i<i0
" M
L X
L
X
X
i=1 i0 =1
m=1
()
()
(e
qmi qemi0 )cm
qemi cm
M
X
"M
X X
i<i0
m=1
!
()
!2
!#
()
()
(e
qmi qemi0 )cm
()
()
qemi qemi0
2
#
(C.3)
m=1
"M
L X
L
X
X
#
()
()
()
qemi qemie
qmi0 .
i=1 i0 =1 m=1
m=1
(C.4)
The left-hand side of (C.4) is obtained using the same method as in (C.2) as
follows. The two sums in the LHS of (C.4) can be split into three cases, viz.,
121
!
()
qemi cm
m=1
M
X
()
(e
qmi
m=1
()
qemi0 )cm
M
X
!
()
qemi0 cm
!#
M
X
()
(e
qmi0
m=1
m=1
()
qemi )cm
which is same as the term in the left hand side of (C.3). Similarly, the right
hand side term can be obtained. Expanding the products in (C.4), we get
L X
L
X
M
X
i=1 i0 =1
m=1
!2
()
qemi cm
L X
L
X
M
X
i=1 i0 =1
m=1
L
X
M
X
i=1
m=1
!2
()
qemi cm
L X
M
X
!
()
qemi cm
M
X
!
()
qemi0 cm
m=1
M
X
" L L
XX
m=1
i=1 i0 =1
()
qemi
2
L X
L
X
#
() ()
qemi qemi0
i=1 i0 =1
!2
()
qemi cm
i=1 m=1
M X
L
X
m=1 i=1
122
()
qemi
2
M
X
L
X
m=1
i=1
!2
()
qemi
REFERENCES
123
[24] M. Mitzenmacher, The power of two choices in randomized load balancing, Ph.D. dissertation, University of California at Berkeley, 1996.
[25] Y. T. He and D. G. Down, Limited choice and locality considerations
for load balancing, Performance Evaluation, vol. 65, no. 9, pp. 670
687, 2008.
[26] H. Chen and H. Q. Ye, Asymptotic optimality of balanced routing,
2010, http://myweb.polyu.edu.hk/lgtyehq/papers/ChenYe11OR.pdf.
[27] L. Tassiulas, Linear complexity algorithms for maximum throughput
in radionetworks and input queued switches, in Proc. IEEE Infocom.,
1998.
[28] Y. Ganjali, A. Keshavarzian, and D. Shah, Cell switching versus packet
switching in input-queued switches, IEEE/ACM Trans. Networking,
vol. 13, pp. 782789, 2005.
[29] A. Eryilmaz, R. Srikant, and J. R. Perkins, Stable scheduling policies for fading wireless channels, IEEE/ACM Trans. Network., vol. 13,
no. 2, pp. 411424, 2005.
[30] V. Venkataramanan and X. Lin, On the queue-overflow probability
of wireless systems: A new approach combining large deviations with
Lyapunov functions, IEEE Trans. Inform. Theory, 2013.
[31] D. Shah and J. Shin, Randomized scheduling algorithm for queueing
networks, The Annals of Applied Probability, vol. 22, no. 1, pp. 128
171, 2012.
[32] J. Ghaderi and R. Srikant, Flow-level stability of multihop wireless
networks using only MAC-layer information, in WiOpt, 2012.
[33] B. Hajek, Hitting-time and occupation-time bounds implied by drift
analysis with applications, Advances in Applied Probability, pp. 502
525, 1982.
[34] S. T. Maguluri and R. Srikant, Scheduling jobs with unknown duration
in clouds, in INFOCOM, 2013 Proceedings IEEE, 2013, pp. 18871895.
[35] S. T. Maguluri and R. Srikant, Scheduling jobs with unknown duration
in clouds, to appear in IEEE/ACM Transactions on Networking.
[36] S. Karlin and H. M. Taylor, A First Course in Stochastic Processes.
Academic Press, 1975.
[37] A. Eryilmaz and R. Srikant, Asymptotically tight steady-state queue
length bounds implied by drift conditions, Queueing Systems, pp. 149,
2012.
125
[38] J. Ghaderi, T. Ji, and R. Srikant, Connection-level scheduling in wireless networks using only MAC-layer information, in INFOCOM, 2012,
pp. 26962700.
[39] T. Ji and R. Srikant, Scheduling in wireless networks with connection arrivals and departures, in Information Theory and Applications
Workshop, 2011.
[40] M. Bramson, State space collapse with application to heavy-traffic limits for multiclass queueing networks, Queueing Systems Theory and
Applications, pp. 89 148, 1998.
[41] R. J. Williams, Diffusion approximations for open multiclass queueing
networks: Sufficient conditions involving state space collapse, Queueing
Systems Theory and Applications, pp. 27 88, 1998.
[42] M. I. Reiman, Some diffusion approximations with state space collapse, in Proceedings of International Seminar on Modelling and Performance Evaluation Methodology, Lecture Notes in Control and Information Sciences. Berlin: Springer, 1983, pp. 209240.
[43] J. M. Harrison, Heavy traffic analysis of a system with parallel servers:
Asymptotic optimality of discrete review policies, Ann. App. Probab.,
pp. 822848, 1998.
[44] J. M. Harrison and M. J. Lopez, Heavy traffic resource pooling in
parallel-server systems, Queueing Systems, pp. 339368, 1999.
[45] S. L. Bell and R. J. Williams, Dynamic scheduling of a parallel server
system in heavy traffic with complete resource pooling: asymptotic optimality of a threshold policy, Electronic J. of Probability, pp. 10441115,
2005.
[46] A. Stolyar, MaxWeight scheduling in a generalized switch: State space
collapse and workload minimization in heavy traffic, Adv. Appl. Prob.,
vol. 14, no. 1, 2004.
[47] J. F. C. Kingman, Some inequalities for the queue GI/G/1,
Biometrika, pp. 315324, 1962.
[48] S. T. Maguluri, R. Srikant, and L. Ying, Heavy traffic optimal resource
allocation algorithms for cloud computing clusters, in International
Teletraffic Congress, 2012, pp. 18.
[49] S. T. Maguluri, R. Srikant, and L. Ying, Heavy traffic optimal resource
allocation algorithms for cloud computing clusters, Performance Evaluation, vol. 81, pp. 2039, 2014.
126
127