Você está na página 1de 19

The Load Distance Balancing

Problem
Eddie Bortnikov (Yahoo!)
Samir Khuller (Maryland)
Yishay Mansour (Google)
Seffi Naor (Technion)
The Load-Distance Balancing Problem

 Given n clients and k servers s1, s2,…sk we


need to assign each client to a server.
 Cost for client i assigned to server sj is as
follows:
Cost(i) = Distance(i,sj) + Delay(j,Lj)
Delay(j,Lj) is a FUNCTION of the number of
clients Lj assigned to sj.
OBJECTIVE: Min Max Cost(i)
An Example

s1 s2 s3

C E
D
A B

 Each server has its own delay function (can be arbitrary, we


just assume its non-decreasing).
 Note that C is closer to s1, but prefers to attach to s2 since
Dist(C,s2)+Delay(s2,2)<Dist(C,s1)+Delay(s1,3)

 Objective is to Minimize Max Cost for any client


Related Work

 Lots of research on locating facilities. Here


the facilities are all given; we just have to
compute assignment of clients to facilities.
 Notion of capacities has been used for
various covering problems such as Vertex
Cover, K Centers, Facility Location etc.
Main Results
 The problem is NP-hard.
 We develop a polynomial time 2 approx.
 We show that the bound 2 cannot be improved to 2-ε
unless NP=P.
 With triangle inequality in the distance function the
hardness reduces to 5/3-ε.
 When all clients and servers are on a line its solvable
in polynomial time.
 For Min Sum Cost(i), we can solve in polynomial
time using Min-Weight Matching.
NP-hardness by Exact Set Cover

 Given N elements and a collection S of K


sets (each set has size m). Does there exist
a subset S’ of S, such that each element
belongs to exactly one set in S’?
In other words, we need to pick exactly N/m
subsets from S, to cover each element once.
Example of Exact Cover

 Here we have N=16 elements and 9 sets (m=4).


 The FOUR blue sets form an EXACT COVER, and
we discard the FIVE orange sets.
Reduction from Exact Cover

 Each element is a client. In


d1
addition we create a
Dummy
collection of M(K-N/m)
clients
dummy clients.
 Subset Sj in S corresponds
to server sj.
 Dist(dummy,server)=d1 sj
d2
 Dist(i,sj) = d2 if i ε Sj, o.w. ∞
 d2 >> d1 i
Clients (elements)
Reduction from Exact Cover

 Delay functions for servers are basically a step


function.
 Delay(j,Lj) = Δ-d2, when load is at most m.
 Delay(j,Lj)= Δ-d1, when load exceeds m, but is at
most M.
Δ-d1

Δ-d2

m M
Reduction from Exact Cover

 Suppose there is a solution to exact cover, then


there is a solution to the LDB problem with delay at
most Δ.
 For each chosen subset Sj, the corresp. server sj
gets m clients each at distance d2.Since the delay is
Δ-d2, the total cost is at most Δ.
 For the remaining subsets, those dummies are all
assigned to the remaining servers (K-N/m), each
gets M dummies.
 The proof in the other direction requires some work!
Proof (cont.)
 Each server can support at most M dummy clients if
the total cost does not exceed Δ, and no more than
m real clients.
 Suppose a server supports both real and dummy
clients; then the total number of servers with real
clients is k’ > N/m.
 These serve at most (mk’-N) dummy clients, while
the rest can serve only M(k-k’) dummy clients.
 Adding the two shows (some algebra needed) that
we can only assign < M(k-N/m) dummy clients if
M>m.
Hardness Results follow….

 We can set d1=ε and d2=Δ-ε. A solution to


EXACT COVER exists if and only if a
solution with cost Δ exists for LDB.
 If there is no solution to EXACT COVER,
then every solution to LDB has cost 2(Δ-ε).
 However, here we do violate triangle
inequality in the distance function.
Hardness results with triangle
inequality

 We need to set d1=⅓ Δ, and d2= Δ-ε.


 With these parameters, the distance between
a (real) client i and a server sj such that i is
not in Sj, is at least 5/3Δ- ε.
 A solution to EXACT COVER exists if and
only if a solution with cost Δ exists for LDB.
 If there is no solution to EXACT COVER,
then every solution to LDB has cost 5/3Δ-ε.
Approximation Algorithm with factor 2

 Suppose a solution exists with maximum cost Δ.


 For each server sj, we can compute an upper bound
on the number of clients that can be served with a
delay of at most Δ (say L*j).
 For each client i we can compute the subset of
servers that are within distance Δ (S*i).
 Now its just a flow problem to check if an assignment
exists where each client i is assigned to a server
from S*i and each sj has load at most L*j.
 Minimizing Δ gives a trivial 2 approximation.
All servers and clients on a line

 Use dynamic programming!


Minimizing the Sum of Costs

 We reduce this to min cost


matching in a bipartite
graph.
 Let G=(X,Y,E) where nodes
in X correspond to n clients
and there are nk nodes in Y.
We have n nodes corresp.
to each server.
 We ask for a min cost
matching to find a solution.
Capacitated K Centers

 The related problem of choosing K facilities has been


considered (each client should be assigned to a
closeby facility and the load on the facility should not
be too high): [Khuller & Sussman] (K,L,5R) or
((2/c)K, cL, 2R) related to K-centers clustering.
Conclusions

 Can we improve the 2 approximation when


triangle inequality holds?
 Can we improve the 2 approximation when
Delay functions satisfy specific properties?
What is a natural delay function?
 Are there other special cases that can be
solved in polynomial time?

Você também pode gostar