Escolar Documentos
Profissional Documentos
Cultura Documentos
TABLE 1
Term Definition
S, si switch set with n switches: S={s1, · · · , sn}
w(si) weight of si, as the no. of out-going flows.
PC(si) potential controllers set of the ith switch.
rc(si) the real controller of the ith switch.
C, ci controller set with m controllers: C={c1, · · · , cm}
w(ci) weight of ci, as the sum of RS(ci)’s weight. Theorem 1. LBDC is NP complete.
PS(c i ) potential switches set of the ith controller.
RS(ci) real Switches set of the ith controller. Proof. We will prove the NP completeness of LBDC by
AN (ci) adjacent node set (1-hop neighbors) of ci. considering a decision version of the problem, and showing
a reduction from PARTITION problem [19].
The system running, the weightiness of the controller I Fig. 1. An example of regional balancing migration.
may grow exclusively, making it unbalanced comparing We make an instance of LBDC. In this case there are
with other controllers. Then in this condition, we must two controller c1, c2 and A switches. Each switch so
marginally migrate some switches in RS (chi) to other represents an element a A, with weight w (SA) = size (a).
available controllers, in society to reduce its workload and Both controllers can control every switch in the network
preserve the whole network traffic balanced. (PS (c1) = PS (c2) = SA an A).
Then our problem becomes balancing the traffic load } given a YES solution A′ for PARTITION,
Then,
among my partition in real time upbringing, and migrate we have a solution RS(c1) = sa{ a | ∈ A′ }, RS(c2) =
′
switches among controllers when the symmetry is gone. {sa a| ∈A A\ with} σ = 0. The reverse part is trivial. The
We limit this problem as a Load Balancing problem for Devolved reductions can be done within polynomial time, which
Controllers (LBDC). In our system, each controller can completes the proof. Next we demonstrate our results for
lethargically migrate switch to or receive switches from the LBDC. We carry out the schemes within OpenFlow
understandably adjacent controllers to keep up the traffic load framework, which clears the system fairly easy to configure and
balanced. enforce. It changes the devolved controllers from a
Image 1 illustrates the migration pattern. Here Con- statistical model into an implementable prototype. besides,
troller cj dominates 17 switches (as red shifts) and our schemes are topology free, which is scalable for any
Controller ci dominates 13 switches (as blue switches). DCN topology such as Fat-Tree, BCube, Portland, etc.
Since the traffic between chi and cj is unbalanced, cj is
migrating one of its switches to watch.
III. LINEAR PROGRAMMING AND
&RQWUROOHU FL &RQWUROOHU FM
ROUNDING
Summing up all switch elements a ∈ U , we get Pr[C is not a valid switch set cover] ≤ n · 1 ≤ 1
4 4
We then present our LBDC-Randomized Rounding n we can
Therefore the LBDC-RR algorithm is efficient and
(LBDC-RR) algorithm as described below. solve the LBDC problem using linear program- ming and
First, we claim that our LBDC problem can be described randomized rounding.
in another way as the definition and prop- erties of set
cover: Given a universe U of n switch elements, S is a
collection of subsets of U , and S= S1, , Sn . And IV. ALGORITHM DESIGN
{ is ·a· ·cost assignment
there } func- tion c : S Z+. Find the Using Linear programming and rounding, we can perfectly
subcollection → of S with the minimum deviation that covers solve LBDC theoretically. However, it is usu- ally time
all the switches of the universal switch set U . consuming and impractical to solve an LP in real-world
We will show that each switch element is covered with applications. Thus, designing efficient and practical
constant probability by the controllers with a specific heuristics for real systems is essential. In this section, we
switch set, which is picked by this process. Repeating this will propose a centralized and a distributed greedy
process O(log n) times, and picking a subset of switches algorithm for switch migration, when the traffic load
if it is chosen in any of the iterations, we get a set cover becomes unbalanced among the controllers. We then
with high probability, by a standard coupon collector describe OpenFlow based migration protocols that we use
argument. The expected minimum deviation of cover (or in this system.
say controller-switch matching) picked in this way is
O(log n) OPT f O(log n) OPT , where OPT f is the cost of 4.1 Centralized Migration
an optimal solution to the LP-relaxation. · ≤
Centralized Migration is split up into two phases. The first
·
Alg. 2 shows the formal description of LBDC-RR. phase is used for configuring and initializing the DCN. As
the traffic load changes due to various applications, we
Algorithm 1: Randomized Rounding (LBDC-RR) have to come to the second phase for dynamical migration
among devolved controllers.
1 Let x = p be an optimal solution to the LP; Fig. 2 illustrates the general workflow of Central- ized
2 foreach set Si ∈ S do Migration, which includes Centralized Initialization and
3 Pick Si with probability xSi Centralized Regional Balanced Migration.
4 repeat Network Traffic Varies Unbalanced
5 Pick a subcollection as a min-cover Initialization State
with the smallest |RS(ci) |is preferred. If there are still and stop the migration. In each round, we sample the
several candidates, we choose the closer controller by current weight of each controller, and calculate
physical distance. Finally, if we still cannot make a ∑
Avg = m w(c )/m. In all, the Linear Expectation
now i= i
Then we design LBDC-CI as shown in Alg. 3. 1
{
Thd = α · Avgnow + (1 − α) · Avglast, (14)
Efn = β · Thd
Algorithm 2: Centralized Initialization (LBDC-CI) The core principle of LBDC-CM is migrating the
heaviest switch to the lightest controller greedily. Alg. 4
Input : S with w(si); C with w(ci); describes the details. Note that AN (ci) denotes the
Output: An m-Partition of S to C neighbor set of ci.
1 RemList={ s1,s2,· · · ,sn };
3 while RemList ̸ = ∅ do Algorithm 3: Centralized Migration (LBDC-CM)
4 Pick si from RemList;
5 Let ℓ = arg min{w(cj) | cj ∈ PC(s i )}; Input: S with w′(si); C with w′(ci); PendList
j = OverList = {∅};
6 Assign si to cℓ (by break Tie Law); 1 Step 1: Add ci → OverList if w′(ci) > Efn;
7 Remove si from RemList; 2 Step 2: Find cm of max weight in OverList;
3 if ∃ cn ∈ AN (cm) : w′(cn) < Thd then
4 repeat
LBDC-CI needs to search the RemList to assign the 5 Pick sm ∈ RS(cm) of max weight ;
switches. This process takes running time O(n). While loop 6 if ∃cf ∈ AN (cm) ∩ PC(s m ) : w′(cf ) < Thd
will be executed once for each switch in RemList, which
then Send sm → cf ;
takes O(m). Hence in the worst case the running time is
O(mn). If we use a priority heap to store the RemList, we 7 else Ignore the current sm in cm;
can improve the performance and reduce the overall 8 until w′(cm) ≤ Thd or all w′(cf ) ≥ Thd;
running time to O(m log n). 9 if w′(cm) > Efn then move cm to PendList;
As the system runs, traffic load may vary frequently and 10 else remove cm from OverList;
will influence the balanced status among de- volved 11 else
controllers. Correspondingly, we have to begin the second 12 Move cm from OverList to PendList;
phase and design the centralized migration algorithm 13 Step 3: Repeat Step 2 until OverList = {∅};
(LBDC-CM) to alleviate the situation.
14 Let OverList = PendList, Repeat Step 2 until
Centralized Regional Balanced Migration: During the
PendList becomes stable;
migration process, we must assess when the con- troller
15 Step 4: Now PendList has several connected
needs to execute a migration. Thus we come up with a
components CCi (1 ≤ i ≤ |CC|);
threshold and an effluence to judge the traffic
16 foreach CCi ∈∪ CC do
load balancing status of the controllers. Here we de-
fine Thd as the threshold and Efn as the effluence. If 17 Search the cj ∈CC AN (cj);
i ′
w (CCi∪AN (CCi))
18 Compute avglocal = (CC|CCii|+|AN
)| ;
′
while w (cj) ≥ γ · avglocal : cj ∈ CCi do
Thd, it becomes relatively idle and available to receive 19
Migrate smax ∈ RS(cj) to
more switches migrated from those controllers with 20
workload overhead. If the workload of a controller is higher cmin ∈ AN (CCi);
than Efn, it is in an overload status and should 21 remove cj ∈ CCi from PendList;
assign its switches to other idle controllers. Some
We use Desi innowRelative Weight Deviation to evaluate to distribute si to c2, if the current load of both con-
trollers are below their thresholds. Thus we come up with
limited LBDC-CM and LBDC-CM with switch priority. We
LBDC-CM with switch priorities. In this scheme, each
use Avgnow to replace Desi in RWD to evaluatenow naive
switch has a value list, which stores the value of each
LBDC-CM and LBDC-DM.
mapping between this switch and its potential
According to Eqn. (15), the procedure of the lim- ited controllers. We want to balance the traffic load of the
LBDC-CM is very similar as the naive LBDC- CM in Alg. network and make the whole value as large as possible.
4. The only difference comes from the comparison steps, In LBDC-CM, we use vij to denote the value we can get
when to judge whether a controller is overloaded, we need by attaching switch si to controller c j. These values are
to compare w′(ci) to its local Efn i and Thd i. stored in a matrix V alue, and if cj
Limited LBDC-CM uses the current load ratio of each
controller other than the value of the current weight, to The implementation of this algorithm is quite similar to
judge whether the devolved controllers are unbalanced. limited LBDC, except that we changed the migration
Thus, we only need to calculate the aver- scheme used in Step 2 of limited LBDC- CM.
Algorithm 4: LBDC-CM with switch priority Distributed Regional Balanced Migration: In the second
w′ (c )
phase, the controller uses the threshold and
1 Step 2: Find cm ∈ OverList with max wm(cmm) ; the effluence to judge its status and decide whether
2 if ∃cn ∈ AN (cm) : w′(cn) < Thd n then it should start the migration. Since in a distributed
3 repeat system, a controller can only obtain the information of
4 if ∃cf ∈ AN (cm): w′(cf ) < Thd f then its neighborhood, the threshold is not a global one that suits
5 Sort PS(c m ) by vif : si ∈ PS(c m); for all the controllers, but an independent value which is
6 Pick sk with max vkf in cm, and pick max calculated by each controller locally. Also the algorithm
sk to break tie; runs periodically for several rounds. In each round, each
7 Send sk → cf ; controller samples AN (ci) and applies the Linear
8 until w′(cm) ≤ Thd m or all w′(cf ) ≥ Thd f ; Expectation
9 if w′(cm) > Efn m then move cm
10 Move cm from OverList to PendList; LBDC-DM aims at monitoring the traffic status of
itself by comparing current load with its threshold. When
the traffic degree is larger than Efn, it enters
In this scheme, we add the process of sorting the the sending state and initiates a double-commit trans-
switch list according to the value matrix, which will take action to transfer heavy switches to nearby nodes. Alg. 7
O(log n) if we use heap sorting. Thus the time complexity shows the distributed migration procedure.
is O(n log m log n) if we use a priority heap to store the
PendList and the OverList. And the space complexity is Algorithm 5: Distributed Migration (LBDC-DM)
still O(n2) since we need some matrices to store the value
and the mapping relations. Sending Mode: (whenw′(cmy) ≥ Efn)
1 if ∃ci ∈ AN (Cmy) in receiving or idle then
Even so, in that respect is no unambiguous appreciation the spot congestion. We use relative weight deviation to
from the switch that these messages are obtainable. Thus, evaluate the functioning of our algorithms.
in order to certification all these messages are steadfast, ci We analyze the performance of our four algorithms..
hand on a Barrier Request and waits for the Barrier Reply, Consider a DCN with 10 10 controllers locating as a
only after which it signal end migration to the final master straight array. At the start of a time slot, the weights of
cj. switches are updated and then we play the migration
Phase 4: Dispense controller cj as the final master of sm. algorithms.The weight of switches follows Pareto
cj sets its role as the master of sm by sending a Role- distribution with αp = 3. Figure 3 indicates the system
Request message to sm. It also updates the distributed initial traffic states, Different color scale represents
information store to argue this. The switch sets ci to slave different working state of a controller. The darker the color
when it receive the Role-Request communication from cj. is, the busier the controller works. Figure 4, Figure 5,
Then cj remains active and process all messages from sm Figure 6 and Figure 7 illustrate the performance of the
for this segment. naive LBDC- CM, limited LBDC-CM, Priori LBDC-CM
The above movement protocol requires 6 round-trip and LBDC- DM respectively. We can see that after the
times to complete the migration. But notice that we need to migration, the whole system becomes more balanced.
trigger migration only once in a while when the load Actually, the performance of LBDC-DM is poor when
conditions change, as we discussed in the algorithm design the number of the controllers is relatively lim- ited. This
subsections. phenomenon is attributed to the system setting that one
controller can only cover switches within 30m. When the
number of controllers is few, more switches should be
V. PERFORMANCE EVALUATION controlled by one particular controller without many
In this segment, we assess the execution of our centralized choices. As the number of the controller increases, LBDC-
and disseminated protocols. We look at the case where traffic DM can achieve a better performance and a higher
stipulate changes and test whether the metric of evenhanded improvement ratio.
workload controllers is minimize. We likewise consider the Intuitively, increasing the number of controllers may
number of migrated switches into thoughtfulness. Furthermore, increase the deviation, but it may lead to less migration
we look into how different parameters will determine the frequency. To balance the number of controllers and the
outcomes. migration frequency, we need to carefully set α, β, and γ
values. If there are sufficient controllers to manage the
whole system, we can adjust the three parameters such that
5.1 Environment Setup the system will maintain a stable state longer. While if the
We construct simulations by Python 2.7 to assess the number of controller reduces, we have to raise the
routine of our intentions. We place 10,000 switches and migration threshold to fully utilize controllers. The effect
100 controllers in a 100 100m2 square. Switches are evenly on these parameters are further discussed in Sec. 5.4.
isolated in this square, sound out, each switch is 1m away
from any of its neighbors. The controllers are too similarly 5.3 Horizontal Protocol Performance Comparison
spread and each controller is 10m away from its fellow We designed three variations of LBDC-CM: naive LBDC-
citizen. Each controller can check all the switch within CM, which is the simplest and applicable to most of the
30m, and can exchange a few words with other controllers cases. While if the controllers are hetero- geneous or the
within the range of 40m. We take on the weight of each switch switches have a space priority to its closest controller
follows the Pareto distribution with its parameter αp = 3. We make physically, then we can implement limited LBDC-CM or
a small simulation to choose the most appropriate α, β and γ, so that LDBC-CM with switch priority respectively. Finally we
the environment we make can be very near to the actual situation, have a distributed LBDC-DM protocol. Now let us
in terms of the traffic condition, workload of controllers, and compare the performance of the four migration protocols.
migration frequency, etc. [13] –[15], [24]. Thus we set α = 0.7,
Comparison on number of controllers. Firstly, we vary
β = 1.5, γ = 1.3 as default configuration.
the number of controllers from 30 to 210 with a step of 20
and check the change of relative weight deviation of the
5.2 System Performance Visualization Results system. The simulation results are shown in Fig. 8 and Fig.
9. We compare the relative weight deviation of the initial
We employ the default configuration described above to
bursty traffic state and the state after the migration. We find
examine the operation of the organization. We first apply
that after the migration, the relative weight deviation of all
initialization and change the traffic demands dynamically
the controllers decreases. It depicts that our four protocols
to emulate unpredictable user requests. And so we apply
improve the system performance significantly com-
naive LBDC-CM and other variants to alleviate
Fig. 3. Initial state Fig. 4. Naive LBDC- Fig. 5. Limited Fig. 6. Priori LBDC- Fig. 7. LBDC-DM mi-
CM migration LBDC-CM migration CM migration gration
Fig. 8. Relative weight deviation Fig. 9. Performance improvement for Fig. 10. Relative weight deviation without
protocol comparison different protocols traffic changes
Fig. 11. Migrated switch without Fig. 12. Relative weight deviation as Fig. 13. Migrated switch as traffic
traffic changes traffic changes changes
pared with the initial state, whether in the relative weight Figure 10 and Figure 11 show the relative weight deviation
deviation part or in the improvement part. As the number and migrated switch number w.r.t. the four protocols at
of controllers increases, the improvement ratio is also different time slot under the condition that the global traffic
increasing. It is quite intuitive that more controllers will load is not changed all the time (the weight of each
share jobs to reach a balanced state. Both figures show that switch is constant). We can see that the relative weight
our algorithm has a pretty good performance when the deviation is decreasing, but the values of limited LBDC-
number of controllers grows, which indicates that our CM and LBDC-CM with switch priority are higher than
scheme is suitable for mega data centers. that of the naive LBDC-CM. This is because through
The naive LBDC-CM performs the best because it limited LBDC-CM and LBDC-CM with switch priority,
considers all possible migrations from a global prospective. each controller has a different upper bound, which will
It is even better than the performance of the LBDC-DM, influence the migration. For example, if some switches can
but the difference between them is decreasing as the only be monitored by a certain controller, and that
number of the controllers increases. It is better if we add controller is overloaded, then it will cause a high relative
more controllers to the network to achieve a balanced traffic weight deviation since we cannot remove the switches to
load. In reality we may only run the other protocols such as other controllers. In addition, controllers in LBDC- CM
the LBDC-DM, lim- ited LBDC-CM and LBDC-CM with with switch priority even have a preference when choosing
switch priority. For the limited LBDC-CM, the maximum potential switches. In terms of migrated switch numbers,
workload of controllers also follows Pareto distribution we can see that with time goes by, all four protocols
with αp = 3, and we amplify it with a constant to make sure remain stable on the number of migrated switches. LBDC-
the total traffic load not exceed the capacity of all DM has the lowest number of migrated switches because of
controllers. For LBDC-CM with switch priority, we allocate its controllers can only obtain a local traffic situation,
a value to each mapping of a switch and a controller, which resulting in the relatively low frequency in migrating
is inversely proportional to their dis- tance, we can also see switches.
that it has a significant growth as the number of controller Run-Time performance w.r.t. dynamic traffic loads.
increases. Overall we can conclude that all of the four Figure 12 and Figure 13 show the relative weight deviation
protocols performs quite well in balancing the workload of and migrated switch number w.r.t. the four protocols at
the entire system. different time slot under the condition that the global traffic
Run-Time performance w.r.t static traffic loads. load is changed dynamically (the weight of each switch
is dynamic). Even if the
traffic load is changing at different time slots, the migrated view of traffic to make routing decisions. Actually, us- ing a
switch number stays in a relatively stable status. If centralized controller makes the design simpler and
controller c1 is overloaded, it will release some dominating sufficient for a fairly large DCN.
switches to its nearby controllers. However, if in the next However, using a single omniscient controller in-
round, the switches that monitored by those controllers troduces scalability concerns when the scale of DCN grows
gain higher traffic load and make the nearby controllers dramatically. To address these issues, re- searchers installed
overloaded, then the switches may be sent back to multiple controllers across DCN by introducing devolved
controller c1. Thus, to avoid such frequent swapping controllers [4]–[8], [33] and used dynamic flow as an
phenomenon, we can set an additional parameter for each example [5] to illustrate the detailed configuration. The
switch. If its role has been changed in the previous slot, introduction of devolved controllers alleviates the
then it will be stable at current state. scalability issue, but still introduce some additional
We may also consider the deviation of load bal- ancing problems.
among switches to better improve the sys- tem Meanwhile, several literatures in devising dis- tributed
performance. Since we consider the balancing problem controllers [6]–[8] have been proposed for SDN [34] to
among controllers, which is like the “higher level” of address the issues of scalability and reliability, which a
balancing problem among switches, we can implement centralized controller suffers from. Software-Defined
some load balancing strategies among switches [25]–[28] Networking (SDN) is a new net- work technology that
and combine the two-layers to- gether to achieve a better decouples the control plane logic from the data plane and
solution. uses a programmable software controller to manage
network operation and the state of network components.
The SDN paradigm has emerged over the past few years
5.4 Parameter Specification
through several initiatives and standards. The leading SDN
Next we explore the impact of the threshold parame- ters α, protocol in the industry is the OpenFlow protocol. It is
β, γ. Here α is a parameter to balance conser- vativeness specified by the Open Networking Foundation (ONF) [35],
and radicalness, β is a crucial parameter which decides which regroups the major net- work service providers and
whether to migrate switches or not in four protocols, and network manufacturers. The majority of current SDN
γ is used in Step 4 of LBDC- CM. We examine the impact architectures, OpenFlow- based or vendor-specific, relies
of changing α, β and γ altogether. TABLE 2 lists the on a single or mas- ter/slave controllers, which is a
statistics for α ranging between 0.25 and 0.75, β ranging physically centralized control plane. Recently, proposals
between 1.15 and have been made to physically distribute the SDN control
1.35, γ ranging between 1.15 and 1.35. The improve- ment plane, either with a hierarchical organization [36] or with a
rate and the number of migrated switches is mostly flat or- ganization [7]. These approaches avoid having a
decreasing as β increases, which is actually correct SPOF and enable to scale up sharing load among several
according to the definition of the threshold. controllers. In [34], the authors present a distributed NOX-
based controllers interwork through extended GMPLS
TABLE 2 protocols. Hyperflow [7] is, to our best knowl- edge, the
Influence of α, β and γ factor only work so far also tackling the issue of distributing
the OpenFlow control plane for the sake of scalability. In
α β γ Initial LBDC-CM Rate Switch #
0.25 1.15 1.15 181.87 14.41 92.08 6344 contrast to our approach based on designing a traffic load
0.25 1.15 1.35 188.54 16.90 91.04 6236 balancing scheme with well designed migration protocol
0.25 1.35 1.15 193.81 11.83 93.90 6536 under the Open- Flow framework, HyperFlow proposes to
0.25 1.35 1.35 182.76 16.18 91.15 6224 push (and passively synchronize) all state (controller
0.75 1.15 1.15 196.73 12.01 93.90 6705
0.75 1.15 1.35 187.62 17.29 90.79 6244
relevant events) to all controllers. This way, each controller
0.75 1.35 1.15 178.77 15.46 91.35 6305 thinks to be the only controller at the cost of requiring
0.75 1.35 1.35 181.01 14.29 92.11 6231 minor modifications to applications.
HyperFlow [7], Onix [34], and Devolved Con- trollers
[4] try to distribute the control plane while maintaining
VI. RELATED WORK logically centralized using a distributed file system, a
distributed hash table and a pre- computation of all possible
As data center becomes more important in industries, there combinations respectively. These approaches, despite their
have been tremendous interests in designing efficient ability to distribute the SDN control plane, impose a strong
DCNs [1], [2], [29]–[32]. Also, the effects of traffic requirement: a consistent network-wide view in all the
engineering have been proposed as one of the most crucial controllers. On the contrary, Kandoo [36] proposes a
issues in the area of cloud computing. hierarchical distribution of the controllers based on two
The existing DCN usually adapts a centralized controller layers of controllers.
for aggregation, coordination and resource management
[1], [2], [10], [31], which can be energy efficient and can
leverage the failure of using a global