Você está na página 1de 115

Computer Networks :

Protocols and Practice


Part 10 : MultiProtocol Label Switching

Olivier Bonaventure
http://inl.info.ucl.ac.be/

CNPP/2008.10. © O. Bonaventure 2008

These slides are licensed under the creative commons attribution share-alike license 3.0. You can obtain detailed
information
about this license at http://creativecommons.org/licenses/by-sa/3.0/
MPLS
MultiProtocol Label Switching
Outline
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisations of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

A good textbook on MPLS is the following

B.Davie and Y.Rekhter. MPLS Technology and Applications. Morgan Kauffmann, 2000.
A more practical book on MPLS, centered around Cisco routers is :
I. Pepelnjak and J. Guichard, MPLS and VPN Architectures, Cisco Press, 2001

The MPLS technology is standardised notably within IETF, see


http://www.ietf.org/html.charters/mpls-charter.html

Most of the standardisation documents on MPLS, including the deprecated ones may be found at
N.Demizu. Multi layer routing. Available fromhttp://www.watersprings.org/links/mlr/.
High performance packet switching

The challenge
To efficiently support high speed networks, a
router should be capable of switching and
forwarding packets at line rate on all interfaces

Inter-packet time on high-speed interfaces

Memory access time : 10 nsec for SRAM

CNPP/2008.10. © O. Bonaventure, 2008

3
Architecture of a normal IP router

Routing
protocol Routing table

Control
The "best" paths selected from the routing table
built by the routing protocols are installed in the
forwarding table

Shap. IP packets
Class.
IP packets Forwarding Pol
Table Forwarding

Shap.
Class.
Pol IP packets

Forwarding decision based on longest match


Update of TTL and checksum fields in IP packets
CNPP/2008.10. © O. Bonaventure, 2008

4
Label swapping
A simpler forwarding paradigm
Principle
On packet arrival, router analyzes
Packet Label
Input Port
Based on label forwarding label, router decides
Output Port for outgoing packet
Packet Label for outgoing packet

B C A Y X
R1 R2

A
R1 Label forwarding table
Inport Inlabel Outport Outlabel
West A East X Cost of table lookup
West B East Y
R3 a single memory access !

CNPP/2008.10. © O. Bonaventure, 2008

Label swapping is notably used in ATM and Frame Relay networks.


Architecture of a label switch

Technology specific control plane


ATM
Frame relay Control
Optical switch
...

Labeled
packets
Shap.
Labeled Class.
packets Label Pol
Forwarding Forwarding
Table
Shap.
Class.
Pol Labeled
packets

Forwarding decision based on exact match


Update of the labels included inside packet
CNPP/2008.10. © O. Bonaventure, 2008

6
Label-swapping example

BR2 Routing table


default via BR3
BR1, BR3 : direct
Cra Routing table BR3 Routing table
Cra, Crb via BR1 default -> Internet
default via BR1 Label forwarding table BR1, BR2, CRd : direct
Inport Inlabel Outport Outlabel
Label table South L2 S- E L1 Cra, Crb via BR1
Flow1 : L1 via BR1
Flow2 : L2 via BR1
BR2 Crd

Flow 1
Flow 2 Cra
BR3
BR1

Routing table Crb


default via BR1

Label table Flow 3 BR1 Routing table


Flow3 : L1 via BR1 default via BR3 BR3 Label forwarding table
BR2, Cra, Crb : direct Inport Inlabel Outport Outlabel
West L1 East L1
Label forwarding table West L2 East L3
Inport Inlabel Outport Outlabel N-W L1 North L1
West L1 North L2
West L2 East L1
S-W L1 East L2
CNPP/2008.10. © O. Bonaventure, 2008

In principle, a MPLS flow can be considered as a layer 2.5 flow, namely a flow that belongs to an intermediate layer between layer 2 and layer
3.
Historical perspective

Early 1990's
IP routers were essentially high-speed computers
with special software
Impossible to implement IP forwarding in hardware
Performance of routers was limited by their CPU
forwarding assistance with caches to avoid expensive route
lookups

Asynchronous Transfer Mode


Emerging technology relying on label swapping
Hardware implementation was easy at high speeds
155 Mbps, 622 Mbps
To support voice (telephony) all packets were divided in
48 bytes cells with 5 bytes header for label

CNPP/2008.10. © O. Bonaventure, 2008

ATM is still widely used as the backbone for ADSL access, but many ADSL deployments are moving towards Ethernet-based access networks
to replace ATM

A description of ATM may be found in :


Martin De Prycker, Asynchronous Transfer Mode. Solutions for Broadband ISDN (Prentice-Hall, 1993)

Other pointers may be found : http://en.wikipedia.org/wiki/Asynchronous_Transfer_Mode


Historical perspective (2)

Early 2000's
Moore's Law
CPU performance doubles every 18 months
VLSI integration allows more complex CPUs
IP routers
New advances in forwarding algorithms
IP forwarding can be easily performed in hardware
chips are complex but wire speed at 2.4 Gbps or 10 Gbps works
Label swapping is not anymore necessary from performance
or hardware implementation point of view

ATM
48+5 bytes cell size is major drawback and implementation
cost
ATM interfaces on IP routers do not exist above 622 Mbps
ATM mainly restricted to medium speed ADSL->1Gbps

CNPP/2008.10. © O. Bonaventure, 2008

Additional information about Gordon Mooreʼs law may be found at http://www.intel.com/technology/mooreslaw/index.htm


Mooreʼs law mainly applies to high volume product such as PC CPUʼs, it does not completely apply to the specialised ASICs found in routers,
however, the
performance improvement for ASCIs is similar.
Historical perspective (3)

Current motivations for MPLS


Applications
destination based routing
traffic engineering
QoS
fast restoration
Virtual Private Networks

Utilisation of MPLS to control optical or


transmission devices close to label swapping
fibre switch
lambda switch
TDM (SONET/SDH) switches

CNPP/2008.10. © O. Bonaventure, 2008

10

A detailed description of the utilisation of MPLS to control optical networks is outside the scope of this course. A good description of this
utilisation of MPLS
may be found in GMPLS: Architecture and Applications (The Morgan Kaufmann Series in Networking)by Adrian Farrel and Igor Bryskin
Historical perspective (4)

MPLS controlled devices


MPLS and IP routing are used to establish flows
through these devices
equipment uses a special forwarding mechanism

Fibre switch Fiber switching table


Inport InFiber OutPort OutFiber
West F1 East F5
West F2 East F3
West F3 South F2

F1 F1
F2 F2
F3 F3
F4 F4
F5 F5
F1 F2 F3

λ switch
CNPP/2008.10. © O. Bonaventure, 2008

11
MPLS
MultiProtocol Label Switching
Plan
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisation of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

12
Integrating label swapping and IP

LSR2 Rd

Flow 1 Labelled packets


Ra
Flow 2 LSR3 Rd
LSR1
Pure IP packets
Pure IP packets
Rb Flow 3

Core LSR : Label-Switching Router


packet forwarding based only on labels

Egress Edge router


Ingress Edge Router removes labels before sending
inserts labels on packets sent through backbone packets outside MPLS network

CNPP/2008.10. © O. Bonaventure, 2008

13

The main standardisation documents on MPLS may be found at http://www.ietf.org/html.charters/mpls-charter.html :


R.Callon, E.Rosen, and A.Viswanathan. Multiprotocol label switching architecture. Internet RFC 3031, January 2001.
Integrating label swapping and IP (2)

Problems to be solved
What is a labelled IP packet ?

What is the behaviour of a core LSR ?

What is the behaviour of an edge LSR ?

CNPP/2008.10. © O. Bonaventure, 2008

14
Labelled IP packets

Generic solution
Insert special 32 bits header in front of IP packet
1 2 3
01234567890123456789012345678901

Label

20 bits :1048576 different labels

Technology specific solutions


Reuse the already available "labels" below layer 3
Frame Relay
Asynchronous Transfer Mode
Fibre/lambda switching with special label semantics
SONET/SDH with special label semantics

CNPP/2008.10. © O. Bonaventure, 2008

15

The encoding of the MPLS label is defined in :


E.Rosen, Y.Rekhter, D.Tappan, D.Farinacci, G.Fedorkow, T.Li, and A.Conta. MPLS label stack encoding. RFC 3032, 2001.

The utilisation of MPLS to support ATM and frame relay switches are discussed in :
B.Davie, J.Lawrence, K.McCloghrie, E.Rosen, G.Swallow, Y.Rekhter, and P.Doolan. MPLS using LDP and ATM VC switching. Internet RFC
3035, January 2001.
A.Conta, P.Doolan, and A.Malis. Use of label switching on frame relay networks specification. Internet RFC 3034, January 2001.
Operations performed on labelled packet

Three types of operations


PUSH
insert a label in front of a received packet
SWAP
change the value of the label of a received labelled
packet
POP
remove the label in front of a received labelled packet

Perform PUSH Perform SWAP


Performs POP

Ra
Flow 2 LSR3 Rd
LSR1
Pure IP
packets Pure IP
Rb Flow 3 Labelled packets packets

CNPP/2008.10. © O. Bonaventure, 2008

16
Content of the
Label forwarding table

InLabel NextHop; Op

Identification of the label of the


incoming packet

Operation to be performed on label


swap label in front of packet
push new label in front of packet
Next hop for the packet pop label in front of packet
outgoing interface
packet sent to another LSR
LSR itself
packet destination is LSR
packet needs to be routed as a normal IP packet after pop

CNPP/2008.10. © O. Bonaventure, 2008

17
Label spaces

A LSR may manage its labels with 2 methods


Per interface label space
one label space for each interface
220 distinct labels for each interface of the LSR
L3's per-interface Label forwarding table L4's per-LSR Label forwarding table
Inlabel Outport Operation Inlabel Outport Operation
West:L3 South SWAP(L0) L0 East SWAP(L0)
West:L4 East SWAP(L3) L4 South SWAP(L3)
South:L3 West SWAP(L5) L3 East SWAP(L5)

L3 L4

Per LSR label space


a single label space for all interfaces
220 distinct labels for the LSR
CNPP/2008.10. © O. Bonaventure, 2008

18
LSP
Label switched path
IP packets
IP packets
L0 L2
LSP2 POP L4's Label forwarding table
Inport Inlabel Outport Operation
Ingress: L1 POP
SE L6 NW SWAP(L2)
L5, L6, L7, L4 L4 SE L1 NE SWAP(L5)
Egress:L0
PUSH(L3) L7's Label forwarding table
IP packets Inport Inlabel Outport Operation
L3 L7 SW L5 NW SWAP(L6)
SW L0 NW SWAP(L1)

LSP 1 L6's Label forwarding table


Inport Inlabel Outport Operation
Ingress: L3 L5 L6
W L5 NE SWAP(L5)
L5, L7, L6, L4, W L0 NE SWAP(L0)
Egress:L2
IP packets L5's Label forwarding table
L1 Inport Inlabel Outport Operation
PUSH(L4) S L E SWAP(L5)
LSP NW L3 E SWAP(L0)

a path followed by a labelled packet over several hops


starting at an ingress LSR and ending at an egress LSR
a LSP is usually unidirectional (ingress -> egress)
CNPP/2008.10. © O. Bonaventure, 2008

19

In this case, we assume that per interface label space is used. The same example can of course be drawn for per-LSR label space.
How to improve scalability ?

L0 L2

Small LSP
Ingress: L1, L5, L7, L4, Egress:L0 L4

L3 L7

Small LSP
Ingress: L3, L5, L7, L4, Egress:L2 L5 L6

L6 only sees one LSP


can use smaller LFT
L1
Large LSP carrying several different LSPs
Ingress: L5, L6, Egress:L7
Scalability of LSPs
it should be possible to place small LSPs inside
large LSPs
CNPP/2008.10. © O. Bonaventure, 2008

20

The need to support flows carrying other flows is common to most networking technologies. ATM networks rely on the utilisation of virtual
paths that can carry a large number of virtual circuits.
Labelled IP packets (more)

How to support hierarchy of LSPs ?


it should be possible to place small LSPs inside
large LSPs
ideally, there should be not predefined limit on the
number of levels supported

Solution adopted by MPLS


each labelled packet can carry a stack of labels
1 2 3
01234567890123456789012345678901

Label S

label at the top of the stack appears first in packet


S=1 if the label is at the bottom of the stack
S=0 if the label is not at the bottom of the stack
CNPP/2008.10. © O. Bonaventure, 2008

21

The stack of labels is one of the major innovations of MPLS compared to the other label-based forwarding techniques. The utilisation of a
stack is, of course, the reason why the two basic operations of ingress and egress LSRs are called push and pop.
Content of the
Label forwarding table (more)

InLabel NextHop; Operation

Identification of the incoming label


Interface:label for per interface label space
label for per-LSR label space

Operation to be performed on stack


Next hop for the packet swap label on top of the stack
outgoing interface (swap label on top of the stack and )
LSR itself push new label on top of the stack
packet destination is LSR pop the label stack
packet needs to be routed
after pop operation
CNPP/2008.10. © O. Bonaventure, 2008

22

In the case of multicast LSPs it should be possible to establish LSPs as trees. In this case, a LSR may serve as a branch point and a packet
might need to be replicated on several outgoing interfaces. For this reason, each line of the label forwarding table may contain several pairs
(NextHop; Operation), one for each outgoing interface on the multicast tree.
Content of the
Label forwarding table (more) (2)
Example
L3's Label forwarding table
Inlabel Nexthop Operation
L3 North-E SWAP( L0)
L4 East POP
L6 Local POP
L5 South-E PUSH (L1) L2 L0
L9 South-East SWAP (L8); PUSH (L4) L0 L7

L5
L6
L3
L2 L3
L0 L5 L1 L5
L5 L4
L8 L4
L0 L5

L3 L6
L9
CNPP/2008.10. © O. Bonaventure, 2008

23

In this example, the packet received with label L6 has a special treatment. First, label L6 is popped and the destination is the local LSR. After
the POP operation, the packet contains label L3 and according to the first line of the label forwarding table, this label is swapped to L0 and the
packet is sent on the NE link.
MPLS and label stacks
Example
L0 L2

Small LSP L4's Label forwarding table


Inport Inlabel Outport Operation
Ingress: L1, L5, L7, L4, Egress:L0 L4 SE L1 NE SWAP(L5)
SE L6 NW SWAP(L2)

L3 L7

L7's Label forwarding table


Small LSP Inport Inlabel Outport Operation
Ingress: L3, L5, L7, L4, Egress:L2 L5 L6 SW L0 NW SWAP(L1)
SW L5 NW SWAP(L6)

L1 L6's Label forwarding table


Inport Inlabel Outport Operation
W L1 NE POP

L5's Label forwarding table L6 only sees one LSP !


Inport Inlabel Outport Operation
NW L3 E SWAP(L0), PUSH L1 Large LSP carrying several different LSPs
S L4 E SWAP(L5), PUSH L1 Ingress: L5, L6, Egress:L7

CNPP/2008.10. © O. Bonaventure, 2008

24

In practice, the large LSP between L5 and L7 would pass through several intermediate LSRs. For graphical reasons, only L6 is shown on the
slide.
Behaviour of ingress edge LSR

How does edge LSR at ingress determine


the label to be used to forward a received
packet ?
Principle
1. Divide the set of all possible packets into several
Forwarding Equivalence Classes (FEC)
a FEC is a group of IP packets that are forwarded in the same
manner (e.g. over the same path, with the same forwarding
treatment)
examples
all packets sent to the same destination prefix
all packets sent to the same BGP next hop

2. Associate the same label to all the packets that belong


to the same FEC

CNPP/2008.10. © O. Bonaventure, 2008

25

The FEC is defined in RFC3031


Behaviour of ingress edge LSR (2)

Example LSP1
Ingress=Ra, L1, L2, Egress=Rc
LSP2
Ingress=Ra, L1, L3, Egress=Rd
LSP3
Ra's Mapping table Ingress=Ra, L1, Egress=Rb
IP dest prefix LSP
138.48.0.0/16 LSP3
139.165.0.0/16 LSP2 192.163.13.1
12.0.0.0/8 LSP1
11.0.0.0/8 LSP2 12.0.0.0/8
192.163.13.1 LSP1
L2 Rc 11.0.0.0/8

LSP 1
Ra
L3 Rd
L1
LSP 2
LSP 3 139.165.0.0/16
Rb

CNPP/2008.10. 138.48.0.0/16 © O. Bonaventure, 2008

26

In this example, edge LSR Ra groups all the packets sent to the same egress router (Rb, Rc, Rd) inside a single LSP. Other mappings are of
course possible, but this mapping the most frequently often used in practice.
MPLS
MultiProtocol Label Switching
Plan
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisations of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

27
Destination-based packet forwarding

The problem

LSR2 Rc
Labelled IP
Pure IP packets
packets Ra Pure IP
LSR3 Rd packets
LSR1

Rb

How to provide transit service when


Edge LSRs are able to attach and remove labels
Edge LSR and Core LSRs run IP routing protocols and
maintain IP routing tables
Core LSR can only forward labelled packets
Core LSR cannot route IP packets efficiently !

CNPP/2008.10. © O. Bonaventure, 2008

28

The destination-based packet forwarding problem was initially solved to allow LSRs that were not able efficiently forward IP packet at high
performance. It is now being used only with high-end routers/LSRs in the backbone of large networks. Those routers are able to forward IP
packets and labelled packets at line rate, but the benefits of MPLS imply that MPLS is still used even if this is not for pure performance
reasons.
Destination-based packet forwarding (2)

Manual solution
Create full mesh of LSPs between all edge LSRs

LSR2 Rc

Pure IP
packets Ra Pure IP
LSR3 Rd packets
LSR1

Rb

Problems to be solved
N edge LSRs -> N*(N-1) unidirectional LSPs
How to automate LSP establishment ?
How to reduce the number of required LSPs ?
CNPP/2008.10. © O. Bonaventure, 2008

29

The main issue with the full mesh of LSPs is the size of the label forwarding tables in the core LSRs. If each LSR needs to maintain a table with
N2 entries, this may create performance or memory problems in large networks. Although MPLS supports 220 different labels on each interface,
MPLS allows LSRs to only support a limited number of labels. This number could depend on the amount of memory available on the LSR or on
its interfaces.
Architecture of a core LSR

IP Routing
protocol IP Routing table
Label
Distribution Control
Label table

Labelled
packets
Shap.
Labeled Class.
packets Label Pol
Forwarding Forwarding
Table
Shap.
Class.
Pol Labelled
packets

CNPP/2008.10. © O. Bonaventure, 2008

30

In practice, a core LSR is also able to forward and route IP packets, but the achieved performance is often much lower than with labelled
packets. This is the reason why IP packet forwarding is not shown in this architecture.
Architecture of an ingress edge LSR

IP Routing
protocol IP Routing table
Label
Distribution Control
Label table

Labelled
packets
Shap.
IP Class.
packets Identify
Pol
FEC Forwarding
Attach
Label Shap.
Class.
Pol Labelled
packets

CNPP/2008.10. © O. Bonaventure, 2008

31

This figure corresponds to the ingress part of an edge LSR.


The egress part of an edge LSR will receive labelled IP packets and will remove the labels before acting as a normal IP router.
Distributing labels

How to fill the label forwarding tables of all


LSRs in a given network ?
Use a special protocol to distribute FEC-label
mappings
LDP : Label Distribution Protocol
RSVP-TE : extensions to RSVP

Piggyback FEC-label mappings inside messages


sent by routing protocol
possible if routing protocol is extensible
BGP can be easily modified to associate label with route
RIP cannot be used because its syntax is not extensible
link-state protocols (OSPF IS-IS) do not distribute routes

CNPP/2008.10. © O. Bonaventure, 2008

32

We discuss in this section the LDP protocol. The utilisation of RSVP and BGP to distribute labels will be explained later.
How to distribute labels ?

Who determines the FEC-label mapping ?


packets are sent by upstream LSR
towards downstream LSR

FEC-label mappings are sent by


downstream LSR towards upstream LSR

Flow of packets

L1 L2

FEC-label mappings

CNPP/2008.10. © O. Bonaventure, 2008

33

This way of allocating the labels is useful for implementation reasons. For example, consider LSR that uses high-speed label forwarding tables
that contain only 1024 entries. With a downstream allocation of the labels, this LSR can simply only allocate the values between 0 and 1023
and will be sure to receive packets with those values. The lookup in the label forwarding table can be implemented as a direct access to LFT[L]
where L is the received label.
If the upstream node was allocating the labels, then the LSR should be able to receive packets with any label values. To use a LFT with 1024
entries, it could rely on a secondary table used to map received labels to LFT indexes. The LFT lookup would then be implemented as
LFT[index[L]] at the cost of two memory access.
An alternative would be to use hash functions instead of a table, but then the hash collisions must be taken into account.
Label distribution modes

When to distribute the FEC-label mapping?


Downstream on demand label distribution
upstream LSR requests and downstream allocates label

Label Request(FEC)

L1 L2

Label Mapping(FEC:Label)
advantages
the FEC-label mappings are only distributed when needed
LSR only need to store the FEC-label mappings that are in use

drawback
when a next-hop fails, some time may elapse while a new FEC-
label mapping is been requested from the new next-hop

CNPP/2008.10. © O. Bonaventure, 2008

34

We will see RSVP-TE later as an example of downstream on deman label distribution.


Label distribution modes (2)

When to distribute the FEC-label mapping ?


Unsolicited downstream label distribution
downstream LSR announces independently FEC-label
mappings to upstream LSR

Label Mapping(FEC1:Label2)

L1 L2 L3

Label Mapping(FEC1:Label1) FEC1


advantage
Each LSR can obtain several labels for each FEC
in case of failure, LSR can quickly switch from one label to
another
drawback
Labels may not be distributed at the best time

CNPP/2008.10. © O. Bonaventure, 2008

35

We will see LDP and BGP as examples of unsollicited downstream label distributions.
Label Distribution Protocol

LDP : Label Distribution Protocol


Designed to distribute FEC-labels mapping on a
hop-by-hop basis inside network when labels
cannot be distributed by routing protocols
Neighbour discovery over UDP
determine whether the neighbour is a LSR or a normal router
Distribution of FEC-label mappings over TCP
several modes of distribution are supported by LDP
we will only provide some examples of LDP
Main messages
Initialisation
establishment of a LDP session
Keepalive
used to verify that the LDP session is still up
Label mapping
used by LSR to announce a FEC-label mapping
Label withdrawal
Used by LSR to withdraw a previous FEC-label mapping
CNPP/2008.10. © O. Bonaventure, 2008

36

The Label Distribution Protocol is defined in :


L.Andersson, P.Doolan, N.Feldman, A.Fredette, and B.Thomas. LDP specification. Internet RFC 3036, January 2001.
B. Thomas, E. Gray, LDP Applicability, RFC 3037, 2001
Additional details may be found in :
C.Boscher, P.Cheval, L.Wu, and E.Gray. LDP state machine. RFC 3215, January 2002.
Neighbour discovery

Principle
LSR periodically send LDP Hello packets to neighbor
on "all routers" multicast address
LDP neighbour discovery uses UDP port 646
neighbours respond with LDP Hello if they are LSR

LDP Hello LDP Hello


R L2 L1
LDP Hello
L1 is LSR
R is not LSR
LSR with highest IP address becomes active and
establishes TCP connection for LDP on port 646
LSR with lowest IP address becomes passive and waits the
establishment of TCP connection for LDP session
TCP session established on port 646
LDP session establishment allows negotiation of options
CNPP/2008.10. © O. Bonaventure, 2008

37

The initialisation of the LDP session is done by sending the INITIALIZATION message. This message may contain several options. If the
remote LSR accepts the LDP session with the proposed options, it replies with a KEEPALIVE message.

KEEPALIVE messages are regularly exchanged over each LDP session to ensure that the LDP session is still up and running and to detect
failures.
LDP messages

Initialization
used during LDP session establishment to announce and
negotiate options
Keepalive
sent periodically in absence of other messages
Label mapping
used by LSR to announce a FEC-label mapping
Label withdrawal
used by LSR to withdraw a previous FEC-label mapping
Label Release
used by LSR to indicate that it will not use a previously
received FEC-label mapping
Label Request
used by LSR to request a label for a specific FEC

CNPP/2008.10. © O. Bonaventure, 2008

38

We do not cover the label-release and label request messages in this presentation.
Destination based forwarding

Principle
Labelled packets should follow the same path
through the network as if they were pure IP packets

LSR2 Rc

Pure IP
packets Ra Pure IP
LSR3 Rd packets
LSR1

Rb

create a tree shaped LSP rooted on each egress LSR


similar to the way IP routing would forward packets
one tree per egress LSR, reduces total number of LSPs
distribute the labels to build those trees © O. Bonaventure, 2008
CNPP/2008.10.

39

Note that by using a tree-shaped LSP, it is possible to significantly reduce the size of the label forwarding tables of the core LSR.
Destination based forwarding (2)
How does LDP creates the LSPs ?

LB Routing table
RC : East
LA Routing table
RC : East [via LB]

LA LB RC
C

Packets to RC

LD LE

LD Routing table
RC : North [via LA]]
LE Routing table
EC : North-East

CNPP/2008.10. © O. Bonaventure, 2008

40

In this example, we assume that the link between LD and LE is a lower bandwidth link whose metric is higher than the other links in the
network. For this reason, the link is not preferred by the IGP. The other links have the same metric.

To be able to create an LSP, LDP needs to know the routes for each destination for which a LSP will be established. In practice, the routing
table will be much larger than as shown on the slide.
Destination based forwarding (3)
First step : choose destination for which a LSP
will be established
LB finds that next hop RC is
LB Routing table a pure IP router
RC : East
It will advertise a FEC to RC
LA Routing table
RC : East [via LB]

LA LB RC
C

LD LE LE finds that next hop RC is


a pure IP router
It will advertise a FEC to RC
LD Routing table
RC : North [via LA]]
LE Routing table
EC : North-East

CNPP/2008.10. © O. Bonaventure, 2008

41

In the following slides, we will not show anymore the IP routing tables, but they are used by LDP
Destination based forwarding (4)
Advertising the mappings at LB and LE

Label table
South:L3 -> East:POP
West:L4 -> East:POP
Label table
- Received Mappings
FEC table RC : South:L2 (unused)
RC -> PUSH L4, East
L Mapping : RC:L4
Received Mappings
RC : East:L4 (chosen) LA LB RC
C

L Mapping RC:L3 L Mapping RC:L2

LD LE
L Mapping RC:L1
Label table
- Label table
East:L1 ->NE: POP
FEC table North:L2 ->NE:POP
-
Received Mappings Received Mappings
RC : East:L1 (unused) RC: North:L3 (unused)

CNPP/2008.10. © O. Bonaventure, 2008

42

At this point, LE is able to receive MPLS packets from LD and LB. However, since the IGP path to RC from LB and LD is not via LE, LE will not
receive such MPLS packets.

The first LSP that has been established is the one between LA and LB. LA has chosen the mapping advertised by LB because its IGP path to
reach RC is via LB.

LE did not select the mapping advertised by LB. LB did not select the mapping advertised by LE.
Destination based forwarding (5)

LB Routing table
RC : East
LA Routing table Label table
RC : East [via LB] South:L3 -> East:POP
Label table West:L4 -> East:POP
South:L3 -> East:L4 Received Mappings
FEC table RC : South:L2 (unused)
RC -> PUSH L4, East
Received Mappings
RC : East:L4 (chosen)
LA LB RC
C

L Mapping : RC:L3 MPLS IP packets

LD Routing table packets


RC : North [via LA]]
Label table LD LE
-
FEC table
RC ->PUSH L3, North LE Routing table
C : North-East
Received Mappings Label table
RC : East:L1 (unused)
East:L1 ->NE: POP
RC : North:L3 (chosen) North:L2 ->NE:POP
Received Mappings
RC: North:L3 (unused)

CNPP/2008.10. © O. Bonaventure, 2008

43

The blue arrows show the tree-shaped LSP used by LD and LA to send labelled packets whose destination is RC via LB which is their
preferred egress LSR. LE sends directly its packets to RC.

Note that the MPLS packets flow in the opposite direction of the mappings.
Destination based forwarding (6)

Packet flow LB Routing table


RC : East
Label table
South:L3 -> East:POP
West:L4 -> East:POP
LA Routing table Received Mappings
RC : East [via LB] RC : South:L2 (unused)
Label table
South:L3 -> East:L4
FEC table LA swaps label L3 with LB pops label L4
RC -> PUSH L4, East label L4 and sends to RC
Received Mappings L4 | D: RC
RC : East:L4 (chosen) D: RC
LA LB RC
C

L3 | D: RC
LD Routing table
RC : North [via LA]]
Label table LD LE LE Routing table
- C : North-East
FEC table Label table
RC ->PUSH L3, North LD pushes label L3 to East:L1 ->NE: POP
Received Mappings packets sent towards RC North:L2 ->NE:POP
RC : East:L1 (unused) Received Mappings
RC : North:L3 (chosen) RC: North:L3 (unused)
D: RC

CNPP/2008.10. © O. Bonaventure, 2008

44

This example shows the transmission of one packet from the workstation attached to LD to destination RC.
Destination based forwarding (7)

How to deal with link failures ?


LB Routing table
RC : East (failed) LB notices the link failure
LA Routing table Label table Since LB received a mapping from LE,
RC : East [via LB] South:L3 -> East:POP it may immediately update its label
Label table West:L4 -> South:L2 table to "reroute" the traffic towards LE
South:L3 -> East:L4 Received Mappings
FEC table RC : South:L2
RC -> PUSH L4, East
Received Mappings
RC : East:L4 (chosen)
LA LB RC
C

L Withdraw RC:L3
LD Routing table
RC : North [via LA]]
Label table LD LE LE Routing table
- C : North-East
FEC table Label table
RC ->PUSH L3, North East:L1 ->NE: POP
Received Mappings North:L2 ->NE:POP
RC : East:L1 LE will accept traffic from LB and Received Mappings
RC : North:L3 (chosen) RC : North:L3
will remove the RC: North:L3
label from its received mappings
CNPP/2008.10. © O. Bonaventure, 2008

45

When LSR LB detects the fail of the link that it uses to reach RC, it can immediately update its label forwarding table to use the label that it
received from LE. LB does not need to change the label that it advertised to LA since it is still able to reach RC via LE. However, LB must
inform LE that the label L3 that it advertised earlier is not anymore available. This is done with the label withdraw message.

After some time, the routing protocol will update the routing tables to reflect the link failure. The updates to the routing tables will trigger the
distribution of new label mappings.
MPLS and transient loops

What if routing has created a transient loop


while LSP is being established ?
LSP could be looped inside network

How to recover from looped LSPs ?


Discard looped packets as in IP
TTL field in MPLS header
1 2 3
01234567890123456789012345678901

Label S TTL

not suitable for technologies with their own header (ATM)


Prevent loops inside LDP
indicate the entire path of LSP in LDP Label request and
LDP label mapping messages
CNPP/2008.10. © O. Bonaventure, 2008

46

We do not discuss in this presentation the utilisation of LDP to prevent loops.


Utilisation of the MPLS TTL

Example LB Routing table


RC : East
Label table
South:L3 -> East:POP
West:L4 -> East:POP
LA Routing table LA swaps label L3 with Received Mappings
RC : East [via LB] RC : South:L2 (unused)
LB decrements MPLS TTL
Label table label L4 removes label L4 and copies
South:L3 -> East:L4 Decrements MPLS TTL MPLS TTL in IP TTL
FEC table
RC -> PUSH L4, East before sending to RC
Received Mappings L4 | D: RC
RC : East:L4 (chosen) D: RC
LA LB RC
C

L3 | D: RC
LD Routing table
RC : North [via LA]]
Label table LD LE LE Routing table
- C : North-East
FEC table Label table
RC ->PUSH L3, North East:L1 ->NE: POP
Received Mappings Adds label L3 North:L2 ->NE:POP
RC : East:L1 (unused) Received Mappings
RC : North:L3 (chosen) Copies TTL from IP packet RC: North:L3 (unused)
D: RC Decrements MPLS TTL
CNPP/2008.10. © O. Bonaventure, 2008

47
LDP in practice

How do network operators use LDP ?


Most common deployment is to create a full
mesh of LSPs among all LSRs so that each
LSR is able to send MPLS packets to any
LSR
LSR usually advertises a label for its loopback

What will be the LSPs in the network below ?


LA LB

LD LE

CNPP/2008.10. © O. Bonaventure, 2008

48

Using loopback addresses to advertise mappings is preferred over addresses associated to a physical
interface because a loopback address is always up and does not become unreachable if a physical
interface fails

In the example network above, the metric of all links is set to 1 except the link between LA and LD.
MPLS
MultiProtocol Label Switching
Outline
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisation of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

49
MPLS in large ISP networks

Pure IP-based ISP network


eBGP on border routers
current full BGP Internet routing table
+-220.000 active routes
iBGP full mesh
4 border, 3 core routers RG
24 iBGP sessions
RH
RA B4 R5
EBGP

R1
B6 R7 RF
RB B3

R2 AS1
RE
RC
To correctly forward IP packets, border and
RD backbone routers need a full routing table
For this, they need to be part of the iBGP mesh
CNPP/2008.10. © O. Bonaventure, 2008

50
MPLS in large ISP networks (2)

BGP free ISP backbone


Backbone router
Maintains internal routing table of ISP network
only knows how to reach routers inside ISP
RG
RH
RA B4 R5
EBGP
IBGP

R1
B6 R7 RF
RB B3

R2 AS1 RE
RC Border router
Maintains full BGP routing table
RD routing table indicates for each destination
AS path towards destination
IP address of next-hop to reach destination
CNPP/2008.10. © O. Bonaventure, 2008

51

In a large ISP network, there are two ways to advertise with BGP the nexthop used to reach a destination. For example, consider the prefix p
advertised by router RG via eBGP to router R5.
The normal operation with BGP is that R5 advertises prefix p to its iBGP peers with RG as a nexthop. However, this solution requires the ISP to
advertise in its IGP all its interdomain links (e.g. R5-RG, R5-RH, ...), which increases the size of the IGP tables.
A second solution that is often used is the allow the border router that receives a route via eBGP to advertise itself as the nexthop when
advertising the route to iBGP peers. In the case of prefix p, router R5 would advertise its prefix with itself as the nexthop to its iBGP peers. This
requires a special configuration on the BGP routers. We assume this special configuration is used for the example described in this section.
MPLS in large ISP networks (3)

Principle of the solution


Use a hierarchy of labels
top label is used to reach egress border router (blue LSP)
second label is used to reach eBGP peer (red/green LSP)

RG

RH
RA B4 R5
Egress Border router
packets are label switched
R1
B6 R7 RF
RB B3
AS1
R2 Ingress Border router RE
RC Maintains full BGP routing table
Attaches two labels based on routing table
RD top label to reach egress border router
bottom label used to reach nexthop
CNPP/2008.10. © O. Bonaventure, 2008

52

This slide shows the LSPs that are used to reach RG and RH, the two eBGP peers of R5 from three ingress border routers : R1, R2 and R7.
The comments associated to R2 and R5 show the operations performed when packets are sent from RC or RD towards RG or RH.
MPLS in large ISP networks (4)

How to distribute the two labels ?


LDP allows to distribute the label to reach the
egress BGP router
R5 routing table
B3/32 : West
B4/32 : West
R2/32 : West
+ full BGP table AS1

LM : R5/32:L8 LM : R5/32:L9 LM : R5/32:POP

B3 R5C
R2 LDP LDP B4 LDP

R2 routing table B3 routing table


B3/32 : East R2/32 : West B4 routing table
B4/32 : East B4/32 : East R2/32 : West
R5/32 : East R5/32 : East B3/32 : West AS2
+ full BGP table R5/32 : East

CNPP/2008.10. © O. Bonaventure, 2008

53

There is an iBGP session, not shown in the slide, between all border routers of the ISP, including R2 and R5. B3 and B4 do not participate in
the iBGP full mesh and thus do not maintain BGP routing tables.
MPLS in large ISP networks (5)

How to distribute the two labels ?


BGP allows to distribute the label associated to
each prefix

Routes distributed by R5 via iBGP :


13.0.0.0/8, label L1, next-hop : R5/32 AS1
15.0.0.0/8, label L2, next-hop : R5/32 13.0.0.0/8
eBGP
iBGP
B3 RC
5
R2 B4
EBGP
R2 FEC mapping with iBGP routes
+labels R5 routing table AS2
R5/32 : PUSH L8 B4/32 : East
13.0.0.0/8 : PUSH(L1);PUSH(L8) B3/32 : East 15.0.0.0/
15.0.0.0/8 : PUSH(L2);PUSH(L8) R2/32 : East 8
13.0.0.0/8 : North
15.0.0.0/8 : South

CNPP/2008.10. © O. Bonaventure, 2008

54
MPLS in large ISP networks (6)

AS1
13.0.0.0/8
15.10.1.1 L2 L8 15.10.1.1 L2 L9 15.10.1.1 L2
eBGP
iBGP
B3 RC
5
R2 B4
13.0.0.4 L1 L8 13.0.0.4 L1 L9 13.0.0.4 L1
EBGP
R2 FEC mapping with iBGP routes
+labels B4 routing table
B3/32 : West AS2
R2/32 : PUSH L8
13.0.0.0/8 : PUSH(L1);PUSH(L8) R2/32 : West 15.0.0.0/
R5/32 : East 8
15.0.0.0/8 : PUSH(L2);PUSH(L8)
Label table
West:L9 -> East:POP
R2 routing table B4 routing table
B4/32 : East R2/32 : West
B4/32 : East B4/32 : East
R5 routing table R5 Label table
R5/32 : East R5/32 : East B4/32 : East West:L3 -> Local:POP
B3/32 : East L1 -> North:POP
LDP FEC mapping Label table
West:L8 -> East:L9 R2/32 : East L2 -> South:POP
R2/32 : PUSH L8 13.0.0.0/8 : North
15.0.0.0/8 : South

CNPP/2008.10. © O. Bonaventure, 2008

55

The modifications to BGP to support the distribution of labels are described in :Y.Rekhter and E.Rosen. Carrying label information in BGP-4.
Internet RFC 3107, May 2001.
MPLS
MultiProtocol Label Switching
Outline
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisation of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

56
Traffic load in large IP networks

CNPP/2008.10. © O. Bonaventure, 2008

57

Source : http://support.free.fr/reseau/carte_reseau.html built with http://netmon.grnet.gr/weathermap/

Other examples maybe found at

Internet2 weather map : http://atlas.grnoc.iu.edu/I2.html

nordunet
http://www.nordu.net/stat-q/load-map/ndn-map,,traffic,peak

http://monitor.belnet.be
Evolution of traffic load

Example with GEANT

CNPP/2008.10. © O. Bonaventure, 2008

58

The GEANT topology may be found at : www.geant.net/upload/pdf/Visio-AP01G009_SCH0102_2-GeantTopologya3(T)Visio2000Isis.pdf

The measurements were presented in S. Uhlig, B. Quoitin, S. Balon, and J. Lepropre. Providing public intradomain traffic matrices to the
research community. ACM SIGCOMM Computer Communication Review, 36(1), http://inl.info.ucl.ac.be/publications/providing-public-
intradomain-traffic-
Evolution of link load

Example with GEANT

CNPP/2008.10. © O. Bonaventure, 2008

59
Traffic engineering

Problem
Shortest path chosen by IP routing does not
always lead to a good network utilisation
fish problem
many packets R4 R8
towards R9
R1

R3 R7
many packets
towards R8

R5 R6
R2 R9

How to better optimise the network utilisation ?


How to react to changes in traffic conditions
CNPP/2008.10. © O. Bonaventure, 2008

60
Traffic engineering
simple case study
A's routing table
F: d=1 : West
D: d=1 : South
B: d=1 : East
G: d=2 : West
C: d=2 : South
E: d=3 : South
B->F : 0.6
A
Link load : 120% ! B->F : 0.6
A->B : 0.6
F A->F : 0.6
B
A->E : 0.6 B's routing table
A: d=1 : North
Traffic matrix C: d=1 : South
A->B : 0.6 D D: d=2 : South
A->F : 0.6 A->E : 0.6 E: d=2 : South
B->F : 0.6 F: d=2 : North
A->E : 0.6 G C G: d=3 : South
A->E : 0.6

E
CNPP/2008.10. © O. Bonaventure, 2008

61

In this example, we assume that each link can carry one unit of traffic.
IP based traffic engineering

How to solve the traffic engineering problem


in a pure IP network ?
Two types of solutions

Router-level traffic engineering


Allow a router to use several paths instead of a single
one for a given route
Possible with most router implementations

Network-level traffic engineering


Force aggregate traffic flows to follow some paths
inside the network
Possible in some cases by playing with link costs

CNPP/2008.10. © O. Bonaventure, 2008

62
Equal Cost Multipath

Simple network-level load balancing


mechanism supported in OSPFv2
Principle
OSPF distributes the complete network topology to all
routers inside network
based on this topology, each router computes the
routes towards all destinations
if a router finds several equal cost paths reaching one
destination, it may balance its traffic over these paths
load balancing is done at the discretion of this router
without coordinnating with other routers
since routes are equal cost routes, loops will not occur provided
that the routing table is stable

CNPP/2008.10. © O. Bonaventure, 2008

63
Equal Cost Multipath (2)

Example A's routing table


F: d=1 : West
Traffic flow D: d=1 : South
B: d=1 : East
A->E : 0.6 G: d=2 : West, South
F's routing table C: d=2 : South, East
... E: d=3 : East, South, West
E: d=2 : South
A
B's routing table
D's routing table ...
... F B E: d=2 : South
E: d=2 : South-West
South-East A->E : 0.2
A->E : 0.2
A->E : 0.2 C's routing table
D ...
G's routing table A->E : 0.1 A->E : 0.1 E: d=1 : South-West
...
E: d=2 : South-East G C
A->E : 0.3
A->E : 0.3

E
CNPP/2008.10. © O. Bonaventure, 2008

64
How to dispatch IP packets ?

Principle
For each destination, remember the P equal paths
instead of a single one
place those paths in forwarding table
when a packet arrives, load balancing algorithm
selects one path among the P available paths

Shap. IP packets
Class.
IP packets Forwarding Pol
Table
+
Load
Balancing Shap.
Class.
IP packets
Pol

Forwarding decision based on longest match


Load balancing implies that forwarding table may contain
several entries for each destination © O. Bonaventure, 2008
CNPP/2008.10.

65
Load balancing algorithms

Simple solution
(Deficit) Round-Robin or variants to dispatch
packets on a per packet basis

Advantages
easy to implement since number of paths is small
traffic will be divided over the equal cost paths on a per
packet basis
each path will carry the same amount of packets/traffic

Drawbacks
two packets from the same TCP connection may be
sent on different paths and thus be reordered
TCP performance can be affected by reordering

CNPP/2008.10. © O. Bonaventure, 2008

66

References to load balancing algorithms include :

C. Hopps, Analysis of an Equal Cost MultiPath algorithm, RFC2992, Nov. 2000


Z. Caro, Z. Wang, E. Zegura, Performance of Hashing-Based Schemes for Internet Load Balancing, INFOCOM2000,
http://www.ieee-infocom.org/2000/papers/650.ps
Per TCP connection
load balancing
Principle
Perform load balancing on a per TCP connection
basis
Router identifies the connection to which each
packet belongs and all packets from same
connection are sent on same path
no reordering inside TCP connections

Issues to address
How to efficiently select the path for each TCP conn.
Router should not need to maintain a table containing
IP src, IP dest, Src port, Dest port : path to utilise
TCP connections towards to busy server should not be all sent
sent on the same path

CNPP/2008.10. © O. Bonaventure, 2008

67
Per TCP connection
load balancing (2)
How to perform load balancing without
maintaining state for each TCP connection ?
Principle
concatenate IP src, IP dest, IP protocol, Src port, and
Dest port from the IP packet inside a bit string
bitstring = [IP src:IP dest:IP protocol:Src port:Dest port]
compute path = Hash(bitstring) mod P
hash function should be easy to implement and should produce
very different numbers for close bitstring values
candidate hash functions are CRC, checksum, ...
Advantages
all packets from TCP connection sent on same path
traffic to a server will be divided over the links
Drawback
does not work well if a few TCP connections carry a large
fraction of the total traffic
Polarisation issues if all routers use same hash
CNPP/2008.10. © O. Bonaventure, 2008

68

The polarisation problems happen if all routers use exactly the same hash as shown above. In this case, all routers of the network will
compute the same hash and packets that are sent on the first interface by the first router will also be sent on the first interface by the second
router... A possible way to avoid these polarisation issues
Limitations of Equal Cost Multipath

Traffic matrix A->E : 0.2


A's routing table
A->B : 0.6 B->F : 0.6 F: d=1 : West
A
A->F : 0.6 B->F : 0.6 D: d=1 : South
B->F : 0.6 A->B : 0.6 B: d=1 : East
A->E : 0.6 F A->F : 0.6 G: d=2 : West, South
B C: d=2 : South, East
A->E : 0.2 E: d=3 : East, South, West
A->E : 0.2
A->E : 0.2
D
Link load : 140% ! A->E : 0.1
A->E : 0.1

G C
A->E : 0.3
A->E : 0.3

Drawbacks of ECM
load balancing only works for exactly equal costs paths
and few paths are exactly equal
local decision taken by each individual router
CNPP/2008.10. © O. Bonaventure, 2008

69

Extensions to Equal Cost Multipath have been proposed to allow routers to split traffic over non-shortest paths, but these extensions have not
been implemented and deployed on real rotuers.

Some networks are designed to use a large number of equal cost paths to favour load balancing. Other networks are designed to avoid equal
cost paths in order to ease management and debugging of problems.
IP-based
network level traffic engineering
How to improve the traffic distribution
throughout the entire network ?
Principle
IGP link cost influences the utilisation of this link
Typical IGP link cost settings include
1 for each link to select shortest path measured in hops
=link delay to select shortest path measured in seconds
f(bandwidth) to select shortest-high bandwidth path
example :

Careful selection of the IGP link costs to balance traffic


rerouting traffic outside a busy link by manually tweaking costs
optimising the flow of traffic instead a network for a given traffic
matrix can considering it as a classical optimisation problem
Can be difficult if routers do not support ECM
possible with some restrictions when routers support ECM

CNPP/2008.10. © O. Bonaventure, 2008

70

For example, the default OSPF link cost setting on Cisco routers is described in OSPF Design guide, available from http://www.cisco.com/
warp/public/104/1.html

A method to optimally set the OSPF weights for a known traffic matrix was proposed in
B. Fortz and M. Thorup, Internet traffic engineering by optimizing OSPF weights, Proc. IEEE Infocom 2000, March 2000, available from http://
www.ieee-infocom.org

Other references include


Y.Wang, Z.Wang, and L.Zhang. Internet traffic engineering without full mesh overlaying. In INFOCOM2001, April 2001. available from http://
www.ieee-infocom.org.
D.Lorenz, A.Orda, D.Raz, and Y.Shavitt. How good can IP routing be ? Technical Report 2001-17, DIMACS, May 2001. available from http://
www.eng.tau.ac.il/~shavitt/pub/DIMACS01-17.ps.
Anja Feldmann, Albert Greenberg, Carsten Lund, Nick Reingold, and Jennifer Rexford. Netscope: Traffic engineering for ip networks. IEEE
Network Magazine, March 2000.

Several tools have been developed to aid traffic engineering. Some commercial tools include :
Cariden http://www.cariden.com
WANDL http://www.wandl.com

as well as opensource tools :


http://totem.info.ucl.ac.be
IP-based
network level traffic engineering (2)
Traffic matrix A->E : 0.2
A's routing table
A->B : 0.6 B->F : 0.6 F: d=1 : West
A
A->F : 0.6 B->F : 0.6 D: d=1 : South
B->F : 0.6 A->B : 0.6 B: d=1 : East
A->E : 0.6 F A->F : 0.6 A->E : 0.2 G: d=2 : West, South
B C: d=2 : South, East
A->E : 0.2 E: d=3 : East, South, West
A->E : 0.2
A->E : 0.2
D
Link load : 140% ! A->E : 0.1
A->E : 0.1

G C
A->E : 0.3
A->E : 0.3

How to improve the traffic distribution ?


A should send traffic towards E only via its South port
B should send traffic towards F only via its South port
Possible by changing the IGP link costs
CNPP/2008.10. © O. Bonaventure, 2008

71
IP-based
network level traffic engineering (3)
Possible setting of the IGP link costs
A's routing table
D's routing table F: d=3: West
A: d=2: North D: d=2: South
B: d=2: South-East B: d=3: East
C: d=1: South-East E: d=4: South
E: d=2: S-E, S-W G: d=3: South B's routing table
F: d=2: S-W C: d=3: South A: d=3: North
G: d=1: S-W C: d=1: South
A D: d=2: South
3 3 E: d=E: South
F: d=4: South
F 2 B G: d=3: South

Traffic matrix
A->B : 0.6 1 1 C's routing table
D B: d=1: North
A->F : 0.6 A: d=3: North-West
B->F : 0.6 1 1 D: d=1: North-West
A->E : 0.6 C E: d=1: South-West
G F: d=3: N-W, S-W
1 1 G: d=2:N-W, S-W
E
CNPP/2008.10. © O. Bonaventure, 2008

72
Changing IGP weights in GEANT

Example : invcap

CNPP/2008.10. © O. Bonaventure, 2008

73

Other changes to the IGP weights are possible, of course.


MPLS-based traffic engineering

Principle of the solution


Build a normal IP or IP+MPLS network
packet forwarding on shortest path towards destination
Collect traffic statistics at edge routers and
information about link load
identify the most congested parts of the network
Ingress routers establish LSPs along a well
chosen path to divert large traffic flows away from
heavily loaded links

Issues to solve
How can an ingress router determine an
acceptable path for a given traffic flow ?
How to establish a LSP along a chosen path ?

CNPP/2008.10. © O. Bonaventure, 2008

74

Xipeng Xiao, Alan Hannan, Brook Bailey, and Lionel M. Ni. Traffic engineering with MPLS in the Internet. IEEE Network Magazine, March 2000.
Selecting a path with constraints

A
B

R1 R2

R4
A requires a LSP with
10 Mbps bandwidth R3
< 50 msec delay
no packet losses
C

How can we establish a LSP with QoS


constraints through the network ?
Need information about capabilities of each link
Need an algorithm to select the best path
according to specific constraints
CNPP/2008.10. © O. Bonaventure, 2008

75
Constrained routing

What should be added to traditional routing


algorithms ?
a way to distribute information about current
network state
routers must know load of remote links to choose paths
meeting constraints for flows with QoS guarantees

a way to compute a path subject to constraints


current routing algorithms find shortest path
how can we find a path with
minimum hop count
at least 10 Mbps
at most 10 msec of delay

CNPP/2008.10. © O. Bonaventure, 2008

76
Distributing load information

Distance vector routing protocols [RIP,BGP]


routers conspire to distribute routing table
difficult to inform routers of load on remote links
difficult to support constrained routing

Link state routing protocols [OSPF, IS-IS]


routers conspire to distribute network map
simple to add information about network load
routers distribute link state packets with load info
delay is already distributed as the IGP metric
bandwidth/link load is main information to distribute
tradeoff between frequent distribution (accurate
information) and rare distribution (avoid network overload)
each router knows topology and load of each link
and can find constrained paths
CNPP/2008.10. © O. Bonaventure, 2008

77
Distributing load information (2)

Potential problem
Link load information is not distributed immediately
routers must establish flows based on partial
information about the current load in the network
D=1
D=3 B=6 D=10
B=6 R5 B=8
R2
D=3 R6
D=3 B=2 1
R1 B=2 D=1
B=6 D=6
D=5 R3 R4 B=5
B=8 2

1. new flow [B=4] is created between R4 and R6


1. before information about load changes, R3 wants to create a new
flow [B=2] towards R6
R3 believes that R3-R4-R6 is the best path
CNPP/2008.10. © O. Bonaventure, 2008

78

In this slide, B means bandwidth and D means Delay


Constraints

Three types of constraints on path selection


Additive constraint
find path minimising
example
hop count
link delay or cost
Multiplicative constraint
find path minimising
example
loss rate
Concave constraint
find path containing links whose characteristic is always
above a given constraint
example
bandwidth
resource class or color

CNPP/2008.10. © O. Bonaventure, 2008

79
Finding a constrained path

Single additive or multiplicative constraints


apply Dijkstra's algorithm
example

C=1
C=3 R2 R5 C=10

C=3 C=3 R6
R1
C=1 C=6
C=5 R3 R4

2 or more additive/multiplicative constraints


unfortunately problem is NP hard
need to evaluate all possible paths to find exact solution
several heuristics have been proposed in literature to find
acceptable solutions
CNPP/2008.10. © O. Bonaventure, 2008

80

In the figure above, C is the IGP cost associated with each link. The Dijsktra algorithm builds a shortest path tree and is
run on each router to determine the shortest path tree from this router to all routers inside the network. This tree is
computed incrementally as follows.
First, the tree only contains the router which performs the computation
Then, all the routers that are adjacent to the router performing the computation are considered to be candidate and are put
on the candidate list with costs equal to the cost of the links between this router and the candidate router. The candidate
router with the smallest cost is added to the shortest path tree and removed from the candidate list.. All the neighbors of
this router are then examined to see if a better path can be found for a candidate list. The candidate list is updated and
the algorithm continues until all routers are added to the tree.
The shortest path tree from R3 is shown below :
Finding a constrained path (2)

Concave constraints
fortunately easy to handle
remove from the network map all links that do not
satisfy the constraint
use Dijkstra's algorithm on the reduced map
example
find shortest 3 Mbps from R3 to R6

C=1
C=3 B=6 C=10
B=6 R5 B=8
R2

C=3 C=3 R6
R1 B=2 C=1 B=10
B=1 C=6
C=5 R3 R4 B=1
B=8

CNPP/2008.10. © O. Bonaventure, 2008

81

In the figure above, C is the IGP cost associated with each link and B is the available bandwidth on each link (in Mbps).

Another example would be to only utilise links offering supporting some kind of protection or to avoid satellite links.
Constrained routing in IP networks

Several solutions proposed by researchers


Lessons learned
Constrained routing should be applied to
flows and not on a per packet basis
Currently, constrained routing in IP networks
is only used with MPLS to establish LSPs
Bandwidth and delay are key constraints
delay jitter is less important and difficult to efficiently support
Path selection should be performed by the source
the source of a flow selects an explicit route
intermediate nodes perform connection admission control but do
not perform any constrained routing decision
path selection algorithm does not need to be standardised
if the new flow is acceptable, establishment continues
otherwise the source will have to compute another path

Existing constrained routing protocols


OSPF-TE, ISIS-TE, PNNI (ATM)

CNPP/2008.10. © O. Bonaventure, 2008

82

ISIS-TE is described in :

H.Smit and T.Li. IS-IS extensions for Traffic Engineering (RFC 3784) updated by RFC 4205
Intermediate System to Intermediate System (IS-IS) Extensions in Support of Multi-Protocol Label Switching (GMPLS) (RFC 4205)

OSPF-TE is defined in

D. Katz, K. Kompella, D. Yeung, Traffic Engineering (TE) Extensions to OSPF Version 2 (RFC 3630)
PNNI is described in
ATM Forum. Private Network-Network Interface specification version 1.0 (PNNI 1.0). ATM Forum specification af-pnni-0055.000, March 1996.

Additional information about QoS routing protocols may be found in :


Shigang Chen and Klara Nahrstedt. An overview of quality of service routing for next-generation high-speed networks: Problems and solutions.
IEEE Network Magazine, 12(6):64--79, November 1998.

G.Apostolopoulos, R.Guerin, S.Kamat, A.Orda, A.Przygienda, and D.Williams. QoS routing mechanisms and OSPF extensions. Internet Draft,
draft-guerin-qos-routing-ospf-04.txt, Internet Engineering Task Force, December 1998. Work in progress.

E.Crawley, R.Nair, B.Rajagopalan, and H.Sandick. RFC 2386: A framework for QoS-based routing in the Internet, August 1998. Status:
INFORMATIONAL
OSPF-TE
An example constrained routing protocol
Extension to OSPF designed to aid in the
establishment of traffic engineered LSPs
OSPF-TE distributes new info about each link
link type and link Id
local and remote IP addresses
traffic engineering metric
additional metric to specify the cost of this link
maximum bandwidth
maximum amount of bandwidth usable on this link
maximum reservable bandwidth
maximum amount of bandwidth that can be reserved by LSPs
unreserved bandwidth
amount of bandwidth that is not yet reserved by LSPs
resource class/color
can be used to specify the type of link (e.g. Expensive link would
be colored in red and cheap links in green)

CNPP/2008.10. © O. Bonaventure, 2008

83

OSPF-TE is described in :

D.Katz and D.Yeung. Traffic Engineering extensions to OSPF.RFC3640


Using RSVP to distribute MPLS labels

Principle
RSVP supports downstream on-demand label
allocation
RSVP extension for MPLS called RSVP-TE

Ingress LSR sends PATH message towards


egress LSR
PATH message includes Label Request Object

Egress LSR sends label back in RESV message


RESV propagates the labels hop-by-hop

CNPP/2008.10. © O. Bonaventure, 2008

84

RSVP-TE is defined in the following documents :

RFC3209 RSVP-TE: Extensions to RSVP for LSP Tunnels. D. Awduche, L.


Berger, D. Gan, T. Li, V. Srinivasan, G. Swallow. December 2001.
(Format: TXT=132264 bytes) (Status: PROPOSED STANDARD)

RFC3210 Applicability Statement for Extensions to RSVP for LSP-Tunnels.


D. Awduche, A. Hannan, X. Xiao. December 2001. (Format: TXT=17691
bytes) (Status: INFORMATIONAL)

Besides RSVP, a second protocol which is an extension to LDP can be used to establish LSPs for traffic engineering purposes :
RFC3212 Constraint-Based LSP Setup using LDP. B. Jamoussi, Ed., L.
Andersson, R. Callon, R. Dantu,. January 2002. (Format: TXT=87591
bytes) (Status: PROPOSED STANDARD)
RFC3213 Applicability Statement for CR-LDP. J. Ash, M. Girish, E. Gray,
B. Jamoussi, G. Wright. January 2002. (Format: TXT=14489 bytes)
(Status: INFORMATIONAL)
RFC3214 LSP Modification Using CR-LDP. J. Ash, Y. Lee, P. Ashwood-Smith, B. Jamoussi, D. Fedyk, D. Skalecki, L. Li. January 2002.
(Format: TXT=25453 bytes) (Status: PROPOSED STANDARD)
Using RSVP to distribute MPLS labels (2)

LSP establishment with RSVP-TE


LS L1 L2 L3 LD
PATH
PATH LD knows that LS
PATH contains PATH wishes to establish LSP
address of LD with Tspec
LABEL_Request L1 finds next hop=L2
no label associated PATH
TSpec

RESV[Label=L1]
RESV[Label=L7]
LD allocates a label on
RESV[Label=L0] link L3-LDl
RESV[Label=L9] Rspec indicates
L2 Label table reservation
West:L0 ->East:L7
Packets towards LD
will use label L9
L1 Label table L3 Label table
West:L9 ->East:L0 West:L7 ->East:L1

CNPP/2008.10. © O. Bonaventure, 2008

85
RSVP-TE : detailed example

What happens inside the LSR ?

LS L1 LD
PATH
Content of PATH message
IP source : LS
IP destination : LD
R1's state Previous Hop : LS
Session : Egress: LD, Tunnel Id : X
LSP Id : Ingress: LS, Id : Y Session : Egress: LD, Tunnel Id : X
upstream=LS LSP Id : Ingress: LS, Id : Y
Tspec : X Mbps LABEL_REQUEST
Tspec : X Mbps

Tunnel Id
identifies the LSP being established between LS and LD
LSP Id
identifies an "instance" of the LSP being established
Label Request
indification of the network protocol (IP, Ipv6) carried on LSP
CNPP/2008.10. © O. Bonaventure, 2008

86
RSVP-TE : detailed example (2)

Content of the RESV message

LS L1 LD

Content of RESV message


IP source : LD
IP destination : L1 RESV

Previous Hop : LD
Filter Spec: SE L1's state
Session : Egress: LD, Tunnel Id : X
Session : Egress: LD, Tunnel Id : X LSP Id : Ingress: LS, Id : Y
LSP Id : Ingress: LS, Id : Y upstream=LS
Label : L1 Tspec : X Mbps
Rspec : Y Mbps Rspec : Y Mbps
Shared Explicit filter
filter contains complete identification of LSP
allows flexible handling of LSP

Identification of LSP : [Ingress, Egress, Tunnel ID, LSP Id]


CNPP/2008.10. © O. Bonaventure, 2008

87
RSVP-TE : Explicit Routes

How to establish LSP along non-shortest


path ?
Solution
Ingress LSR may specify the route to be followed by an
LSP being established

Route specification is composed of a list of


IP addresses
Subnet prefixes
Autonomous System numbers

Two types of route specifications


Strict route
the LSP must pass through each LSR specified by ingress
LSR
Loose route
the LSP can pass through non-specified LSR between two
specified LSRs
CNPP/2008.10. © O. Bonaventure, 2008

88
RSVP-TE : Explicit Routes (2)

R3 prefers to reach R7 via R4

PATH [ERO: R4, R7,R8] R4 PATH [ERO: R7,R8]

R3 R2 R7
PATH [ERO: R3,R7,R8] R8
R5 R6
R0 R1 PATH [ERO: R8]

PATH [ERO: R1,R3,R7,R8]

CNPP/2008.10. © O. Bonaventure, 2008

89

When RSVP-TE is used to establish LSPs, this is usually inside a single AS. In this case, LSPs are established between loopback addresses of routers and
usually a full-mesh of LSPs is used. Some providers also use several LSPs between pairs of important routers to
- perform load balancing,
- ensure that one LSP will always remain available even in case of link failures
- support different types of service (e.g. voice, normal Internet) over different paths

There are some deployments with RSVP-TE across ASes. In this case, the ERO can specify a prefix or AS instead of specifying an IP address.
MPLS for traffic engineering

Principle
ingress LSR establishes LSPs (with guaranteed
resources) along chosen paths

Issues to consider
How to reserve bandwidth for a given LSP ?
Rely on Tspec and Rspec

How to support varying traffic flows ?


It should be possible to dynamically modify the LSP
resources. If there are not enough resources to support an
increase, the LSP should keep the old resources

How to dynamically reroute an established LSP ?


For example more bandwidth is available on another path
or because one link used by the LSP failed
CNPP/2008.10. © O. Bonaventure, 2008

90
RSVP-TE : Resource increase

How to smoothly increase the bandwidth of


an LSP ?
Simple solutions
1. Change resources in PATH and RESV messages
• if there are not enough resources available in the network to
support the bandwidth increase, network will send RESVErr and
entire LSP will be removed from network
• not suitable for important LSP

2. Try to establish new LSP


• create new LSP and once accepted remove old LSP
• drawback : the new LSP might be rejected due to the resources
already used by the existing LSP

CNPP/2008.10. © O. Bonaventure, 2008

91
RSVP-TE : Resource increase (2)

Smooth resource increase


100 Mbps 100 Mbps
LS L1 LD

PATH[Tunnel=X,Id=Y, Bw=60]

PATH[Tunnel=X,Id=Y, Bw=60]

L1's state
Filter Shared Explicit
Session : Egress: LD, Tunnel Id : X
LSP Id : Ingress: LS, Id : Y
Session : Egress: LD, Tunnel Id : X
LSP Id : Ingress: LS, Id : Z
upstream=LS
Bandwidth : 60 Mbps

PATH_new[Tunnel=X,Id=Z, Bw=80]

PATH_new[Tunnel=X,Id=Z, Bw=80]

CNPP/2008.10. © O. Bonaventure, 2008

92
RSVP-TE : Resource increase (3)

100 Mbps 100 Mbps


LS L1 LD

RESV[Bw=80, SE[Tunnel=X,Id=Y;Tunnel=X,Id=Z]
RESV[Bw=80, SE[Tunnel=X,Id=Y;Tunnel=X,Id=Z]

L1's state
Filter Shared Explicit
Session : Egress: LD, Tunnel Id : X
LSP Id : Ingress: LS, Id : Y
Session : Egress: LD, Tunnel Id : X
LSP Id : Ingress: LS, Id : Z
upstream=LS
Bandwidth : 80 Mbps

PATHtear[Tunnel=X,Id=Y]
PATHtear[Tunnel=X,Id=Y]

CNPP/2008.10. © O. Bonaventure, 2008

93
RSVP-TE : Changing routes

How to change the route of an explicitly


routed LSP ?
Same principle as for resource increase

Old_Path and New_Path


share resources on R7-R8
R4 Old_PATH
Old_Path and New_Path Old_PATH
share resources on R0-R1
and R1-R3 R7 Old_PATH
R3 R2
Old_PATH New_PATH New_PATH
New_PATH
New_PATH R8
Old_PATH R5 R6
R0 R1
New_PATH
Differences between Old_Path and New_Path
ERO object
LSP Id
CNPP/2008.10. © O. Bonaventure, 2008

94
Traffic engineering
with MPLS
A's routing table A's mapping table
F: d=1 : West Destination Label
D: d=1 : South B LB
B: d=1 : East E LE
G: d=2 : West F LF
C: d=2 : South
E: d=3 : South B's mapping table
Destination Label
A->F A
B->E F LF

F B
A->E B's routing table
A: d=1 : North
Traffic matrix B->F C: d=1 : South
A->B : 0.6 D: d=2 : South
D
A->F : 0.6 E: d=2 : South
B->F : 0.6 F: d=2 : North
A->E : 0.6 C G: d=3 : South
G

E
CNPP/2008.10. © O. Bonaventure, 2008

95
MPLS
MultiProtocol Label Switching
Plan
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilisation of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

96
MPLS and IP QoS

What could be the benefit of MPLS to support


IP QoS ?
With differentiated or integrated services, the path
followed by IP packets is independent of their QoS
since those architectures did not change routing

When MPLS is used, IP packets with distinct QoS


requirements may be placed inside distinct LSPs
that follow different paths inside the network
MPLS allows to utilize distinct routes for packets with
distinct QoS requirements

CNPP/2008.10. © O. Bonaventure, 2008

97
MPLS and Differentiated Services

How can we support Differentiated services


inside a MPLS network ?
LSR must know QoS required by each packet
Two complementary methods
Indicate the QoS required by all packets of a given LSP at
LSP establishment time (e.g. With special RSVP objects)

Encode QoS info inside MPLS label


1 2 3
01234567890123456789012345678901

Label Ex.S TTL


EXP field is part of MPLS header and contains 3 bits
special RSVP objects can be used to map Exp field with DSCP

CNPP/2008.10. © O. Bonaventure, 2008

98

The MPLS support of Diffserv is discussed in :

F.Le Faucheur, L.Wu, B.DavieS. Davari, P.Vaananen, R.Krishnan, P.Cheval, and J.Heinanen. Mpls support for differentiated services. Internet
draft, draft-ietf-mpls-diff-ext-09.txt, work in progress, April 2001.
Other information may be found in :

S.Ganti, S.Seddigh, and B.Nandy. Mpls support of differentiated services using e-lsp. Internet draft, draft-ganti-mpls-diffserv-elsp-00.txt, work in
progress, April 2001.
Diffserv support with MPLS

The "IP" way, aka E-LSPs


a single LSP may carry packets receiving
several differentiated services
each MPLS router relies on EXP header field to
determine the service for each received packet
EXP field is part of MPLS header and contains 3 bits

1 2 3
01234567890123456789012345678901

Label Ex.S TTL


ATM/frame relay encapsulation do not support 3 bits
useful to reduce the number of required LSPs in large
networks

CNPP/2008.10. © O. Bonaventure, 2008

99
Diffserv support with MPLS (2)

The "ATM" way, aka L-LSPs


one LSP carries packets receiving a single service
the EXP field of the MPLS header may be used to specify
the drop preference for each packet (e.g. For AF)
MPLS router decides service for a receiving packet
based on lable value
the service used by the LSP is specified at LSP
establishment time
useful in ATM/frame relay networks or when
number of LSPs is not a constraint
each L-LSP may have its own explicit route
more L-LSPs than E-LSPs will be needed

CNPP/2008.10. © O. Bonaventure, 2008

100
MPLS
MultiProtocol Label Switching
Plan
Multiprotocol Label Switching
The label swapping forwarding paradigm
Integrating label swapping and IP

Utilizations of MPLS
Destination based packet forwarding
Simpler ISP backbones
Traffic engineering
QoS support
Fast restoration

CNPP/2008.10. © O. Bonaventure, 2008

101
Dealing with link/router failures

Providing a good service means


providing the promised bandwidth/delay ...
... even in case of network failures

R4
R8

R0 R7
R3

R9
R5 R6
How to deal with routers/link failures
Detect the failed component
Detection by the routing protocol itself (adjacency information)
Detection with the help of layer 2 information (carrier lost)
Propagate the bad news inside the network

CNPP/2008.10. © O. Bonaventure, 2008

102
Detecting link/router failures
How to detect link/router failures ?
Rely on failure detection of physical layer
possible with SONET/SDH and some types of modems
detection delay may vary from 10 msec to a few seconds
difficult on LAN interfaces such as Ethernet
Relying on the routing protocols
BGP
once peering session has been established, KEEPALIVE messages
are sent every 30 seconds
peer is dead if no KEEPALIVE within 90 seconds
OSPF
Hello packets sent every 10 seconds
Neighbor is dead if no hello received in 40 seconds
RIP
router sends routing table every 30 seconds
neighbor is dead if no table within the 180 seconds

Untuned routing protocols may need 10s seconds to detect


failures !
CNPP/2008.10. © O. Bonaventure, 2008

103

Newer implementations of RIPv2 should support triggered updates. With these triggered updates, a bad news (i.e. The metric of a route being
set to infinity to indicate a link failure) may be sent immediately. However, once a triggered update has been received, a router should not send
another update for the same route within a period of 1 to 5 seconds. This dampening process is used to reduce the network load, but it may
also increase the convergence time inside the network.
Most implementations allow to configure the keepalive timers used in each routing protocol. However, care must be taken when setting a
keepalive timer to a too low value. For example, consider BGP. If in theory a keepalive timer of 1 millisecond would be interesting to quickly
detect failures, in practice, using such a timer would create problems. Since BGP is implemented as an application running on the central CPU
of the router which is also supporting SNMP, RIP/OSPF, console acces, ... it is usually impossible to guarantee that BGP would be able to send
a KEEPALIVE every 1 milliseconds. Furthermore, even if BGP had a dedicated CPU, BGP relies on TCP and the loss of one TCP segment
caused by random transmission errors could force TCP to wait until the retransmission of the lost segments to deliver the KEEPALIVE
message.
Detecting link/router failures

Another solution
Bidirectionnal forwarding detection (BFD)
Simple UDP based protocol
can be implemented on router linecards directly
Two modes of operation

BFD Poll BFD Setup (10 msec)


R3 R4 R3 R4
BFD Final BFD Setup (20msec)
BFD

BFD

BFD

O(10 msec) failure detection becomes possible


CNPP/2008.10. © O. Bonaventure, 2008

104

Bidirectionnal Forwarding Detection (BFD) is intended to be a generic protocol able to detect that a link is up and running with a much shorter
detection time that the routing protocols. BFD supports several modes of operations , see :
D. Katz, D. Ward, Bidirectional Forwarding Detection, Internet draft, http://www.ietf.org/internet-drafts/draft-ietf-bfd-base-09.txt
work in progress, Feb 2009
Dealing with link/router failures

How long does it take to propagate the failure


information inside the network ?
O(a few sec.) inside a single AS
some features of routing protocols implementations may
introduce additional delays
a popular router OS introduces a default delay of five seconds
between the reception of an OSPF routing update and the
computation of the updated routing table
this reduces CPU utilization but increases convergence time
a convergence time of around half a second is possible
today in large networks

O(tens sec.) across the Internet


measurements on the Internet show that a failure is recovered
within several tens of seconds, but in some cases it may take
several minutes

CNPP/2008.10. © O. Bonaventure, 2008

105

Several groups are working on improving the convergence of IGP protocols. These improvements include the development of algorithms to
incrementally update the IGP routing table and tuning of the timers used by a given implement.
For more information, see e.g. :
C.Alaettinoglu, V.Jacobson, and H.Yu. Towards millisecond IGP congergence. Internet draft, draft-alaettinoglu-ISIS-convergence-00.ps, work
in progress, November 2000.

Cengiz Alaettinoglu and Steve Casner, ISIS Routing on the Qwest Backbone: a Recipe for Subsecond ISIS Convergence, NANOG 24, http://
www.nanog.org/mtg-0202/cengiz.html, February 2002

A. Retana, IP Routing protocol scalability - theory and examples, 2001, NANOG http://www.nanog.org/mtg-0110/retana.html
MPLS based fault-tolerance

Can we do better with MPLS ?


MPLS forwarding does not depend on routing
a MPLS router can forward packets even when routing
has not converged provided that a secondary LSP exist

Solution
Establish secondary LSPs to protect important LSPs
secondary LSP is established and maintained inside the network
but carries traffic only in case of failure of the primary LSP

when an outgoing link or router fails inform headend


LSRto stop using the primary LSP and switch all traffic to
the protection LSP
this operation can be done by the MPLS router itself without any
cooperation with other routers provided the protection LSP exists

CNPP/2008.10. © O. Bonaventure, 2008

106
End-to-End secondary LSP

First solution
Secondary LSP established between ingress LSR
and egress LSR
In case of failure, PathErr message sent to ingress
Protection LSP to protect R4
primary LSP from failure

R3 reroutes traffic
R0 R7
R3
on secondary LSP upon
reception of PathErr
Primary LSP
R5 R6
PathErr
Drawbacks
PathErr may take some time to reach ingress LSR
packets may be lost between failure detection and reception of
PathErr
one secondary LSP for each primary LSP
CNPP/2008.10. © O. Bonaventure, 2008

107
Selection of path for secondary LSP

How to select a suitable path for a secondary


LSP to protect a primary LSP ?
Principle of the solution
Ingress LSR selects path for primary LSP by using its
path selection algorithm on the entire network topology
Knowing the path of primary LSP, ingress LSR computes
a path for the secondary LSP by using its path selection
algorithm on the network topology where the resources
used by primary LSP have been removed
secondary LSP must rely on different physical resources than
primary LSP
this implies that the ingress LSR needs to know the physical
resources used by each LSP
RSVP carries this information in the RRO object

CNPP/2008.10. © O. Bonaventure, 2008

108
Selection of path for secondary LSP (2)

Types of end-to-end protection LSPs


router-disjoint protection LSP
does not utilize the same transit routers as the primary
link-disjoint protection LSP
does not utilize the same links as the primary LSP

Router disjoint protection LSP R4

R0 R7
R3
R3 computes explicit path
for protection LSP based on Primary LSP
topology without R5 and R6 R5 R6

RESV
RRO:R3-R5-R6-R7

CNPP/2008.10. © O. Bonaventure, 2008

109
Improving MPLS protection

Issues with end-to-end protection LSPs


one protection LSP must be established for each
primary LSP
doubles the number of LSPs in the network
if bandwidth is reserved for primary, bandwidth must be
reserved for secondary
failure information must travel back to ingress
LSR

It would be interesting to
Reduce the time required to switch to the
protection LSP
Reduce the number of protection LSPs
established inside the network

CNPP/2008.10. © O. Bonaventure, 2008

110
Failures to consider

Two types of failures


Router failure
a router and all the links attached to this router fail
Link failure
a link fails between two adjacent routers

Protection LSP to protect


Protection LSP to protect
primary LSP from failure R4
of link R3-R5 primary LSP from failure of R6

R0 R7
R3

R5 R6
Primary LSP

CNPP/2008.10. © O. Bonaventure, 2008

111

In pratice, it might also be useful to group LSPs from groups of failures. For example, consider the utilization of different wavelengths over the
same optical fiber. In the slide above, links R5-R0 and R5-R6 could travel over the same physical fiber. If this fiber fails, then both R5-R0 and
R5-R6 would fail together. In this case, if the primary LSP uses link R5-R6, then the secondary LSP should not also utilize link R5-R0.
It is possible to deal with those issues by advertising in the IGP the physical ressources that are used by each link, but these details are outside
the scope of this presentation.
Protecting a complete path

Principle
one protection LSP is used to protect each link
Protection LSP for link R3-R5
R4
Protection LSP for link R6-R7

R0 R7
R3

R5 R6
Primary LSP Failure of link R6-R7
R6 will bypass failure and
divert traffic on protection LSP
Protection LSP for link R5-R6
Advantage
traffic is immediately switched to secondary LSP
Drawback
a large number of protection LSPs may be required
CNPP/2008.10. © O. Bonaventure, 2008

112

In this slide, we assume that link protection was requested.Those protection LSPs are called detour LSPs in the standardization documents.

P. Pan, D. Gan, G. Swallow, J. Vasseur, D. Cooper, A. Atlas, M. Jork, Fast Reroute Extensions to RSVP-TE for LSP Tunnels, Internet draft,
draft-ietf-mpls-rsvp-lsp-fastreroute-00.txt, work in progress, Jan 2002

Per-LSP link failure protection

Principle
Protection LSP established by each LSR to
protect each link used by primary LSP
R4's Label forwarding table
Inlabel Outport Outlabel
Primary LSP R4 L2 North-E L5
(path chosen by ingress)

Protection LSP
R3
for link R3-R4 R7
(path chosen by R3)

R3's Label forwarding table R7's Label forwarding table


Inlabel Outport Outlabel R5 R6 Inlabel Outport Outlabel
L1 North-E L2 L4 NW L2
* South-E L3

R3's Label forwarding table after failure


Inlabel Outport Outlabel R5's Label forwarding table R6's Label forwarding table
L1 South-E L3 Inlabel Outport Outlabel Inlabel Outport Outlabel
L3 E L3 L3 NE L4

CNPP/2008.10. © O. Bonaventure, 2008

113

Extensions to RSVP-TE have been proposed to allow the ingress LSR to request the establishment of automatic link protection LSPs by each
LSR on the path of a primary LSP. See

P. Pan, D. Gan, G. Swallow, J. Vasseur, D. Cooper, A. Atlas, M. Jork, Fast Reroute Extensions to RSVP-TE for LSP Tunnels, Internet draft,
draft-ietf-mpls-rsvp-lsp-fastreroute-00.txt, work in progress, Jan 2002

The ingress LSR, when establishing its LSP may request the LSP to be protected against link failures or router failures.
R3 computes the path for the protection LSP by finding in the network the best path to reach R4 without using link R3-R4.

Note that in this example, we use per-LSR label space. This allows R4 to send the packets received with label L2 on the primary LSP,
independently of the interface from which the packets were received (from R3 on the primary or from R7 on the detour LSP). For this, R4
simply suggests the utilization of label L2 in the RESV message that it sends upon reception of the Path message sent by R3 to establish the
detour LSP to protect the primary LSP from the failure of link R3-R4.

Per-LSP router failure protection

Principle
Protection LSP established by each LSR

R2's Label forwarding table


Inlabel Outport Outlabel
L2 North-E L4 R4's Label forwarding table
Inlabel Outport Outlabel
R4 L4 North-E L5
Primary LSP
R2
Protection LSP R7
R3 for R3-R2- R4
R3's Label forwarding table R7's Label forwarding table
Inlabel Outport Outlabel R5 R6 Inlabel Outport Outlabel
L1 North-E L2 L4 NW L4
* South-E L3

R3's Label forwarding table after failure R6's Label forwarding table
Inlabel Outport Outlabel R5's Label forwarding table Inlabel Outport Outlabel
L1 South-E L3 Inlabel Outport Outlabel L3 NE L4
L3 E L3

CNPP/2008.10. © O. Bonaventure, 2008

114
Protecting several LSPs together

Primary LSP2 Primary LSP1

R4's Label forwarding table


Inlabel Outport Outlabel
R4 L2 North-E L5
L3 North L6

Protection LSP
R3
for link R3-R4 R7

R3's Label forwarding table R7's Label forwarding table


Inport Inlabel Outport Operation R5 R6 Inport Inlabel Outport Operation
West L1 North-E SWAP(L2) SW L4 NW POP
West L2 North-E SWAP(L3)
* * South-E PUSH(L3)

R5's Label forwarding table


R3's Label forwarding table after failure Inport Inlabel Outport Operation
Inport Inlabel Outport Operation NW L3 E SWAP(L4)
West L1 South-E SWAP(L2);PUSH(L3)
West L2 South-E SWAP(L3);PUSH(L3)

CNPP/2008.10. © O. Bonaventure, 2008

115

This type of protection LSP is also called a bypass tunnel

Extensions to RSVP-TE have been proposed to allow the ingress LSR to request the utilization of bypass tunnels by each LSR on the path of
a primary LSP to be protected. See
P. Pan, D. Gan, G. Swallow, J. Vasseur, D. Cooper, A. Atlas, M. Jork, Fast Reroute Extensions to RSVP-TE for LSP Tunnels, Internet draft,
draft-ietf-mpls-rsvp-lsp-fastreroute-02.txt, work in progress, Feb 2003
In this example, R3 establishes a bypass tunnel to protect LSP1 and LSP2 from the failure of link R3-R4. For this, R3 creates a protection LSP
to reach R4 without using link R3-R4. If link R3-R4 fails, then R3 will update its label forwarding table to first put the same labels as on the
primary LSP and then push label L3 on top of those labels. Packets from LSP1 and LSP2 will then be sent to router R7 that will POP the top
label before forwarding the packets towards R4.
We assume in the example that R4 uses a per-LSR label space. Note also that R7 is removing (POP) the label of the bypass tunnel before
sending packets to R4. This allows R4 to receive the packets with the same labels as on the primary LSP.

Você também pode gostar