IP Tec For Mobile Networks

Technology
IP for Mobile Networks

TTP18031 D0 SG DEN I1.0
STUDENT GUIDE
All Rights Reserved Alcatel-Lucent 2009
All rights reserved Alcatel-Lucent 2008

Passing on and copying of this document, use and communication of its contents
not permitted without written authorization from Alcatel-Lucent

IP for mobile networks - Page 1
Terms of Use and Legal Notices

1. Safety Warning
Switch to notes view!
Both lethal and dangerous voltages may be present within the products used herein. The user is strongly advised not to wear
conductive jewelry while working on the products. Always observe all safety precautions and do not work on the equipment
alone.
The equipment used during this course may be electrostatic sensitive. Please observe correct anti-static precautions.
2. Trade Marks
Alcatel-Lucent and MainStreet are trademarks of Alcatel-Lucent.
All other trademarks, service marks and logos (Marks) are the property of their respective holders, including Alcatel-Lucent.
Users are not permitted to use these Marks without the prior consent of Alcatel-Lucent or such third party owning the Mark. The
absence of a Mark identifier is not a representation that a particular product or service name is not a Mark.
Alcatel-Lucent assumes no responsibility for the accuracy of the information presented herein, which may be subject to change
without notice.
3. Copyright
This document contains information that is proprietary to Alcatel-Lucent and may be used for training purposes only. No other
use or transmission of all or any part of this document is permitted without Alcatel-Lucents written permission, and must
include all copyright and other proprietary notices. No other use or transmission of all or any part of its contents may be used,
copied, disclosed or conveyed to any party in any manner whatsoever without prior written permission from Alcatel-Lucent.
Use or transmission of all or any part of this document in violation of any applicable legislation is hereby expressly prohibited.
User obtains no rights in the information or in any product, process, technology or trademark which it includes or describes, and
is expressly prohibited from modifying the information or creating derivative works without the express written consent of
Alcatel-Lucent.
All
2 rights reserved Alcatel-Lucent 2008
All Rights Reserved Alcatel-Lucent @@YEAR
Technology
IP for mobile networks
4. Disclaimer
In no event will Alcatel-Lucent be liable for any direct, indirect, special, incidental or consequential damages, including lost
profits, lost business or lost data, resulting from the use of or reliance upon the information, whether or not Alcatel-Lucent has
been advised of the possibility of such damages.
Mention of non-Alcatel-Lucent products or services is for information purposes only and constitutes neither an endorsement, nor
a recommendation.
This course is intended to train the student about the overall look, feel, and use of Alcatel-Lucent products. The information
contained herein is representational only. In the interest of file size, simplicity, and compatibility and, in some cases, due to
contractual limitations, certain compromises have been made and therefore some features are not entirely accurate.
Please refer to technical practices supplied by Alcatel-Lucent for current information concerning Alcatel-Lucent equipment and
its operation, or contact your nearest Alcatel-Lucent representative for more information.
The Alcatel-Lucent products described or used herein are presented for demonstration and training purposes only. AlcatelLucent disclaims any warranties in connection with the products as used and described in the courses or the related
documentation, whether express, implied, or statutory. Alcatel-Lucent specifically disclaims all implied warranties, including
warranties of merchantability, non-infringement and fitness for a particular purpose, or arising from a course of dealing, usage
or trade practice.
Alcatel-Lucent is not responsible for any failures caused by: server errors, misdirected or redirected transmissions, failed
internet connections, interruptions, any computer virus or any other technical defect, whether human or technical in nature
5. Governing Law
The products, documentation and information contained herein, as well as these Terms of Use and Legal Notices are governed by
the laws of France, excluding its conflict of law rules. If any provision of these Terms of Use and Legal Notices, or the
application thereof to any person or circumstances, is held invalid for any reason, unenforceable including, but not limited to,
the warranty disclaimers and liability limitations, then such provision shall be deemed superseded by a valid, enforceable
provision that matches, as closely as possible, the original provision, and the other provisions of these Terms of Use and Legal
Notices shall remain in full force and effect.

Course Outline
1.
TCP/IP Basics
2. Ethernet
1. TCP/IP
technology
1. Basic Concepts
3.
Point to Point transport
2. Ethernet technology
4. IP Layer
1. Bridges and Switches
2.5. Virtual
LANs
Transport
Layer
3. Point
to Point transport
6. Application
Services
1. PPP/ML-PPT
7. Quality of Service
4. IP Layer
Services
1.8. IPMPLS
addressing
2. Routing principles
9.
Introduction to IPSEC
3. Redundancy (HSRP/VRRP)
5. Transport Layer
3
1. User Datagram protocol (UDP) All Rights Reserved Alcatel-Lucent @@YEAR
Technology
2. Transmission Control Protocol (TCP)

3. SIGTRAN
6. Application Services
1. Synchronization (NTP)
2. FTP/ SFTP
3. Voice over IP (VoIP)
7. Quality of Service
1. QoS problems
2. Mechanisms of the QoS
8. MPLS overview
1. Label switching
2. Traffic engineering
3. MPLS services
9. IPSEC Introduction
1. Security association
2. Tunnel setup
3. IKE

About this Student Guide

Conventions
used
in this guide
Switch to notes
view!
Note
Provides you with additional information about the topic being discussed.
Although this information is not required knowledge, you might find it useful or
interesting.
Technical Reference
(1) 24.348.98 Points you to the exact section of Alcatel-Lucent Technical
Practices where you can find more information on the topic being discussed.
Warning
Alerts you to instances where non-compliance could result in equipment damage or
personal injury.
Where you can get further information
All Rights Reserved Alcatel-Lucent @@YEAR
Technology
If you want further information you can refer to the following:

Technical Practices for the specific product
Technical support page on the Alcatel website: http://www.alcatel-lucent.com

Do not delete this graphic elements in here:
Section 1
TCP/IP Overview
Technology

Section 1 Page 1
Module Objectives
Upon completion of this module, you should be able to:
Describe the basic concepts of communication over an IP network
Describe the role of the first four layers of the TCP/IP stack list
Explain the operating principle of the main protocols that make up the
TCP/IP stack

TCP/IP Overview
Technology IP for Mobile Networks

Section 1 Page 2
1.1 Basic Concepts

TCP/IP Overview

Section 1 Page 3
1 Basic Concepts
Network Categories
LAN
MAN
WAN

TCP/IP Overview
Networks generally fall into three categories, depending on their size and geographical coverage:
Local Area Network (LAN): coverage is limited to a university campus, company premises, etc.
Metropolitan Area Network (MAN): coverage extends to a geographical area, the size of a town. MANs
provide high-speed links between several LANs in the same geographical area (less than one hundred
kilometers).
Wide Area Network (WAN): coverage extends to wide geographical areas.

Section 1 Page 4
1 Basic Concepts
Network Topologies
Bus
Star
Central
Ring

TCP/IP Overview
An IT system is made up of computers connected to each other by communication links (network cables, etc.)
and hardware devices (network boards and other equipment that enables data to circulate properly). The
physical layout of the network (the spatial configuration) is known as the physical topology. Topologies
generally fall into the following categories:
bus topology: in a bus topology, all the computers are connected to the same transmission link.
star topology: in a star topology, the computers in the network are connected to a central equipment
system.
ring topology: in a network with ring topology, the computers are connected to each other in a ring and
communicate in turn.

Section 1 Page 5
1 Basic Concepts
Connectionless Communication Mode
P3
P2
P1
P2
P1
P3
Connectionless network
P2
P1
P3
P3
P3
P2
P1
P1
P2

TCP/IP Overview
In a connectionless network:
All packets must know the destination address.
No connection is established: flows to the same destination can travel along different routes.
Data can arrive at the destination in any order.

Section 1 Page 6
1 Basic Concepts
Connection-Oriented Communication Mode
P3
P2
P1
P3
P2
P1
Connection-oriented network
P2
P1
Connectionless
network
3
P
P3
P2
P3
P2
P2
P1
P1
P2
P3
P1
P1
P3
P3
P3
P2
P1
Path establishment
Data transfer
Path release
P1
P2

TCP/IP Overview
In a connection-oriented network, a connection must be established when two devices wish to communicate.
The intermediate nodes must preserve the context of this connection.
Connection-oriented communication is characterized by:
the setting up of a virtual circuit.
the identification of data by a path identifier.
the delivery of data in the order it is sent.
the need to release the connection after communication.

Section 1 Page 7
1 Basic Concepts
Network Interconnection
LAN
WAN
TCP/IP
network
interconnection
LAN
LAN

TCP/IP Overview
The main role of TCP/IP is the interconnection of networks.

The main difficulty lies in the fact that networks can fall into very diverse categories.
Indeed, connecting networks can involve local business networks based on the following types of topology:
bus
ring
star
Connecting networks can also involves long-haul mesh networks such as:
ATM
Frame Relay
Public Switched Telephone Networks
The role of TCP/IP is therefore to provide universal communication services over diverse physical networks.

Section 1 Page 8
1 Basic Concepts
Communication Needs
Many kinds of connections:
- Point-to-Point (leased lines, PSTN, etc.)

- Point-to-multipoint (Local Area Networks),
Some rules
rules are
essential for
communications
- Virtual connections (Wide Area Networks),
Protocols
Various Operating Systems

DOS,
UNIX, LINUX, etc.
To facilitate the user tasks: file transfer
mail exchanges
,
surf the Net
, .
Some additional
software are
offered
Services

TCP/IP Overview
Network interconnection brings into play different types of links:

point-to-point links.
multipoint links (deployed mainly in local networks).
virtual-circuit links used in WAN networks (e.g. ATM, Frame Relay, X25).
Network interconnection also brings into play different operating systems, the main ones being:
DOS
Unix
Linux
These operating systems function on machines built by different equipment manufacturers.

Rules therefore had to be defined to enable dialog. These communication rules are known as protocols.
Additional software also had to be developed and integrated in the TCP/IP protocol stack to make it easier for
users wishing to:
transfer files,
exchange e-mails,
surf the internet,
perform
many other tasks.
These types of software are known as services.

Section 1 Page 9
1 Basic Concepts
TCP/IP Model
Application
7
Presentation
6
Session
5
HTTP TELNET FTP SMTP DNS
Transport
4
TCP
Network
3
UDP
IP
Link
2
Physical
1
TFTP SNMP
IEEE 802.2 (LLC)/802.1 (Bridging)

IEEE 802.3 (CSMA/CD)
1000Base-SX1000Base-LX1000Base-CX 100BaseT 1000Base-T
ICMP ARP
ATM,
PPP/ML PPP,
HDLC...

TCP/IP Overview
When people refer to communication software, they generally mean the Open Systems Interconnection (OSI) architecture,
which was developed by International Standards Organization (ISO) between 1977 and 1984. The OSI model is broken down
into 7 layers. Each layer plays a specific role: the physical layer is responsible for the transmission of bits over the
transmission medium; the data link layer is responsible for the transmission of frames between devices that are
interconnected physically; the network layer is responsible for routing packets within the network; the transport layer is
responsible for end-to-end message transmission; the session layer is responsible for dialog synchronization; the
presentation layer is responsible for data representation and format conversion; and the application layer is responsible
for hosting network-oriented utilities and applications.
TCP/IP does not follow exactly the same pattern as OSI. The lower-level TCP/IP protocols do not fulfill the role defined by
OSI for the physical and data link layers. At level 3, IP complies with the OSI model. You will discover other very
important network-layer protocols such ARP and ICMP. At level 4, two transport protocols are used: TCP and UDP. Finally,
services are integrated in the three upper layers of the OSI model.
Here are a few examples: HTTP for surfing the internet; Telnet for remote control of a device; FTP for file transfer; SMTP
for e-mail exchange; DNS for internet addressing; TFTP for file transfer, SNMP for network administration.
When people refer to TCP/IP layers or protocols, they are referring not only to these two protocols but to all the
protocols in the stack, which includes TCP and IP.
The TCP/IP sources are available free of charge and were developed independently of any particular architecture,
operating system, or proprietary structure. They can therefore be transported over any type of platform. They form an
open system that is continually evolving and therefore highly popular.
TCP/IP operates over a diverse range of media and technologies such as serial links, coaxial cables, optical fiber, radio
links, ADSL, ATM networks, etc.
The addressing mode is shared by all TCP/IP users regardless of the platform they use. If the address is unique,
communication can take place even if the hosts are on different sides of the world.
The higher protocols are standardized to allow for wide-ranging developments over all types of machines.

Section 1 Page 10
1 Basic Concepts
Standardization
ISOC
IAB
Internet Architecture Board
Internet
Corporation
for
Assigned
Names and
Numbers
Internet Engineering Task Force

www.icann.org
IESG
Internet Engineering Steering Group
Area 1
Area 7
WG
Working Group
WG
Working Group
WG
Working Group
WG
Working Group
IANA
www.iana.org
Internet Assigned Numbers
Authority
RFC editor
http://www.rfchttp://www.rfc-editor.org/rfcsearch.html
TCP/IP Overview
TCP/IP Standardization
The organization responsible for standardization is the "Internet Society". It is made up of individual members
as well as organizations and industrial companies.
The Internet Society is headed by the IAB, which comprises twelve members elected for 2 years.
The IAB is supported by the IETF for studies into new standards and the IANA, which is mainly charged with
assigning official values to certain fields of various protocols and allocating Internet IP addresses.
The IETF is managed by the IESG.
The IETF is divided into Areas. Working Groups are set up within the Areas.
Each Area specializes in a particular Internet field:
one Area is responsible for applications.
another for the Internet.
another for routing.
another for security issues.
another for transport protocols.
the
final Area for performance.
It should be noted that the IANA, which was originally formed under the auspices of the American
government, now answers to the ICANN, a non-governmental organization. The new organization has not
affected the responsibilities of the IANA, which continues performing the same functions.
The standards are issued in the form of Request For Comments (RFCs) and are free of charge and available
online.

Section 1 Page 11
1 Basic Concepts
Use of Layers in a TCP/IP Communication

server
host
data
IP
Network
Transport
Port s
21 data
Network@IPa
IP@ a
b
IP@ a
b
Link
Phys@ 1
2
Phys@ 8
7
Phys@: 1
Host
IP@ a
b
Phys@ s1d2
www Mail
80
25
Transport
Port s
21 data
Network @IPb
IP@ a
b
Link
Phys@ s4d15
Phys@ Phys@ Phys@

Phys@ Phys@
Phys@: 15
2
8
4
6
7
Phys@ s8d7
Phys@ s4d15
Phys@
3
Phys@ 18
Host
Phys@4
15
FTP
21
Phys
@
12
Phys
@
9
Host
Phys@
34
Host

TCP/IP Overview
When two users wish to communicate, one is the Client because in the IP world the client is defined as the
user requesting the service while the other is the Server because that user provides the service.
Here, the Server is capable of providing various services but the Client wishes to request one service only.
The transport layer is charged with targeting the required service. For this, each application is allocated an
official number known as a "port number". (N.B. the IANA is responsible for allocating a port number to every
new service.) The transport layer sends the datagram to the lower-layer IP. This IP packet must be sent to the
remote server. For this reason, every machine connected to the IP network is therefore assigned a logical
address called an IP address. One of IP jobs is to insert a header. The main fields in this header are the packet
source and destination addresses. The packet is then sent to the data link layer, which encapsulates it in a
frame with a header containing the physical source and destination addresses. Finally, the frame is
transferred to the transmission medium.
All the machines connected to this transmission medium analyze the frame header but because only the
router interface recognizes its physical address it extracts the contents of the frame and transmits them to
the upper-layer IP. The routers network layer analyzes the packet header, especially its destination IP
address. Its routing table indicates the outgoing interface and the next physically connected device the
packet must pass through to reach its final destination. The IP packet is transferred to the data link layer,
which encapsulates it in a frame. This time, the physical source address is the source router interface address
and the physical destination address is the address of the next router interface. Once again, only the router
recognizes its physical address in the frame transported by the transmission medium. It therefore extracts the
packet from the frame and sends its contents to its network layer. The network layer routes the packet to the
outgoing interface using its routing table.
Finally, the frame is transferred to the last link. The destination machine recognizes its physical address in
the header and sends the contents to its IP. The IP of the final destination machine recognizes its own IP
address in the destination IP field of the packet received. The contents of the packet are then sent to the
transport layer, which examines the header. Thanks to the destination port number contained in the layer-4
protocol header, the data is routed to the service chosen by the Client.

Section 1 Page 12
Answer the Questions

The OSI reference model is quite similar to TCP/IP, with one major
exception. Where does the difference come from?
Layer 1
Layer 3
The top of the stack

TCP/IP Overview

Section 1 Page 13
Answer the Questions [cont.]

What are the attributes of protocol layering that are used by TCP/IP?
Application layer runs only at endpoints

Independent of data link (layer 2) protocol
Independent of network (layer 3) protocol
Independent of physical facilities used

TCP/IP Overview

Section 1 Page 14
Blank page

TCP/IP Overview

Section 1 Page 15
End of Section

TCP/IP Overview

Section 1 Page 16
Section 2
Ethernet technology
Technology

Section 2 Page 1
Module Objectives
Upon completion of this module, you should be able to explain:

the
the
the
the
the
principle of CDMA/CD operation

Ethernet 802.3 frame format
interest of VLAN
VLAN tagging process
802.1x authentication mechanisms

Ethernet Technology

Section 2 Page 2
1. Ethernet principles

Ethernet Technology

Section 2 Page 3
1 Ethernet principles
CSMA/CD mechanism
Rec
Internal
loopback
Collision
detection
Loopback
4-port HUB
Trans
Note: The Hub does not

forward the signal
on the input port
R
1
RJ45 connector
R
R
T
HUB = multiport repeater
T
<100m
Ethernet Technology
1990: Ethernet arrives on the twisted pair

With the launch of standard 802.3i (10Base-T), the IEEE opened the doors to the Ethernet explosion. The
standard provided for the construction of star networks on simple category-3 Unshielded Twisted Pair (UTP)
cables.
Despite the introduction of the 10Base2 standard, cabling remained tedious.
Because most buildings were already cabled with copper pairs, research was carried out into how these
copper pairs could be used for data transmission.
The results led to the use of 2 pairs for each station:
one for transmitting,
one for receiving.
A frame sent by a stations transmit pair had to be received by the receive pair of all the other stations. A
device was required to perform this function. This device was known as a HUB.
The Hub also serves as a repeater that amplifies the signals:
The maximum distance between station and Hub is 100 meters.
The connection is made using RJ45 connectors.
Access to the medium remains the same (CSMA/CD):

The station wishing to transmit must first of all ensure that the medium is available.
The transmitted signals are routed by the Hub to all the receive pairs of the other ports.
It should be noted that the Hub that receives signals on the receive pair of one of its ports routes these
signals to the transmission pairs of all the other ports, except the port that received the signals (ingress port).
To ensure collision detection, each 10/100Base-T network interface board (NIC) has internal loopback.
Section 2 Page 4
10/100Base-T: Link Status

Transmission
hub
Listening
?
(busy)
Collision
(free)
Transmission
R
T
R
T
R
Link broken
16.8ms
4
Link Test Pulse
Normal Link Pulse
LED
Link
16.8ms
LED
Link
R
Ethernet Technology
A machine that does not realize it has a faulty transceiver may start transmitting despite CSMA and cause
collisions. To prevent such a situation from arising, a signal is emitted (when the segment is inactive) to
validate the link. This signal is known as the "Link Test Pulse" or "Normal Link Pulse" and is a 5MHz pulse
emitted every 16.8ms.
In general, a LED is associated with the signal. If the "Link" LEDs on the two interconnected devices are on,
the segment is functioning correctly.
When there are no frames to transmit, each device emits a series of test signals (link test pulses),
interspersed with silences, over the transmit pair. The receive pair of the transceiver at the other end of the
link waits for this signal in order to check the integrity of the line or rather of its receive pair (pair 2).

Section 2 Page 5
10/100/1000 Base T: Cables

10Mb/s
100 Mb/s
1000 Mb/s
10 Base T
100 Base TX
1000 Base TX
Ethernet
Fast Ethernet
Gigabit Ethernet
Twisted
pair
Base band
UTP category 5
STP category 5
RJ45
UTP: Unshielded Twisted Pair

STP: Shielded Twisted Pair
Ethernet Technology
10Base-T refers to the Ethernet cabling standard based on twisted pairs.

100 Base T comes in several flavors (T2, T4, TX). Today, it is mainly 100 Base TX that is used.
1000 Base TX is a Gigabit Ethernet technology using twisted pairs. (802.3 ab).
Various cables can be used. They generally comprise 4 copper-wire pairs. The most common are:
UTP cables: category-5 unshielded twisted pairs,
STP cables: category-5 shielded twisted pairs.
The connections are made using 8-pin RJ45 connectors.

Category 5 E cables are adapted for Gigabit Ethernet (up to 100 m)

Section 2 Page 6
10Base-T: Hub Connection

100m
HUB
m
100
10Base-T
100m
m
100
HUB
10Base-T
m
100
100
m
10
0m
10
0m
HUB
10Base-T
HUB
10Base-T
0m
10
500m
4 repeaters
100
10
0m
HUB
10Base-T
0m
10

Ethernet Technology
Characteristics of a 10Base-T LAN

The maximum distance between the Host or router and the Hub is 100 meters.
The number of ports on the Hub is variable.
To increase the number of ports on a 10Base-T LAN, several Hubs can be cascaded. The distance between 2
Hubs is also limited to 100 meters.
The maximum distance between 2 stations is limited to 500m and there can be no more than 4 Hubs between
2 stations.

Section 2 Page 7
Fast Ethernet 100Base-T: Hub Connection
100
100m
HUB
100m
100Base-T
m
100
20m
100m
HUB
100Base-T
10
0m
10
10
0m
220m
2 repeaters
0m
Ethernet Technology
Fast Ethernet Cabling

100Base-T (also known as "Fast Ethernet") is subject to certain restrictions:
Although
the maximum distance between the stations and the Hub is still 100 meters, the maximum distance
between Hubs has fallen to around 20 meters.
The
number of Hubs between 2 stations must not exceed 2, which means that the maximum distance
between 2 stations falls to 220 meters.

Section 2 Page 8
Logical Address and Physical Address
xz
IP@ = logical address

Bob
Alice
MAC@ = physical address
IP: Internet Protocol

MAC: Medium Access Control
Ethernet Technology
The Medium Access Control (MAC) is part of the data link layer and is responsible for transmitting blocks of
bits (i.e. frames) between devices that are connected to each other physically.
Before looking in detail at the format of a MAC frame, lets consider the different addressing methods in
TCP/IP.
Two types of address are used in TCP/IP:
The logical address or IP address
The physical address or MAC address
To understand why 2 types of address are used, an analogy can be drawn with the traditional telephone
network.
The logical address could be compared to the peoples names, and the physical address to the telephone
numbers.
When a person, lets say Alice, wishes to communicate with Bob, her first thought is:
"Im going to call Bob." However, when she actually makes the call, she will probably have to look in a phone
directory and dial Bobs telephone number.
The principle is the same in TCP/IP. A station wishes to send a data packet to another station. It indicates the
logical IP address of the remote station. But, in practice, this IP packet will be transported in a frame using
physical addresses. Later on, you will see that the routing tables in TCP/IP are generated automatically by
means of the Address Resolution Protocol (ARP).

Section 2 Page 9
Unicast MAC Address
MAC
MAC
00.80.9f.00.02.03
MAC
00.18.55.92.a2.08
00.53.27.32.02.c8
Dest: 00.53.27.32.02.c8 ..
00.6f.66.32.0b.08
00.35.d6.39.cb.0a
MAC
MAC

Ethernet Technology
Lets first look at physical Ethernet addressing.

There are different types of MAC addresses. First of all, the unicast address: this type of address is assigned
to each Ethernet card and is globally unique.
It should be noted that a station with n interfaces will have n MAC addresses.
Unicast addressing is used when a frame needs to be sent to a single, specific station.
The frame placed on the transmission medium can be read by all the stations connected to the LAN.
All of the station interface cards decode the destination MAC address field.
But only the station whose address matches with the MAC address interrupts its processor to deliver it the
contents of the frame. The other stations ignore the frame.

Section 2 Page 10
Broadcast MAC Address
MAC
MAC
00.80.9f.00.02.03
MAC
00.18.55.92.a2.08
00.53.27.32.02.c8
Dest: ff.ff.ff.ff.ff.ff
00.6f.66.32.0b.08
00.35.d6.39.cb.0a
MAC
MAC

Ethernet Technology
The second type of MAC address is the Broadcast address.

This time, a station wishes to send data to all the stations connected to the LAN. Rather than sending n
frames in unicast mode, the transmit station (egress station) uses broadcast addressing. This means that the
destination MAC address field contains only 1s.
Once again, the frame is placed on the transmission medium.
All the interfaces connected read the destination MAC address and see that it is a broadcast.
All the interfaces interrupt their processors to deliver them the contents of the frame.

Section 2 Page 11
Multicast MAC Address
MAC
MAC
00.80.9f.00.02.03
01.00.5e.00.00.09
00.18.55.92.a2.08
MAC
00.53.27.32.02.c8
Dest: 01.00.5e.00.00.09 ..
00.6f.66.32.0b.08
00.35.d6.39.cb.0a
MAC
MAC
01.00.5e.00.00.09

Ethernet Technology
The last type of MAC address is the Multicast address.

Certain stations can join a group and receive a second address, known as a multicast address, that is shared
by all stations in the group.
A station wishing to send a frame solely to the stations in the group puts the multicast address in the
destination address field of the frame.
All interfaces connected to the link decode the frame but only stations with the multicast address interrupt
their processors to deliver them the frame data.

Section 2 Page 12
MAC Address - Details

6 bytes
(48 bits)
O.U.I.: Organizational Unit Identifier (Assigned by IEEE)

Serial number (24 bits)
vendor code (22 bits)
U/L: Bit
I/G: Bit
0: Universal,
Universal, unique address
1: Local, local meaning
0: Individual (or Unicast), associated to only one equipment

1: Group (or Multicast), associated to a group of equipment
Hexadecimal representation (12 digits)

Examples: CISCO: 0 0 .1 0 .7 B . x x . x x . x x
ALU: 0 0 .8 0 . 9 F . x x . x x . x x
managed by manufacturer
Ethernet Technology
What is the format of a MAC address?

MAC addresses comprise 48 bits or 6 bytes.
How can you ensure that a unicast address is unique?
The IEEE standardization body assigns each Ethernet card manufacturer a 22-bit number.
It is then up to the manufacturer to allocate serial numbers as the cards come off the assembly line and
ensure that the numbers are unique.
MAC addresses generally comprise 12 hexadecimal digits. The codes assigned to manufacturers CISCO and
Alcatel-Lucent, for example, begin with:
00.10.7b for CISCO,
00.80.9f for Alcatel-Lucent.
Certain manufacturers are assigned several codes.

The 2 most significant bits play a special role:
The "Universal / local" bit is not used in Ethernet but rather in Token Ring technology.
The most significant bit is, however, very important since it determines whether the address is unicast (if
the bit is set to 0) or multicast (if the bit is set to 1).
Some people may wonder whether, with the explosion of Internet, 48 bits is enough to cover current, and
indeed future, requirements.
In fact, 48 bits is well over enough since it offers a capacity of around 281 thousand billion combinations.
Even if the first 2 bits have special functions, there is still enough capacity to provide every man, woman and
child on the planet 12,000 Ethernet cards.
Lets look at it from another angle: if industry produced 100 million interface cards a day, every day of the
year (i.e. 500 times more than is currently produced), it would take 2,000 years to use up the address space
available.
Section 2 Page 13
Ethernet frame format
1518 length 64
Bytes 7
Preamble
7 x AA
SFD
MAC @ dest.
Ether
MAC @ src.
type
Ethernet
frame
46 to 1500
Data
Padding FCS
>5DC
Control
Indicate the higher-level protocol
Value > 5DCH or 1500D.
Examples: IP: 0800H
Max Trans. Unit (MTU): 1500
ARP: 0806H Mini. size: 46 (possibly padding)
IPv6:86DDH
MTU: Maximum Transmission Unit
Synchronization
Start Frame Delimiter
10101011
IP: Internet Protocol

ARP: Address Resolution Protocol
FCS: Frame Check Sequence

Ethernet Technology
1980: Beginnings of 10Mbps Ethernet

In Ethernet Version 2, frames begin with a preamble comprising 7 bytes, each of which has the hexadecimal
value "AA". The aim of this preamble is to enable stations currently listening to synchronize with the transmit
(egress) station. "A" in hexadecimal corresponds to 1.0.1.0 in binary. So, the preamble is a long string of 1s
and 0s that generate a clock signal on the transmission medium.
Next, a Start Frame Delimiter (SFD) byte enables stations to detect the end of the preamble and the
beginning of the actual frame itself.
Then there are the destination and source MAC-address fields.
This frame is transporting data intended for higher-level protocols. So the transmit station also uses the
"Ether type" byte to specify which protocol located just above Ethernet is the destination for the data: for
example, 800 if IP is the destination layer, 806 if it is ARP, etc.
These are official values assigned by the IANA. They are always above 5DC in hexadecimal or 1500 in decimal.
Next is the data field. To ensure a minimum of 64 bytes for compliance with the collision-detection
requirements, the data field must contain at least 46 bytes. The transmit station may therefore need to use
padding.
To prevent the transmit station from monopolizing the medium for too long, the data in the frame must not
exceed 1500 bytes.
Finally, frame integrity is checked via a 4-byte Frame Check Sequence (FCS) field.
Frame size is measured after the SFD field, i.e. from the destination MAC address to the FCS field inclusive.

Section 2 Page 14
Other Ethernet frame formats

IP packet
Bytes
1492
O. U. I
PID
0 0 . 0 0 . 0 0 0800
Data
SNAP
Bytes
DSAP
(AA)
SSAP
(AA)
1497
Control
(03)
Data
LLC 802.2
Bytes 6
MAC @
dest.
MAC @
src.
Ether
type
0800
46 to 1500
Data
Padding FCS
Bytes
MAC@ dest. MAC@ src. Long.

1500
46 to 1500
data
Padding
FCS
802.3 frame
Eth II frame

Ethernet Technology
In Ethernet II, an IP packet is directly encapsulated in the MAC frame. The maximum packet length is 1500
bytes. Encapsulation is described in RFC 894.
In 1983, IEEE decided to standardize this protocol. In IEEE, the packet first goes through the Subnetwork
Access Protocol (SNAP) where 5 bytes are added. The main one is the Protocol Identification (PID) byte, which
indicates the encapsulated protocol.
Next, it goes through a Logical Link Control (LLC) where:
the DSAP and LSAP fields contain the value "AA", which indicates that LLC encapsulates SNAP,
the Control field contains the value "03", which signifies "Unnumbered Information".
And finally, IEEE 802.3 formats the frame. The format of the IEEE 802.3 frames for Ethernet is identical to the
Ethernet II format except for one field: the Ethertype field from Ethernet II has been replaced by a payload
length field, which necessarily takes a value less than or equal to 1500 in decimal or 5DC in hexadecimal.
Encapsulation is described in RFC 1042.
N.B. When using SNAP encapsulation, the maximum size for IP packets is 1492 bytes.

Section 2 Page 15

In Ethernet, when a transmitter detects a collision, it:
Signals to upper layer that the network is out of service

Waits a random period of time before retrying
Puts a jam indication on the line
Stops the frame transmission

Ethernet Technology

Section 2 Page 16

Associate each protocol to its defining characteristic.
802.2
Ethernet
802.3
Logical Link Control (LLC)
MAC
IP
Contention Resolution
Network Address

Ethernet Technology

Section 2 Page 17

Ethernet Technology

Section 2 Page 18
Repeaters
10Base-T
10Base2
AUI (10Base5)
Media adaptation
Signal Amplifier
Repeater
Segment
Segment
Ethernet Technology
You saw earlier that the length of Ethernet segments is limited and that to extend a LAN, repeaters are
needed to regenerate the signals.
Certain repeaters can also work as adapters enabling transfer from 10Base2 to 10Base5 or 10Base-T.
Repeaters are just signal amplifier devices. They are not intelligent devices.
So, when a station transmits a frame to another station located on the same segment, the repeater
propagates the signals over the other segments. This means that any station located on another segment is
prevented from accessing the transmission medium until the operation is complete.
Lining stations up on the same LAN is the first simple, low-cost step for a local area network. The downside
with this type of architecture is that the number of collisions increases rapidly as traffic increases, which
means a significant reduction in the speed at which data is exchanged.
It would be useful to have devices capable of filtering. An initial solution could be the use of bridges.

Section 2 Page 19
Bridges _ Frame Forwarding

LAN 1
ca
Eth 1
Eth0
bridge
c
cf
MAC@
Port
a
b
c
d
e
f
eth0
eth0
eth0
eth1
eth1
eth1
LAN 2

Ethernet Technology
The filtering configuration can be defined manually by storing in the bridge memory the MAC addresses of the
stations associated with each of these ports.
When a frame is moving along a segment, the bridge analyzes the destination MAC address. If the address is
on the same port as the one that detected the frame, the bridge blocks the frame.
If this is not the case, the bridge propagates the frame to the port that corresponds to the destination MAC
address.
It should be noted that bridges do not filter broadcasts and multicasts.
On a large LAN, manual configuration can be time-consuming and maintenance complicated.

Section 2 Page 20
Self-Learning Bridge
"a" sends a frame to "b"
!!!
filter
MAC@ Port
2/1
a
a
MAC@: a
b
2
filter
MAC@ Port
a
2
filter
MAC@ Port
a
!!!
MAC@:
b
filter
MAC@ Port
2/1
a
a
filter
MAC@ Port
a
2
2
b
2

Ethernet Technology
Lets now consider the limits of the "Self-Learning Bridge" mechanism.

The network cabling has changed and certain destinations can now be reached via several routes.
"a" sends a frame to "b".
Bridge 1 learns the location of "a".
It doesnt know where "b" is located and therefore broadcasts the frame. Bridges 2 and 3 then learn the
location of "a".
Bridges 2 and 3 in turn broadcast the frame.
Bridges 4 and 5 are now faced with a dilemma. Both their ports receive a frame with the source MAC address
"a". This means that "a" is located on port 1 and port 2.
This implies that frames will be broadcast over the links and will very soon take up all the available
bandwidth.
As you have seen, the "Self-Learning Bridge" mechanism has its limits: it can only function if there are no
loops in the network.

Section 2 Page 21
Spanning Tree Protocol

Topology
Tree representation
Root
109
114
234
175
Loop
234
109
175
Loop
447
562
114
447
492
492
562
Loop
suppression

Ethernet Technology
To overcome this problem but still maintain the automatic mechanism, a special protocol known as the
Spanning Tree Protocol (STP) is implemented in the bridges.
This relatively complex protocol uses Bridge Protocol Data Unit (BPDU) messages to establish specific dialog
between the bridges.
The bridges represent the network topology in the form of a tree. They select a bridge to be the root bridge
and then draw in the connections to form a tree structure. The nodes represent the bridges and the leaves on
the tree are the stations.
The bridges detect loops and remove them. This means there is only one path for getting from one station to
another station, as with a tree for getting from one leaf to another.

Section 2 Page 22
Switch: Principle
Simultaneous
4 x 10Mb/s-port switch
communications
Switching fabric
R
T
R
1
R
R
T
4-port switch => the traffic could reach 2 x 10Mb/s
Ethernet Technology
In the past, bridges generally only had 2 ports.

During the 90s, the introduction of 10Base-T links, as well as progress in the field of microprocessors,
Application-Specific Integrated Circuits (ASICs), and memories, made it possible to design bridges with more
ports, which were capable of routing frames simultaneously to several ports at the transmission rate of the
medium.
For marketing reasons, the Switch was born.
But the switch is nevertheless just a bridge equipped with numerous ports.
When a station transmits a frame, the Switch, just like a bridge, analyzes the destination MAC address and,
based on the information in its filter memory, sends the frame to the appropriate link(s).
At the same time, another station can also transmit a frame that will be routed by the Switch to the right
output port(s).
So, unlike the Hub, the Switch makes it possible to increase transmission-medium bandwidth by performing
several operations simultaneously.

Section 2 Page 23
Switch: Full and Half Duplex

Full duplex
HUB
Transmit
Collision
detection
Loopback
Switch
Transmit
Receive
Receive
Transmit
Buffer
Buffer
Receive
Collision
Receive
Buffer
Transmit
Collision
detection
Loopback
Loopback
Transmit
Receive
Collision
detection
Buffer
Half duplex

Ethernet Technology
Segmentation
On a segment with several stations, various mechanisms must be implemented:
A mechanism for accessing the transmission medium i.e. listening to the link to determine whether it is
available or unavailable,
A mechanism for detecting collisions.
Correct communication is always in half-duplex mode. Indeed, at any given time, a single station transmits
while the others listen.
Collisions can occur in cases where frames transmitted by several stations are mixed up on the receive pair.
Generally, therefore, both the station side and the switch side can be configured to function in half-duplex or
full-duplex mode.
Micro-segmentation
In the case of micro-segmentation, where a single station is connected to a switch port, collisions cannot
occur. Indeed, there is only one transmitter on a pair.
Consequently, the station wishing to transmit does not need to use the collision-detection mechanism.
Moreover, the station should function in full-duplex mode if it has that capability.
By default, the NICs of stations wishing to transmit listen to the transmission medium beforehand. If they
detect traffic, they postpone transmission to avoid causing a collision.
So, if on a micro segment this mechanism is not disabled, the station (or the port of the Switch in the other
direction) will continue to function in half-duplex mode and delay transmission for fear of causing a collision.
The NIC internal loopback mechanism must therefore be disabled. This can be configured manually or via the
auto-negotiation mechanism.

Section 2 Page 24
Switch: Auto-Negotiation
Link state detection
AutoAuto-negotiation
16.8ms
Normal Link Pulse
Fast Link Pulse
2ms
17..33 pulses
100BASE-TX Full Duplex

100BASE-T4
100BASE-TX,
10BASE-T Full Duplex
10BASE-T
Ethernet Technology
Auto-Negotiation
Most Ethernet interfaces, such as adapters (NICs) for PCs or workstations and Switches, are capable of
adapting their transmission speed (10/100) and mode (Half or Full Duplex).
This is done at start-up by exchanging the Fast Link Pulse (FLP), which is the equivalent of the Normal Link
Pulse (NLP) used for the 10Base-T integrity test.
This means that two devices with auto-negotiation capability can define the best method for working
together from the options specified below (in order of preference):
1. Full-duplex 100Base-TX
2. 100Base-T4
3. 100Base-TX
4. Full-duplex 10Base-T
5. 10Base-T

Section 2 Page 25
Switch: Full-Duplex Mode Advantage

Segmentation
MicroMicro-segmentation
Switch
hub
Independent rate for each station
10
Mb/s
100 Mb/s Switch 10 Mb/s
Shared bandwidth
Half duplex
Access contention
free
medium
Collision detection
no
delay
100 Mb/s
Extended length
Full Bw
Full duplex
No need for
access contention
No need for
collision detection
no
Transmission
=
reception
10 Mb/s
free
medium
Transmission
=
reception
no
delay

Ethernet Technology
To conclude, lets compare the characteristics of segmentation and micro-segmentation:

With segmentation, transmission speed is the same for all stations; with micro-segmentation, transmission
speed is independent between stations.
With segmentation, the bandwidth is shared between all the stations; with micro-segmentation, each
station uses the full bandwidth.
With segmentation, the medium-control mechanism must be implemented, implying operation in halfduplex mode; with micro-segmentation, this mechanism isnt required and full-duplex mode is therefore
possible.
With segmentation, the collision-detection mechanism must be implemented; with micro-segmentation,

collision detection isnt required.
Finally, with segmentation, the maximum distance between 2 stations is limited to enable collision
detection; with micro segmentation, there is no limit since collisions are no longer possible. The limit is solely
dependent on the signal transmission technique. Repeaters can always be installed.
1997: Full Duplex Ethernet

The arrival of standard 802.3x enabled communication simultaneously in both
In full-duplex mode, both stations can communicate at 200Mbps over a point-to-point link.

Section 2 Page 26
directions.
Network design (1) _ Hubs
HUB
1 Wiring
2 Communication
Sale
s
R&D
HUB
Fina
nce
s
Sale
s
R&D
Fina
nce
s
rt t
po men
m
I rt
pa
de
rt
po ent
x
E rtm
pa
de

Ethernet Technology
Lets now consider a scenario in which a building is cabled using Hubs and how communication takes place
between two stations.
The frames exchanged are broadcast over the whole LAN, preventing other exchanges from taking place
simultaneously and also bothering stations that are not concerned by the transaction.

Section 2 Page 27
Network design (2) _ Bridge and hubs
HUB
Sale
s
Filtering Bridge
R&D
HUB
Fina
nce
s
Sale
s
R&D
Fina
nce
s
rt t
po men
m
I art
p
de
rt
po ent
x
E rtm
pa
de

Ethernet Technology
Compared with a cable set-up based on segmentation, you can see that communication is more effective
when the stations are on the same segment.

Section 2 Page 28
Network design (2) _ Bridge and hubs
HUB
Sale
s
Bridge
R&D
HUB
Fina
nce
s
Sale
s
R&D
Fina
nce
s
rt t
po men
m
I art
p
de
rt
po ent
x
E rtm
pa
de

Ethernet Technology
But the same drawbacks exist for communications between stations located on different segments.

Section 2 Page 29
Network design (3) _ Switches
Sale
s
1 Wiring
R&D
2 Communication
Sale
s
Fina
nce 2
s
R&D
Fina
nce
s
Switch
ort ent
p
Im artm
p
de
rt
po ent
x
E rtm
pa
de
MicroMicro-segmentation

Ethernet Technology
Installing a switch can bring numerous advantages in terms of:

cabling, since the connections are centralized in a single technical location. A switch usually has a large
number of ports. Some of them can be stacked and interconnected using special links.
communication, thanks to micro-segmentation.

Section 2 Page 30

What is the advantage of Full-Duplex Ethernet over Half-Duplex

Ethernet?
Effective doubling of the link bandwidth
Simpler Management
Support of Voice

Ethernet Technology

Section 2 Page 31

What Ethernet operation mode allows a device to either transmit or

receive?
Auto-negotiation
Full duplex
Half duplex
Spanning Tree

Ethernet Technology

Section 2 Page 32

Match each Ethernet technology to its appropriate function.
Half duplex
Matches speed
Full duplex
Finds a backup after failure
Auto-negotiation
Spanning tree
One simultaneous transmitter

200Mbits/s on Fast Ethernet

Ethernet Technology

Section 2 Page 33

Imagine that you are an Ethernet switch, examining a frame header to

determine what to do. Match each situation to the appropriate action.
Match address of ingress port
Match entry for one egress port
Broadcast
Forward
No matches
Filter
All ones
Flood

Ethernet Technology

Section 2 Page 34

Match each protocol to the appropriate layer.
Ethernet
Physical
UDP
Data link
Auto-negotiation
Network
IP
Transport

Ethernet Technology

Section 2 Page 35
3. Virtual LAN

Ethernet Technology

Section 2 Page 36
3. Virtual LANs
Problem
SW
F
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
M
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
F _ Finances
M Marketing
Physical and logical topology :
a single networks
Ethernet Technology
Broadcast traffic is seen and processed by all the users connected to the switch, independently of the
fact that they might not be concerned by the content of the message. Security is also weak in this
environment, a user with a packet sniffer will be able to see the content of many messages.

Section 2 Page 37
3. Virtual LANs
Solution
VLAN id
Members
10 (Marketing) Ports 2, 5, 6
20 (Finances)
Ports 1, 3, 4
SW
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
F
M
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Physical topology
Logical topology: two isolated networks

Ethernet Technology
The best solution available for simple broadcast contention is the use of VLAN. Even though users are still
physically connected to the same device, they will be isolated in different logical networks and no traffic
from a VLAN can be seen by a user of another VLAN.
The simplest way to create a VLAN in a switch is per port. Each port is explicitly assigned to a VLAN. The
association port VLAN is stored by the switch in VLAN table. Each VLAN is identified with VLAN id.,
which is a number between 0 and 4095. Usually, VLANs are also given a label that is easier to remember
than a number. By default all ports in the switch are members of VLAN 1. Configuring a VLAN for a port
means removing the port from VLAN 1 and assigning it to a new VLAN.
After VLANs have been implemented, instead of forwarding broadcast traffic to every port, the switch
will forward a broadcast frame only to the ports that are members of the same VLAN as the port
originating it. Unicast traffic will be forwarded to the destination port only if it is a member of the same
VLAN as the source.
InterVLAN communication is not possible at layer 2. A layer 2 switch cannot switch frames between two
different VLANs
Other methods to implement VLAN: by MAC address, by protocol, LANE (LAN emulation for ATM
transport)

Section 2 Page 38
3. Virtual LANs
Access links
VLAN
Members
10 (Marketing) Ports 2, 5, 6
20 (Finances)
Ports 1, 3, 4
Ethernet Switch
Port 1
Port 2
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Port 3
Port 4
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Untagged Ethernet Frame Dest

Dest
Src
Src
Ethertype
Ethertype
Port 5
Port 6
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Data
Data
FCS
FCS

Ethernet Technology
An access port is a switch port that is connected to a terminal device eg. A PC or printer. It is a member
of a single VLAN.
As all the traffic originated on or destined for this port is for the same VLAN, no particular mechanism is
needed to mark the frames (the VLAN membership of the port is already known to the switch). In this
case, the port will be untagged. The untagged VLAN is also called the native VLAN.

Section 2 Page 39
3. Virtual LANs
VLAN spanning multiple switches _ Problem
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Port 7
Port 7
VLAN id
VLAN
Members
10 Marketing Ports 2, 5, 6, 7
20 Finances
SW1
Port 1
Port 2
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Port 4
Port 5
Port 3
Port 4
Ports 3, 6, 7
11 Engineering Ports 2,5
Ports 1, 3, 4, 7
Port 3
Members
10 Marketing
20 Finances
SW2
Port 6
Port 1
Port 2
Ports 1, 4, 7
Port 5
Port 6
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff

Ethernet Technology

Section 2 Page 40
3. Virtual LANs
VLAN tagging
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Port 7
VLAN id
Members
VLAN tag
Marketing Ports 2, 5, 6, 7
Finances
Port 1
Port 3
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
Port 4
Members
Marketing
Ports 3, 6, 7
Finances
Port 5
Port 6
Port 1
VLAN tag
Ports 1, 4, 7
Port 2
Port 3
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff
VLAN id
Engineering Ports 2,5
Ports 1, 3, 4, 7
Port 2
SW2
Port 7
SW1
Port 4
Port 5
Port 6
ff:ff:ff:ff:ff:ff
ff:ff:ff:ff:ff:ff

Ethernet Technology
To extend a VLAN to span several switches, the switches will be interconnected using trunks.
Unlike the access links, trunks can carry the traffic of multiple VLANs. To identify the VLAN a frame
Belongs to, a label or tag is added to the frame. It contains information about the VLAN originating the
frame. A frame carrying a VLAN tag is called a tagged frame.
In a trunk, only one VLAN can be untagged (the native VLAN). Frames originated in all the other VLANs
must be labelled before transport.

Section 2 Page 41
3. Virtual LANs
Trunking
Dest
Dest
Src
Src 802.1q
802.1q tag
tag Ethertype
Ethertype
Data
Data
FCS
FCS
Trunks must carry traffic
for multiple VLANs
untagged
20
10
Port 7
Port 7
SW1
SW2
Port
Port77isismember
memberof:
of:
VLAN
10
->
VLAN 10 ->tag
tag==10
10
VLAN
VLAN20
20->
->tag
tag==20
20
VLAN
1
->
untagged
VLAN 1 -> untagged

Ethernet Technology
In a trunk, only one VLAN can be untagged (the native VLAN). Frames originated in all the other VLANs
must be labelled before transport.
By default, a trunk carries all the VLANs configured in the switch. The process of removing unused VLANs
from the trunk is called VLAN pruning

Section 2 Page 42
3. Virtual LANs
802.1Q tagging
Destination Address
The next field contains a VLAN tag
Source Address
Ethertype
= 0x8100
Length/Type
User Priority CFI
4 bytes
(802.1p)
Tag Control Information
Data
Length/Type
PAD
Data
FCS
PAD
FCS
VID (VLAN ID) 12 bits
User
User Priority
Priority (3
(3 bits)
bits) __ used
used for
for Class
Class of
of Service
Service
(CoS)
(CoS) marking
marking in
in 802.1p
802.1p
CFI
CFI (1
(1 bit)
bit) __ Canonical
Canonical Format
Format Identifier
Identifier
Set
Set to
to 00 for
for Ethernet
Ethernet networks
networks
VLAN
VLAN id
id (12
(12 bits)
bits) __ VLAN
VLAN identifier.
identifier. It
It can
can
take
take values
values in
in the
the range
range between
between 00 and
and 4095
4095
Value
Value 11 is
is usually
usually assigned
assigned to
to the
the Default
Default
VLAN
VLAN
Ethernet Technology
The tagging scheme proposed by the 802.3ac standard recommends the addition of the four octets after
the source MAC address. Their presence is indicated by a particular value of the EtherType field (called
TPID), which has been fixed to be equal to 0x8100. When a frame has the EtherType equal to 0x8100,
this frame carries the tag IEEE 802.1Q/802.1p. The tag is stored in the following two octets and it
contains 3 bits of user priority, 1 bit of Canonical Format Identifier (CFI), and 12 bits of VLAN ID (VID).
The 3 bits of user priority are used by the 802.1p standard; the CFI is used for compatibility reasons
between Ethernet-type networks and Token Ring-type networks. The VID is the identification of the
VLAN, which is basically used by the 802.1Q standard; being on 12 bits, it allows the identification of
4096 VLANs.
After the two octets of TPID and the two octets of the Tag Control Information field there are two octets
that originally would have been located after the Source Address field where there is the TPID. They
contain either the MAC length in the case of IEEE 802.3 or the EtherType in the case of Ethernet II.
Note _ Adding a tag in a frames implies that the FCS field has to be recomputed by the switch

Section 2 Page 43
3. Virtual LANs
Aggregation layer problem
Customer 1
40
Customer 1
VLAN 40
VLAN 41
40
Service Provider Network
Customer 2
VLAN 40
Customer 2
VLAN 30
Customer 1
VLAN 42
Dest
Dest
Src
Src 802.1q
802.1q tag
tag
Ethertype
Ethertype
Data
Data
FCS
FCS
A single VLAN space to share among all clients = No overlapping allowed

Ethernet Technology
A Service Provider that offers transport services to the clients must support the client VLANs e.g.
transparently transport the VLAN tag across the network. It means that all the provider customers are
sharing the VLAN space e.g. VLAN id range 1 to 4095.
Two customers configuring their networks independently might choose VLAN identifiers that are identical. In
that case, the provider egress switch cannot which customer network is the actual destination of the frame.
In this case, no overlapping can be allowed. Besides the maximum limit of 4095 VLAN is usually sufficient for
enterprise networks but might not be enough for a Provider network

Section 2 Page 44
3. Virtual LANs
Q in Q tagging
VLAN ID 10 -> Customer1->port 2
VLAN ID 20 -> Customer2->port 5
Customer 1
VLAN 41
10 40
Customer 140
VLAN 40
10 40
Customer 2
VLAN 40
Service Provider Network
Customer 2
VLAN 30
Customer 1
VLAN 40
Dest
Dest
Src
Src
Customer
Customer ID
ID
Site
Site ID
ID
Ethertype
Ethertype
Packet
Packet
FCS
FCS
The CPE adds a tag to identify the customer. Overlapping VLAN id

indifferent customers are not a problem
Ethernet Technology
A solution to the problem in the previous slide might be the use of an additional VLAN tag. This tag could be
inserted by the provider or the remote CPE and it will identify the customer or service. This method of
encapsulation is called Q in Q.
With Q in Q encapsulation, every customer can potentially use the whole VLAN ids space.

Section 2 Page 45
4. LAN Authentication

Ethernet Technology

Section 2 Page 46
Who are you ?
Authorized User
Protected resources
Unauthorized User

Ethernet Technology
IEEE 802.1x _2001 _ Port-based network access control

802.1aa _ Revision of the 802.1x, work in progress

Section 2 Page 47
802.1x components
(2)
(1)
Authentication Server
(RADIUS)
Wired connection
(3)
Network Access Server
Protected Network
Wireless association
Access Point
Supplicants
1.
2.
3.
Authenticators
Authenticator detects the presence of the client and sets port to unauthorized state. The authenticator sends an EAP-Request to the supplicant.
Supplicant responds and the authenticator forwards the response to the RADIUS server. The RADIUS will verify the client credentials.
If the authentication server accepts the request, the authenticator set the port to authorized state and normal traffic is forwarded

Ethernet Technology
IEEE 802.1X is an IEEE standard for port-based Network Access Control. It provides an authentication
mechanism to devices wishing to attach to a LAN, either establishing a point-to-point connection or
preventing it if authentication fails. It is used for most wireless 802.11 access points and is based on the
Extensible Authentication Protocol (EAP).
802.1X involves communications between a supplicant, authenticator, and authentication server. The
supplicant is often software on a client device, such as a laptop, the authenticator is a wired Ethernet
switch or wireless access point, and an authentication server is generally a RADIUS database. The
authenticator acts like a security guard to a protected network. The supplicant (i.e., client device) is not
allowed access through the authenticator to the protected side of the network until the supplicants
identity is authorized.
Upon detection of the new client (supplicant), the port on the switch (authenticator) is enabled and set to
the "unauthorized" state. In this state, only 802.1X traffic is allowed; other traffic, such as dhcp and http, is
blocked at the data link layer. The authenticator sends out the EAP-Request identity to the supplicant, the
supplicant responds with the EAP-response packet that the authenticator forwards to the authenticating
server. If the authenticating server accepts the request, the authenticator sets the port to the "authorized"
mode and normal traffic is allowed. When the supplicant logs off, it sends an EAP-logoff message to the
authenticator. The authenticator then sets the port to the "unauthorized" state, once again blocking all nonEAP traffic.
Note_ In wireless environments, instead of a physical link, the supplicant creates an association with an
access point.

Section 2 Page 48
EAP message format
1
2
3
4
1 byte
1 byte
Code
Code
Identifier
Identifier
2 byte
EAP Request/Response Packet
1 byte
Code
Code
Code
Code
Type
Length
Type-Data
1 = Identify
2 = Notification
3 = Nak (response only)
4 = MD5-Challenge
5 = OTP (One Time Password)
9 = RSA Public Key Authentication
13 = EAP-TLS
17 = EAP-Cisco Wireless (LEAP)
21 = EAP-TTLS
22 = Remote Access Service
23 = UTMS Authentication and Key Agreement
25 = PEAP
26 = MS-EAP Authentication
.
Request
Response
Success
Failure
1 byte
Data
Data
Total
Total packet
packet length
length
2 byte
Data
Data
Total
Total packet
packet length
length
EAP Configuration Negotiation Packet

Code
Code
Length
Length

Ethernet Technology

Section 2 Page 49
Authentication
Authentication Prot.
Prot. (0xC227)
(0xC227)
802.1x authentication
EAPOL encapsulation
RADIUS encapsulation
Presence detected
EAPOL
EAP - Identity Request
EAPOL
RADIUS Access-Req
EAP-Response (Identity)
EAP-Response (Identity)
RADIUS Access-Granted
EAP-Success
EAPOL
EAP - Success
or
or
EAPOL
RADIUS Access-Reject
EAP-Failure
EAP- Failure
Supplicant
Authenticator
(NAS or Access Point
Authentication Server
(RADIUS)

Ethernet Technology
EAP _ Extensible Authentication Protocol (RFC 2284)

RADIUS support for EAP (RFC 3579)
The protocol used to carry the EAP method between in 802.1x is called EAP encapsulation over LANs (EAPOL).
It is currently defined for Ethernet-like LANs including 802.11 wireless, as well as token ring LANs such as
FDDI. A type 0 EAPOL frame carries an EAP message. The type 0 indicates to the receiver (either
supplicant or authenticator) that it should strip off the EAPOL encapsulation and process the EAP data.
EAP messages are encapsulated and transported within Ethernet frames with the Ethertype field set to the
value 0x88FE. EAPOL is an alternative to RADIUS or DIAMETER to carry the messages across the LAN between
the Authenticator and the supplicant.
The standard requires the implementation of the following EAP-methods:
MD5 challenge
One Time passwords (OTP)
Generic Token Card
In addition, there are many proprietary and RFC-based EAP-methods: EAP-TLS, EAP-TTLS, EAP-FAST, EAPLEAP, etc.

Section 2 Page 50
Blank page

Ethernet Technology

Section 2 Page 51
End of Section

Ethernet Technology

Section 2 Page 52
Section 3
Point to Point Transport
IP Technology

Section 3 Module
Page 1
Blank Page
3 2

IP Technology IP for Mobile Networks
This page is left blank intentionally
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
Last name, first name
First edition

Section 3 Module
Page 2
1. Point-to-Point protocol (PPP)
3 3


Section 3 Module
Page 3
1. Point to Point protocol
What is PPP ?
Flag
Flag Address
Address Control
Control Protocol
Protocol
22 bytes
7E
FF
03
bytes
7E
FF
03
FCS
FCS
Payload
Payload
Maximum
Maximum 1500
1500 bytes
bytes
22 or
or 44 bytes
bytes
Flag
Flag
7E
7E
PPP Connection
Router
Transport Network
Router
(leased line, SDH/PDH, ISDN, PSTN,

L2TP/GRE tunnels, etc)
IP network
PPP Connection
Access Network
Client
3 4
(PSTN, ISDN, Wifi, GPRS/UMTS)
Network Access Server

(NAS)

PPP is a connection-oriented protocol that enables layer two links over a variety of different physical
layer connections. It is supported on both synchronous and asynchronous lines, and can operate in halfduplex or full-duplex mode. It was designed to carry IP traffic but is general enough to allow any type of
network layer datagram to be sent over a PPP connection. As its name implies, it is for point-to-point
connections between exactly two devices, and assumes that frames are sent and received in the same
order.
PPP is a complete link layer protocol suite for devices using TCP/IP, which provides framing,
encapsulation, authentication, quality monitoring and other features to enable robust operation of
TCP/IP over a variety of physical layer connections.
Flag: Indicates the start of a PPP frame. Always has the value 01111110 binary (0x7E)
Address: this field has no real meaning. It is thus always set to 11111111 (0xFF or 255 decimal), which
Is equivalent to a broadcast (it means all stations).
Control: in PPP it is set to 00000011 (3 decimal).
Protocol: Identifies the protocol of the datagram encapsulated in the Information field of the frame.
Information: Zero or more bytes of payload that contains either data or control information, depending
on the frame type. For regular PPP data frames the network-layer datagram is encapsulated here. For
control frames, the control information fields are placed here instead.
Padding: In some cases, additional dummy bytes may be added to pad out the size of the PPP frame.
Frame Check Sequence (FCS): A checksum computed over the frame to provide basic protection against
errors in transmission. This is a CRC code similar to the one used for other layer two protocol error
protection schemes such as the one used in Ethernet. It can be either 16 bits or 32 bits in size (default is
16 bits). The FCS is calculated over the Address, Control, Protocol, Information and Padding fields.
Flag: Indicates the end of a PPP frame. Always has the value 01111110 binary (0x7E)
Section 3 Module
Page 4
PPP connection setup

IP network
PPP Connection
Access Network
NAS
Client
LCP negotiation : compression,

authentication protocol selection, .
Authentication : PAP, CHAP,

MS-CHAP, EAP..
NCP negotiation : IP address, .
Data transfer
3 5

Even though PPP is called a protocol and even though it is considered part of TCP/IPdepending on
whom you askit is really more a protocol suite than a particular protocol. The operation of PPP is based
on procedures defined in many individual protocols.
The PPP standard itself describes three main components of PPP:
PPP Encapsulation Method: The primary job of PPP is to take higher-layer messages such as IP datagrams
and encapsulate them for transmission over the underlying physical layer link. To this end, PPP defines a
special frame format for encapsulating data for transmission, based on the framing used in the HDLC
protocol. The PPP frame has been specially designed to be small in size and contain only simple fields, to
maximize bandwidth efficiency and speed in processing.
Link Control Protocol (LCP): The PPP Link Control Protocol (LCP) is responsible for setting up,
maintaining and terminating the link between devices. It is a flexible, extensible protocol that allows
many configuration parameters to be exchanged to ensure that both devices agree on how the link will
be used.
Network Control Protocols (NCPs): PPP supports the encapsulation of many different layer three
datagram types. Some of these require additional setup before the link can be activated. After the
general link setup is completed with LCP, control is passed to the PPP Network Control Protocol (NCP)
specific to the layer three protocol being carried on the PPP link. For example, when IP is carried over
PPP the NCP used is the PPP Internet Protocol Control Protocol (IPCP). Other NCPs are defined for
supporting the IPX protocol, the NetBIOS Frames (NBF) protocol, and so forth.

Section 3 Module
Page 5
1 Overview
PPP standards
Network
IP
IPX
AppleTalk
Authentication Protocols
PPP
CHAP
NCP
Link
PAP
LCP
HDLC
Physical
ADSL/ATM
3 6
SDH/PDH
ISDN

Additional PPP functional groups

LCP Support Protocols: Several protocols are included in the PPP suite that are used during the link
negotiation process, either to manage it or to configure options. Examples include the authentication
protocols CHAP and PAP, which are used by LCP during the optional authentication phase.
LCP Optional Feature Protocols: A number of protocols have been added to the basic PPP suite over the
years to enhance its operation after a link has been set up and datagrams are being passed between
devices. For example, the PPP Compression Control Protocol (CCP) allows compression of PPP data, the
PPP Encryption Control Protocol (ECP) enables datagrams to be encrypted for security, and the PPP
Multilink Protocol (ML/PPP) allows a single PPP link to be operated over multiple physical links. The use
of these features often also requires additional setup during link negotiation, so several define
extensions (such as extra configuration options) that are negotiated as part of LCP.

Section 3 Module
Page 6
LCP frame format

LCP
C021 Code Ident Length
16
Data
16
Request/Response nb
8
Set up:
1: Configure Request
2: Configure Ack
3: Configure Nack
4: Configure Reject
Type
Length =
code+ Id+ Length+ Data
Length data
Length= Type+ Length+ Data
1: Maximum Receive Unit

2: Asynch control character Map
3: Authentication Protocol (PAP, CHAP)
4: Link Quality Protocol
5: Magic number (loop detection)
7: Protocol field compression
8: Address et control field compression
9: FCS alternative
10: Self describing padding (padding ff)
13: Callback
14: Compound frame
Termination
5: Terminate Request
6: Terminate Ack
Link management
7: Code Reject
8: Protocol Reject
9: Echo Request
10 Echo Reply
11: Discard Request
Extension
12: Identification
13 : Time Remaining
3 7

There are three classes of LCP packets:

1. Link Configuration packets used to establish and configure a link (Configure-Request, ConfigureAck, Configure-Nak and Configure-Reject).
2. Link Termination packets used to terminate a link (Terminate- Request and Terminate-Ack).
3. Link Maintenance packets used to manage and debug a link (Code-Reject, Protocol-Reject, EchoRequest, Echo-Reply, and Discard-Request).

Section 3 Module
Page 7
LCP options
Type Length
Max Receive Unit
16
01
04
Maximum
Receive Unit
Authentication Protocol
03
Asynchronous Control
Character Map
data
16
04
C023 (PAP)
C223 (CHAP)
06
Data
(Default no
authentication)
32
02
(Default 1500)
Asynch Control Character Map

(Default 0xffffffff)
3 8

Maximum-Receive-Unit
This Configuration Option may be sent to inform the peer that the
implementation can receive
larger frames, or to request that the peer send smaller frames. If smaller frames are requested, an
implementation MUST still be able to receive 1500 octet frames in case link synchronization is lost.
Authentication-Protocol
On some links it may be desirable to require a peer to authenticate itself before allowing networklayer protocol packets to be exchanged. This Configuration Option provides a way to negotiate the
use of a specific authentication protocol. By default, authentication is not necessary.
Quality-Protocol
On some links it may be desirable to determine when, and how often, the link is dropping data. This
process is called link quality monitoring.
This Configuration Option provides a way to negotiate the use of a specific protocol for link quality
monitoring. By default, link quality monitoring is disabled.
Async-Control-Character-Map
This Configuration Option provides a way to negotiate the use of control character mapping on
asynchronous links. By default, PPP maps all control characters into an appropriate two character
sequence. However, it is rarely necessary to map all control characters and often it is unnecessary
to map any characters.

Section 3 Module
Page 8
LCP Options (continue)

Type
Data
Length
Magic number
05
06
Address & Control

compression
08
02
Protocol compression
32
Magic number
(Default if
compression not
active)
07
02
Flag Address Control

Protocol
7E
FF
03
1
(By default no
compression)
CRC
Flag
7E
1
Prot
1
3 9

Magic-Number
The Magic-Number field is four octets and aids in detecting links which are in the looped-back
condition
Protocol-Field-Compression
This Configuration Option provides a way to negotiate the compression of the Data Link Layer
Protocol field. By default, all implementations MUST transmit standard PPP frames with two octet
Protocol fields. However, PPP Protocol field numbers are chosen such that some values may be
compressed into a single octet form which is clearly distinguishable from the two octet form.
Address-and-Control-Field-Compression
This Configuration Option provides a way to negotiate the compression of the Data Link Layer
Address and Control fields. By default, all implementations MUST transmit frames with Address
and Control fields and MUST use the hexadecimal values 0xff and 0x03 respectively. Since these
fields have constant values, they are easily compressed. This Configuration Option is sent to
inform the peer that the implementation can receive compressed Address and Control fields.
Compressed Address and Control fields are formed by simply omitting them.
Callback
This Configuration Option provides a method for an implementation to request a dial-up peer to
call back. This option might be used for many diverse purposes, such as savings on toll charges.
Compound-Frames
This Configuration Option provides a method for an implementation to send multiple PPP
encapsulated packets within the same frame.

Section 3 Module
Page 9
One way LCP negotiation example

B (Client)
A (NAS)
ConfigureConfigure-Request/
Request Id: 1f/ MRU: 1000; asyncmap : 0;
Auth: PAP; MagicNb: 2f 4e6a; Prot-Compression;
Addr/ctl-compression
MRU: 1000 (ack);

asyncmap : 0 (nack);
Auth: PAP (ack);
MagicNb: 2f 4e6a (ack);
Prot);
Prot-Compression (rej
(rej);
Addr/ctl-compression(ack)
ConfigureConfigure-Reject/
Reject Id: 1f/ Prot-Compression;
Request Id: 20/ MRU: 1000; asyncmap : 0;
Auth: PAP;MagicNumber:2f 4e6a;Add/ctl-compression
ConfigureConfigure-Nack/
Nack Id: 20/ asyncmap :
0x2000;
A prefers
default
value of
asyncmap
Request Id: 21/ MRU: 1000; Auth: PAP;
MagicNumber: 2f 4e6a; Addr/ctl-compression
ConfigureConfigure-Ack/
Ack Id: 21/ MRU: 1000; Auth: PAP;
MagicNumber: 2f 4e6a; Addr/ctl-compression
3 10
MRU: 1000 (ack);

asyncmap : 0 (nack
);
(nack);
Auth: PAP (ack);
MRU: 1000 (ack);

Auth: PAP (ack);

The process starts with the initiating device e.g. A creating a Configure-Request frame that contains a
variable number of configuration options that it wants to see set up on the link. This is basically device
A's wish list for how it wants the link created.
The other device receives the Configure-Request and processes it. It then has three choices of how to
respond:
If every option in it is acceptable in every way, device B sends back a Configure-Ack

(acknowledge). The negotiation is complete.
If all the options that device A sent are valid ones that device B recognizes and is capable of
negotiating, but it doesn't accept the values device A sent, then device B returns a Configure-Nak
(negative acknowledge) frame. This message includes a copy of each configuration option that B
found unacceptable.
If any of the options that A sent were either unrecognized by B, or represent ways of using the link
that B considers not only unacceptable but not even subject to negotiation, it returns a ConfigureReject containing each of the objectionable options.
Even after receiving a reject, device A can retry the negotiation with a new Configure-Request.

Section 3 Module
Page 10
LCP negotiation example

CLIENT
LCP Con
f-Req Id:1 { Async_map:0x000

a0000, Magic_number:
0x00217cbb, Prot_comp, Add
r/ctl_comp, Callback}
c_map:0x00000000,
LCP Conf-Req Id:1 { MRU:1524, Asyn
r/ctl_comp}
Authent_prot:PAP, Prot_comp, Add
LCP Conf-Ack Id:1 { MRU:1524, Async_m
ap:0x00000000,
Authent_prot:PAP, Prot_comp, Addr/ctl
_comp,}
LCP Conf-Rej Id:1 {Callback}
LCP Conf-Req Id:2 { Async_map:0

x000a0000,
Magic_number:0x00217cbb, Prot
_comp, Addr/ctl_comp}
a0000,
00
{ Async_map:0x0
LCP Conf-Ack Id:2
dr/ctl_comp}
Ad
p,
om
00217cbb, Prot_c
Magic_number:0x
3 11


Section 3 Module
Page 11
NAS
Password Authentication Protocol
Alice password test

username Jack password secret
username
Connect To
User name Jack
Password secret
X
PAP Authenticate Request
Jack + secret
2
4
PAP Authenticate Ack
::
3 12

The Password Authentication Protocol (PAP) provides a simple method for the peer to establish its
identity using a 2-way handshake. This is done only upon initial link establishment.
PAP is not a strong authentication method. Passwords are sent over the circuit "in the clear", and there
is no protection from playback
When PAP is enabled, the remote router attempting to connect to the access server is required to send
an authentication request. If the username and password specified in the authentication request are
accepted, the Cisco IOS software sends an authentication acknowledgement.
After you have enabled CHAP or PAP, the access server will require authentication from remote devices
dialing in to the access server. If the remote device does not support the enabled protocol, the call will
be dropped.
To use CHAP or PAP, you must perform the following tasks:
1.
Enable PPP encapsulation.
2.
Enable CHAP or PAP on the interface.
3.
For CHAP, configure host name authentication and the secret or password for each remote
system with which authentication is required.

Section 3 Module
Page 12
PAP message format
PAP
C023 Code
Ident
1: Authenticate Request
2: Authenticate Ack
3: Authenticate Nack
Lenght
Data
ID length
Peer ID
length
3 13
PW length Password
Message

RFC 1334
The Code field is one octet and identifies the type of PAP packet. PAP Codes are assigned as follows:
1
Authenticate-Request
Authenticate-Ack
Authenticate-Nak
Identifier
The Identifier field is one octet and aids in matching requests and replies.
Length
The Length field is two octets and indicates the length of the PAP packet including the Code,
Identifier, Length and Data fields. Octets outside the range of the Length field should be treated a
Data Link Layer padding and should be ignored on reception.
Data
The Data field is zero or more octets. The format of the Data field is determined by the Code
field.
Peer-ID
The Peer-ID field is zero or more octets and indicates the name of the peer to be
authenticated.
Password
The Password field is zero or more octets and indicates the password to be used for
authentication.
Message
The Message field is zero or more octets, and its contents are implementation
dependent. It is intended to be human readable, and MUST NOT affect operation of the
protocol. It is recommended that the message contain displayable ASCII characters
Section 3 Module
Page 13
CHAP (Challenge Handshake Authentication Protocol)

Connect To
Jack
Username
Password
secret
hostname ISP_a
X
1
Challenge
ISP_a + Random nb
username Alice password test
username Jack password secret
3
Non-reversible
algorithm
MD5
5
4

Response
Jack +
MD5

6
Success
Authentication succeeded
3 14
::

The Challenge-Handshake Authentication Protocol (CHAP) is used to periodically verify the identity of the
peer using a 3-way handshake.
When CHAP is enabled on an interface and a remote device attempts to connect to it, the access server
sends a CHAP packet to the remote device. The CHAP packet requests or "challenges" the remote
device to respond. The challenge packet consists of an ID, a random number, and the host name of the
local router.
When the remote device receives the challenge packet, it concatenates the ID, the remote device's
password, and the random number, and then encrypts all of it using the remote device's password. The
remote device sends the results back to the access server, along with the name associated with the
password used in the encryption process.
When the access server receives the response, it uses the name it received to retrieve a password stored
in its user database. The retrieved password should be the same password the remote device used in
its encryption process. The access server then encrypts the concatenated information with the newly
retrieved passwordif the result matches the result sent in the response packet, authentication
succeeds.
The benefit of using CHAP authentication is that the remote device's password is never transmitted in
clear text. This prevents other devices from stealing it and gaining illegal access to the ISP's network.
CHAP transactions occur only at the time a link is established. The access server does not request a
password during the rest of the call. (The local device can, however, respond to such requests from
other devices during a call.)
After you have enabled CHAP, the access server will require authentication from remote devices dialing
in to the access server. If the remote device does not support the enabled protocol, the call will be
dropped.
To use CHAP, you must perform the following tasks:
1. Enable PPP encapsulation.
2. Enable CHAP on the interface.
3. For CHAP, configure host name authentication and the secret or password for each remote system
with which authentication is required.
Section 3 Module
Page 14
CHAP message format

CHAP
C223 Code
1:
2:
3:
4:
Ident
Challenge
Response
Success
Failure
Lenght
Data
Challenge value
Name of system
transmitting
this packet
Response value
128 bytes in MD5
Name of system
transmitting
this packet
Challenge
length
1
Response
length
1
Message (optional)
Length
3 15

Challenge and Response

The Challenge packet is used to begin the Challenge-Handshake Authentication Protocol. The
authenticator MUST transmit a CHAP packet with the Code field set to 1 (Challenge).
A Challenge packet MAY also be transmitted at any time during the Network-Layer Protocol phase to
ensure that the connection has not been altered.
Whenever a Challengepacket is received, the peer MUST transmit a CHAP packet with the Code field set
to 2 (Response).
Whenever a Response packet is received, the authenticator compares the Response Value with its own
calculation of the expected value. Based on this comparison, the authenticator MUST send a Success or
Failure packet
The Challenge Value is a variable stream of octets. The importance of the uniqueness of the Challenge
Value. The Challenge Value MUST be changed each time a Challenge is sent.
The Response Value is the one-way hash calculated over a stream of octets consisting of the Identifier,
followed by (concatenated with) the "secret", followed by (concatenated with) the Challenge Value.
The Name field is one or more octets representing the identification of the system transmitting the
packet
The Message field is zero or more octets, and its contents are implementation dependent. It is intended
to be human readable, and MUST NOT affect operation of the protocol. It is recommended that the
message contain displayable ASCII characters
Note: Because the Success might be lost, the authenticator MUST allow repeated Response packets after
completing the Authentication phase. To prevent discovery of alternative Names and Secrets, any
Response packets received having the current Challenge Identifier MUST return the same reply Code
returned when the Authentication phase completed(the message portion MAY be different). Any
Response packets received during any other phase MUST be silently discarded.

Section 3 Module
Page 15
NCP message format

NCPNCP-IP
8021 Code
Ident
Lenght
Data
Request/Response nb
Set-up:
2: Configure Ack
3: Configure Nack
4: Configure Reject
Type
Data length
Length
data
Release
6: Terminate Ack
1: obsolete
2: IP compression protocol (RFC1332)
3: IP Address (RFC1332)
4 : Mobile-IPv4 [RFC2290]
129: Primary DNS Server Address [RFC1877]
130: Primary NBNS Server Address [RFC1877]
131: Secondary DNS Server Address [RFC1877]
132: Secondary NBNS Server Address [RFC1877]
link management
7: Code Reject
3 16

NBNS= WINS
The IP Control Protocol (IPCP) is the NCP for IP and is responsible for configuring, enabling, and disabling
the IP protocol on both ends of the point-to-point link. The IPCP options negotiation sequence is the
same as for LCP, thus allowing the possibility of reusing the code.
IP-Compression-Protocol _ provides a way to negotiate the use of a specific compression protocol. By
default, compression is not enabled. Van Jacobson TCP/IP header compression reduces the size of the
TCP/IP headers to as few as three bytes. This can be a significant improvement on slow serial lines,
particularly for interactive traffic.
The IP-Compression-Protocol Configuration Option is used to indicate the ability to receive compressed
packets. Each end of the link must separately request this option if bi-directional compression is
desired.
IP-Address _ provides a way to negotiate the IP address to be used on the local end of the link. It allows
the sender of the Configure-Request to state which IP-address is desired, or to request that the peer
provide the information. The peer can provide this information by NAKing the option, and returning a
valid IP-address.
If negotiation about the remote IP-address is required, and the peer did not provide the option in its
Configure-Request, the option SHOULD be appended to a Configure-Nak. The value of the IP-address
given must be acceptable as the remote IP-address, or indicate a request that the peer provide the
information. By default, no IP address is assigned.
DNS Server Address _ defines a method for negotiating with the remote peer the address of the primary
and secondary DNS server to be used on the local end of the link. If local peer requests an invalid server
address (which it will typically do intentionally) the remote peer specifies the address by NAKing this
option, and returning the IP address of a valid DNS server. Default : No address is provided.
NBNS Server Address _ defines a method for negotiating with the remote peer the address of the
primary and secondary NBNS server to be used on the local end of the link. If local peer requests an
invalid server address (which it will typically do intentionally) the remote peer specifies the address by
NAK-ing this option, and returning the IP address of a valid NBNS server. By default, no primary NBNS
address is provided.
Section 3 Module
Page 16
IPCP-Address negotiation
ISP
Client
IPCP
8021
Or wished IP@
Code=01
Req Ident
Lenght=0A
Code=03
8021 Nack Ident
IP@
03
Length
06
0.0.0.0
03
06
194.1.2.3
Lenght=0A
valid IP@
8021
Code=01
Req Ident
8021
Code=02
Lenght=0A
Ack Ident
IP@
03
Length
06
194.1.2.3
03
06
194.1.2.3
Lenght=0A
3 17

IP-Address
This Configuration Option provides a way to negotiate the IP address to be used on the local end of the
link. It allows the sender of the Configure-Request to state which IP-address is desired, or to request
that the peer provide the information. The peer can provide this information by NAKing the option,
and returning a valid IP-address.
If negotiation about the remote IP-address is required, and the peer did not provide the option in its
Configure-Request, the option SHOULD be appended to a Configure-Nak. The value of the IP-address
given must be acceptable as the remote IP-address, or indicate a request that the peer provide the
information.
By default, no IP address is assigned.

Section 3 Module
Page 17
IPCP _ Van Jacobson compression

4 bytes
Version Header
length
Datagram length
Identification
I
P
Compression
Type Of
Service
TTL
Flag Datagram Offset
Protocol
Checksum
1 byte
Flags indicating the presence of the field
Source IP address
c i
Destination IP address
T
C
P
Connection nb
Destination port nb
Source port nb
Checksum TCP
Sequence number
Ack. number
Header
length
Reserved
U
R
G
A
C
P
S
R
S
S
Y
F
I
Window size
K H T N N
Checksum
Urgent pointer
Urgent Pointer (u)

Window delta (w)
Acknowledge delta (a)
Sequence delta (s)
ID delta (i)
Data
Data
3 18
p s a w u

One important option used with IPCP is Van Jacobson Header Compression, which is used to reduce the
size of the combined IP and TCP headers from 40 bytes to approximately 4 by recording the states of a
set of TCP connections at each end of the link and replacing the full headers with encoded updates for
the normal case, where many of the fields are unchanged or are incremented by small amounts between
successive IP datagrams for a session. This compression is described in RFC 1144.

Section 3 Module
Page 18
Data compression negotiation : CCP

CCP: Compression Control Protocol
80FD Code Ident Lenght
16
Data
16
Request/Response nb
Setup:
2: Configure Ack
3: Configure Nack
4: Configure Reject
Data length
Length data
0: OUI
1: Predictor type 1
2: Predictor type 2
3: Puddle Jumper
4:-15: unassigned
16: Hewlett Packard PPC
17: Stac Electronic LZS
18: Microsoft PPC
19: Gandalf FZA
20: V42bis compression
21: BSD LZW Compress
Release
6: Terminate Ack
Link management
7: Code Reject
14: Reset-request
15: Reset-Ack
3 19
Type


Section 3 Module
Page 19
IP packet transfer
IP datagram
IP
IP
PPP
Flag
7E
1
Address Control Protocol

0021
FF
03
1
CRC
2
Could be compressed
3 20


Section 3 Module
Page 20
Flag
7E
1
2. Multilink Point-to-Point protocol (MP)
3 21


Section 3 Module
Page 21
2. Multilink Point to Point protocol
Multilink PPP stack
Transport Layer
Protocol
Transport Layer Protocol
Network Layer
Protocol
Network Layer Protocol

Multilink PPP
PPP
Line 1
3 22
PPP
PPP
PPP
Line 3
Line 2
Line 3

Multilink PPP is an optional feature of PPP, so it must be designed to integrate seamlessly into regular
PPP operation. To accomplish this, MP is implemented as a new architectural sublayer within PPP. In
essence, an Multilink PPP sublayer is inserted between the regular PPP mechanism and any network
layer protocols using PPP. This allows MP to take all network layer data to be sent over the PPP link and
spread it over multiple physical connections, without causing either the normal PPP mechanisms or the
network layer protocol interfaces to PPP to break.
It works by fragmenting whole PPP frames and sending the fragments over different physical links.

Section 3 Module
Page 22
Multilink PPP option negotiation
LCP Configure
Request
{ MRU=1500; MRRU
= 1500; End-Point
Disc = 00-00-10-0B
-F2-3A}
LCP Configure Nack
= 1490}
{MRU = 1490; MRRU
LCP Configure
Request
{ MRU=1490; MRRU
= 1490; End-Point
Disc = 00-00-10-0B
-F2-3A}
LCP Configure
Ack
d-Point Disc
RU = 1490; En
{MRU = 1490; MR
3 23
F2-3A}
= 00-00-10-0B-

To use Multilink PPP , both devices must have it implemented as part of their PPP software and must
negotiate its use. This is done by LCP as part of the negotiation of basic link parameters in the Link
Establishment phase. Three new configuration options are defined to be negotiated to enable Multilink
PPP:
Multilink Maximum Received Reconstructed Unit: Provides the basic indication that the device
starting the negotiation supports MP and wants to use it. The option contains a value specifying the
maximum size of PPP frame it supports. If the device receiving this option does not support Multilink
PPP it must respond with a Configure-Reject LCP message.
Multilink Short Sequence Number Header Format: Allows devices to negotiate use of a shorter
sequence number field for MP frames, for efficiency.
Endpoint Discriminator: Uniquely identifies the system; used to allow devices to determine which links
go to which other devices.
Before MP can be used, a successful negotiation of at least the Multilink Maximum Received Reconstructed
Unit option must be performed on each of the links between the two devices. Once this is done and an LCP
link exists for each of the physical links, a virtual bundle is made of the LCP links and Multilink PPP is
enabled.

Section 3 Module
Page 23
Multilink PPP mechanism
PPP Frame
PPP Frame
Frag.1
Frag.3 Frag.2 Frag.1
PPP
Line 1
PPP
Line 1
Frag.2
PPP
Line 2
PPP
Line 2
MP
Multilink PPP
Sublayer
3 24
Frag.3
PPP
Line 3
PPP
Line 3

Multilink PPP basically sits between the network layer and the regular PPP links and acts as a middleman:
Transmission: Multilink PPP accepts datagrams received from any of the network layer protocols configured
using appropriate NCPs. It first encapsulates them into a modified version of the regular PPP frame. It then
takes that frame and decides how to transmit it over the multiple physical links. Typically, this is done by
dividing the frame into fragments that are evenly spread out over the set of links. These are then
encapsulated and sent over the physical links. However, an alternate strategy can also be implemented as
well, such as alternating full-sized frames between the links. Also, smaller frames are typically not
fragmented, nor are control frames such as those used for link configuration.
Reception: Multilink PPP takes the fragments received from all physical links and reassembles them into
the original PPP frame. That frame is then processed like any PPP frame, by looking at its Protocol field and
passing it to the appropriate network layer protocol.
The fragmenting of data in MP introduces a number of complexities that the protocol must handle. For
example, since fragments are being sent simultaneously, we need to identify them with a sequence
number to facilitate reassembly. We also need some control information to identify the first and last
fragments of a frame.

Section 3 Module
Page 24
Multilink PPP frame format

IP Data
IP Header
Network Layer
1 byte
MP Sub-Layer
Prot.
0x21
IP PDU
CRC
CRC
CRC
Frag.3
3 25
Frag.2
Frag.1 Prot.
0x21
Original PPP frame

with ACFC & PFC
PPP
Sequence MP Protocol Ctrl. Add. Flag
Number Flags 0x003D 0x03 0xFF 0x7E Line 1
Sequence MP Protocol
Number Flags 0x003D
Sequence MP Protocol
Number Flags 0x003D
Ctrl. Add. Flag

0x03 0xFF 0x7E
Ctrl. Add. Flag

0x03 0xFF 0x7E
PPP
Line 2
PPP
Line 3

Several of the fields that normally appear in a whole PPP frame arent needed if that frame is going to
then be divided and placed into other PPP Multilink frames, so when fragmentation is to occur, they are
omitted when the original PPP frame is constructed for efficiencys sake. Specifically:
The Flag fields at the start and end are used only for framing for transmission and arent needed in the
logical frame being fragmented.
The FCS field is not needed, because each fragment has its own FCS field.
The compression options that are possible for any PPP frame are used when creating this original frame:
Address and Control Field Compression and Protocol Compression. This means that there are no Address
or Control fields in the frame, and the Protocol field is only one byte in size.
These changes save a full eight bytes on each PPP frame to be fragmented. As a result, the original PPP
frame has a very small header, consisting of only a one-byte Protocol field. The Protocol value of each
fragment is set to 0x003D to indicate a MP fragment, while the Protocol field of the original frame becomes
the first byte of data in the first fragment.
Beginning Fragment Flag _ When set to 1, flags this fragment as the first of the split-up PPP frame. It is
set to 0 for other fragments.
Ending Fragment Flag _ When set to 1, flags this fragment as the last of the split-up PPP frame. It is set
to 0 for other fragments.
Reserved (2 or 6 bits) _ Not used, set to zero.
Sequence Number (12 or 24 bits) _ When a frame is split up, the fragments are given consecutive sequence
numbers so the receiving device can properly reassemble them.
Fragment Data: The actual fragment from the original PPP frame.
Section 3 Module
Page 25
End of Section
3 26


Section 3 Module
Page 26
Section 4
IP Layer
IP Technology

Section 4 Module
Page 1
Blank Page
4 2
IP Protocol
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 4 Module
Page 2
1. IP Addressing
4 3
IP Protocol

Section 4 Module
Page 3
4 IP Protocol
Analogy between PSTN Dialing and IP

Telephone dialing
French RTC
Country code= 33
Barbados RTC
Country code=
1246
Finnish RTC
Country code= 358
Russian RTC
Country code= 7
Border
Telephone number: country code Designation number
IP numbering
ClassClass-A networks
ClassClass-B networks
Large IP
Large IP
Network
Large
IP networks
Network
ClassClass-C networks
medium IP
medium IP
Network
Medium
medium
IPIP
Network
networks
Network
Border (class)
IP address: Network ID
4 4
Small IP
networks
Host ID
IP Protocol
To understand the IP addressing format, an analogy can be drawn with the telephone numbering system.
Various countries have telephone networks.
Each country has a country code. Some codes comprise only one figure, some 2, others 3, etc.
So, to reach a particular telephone, you need to dial a number made up of:
a country code,
a designation number.
The boundary between the two fields varies according to the size of the country.
The total number of figures cannot exceed a certain limit. This means that small countries with 4-figure
country codes have less capacity in terms of number of subscribers possible than large countries with
single-figure country codes.
This is also the case with IP addressing where there are:
a few large networks,
a few more medium-sized networks,
a large number of small networks.
A device IP address is divided into two parts:

the Network Identifier (or Net ID),
the station identifier known as the Host ID.
The boundary between these 2 fields also varies.

The boundary can be placed in one of 3 positions and thus determines three types of network:
class-A networks,
class-B networks,
class-C networks.
Section 4 Module
Page 4
4 IP Protocol
Network Size
Net ID
(7bits)
Host Id (24bits)
8
Class-A
network
16 17
24 25
32
Number of networks: 126

Number of Hosts: 16 777 214
Net Id from: 1.0.0.0 to 126.0.0.0
126.0.0.0
10
Class-B
network
Net ID (14bits)
Host Id (16bits)
Number of networks: 16 384

Number of Hosts: 65 534
Net Id from: 128.0.0.0
128.0.0.0 to 191.255.0.0
191.255.0.0
Host Id
(8bits)
Number of networks: 2 097 152
Number of Hosts: 254
Net Id from: 192.0.0.0
192.0.0.0 to 223.255.255.0
223.255.255.0
110
Class-C
network
4 5
Net ID (21bits)
IP Protocol
The Class-A network type, which uses 7 bits for the Net ID, enables the creation of only 126 networks.
Obviously, 128 combinations are possible with 7 bits but, as you will see later on, certain values are
reserved. The 24-bit Host ID means that a large number of stations can be connected per network (up to
16,777,214). So, Net IDs for Class-A networks can range from 1.0.0.0 to 126.0.0.0
The Class-B network type, which uses 14 bits for the Net ID, enables the creation of 16,384 networks. The
16-bit Host ID means that a maximum of 65,534 stations can be connected per network. So, Net IDs for
Class-B networks can range from 128.0.0.0 to 191.255.0.0
The Class-C network type, which uses 21 bits for the Net ID, enables the creation of up to 2,097,152
networks. However, with only 8 bits for the Host ID no more than 254 stations can be connected per
network. So, Net IDs for Class-C networks can range from 192.0.0.0 to 223.255.255.0
The IP addresses, which are made up of 32 bits, enable over 4 billion combinations. This would seem to be
enough capacity to satisfy the worlds IP-address requirements.
So why is there a lack of IP addresses at the moment?
Because of this class-based organization.
Because Class-C networks allow a maximum of only 254 hosts, they severely restrict the development
potential of a businesss network. So much so that in the 80s, even small businesses were asking for Class-B
Net IDs, which enable the connection of 65,000 hosts.
In reality, few Class-B networks actually use all the IP-address potential available.
If a Class-B Net ID is assigned to a network and only 2,000 addresses are used, the other 63,000 addresses
are unusable and therefore completely wasted.
Indeed, the same Net ID cannot be used elsewhere in the world. You will see later on that the routers
analyze the destination address of IP packets and first try to reach the network (i.e. the Net ID) of the
destination station. If several networks located in different areas have the same Net ID, you can imagine
the confusion at router level.

Section 4 Module
Page 5
4 IP Protocol
Special IP @: Broadcast Limited to the Network

Destination
IP@
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
255
An IP-level broadcast
triggers an Ethernet-level
broadcast
255
255
IP src
IP
255
IP dest
172.245.0.1
255.255.255.255 data
MAC@dest
MAC
MAC@src
Type
ff:ff:ff:ff:ff:ff 01:00:2a:01:22:11 0800
FCS
Network
172.245.0.0
4 6
IP Protocol
We have seen that special multicast and broadcast addresses are used at MAC level. Similarly, special IP
addresses have also been defined at IP level.
The first special address is the broadcast.
A station wishing to transmit an IP packet to all stations connected to the same network uses a broadcast.
In such cases, all the IP-address bits are set to 1.
An IP-level broadcast, which has the destination address 255.255.255.255, automatically triggers a MAClevel broadcast, which has a destination MAC address in which all the bits are set to "f".
It should be noted that broadcast packets can never go through a router.

Section 4 Module
Page 6
4 IP Protocol
Special IP @: Unknown Source IP @
IP@= ?
IP
3
MAC
MAC@dest
IP src
IP dest
0.0.0.0
255.255.255.255
MAC@src
DHCP:
IP@ Request
Type
FCS
ff:ff:ff:ff:ff:ff 00:01:2a:01:22:11 0800
MAC: 01:00:2a:01:22:11
4
5
@ pool
DHCP server
(IP@ server)
IP@=0.0.0.0 can be used by an host at start-up to obtain
the IP @ of a BOOTP or DHCP server.
4 7
IP Protocol
Another special address is the unknown address.

The station addresses can be provided dynamically by a server. This server can be a "bootp" server or a
"DHCP" server.
Therefore, a station without an IP address that wishes to communicate over the network first sends an IPaddress request to a server.
The station does this by generating an IP packet with the source address 0.0.0.0 (signifying unknown
address) and a broadcast destination address (because the station doesnt know the server address).
This packet is returned to the MAC protocol, which encapsulates it in a broadcast frame.
The server will take an available address from its address pool.
The special address 0.0.0.0 is used as the source address at start-up only.

Section 4 Module
Page 7
4 IP Protocol
Special IP @: Host Loopback

The IP @ : 127. _._._ allows a communication between 2 applications
Application 2
IP protocol
@IP:Z
Application 1
@IP:127.0.0.1
IP @ :Z
this address is not sent

over the network
IP @ : Y
4 8
IP Protocol
The class-A network 127.0.0.0 is defined as the loopback network. Addresses from that network are
assigned to interfaces that process data within the local system. These loopback interfaces do not access a
physical network.

Section 4 Module
Page 8
4 IP Protocol
NetID
Each network has a unique NetID
200.98.76.0
200.98.76
The router
interface also
has an IP@
eth0
Hub
192.100.17.0
eth1
Hub
200.98.76.254 192.100.17.254
200.98.76.1
192.100.17.1
192.100.17.2
200.98.76.2
200.98.76.3
192.100.17.3
200.98.76.253
192.100.17.253
Class-C network => 254 hosts maximum

4 9
IP Protocol
Lets now take the example of this router, which has 2 interfaces.
Two networks can be created.
The networks in this particular example are Class-C networks because the Net ID is greater than or equal to
192.
They have different Net IDs. Indeed, each time a packet passes through a router it must change networks.
Each router interface has an address that is part of the addressing space of the network to which it is
connected. It should be noted that administrators generally assign the highest addresses to the router
interfaces. This means Host ID 254 in this example since Host ID 255 is reserved for broadcasts.
Each station connected to this network is assigned an IP address comprising the network Net ID and an
available Host ID.
The same applies to the second network: the router interface is assigned an address on this network and all
the stations connected to the second network will have an address containing this network Net ID.
To conclude
A Class-C network has 254 addresses: Host IDs "0" and "255" are reserved.

Section 4 Module
Page 9
4 IP Protocol
Public Addresses Private Addresses

IP@: 154.11.22.33
Public IP@
IP@: 195.51.63.1
IP@: 9.1.2.3
Assigned by IANA
Globally unique
Internet
Cannot circulate on the Internet

IP@: 10.6.7.8
IP@: 10.6.7.8
Private network
10.0.0.0
Private network
10.0.0.0
Address ranges reserved by IANA

Can be used several times
4 10
Private IP@
IP Protocol
Internet addressing comprises two types of internet address:

public addresses,
private addresses.
A public address is an official address assigned by the IANA, which is the body responsible for allocating
Internet IP addresses.
This type of address is globally unique.
The IANA has set aside certain blocks of addresses for private networks.
These addresses are never assigned to Internet stations and cannot circulate on the Internet.
Several private networks can use the same Net ID. There is no ambiguity as long as the networks are not
interconnected.

Section 4 Module
Page 10
4 IP Protocol
Ranges of Private Addresses

Private IP@
Private net.
Public IP@
class A: 10.0.0.0 10.255.255.255 (1 network)
Internet
Private
networks
class B: 172.16.0.0 172.31.255.255 (16 networks)
Private
networks
class C: 192.168.0.0 192.168.255.255 (256 networks)
4 11
IP Protocol
The blocks of addresses set aside by the IANA are as follows:

In Class A, the network 10.0.0.0
In Class B, 16 networks with Net IDs 172.16. to 172.31
In Class C, 256 networks with Net IDs 192.168.0 to 192.168.255

Section 4 Module
Page 11
4 IP Protocol
Private IP Networks and Internet Connection
10.10.10.8
IP@:
10.10.10.8
194.5.3.12
data
1
Intranet 1
NetID: 10.10.10.0
2
Internet
Deleted
194.5.3.12
packet
Private IP
addresses
4 12
IP Protocol
Lets assume that a private network administrator decides to connect his/her network to the Internet.
But private IP addresses are not allowed to circulate over the Internet. The Internet access router destroys
any packet with private addresses.

Section 4 Module
Page 12
4 IP Protocol
Network Address Translation (NAT)

Private IP @ Public IP @
2 10.10.10.4 212.17.22.21 3
212.17.22.22
212.17.22.23
.3
NAT
.1
Private network
10.10.10.0
Internet
.4
194.5.3.12
.2
1
IPsrc: 212.17.22.21
IPdest: 194.5.3.12
IPsrc: 10.10.10.4
IPdest: 194.5.3.12
IPsrc: 194.5.3.12
IPdest: 10.10.10.4
4 13
4
IPsrc: 194.5.3.12
IPdest:212.17.22.21
IP Protocol
A solution does exist to enable private stations to communicate with other stations on the Internet: the
Network Address Translation (NAT) function.
The administrator asks the IANA to allocate a public address and configures the NAT function in the Internet
access router.
When a station from the private network sends a packet to a station on the Internet, the access router
intercepts the packet, stores the source IP address and replaces it with an available public IP address from
the pool.
The packet has been transformed and can now circulate over the Internet.
The Internet server can reply by exchanging the source and destination public addresses in the IP-packet
header.
The access router consults its table to restore the private IP address before sending the packet to the
private network.
The NAT function has its limits: at any given time, the number of stations surfing the internet must equal
the number of public addresses allocated by the IANA.
Other mechanisms can be used such as Port Address Translation (PAT) or proxies, which are beyond the
scope of the TCP/IP beginners course.

Section 4 Module
Page 13
4 IP Protocol
DHCP
Client
Server
Server 2
Broadcast dhcp_discover
(MAC@+ requested services)

@1 + services)
_offer (IP
Broadcast dhcp
)
(@IP2 + services
r
fe
of
p_
hc
d
Broadcast
Broadcast dhcp_request (MAC@+server1+
requested services)
Broadcast dhcp_ack (IP@1+services)
4 14
Lease
time
IP Protocol
Role of DHCP (extension of BOOTP)

allocates IP addresses dynamically.
provides other useful information for client configuration (DNS address, etc.).
facilitates administration (remote client configuration).
Principle:
Several servers can reply to a request.
DHCP discover: sent by the client to locate the DHCP servers.
DHCP-offer: routes the services offered by a DHCP server.
DHCP-Request: client accepts the servers offer. Also used to extend the lease.
DHCP-Ack: the server sends the client the configuration.
The IP addresses are supplied:
for a limited period ("lease time") expressed in seconds (from 0 to 100 years).
permanently ("permanent lease"); lease time =ff.ff.ff.ff
Certain IP addresses can be allocated to specific clients (MAC@/IP@).
DHCP-Nack: this message can be sent back to the client when, for example, the server refuses to extend
the lease or the client was too slow to reply to the offer.

Section 4 Module
Page 14
4 IP Protocol
Default Gateway
IP@src: 1.0.0.1
1 IP@dest: 2.0.0.2
Yes
IP dest. 2
within local
net?
IP level
IP@: 2.0.0.2
No
Default gateway
=IP@: 1.0.0.254
Other network
9
ARP cache
IP@
MAC@
Router
1.0.0.2 405060
?????? 7
3 1.0.0.254 908070
5
Data
FCS
MAC@ MAC@ Type
IP@src:
1.0.0.1
0800
dest.
src.
IP@: 1.0.0.1
IP@dest: 2.0.0.2
MAC@: 102030 908070 102030 (IP)
4
ARP Request
IP @ : 1.0.0.254
4 15
6 ARP Response
MAC@ : 908070
IP@:1. 0.0.254
MAC@: 908070
8
MAC@:405060
IP@:1.0.0.2
IP Protocol
As you can imagine, if the destination IP address is the address of a station located on the other side of the globe, you
cannot use the broadcast mechanism as it will flood the Internet with messages. Thats precisely why routers never
propagate broadcasts. A broadcast is always restricted to the network in which it was generated. How, then, can
stations in different networks communicate with each other?
In fact, at the IP level of a station, when a packet needs to be sent, the first question IP considers is "Is the destination
address inside or outside the network?"
If the destination address is inside the same network, the usual procedure applies: consultation of the ARP table, ARP
procedure if necessary, etc.
If, however, the destination address is outside the network, the station configuration must indicate the address of the
default router through which the packet must be routed to reach the destination.
This parameter is often called the "default gateway". The transmit station must now transfer the packet as far as this
default gateway. The default gateway has an interface connected to the same network as the transmit station and
therefore has an IP address in the same network (with the same Net ID).
This station knows how to send a packet to another station connected to the same network. It consults its ARP table. If
the MAC address of the default gateway is not yet known, it initiates an ARP procedure by generating a request in the
form of a broadcast. This broadcast will not leave the network but will reach the interface of the router that is
connected to the same network.
The router will reply by sending its interface MAC address. This MAC address will be stored in the station ARP table.
And, finally, the IP packet intended for the remote station will be encapsulated in a frame whose destination MAC
address is the MAC address of the next router. This router is appropriately named "next hop".
It is now the router job to consult its routing table to establish which is the best outgoing interface to use to reach the
final destination. Once again, the routing table indicates the IP address of the next router that will move the packet
nearer to its final destination. A new ARP procedure might be initiated between these 2 routers to retrieve the MAC
address of the next router, and so on.
So, once again, you can see that the physical addresses are used constantly to move the IP packets through the network
to their final destination.

Section 4 Module
Page 15
4 IP Protocol
Destination IP @ "Inside" or "Outside" the LAN?

configuration Host
Default gateway:128.5.15.5
2 Host IP@: 128.5.4.1
class B
4
1 Dest IP@: 128.5.26.2
Same =
network
5
ARP cache
IP@
MAC@
128.5.26.2 908070
128.5.15.5 405060
MAC@:102030
IP@:128.5.4.1
IP@: 128.5.26.2
MAC@: 908070
Data
MAC@ Type
IP@src: 128.5.4.1 F
src. 0800 IP@dest: 128.5.26.2 C
908070 102030 (IP)
MAC@
dest.
7
MAC@: 405060
Internet
4 16
IP@: 128.5.15.5
IP Protocol
Once the station has been configured, when an IP packet needs to be sent to the address 128.5.26.2, the
station determines whether this IP address is inside or outside its network.
First of all, it analyzes its own IP address to determine which class its own network belongs to. In this
example, 128 indicates a Class-B network address.
Once the station knows the class, it knows where the boundary is between the Net ID and the Host ID for its
own network. Here, the Net ID is two bytes long.
The station therefore compares just the Net ID bytes of the source and destination addresses.
In this example, the Net IDs are identical, which means that the destination IP address is located in the
same network as the transmit station.
The station does not need to send the packet through the default gateway. It just needs to consult the ARP
table directly and possibly initiate an ARP procedure on its LAN if the corresponding MAC address is not yet
known. Here, the ARP table has been updated.
The transmit station can therefore encapsulate the packet in an Ethernet frame whose destination MAC
address will be the MAC address of the IP packet destination station.

Section 4 Module
Page 16
4 IP Protocol
Destination IP @ "Inside" or "Outside" the LAN? (2)

configuration Host
128.5.15.5
2 Host IP@: 128.5.4.1
class B 3
1 Dest IP@: 128.6.6.6

5
Other
network
ARP cache
IP@
MAC@
128.5.26.2 908070
128.5.15.5 405060
MAC@: 102030
IP@: 128.5.4.1 8
IP@: 128.5.26.2
MAC@: 908070
Data
MAC@ Type
IP@src: 128.5.4.1 F
src. 0800 IP@dest: 128.6.6.6 C
405060 102030
MAC@
dest.
405060 102030
(IP)
MAC@: 405060
IP@: 128.5.15.5
Internet
4 17
IP Protocol
Lets now assume that this station wishes to send a packet to IP address 128.6.6.6
Once again, it analyzes its own IP address and determines that its a Class-B network address. It freezes the
2 Net ID bytes and compares them.
This time, the Net IDs are different and the destination station is therefore located in another network.
This means that the packet must go through a router, which will be the default gateway defined in the
station configuration.
The station knows the routers IP address and now needs to find the corresponding MAC address.
The station consults its ARP cache. In this case, the cache contains the MAC address. If it had not contained
the address, the station would have launched an ARP procedure.
The packet is encapsulated in an Ethernet frame whose destination MAC address is the address of the next
router on the route leading to the final destination (rather than the MAC address of the final destination
station).

Section 4 Module
Page 17
4 IP Protocol
Subnetworks
128.5.4.3
128.5.4.5
Internet
S/Net 128.5.4.0
128.5.4.2
128.5.4.4
128.5.4.1
Network 128.5.0.0
S/Net
128.5.8.1
128.5.8.3
128.5.8.4
128.5.8.2
4 18
128.5.8.0128.5.8.5
IP Protocol
The class-based system for network classification lacks the flexibility needed to handle the explosion in the
number of IP networks and devices.
In 1984, to prevent too many stations from being connected to the same network and also because the
distance between sites was increasing, the decision was taken to introduce the "subnetwork" or "subnet"
concept in the aim of offering administrators of large networks an extra hierarchical level.
The Net IDs of these subnetworks borrow a few bits from the Host ID to ensure that the subnetworks are
clearly identified.
Here, the Class-B network 128.5.0.0, which had a capacity of around 16 million host stations, has been
divided into 2 subnetworks with Net IDs 128.5.4.0 and 128.5.8.0 respectively.
So three bytes are used for the Net ID in these subnetworks.
And, of course, all the stations belonging to network 128.5.4.0 have IP addresses starting with 128.5.4 and
all the stations connected to network 128.5.8.0 have IP addresses starting with 128.5.8

Section 4 Module
Page 18
4 IP Protocol
Subnet Mask
The "Subnet Mask" indicates the length

of the NetID part in the IP address
IP@src: 128.5.4.3
1 IP@dest 128.5.8.4
Yes
IP dest. 2
within local
net?
IP level
IP@: 128.5.8.4
Other network
No
Router
Default gateway
=IP@: 128.5.4.1
Mac@: 304050
IP@: 128.5.4.1
IP@: 128.5.4.3
MAC@: 102030
MAC@:708090
IP@:128.5.4.5
4 19
IP Protocol
What can be done to resolve this problem?

The dividing line between Net ID and Host ID can no longer be based on the network class.
Since the introduction of the subnetwork concept, a new parameter has also been developed: the "Subnet
Mask".

Section 4 Module
Page 19
4 IP Protocol
Subnet Mask Mechanism

Src IP@:
138
17
1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1
Dest IP@: 138
19
5
0 0 0 0 0 1 0 1
37
0 0 1 0 0 1 0 1
1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 1
24 2322 21 20
Net ID:
Mask:
138
16
255
255
252
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
4 20
IP Protocol
Lets look at Subnet Mask and at the mechanism for determining whether a destination IP address is "inside"
or "outside" the transmit station network.
Lets consider an example.
There are two IP addresses: a source address and a destination address. The question is: Are these two
addresses in the same subnetwork?
If the Net ID is 3 bytes long, the answer is no.
If the Net ID is 2 bytes long, the answer is yes.
It is clear that an additional parameter is required to indicate the length of the Net ID. This parameter is
the Subnet Mask.
You will see that the difficulty in processing addresses lies in the fact that they are expressed as decimal
numbers.
To make things completely clear, lets convert the mask into binary, then apply the mask to both the
source and destination address.
The Net ID of the source IP address now appears clearly and can be compared to the corresponding bits of
the destination address.
You can now see clearly that the 2 addresses are in the same subnetwork.
What is the Net ID of this subnetwork? Once again, there is a slight difficulty concerning translation of the
third byte.

Section 4 Module
Page 20
4 IP Protocol
Subnet Mask Notation
Dotted decimal notation
IP @:
138
Netmask: 255
5
255
19
252
37
0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Prefix notation
@ IP:
138
4 21
19
IP Protocol
There are two methods for giving a network mask:

Dotted decimal
prefix
In dotted decimal notation, each byte of the mask is given in decimal

In prefix notation, the prefix indicates hoaw many bytes the mask is composed of.

Section 4 Module
Page 21
37
22
4 IP Protocol
Search for a Router

PC configuration
gateway:128.5.4.1
Host IP@: 128. 5 . 4 .3
1 Subnet Mask: 255.255.255.0
2 IP@ dest: 128. 5 . 8 .4
ARP cache
IP @
MAC@
128.5.4.5 708090
128.5.4.1 304050
IP@: 128.5.4.3
Mac@: 102030
MAC@ Type
MAC@
F
IP Packet
src. 0800 IPdest:
dest.
IPdest: 128.5.8.4 C
304050 102030 (IP) IPsrc:
IPsrc: 128.5.4.3 S
6
Mac@:
IP@:
304050
128.5.4.1
Subnet 128.5.4.0
Subnet 128.5.8.0
IP@: 128.5.8.1
4 22
IP@: 128.5.4.5
Mac@: 708090
IP@: 128.5.8.4
Mac@: aabbcc
IP Protocol
Lets now consider whether the mask has solved the problem of communicating between subnetworks.
The subnet mask must be included in all station configurations along with the default gateway and the IP
address.
Thanks to previous traffic, the ARP cache already contains the MAC addresses of the stations in the same
network.
This station wishes to transmit a packet to the station with the address 128.5.8.4.
From now on, its the mask rather than the class that determines the Net ID of the source network.
This time, then, the transmit station discovers that the destination address is outside the network and so
sends the packet to the default gateway using the address in the configuration.
The station consults its ARP cache, which contains the corresponding MAC address.
A frame can therefore be transmitted to the router. The frame contains the IP packet intended for the
remote station.

Section 4 Module
Page 22
4 IP Protocol
Classful/Classless Addressing
"Classful" addressing
Which class of network is to be selected?
Too small class-C network (254 hosts maxi)
Enterprise
500 hosts
Class-B network (65534 hosts maxi)

Consequences?
Waste of IP addresses
"Classless" addressing
Network aggregation
4 23
IP Protocol
CIDR stands for Classless Inter-Domain Routing.

Historically, IP addresses were assigned within classes: Class A (8 bits of network address, 24 bits of host
address), Class B (16 bits of network address, 16 bits of host address) and Class C (24 bits of network
address, 8 bits of host address). With the advent of CIDR, address space is now allocated on a bit boundary
basis.

Section 4 Module
Page 23
4 IP Protocol
Classless Inter-Domain Routing (CIDR)
Network: 201
78
48
1 1 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
NetID
HostID (9 bits)
510 hosts
2 class-C networks
201.78.48.0/23
500 hosts
CIDR: enables to allocate the required amount of IP addresses

4 24
IP Protocol
CIDR stands for Classless Inter-Domain Routing.

Historically, IP addresses were assigned within classes: Class A (8 bits of network address, 24 bits of host
address), Class B (16 bits of network address, 16 bits of host address) and Class C (24 bits of network
address, 8 bits of host address). With the advent of CIDR, address space is now allocated on a bit boundary
basis.

Section 4 Module
Page 24
4 IP Protocol
Classless Inter-Domain Routing (CIDR) [cont.]

Net1:
Net1:
201
78
48
0 / 23
1 1 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
Net2:
Net2:
201
78
50
0 / 23
1 1 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0
Net3:
Net3:
201
78
52
1 1 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 0 0
Net4:
Net4:
201
78
56
1 1 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0
201.78.56.0/21
Net4: 2046 hosts
Destination
Next hop
201.78.48.0/20 IP@1
0 / 21
0 0 0 0 0 0 0 0
201.78.48.0/23
Net1: 510 hosts
IP@2
IP@1
Destination
0 / 22
0 0 0 0 0 0 0 0
Next hop
201.78.48.0/22 IP@2
201.78.52.0/22
Net3:1022 hosts
201.78.50.0/23
Net2: 510 hosts
CIDR: enables to aggregate addresses in the routing tables

4 25
IP Protocol

Section 4 Module
Page 25
2. IP Routing
4 26
IP Protocol

Section 4 Module
Page 26
4 IP Protocol
Format of the IP Datagram

byte
byte
byte
Version Header
length
Type Of
Service
Identification
TTL
byte
Datagram length
Flag Datagram offset
Protocol
Checksum
Source IP address
Option
s
Data
4 27
IP Protocol
Here are the different fields.

We will now look at them one by one but will not deal with them in order since many of them are
interlinked.
As you can see, the packets are generally represented in the form of 4-byte words. The importance of this
will become clearer later on in the module.

Section 4 Module
Page 27
4 IP Protocol
The Different Types of Routing

Static
Generates no traffic and saves bandwidth
Easy to create for simple networks
Manual programming
No re-routing in case of default
Risk of error occurring
Dynamic
Automatically re-routes the traffic in case of failure
Ideal for large networks
Generates traffic on the network
Leads to a processing overload in the routers
4 28
IP Protocol
Static Routing
Static routing is carried out manually by the network administrator. The administrator is responsible for
detecting and propagating routes throughout the network. The administrator enters the routes manually in
the configuration of each of the networks routing devices.
Once the router has been configured, it simply transfers the packets using the predetermined ports. There
is no communication between the routers concerning the actual network topology.
In small networks with little redundancy, the static routing process is quite easy to manage. However, this
method has certain drawbacks as far as the management of IP routing tables is concerned:
the static routes require a high level of coordination and maintenance in complex network environments,
the static routes do not adapt dynamically to the operating state of the network. When a destination
subnetwork becomes unreachable, the static routes leading to this network remain in the routing table.
Traffic continues to be transmitted to this destination. Until the network administrator updates the static
routes in line with the new network topology, traffic cannot be routed along other existing routes.
Dynamic Routing
Dynamic routing algorithms enable routers to detect and adapt automatically to the routes in the network.

Section 4 Module
Page 28
4 IP Protocol
Principle of Dynamic Routing
The router announces which

networks it can reach
The router calculates the
routes from the announcements
4 29
IP Protocol

Section 4 Module
Page 29
4 IP Protocol
Algorithms of Routing Protocols
Routing algorithm
Link State
Distance Vector
OSPF
IS-IS
EIGRP
RIP
BGP (path Vector)
RIP: Routing Information Protocol
IS-IS: Intermediate System to Intermediate System
OSPF: Open Shortest Path First
EIGRP: Enhanced Internet Gateway Routing Protocol
BGP: Border Gateway Protocol
4 30
IP Protocol
Several dynamic routing protocols are currently used for automatic route detection. The difference
between these protocols lies in the way they detect and calculate new routes to destination networks.
They can be divided into two main categories:
distance vector protocols.

link state protocols.

Section 4 Module
Page 30
4 IP Protocol
Classes of Routing Protocols

(IS-IS)Janet
Autonomous
system
INTERNET
Sphinx
(OSPF)
BGP
(OSPF)
Sprint
DFN
(IGRP)
Autonomous
system
2 classes of protocols:
Interior Gateway Protocol
Exterior Gateway Protocol
4 31
(EIGRP)
Renater
(RIP, IGRP, OSPF, IS-IS, etc.)

(EGP, BGP)
IP Protocol
Autonomous Systems (ASs) are logical portions of a larger IP network. ASs are usually networks inside
organizations. They are controlled by a single administration authority.
Certain routing protocols are used to determine routing paths within an AS while others are used to
interconnect several ASs:
Interior Gateway Protocols: enable routers to exchange information within an AS. Examples: OSPF and
RIP.
Exterior Gateway Protocols: enable ASs to exchange information with other ASs. Example: BGP.
The interior protocols are used to manage routing information within each AS. The figure also shows the
exterior protocols, which manage information on routing between ASs.
Numerous interior routing processes can be used within an AS. When this arises, the AS must present itself
to the other ASs with a single, coherent routing plan. The AS must provide a coherent view of its internal
destinations.

Section 4 Module
Page 31
4 IP Protocol
Routing Table: Principle
.8
204.92.75.0
204.92.75.0
.6
.3
.25
.2
.1
204.92.77.0
204.92.77.0
e1
e2
e0
.9
.12
.13
.2
.1
204.92.76.0
204.92.76.0
R2
#interface e1
ip address 204.92.76.2 255.255.255.0
# interface e0_
ip address 192.168.201.1 255.255.255.0
Network
204.92.76.0
192.168.201.0
.1
e0
.7
1/1/2
192.168.201.0
192.168.201.0
R1
e1
Mask
Next hop
If
204.92.76.1
e1
e0
e1
255.255.255.0
255.255.255.0
0.0.0.0(default)
0.0.0.0
#ip route 0.0.0.0 0.0.0.0 204.92.76.1
4 32
IP Protocol
Lets now look at what a routing plan is by means of the following example.
There are 4 networks:
the network with Net ID 204.92.77.0
the network 204.92.75.0
As usual, each router interface has an IP address in the network it belongs to.
Lets now look at the R1 routing table, or rather lets construct the R1 routing table.
The routing table will not include routes to every station as it would be enormous. Instead, it will include
the routes needed to reach each network. A network is represented by its Net ID, that is, an IP address
associated with a mask.
First route: to reach the stations in network 204.92.76, traffic doesnt need to go through another router as
Ethernet interface 1 (e1) is connected directly to this network.
Similarly, to reach the stations in network 192.168.201, traffic can go through Ethernet interface 0 (e0).
We could then continue to describe all the other networks. But, lets imagine that all the worlds other
internet networks are located on the left of R1. Describing all the networks would be tedious and the
routing table would be huge. So, to make the task easier, a default route can be included in the routing
table. This default route would be used solely when no other route in the table can be used to route the
packet.
So, here, any IP packet whose destination address doesnt begin with 204.92.76 or 192.168.201 must be
sent to router R2, which is known as the "next hop". The routing table therefore contains the IP address of
the router R2 interface that shares the same network as R1. It also contains router R1 outgoing interface.

Section 4 Module
Page 32
4 IP Protocol
Routing Table
204.92.75.0/24
204.92.75.0/24
.8
.25
.6
.3
.9
.13
.2
.2
.1 e2 e .1
e1
0
204.92.77.0/24
204.92.77.0/24
R1
204.92.76.0/24
204.92.76.0/24
R2
Network
Mask
4 33
Mask
.12
.7
192.168.201.0/24
192.168.201.0/24
Next hop
204.92.76.0 255.255.255.0
192.168.201.0 255.255.255.0
0.0.0.0(default)
0.0.0.0
204.92.76.1
Fill in this table

Network
e0
e1
.1
Next hop If
IP Protocol
Exercise
Try and fill in the routing table for router R2.
Several solutions are possible.

Section 4 Module
Page 33
If
e1
e0
e1
4 IP Protocol
Routing Table: Exercise (Solution)

204.92.75.0/24
204.92.75.0/24
.8
.25
.6
.3
.9
.13
.2
.1
e1
204.92.77.0/24
204.92.77.0/24
e2
.2
.1
e0
R1
204.92.76.0/24
204.92.76.0/24
R2
Network
204.92.76.0
192.168.201.0
0.0.0.0(default)
Solution
Network
204.92.76.0
204.92.77.0
204.92.75.0
192.168.201.0
Mask
e0
e1
.1
Mask
255.255.255.0
255.255.255.0
0.0.0.0
.12
.7
192.168.201.0/24
192.168.201.0/24
Next hop
If
204.92.76.1
e1
e0
e1
Next hop If
255.255.255.0
255.255.255.0
255.255.255.0
e0
e1
e2
255.255.255.0 204.92.76.2
e0
4 34
IP Protocol
One possible solution is shown here.

You can start by adding the routes to the networks connected directly to router R2.
To reach network 204.92.76, go through Ethernet interface 0.
To reach the last network you can now:
either introduce a default route,
or specify a route for the last Net ID.
The preference here is for the Net ID. So, to reach network 192.168.201, go to the next hop (i.e. router R1)
via Ethernet interface 0.

Section 4 Module
Page 34
4 IP Protocol
Routing Table: Metric

192.168.201.0
204.92.77.0 204.92.75.0
.1
.2
e1
e2
e0
R2
.2
.1
R1
Secondary route
Primary route
4 35
e2
e0
.1
.2
204.92.76.0
Network
e1
Mask
Next hop
If metric
204.92.76.0 255.255.255.0
e1
192.168.201.0 255.255.255.0
204.92.77.0
255.255.255.0 204.92.76.1
204.92.77.0
255.255.255.0
204.92.75.0
255.255.255.0 204.92.76.1
204.92.75.0
255.255.255.0 204.92.77.1
e0
e1
e2
0
0
1
0
e1
e2
1
1
IP Protocol
Lets now alter the diagram so that there are several routes leading to a destination. The routing table
must be updated. So, in R1 there is now a second direct route to 204.92.77 through Ethernet interface 2.
The question that now arises is "Which one of the 2 routes will R1 choose to reach network 204.92.77?". This
is the role of another routing-table parameter known as the "metric".
Here, for example, the metric corresponds to the number of hops to the destination station. It is 0 when
the network is connected directly. The router chooses the lowest-cost route.
The routing table is not quite up to date. At the moment, it shows only one route for reaching network
204.92.75 (the route that goes through network 204.92.75) when, in fact, another route via Ethernet 2 and
the next hop 204.92.77.1 can be used.
The type of routing just constructed is static routing, which means that it is set up by an operator.
You can see that:
static routing is relatively complex to set up in a large network,
design errors, route omissions and even typing errors can easily occur in the routing tables.
But, on top of that, this type of routing is not self-adjusting. This means that it cannot adjust to events
that occur in the network such as link breakage, router failure, etc.
It is for these reasons that dynamic routing protocols such as RIP, OSPF, BGP, etc. were developed.
The levels of performance and sophistication of these protocols vary and they all offer certain advantages
and disadvantages.
Similarly, static routing can also offer advantages in certain specific circumstances.

Section 4 Module
Page 35
4 IP Protocol
Routing table scanning
Prefix
192.168.1.17
Next Hop
192.168.0.0 /16
R4
194.1.0.0 /16
R1
194.1.16.0 /20
R2
192.168.1.0/24
R3
Choice of the longest prefix
4 36
Full scan of the routing table
IP Protocol
Problem:
which of the 2 entries must use the datagram 192.168.1.17? A priori one does not know because the
datagram does not carry the size of the prefix (mask)
Rule:
One retains the entry which has the longest prefix.
It is thus necessary :
to scan the whole routing table,
to retain all the possible prefixes, and
to choose among those, that which has the longest mask. Here, they is 192.168.1/24.

Section 4 Module
Page 36
4 IP Protocol
Time To Live (TTL)

Version
Header Type Of Service

length
Identification
TTL
Datagram length
Flag
TTL=64
Datagram Offset
Protocol
Checksum
Source IP address
TTL=63
Options
TTL=62
Data
TTL=61
5
TTL=60
TTL=32
2
TTL=0
3
4 37
IP Protocol
This is the TTL or "Time To Live" field.

In theory, this field indicates the maximum time a packet is allowed to stay in the network. Each router
must decrease the TTL field based on its processing time.
In practice, all routers process packets in less than one second. So, it is now usual practice for routers that
process the packet to decrease the TTL value by one.
When a packet is transmitted by a station, it starts out with a certain value in the TTL field. Then, each
time the packet passes through a router, this value decrements. This packet must arrive at its destination
before the value of the field reaches 0, otherwise the packet is destroyed by a router.
What is the purpose of such a field?
When an IP packet gets lost in the network, the TTL eventually reaches 0, which means that any router can
destroy it.
This happens, for example, when a packet gets stuck in a loop following a routing problem.
You may remember that this phenomenon was mentioned earlier when discussing routing and, in particular,
loops that can occur when the default route is used incorrectly.

Section 4 Module
Page 37
4 IP Protocol
Encapsulated Protocol
Data
TCP
ICMP
HigherHigher-level protocols
Version Header
length
Identification
IP protocol
TTL
UDP
17
ToS
Datagram length
Protocol
Checksum
Source IP address
Options
Data
MAC
4 38
MAC@ dest.
MAC@ src.
Type
0800
(IP)
Data
FCS
IP Protocol
When the destination station of a MAC frame receives the frame, it is the EtherType field that indicates
which higher-level protocol the contents must be sent to.
This is also the case for IP. The "protocol" field indicates which higher-level protocol is the destination of
the packet data.
The IANA assigns the official codes for this field.
The protocols encapsulated in IP are ICMP, UDP and TCP.
The TCP and UDP protocols will be studied during this training module.

Section 4 Module
Page 38
4 IP Protocol
Layers in a TCP/IP Communication

server
host
data
IP
Network
Transport
data
Network@IPa
IP@ a
b
Link
Phys@ 1
2
IP@ a
b
Phys@ 8
7
Phys@: 1
Host
Phys@ s1d2
Phys@4
15
www Mail
Transport
data
Network @IPb
IP@ a
b
Link
Phys@ s4d15
Phys@ Phys@ Phys@

Phys@ Phys@
Phys@: 15
2
8
6
4
7
Phys@ s8d7
Phys@ s4d15
Phys@
3
Phys@ 18
Host
4 39
IP@ a
b
FTP
Phys
@
12
Phys
@
9
Host
Phys@
34
Host
IP Protocol
When two users wish to communicate, one is the Client because in the IP world the client is defined as the
user requesting the service while the other is the Server because that user provides the service.
Here, the Server is capable of providing various services but the Client wishes to request one service only.
The transport layer is charged with targeting the required service. For this, each application is allocated an
official number known as a "port number". (N.B. the IANA is responsible for allocating a port number to
every new service.) The transport layer sends the datagram to the lower-layer IP. This IP packet must be
sent to the remote server. For this reason, every machine connected to the IP network is therefore assigned
a logical address called an IP address. One of IP jobs is to insert a header. The main fields in this header are
the packet source and destination addresses. The packet is then sent to the data link layer, which
encapsulates it in a frame with a header containing the physical source and destination addresses. Finally,
the frame is transferred to the transmission medium.
All the machines connected to this transmission medium analyze the frame header but because only the
router interface recognizes its physical address it extracts the contents of the frame and transmits them to
the upper-layer IP. The routers network layer analyzes the packet header, especially its destination IP
address. Its routing table indicates the outgoing interface and the next physically connected device the
packet must pass through to reach its final destination. The IP packet is transferred to the data link layer,
which encapsulates it in a frame. This time, the physical source address is the source router interface
address and the physical destination address is the address of the next router interface. Once again, only
the router recognizes its physical address in the frame transported by the transmission medium. It
therefore extracts the packet from the frame and sends its contents to its network layer. The network layer
routes the packet to the outgoing interface using its routing table.
Finally, the frame is transferred to the last link. The destination machine recognizes its physical address in
the header and sends the contents to its IP. The IP of the final destination machine recognizes its own IP
address in the destination IP field of the packet received. The contents of the packet are then sent to the
transport layer, which examines the header. Thanks to the destination port number contained in the layer4 protocol header, the data is routed to the service chosen by the Client.

Section 4 Module
Page 39
4 IP Protocol
Best Effort
Not reliable
But what does

IP provide?
No error recovery
Best effort
Connectionless-oriented
4 40
IP Protocol
Which services are provided by the IP layer?

IP is not reliable. This means that it cannot guarantee that the data it sends will be routed correctly. In the
event that a packet is lost, IP does not perform error recovery.
IP offers a connectionless service. This means that it does not communicate with the other remote IP
layers. Each datagram is managed independently from the other datagrams even when a large file is being
transferred between remote entities. This implies that the datagrams can be mixed up, duplicated, lost or
altered.
IP just tries to deliver the datagrams and provides a "Best effort" service.

Section 4 Module
Page 40
3. IP Redundancy
4 41
IP Protocol

Section 4 Module
Page 41
4 IP Protocol
Router Discovery Problem
Router A
10.1.1.1
C:\ > ipconfig

IP address: 10.1.1.10
Netmask: 255.255.255.0
20.20.20.4
Gateway: 10.1.1.1
Network
10.1.1.1020.20.20.4
10.1.1.2
Router B
4 42
IP Protocol
Router A is the default gateway responsible for handling packets for network 10.1.1.0/24. If the
connection between Router A and the network goes down or if the router becomes unavailable, fast
converging routing protocols, such as the Enhanced Interior Gateway Routing Protocol (Enhanced IGRP)
and Open Shortest Path First (OSPF) can respond within seconds so that Router B is prepared to transfer
packets that would otherwise have gone through Router A.
However, in spite of fast convergence, if Router A goes down, the users in network 10.1.1.0 might not be
able to communicate with the external segments even after the routing protocol has converged. That's
because IP hosts, usually do not participate in routing protocols. Instead, they are configured statically
with the address of a single router, such as Router A. Until someone manually modifies the configuration
of machine to use the address of Router B instead of Router A, the user cannot communicate with the
other network segments.
Some IP hosts use proxy Address Resolution Protocol (ARP) to select a router. If the users workstation
was running proxy ARP, it would send an ARP request for the IP address 20.20.20.4. Router A would reply
on behalf of that station and would offer its own media access control (MAC) address With proxy ARP,
stations in external segments are seen as if they were connected to the same segment . If Router A fails,
machine 10.1.1.10 will continue to send packets destined for 20.20.20.4 to the MAC address of Router A
even though those packets have nowhere to go and are lost. The user either waits for ARP to acquire the
MAC address of Router B by sending another ARP request or reboots the workstation to force it to send an
ARP request. In either case, for a significant period of time, it will not be able to communicate with any
external destination , even when routing protocols have converged and Router B is ready to forward
packets.
Some IP hosts use the Routing Information Protocol (RIP) to discover routers. The drawback of using RIP is
that it is slow to adapt to changes in the topology. If stations in network 10.1.1.0 were configured to use
RIP, 3 to 10 minutes might elapse before RIP makes another router available.
Some newer IP hosts use the ICMP Router Discovery Protocol (IRDP) to find a new router when a route
becomes unavailable. A host that runs IRDP listens for hello multicast messages from its configured
router and uses an alternate router when it no longer receives those hello messages. If the station was
running IRDP, it would detect that Router A is no longer sending hello messages and would start sending
its packets to Router B. However, for legacy devices that do not support IRDP, it is not an option.
Section 4 Module
Page 42
4 IP Protocol
Principle
Virtual Router
Router A (active)
C:\ > ipconfig
Interface IP @:10.1.1.1
MAC @: 00:10:7B:81:9A:9B
Netmask: 255.255.255.0
Gateway: 10.1.1.3
10.1.1.1
20.20.20.4
Standby group 1
IP Address: 10.1.1.3
MAC @:00:00:0C:07:AC:01
Network
Interface IP @:10.1.1.2
MAC @: 00:10:7B:81:9C:EC
Standby group number to which
participating physical
interfaces belong
4 43
Router B (standby)
IP Protocol
One way to achieve high availability is to use HSRP, which provides network redundancy for IP networks,
ensuring that user traffic is forwarded immediately and transparently recovers from first hop failures in
router interfaces
By sharing an IP address and a MAC (Layer 2) address, two or more routers can act as a single "virtual"
router. The members of the virtual router group continually exchange status messages. This way, one
router can assume the routing responsibility of another, should it go out of commission for either planned
or unplanned reasons. Hosts continue to forward IP packets to a consistent IP and MAC address, and the
changeover of devices doing the routing is transparent.
Using HSRP, a set of routers works in concert to present the illusion of a single virtual router to the hosts
on the LAN. This set is known as an HSRP group or a standby group. A single router elected from the
group is responsible for forwarding the packets that hosts send to the virtual router. This router is known
as the Active router. Another router is elected as the Standby router. In the event that the Active router
fails, the Standby assumes the packet-forwarding duties of the Active router. Although an arbitrary
number of routers may run HSRP, only the Active router forwards the packets sent to the virtual router.
To minimize network traffic, only the Active and Standby routers send periodic HSRP messages once the
protocol has completed the election process. If the Active router fails, the Standby router takes over as
the Active router. If the Standby router fails or becomes the Active router, then another router is
elected as the Standby router.
On a particular LAN, multiple hot standby groups may coexist and overlap. Each standby group emulates
a single virtual router. The individual routers may participate in multiple groups. In this case, the router
maintains separate state and timers for each group.
Each standby group has a single, well-known MAC address, as well as an IP address.
In most cases when you configure routers to be part of an HSRP group, they listen for the HSRP MAC
address for that group as well as their own burned-in MAC address. The exception is routers whose
Ethernet controllers only recognize a single MAC address (for example, the Lance controller on the Cisco
2500 and Cisco 4500 routers). These routers use the HSRP MAC address when they are the Active router,
and their burned-in address when they are not.
HSRP uses the following MAC address on all media except Token Ring:
0000.0c07.ac**
(where ** is the HSRP group number)

Section 4 Module
Page 43
4 IP Protocol
Operation
Virtual Router
Router A Router
(no more
hellos)
A (active)
10.1.1.1224.0.0.2 Hello
C:\ > ipconfig
20.20.20.4
Netmask: 255.255.255.0
Gateway: 10.1.1.3
Standby group 1
IP Address: 10.1.1.3
MAC @:00:00:0C:07:AC:01
Network
10.1.1.2224.0.0.2 Hello
Routeractive
B (standby)
Router B enters
mode
4 44
IP Protocol
The routers in an HSRP group send and receive keepalives using the multicast address of 224.0.0.2 and
UDP port 1985. By default the hello interval is 3 seconds. Once 3 hello intervals pass without hearing
from the active router, the standby router automatically becomes the active router. Each router is
configured with a priority number, the router with the highest priority number in a standby group is the
active router
Preemption
The HSRP preemption feature enables the router with highest priority to immediately become the Active
router. Priority is determined first by the priority value that you configure, and then by the IP address. In
each case a higher value is of greater priority.
When a higher priority router preempts a lower priority router, it sends a coup message. When a lower
priority active router receives a coup message or hello message from a higher priority active router, it
changes to the speak state and sends a resign message.
Preempt Delay
The preempt delay feature allows preemption to be delayed for a configurable time period, allowing the
router to populate its routing table before becoming the active router.

Section 4 Module
Page 44

What is the function of Network Address Translation (NAT)?
To map private IP addresses to public IP addresses

To map symbolic addresses to numeric addresses
To convert classless addresses into classful (CIDR) ones
To map non-IP addresses to IP addresses
4 45
IP Protocol

Section 4 Module
Page 45

How are all the routers within an Autonomous System administered?
By BGP-4
By the same manufacturer
By the same organization
By UDP
4 46
IP Protocol

Section 4 Module
Page 46

Associate each IPv4 header field to its appropriate description.
Source Address
IDs errors in IP
header
Time To Live
Origin of packet
Counter to avoid route loops
Header Checksum
Total Length
4 47
Size of packet in bytes
IP Protocol

Section 4 Module
Page 47
End of Section
4 48
IP Protocol

Section 4 Module
Page 48
Section 5
Transport Layer
IP Technology

Section 5 Module
Page 1
Blank Page
5 2
Transport Layer
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 5 Module
Page 2
1. User Datagram Protocol (UDP)
5 3
Transport Layer

Section 5 Module
Page 3
5. Transport layer
Situation of the UDP Protocol

Application
NTP
TFTP
Transport
SNMP
DNS
Telnet
FTP
UDP
ICMP
Network
SMTP
TCP
IP
ARP
SNAP
LLC
802.2
Link
MAC
Physical
5 4
FDDI
token Ring
Ethernet ISO
802.3
Optical
fiber
Shield
twisted pair
10Base-T
Transport Layer

Section 5 Module
Page 4
Ethernet V2
10Base2
10Base5
HTTP
5. Transport layer
Connectionless Service
UDP does not reorder packets

UDP
P3
P2
P1
P2
P1
P3
IP network
Offers connectionless service
IP
UDP
IP
P2
P1
P3
P3
P3
P2
P1
P1
P2
UDP offers "connectionless" service

5 5
Transport Layer
You have already seen that IP offers service in connectionless mode only.
This means that the IP network does not ensure that all the packets from the same flow follow the same
route and therefore cannot guarantee that these packets will arrive in the same order they were
transmitted.
UDP also functions in connectionless mode and therefore does not offer mechanisms for reordering
packets.
To summarize, both UDP and IP offer connectionless mode service only.

Section 5 Module
Page 5
5. Transport layer
UDP: a Non-Reliable Protocol

Postal service
Fact
150$
User
User
Not reliable
Nevertheless, people appreciate this service
It is the role of users to develop a procedure if they wish a reliable
communication
(For ex.: In case of no reply after 3 days, the letter is sent again)
5 6
Transport Layer
You know that IP is not a reliable protocol. Does UDP increase reliability?
Well, actually, no it doesnt.
This seems unfortunate, but is it really detrimental for data communication?
In fact, it all depends on the data type.
Once again, UDP can be compared to the postal service.
When you send a letter by post, in most cases the service is not guaranteed.
The letter can quite easily be lost in transit. It doesnt happen often but it is possible.
All users are well aware of this possibility but are relatively confident when they use this mode of
communication. The same goes for UDP.
However, there is nothing stopping users from developing a reliable mechanism for sending post while
continuing to use the postal service.
For example, a user sends a letter and requests a reply. If a reply has not been received after n days, the
user can send the letter again.

Section 5 Module
Page 6
5. Transport layer
UDP for Applications Tolerating Information Loss

every 10s
Network
Time
Server
Dat NTP
e&
ti m
e
Network
management
IP network
IP network
Conversation
Conversation
Co
5 7
nv
er
sa
ti
on
Transport Layer
Other applications are relatively tolerant in the event of information loss.

This is the case, for example, with voice transmission over IP.
The IP packets of a conversation transport digitized voice samples.
If a packet is lost, only a few samples are lost.
Such losses dont really matter because the human ear is capable of correcting the defects itself.
Plus, retransmitting the packet is no good because it would arrive too late to be reinserted in the
conversation in the right place.
At this level, UDP offers a significant advantage. It is so simple that it produces only very slight
transmission delay. This is an extremely important factor in ensuring that the voice packets are
transmitted properly without any echo phenomena.
To ensure that a network is supervised properly, the network devices must be perfectly synchronized.
For example, alarm messages with alarm raise times generated by the various devices can only be
analyzed correctly if the device clocks are synchronized.
For this reason, time servers use the Network Time Protocol (NTP) to distribute time.
NTP is based on UDP, which is not, however, reliable.
If NTP distributes the time every 10 seconds, losing a message is not of major importance:
firstly, each device has its own internal clock and uses these messages to resynchronize its internal
clock,
secondly, it is not worth implementing a mechanism to retransmit lost messages because time will have
passed by the time the message is retransmitted and the information will be out of date.
In any case, a new message will be received within less than 10 seconds.

Section 5 Module
Page 7
5. Transport layer
UDP for Applications Using Simple Exchanges

Alcatel
http://alcatel.com
Name
Server
IP@=169.109.33.06
169.109.33.06
Internet
DNS
UDP
application not
wishing reliable
reliability
What is the IP@

of "alcatel.com"?
DNS
UDP
not application
reliable wishing
reliability
What is the IP@

of "alcatel.com"?
.06
tel.com" = 169.109.33
alcate
"alca
The application must implement a procedure for recovering errors

5 8
Transport Layer
Other applications are based on UDP even though they need a good level of reliability.
These are generally applications that need to perform extremely simple exchanges such as "requestreply" exchanges.
Take the case of the Domain Name System (DNS), which uses a "name server" to translate domain names
such as "alcatel.com" into IP addresses.
This is done using a dialog protocol that runs on top of UDP.
When a Client asks for a translation, it obviously wishes to receive a result. However, this level of
reliability is not guaranteed with UDP.
So, the DNS application asks a name server to translate a domain name. This request is made using nonreliable UDP and IP.
As it happens, the packet is destroyed in the network but there is no reaction from IP or UDP.
It is therefore up to the application to recover the error.
How does it do this?
Quite simply, by triggering a reply Timer when the request is sent.
If, at timeout, a reply has not been received, the Client simply resends the request.
And, hopefully, this time the exchange will proceed as planned.

Section 5 Module
Page 8
5. Transport layer
Format of the UDP Message
byte
byte
byte
UDP source port

UDP message length
byte
UDP destination port

UDP Checksum
Data
5 9
Transport Layer
Now you have seen the UDP applications, lets look at the fields that make up the UDP header.
In reality, this header is extremely simple. It is made up of 4 fields.
You are now familiar with the role of the source and destination port fields.

Section 5 Module
Page 9
5. Transport layer
Main UDP Well-Known Ports

UDP "Well-known" ports
7:
Echo
9:
Discard
11: Systat- logged users
13: Daytime
15: Netstat
19: Chargen
37: Temps (time)
43: whois53: DNS Domain Name Server (Query)
67: BOOTPs
BOOTP Bootstrap Protocol- Server
68: BOOTPc
BOOTP Bootstrap Protocol- Client
69: TFTP Trivial File Transfer Protocol
111: RPC remote Procedure Call
123: NTP Network Time Protocol
161: SNMP Simple Network Management Protocol
162: SNMP - Traps
5 10
Transport Layer

Section 5 Module
Page 10
5. Transport layer
Synthesis
UDP added value:
ConnectionlessConnectionless-oriented
Not reliable
No flow control
Application
2
Application
Application
1
3
No error recovery
UDP simply performs multiplexing/demultiplexing

5 11
Transport Layer
To conclude on the subject of UDP:

What is the added value of UDP with respect to IP?
It doesnt improve reliability.
It doesnt control flows.
It doesnt provide an error recovery mechanism.
It doesnt ensure that datagrams are delivered in the order they are sent.
So, what does it do?

UDP simply enables the multiplexing and demultiplexing of data exchanged between several
applications.

Section 5 Module
Page 11

For which kind of transfer is UDP used?
Electronic mail
File transfer
Voice over IP
Web page transfer
5 12
Transport Layer

Section 5 Module
Page 12

For what does the Real-time Transport Control Protocol (RTCP) provide
a performance monitoring channel?
An associated call setup
An IP packet
An RTP flow
5 13
Transport Layer

Section 5 Module
Page 13

What are the characteristics of the User Datagram Protocol (UDP)?
Connectionless
Layer 3 protocol for TCP
Unacknowledge packet retransmission
Unreliable service
5 14
Transport Layer

Section 5 Module
Page 14

Can the Real-Time Transport Protocol be used to provide a minimum

guaranteed network transit delay?
Yes
No
5 15
Transport Layer

Section 5 Module
Page 15

What are the transport services provided by the Real-Time Transport

Protocol (RTP)?
Identification of lost packets
Identification of out-of-sequence packets
Packet retransmission
Relative timing information
5 16
Transport Layer

Section 5 Module
Page 16
2. Transmission Control Protocol (TCP)
5 17
Transport Layer

Section 5 Module
Page 17
5. Transport layer
Situation of the TCP Protocol

Application
NTP
TFTP
Transport
SNMP
DNS
Telnet
FTP
UDP
ICMP
Network
SMTP
HTTP
TCP
IP
ARP
SNAP
LLC
802.2
Link
MAC
Physical
5 18
FDDI
token Ring
Optical
fiber
Ethernet ISO
802.3
10Base-T
Ethernet V2
10Base2
10Base5
Transport Layer
Illustrating the position of TCP in the TCP/IP stack obviously shows that TCP is located in the transport
layer but, more particularly, it presents the main applications that run over this protocol:
HTTP, which enables users to surf the internet.
FTP, which enables effective file transfer.
TELNET, which enables systems to be remote controlled.
SMTP, which enables the sending of electronic mail.
DNS, which is used for translating domain names into IP addresses and which has the particular feature
of functioning over both UDP, as seen previously, and over TCP. In fact, it uses TCP solely to update
databases between name servers.

Section 5 Module
Page 18
5. Transport layer
Connection-Oriented Service
TCP records the packets received

TCP
P3
P2
P1
P1
P2
P3
IP network
connectionless service
IP
TCP
IP
P2
P1
P3
P3
P3
P2
P1
TCP offers the connectionconnection-oriented service

5 19
P1
P2
Sequence numbers must

be inserted and
managed by TCP
Transport Layer
Although TCP is installed over IP, which is a connectionless protocol, it offers a connection-oriented
service. This means that TCP ensures that packets sent in a particular order over an IP network will be
delivered to the applications in the order they were sent. To make this possible, TCP must insert
sequence numbers in the datagrams.

Section 5 Module
Page 19
5. Transport layer
Error Recovery
Application
Application
Withdrawal: 50
TCP
P1
Withdrawal: 50
P1P1-OK
TCP is reliable
IP
TCP
P1
IP
2
Central Bank
IP network
(not reliable)
5 20
Cash er
ens
disp
Transport Layer
TCP brings reliability.

All applications that wish to ensure reliable transmission but do not have their own reliability
mechanisms use TCP. TCP offers this service using mechanisms that are complex but effective.
We have seen that this kind of mechanism was sometimes integrated in the applications running above
UDP. However, when using complicated mechanisms that use a lot of memory space, it is preferable to
implement them once only in TCP rather than reproducing them n times in the various applications.

Section 5 Module
Page 20
5. Transport layer
TCP Format
Byte
Byte
Byte
Byte
source Port number destination port number
max
60 bytes
Sequence number
Acknowledgement number
A P R S F
Header Reserved U
R C S S Y I
length
G K H T N N
Checksum
Min
20 bytes
Window size
urgent Pointer
Options (optional)
Data (optional)
Header length: expressed in 44-byte words
5 21
Transport Layer
At IP level, the unit of transmission is called a "packet"

At TCP level, the unit of transmission is called a "segment"
To enable the implementation of all TCP mechanisms, the header naturally has more fields than the UDP
header. It has a minimum of 20 bytes, which can reach 60 depending on the options.
It is the "Header length" field, expressed in 4-byte words, that determines the size of the header.

Section 5 Module
Page 21
5. Transport layer
Main Well-Known Ports

5: RJE- Remote Job Entry
7: Echo
9: Discard
11: Systat- logged users
13: Daytime
15: Netstat
19: Chargen20: FTP File Transfer Protocol- Data
21: FTP File Transfer Protocol- Commands
23: TELNETTELNET Remote connection
25: SMTP Simple Mail Transfer Protocol53: DNS Domain Name Server (zone transfer)
80: HTTP Hypertext Transfer Protocol
110: POP3 Post Office Protocol
443: HTTPS Secure HTTP
5 22
Transport Layer
As in UDP, certain ports are used for particular services such as Discard, Echo, Time, etc.

Section 5 Module
Page 22
5. Transport layer
Sequence Number and Flags

Bytes
Bytes
Bytes
Bytes
source Port number destination port number
Sequence number
Acknowledgement number
A P R S F
Header Reserved U
R C S S Y I
length
G K H T N N
Checksum
Window size
urgent Pointer
Options (optional)
Data (optional)
5 23
Transport Layer
Several fields are used to ensure that packets are sequenced correctly and errors recovered:
sequence number of the first byte in this packet.
acknowledgement number that indicates which is the next byte expected by the other station.
There are several flag bits:

Urgent
Ack
Push
Reset
Synchro
Final
Instead of just giving the definition and function of each field and flag, lets look at some examples of
how they are used.

Section 5 Module
Page 23
5. Transport layer
Connection Establishment
Appli
TCP
TCP
Seq. X
Connect-Request
Connect-Confirm
Three-way handshake
SYN (Seq.: x)
Seq.: y
Connect-Indication
Connect-Response
)
1
+
x
.=
) / ACK ( Ack
SYN (Seq.= y
ACK ( Ack.= y
+ 1 ) /(Seq.=
X
5 24
Appli
+ 1)
Transport Layer
Communication between 2 applications operating over TCP therefore begins with a connection
establishment procedure called the "three-way handshake".
The Client application sends TCP a Connect-Request primitive with the destination port, the IP address,
etc.
TCP on the Client side starts by selecting a sequence number at random. It inserts it in the TCP header
and sets the SYN (synchronization) flag.
TCP on the Server side (the remote station) uses the primitive Connect-Indication to notify the
application corresponding to the port number and provides certain parameters such as the calling IP
address.
The Server application uses the primitive Connect-Response to ask TCP to accept the connection.
The TCP on the Server side then chooses a random sequence number.
It sends back its own sequence number along with the SYN flag and indicates that the request has been
received by setting the ACK flag and sending back the sequence number received incremented by 1.
When TCP on the Server side sends this header back, it is not sure that the information will reach its
destination. For this reason, the other station acknowledges receipt of the message by incrementing its
sequence number and sending back the sequence number received plus 1.
In the meantime, TCP on the Client side uses the primitive Connect-Confirm to inform its application
that the session connection has been established.

Section 5 Module
Page 24
5. Transport layer
Data Reordering
Establishment phase
Seq.: 40
Transfer phase
Data-Request
abcd")
("abcd
Data-Request
efg")
("efg
(Seq.= 40
) / Data "a
bcd"
(Seq.= 44
) / Data "
efg"
Data-Request
hi")
("hi
(Seq.= 47
) / Data "h
i"
Data-Request
("jkl
jkl")
(Seq.= 49
) / Data "
jkl"
5 25
4
K 44
ACK=4
Data-Indication
("abcd
abcd")
2
K 52
ACK=5
Data-Indication
efghijkl")
("efghijkl
Transport Layer
Once the session has been established between the 2 applications, data can be exchanged in both
directions.
To make this example easier to understand, the data will be transferred in one direction only.
It should be noted that TCP must ensure that the data is passed on to the applications in the same order
it was sent.
Lets assume that the sequence number is currently 40. The application uses the Data-Request primitive
to ask its TCP to transmit the 4 characters "abcd". TCP therefore sends this data with its current
sequence number, that is, 40.
The remote station passes this data on to its application. But TCP itself acknowledges receipt of the data
by sending back an acknowledgement number equal to the sequence number received incremented by
the number of bytes received, which in this case is 4.
In the meantime, the sender wishes to send 3 more characters, "efg". Unlike with UDP, TCP doesnt wait
for acknowledgement of receipt of the previous data before transmitting the new data.
TCP therefore transmits this data immediately. The sequence number of the first byte in the segment is
now 44.
This segment may be carried over another route, which will mean a longer transmission delay.
Next, 2 other characters, "hi", are transmitted with the sequence number 47. This time, the route taken
by the segment is a lot quicker and the segment even reaches its destination before the previous
segment.
Next, another 3 characters, "jkl", with the sequence number 49 are sent along the same faster route.
This segment also overtakes the segment that is still being transported over the longer route.
On the receive side, this data is not passed on to the application because it is no longer in the order in
which it was sent. Only when the middle segment is received all the data waiting in TCP is passed on to
the application. And only then is the acknowledgement sent back to the sender.
This example shows how TCP uses sequence numbers to ensure that data is delivered in the same order it
was sent.
Section 5 Module
Page 25
5. Transport layer
Flow Control
window size
TCP gives a credit
to each sender
IP network
5 26
Transport Layer
So, you have seen how TCP offers a connection-oriented service that:
draws on procedures for opening and closing sessions and transferring data.
defines mechanisms for reordering data.
Lets now look at another functionality offered by TCP: flow control.

Lets take an example with a station that is dedicated to the Server function. This station is likely to
receive considerable flows of data.
It is therefore essential for the station to have a means of regulating the flows end-to-end.
Once again, TCP can offer this type of service.
Each station transmits and receives data but the receive function grants the other station transmit
function a credit that represents the number of incoming bytes the receive function is willing to accept.
The amount of credit varies dynamically and is defined via the "Window size" field in the TCP header.

Section 5 Module
Page 26
5. Transport layer
Flow Control _ Window size

Transmission Window
2000
ow = 1000
ACK nb = 2000 / Wind
2000
500 bytes
SEQ nb = 2000 50
5000by
bytes
tes
500 bytes
Receiver buffer
2500
2500
500 bytes
SEQ nb = 2500 50
5000by
bytes
tes
500 bytes
ow = 500
3000
3000
ow = 0
ow = 600
3000
3000
400 bytes
400 bytes
SEQ nb = 3000 40
4000 by
bytes
tes
3500
5 27
ow = 200
Transport Layer

Section 5 Module
Page 27
3500
5. Transport layer
Retransmit Timeout
INTERNET
SYN
Round Trip Time
etransmit
_ x =R
TimeO
Out
SYN, ACK
ACK
Round Trip Time
segment
Waiting
for ack
5 28
Transport Layer
TCP uses various Timers. The main one, the Retransmit Timeout, is used for the waiting-foracknowledgement period.
The problem, of course, lies in assigning the right value to this timer since the time taken to
acknowledge a segment depends are numerous parameters:
distance between the stations,
link speed,
system processing time,
traffic in the network,
etc.
Instead of assigning a set value to the timer, TCP sets the timer according to a parameter known as RTT
or Round Trip Time. This parameter measures the time between when a segment is sent and when an
acknowledgement is received.
The Retransmit Timeout is then set based on this RTT.

Section 5 Module
Page 28
5. Transport layer
Congestion Control: "Slow Start" Algorithm

Emitter
Example
cwnd: 1
: 512 by
t
Receiver
tes
=x
dow size
n
i
W
,
k
c
A
cwnd: 2
Segments
20
15
10
5
cwnd: 4
(Round Trip Time)
Exponential
increase
Cwnd: Congestion Window

5 29
Transport Layer
You have seen the flow-control mechanisms that use the "Window size" field in the TCP header. This field
is set by the end stations, implying that the flow control is end-to-end flow control.
But how does flow control work when there is, say, congestion in the network?
Routers, of course, only process data up to level 3. They do not intervene in level 4 TCP and therefore do
not modify parameters in the segment.
It is therefore TCP that implements a congestion-control algorithm. It is not based on another protocol or
particular fields in the messages exchanged but consists in analyzing network behavior and, in particular,
the networks ability to return acknowledgements.
If an acknowledgement is not returned, you could assume that the segment has been destroyed during
transmission because a particular interface has changed one of the bits in the frame. In practice, this
type of error is relatively uncommon and accounts for less than 1% of messages transmitted. When an
acknowledgement is not returned, it is usually due to congestion in the network.
TCP implements an algorithm known as "slow start".
The transmit station starts by subjecting the network to a kind of test that consists in transmitting a
segment to the remote station.
If the transmit station receives an acknowledgement, it tests the network again by this time transmitting
two consecutive segments.
If it receives the corresponding acknowledgements, it then transmits 4 consecutive segments and waits
for the acknowledgements and so on, exponentially, until a segment or acknowledgement is lost, in
which case another "slow start" process begins.
Numerous algorithms have been suggested over recent years and engineers are continuing to look for
other solutions.
New TCP implementations generally use a combination of the 4 basic internet standard algorithms:
the "slow-start" algorithm that you have just seen,
the "congestion avoidance" algorithm,
the "fast retransmit" algorithm,
the "fast recovery" algorithm.

Section 5 Module
Page 29
5. Transport layer
Slow Start Algorithm and Congestion Avoidance

Segments
25
ssthreshold: slow start threshold
Congestion
detection
20
Congestion avoidance
15
10
ssthreshold
5
Threshold= 16/2= 8
incre
r
a
e
n
i
L
ase
slow start
(Round Trip Time)
5 30
Transport Layer
The "congestion avoidance" algorithm is used in conjunction to the slow-start algorithm.

If, during a slow start, congestion is detected while the transmit station is in the process of transmitting
16 consecutive segments, the transmit station starts by dividing this value by 2 and storing the result (in
this case 8) in a variable.
Next, it restarts the slow-start process and transmits one, then two, then four segments, continuing
exponentially until it reaches the value stored in the variable. When it reaches this value, it goes into
what is known as the "congestion avoidance" phase. During this phase, the number of segments
transmitted increases linearly rather than exponentially.

Section 5 Module
Page 30
5. Transport layer
Synthesis
TCP provides:
Flow control
Reliability
Error recovery
Multiplexing/
demultiplexing
ConnectionConnection-oriented
service
5 31
Transport Layer
You have now seen the basics of TCP. A more extensive examination of TCP could include:
a more detailed look at flow-control algorithms with the "Nagle" and "fast retransmit" algorithms,
an analysis of selective acknowledgement mechanisms,
etc.
However, this would require much more time and would only really be useful for developers.
To summarize and conclude, it can be said that TCP offers applications a large number of services:
Firstly, it provides reliability thanks to the use of sequence numbers and acknowledgement
mechanisms.
It also implements error recovery.
It provides full-duplex flow-control mechanisms, which optimize communication.
Although it operates on a datagram network, it provides connection-oriented service, which ensures

that data is delivered in the order it was transmitted.
And finally, it enables the multiplexing of several data flows.
You now have a solid grasp of the basics of transport-level TCP/IP and are capable of identifying the
advantages and disadvantages of TCP compared with UDP.

Section 5 Module
Page 31

What are the two TCP fields which are used to assure reliable delivery
of data.
A
B
C
D
E
F
D
5 32
Transport Layer

Section 5 Module
Page 32

What are the possible actions that a receiver can take to slow down the
pace at which a sender transmits segments.
Create a larger input buffer
Not to acknowledge received packets as quickly
Set a low value for the Window size
5 33
Transport Layer

Section 5 Module
Page 33

The diagram represents the download of 2 files of 50 kBytes and 2 Mbytes

respectively. For each one, different window sizes have been tested in order to
determine the most efficient one. The window sizes used are 8, 16, 32, and 64
kbytes. Complete the graphic with the missing parameters.
% Downlink efficiency
90
80
70
60
50
kB
File size = 50
_____?
40
2 MB
File size = _____?
30
20
10
0
5 34
8
?
16
?
32
?
64
?
Window size
in kBytes
Transport Layer

Section 5 Module
Page 34
3. Signalling Transport (SIGTRAN)
5 35
Transport Layer

Section 5 Module
Page 35
5. Transport layer
SIGTRAN and SCTP
IETF/SIGTRAN protocols opposite the SS7 stack

TCAP / MAP
ISUP / SCCP
MTP-3
M2PA
SUA
SS7 stack
Q.931
V5.2
Data
IUA
V5UA
User
MTP-3
M3UA
M2UA
MTP-2
SCTP
MTP-1
IP
SCTP offers a compromise between the reliability of non

real time TCP and the "best effort" real time of UDP
5 36
Transport Layer

Section 5 Module
Page 36
5. Transport layer
SCTP protocol
SCTP Endpoint A
SCTP Endpoint B
SCTP User
Application
SCTP User
Application
SCTP Transport
Service
SCTP Transport
Service
IP Network
Service
IP Network
Service
The SCTP Endpoint

might hold one or
several IP address
5 37
Network Transport
SCTP Association
The SCTP Endpoint

might hold one or
several IP address
Transport Layer
To reliably transport SS7 messages over IP networks, the IETF SIGTRAN working group devised the Stream
Control Transmission Protocol (SCTP). SCTP allows the reliable transfer of signaling messages between
signaling endpoints in an IP network. To establish an association between SCTP endpoints, one endpoint
provides the other endpoint with a list of its transport addresses (multiple IP addresses in combination
with an
SCTP port). These transport addresses identify the addresses which will send and receive SCTP packets.
SCTP Endpoint _ An SCTP endpoint is a logical sender or receiver of SCTP segments. This endpoint is a
combination of one or more IP addresses and a port number.
Association _ SCTP works by establishing a relationship between SCTP endpoints. Such a relationship is
known
as an association and is defined by the SCTP endpoints involved and the current protocol state.
Segments and Chunks _ When SCTP wishes to send a piece of information to the remote end, it sends a
SCTP
segment to the IP layer and IP routes the packet to the destination.
A number of chunks follow the common SCTP header, and each chunk is comprised of a chunk header
plus
some chunk-specific content. This content can be either SCTP control information or SCTP user
information.
Streams _ A stream is a one-way logical channel between SCTP endpoints. A stream is a sequence of user
messages between two SCTP users. During association establishment the number of streams from SCTP
endpoint A to B and from SCTP endpoint B to A are specified.
Section 5 Module
Page 37
5. Transport layer
SCTP Association
file
file 11
record
record 00
file
file 11
record
record 11
file
file 22
record
record 00
file
file 22
record
record 11
file
file 33
record
record 00
file
file 33
record
record 11
TCP
Endpoint
A
file
file 33
record
record 11
file
file 33
record
record 00
file
file 22
record
record 11
file
file 22
record
record 00
file
file 11
record
record 11
file
file 11
record
record 00
TCP connection
buffered
file
file 11
record
record 11
TCP
Endpoint
B
received
file
file 11
record
record 00
Stream 0
file
file 11
record
record 00
file
file 11
record
record 11
SCTP
Endpoint
A
buffered
file
file 22
record
record 11
SCTP
Endpoint
B
file
file 22
record
record 00
Stream 1
SCTP
association
5 38
Stream 2
file
file 33
record
record 11
file
file 33
record
record 00
file
file 22
record
record 00
file
file 22
record
record 11
file
file 33
record
record 00
file
file 33
record
record 11
received
Transport Layer
IP signaling traffic is usually composed of many independent message sequences between many different
signalling endpoints.
SCTP allows signaling messages to be independently ordered within multiple streams (unidirectional
logical
channels established from one SCTP endpoint to another) to ensure in-sequence delivery between
associated
endpoints. By transferring independent message sequences in separate SCTP streams, it is less likely that
the
retransmission of a lost message will affect the timely delivery of other messages in unrelated sequences
(called head-of-line blocking). Because TCP does enforce head-of-line blocking, the SIGTRAN Working
Group
recommends SCTP rather than TCP for the transmission of signalling messages over IP networks.
In summary, SCTP provides:
acknowledged error-free non-duplicated transfer of signaling information
in-sequence delivery of messages within multiple streams, with an option for order-of-arrival delivery
of individual messages
optional bundling of multiple messages into a single SCTP packet
data fragmentation as required
network-level fault tolerance through support of multi-homing at either or both ends of an

association
appropriate congestion avoidance behaviour and resistance to flooding (denial-of-service) and

masquerade attacks
Section 5 Module
Page 38
5. Transport layer
SIGTRAN M2UA
MTP3
User
The rest of the SS7 stack is here

(NIF = Nodal
Interworking
Function)
MTP3
NIF
MTP2
MTP3
User
Backhauls MTP2
Primitives
M2UA
M2UA
MTP2
SCTP
MTP1
Signalling
End Point
MTP3
SS7
MTP1
SCTP
IP network
IP
Signalling Gateway
(no SP number)
IP
Application Server
Signalling relationship
5 39
Transport Layer
An Application Server contains a set of one or more unique Application Server Processes (ASP). Normally,
one or more of these ASP must be actively processing traffic.

Section 5 Module
Page 39
5. Transport layer
SIGTRAN M2UA
M2UA is not
symmetrical:
MTP3 user
(e.g. ISUP)
ASP
MTP3
M2UA/SCTP/IP
ISUP
SGW
IP network
Primitives
M2UA
IP
M2UA/SCTP/IP
SGW
MTP2
SGW
MTP1
5 40
ISUP
Transport Layer

Section 5 Module
Page 40
5. Transport layer
SIGTRAN M2PA
Node A
Node B
Node C
MTP3
User
MTP3 Users
MTP3 Users
Full MTP3
MTP3
M2PA
MTP2
Signalling is
Peer-to-Peer
Full MTP3
M2PA
MTP2
MTP2
SCTP
SS7
MTP1
SEP
SCTP
SS7/IP
MTP1
IP
IPSP (SP No.)
Transport Layer

Section 5 Module
MTP1
IPSP
SEP : Signalling End Point

IPSP : Internet Protocol Server Process
M2PA : MTP2 User Adaptation for Peer-to-Peer Connection
5 41
SS7
IP
Page 41
5. Transport layer
SIGTRAN M3UA
Node A
Node B
Node C
MTP3
User
MTP3 Users
MTP3 Users
MTP3
MTP3
M3UAs
MTP3
M3UAs
MTP3
MTP3
Peer-to-Peer
M3UA
M3UA
MTP2
MTP2
SCTP
MTP1
IP
SS7
MTP1
SEP
MTP2
IP
MTP1
SS7/IP
IPSP (SP No.)
Transport Layer

Section 5 Module
SS7
IPSP
SEP
: Signaling End Point
IPSP : Internet Protocol Server Process
M3UA : MTP3 User Adaptation Layer
5 42
SCTP
Page 42
Blank Page
5 43
Transport Layer
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 5 Module
Page 43
End of Section
5 44
Transport Layer

Section 5 Module
Page 44
Section 6
Application Services
IP Technology

Section 6 Module
Page 1
Blank Page
6 2
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 6 Module
Page 2
1. Network Time Protocol (NTP)
6 3

Section 6 Module
Page 3
NTP strata hierarchy

Internal hardware clock
can be synchronized by
NTP source
Terrestrial
radio
source
Stratum 0
GPS Satellite
Stratum 1
Stratum 2
Stratum 3
NTP
NTP
Server
Server
NTP
NTP
Server
Server
Peer
relationship
Client
Client // Server
Server
stratum
stratum 22
Client / Server
stratum 3
Client
Stratum 3
Stratum 15
6 4
Client
Client // Server
Server
stratum
stratum 22
Client
Stratum
15
Client / Server
stratum 3
Client
Stratum
15
Client
Client // Server
Server
stratum
stratum 22
Client
Stratum
15
Client
Client
Stratum
Stratum 22
Peer
relationship
Client
Stratum 3
Client
Stratum
15
NTP is a protocol designed to synchronize the clocks of computers over a network. This protocol has been specifically
designed for Internet environments and uses a client/server model to provide service. NTP version 3 is an internet draft
standard, formalized in RFC 1305. NTP version 4 is a significant revision of the NTP standard, and is the current
development version, but has not been formalized in an RFC.
At the top of any NTP hierarchy are one or more reference clocks. These are electronic clocks synchronized to a
common time reference, for instance, GPS signals, radio signals or extremely accurate frequency control. The accuracy
of the other clocks is judged according to how close that clock is to the reference clock (stratum), the network
latency to the clock and its claimed accuracy.
NTP uses the UDP protocol on port 123 for communication between clients and servers. Attempts are made at
designated intervals until the server responds. The interval ranges from once every minute up to 17 minutes depending
on a number of factors.
NTP works on a hierarchical model in which a small number of servers give time to a large number of clients. The client
on each level, or stratum, are in turn, potential servers to an even larger number of clients of a higher numbered
stratum. Stratum numbers increase from the primary (stratum 1) servers to the low numbered strata at the leaves of
the tree. Clients can use time information from multiple servers to automatically determine the best source of time
and prevent bad sources from corrupting their own time.
Servers that are directly connected to the reference clock are termed stratum 1. A reference clock connected to a
stratum 1 server is referred to as stratum 0 server. Clients never communicate directly with a stratum 0 server, they
always go through a stratum 1 server synchronized to a stratum 0 server.
Clients of stratum 1 servers are referred to as stratum 2 clients. If they serve time to clients, they are also referred as
stratum 2 servers and the clients they serve are known as stratum 3 clients. This continues to higher numbered strata.
The maximum NTP stratum number for a client is 15; however, in practice, it is rare to find clients with a stratum
number above 4 or 5, for most real-world configurations.
Section 6 Module
Page 4
NTP operation principle

Device B
(sync source)
Device A
NTP Packet 10:00:00
NTP Packet 10:00:00 10:14:00
Reception Time (T2)
Transmission Time (T1)
NTP Packet 10:00:00 10:14:00 10:14:01
Response
Transmission Time
(T3)
NTP Packet 10:00:00 10:14:00 10:14:02 10:00:03
Response Arrival Time (T4)

RTT Delay (Device A) = (T4 -T1)-(T3 -T2) = ( 10:00:03 10:00:00) (10:14:00 10:14:01) = 2s
Offset
(Device A)
= ((T2 -T1)-(T3 T4))/2 =( 10:14:00 10:00:00) (10:14:01 10:00:03) = 13m29s
6 5
The procedures of synchronizing system clocks are as follows:

1. Device_A sends an NTP packet to Device_B, with the timestamp identifying the time when it is sent
(that is, 10:00:00, noted as T1) carried.

2. When the packet arrives, Device_B inserts its own timestamp, which identifies 10:14:00 (noted as T2)
into the packet.

3. Before this NTP packet leaves, Device_B inserts its own timestamp once again, which identifies 10:14:01
(noted as T3).
4. When receiving the response packet, Device_A inserts a new timestamp, which identifies 10:00:03am
(noted as T4), into it.

At this time, Device_A has enough information to calculate the following two parameters:
The delay for an NTP packet to make a round trip between Device_B A and Device_B :
(T4 -T1)-(T3 -T2)
The time offset of Device_A with regard to Device_B :

((T2 -T1) + (T3 -T4))/2.
Device_A can then set its own clock according to the above information to synchronize its clock to that of
Device_B

Section 6 Module
Page 5
NTP operation modes

NTP
Server
NTP
Client 1
Periodical
broadcast or
multicast
NTP
Client 2
Broadcast or multicast mode
NTP
Server
NTP
Client 1
est (1)
NTP Requ
onse (2)
NTP Resp
NTP
Client 2
Polling or client/server mode
6 6
The bandwidth requirements for NTP are also minimal. Unencrypted NTP Ethernet packets are 90 bytes long
(76 bytes long at the IP layer). A broadcast server sends out a packet about every 64 seconds. A nonbroadcast client/server requires 2 packets per transaction. When first started, transactions occur about
once per minute, increasing gradually to once per 17 minutes under normal conditions. Poorly synchronized
clients will tend to poll more often than well synchronized clients. In NTP version 4 implementations, the
minimum and maximum intervals can be extended beyond these limits, if necessary
A unicast client sends a request to a designated server at its unicast address and expects a reply from which
it can determine the time and, optionally, the roundtrip delay and local clock offset relative to the
server.
A multicast server periodically sends a unsolicited message to a designated IPv4 or IPv6 local broadcast
address or multicast group address and ordinarily expects no requests from clients. A multicast client
listens on this address and ordinarily sends no requests.
For IPv4, the IANA has assigned the multicast group address 224.0.1.1 for NTP, which is used both by
multicast servers and anycast clients.

Section 6 Module
Page 6
2. File Transfer Protocol (FTP)
6 7

Section 6 Module
Page 7
FTP _ Data and Control Connections

connect
File1
1
FTP Client
5
> get file1
File1
13
11
Data
Control
connectionconnection
4
8
connectionconnection
9
2
Ephemeral port
1843
Well-known port Well-known port
Ephemeral port
TCP
FTP server
Control
Data
20
1955 6
21
TCP
9
10
12
TCP/IP Network
7 Data port: 1955
Get "file1"
6 8
FTP is a standardized protocol (STD 9). It is described in the standard RFC 959 File Transfer Protocol (FTP)
and the update RFC 2228 FTP Security Extensions.
To access files on a remote station, the user must provide the server with user identification information.
The server is responsible for authenticating the information before authorizing access to the files.
FTP uses TCP as its transport protocol in order to offer reliable end-to-end connections.
Data can be transferred in both directions.
The FTP server waits for connection requests on ports 20 and 21. Two connections are used:
The first one, the control connection, is for the login and uses the TELNET protocol.
The second one, the data connection, is for data transfer.
Use of Passive Mode

In passive mode, data-connection set establishment is reversed. The FTP server selects a port (ephemeral)
and informs the Client of this port number. Establishing the control connection and the data transfer
connection from the Client side facilitates configuration, especially when a firewall is used.

Section 6 Module
Page 8
Secure FTP
6 9
SSH File Transfer Protocol (sometimes called Secure File Transfer Protocol or SFTP) is a network protocol
that provides file transfer and manipulation functionality over any . It is typically used with version two of
the SSH protocol (TCP port 22) to provide secure file transfer, but is intended to be usable with other
protocols as well.
The SFTP protocol allows for a range of operations on remote files it is more like a remote file system
protocol. An SFTP client's extra capabilities compared to an SCP client include resuming interrupted
transfers, directory listings, and remote file removal.
SFTP is not FTP run over SSH, but rather a new protocol designed from the ground up by the IETF SECSH
working.
Note _ In Winscp, no mechanism is provided for keys generation. A program like puttygen.exe is necessary
to generate the key files.

Section 6 Module
Page 9
3. Voice over IP (VoIP)
6 10

Section 6 Module
Page 10
Role of RTP-RTCP
RTP flow (audio)
RTCP
RTP flow (audio)
RTCP
RTP flow (video)
RTCP
RTP flow (video)
RTCP
Audio
Video
RTP RTCP RTP RTCP
RTP RTCP RTP RTCP
UDP
IP
RTP: Real-time Transport Protocol

RTCP: Real-time Transport Control Protocol
6 11
Real-time Transport Protocol (RTP)

The aim of RTP is to provide a standardized means of transmitting real-time data (audio, video, etc.)
over IP. RTPs main role consists in providing sequence numbering for IP packets so that the voice or
video data can be reconstructed even if the underlying network has changed the order of the packets.
More generally, RTP enables:
identification of the type of information being transported,
addition of time stamps and sequence numbering to the information being transported,
checks to ensure that the packets have arrived at their destination.
In addition, RTP can be used with multicast packets to route conversations to multiple destinations.
In multimedia communications, each medium (voice, video, etc.) has an RTP session with an associated
RTCP feedback function. RTP sessions that have the same address are distinguished by different UDP
ports. For each participant, the session is defined by:
an IP address
a pair of UDP ports (RTP and RTCP)
Real-Time Transport Control Protocol (RTCP)

RTCP is a control protocol used in association with RTP. It measures performance but does not provide
service guarantee. RTCP is based on the periodic transmission of control packets by all the participants in
a session. RTCP provides transmission and receiver information.
It is an RTP-flow control protocol that enables the transmission of basic information about the
participants in a session and about Quality of Service.

Section 6 Module
Page 11
RTP functions
Real time
data OK
Real time
data
<=>
Recover the time base

, P3, P2, P1
, P3, P1
Reorder packets
Detect loss packets
IP
network
B
Allow conference
A
B
6 12
What RTP can do

RTP can
recover the time base of audio, video and real time (in general) data flows,
quickly detect loss of packets and inform the source of them (respectfully to
compatible delay),
be transported in multicast packets in order to deliver media to multiple addressees,
RTP cannot
act at routers' level,
control the QoS Quality of Service,
make resources reservation
Either guarantee packets delivery or retransmit missing ones.

Section 6 Module
Page 12
Problem Inherent to VoIP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Network
delay
1
2
3
4
6
7
8
10
11
12
14
9
15
Voice sample
18
17
19
13
20
16
6 13
Reconstruction
delay
1
2
3
4
6
7
8
9
10
11
12
14
15
17
18
19
20
Voice starts out as a synchronous flow.

This flow is broken down into packets to be transported over the network.
The sender has generated 20 packets. The receiver may receive the packets out of sequence as a result
of variations in transmission times (delay) through the network.
In addition to this variable transmission delay, time is required to reorganize and reconstruct the flow.
When the flow is reconstructed in this example, it is considered that:
packet 5 has been lost,
packets 13 and 16 have arrived too late to be included in the reconstructed flow.
The late arrival of these packets results in a lower level of quality.

Section 6 Module
Page 13
Format of the RTP Message

Payload Type
0
8
9
4
15
18
34
31
CC
Codec
G.711 , Law
G.711 , A Law
G.722
AUDIO
G.723
G.728
G.729
H.263
VIDEO
H.261
M Payload Type
Allows for:
Detection of lost datagrams
Detection of duplications
Reordering of datagrams
Sequence number
Timestamp
Synchronization Source Identifier (SSRC)
Contributing Source Identifier (CSRC)
Profile dependent
Identifies the source

(important in
conference mode)
Size
Data (payload)
6 14
The Version field, V: 2 bits. Indicates the version of the protocol (V=2).
The Padding field, P: 1 bit. If P equals 1, the packet contains additional padding bytes to complete the
last packet.
The Extension field, X: 1 bit. If X equals 1, the header is followed by an extension packet.
The CSRC count field, CC: 4 bits. Contains the number of CSRCs that follow the header.
The Marker field, M: 1 bit. Its meaning is defined by an application profile.
The Payload Type (PT) field: 7 bits. This field identifies the type of payload (audio, video, image, text,
html, etc.) See the IANA site ASSIGNED NUMBERS (http://www.iana.org/numbers.html) for the
various standardised codes (RTP Payload types (PT) for standard audio and video encodings).
The Sequence number field: 16 bits. Its initial value is random and increments by one each time a
packet is sent. It can be used to detect packet loss.
The Timestamp field: 32 bits. Reflects the sampling instant of the first byte in the RTP packet. The
sampling instant must be derived from a clock that increments monotonically and linearly in time to
allow synchronisation and jitter calculations.
The SSRC field: 32 bits. A unique synchronisation source identifier chosen randomly by the application.
The SSRC field identifies the synchronisation source (or more simply the source). This identifier is
chosen randomly and has the advantage of being unique amongst all the sources from the same session.
The list of CSRCs identifies the sources (SSRCs) which have contributed to obtaining data contained in
the packet that contains these identifiers. The number of identifiers is given in the CC field.
The CSRC field: 32 bits. Identifies the contributing sources (conference).

Section 6 Module
Page 14
Timestamp: Jitter Control
20 ms
40 ms
Sampling:
8 kHz
Unit: 125 s
Timestamp
0
Payload
160
1st sample
Timestamp
160
Payload
160th sample x 125 s = 20 ms
6 15
The 32-bit Timestamp field reflects the sampling instant of the first byte in the RTP packet. The
sampling instant must be derived from a clock that increments monotonically and linearly in time to
allow synchronization and jitter calculations.
The initial value of the Timestamp should be random.
The sampling frequency is defined for each type of payload.
For most audio codecs, it is generally 8000Hz.
For H.261, it is 90kHz.

Section 6 Module
Page 15
Voice Sampling
Analog voice
Frequency: 8 kHz
8,000 samples per second
Amplitude: 3 bits
8 different values
Result:
6 16
t = 0.125ms
100
101
110
000
001
010
011
101 100 100 101 000 010 011 011 011 001 110 101 100
Sampling
The method used to digitize an analog signal such as voice depends on 2 parameters: frequency and
amplitude.
Together these parameters determine the quality of the sample and the amount of information required to
reconstruct the message.

Section 6 Module
Page 16
Problems
Decoding: 101 100 100 101 000 010 011 011 011 001 110 101 100
Original signal
100
101
110
000
001
010
011
How might this be resolved?

Increase the number of samples and the amplitude, but
the bandwidth used increases.
Compression required
6 17
When an analog signal is reconstructed using digital information, the reconstructed signal differs from the
original one.
If we want to produce a digital signal that is closer to the original, the number of samples and the
amplitude must be increased. However, this means that the amount of information to be transported also
increases.
One answer could be to compress the information.

Section 6 Module
Page 17
Compression Carrot Soup
5 carrots
1 onion chopped
20 g butter
1 chicken stock cube
1 potato
150 g chicken
1 litre water
Salt and pepper
How should the list of ingredients be written to make

carrot soup?
6 18
To illustrate the various compression families, lets consider the example of 3 chefs who wish to write
down the list of ingredients required to make carrot soup.

Section 6 Module
Page 18
Compression Carrot Soup [cont.]
First method
Ingredients:
Ingredients:
55 carrots
carrots
11 onion
onion chopped
chopped
20
20 gg butter
butter
11 chicken
chicken stock
stock cube
cube
11 potato
potato
150
150 gg chicken
chicken
11 litre
litre water
water
Salt
and
Salt and pepper
pepper
6 19
5 carrots
1 onion chopped
20 g butter
1 potato
150 g chicken
1 litre water
Salt and pepper
92 characters
The first chef writes the list of ingredients as it is. He uses 92 characters.

Section 6 Module
Page 19
Second method
Lexicon:
C = carrot
O = onion chopped
B = butter
P = chicken stock cube
T= potato
PL = 150 g chicken
E = water
SP = salt and pepper
6 20
5 carrots
1 onion chopped
20 g butter
1 potato
150 g chicken
1 litre water
Salt and pepper
Ingredients:
Ingredients:
55 CC
11 O
O
20
20 gg of
of BB
11 PP
11 TT
11 PL
PL
11 EE
SP
SP
21 characters
The second chef uses a standard lexicon from a recipe book and writes the list of ingredients using the
lexicon. He uses 21 characters.

Section 6 Module
Page 20
Third method
Lexicon:
C = carrot
O = onion chopped
B = butter
P = chicken stock cube
T= potato
PL = 150 g chicken
E = water
SP = salt and pepper
5 carrots
1 onion chopped
20 g butter
1 potato
150 g chicken
1 litre water
Salt and pepper
The chicken stock cube and the

butter do not give this soup
much taste
6 21
Ingredients:
Ingredients:
55 CC
11 O
O
11 TT
11 PL
PL
11 EE
SP
SP
13 characters
The third chef uses a standard lexicon from a recipe book and, drawing on his experience, determines the
ingredients that do not add the least taste to the soup. He then writes the list of ingredients, leaving some
of them out because they dont make much difference to the soup. He uses 13 characters.

Section 6 Module
Page 21

Associate each chef with his method for writing the recipe.
13 charact.
No compression
Destructive compression
with shared lexicon
21 charact.
Compression with
shared lexicon
92 charact.
6 22

Section 6 Module
Page 22
Destructive Codec
5 carrots
1 onion chopped
20 g butter
1 potato
150 g chicken
1 litre water
Salt and pepper
The chicken stock
cube and the
butter do not give
this soup much
taste
6 23
5 carrots
1 potato
150 g chicken
1 litre water
Salt and pepper
The onion does not

give this soup
much taste
Ingredients:
Ingredients:
55 CC
11 O
O
11 TT
11 PL
PL
11 EE
SP
SP
Ingredients:
Ingredients:
55 CC
11 TT
11 PL
PL
11 EE
SP
SP
When a chef decides which ingredients must be removed, he changes the list of ingredients slightly. He
therefore changes the original recipe. Another chef may then read the recipe and also decide to change it.
The soup could end up tasting different from the original recipe.

Section 6 Module
Page 23
Destructive Codec [cont.]
They all make carrot soup but the quality of the soup is subjective
Which is the nicest?
6 24
The recipe has been changed so it must now be decided which is the nicest carrot soup.

Section 6 Module
Page 24
Quality: R Factor - MOS
R
Factor
MOS
(Mean Opinion Score)
More
reliable
100
5.0
90
4.1
80
3.7
70
3.4
60
2.9
The MOS terminology is

defined by ITU-T P.800.1
50
2.4
The PESQ (Perceptual

Evaluation of Speech
Quality) MOS is defined
by ITU-T P.862
Very satisfied
Satisfied
Some users dissatisfied
Many users dissatisfied
Nearly all users dissatisfied
Not recommended
6 25
R Factor
The ITUT has a defined a model for defining the quality of a codec. This benchmark can be used to
compare the quality of one codec with another.
The R factor is calculated on a scale of 0 to 100 (E-model) based on user perception. 100 is excellent and 0
poor. R factor calculation begins with a unimpaired signal. If there is no network or equipment, quality is
perfect.
This is expressed by the equation:
R = R0 (e.g. 93.2)
But the network and equipment impair the signal, thus reducing signal quality as it travels from one end to
the other:
R = R0 Is -Id Ie-eff + A where:
Ro: represents the basic signal-to-noise ratio, including noise sources such as circuit noise and room
noise.
Is: it is a combination of all impairments which occur more or less simultaneously with the voice
signal
Id: represents the impairments caused by delay and the effective equipment impairment
Ie-eff: represents impairments caused by low bit-rate codecs. It also includes impairment due to
packet-losses of random distribution.

A: this Advantage factor allows for compensation of impairment factors when there are other
advantages of access to the user.
Mean Opinion Score (MOS)
MOS is based on a subjective test. It corresponds to voice quality as perceived by a testers (PESQ =
Perceptual Evaluation of Speech Quality), who give a score between 1 and 5.

Section 6 Module
Page 25
Coding Rate
G.711
8-bit
encoding amplitude
Sampling: 8 kHz
t = 0.125ms
6 26
Bit-rate
64 kbps
Encoding rate
125 s
These codecs do not use the compression method. This means that the rate is calculated using the
following formula:
Rate = amplitude bits x sampling frequency

G.711 is the reference codec. It works as previously described with an amplitude of 8 bits and a sampling
frequency of 8 kHz.
G.726 Adaptive Differential Pulse Code Modulation (ADPCM) uses a compression whereby only the
difference between two samples is encoded. In this case, the amplitude can be reduced to 2 bits with an
acceptable loss of quality.
The most common rate for this codec is 32 kbps.

Section 6 Module
Page 26
Coding Rate [cont.]
Absolute
sample value
4
6
G.726 ADPCM
2
Defines the differences

between two samples
Amplitude: 2 - 5 bits
1
Sample no. 1
Difference between
sample values
Sampling: 8 kHz
6
no. 1
Rate
16 40 kbps
Encoding delay
6 27
(usually
32 kbps)
125 s
These codecs use compression methods, which means that the formula used for the previous codec is not
applicable.
G.726 ADPCM (Adaptive Differential Pulse Code Modulation) uses a compression whereby only the
difference between two samples is encoded. In this case, the amplitude can be reduced to 2 bits with an
acceptable loss of quality.
The information is then compressed using a lexicon and an algorithm which recreates the human body using
a mathematic model.
As soon as the receiver and the sender have agreed on the lexicon to be used, the model then sends the
vocal chord impulses only.

Section 6 Module
Page 27
Coding Rate [cont.]

Destructive compression with shared lexicon
+ VAD / SID / CNG
(Voice Activity Detection / Silence Insertion Deletion / Comfort Noise Generation)
G.729
Rate
Sampling: 8 kHz
20 bytes every 20 ms
2 bytes every 20 ms during
silences
Encoding delay
8 kbps
15 ms
AMR
Sampling: 8 kHz
Between 95 and 244 bits
every 20 ms
39 bits every 160 ms
during silences
6 28
Rate
Encoding delay
4.75 12.2 kbps

20 ms
These codecs use compression methods, which means that the formula used for the previous codec is not
applicable.
The G.729 samples the voice using a similar method as G.711. The information is then compressed using a
lexicon, and an algorithm which recreates the human body using a mathematic model.
As soon as the receiver and the sender have agreed on the lexicon to be used, the model then sends the
vocal chord impulses only.
Adaptive Multiple Rate (AMR) uses a compression similar to the G729. However, the rate is not fixed but 8
levels of quality and data rate have been defined (from AMR 4.75 kbps to 12.2 kbps).

Section 6 Module
Page 28
Frequency Hiding
High level
Signal
level
The human ear will not hear this

neighboring frequency sound:
it can therefore be hidden
Frequencies
6 29
The encoding of the sound is based on human hearing. Among the principal properties, three will be used
to compress a sampled audio flow:
Sensitivities to certain frequencies: the human ear is not designed to hear certain frequencies.
Let us recall that the frequency of a sound indicates its tone, which is similar to the colour of an object
(the colour itself being due to a frequency). An acute sound will have a high frequency whereas a low
sound has a low frequency.
Certain sounds are too acute to be perceived by the human ear. In reality, ultrasounds can be perceived by
certain animals.
Other sounds are too loud to be heard. These are known as infrasound. Inaudible sounds do not require
encoding.
Frequency hiding: a strong sound will hide a lower-level sound with a close frequency.

Section 6 Module
Page 29
Temporal Hiding
High level
Signal
level
This sound comes after a high-level

sound with the same frequency
and will therefore not be audible
by the human ear: it can be hidden
Time
6 30
Temporal hiding: the ear also tends to mask sounds produced just before or after the emission of a
relatively strong noise.
This noise drowns out any sound emitted afterwards. These sounds are not perceived by the human ear and
therefore do not required encoding.

Section 6 Module
Page 30
Codec Quality
R
Factor
MOS
100
5.0
90
4.1
Very satisfied
G711 (64 kbps)
AMR (12.2 kbps)
Satisfied
80
3.7
G726 (32 kbps)
70
3.4
G729
60
2.9
50
2.4
Some users dissatisfied

Many users dissatisfied
Nearly all users dissatisfied
Not recommended
6 31
R factor
MOS
G.711
89.3
4.1
AMR 12.2
84.3
3.90
G.726
82.3
3.85
G.729
68.8
3.27

Section 6 Module
Page 31
Packet Sizes
20 ms
20 ms
Packet size
G.711 (64 kbps)
160 bytes
G.726 ADPCM
(32 kbps)
80 bytes
30.5 bytes
AMR (12.2 kbps)
20 bytes
(VAD)
20 bytes
G.729 (8 kbps)
14 by.
(VAD)
6 32
VAD: Voice Activity Detection (35% silence)

Voice over IP (VoIP) sends packet information. To understand the calculations of traffic over IP, imagine
that you are filling a cup from a jug. The flow from the jug is constant, but a drop leaves the cup every 20
ms.
To calculate the rate over IP we need to convert the codec rate into the number of bytes transferred
during 20 ms.
With AMR:
12.2 kbps means 244 bits every 20 ms (12.2 kbps x 20 ms)
Because we need bytes instead of bits, 244 bits / 8 = 30.5 bytes

If we assume 35% silence, this means that 30.5 bytes are transferred during 65% of the time.
So 30.5 x 65% = 19.82 bytes

But during the silence (35% of the time), we transfer 39 bits (6 bytes) every 160 ms (equivalent to 0.6
bytes every 20ms)
So 0.6 x 35%= 0.21 bytes

Then, if we consider AMR with 35% silence: ~20 bytes every 20ms
G711 and G726 do not include VAD.

Section 6 Module
Page 32
IP Overhead
20
IP
12
160
UDP RTP
G.711
Useful bandwidth: 64 kbps

Used bandwidth: 81 kbps
IP
20
IP
20
IP
12
80
UDP RTP
12
UDP RTP
12
UDP RTP
6 33

G.726
30,5
NbUP
20
AMR
Useful bandwidth: 12.2 kbps

14
G.729

As shown previously, voice is transferred over IP using a set of layers:

NbUP is a protocol used only with AMR. It defines the rate of AMR transported, and expected.
RTP is a datagram protocol that is designed for real-time data such as streaming audio and video.
UDP is a connectionless datagram protocol. It is a "best effort" or "unreliable" protocol - not because it is
particularly unreliable, but because it does not verify that packets have reached their destination, and
gives no guarantee that they will arrive in order.
IP performs the basic task of getting packets of data from source to destination. IP can carry data for a
number of different higher level protocols; these protocols are each identified by a unique IP Protocol
Number.

Section 6 Module
Page 33
Benefit
R Factor
(voice quality)
Number of calls
using STM-1
7000
95
89.3
6511
90
84.3
6000
85
5546
5000
80
82.3
75
4000
70
3000
68.9
2995
2000
65
60
1848
Number of TDM calls
(G.711) using STM-1
= 1953 (63 x 31)
55
1000
G.711
G.726
AMR
G.729
0
6 34
This diagram shows that with STM1 (149Mb/s), it is worth using voice over IP if you dont use G711.
In fact, with STM1 you can transport 63 PCMs using G711, which means 31TSs x 63PCMs = 1,953 calls.
VoIP becomes viable with G726, AMR and G729, but with G729 quality decreases leading to the R factor
definition: Many Users Dissatisfied

Section 6 Module
Page 34

Which is the most reliable way to compare the quality of different

speech coders?
A variety of electrical measurements
An electrical technique called Perceptual Speech Quality Measurement (PSQM)
Human listening tests rated by Mean Opinion Scores (MOSs)
Percentage of Completed Calls Dropped
6 35

Section 6 Module
Page 35
End of Section
6 36

Section 6 Module
Page 36
Section 7
Quality of Service
IP Technology

Section 7 Page 1
Blank Page
7 2
Quality of Service
Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 7 Page 2
1. QOS in IP networks
7 3
Quality of Service

Section 7 Page 3
7. QoS in IP networks
Why to implement Quality of Service?

VoIP
Broadcast TV
Streaming
video
Audio/video
conference
IP network
PBX
PSTN/ISDN
7 4
Quality of Service
The Quality of Service on IP is to date an extreme subject.

When IP was designed, it was primarily dedicated to the not real time communication such as e-mail,
FTP,... It ensured the "Best effort" i.e. the network does the best that it can to deliver the packets offers
no guaranties and only a single service level, knowing that some times certain packets can be lost, can
arrive with delay. The performances are reached when the network is not very charged.
A switched network dedicates resources specifically to two end users for the entire length of their call. A
packet network using statistical multiplexing allocates resources only when input sources offer something
to send. This characteristic of statistical multiplexing increases the probability that sometimes the
network receives incoming packets at a rate greater than can be processed. The packets rest in the
buffer, or queue while awaiting processing
Now that there is convergence between the switched networks and the packet networks, traffic real time
such as voice and video must be able to forward on the packet networks.
It is necessary to implement the Quality of Service to ensure the new types of traffic:
Interactive: audio and video (videoconference)
Not interactive: audio streaming video (radio, TV)
Up to now, IP networks was dedicated to transport dated and only best effort was provided to teh users.
As we converge the network and we put voice and video applications on IP network, delay affects
interactive conversation.
The ITU says that a packet delay of 150-200 msec degrades the interactivity in a conversation.
IP needs enough intelligence to differentiate one packet from another and provide different service levels
based on the requirements of the applications

Section 7 Page 4
What is Quality of Service?
Operational means of differentiating packet flow by bounding :

delay
throughput
jitter
Recommended
Packet discard probability
7 5
Quality of Service
It is necessary to install mechanisms of priority between different types of traffic

real time and,
not real time.
To tackle this subject, several factors must be examined:

Various approaches of QoS

Admission control
Prioritization
Queuing
Work of the IETF

Integrated services
Differentiated services
Applications
RTP and RTCP
What the Quality of Service?

A means of differentiating packet flow in term of
throughput,
time, beyond 150ms, degraded interactive conversation
jitter,
probability of loss of packets

Section 7 Page 5
Delay and jitter

Delay : amount of time it takes the packet to get through the network
Delay = ( route, line speed, queue size, network load)
t1
t1
Jitter = variation delay

Jitter = ( route, line speed, queue size, network load)
7 6
Quality of Service
The degradation of the performances is primarily due to the queues.

Only the properties on which the network can act are:
The policy of access to the network as regards load
The management of the queues in the equipment of the network
The mechanisms of Quality of Service will play on:

The control of the traffic inbound the network ("traffic admission control").
The manner of managing these queues of waiting ("tail management") and the assignment of priorities,
The treatment inflicted with the packets when the queue is full ("tail drop")
The policy of access to the wearing of exit
...
Packets arriving in network equipment (switch, router) are placed in a queue of the ingress interface then,
treaties, before being deposited in a queue on the egress interface.
Definitions :
Delay The end-in-end transmission time of the packets in the network is not only function of the
distance which separates the two entities in communication and of the rate of the used links, but also
function of the size of the queues and the load of the network.
Jitter each packet of the same flow can be delayed of a different time. The jitter corresponds to this
variation of time of packet transmission. The jitter is also function of the size of the queues and the
load of the network. A packet crossing network equipment having a very short queue size will be sent
much more quickly on the exit interface mainly if the network load is not very high. Another package
taking a different road and crossing routers having a longer queue size will be delayed of a much more
long time.
Throughput Throughput mentioned in the contract SLA is the guaranteed throughput. Certain
packets can be eliminated if the queue is full and so, affect the throughput.
Section 7 Page 6
Causes of the delay

Switching delay of the order of tens s and is therefore negligible
Serialization delay
Frame
1500 Bytes
Low-speed link (64kb/s)

1110010110001010111
High-speed link (10Gb/s)

Frame
1110010110001010111
1500 Bytes
187ms
Propagation delay
1.2
s
1
5ms
1000 km
Queuing delay
varies with queue occupancy
Delay variation results exclusively from variation in the

queuing delay at every hop
7 7
Quality of Service
In advanced high-speed routers, the switching delay is of the order of tens of microseconds and is therefore
negligible. Thus, the one-way delay in a network is caused by three main components:
Serialization delay at each hop This is the time it takes to clock all the bits of the packet onto the wire.
This is very significant on a low-speed link (187 milliseconds (ms) for a 1500-byte packet on a 64-kbps
link) and is entirely negligible at high speeds (1.2 microseconds for a 1500-byte packet on a 10-Gbps link).
For a given link, this is clearly a fixed delay.
Propagation delay end-to-end This is the time it takes for the signal to physically propagate from one end
of the link to the other. This is constrained by the speed of light on fiber (or the propagation speed of
electrical signals on copper) and is about 5 ms per 1000 km. Again, for a given link, this is a fixed delay.
Queuing delay at each hop This is the time spent by the packet in an egress queue waiting for
transmission of other packets before it can be sent on the wire. This delay varies with queue occupancy,
which in turns depends on the packet arrival distribution and queue service rate.

Section 7 Page 7
QoS requirements
Telephony application
Mouth-to-ear max delay

Telephony application delay
< 150ms
(packetization time, codec encoding )
40ms
Delay < 110 ms

Jitter < 110 ms
0.1%< Packet loss < 0.5% (undetectable)
Interactive applications
300ms < Delay <400ms
Jitter not really relevant
0.5%< Packet loss < 1% (involves rare retransmission)
Non-interactive applications
0.1%< Packet loss < 0.5% (drives the throughput via TCP)
Delay irrelevant
Jitter
7 8
Quality of Service
Although many applications using a given network may each potentially have their own specific QoS
requirements, they can actually be grouped into a limited number of broad categories with similar QoS
requirements. These categories are called classes of service. The number and definition of such classes of
service is arbitrary and depends on the environment.
In the context of telephony, we'll call the delay between when a sound is made by a speaker and when that
sound is heard by a listener as the mouth-to-ear delay. Telephony users are very sensitive to this mouthto-ear delay because it might impact conversational dynamics and result in echo. A mouth-to-ear delay
below 150 ms results in very high-quality perception for the vast majority of telephony users. Hence, this
is used as the design target for very high-quality voice over IP (VoIP) applications. Less-stringent design
targets are also used in some environments where good or medium quality is acceptable.
Because the codec on the receiving VoIP gateway effectively needs to decode a constant rate of voice
samples, a de-jitter buffer is used to compensate for the delay variation in the received stream. This
buffer effectively turns the delay variation into a fixed delay. VoIP gateways commonly use an adaptive
de-jitter buffer that dynamically adjusts its size to the delay variation currently observed. This means
that the delay variation experienced by packets in the network directly contributes to the mouth-to-ear
delay.
Therefore, assuming a delay budget of 40 ms for the telephony application itself (packetization time, voice
activity detection, codec encoding, codec decoding, and so on), you see that the sum of the VoIP one-way
delay target and the delay variation target for the network for high-quality telephony is 110 ms end to
end (including both the core and access links).
Assuming random distribution of loss, a packet loss of 0.1- 0.5 % results in virtually undetectable, or very
tolerable, service degradation and is often used as the target for high-quality VoIP services.
For interactive mission-critical applications, an end-to-end RTT on the order of 300-400 ms is usually a
sufficient target to ensure that an end user can work without being affected by network-induced delay.
Delay variation is not really relevant. A loss ratio of about 0.5-1% may be targeted for such applications,
resulting in sufficiently rare retransmissions.
For noninteractive mission-critical applications, the key QoS element is to maintain a low loss ratio (with
a target in the range of 0.1-0.5 %) because this is what drives the throughput via the TCP congestion
avoidance mechanisms. Only loose commitments on delay are necessary for these applications, and delay
variation is irrelevant.
Section 7 Page 8
Admission control et Queue management

Ping between US and Europe
Delay
(ms)
Highest Delay
Second highest Delay
Second lowest Delay
Lowest Delay
1000
800
600
400
200
12:00 1:00 2:00 3:00
4:00 5:00
6:00
7:00 8:00 9:00 10:00 11:00 12:00 1:00
2:00 3:00
4:00 5:00
Uncontrollable delay
6:00
7:00 8:00 9:00 10:00 11:00
Controllable delay
7 9
Quality of Service
In advanced high-speed routers, the switching delay is of the order of tens of microseconds and is therefore
negligible. Thus, the one-way delay in a network is caused by three main components:
Serialization delay at each hop This is the time it takes to clock all the bits of the packet onto the wire.
This is very significant on a low-speed link (187 milliseconds (ms) for a 1500-byte packet on a 64-kbps
link) and is entirely negligible at high speeds (1.2 microseconds for a 1500-byte packet on a 10-Gbps
link). For a given link, this is clearly a fixed delay.
Propagation delay end-to-end This is the time it takes for the signal to physically propagate from one
end of the link to the other. This is constrained by the speed of light on fiber (or the propagation speed
of electrical signals on copper) and is about 5 ms per 1000 km. Again, for a given link, this is a fixed
delay.
Queuing delay at each hop This is the time spent by the packet in an egress queue waiting for
transmission of other packets before it can be sent on the wire. This delay varies with queue
occupancy, which in turns depends on the packet arrival distribution and queue service rate.
In the absence of routing change, because the serialization delay and propagation delay are fixed by physics
for a given path, the delay variation in a network results exclusively from variation in the queuing delay
at every hop. In the event of a routing change, the corresponding change of the traffic path is likely to
result in a sudden variation in delay.
The uncontrollable delay is due to:
the topology of the network
the travel time (primarily function of the distance)
The band-width of the connections
Size of the packets sent by the applications.
The controllable time is generated by

the latency of the packets in the buffers (queues)

Section 7 Page 9
Line speed and delay

Interleaving on slow speed link
1500 bytes
1
56kb/s
3
2
66 Bytes
Data
Voice
t0
214ms serialization delay!!
t0+ 10 s
Interleaving on fast speed link

1500 bytes
1
3
2
Voice
t0
1Gb/s
1.2 s serialization delay!!
t0+ 10 s
7 10
Quality of Service
Gigabit Ethernet changes the way you look at statistical multiplexing. Let me remind you that a standard
Ethernet frame is 1500 bytes. This means a 1500 byte packet can be transmitted in roughly 12
microseconds across a 1 Gbps link, assuming that the link actually delivers the full 1 Gbps. In reality,
8B/10B encoding reduces that a little, but not significantly. If a voice packet has to wait around for even
100 microseconds before it can be forwarded, who cares? You need to deliver voice packets in 150,000
microseconds, end to end to keep your voice users happy, otherwise, they will complain about delay.
1- 1500 bytes data frame sent to router
2- 66 byte voice frame sent an instant behind the data frame
3- Voice frame must wait 214ms (or 0.012ms if 1Gb/s) the data frame to be sent

Section 7 Page 10
Control admission : SLA (Service Level Agreement)

Traffic admission control
User
IP network
Network
operator
Traffic shaping
user
Performances
Packet loss
Legal contract
Service
Level
Agreement
xxxxxxxxx xxxxxxx
End-to-End delay
Availability
Guarantees
7 11
Quality of Service
Service Level Agreements

A "Service Agreement Level", SLA, is a contract of service signed between a user and carrier. It defines the
type of service which the carrier will have to provide to the user and includes a profile of traffic to be
respected by the user.
The SLA defines:
Performances in the form of "Traffic parameters, for example: the carrier will support 100 kb/s of high
priority traffic for the customer
The deadlines (Absolute delay, Round Time Trip, ...)
The availability in term of maintenance of the service (Mean time to answer, Mean time to repair, ...)
Losses of packets (percentage of loss)
The guarantees (penalties in the event of non respect, ...)
The operator of the network guarantees a certain quality of service. As it must check as the customer does
not exceed his rights. For this reason it sets up the functions of:
"Traffic admission control : Admission control describes how carriers control the traffic entering the
network.
"Traffic shaping" (or "traffic policing") Traffic shaping controls the rate that traffic enters into the
network. Typically carriers shape traffic to ensure that customer conforms to their Service level
agreement. For example, if the customer sends high priority traffic at 100 kb/s, the carrier shape it
at the network entry point to ensure that only 100 kb/s enters the network.

Section 7 Page 11
Queue management
Best effort
Switch / Router
Only one queue per port

(First In First Out)
Less expensive
Implicit admission control
No shaping
Input port
Input port
Priority traffic Queue
Output port
h
itc
Sw bric
fa
high
Switch
Router
Output port
high
Input port
Multipriority queues
Traffic covered by SLA
Explicit admission control
Shaping
Cost more expensive
medium
medium
low
low
high
Flow
7 12
Input port
h
itc
Sw bric
fa
Output port
high
Output port
medium
medium
low
low
Quality of Service
How do carriers manage traffic once inside the network ?

Previously the switches and the routers ensured only "Best Effort" i.e. they laid out only of one queue per
port thus, the first entering packet became systematically the first outgoing one (FIFO). Best effort
means:
No guarantee in term of delivery, time (weak cost)
No control on the inbound traffic
Today, the modern switches and the routers have "multi-priorities queues" by port for example:
High priority
Medium priority
Low priority
allowing to deposit the packets in the queues suitable in QoS wished.

Then the packets are extracted from the various tails according to the deadlines to respect.
The network device (Switch or router) receiving incoming packets selects the queue according to the
markings in the IP packet (the DiffServ CodePoints).
Tail drop management will impact the packets arriving after a queue reaches the maximum capacity.

Section 7 Page 12
Stateful/stateless QoS
ATM/FR
Stateful QoS
Service flow
Info about
flow
ATM/FR
Info about
flow
ATM/FR
ATM/FR
Info about
flow
No info
about flow
Stateless QoS
No info about flow

7 13
Quality of Service
A flow is a sequence of packets of a source towards a destination which requires the same service network.
Example all packets of a conversation.
Stateless/stateful
With the difference of "Stateful" equipment (switches ATM, FR, X25), "Stateless" equipment does not store
an information on flow. The routers are stateless equipment, when they receive a packet, they treat it,
dispatch on the exit interface but do not store any information following this treatment. Another packet,
belonging however to the same flow, will have to undergo the same treatment.
Historically, QoS stateful "IntServ" had the favor in the networks of the type ATM, FR but on Internet public
network that poses problems of scale. Indeed how to memorize in each element of the network
information of QoS million connections?
For this reason the IETF, in the years 1990, moved on QoS stateless in the form of "Differentiated services"
(DiffServ).
The fields Type-of-Service (IPv4) or Class-of-Service (IPv6) will be used to manage QoS, their value will
determine the queue to borrow.

Section 7 Page 13
Stateful QoS _ Integrated Services - RSVP
Resource
reservation
Resource
reservation
Resource
reservation
7 14
Resource
reservation
Quality of Service
IntServ (Integrated Service) is a model make it possible to ensure Stateful QoS.

The first IntServ protocol which was born is "RSVP: Resource Protocol Reservation ". In this principle, the
routers preserve the knowledge of flow claiming QoS.
RSVP is a protocol of indication making it possible to hold resources on the path between a source and a
destination.
RSVP is used by an application to require the network to ensure a certain quality of service for a given flow.
This same protocol will be used by the routers of the network between them to draw up and maintain the
tables of states related to the flood.
RSVP identifies one-way flows and is conceived to support multicast exchanges (Radio, TV) as well as
unicast. The reservation of resources is initialized by the site recipient of the flow.
The transmitter of the flood regularly sends messages of control "path" towards the receiver. Each receiver
answers by a message "resv" in which it indicates the criteria of quality of service which is appropriate to
him. Resources necessary, if available, are reserved by the routers on the path from receiver towards the
transmitter. In the case of a multicast flood, the various convergent reservations are aggregated.

Section 7 Page 14
Stateless QoS _ Type of Service

Informs crossed networks about the desired Quality of Service
byte
byte
Version Header
length
byte
Type Of
Service
byte
Datagram length

Identification
TTL
Protocol
Checksum
Source IP address
Options
Delay
Precedence
Bits
Through Reliability
-put
Cost
6
0
7
RFC 1349
RFC 791
DSCP (RFC 2474)
7 15
Quality of Service
Service Type:
The service type is an indication of the quality of service requested for this IP datagram
The Type of Service is used to indicate the quality of the service desired. The type of service is an
abstract or generalized set of parameters which characterize the service choices provided in the
networks that make up the internet. This type of service indication is to be used by gateways to select
the actual transmission parameters for a particular network, the network to be used for the next hop,
or the next gateway when routing an internet datagram.

Section 7 Page 15
ToS : Precedence (rfc791)

Precedence
Bits
Indicates the priority of

the datagram:
datagram:
000 : Routine
001 : Priority
010 : Immediate
011 : Flash
100 : Flash override
101 : not used
110 : InterInter-network control
111 : Network control
7 16
Quality of Service
Precedence:
is intended to denote the importance or priority of the datagram.
This field specifies the nature and priority of the datagram:
000: Routine
001: Priority
010: Immediate
011: Flash
100: Flash override
101: Critical
110: Internetwork control
111: Network control

Section 7 Page 16
ToS : Precedence management

Router
Prec
4
Prec
3
Prec
2
Prec
1
Prec
0
Congestion
7 17
Quality of Service

Section 7 Page 17
IP
network
DiffServ
Type Of
Version Header
Service
length
Identification
TTL
Datagram length
Flag
Protocol
Datagram Offset
Checksum
Source IP address
Options
DSCP (Differentiated Services Code Point)

Bits
Unused
Code
point
pool
Class Selector Code

Points
0: standard
1: experimental or local usage
7 18
Quality of Service
Diffserv is a Stateless approach of QoS. There is no procedure of call establishment.

DiffServ solves the principal problem encountered by IntServ, the scalability which must accompany the
increase in size by the network. The solution consists in rejecting into the routers located at the borders
of the network all the functions of classification of packets (marking, "policing" checking of the respect of
the contract by the transmitter) and traffic shaping, while the core routers of the network will have only
to apply preset behaviors (Per-Hop Behaviour. In the core of the network, all the packets are marked,
these marks are used by routers "DS-capable" to determine the behavior which must be applied to them.
The various behaviors intervene in the management of the queues and in the algorithms of selection of
packets to be rejected in the event of a queue congestion.
The choice to be made by the router of the mode of behavior according to the mark present in the packet is
very fast since there is nothing any more but one field to analyze in the packet header. The
differentiation of the traffic is carried out in the IP packet header field "ToS" (IPv4) or "CoS" (IPv6). That
is the role of the applications or border routers to correctly set these bits. But in any event, the operator
of the network controls the respect of the SLA.
DiffServ supposes that the routers or switches have the "multipriority queues".
DiffServ supposes that this simple mechanism will be enough to ensure sufficient and acceptable QoS
To be compliant with with the previous specifications, and because the majority of the old same routers
manage "precedence", DiffServ definined the first 3 bits as being the "Class selector code points" and
have a function slightly compatible with the old field "precedence".
Thus the packages will be marked with a value of DSCP called: PHB: "Per Hop Beahavior"
Advantages:
No protocol of signaling, each package conveys its QoS
No information by flow to be memorized in each equipment network
No problem about dimensioning

Section 7 Page 18
Diffserv : principle of operation
Per-Hop-Behavior
Traffic conditioning
(Meter, Marker, Shaper/Dropper)
% of use
EF
65/100
AF2
20/100
AF1
10/100
BE
5/100
Input
Classifier
Output
Scheduler
Queue management
7 19
Quality of Service
DiffServ is a flexible model. It is up to each operator to decide how many classes of service to support,
which PHBs to use, which traffic conditioning mechanisms to use, and how to allocate capacity to each
PHB to achieve the required QoS for each class of service.
The DiffServ Code Point determines the Per-Hop Behavior of the network nodes.
If all the traffic on an access link use the same Code Point, then the PHB depends upon load.
Traffic in the high priority queue should wait less time and experience better network quality of service.
First of all, the received packets on an interface will be classified according to their PHB.
There are 3 principal types of traffic correspondent to 3 types of PHB:
EF for "Expedited Forwarding", traffic having a weak time, few jitter and a guaranteed band-width
AF for "Assured Forwarding" whose band-width can be divided according to policies
And BE for "Best Effort" traffic for which the network will make despite everything possible its best
effort to convey it
But before being directed towards the interface of exit, the traffic passes in the process "traffic
conditioning" or it will be measured in order to control if it respects strictly the contract which was
subscribed in term of flow, volume, etc...
This traffic can be downgraded, i.e. marked with a weaker PHB of quality or even discarded according to
the adopted policy.
The process "Scheduler" has a policy of treatment of flows to convey the packages for the exit interface.
Another process the "tail management" also will be setup mainly in the event of congestion in order to
eliminate certain packets as of their entry in a router congested in order to avoid the aggravation of the
problem.

Section 7 Page 19
Diffserv : encoding
Bits

Class Selector Code
Point
Code
point
pool
Unused
Class 1
Class 2
Best effort
Assured Forwarding
Class 3
Class 4
7 20
Expedited Forwarding
Quality of Service
The packets are classified IETF-defined per-hop behaviors (PHBs) including :

assured forwarding (AF)
expedited forwarding (EF)
and Best effort
The EF PHB is intended to support low-loss, low-delay, and low-jitter services. The EF PHB guarantees that its
traffic is serviced at a rate that is at least equal to a configurable minimum service rate (regardless of the
offered load of non-EF traffic) at both long and short intervals. Configuring the minimum service rate higher
than the actual EF load arrival rate allows the EF queue to remain very small (even at short intervals) and
consequently lets the EF objectives be met. Traffic that is characterized as EF will receive the lowest
latency, jitter and assured bandwidth services which is suitable for applications such as VoIP. Codepoint
101110 is recommended for the EF PHB.
AF specifies forwarding of packets in one of four AF classes. Within each AF class, a packet is assigned one of
three levels of drop precedence. Each corresponding PHB is known as AFij, where i is the AF class and j is
the drop precedence. Each AF class is allocated a certain number of resources, including bandwidth.
Forwarding is independent across AF classes. Within an AF class, packets of drop precedence p experience a
level of loss lower than (or equal to) the level of loss experienced by packets of drop precedence q, if p <
q. Packets are protected from reordering within a given AF class regardless of their precedence level. To
minimize long-term congestion in an AF queue, active queue management (such as Random Early Detection
) is required with different thresholds for each drop profile.
The AF PHB groups are intended to address common applications that require low loss as long as the
aggregate traffic from each site stays below a subscribed profile and that may need to send traffic beyond
the subscribed profile with the understanding that this excess traffic does not get the same level of
assurance. AF allows carving out the bandwidth between multiple classes in a network according to desired
policies.
Best-effort forwarding behavior available in all routers (that aren't running DiffServ) for standard traffic
whose responsibility is simply to deliver as many packets as possible as soon as possible. This PHB is
intended for all traffic for which no special QoS commitments are contracted. The default PHB essentially
specifies that a packet marked with a DSCP value of 000000 receives the traditional best-effort service.
Section 7 Page 20
Diffserv : Assured Forwarding

Bits
2
3
4
5
Code
Class Selector Code
point
Point
pool
Drop
Precedence
Assured Forwarding
Class 1
Class 2
Class 3
Class 4
7 21
Unused
0
1
1
0
1
1
1
0
1
1
0
1
Low drop Precedence (AF11)

Medium drop Precedence (AF12)
High drop Precedence (AF13)
0
1
1
0
1
1
1
0
1
1
0
1



Quality of Service
Assured Forwarding (AFxy) PHB group

In a typical application, a company uses the Internet to interconnect its geographically distributed sites and
wants an assurance that IP packets within this intranet are forwarded with high probability as long as the
aggregate traffic from each site does not exceed the subscribed information rate
Assured Forwarding (AF) PHB group is a means for a provider to offer different levels of forwarding
assurances for IP packets received from a customer.
Four AF classes are defined, where each AF class is in each node allocated a certain amount of
forwarding resources (buffer space and bandwidth). IP packets that wish to use the services provided by
the AF PHB group are assigned by the customer or the provider into one or more of these AF classes
according to the services that the customer has subscribed to.
Within each AF class IP packets are marked (again by the customer or the provider) with one of three
possible drop precedence values. In case of congestion, the drop precedence of a packet determines the
relative importance of the packet within the AF class. A congested node tries to protect packets with a
lower drop precedence value from being lost by preferably discarding packets with a higher drop
precedence value.

Section 7 Page 21
Diffserv : Control du trafic par Token Bucket

Token input at
constant rate
Token equivalent to
packet size are
removed if available
in bucket
In-profile traffic
Enough
tokens?
Out-of-profile traffic
Packet discard or,
Packet marked
7 22
Quality of Service
The SLA makes provision for in-profile traffic..

Customer traffic that does not fit the SLA is known as out-of-profil traffic is not prioritized in the same
maner as in-profile traffic.
Diffserv marks both the priority and in or out of profile status of a packet.
The carrier has certain options for processing out-of-profile traffic. Since the customer isnt paying for it,
because its not part of the SLA, the carrier could choose to discard immediately or to mark it and discard
only if the network is congested or the carrier could also carry it and charge the customer a premium
price.
How do carriers determine when and if carriers transmit in-profile traffic?
The "policing" will be carried out by the equipment of access in edge of the network ("access edge"). The
method used is the "Token bucket".
Principle of the token bucket:
With a given rate, tokens are versed in the bucket, without exceeding the capacity of the bucket. Each
token represents a number of bytes of the flow which one ensures certain QoS.
When packets arrive on this flow, tokens are taken in proportion of the size of the packages.
If the bucket has sufficient tokens, the traffic is known as "in-profiles" i.e. the SLA covers the traffic.
If the bucket does not have sufficient tokens, the traffic is known as "out-of-profiles". For these packets,
the operator network applies a policy which can be the destruction of the packets immediately or only
if the network is congested, ...

Section 7 Page 22
Queue management - FIFO

Packets waiting for switch
fabric resources
Packets waiting for

link access
Switch / Router
Input port
Input port
h
itc
Sw bric
fa
Output port
Output port
Max depth
tail
front
High-priority packets
may be delayed or
discarded
Fixed or adjustable size
Tail drop
7 23
Quality of Service
Queing occurs at every swith or router between the source and the destination. Delay-sensitive traffic
requires queue management through the network to support end-to-end QoS requirements.
A queue holds packets awaiting access to a resource :
Input queue hold traffic awaiting access to the switch fabric
Output queue hold traffic awaiting transmission onto the link
The highest level of QoS appears when queues are empty, i.e. all controllable delay is controllable delay is
controlled, and only uncontrollable delay remains.
The method of managing the queue allows from some delay control.
Delay variation, or jitter is very important to the real-time applications. Transmitted real-time flows
must be played out by the destination at constant rate. If all the packets in a flow encounter the same
queue length, they wait about the same time. However, queue sizes vary overtime, resulting in packet
experiencing different delays through the network.
Using queue management techniques, we can try to eliminate or minimize jitter for delay sensitive traffic..
Fifo queuing :
The first packet that gets into the queue is the first one that gets out the queue.The device services the
packet at the front of the queue, arriving packets land at the tail of the queue.
Queue depth identifies, the size or maximum number of packets in the queue.
If the queue is full to capacity, then equipment simply drops arriving packets.
FIFO queuing contains some negative characteristics, especially when used in a traffic-prioritized network.
Devices do not process the packets based on established hierarchy. High-priority packets may be delayed
or discarded while the network devices process low-priority packets

Section 7 Page 23
Multi-priority-queuing
high
Input port
Switch
Router
medium
medium
low
high
Flow
Input port
high
Output port
low
h
itc
Sw bric
fa
high
Output port
medium
medium
low
low
Queue management method : Head-of-line blocking

High-priority packets served first then, medium then low
Medium and low priority packets may be discarded
7 24
Quality of Service
Multi-priority queuing uses a hierarchy to determine which packets to service first by defining a separate
queue for each priority level. The device analyze the packet to determine priority and places that packet
into the appropriate queue based on that determination.
Routers and switches using multi-priority queuing process high-priority first, then medium, then low priority
traffic.. Therefore, incoming high priority traffic delays medium priority traffic, which delays low priority
traffic..
Because the device services high priority packets regardless of medium or low priority network load,
medium and low priority packets may not be serviced at all and therefore discarded. This technique of
queue management is known as head-of-line blocking.
Class-based queuing
Network devices may employ a more sophisticated method of queuing to avoid head of line blocking. This
method is known as class-based queuing.
CBQ does not assign absolute priorities to traffic, but rather assigns a ratio of the resource (e.g. bandwidth)
to each class, or priority. If a particular class uses less than the allocated portion then, the other classes
use it.

Section 7 Page 24
WFQ : Weight Fair Queue

WFQ
Queues
40%
30%
20%
10%
0%
0%
WFQ constantly recalculates resource allocation

to maintain programmed ratio
67%
33%
7 25
Quality of Service
Weighted Fair Queuing allocates a certain ratio of the resource to each priority, but unlike CBQ, it
accommodate and manages traffic consisting of variably sized packets.
WFQ can give certain traffic priority without starving lower priority queue and maintain the resource
allocation ratio constantly over time. Computation complexity offers the major disadvantage to WFQ.
Assume we have 4 queues and we have strategically weighted the resource for each.
To make weighted factoring work, we must dynamically understand the traffic on the network and be
able to dynamically change the weight of the available resource in response to traffic pattern.
In this example, we want to maintain a ratio between the priority queues of 4 to 3 to 2 to 1, from the
highest to lowest.
If the switch or router sent 1500 byte packet from the lower priority queue, it will not serve that queue
again until it serves :
4 times that amount from the high priority queue,
3 times that amount from the medium priority queue,
And 2 times that amount from the low priority queue
If some queues are empty then, the device reallocates the resource to the non-empty queues in such a
way as to maintain the same ratios.
in the lower example, because the two higher priority queues are empty, the weighting between the
low and lower queues is 2 to1, so the low priority gets 67% of the resource and the lower queue gets
33% of the resource. WFQ constantly recalculates resource allocation.

Section 7 Page 25
WFQ : Discard probability

Max depth
tail
Queues
head
Probability of packet
being discarded
1.
Tail drop
50%
Queue
100% fill
WFQ leads to TCP flow troubles

7 26
Quality of Service
Any queue with a maximum depth contains the potential to fill up thus causing the discard of arriving
packets.
Straight tail drop describes the process of dropping packets based solely on available space in the queue
when packets arrive.
An empty queue offer 0% probability of dropped packets and 100% probability of dropped packets once the
queue reaches capacity.
This probability shift cases the problem known as performance oscillation.
Lets take a reminder of TCP flow control by means of next figures.

Section 7 Page 26
Performance oscillation with WFQ

Max depth
tail
Queues
head
being discarded
1.
50%
Dropped
packets from
many TCP
flows
7 27
Queue
100% fill
Quality of Service
An empty queue offer 0% probability of dropped packets and 100% probability of dropped packets once the
queue reaches capacity.
This probability shift cases the problem known as performance oscillation.
TCP offers some built in network congestion control mechanisms. If a given TCP flow experiences certain
pattern of packet loss (unacknowledged packets), it assumes network congestion. TCP very quickly
decreases the transmission rate for packets of that flow. TCP slowly builds up the transmission rate as
congestion eases.
However, the network drops packets from many TCP flows within a short period of time they will all slow
down and then build back up again.
This oscillation results in an inefficient use of network resources.
When many TCP flows slow down, throughput drops and the network operates with resources
underutilized.
When many TCP flows speed up, the network congests and drops packets and subsequently network
throughput.

Section 7 Page 27
RED : Random Early Detection

Probability of dropped packets = (Smoothed queue occupancy)
Max depth
tail
Smoothed Queue
occupancy
head
being discarded
Randomly
1.
Some packets are dropped

early while others have
sufficient space
7 28
Average queue depth
100%
Quality of Service
A more efficient model allows traffic to build to a point supporting high throughput without experiencing
congestion. This efficiency is the goal of Random Early Detection (RED)
RED drops some packets before the queue fills. The probability of packet loss increases with the occupancy
of the queue.
In this way congestion only impact a small subset of the TCP flows. The affected flows slow their
transmission rate to reduce the load on the network enough to avoid full queues and oscillation yet
maintaining acceptable overload throughput.
RED strives to achieve smoothed queue occupancy. The average queued threshold describes the average
length of the queue maintained over some period of time. If the queue exceeds the average queue
threshold, then the probability of those packets being dropped increases and and we start randomly
dropping incoming packets. We dont want to overreact and drop packets when we dont really need to
do so.
There is a potential problem with basic RED :
It may be that 10% of TCP flows (bulk data flows) send 90% of the packets. RED statically drops more
packets from large flows therefore the TCP flows that make up 90 % of the traffic will be dropped first
according to RED.
However, if those large flows were of the highest priority then high priority packets would be dropped.
Dropping high priority packets packets from QoS enabled network is not what a service provider wants to
do. Service providers want to ensure that high-priority traffic gets the services and the resources that
needs as outlined in a SLA.

Section 7 Page 28
WRED : Weighted Random Early Detection

Queues
High-priority
low-priority
One threshold/queue
being discarded
1.
Average queue depth

7 29
Quality of Service
WRED (Weighted RED) concept implement a threshold per queue.

This principle allows, for example, to increase the minimum RED threshold in the high-priority queue
allowing more high-priority packets in the queue before randomly dropping packets. By decreasing the
RED threshold of the low_priority queue, only hold a few low-priority packets before randomly dropping
packets.
This sort of RED adjustment increases the networks ability to process high-priority traffic. WRED allows
network administrators to flexibly allocate queue resources to best serve traffic prioritization needs.

Section 7 Page 29
Answer the questions
7 30
Quality of Service

Section 7 Page 30
7 31
Quality of Service

Section 7 Page 31
7 32
Quality of Service

Section 7 Page 32
4 QoS
7 33
Quality of Service

Section 7 Page 33
End of Section
7 34
Quality of Service

Section 7 Page 34
Section 8
Multiprotocol Label Switching
(MPLS)
IP Technology

Section 8 Page 1
Blank Page
8 2

Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 8 Page 2
1. Label Switching Principles
8 3


Section 8 Page 3
Why MPLS?
Multiprotocol Label Switching offers a number of advantages:

A traffic engineered path with a guaranteed reservation of resources in order
to provide the required QoS.
A fast switchover to a backup path in case of failure (in the order of a few
milliseconds).
The implementation of Layer 2 and 3 VPN services.
VLL _Virtual Leased Line _ Layer 2 point to point service
VPLS _ Virtual Private LAN Service _ Layer 2 point to multipoint service
VPRN _ Virtual Private Routed Network _ Layer 3 VPN
MPLS does not replace classical IP routing but optimizes it

8 4


Section 8 Page 4
8. Multiprotocol Label Switching
MPLS location
7
to
5
Applications
TCP
IP Routing Table
Destination Next Hop
134.5.0.0/16 200.5.1.5
134.5.1.0/24 200.2.3.4
IP
MPLS
2
PPP
1
8 5
UDP
MPLS Table
In
Out
(2, 84)
(4,12)
(2, 85)
(3, 99)
Ethernet
Frame Relay
Physical (Optical Electrical)


MPLS does not replace classical IP routing but optimizes it.

If the IP routing table is modified, the label table must be modified.

Section 8 Page 5
ATM
LSR : Label Switch Router
MPLS network
LSR
LSR
LSR
LSR
LSR
LSR
IP
Routing
Label switching
IP
Routing
IP Router
Label Switching Router
8 6

MPLS network is composed of Label Switching Router(LSR).

A Label Switching Routeur(LSR): is a traditional router which has more processing capacity and having got
MPLS protocols. It knows, amongst other things, how to manage a second table, in addition to the
routing table : the labels switching table
A LSR can be:
An IP router
An ATM switch
A Frame Relay switch
A DWDM optical switch
The table of label depends completely of the traditional IP routing table

If the IP routing table is modified, the label table must be modified.

Section 8 Page 6
LER : Label Edge Router

Transit LSR
traffic within the MPLS domain
Forwards MPLS packets using label swapping (label swap)
processing
Ingress
LER
LSR
LSR
LER
MPLS network
Egress
LSR
LSR
LSR
LSR

traffic as it enters the
MPLS domain :
examines inbound IP packets
classifies packet for QoS
Assigns initial label (label push)
processing
8 7
processing
traffic as it leaves the

MPLS domain:
Removes label (label pop)

The LER converts both IP packets into MPLS packets and MPLS packets into IP packets.
On the ingress side, the LER examines the incoming packet to determine whether the packet should be
labeled. In an MPLS network, the LERs serve as quality of service (QoS) decision points.
The function of the LSR is to examine incoming packets. Provided that a label is present, the LSR will look
up and follow the label instructions and then forward the packet according to the instructions. The LSR
performs a label-swapping function.

Section 8 Page 7
LSP : Label Switched Path
LSP
LSR
21
be
La
l:
La
l:
be
56
MPLS network
LSR
l
be
La
LER
:3
LER
LSR
LSR
8 8

A path through the network, known as a Label Switched Path (LSP), must be defined and the QoS
parameters along that path must be established. The QoS parameters determine
how many resources to commit to the path, and
what queuing and discarding policy to establish at each LSR for packets

Section 8 Page 8
Principle of the Label switching

MPLS does not replace classical IP routing but optimizes it
Switching Table
In
Out
(port, label) (port, label)
IP packet
Data
IPs: 154.1.2.3
IPd: 86.6.7.8
Label
(1, 22)
(2, 17)
(1, 24)
(3, 17)
(1, 25)
(4, 19)
(2, 23)
(3, 12)
25
Port 1
Port 2
IP packet
Data
Port 3
8 9
IPs: 154.1.2.3
IPd: 86.6.7.8
Port 4

Label swapping is based on the accurate match and not the longer prefix like IP.
The MPLS header is simple and short compared to IP header.

Section 8 Page 9
Label
19
Principle of FEC (Forward Equivalence Class)

A FEC may be a group of IP destination addresses using same LSP
IP1
LSR
IP@1
LSR
LER
LER
LSP
IP2
IP@2
23
IP1
IP2 23 IP1 23
IP2 6 IP1 6
14
IP1
IP@1
IP2 14 IP1 14
IP2
IP2
IP@2
8 10

The Forwarding Equivalence Class is an important concept in MPLS. An FEC is any subset of packets
that are treated the same way by a router. By treated this can mean, forwarded out the same
interface with the same next hop and label. It can also mean given the same class of service, output
on same queue, given same drop preference, and any other option available to the network operator.
When a packet enters the MPLS network at the ingress node, the packet is mapped into an FEC.
FECs also allow for greater scalability in MPLS. The limited flexibility and large numbers of (short lived)
flows in the Internet limits the applicability of both IP Switching and MPOA (Multi-Protocols Over Atm).
With MPLS, the aggregation of flows into FECs of variable granularity provides scalability that meets
the demands of the public Internet as well as enterprise applications.

Section 8 Page 10
Flow aggregation
LSP
LSR
Ingress Routing Table
FEC
l:
be
a
L
91
56
FEC
Label
91
91
91
52
52
l:
be
La
Destination
134.5.0.0/16
200.3.2.0/24
56.42.1.0/24
123.2.0.0/16
10.8.128.0/20
MPLS network
1
:2
l
e
Lab l : 15
e LER
Lab
Lab
el :
52
LER
LSR
Aggregation can also be done :

By protocol
By application (destination port)
By traffic priority
By source address
La
be
l:
43
l:
be
a
L
88
LSR
LSR
LSP
FEC : Forward Equivalence Class
8 11

FEC = A subset of packets that are all treated the same way by a router
The concept of FECs provides for a great deal of flexibility and scalability
In conventional routing, a packet is assigned to a FEC at each hop (i.e. L3 look-up), in MPLS it is only
done once at the network ingress

The mapping can also be done on a wide variety of parameters, address prefix (or host),
source/destination address pair, or ingress interface. This greater flexibility adds functionality to
MPLS that is not available in traditional IP routing.
The FEC for a packet can be determined by one or more of a number of parameters, as specified by the
network manager. Among the possible parameters:
Source or destination IP addresses or IP network addresses
Source or destination port numbers
IP protocol ID
Differentiated services codepoint
IPv6 flow label
..

Section 8 Page 11
MPLS Forwarding Example

Routing Table
Destination
LSP
134.5.0.0/16 LSP3
200.3.2.0/24 LSP5
MPLS Table
Dest Proc Out
LSP3 Push 2, 84
LSP5 Push 3, 99
MPLS Table
In
Proc
134.5.6.1
Out
134.5.1.5
134.5.1.5
2, 84 Swap 6,31
Routing Table
4
1 134.5.1.5
200.3.2.7
84
134.5.1.5
Destination
31
134.5.1.5
134.5.0.0/16 134.5.6.1
200.3.2.0/24 200.3.1.1
LSP3
Next Hop
MPLS Table
LSP5
1
200.3.2.7 99
2
200.3.2.7
200.3.2.7
56
200.3.1.1
MPLS Table
MPLS Table
In
Proc
In
Proc
1,99
Swap
3,56
Swap
8 12
Out
2,56
31
In
Proc
Out
1, 3
2, 3
Pop
---
Pop
200.3.2.7
Out
5,31
200.3.2.7

The labels are imposed on the packets only once in periphery of network MPLS on the level of Ingress ELSR (Edge Label Switch Router) where a treatment is carried out on the datagram in order to assign a
specific label.
What is important here, is that this calculation is carried out only one time. The first time that the
datagram of a flow arrives at Ingress E-LSR.
This label is removed at the other end by Egress E-LSR.
Thus the mechanism is as follows:
Ingress LSR (E-LSR) receives the IP packet, carry out a classification of the packet, assigns a label and
transmits the labeled packet.
the transit LSR uses the label in the packet to switch it until the packet reaches the Egress LSR
The egress LSR removes the label and routes the packet to its final destination.

Section 8 Page 12
Penultimate Hop Popping

The label is removed (popped)
Routing Table
Destination
LSP
134.5.0.0/16 LSP3
200.3.2.0/24 LSP5
MPLS Table
2
MPLS Table
Dest
Out Proc
LSP3 2, 84 Push 3
LSP5 3, 99 Push
134.5.1.5
In
134.5.6.1
Proc
Out
Swap 6, null
2, 84 POP
3
Penultimate
2
84
134.5.1.5
200.3.2.7
200.3.2.7
2
200.3.2.7
Next Hop
134.5.0.0/16 134.5.6.1
200.3.2.0/24 200.3.1.1
134.5.1.5
LSP5
200.3.2.7 99
Routing Table
Destination
LSP3
Penultimate
134.5.1.5
134.5.1.5
10
200.3.2.7 56
200.3.1.1
MPLS Table
In
Proc
1,99
Swap
MPLS Table
Out
2,56
8 13
In
Proc
Out
3,56 Swap
null
POP 5,3
200.3.2.7

The label at the top of the stack is removed (popped) by the upstream neighbor of the egress LSR
The egress LSR will not have to do a lookup and remove itself the label
One lookup is saved in the egress LSR
Egress LSR needs to do an IP lookup for finding more specific route
Egress LSR need NOT receive a labelled packet

Section 8 Page 13
Hierarchical LSP tunnels : Label stacking

MPLS Table
In
Proc
Out
MPLS Table
LSPa Push 25
MPLS Table
2
LSPa
In
Proc
25
13
Push
42
42
Push
42 42
13 25
25
18 18
13 25
Proc
Out
25
13
31
Swap
9
11
Swap
MPLS Table
In Proc
2,9
Out
Pop
Pop
31 31
13 25
LSPb
13
3
LSPc
MPLS Table
In
Out
In
Proc
Out
11
MPLS Table
In
42
MPLS Table
Proc
Out
In
Swap
18
18
Proc
Out
Swap 31
LSPb Push 13
MPLS Table
In Proc
11
8 14
Out
Pop

Hierarchical tunnel concept

One of the most powerful features of MPLS is label stacking . A labelled packet may carry many labels,
organized as a last-in-first-out stack. Processing is always based on the top label. At any LSR, a label
may be added to the stack (push operation) or removed from the stack (pop operation). Label stacking
allows the aggregation of LSPs into a single LSP for a portion of the route through a network, creating a
tunnel . At the beginning of the tunnel, an LSR assigns the same label to packets from a number of
LSPs by pushing the label onto the stack of each packet. At the end of the tunnel, another LSR pops
the top element from the label stack, revealing the inner label. This is similar to ATM, which has one
level of stacking (virtual channels inside virtual paths), but MPLS supports unlimited stacking.
Label stacking provides considerable flexibility. An enterprise could establish MPLS-enabled networks at
various sites and establish numerous LSPs at each site. The enterprise could then use label stacking to
aggregate multiple flows of its own traffic before handing it to an access provider. The access provider
could aggregate traffic from multiple enterprises before handing it to a larger service provider. Service
providers could aggregate many LSPs into a relatively small number of tunnels between points of
presence. Fewer tunnels means smaller tables, making it easier for a provider to scale the network
core.

Section 8 Page 14
MPLS shim label
8bit
Label (20 bits)

S
EXP
TTL
Experimental use
Time To Live
bottom of stack
(explained in the following diagrams)

8 15

Exp : 3 bits reserved for experimental use; for example, these bits could communicate DS
(Differentiated Services) information or PHB (Per-Hop Behaviour) guidance

S : set to one for the oldest entry in the stack, and zero for all other entries
Time To Live (TTL): 8 bits used to encode a hop count, or time to live, value
Label value : locally significant 20-bit label
Labels 0 through 15 are reserved labels, as specified in draft-ietf-mpls-label-encaps-07.txt.

A value of 0 represents the "IPv4 Explicit NULL Label". This label value is only legal when it is the sole
label stack entry. It indicates that the label stack must be popped, and the forwarding of the packet
must then be based on the IPv4 header.
A value of 1 represents the "Router Alert Label". This label value is legal anywhere in the label stack
except at the bottom. When a received packet contains this label value at the top of the label
stack, it is delivered to a local software module for processing. The actual forwarding of the packet
is determined by the label beneath it in the stack. However, if the packet is forwarded further, the
Router Alert Label should be pushed back onto the label stack before forwarding. The use of this
label is analogous to the use of the "Router Alert Option" in IP packets. Since this label cannot occur
at the bottom of the stack, it is not associated with a particular network layer protocol.
A value of 2 represents the "IPv6 Explicit NULL Label".This label value is only legal when it is the sole
label stack entry. It indicates that the label stack must be popped, and the forwarding of the packet
must then be based on the IPv6 header.
A value of 3 represents the "Implicit NULL Label". This is a label that an LSR may assign and
distribute, but which never actually appears in the encapsulation. When an LSR would otherwise
replace the label at the top of the stack with a new label, but the new label is "Implicit NULL", the
LSR will pop the stack instead of doing the replacement. Although this value may never appear in
the encapsulation, it needs to be specified in the Label Distribution Protocol, so a value is reserved.
Values 4-15 are reserved for future use.

Section 8 Page 15
Bottom of stack
MPLS Table
In
Proc
Out
Label:88
LSPa Push 2,25
S=0
Label:42
S=0
Label:13
S=1
MPLS Table
In
Proc
Out
1,25 Push 3,42 Label:42

2,13 Push 3,42
S=0
Label:25
Label:25
S=1
S=1
1
LSPa
Label:42
S=0
Label:13
S=1
S=0
Label:42
S=0
Label:25
S=1
2
4
LSPb
Label:13
S=1
5
LSPC
MPLS Table
MPLS Table
In
Label:88
Proc
In
Out
LSPb Push 3,13

8 16
Proc
Out
4,42 Push 2,88

5,23 Push 2,88


Section 8 Page 16
S : bottom of stack
Time to Live (TTL)

Shim label
Label
S
EXP
TTL
MPLS network
LER
IP packet
Label = 25
TTL = 10
LER
LSR
LSR
TTL= 9
ingress IP packet
TTL = 9
Label = 39
TTL= 8
IP packet
IP packet
TTL = 9
TTL = 5
LER
IP packet
Label = 21
TTL = 6
TTL= 7
LSR
8 17
IP packet
TTL = 9
LSR
Egress

A key field in the IP packet header is the TTL field (IPv4), or Hop Limit (IPv6). In an ordinary IP-based
internet, this field is decremented at each router and the packet is dropped if the count falls to zero. This is
done to avoid looping or having the packet remain too long in the internet because of faulty routing.
Because an LSR does not examine the IP header, the TTL field is included in the label so that the TTL
function is still supported. The rules for processing the TTL field in the label are as follows:
When an IP packet arrives at an ingress edge LSR of an MPLS domain, a single label stack entry is added
to the packet. The TTL value of this label stack entry is set to the value of the IP TTL value. If the IP TTL
field needs to be decremented, as part of the IP processing, it is assumed that this has already been done.
When an MPLS packet arrives at an internal LSR of an MPLS domain, the TTL value in the top label stack
entry is decremented. Then:
If this value is zero, the MPLS packet is not forwarded. Depending on the label value in the label stack
entry, the packet may be simply discarded, or it may be passed to the appropriate "ordinary" network
layer for error processing (for example, for the generation of an Internet Control Message Protocol
[ICMP] error message).
If this value is positive, it is placed in the TTL field of the top label stack entry for the outgoing MPLS
packet, and the packet is forwarded. The outgoing TTL value is a function solely of the incoming TTL
value, and is independent of whether any labels are pushed or popped before forwarding. There is no
significance to the value of the TTL field in any label stack entry that is not at the top of the stack.
When an MPLS packet arrives at an egress edge LSR of an MPLS domain, the TTL value in the single
label stack entry is decremented and the label is popped, resulting in an empty label stack. Then:
If this value is zero, the IP packet is not forwarded. Depending on the label value in the label stack entry,
the packet may be simply discarded, or it may be passed to the appropriate "ordinary" network layer for
error processing.
If this value is positive, it is placed in the TTL field of the IP header, and the IP packet is forwarded using
ordinary IP routing. Note that the IP header checksum must be modified prior to forwarding.

Section 8 Page 17
Transparent TTL
Label = 25
TTL= 255 2
80.1.2.3209.8.7.6
TTL=3
10.3.3.3
80.1.2.3209.8.7.6
LSR3
TTL=2
10.2.2.2
LER1
LSR2
25
ingress
10.1.1.1
LSR6
Label = 46
TTL= 254 3
46
80.1.2.3209.8.7.6
TTL=1
80.1.2.3209.8.7.6
TTL=2
LSR4
10.4.4.4
63
Label = 63
TTL= 253 4
MPLS network
10.5.5.5
80.1.2.3209.8.7.6
(Private addressing)
8 18
LER5
TTL=2

In transparent mode, the ingress routers sets the label TTL to 255, a value high enough to allow the
packet to cross the MPLS network in normal conditions (no loop). The IP TTL field will be decreased (1) by the ingress LER. When the MPLS label is removed by the egress LER, the IP TTL is not updated
with the value of the TTL in MPLS label. The egress LER will decrease the IP TTL of -1, just like a
normal router would do.

Section 8 Page 18
EXPerimental : direct mapping

Ethernet
header
802.1q
802.1p
IP
header
ToS : Type of Service

DP : Drop Precedence
Payload
User C
F
priority I VLAN_id
3 bits
Ethernet
header
IP header
ToS
Mapping
Payload
LER
DiffServ Code Point
Class DP
3
bits
Ethernet
header
2
bits
3
bits
EXP S TTL
3 bits
3
bits
IP header
ToS
Prec
Label
Payload
Precedence
5
bits
8 19

The EXP field of the MPLS Shim Header is used by the LSR to determine the PHB to be applied to the
packet.
The Exp bits are set by creating an ingress policy on the ingress LSR. This ingress policy sets the Exp bits in
relation to values associated with the frames and packets traversing the LSP. For example, if a VLAN
trunk port is tunneled through the LSP, the EXP bits can be set by directly copying the values contained
within the three 802.1p priority bits of the 802.1Q headers. Once packets/frames have reached the egress
LSR, an egress policy can be created on the egress LSR that maps the Exp bits back into the bit values of
the packets or frames.

Section 8 Page 19
Notion of Upstream and Downstream LSRs
LER
LER
Ingress
LS
P
Upstream
B
LSR
Egress
171.68.10/24
Downstream
Router-C is the downstream neighbour of Router-B for destination 171.68.10/24

Router-B is the downstream neighbour of Router-A for destination 171.68.10/24
LSRs know their downstream neighbours through the IP routing

protocol

Next-hop address is the downstream neighbour
8 20

MPLS networks allocate labels from downstream direction toward the upstream routers, toward the
source of a packet flow.
The term Downstream refers to the direction of packets flow. Control messages usually flow
Upstream

Section 8 Page 20
Label distribution method
Downstream unsolicited
Downstream on-demand
Net_x
Net_x
LSR
LSR
LSR
LSR
Demand
FEC : net_x
FEC : net_x label y
Response
FEC : net_x label y
8 21

Label distribution method

Downstream on demand :
An LSR can distribute a FEC label binding in response to an explicit request
Downstream Unsolicited label distribution:
Allows an LSR to distribute label bindings to LSRs that have not explicitly requested them
Downstream On-Demand (DoD) Label Distribution
In downstream on-demand mode, label mappings are provided to an upstream LSR when requested. Because labels
will not usually be requested unless needed, this approach results in substantially less label-release traffic for
unwanted labels when conservative label retention is in use and when the number of candidate interfaces that will
not be used for a next hop is relatively large.
Downstream Unsolicited (DOU) Label Distribution
In downstream unsolicited mode, label mappings are provided to all peers for which the local LSR might be a next
hop for a given FEC. This would typically be done at least once during the lifetime of a peer relationship between
adjacent LSRs.
The label manager may use trigger points (such as time intervals) to send out labels or label-refresh messages every
45 seconds. Or a label manager may use the change of standard routing tables as a trigger; when a router changes,
the label manager may send out label updates to all affected routers.
Both can be used in the same network at the same time; however, each LSR must be aware of the distribution
method used by its peer

Section 8 Page 21
Label distribution control

Two control methods
Independent control
Each router makes its switch table from its

routing table and informs neighbors
LSR
LER
Ingress
Ordered control
LSR
Ingress
8 22
LER
Egress
Egress LER is responsible for distributing labels

LSR
LER
LSR
2
LER
Egress

Control of Label Distribution

Two modes are used to load cross-connect tables: independent control and ordered control.
Independent control
Independent control is a term given to a situation in which there is no designated label manager and
when every router has the ability to listen to routing protocols, generate cross-connect tables, and
distribute them freely. Independent control provides for faster network convergence. Any router that
hears of a routing change can relay that information to all other routers. The disadvantage is that
there is no single point of control that is generating traffic, which makes engineering more difficult.
LSR binds a Label to a FEC independently, whether or not the LSR has received a Label the next-hop
for the FEC. The LSR then advertises the Label to its neighbor
Consequence: upstream label can be advertised before a downstream label is received
Ordered Control
The other model of loading tables is ordered control. In the ordered control mode, one router
typically the egress LERis responsible for distributing labels. Ordered control has the advantages of
better traffic engineering and tighter network control; however, its disadvantages are that
convergence time is slower and the label controller is the single point of failure.
LSR only binds and advertise a label for a particular FEC if it is the egress LSR for that FEC or it has
already received a label binding from its next-hop
Both methods are supported in the standard and can be fully interoperable

Section 8 Page 22
Downstream unsolicited and Ordered control

#99
LSR4
#216
LSR8
#99
FEC:171.68.10.0/24
Use label #216
3
#216
LSR6
LSR3
FEC:171.68.10.0/24
Use label #99
2
FEC:171.68.10.0/24
Use label #99
99
2
612
LSR5
99
LSR2
2
FEC:171.68.10.0/24
Use label #99
612
FEC:171.68.10.0/24
Use label #612
LSR1
#612
#99
#99
1
FEC:171.68.10.0/24
Use label #33
#33
#612
8 23

LSR1 discovers a next hop for a particular FEC

LSR1 generates a label for the FEC and communicates the binding to LSR2
LSR2 inserts the binding into its forwarding tables
.

Section 8 Page 23
171.68.10/24
99
FEC:171.68.10.0/24
Use label #612
4
#612
LSR7
216
Downstream On-Demand and Ordered control

#216
LSR4
Req label FEC:

171.68.10.0/24
1
Use label #216
216
#216
#99
LSR8
LSR3
6
2
LSR6
Req label FEC:

171.68.10/24
5
Use label #99
Req label FEC:

171.68.10.0/24
3
99
LSR2
LSR1
171.68.10/24
4
Use label #33
LSR5
#99
#33
LSR7
8 24

1- LSR4 recognizes LSR3 as its next-hop for an FEC. A request is made to LSR4 for a binding between
the FEC and a label

the FEC and a label

the FEC and a label

4- LSR1 is the egress LSR to that particular FEC so, LSR1 replies to LSR2 with a label. LSR2 updates its
switching table.
5- Because a label binding has been received by LSR2 from upstream LSR3, LSR2 replies to LSR3 with a
label. LSR3 updates its switching table.

6- Because a label binding has been received by LSR3 from upstream LSR4, LSR3 replies to LSR4 with a
label. LSR4 updates its switching table.

7 LSR6 recognizes LSR5 as its next-hop for an FEC. A request is made to LSR5 for a binding between
the FEC and a label

8 LSR5 recognizes LSR2 as its next-hop for an FEC. A request is made to LSR2 for a binding between
the FEC and a label

9 - LSR2 recognizes the FEC and has a next hop for it, it creates a binding and replies to LSR5
.

Section 8 Page 24
Label retention modes

LSR2
FEC:171.68.10/24
Use label #33
An LSR may receive

label bindings from
multiple LSRs
FEC:171.68.10/24
Use label #576
FEC:171.68.10/24
Use label #33
LSR5
LSR1
171.68.10/24
FEC:171.68.10/24
Use label #63
FEC:171.68.10/24
Use label #33
LSR4
LSR3
FEC:171.68.10/24
Use label #45
Liberal label retention

Conservative label retention
Label retention modes

8 25

Label retention mode

Liberal retention mode
LSR retains labels from all neighbors

Improve convergence time, when next-hop becomes unavailable
Require more memory and label space
Conservative retention mode

LSR retains label only from next-hops neighbors (according to routing)

LSR discards all other labels for this FECs
Free memory and label space
Label Retention method trades off between label capacity and speed of adaptation to routing changes

Section 8 Page 25
Label Distribution Protocols

Populating MPLS switching tables
Manually
Only on very small networks
LDP protocol
Automatically
Based on existing IP routing tables
MP-BGP protocol
Traffic Engineering
8 26
RSVP-TE protocol
Based on Explicit path

There are several ways to populate the MPLS switching tables.

Manually, which is only realistic for a very limited number of equivalence classes (FECs).
Automatically:
By means of the Label Distribution Protocol (LDP), which is entirely automatic and which builds, on
the basis of the information contained in the IP routing tables, the LSPs for each of the equivalence
classes recognized in the routing tables. With this approach, paths are built hop by hop with an
operation principle similar to that of the IP routing protocols.
By means of the BGP4 with the addition of label distribution information, becoming thus the Multiprotocol BGP (MP-BGP).
By supplying explicitly the path that the LDPs must follow and the quality of service they must ensure.
These solutions are based on two protocols:

The ReSerVation Protocol Traffic Engineering (RSVP-TE) is a modification of RSVP which is already
present in the equipment of lot of manufacturers.
LDP is the hop-by-hop distribution protocol defined by the MPLS working group of IETF. It is totally
independent of the pre-existing protocols. The operation mode of LDP is based on the model of the IP
routing protocols. LDP uses the routing table generated by these protocols to build the MPLS switching
tables. The principle of LDP is simple: each LSR attributes a label to each of the neighbor LSRs for each
equivalence class recognized in its routing table. Then the neighbor will use this label for all the
packets of this equivalence class that the LSR sends to it.

Section 8 Page 26
LDP: Functions
LSR-ID: 5.6.7.8
LSR-ID: 1.2.3.4
LSR
LSR
Label Distribution Protocol

Neighbors management
Session establishment with parameter negotiation
Label/FEC association exchange
8 27

LDP is the hop-by-hop distribution protocol defined by the MPLS working group of IETF. It is totally
independent of the pre-existing protocols.
The Label Distribution Protocol (LDP) is entirely automatic. This protocol builds, on the basis of the
information contained in the IP routing tables, the LSPs for each of the equivalence classes recognized
in the routing tables. With this approach, paths are built hop by hop with an operation principle similar
to that of the IP routing protocols.
It uses the routing table to build the MPLS switching tables. It establishes automatically a path (LSP) for
each equivalence class. It offers different modes of distribution and of conservation of labels, thanks
to which it can adapt to different uses.
In order LDP works, all the internal LSRs of a domain must imperatively know the same FECs. For that, it
is possible to aggregate the inputs of the IP routing tables inside an MPLS domain. The border LSRs are
the only ones that can aggregate the prefixes. If prefixes could be aggregated inside a domain,
downstream LSRs would not be able anymore to de-aggregate the packets that, although intended to
different networks, would have the same label.
LDP ensures 3 main functions:
neighbors management.
the establishment of LDP sessions and the negotiation of the parameters of operation.
the exchange of FEC/label associations and more generally of switching information.
Each LSR has a unique identifier which is generally the IP address of a loopback interface.

Section 8 Page 27
LDP: Association Exchange
NetID y
NetID x
Label mapping
FEC: NetID y #L22

Label mapping
FEC: NetID x #L63

Label Release
Downstream LSR
FEC: NetID x #L63

Upstream LSR
8 28
Label Withdraw
FEC: NetID y #L22

Once an LDP session is established, several types of message are used to exchange Label/FEC
associations:
Label Request (F): this message is sent by the upstream LSR to ask which label must be used for the
packets belonging to the FEC.

Label Mapping (F, L): the downstream LSR uses this message to attribute the upstream LSR a label to
be used for the packets corresponding to the FEC. This message can be spontaneous or can be sent on
receipt of a label request.
Label Withdraw (F, {L, *}): the downstream LSR informs the upstream LSR that the L label/F FEC
association is no more valid and that this label must not be used anymore. When the label is omitted
(*), all the associations corresponding to the F FEC are invalidated. The downstream LSR uses this
message for example in case of routing change or when it cannot route the F FEC anymore.
Label Release (F, L): the upstream LSR informs the downstream LSR that it does not need any F/L
association. The upstream LSR can manage this message because the routing has just changed or
because it received an unsolicited and unnecessary label attribution.

Section 8 Page 28
LDP _ Label Distribution Protocol

8
10
FEC
In In
if label
-
Out Out
if label
138.120
7
FEC
In In
if label
1
Out Out
if label
25 138.120 3 33
25
In In
if label
1
FEC
Out Out
if label
138.120
33
3
5
1
LSR
3
2
LSR
Label Mapping (LSP-id: x)

(label 33)
9
Label Mapping (LSP-id: x)
(label 25)
3
1
2
138.120
LSR
192.168
LSR
8 29

LDP defines a set of procedures and messages by which one LSR (Label Switched Router) informs another
of the label bindings it has made. The LSR uses this protocol to establish label switched paths through a
network by mapping network layer routing information directly to data-link layer switched paths.
Two LSRs (Label Switched Routers) which use LDP to exchange label mapping information are known as
LDP peers and they have an LDP session between them. In a single session, each peer is able to learn
about the others label mappings, in other words, the protocol is bi-directional.
Label Distribution Protocol (LDP) is often used to establish MPLS LSPs when traffic engineering is not
required. It establishes LSPs that follow the existing IP routing, and is particularly well suited for
establishing a full mesh of LSPs between all of the routers on the network.
LDP can operate in several modes to suit different requirements:
On-demand mode, the ingress router sends an LDP label request to the next hop router, as
determined from its IP routing table. This request is forwarded on through the network hop-by-hop
by each router. Once the request reaches the egress router, a return message is generated. This
message confirms the LSP and tells each router the label mapping to use on each link for that LSP.
Unsolicited mode, the egress routers broadcast label mappings for each external link to all of their
neighbors. These broadcasts are fanned across every link through the network until they reach the
ingress routers. Across each hop, they inform the upstream router of the label mapping to use for
each external link, and by flooding the network they establish LSPs between all of the external
links.
The main advantage of LDP over RSVP is the ease of setting up a full mesh of tunnels using unsolicited
mode, so it is most often used in this mode to set up the underlying mesh of tunnels needed by MPLS
enabled VPNs
Section 8 Page 29
2. MPLS Traffic Engineering
8 30


Section 8 Page 30
Drawbacks of IP routing
Traffic based on the

lowest metrics
A
10
10
Cost=
15
10
10
Cost=
10
10
10
10
10
over-utilized links
under-utilized links
traffic
congestion
Changing the metric
Cost=
10
cause traffic redirection
Cost=
151
10
10
10
10
10
10
10
C
A
10
10
Only serves to move problem around
MPLS-TE
Lacks granularity
8 31

Rerouting traffic by raising metrics along the current path has the desired effect of forcing the traffic via
another way.
Since interior gateway protocol (IGP) route calculation was topology driven and based on a simple
additive metric such as the hop count or an administrative value, the traffic patterns on the network
were not taken into account when the IGP calculated its forwarding table. As a result, traffic was not
evenly distributed across the network's links, causing inefficient use of expensive resources. Some links
became congested, while other links remained underutilized. This might have been satisfactory in a
sparsely-connected network, but in a richly-connected network (that is, bigger, more thickly meshed
and more redundant) it is necessary to control the paths that traffic takes in order to balance loads.
IGP is topology driven as opposed to being resource driven!
As Internet service provider (ISP) networks became more richly connected, it became more difficult to
ensure that a metric adjustment in one part of the network did not cause problems in another part of
the network. Traffic engineering based on metric manipulation offers a trial-and-error approach rather
than a scientific solution to an increasingly complex problem.
IGP metric manipulation has a snap effect when it comes to redirecting traffic (not an even
distribution)
ISPs became uncomfortable with size of Internet core
Large growth spurt imminent
Routers too slow
IGP metric engineering too complex
IGP routing calculation was topology driven, not traffic driven
Router based cores lacked predictability

Section 8 Page 31
Goal
Traffic engineering:
Optimization of resource usage
(congestion risks are limited)
Quick re-routing
Guarantee of the Quality of Service (QoS)
8 32

MPLS-TE is used for some security and traffic engineering applications:

optimization of the resource usage: explicit constraint routing enables a better load sharing than IP
routing which is based on the shorter path towards a given destination.

quick re-routing: explicit routing enables to protect primary tunnels with backup tunnels set in
advance. Thanks to this securization, there is no convergency of protocols during the breakdown and
the re-routing delays are then very short (less than 100 ms). This technology is called MPLS Fast
Reroute ( 2).
Guarantee of the Quality of Service (QoS): MPLS-TE is not, in the literal sense, a QoS mechanism and
cannot by itself guarantee the quality of service. It favors the setting up of a QoS offer. It is important
to remember from now on that the reservation of bandwidth performed by MPLS-TE is purely logic and
is only used for facilitating constraint routing. It is a function of the control plan and there is no real
resource allocation in the transfer plan. The bandwidth guarantee requires at least to combine MPLSTE with a mechanism of rate limitation on the tunnel head routers. The DiffServ Aware MPLS-TE (DSTE) technology is an extension of MPLS-TE that enables to route the tunnels per class of DiffServ
service and to reserve the resources per class of service, not globally anymore. It enables a sharper
traffic engineering. Combined with a mechanism of rate limitation per class of service on the tunnel
head routers, DS-TE enables to make sure that the traffic per class of service on a link does not exceed
a given threshold and guarantees not only the bandwidth but also the transit time in a tunnel.
The purpose of traffic engineering is to maximize the amount of traffic that can transit in the network,
while maintaining the quality of service, in order to delay at a maximum the network investments
(links, router) (RFC 3272). Traffic engineering works on fixed network topology.
Traffic engineering is complementary of network engineering. Network engineering consists in finding a
topology or in modifying the topology (links, nodes) to support the demand or an increase in demand.
Both types of engineering are often distinguished by the following phrase: "traffic engineering consists
in making the traffic go where bandwidth is whereas network engineering consists in putting
bandwidth where traffic goes."

Section 8 Page 32
MPLS-TE
First request : an LSP from ingress to egress LSR of 500 Mb/s

Second request : an LSP from ingress to egress LSR of 150 Mb/s
Third request : an LSP from ingress to egress LSR of 100 Mb/s
500 Mb/s
INGRESS LSR
8 33
622
Mb
/s
2.5
Gb
/s
100 Mb/s
s
b/
G
150 Mb/s
5
2.
s
Mb/
5
5
1
b/s
45 M
EGRESS LSR

Metric-based traffic controls continued to be an adequate traffic engineering solution until 1994 or 1995.
At this point, some ISPs reached a size at which they did not feel comfortable moving forward with
either metric-based traffic controls or router-based cores.
Traditional software-based routers had the potential to become traffic bottlenecks under heavy load
because their aggregate bandwidth and packet-processing capabilities were limited.
It became increasingly difficult to ensure that a metric adjustment in one part of a huge network did not
create a new problem in another part. And router-based cores did not offer the high-speed interfaces
or deterministic performance that ISPs required as they planned to grow their core networks.
Traffic engineering (RFC 3346)

Traffic Engineering is the process where data is routed through the network according to a management
view of the availability of resources and the current and expected traffic.
The class of service and quality of service required for the data can also be factored into this process.
Traffic Engineering may be under the control of manual operators. They monitor the state of the network
and route the traffic or provision additional resources to compensate for problems as they arise.
Alternatively, Traffic Engineering may be driven by automated processes reacting to information fed
back through routing protocols or other means.
Traffic Engineering helps the network provider make the best use of available resources.
One of the main uses for MPLS will be to allow improved Traffic Engineering on the ISP backbone
networks.

Section 8 Page 33
Constrained SPF
CSPF calculation
Available Bandwidth
Priority
Attributes
Administrative Weight
OSPF-TE, IS-IS
Path
Cost
Available BW
a-c
10
a-b-c
100
a-d-e-c
500
b
C=1
200Mb/s
a
Tunnel
ac : 200Mb/s
C=2
100Mb/s
C=1
10Mb/s
C=1
1Gb/s
C=2
C=1
1Gb/s
500Mb/s
CSPF (Constrained Short Path First)
8 34

The ingress LSR determines the physical path for each LSP by applying a Constrained Shortest Path First
(CSPF) algorithm to the information in the TE-database . CSPF is a shortest-path-first algorithm that
has been modified to take into account specific restrictions when calculating the shortest path across
the network. Input into the CSPF algorithm includes:
Topology link-state information learned from the IGP and maintained in the TE-database
Attributes associated with the state of network resources (such as total link bandwidth, reserved link
bandwidth, available link bandwidth, and link color) that are carried by IGP extensions and stored in
the TE-database
Administrative attributes required to support traffic traversing the proposed LSP (such as bandwidth
requirements, maximum hop count, and administrative policy requirements) that are obtained from
user configuration
The output of the CSPF calculation is an explicit route consisting of a sequence of LSR addresses that
provides the shortest path through the network that meets the constraints. This explicit route is then
passed to the signaling component, which establishes forwarding state in the LSRs along the LSP. The
CSPF algorithm is repeated for each LSP that the ingress LSR is required to generate.
Following constraints can be taken into account :
bandwidth reservation
include or exclude a specific link(s)
include specific node traversal(s) Constraint-Based Routing in IP networks.
optional backup paths
Network continuously keeps track of these constraints and floods them through IGP extensions. For a new LSP to be
launched in the network, operator configures LSP constraints at ingress LSR, network actively participates in
selecting an LSP path that meets the constraints and represents it as an explicit route

Section 8 Page 34
MPLS-TE components
Destination
Maximum,
Reservable,
Unreserved per priority
Bandwidth
Affinities
Preemption
Protection by fast reroute

Optimized metric
8 35

Destination The source of the TE LSP is the head-end router where the TE LSP is configured, whereas its
destination must be explicitly configured.
Bandwidth One of the attributes of a TE LSP is obviously the bandwidth required for the TE LSP. The
traffic flow pattern between two points is rarely a constant and is usually a function of the time of
day, not to mention the traffic growth triggered by the introduction of new services in the network or
just an accrued use of existing services. Hence, it is the responsibility of the network administrator to
determine the bandwidth requirement between two points and how often it should be reevaluated.
You can adopt a very conservative approach by considering the traffic peak, X percent of the peak or
averaged bandwidth values. After you determine the bandwidth requirement, you can apply an
over/underbooking ratio, depending on the overall objectives. Another approach consists of relying on
the routers to compute the required bandwidth based on the observed traffic sent to a particular TE
LSP.
Affinities A field that must match the set of links a TE LSP traverses represents affinities.
Preemption The notion of preemption refers to the ability to define up to seven levels of priority. In the
case of resource contention, this allows a higher-priority TE LSP to preempt (and, consequently, tear
down) lower-priority TE LSP(s) if both cannot be accommodated due to lack of bandwidth resources on
a link.
Protection by Fast Reroute MPLS Traffic Engineering provides an efficient local protection scheme
called Fast Reroute to quickly reroute TE LSPs to a presignaled backup tunnel within tens of
milliseconds
Optimized Metric The notion of shortest path is always related to a particular metric. Typically, in an IP
network, each link has a metric, and the shortest path is the path such that the sum of the link metrics
along the path is minimal. MPLS TE also uses metrics to pick the shortest path for a tunnel that
satisfies the constraints specified. MPLS TE has introduced its own metric. When MPLS TE is configured
on a link, the router can flood two metrics for a particular link: the IGP and TE metrics (which may or
may not be the same).

Section 8 Page 35
Explicit Paths
Path to Y
Strict/loose
Hop
10.2.2.2
Strict
Loose
10.3.3.3
Strict
10.1.1.1
Mandatory path
Loose path
10.3.3.3
10.1.1.1
Tunnel
ag
8 36
10.2.2.2
a
d


Section 8 Page 36
RSVP-TE : LSP and path

LSP: Point-to-Point entity
Path : tunnel instance
LSP
r
o
l
e
Tunn
Path 2 D
Path
1
G
Tunnel is made of one or several LSPs

8 37

RSVP-TE makes the distinction between LSP and tunnel :

un tunnel is an unidirectional point-to-point routing entity. It is made of one or more LSP, each LSP
match a particular path;
an LSP is a path.
The LSPs associated to a MPLS-TE tunnel can be modified during their life. This concept allows, for
instance, the reoptimization without loss of traffic.
the modification of the path of a tunnel without loss of traffic consists in establishing a new LSP on the
new path and rocking the traffic of the old LSP towards the new LSP before destroying the old LSP
(procedure known as make before break). The two LSP are identified like pertaining to the same tunnel
and the bandwidth is reserved only once on the common link.
A tunnel is identified by the combination of the address of the head router, address of the destination
router and a tunnel number (tunnel id) allocated by the head router. The combination of the address
of the head and the tunnel-Id ensures the unicity of the identifiers of tunnel in the network. In the
same way, a LSP is identified by the combination of the tunnel-Id and a LSP number (LSP-id) allocated
by the head router. The combination of the tunnel-Id and the LSP-Id ensures the unicity of the
identifiers of LSP in the network. The tunnel-Id does not vary during the life of the tunnel. The whole
of one or several LSP associated with a tunnel can vary during the life of the tunnel. For example, in
the event of breakdown of LSP or reoptimisation, a new LSP with a new LSP-Id is created for the
tunnel.

Section 8 Page 37
RSVP-TE : Principle
FEC Oper
C
In
lab
Out Next out

lab Hop if
Push 15
15
Swap 33
6
Path( Tunnel-ID:n, LSP-ID: x)
ERO: b,c / Traffic param 1
LSR
LSR
3
2
Path (Tunnel-ID:n/LSP-ID: x)
(ERO:c / Traffic param)
Out Next out

lab Hop if
Oper
Resv (Tunnel-ID:n/LSP-ID: x)
(label 15 / final Traffic param)
Resv (Tunnel-ID:n/LSP-ID: x)
(label 33 / final Traffic param)
1
2
138.120
LSR
192.168
ERO : Explicit Route Object
LSR
8 38

Generic RSVP (Resource reSerVation Protocol) uses a message exchange to reserve resources across a
network for IP flows. The Extensions to RSVP for LSP Tunnels (RSVP-TE) enhances generic RSVP so that
it can be used to distribute MPLS labels.
RSVP-TE is a separate protocol at the IP level. It uses IP datagrams (or UDP at the margins of the
network) to communicate between LSR peers. It does not require the maintenance of TCP sessions,
but as a consequence of this it must handle the loss of control messages.
The basic flow for setting up an LSP using RSVP-TE for LSP Tunnels is :
1. The traffic parameters required for the session or administrative policies for the network enable LSR
A to determine that the route for the new LSP should go through LSR B, which might not be the same
as the hop-by-hop route to LSR-C. LSR A builds a Path message with an explicit route of (B,C) and
details of the traffic parameters requested for the new route.
2. LSR A then forwards the Path to LSR B as an IP datagram.
3. LSR B receives the Path request, determines that it is not the egress for this LSP, and forwards the
request along the route specified in the request. It modifies the explicit route in the Path message
and passes the message to LSR-C.
4. LSR C determines that it is the egress for this new LSP, determines from the requested traffic
parameters what bandwidth it needs to reserve and allocates the resources required. It selects a
label for the new LSP
5. LSR-C distributes the label to LSR B in a Resv message, which also contains actual details of the
reservation required for the LSP.

6. LSR B receives the Resv message and matches it to the original request using the LSP ID contained in
both the Path and Resv messages. (7) It determines what resources to reserve from the details in the
Resv message, allocates a label for the LSP, sets up the forwarding table.,
8. LSR-B passes the new label to LSR A in a Resv message.
9. The processing at LSR A is similar, but it does not have to allocate a new label and forward this to an
upstream LSR because it is the ingress LSR for the new LSP.
Path and Resv refresh unless suppressed
Section 8 Page 38
Path Protection Secondary/Standby LSP

3 New path computation
4 Signaling of the new LSP
Failure
notification
Failure
Backup tunnel
8 39

The default mode of network recovery of MPLS Traffic Engineering, is a global restoration mechanism:
Global The node in charge of rerouting a TE LSP affected by a network element failure is the headend router.
Restoration When the head-end router is notified of the failure, a new path is dynamically
computed, and the TE LSP is signaled along the new alternate path (assuming one can be found).
A LSP is initially set up. The link fails. After a period of time (the fault detection time), the upstream
router detects the failure. This period of time essentially depends on the failure type and the Layer 1
or 2 protocol. If you assume a Packet over SONET (PoS) interface, the fault failure detection time is
usually on the order of a few milliseconds. In the absence of a hold-off timer, the router upstream of
the failure immediately sends the failure notification (RSVP-TE Path Error message) to the head-end
router.
Accurately quantifying the time required to perform the set of operations just described is particularly
difficult because of the many variables involved. These include the network topology (and hence the
number of nodes the failure notification and the new LSP signaling messages have to go through and
the propagation times of those through fiber), the number of TE LSPs affected by the failure, CPU
processor on the routers, and so on. We can provide an order of magnitude. On a significantly large and
loaded network, the CSPF time and RSVP-TE processing time per node are usually a few milliseconds.
Then the propagation delay must be taken into account in the failure notification time as well as in the
signaling time. So, on a continental network, MPLS TE head-end rerouting would be on the order of
hundreds of milliseconds.
MPLS TE Reroute is undoubtedly the simplest MPLS TE recovery mechanism because it does not require
any specific configuration and minimizes the required amount of backup state in the network. The
downside is that its rerouting time is not as fast and predictable as the other MPLS TE recovery
techniques that are discussed next. Indeed, the fault first has to be signaled to the headend router,
followed by a path computation and the signaling of a new TE LSP along another path, if any (thus with
some risks that no backup path can be found, or at least with equivalent constraints).
7750 SR : Up to seven secondary or standby LSPs can be specified for each primary LSP. All the
secondary paths are considered equal and the first available path is used.

Section 8 Page 39
Fast Reroute
Protected LSP
R2
R4
R1
R3
R6
R5
R9
R7
R8
R1s backup: R1>R6>R7>R8>R3
Detour or Bypass LSP
R2s backup: R2>R7>R8>R4

R3s backup: R3>R8>R9>R5
R4s backup: R4>R9>R5
8 40

Two different methods for local protection. In the one-to-one backup method, a PLR (Point of Local Repair)
computes a separate backup LSP, called a detour LSP, for each LSP that the PLR protects. In the facility
backup method, the PLR creates a single bypass tunnel that can be used to protect multiple LSPs.
The facility backup fast reroute method uses a facility backup tunnel, or bypass, to bypass a failed link
or a failed node. This method takes advantage of MPLS's label stacking capabilities, and all LSPs
protected using this method are protected using a single, common bypass tunnel. Their original labels
are left intact, and another label is pushed on top to direct it through the bypass tunnel. At the egress
end of the tunnel, the traffic is merged back into the original path by popping the outer label and
examining the inner label to find out where the packet should go.
OneOne-toto-One Backup:
Backup A local repair method in which a backup LSP is separately created for each protected
LSP at a Point of Local Repair .
Each upstream node sets up a detour LSP that avoids only the immediate downstream node, and merges
back on to the actual path of the LSP as soon as possible. If it is not possible to set up a detour LSP
that avoids the immediate downstream node, a detour can be set up to the downstream node on a
different interface.
The detour LSP may take one or more hops before merging back on to the main LSP path.

Section 8 Page 40
3. MPLS VPN Services
8 41


Section 8 Page 41
Layer 2 VPN services

Point-to-Point Service
PWE3 (Pseudo Wire Emulation Edge to Edge)

or VPWS (Virtual Private Wire Service)
a.k.a (Alcatel-Lucent) : VLL (Virtual Leased Line)
PE
LAN
A
LAN
C
PE
8 42
LAN
B
PE
Point-to-Multipoint Service
LAN
B
PE
LAN
D
VPLS : Virtual Private LAN Service


Layer-3 VPNs worked well for a number of customers; however, there was a significant percentage of the
marketplace using legacy systems and networks for whom a Layer-2 VPN solution would be better suited.
Businesses in the marketplace found that Layer-3 VPNs met only part of the end users requirements.
Back in the early days of MPLS implementation, early adopters of the technology discovered that there
was a market demand for Layer-2 VPNs as well.
For MPLS carriers wishing to capture the FR and ATM market place, VPWS offers rapid service conversion.
Customers will be able to maintain their FR or ATM connection with the same equipment. The difference
is that traffic will now be carried encapsulated in an MPLS header and run over an MPLS network.
In VPWS, the service providers provide a pseudo-wire across the network. This overlay model provides
circuit emulation from customer to customer. It provides services similar to ATM and FR; however,
significant cost savings can be realized using MPLS
As these needs were identified, different architectures were suggested for MPLS Layer-2 VPNs, including:
PWE3 (Pseudo Wire Emulation Edge to Edge VLL (Virtual Leased Line) One of the important features
of this solution is that the configuration and management required in the provider network is much
simpler than that for leased lines or the MPLS and Martini solutions mentioned above this makes it
cheaper for the provider to supply such a service.
In addition, this type of VPWS is more flexible than using leased lines.
VPLS (Virtual Private LAN Services) TLS (Transparent LAN Service)

Section 8 Page 42
3.1 Virtual Private Wire Service (VPWS)
8 43


Section 8 Page 43
Point to Point VPN (Pseudowire) Principle

Site 1
Red
Point-to-Point connection
Pseudo-wire
ATM
FR
PPP
HDLC
Eth
Site 1
Blue
Site 1
Green
LSP
Site 2
Red
PE2
PE1
Site 2
Blue
P
P
Site 2
Green
LSP
Pseudo-wire
PE3
RFC 4448 aka Martini draft)
Encapsulation method
Signaling protocol (Pseudo-wire setup and control)

8 44
Luca Martini
LDP (RFC4447)
MP-BGP (RFC 4761)

PWE3 (Pseudo Wire Emulation Edge to Edge)

An improvement on this approach is to use the PWE3 extensions to MPLS that are currently being
standardized by the IETF in the PWE3 working group.
These extensions improve scalability by using a fixed number of MPLS LSPs between PE devices in the
provider network. Emulated, point-to-point layer 2 connections (known as pseudo-wires or Martini
pseudo-wires, after the author of the original draft) are then created between pairs of PE devices by
tunneling through such an LSP.

Section 8 Page 44
Encapsulation Ethernet
Example : Transport of Ethernet
Site 1
Red
Site 2
Red
a.k.a (Alcatel-Lucent) : epipe (Ethernet pt2pt)

P
PE2
Site 1
Blue
LSP
PE1
nx
4 bytes
Ethernet
4 bytes
Mac @ dest
Mac @ src
EtherType
Control Word
(4 bytes)
Data
FCS
Label
EXPS TTL
Pseudo wire EXPS TTL

Label
Sequence nb (opt)
Mac @ dest
Mac @ src
EtherType
Data
RFC 4448 : Encapsulation Methods for Transport of Ethernet over MPLS Networks
8 45
Site 2
Blue
FCS

Raw Mode vs. Tagged Mode

Raw mode - the PW represents a connection between two Ethernet ports. This means that if CE tags the
frame it is not meaningful to the PE, the frame is delivered as it is received by MPLS network
Tagged mode - the PW represents a connection between two VLANs. The tag is used by the service
provider to distinguish the traffic. Each VLAN is represented by a different PW.
Control Word
When carrying Ethernet over an MPLS backbone, sequentially may need to be preserved. The optional
control word along the guidelines of is defined here, and addresses this requirement. In general,
applications running over Ethernet do not require strict frame ordering.
QoS Considerations
The ingress PE may consider the user priority (PRI) field [802.1Q] of the VLAN tag header when determining
the value to be placed in a QoS field of the encapsulating protocol (e.g., the EXP fields of the MPLS label
stack). In a similar way, the egress PE may consider the QoS field of the MPLS (e.g., the EXP fields of the
MPLS label stack) protocol when queuing the frame for CE-bound.
Ethernet pt2pt epipe - aka. VPWS (Virtual Private Wire Service)
Transparent to the subscribers data and protocols
True VLL - No MAC learning
The Service provider can apply proper QoS treatment, billing and ingress/egress filtering, shaping and
policing
Draft-martini-l2circuit-eth-mpls service encapsulation
VC label dynamically assigned (T-LDP) or provisioned

Section 8 Page 45
Signaling LDP (TLDP)

PWID = 66 (Bidirectional VC)
LSPs
PE2
VPN label : 23
If 0
eth
PE1
PWID : 66
Remote PE : PE1
If 1
eth
PWID : 66
Remote PE : PE2
VPN label : 23
Manual
configuration
is required
VPN label : 18
LDP : LABEL_MAPPING_Message
PW type : Ethernet
FEC : Virtual Circuit
PWID : 66
MTU : 1500
Control Word : Present /not present
Label : 18
VPN label : 18
Manual
configuration
LDP : LABEL_MAPPING_Message
PW type : Ethernet
FEC : Virtual Circuit
PWID : 66
MTU : 1500
Control Word : Present /not present
Label : 23
RFC 4447 : Pseudowire set up and maintenance using LDP
8 46

Virtual Circuit FEC Element

C - Control Word present

VC Type - FR, ATM, Ethernet, HDLC, PPP, ATM cell. Assigned values are specified in "IANA Allocations
for Pseudowire Edge to Edge Emulation (PWE3)
VC Info Length - length of VCID field
Group ID - user configured - group of VCs representing port or tunnel index
PW ID - used with VC type to identify unique VC.
Interface Parameters - Specific I/O parameters
Note that the PW ID and the PW type MUST be the same at both endpoints.

Section 8 Page 46
Signaling LDP/MP-BGP
CE1
FR network
VC100
VC200
VC201
VC101
VC300 VC301
RED-CE1
Local label mapping
VC100L12(CE1CE2) L12
VC101L13(CE1CE3)
CE1
VC100
CE3
RED-CE2
Local label mapping
L21 L21(CE1
CE2)VC200
L23(CE3
CE2) VC201
MPLS network
PE
VC101
RED-CE3
Local label mapping
VC300L31(CE3CE1
VC301L32(CE3CE2)
PE
VC200
VC201
L23
L13
CE2
CE2
PE
L31
VC300
VC301
L32
LSP
CE3
RFC 4761 : Using BGP for Auto-Discovery and Signaling
8 47

Another solution is described in draft-kompella-ppvpn-l2vpn. This draft gives a mechanism for creating a
VPWS using MP-BGP as both an auto-discovery protocol and a signaling protocol.
In this solution, each PE devices uses Multi-Protocol BGP (MP-BGP) to advertise the CE devices and VPNs
connected to it, together with the MPLS labels used to route data to them. Consequently, when this
information is received by the other CE devices, they learn how to setup the VPWS.

Section 8 Page 47
Data exchange
RED-CE1
Local label mapping
VC100L12(CE1CE2)
VC101L13(CE1CE3)
Label mapping
L21 (CE1
CE2) VC200
L12 (CE1CE2) VC200
L23 (CE3
CE2) VC201
L32 (CE3CE2) VC201
7
L54
L12
L31 6
L12
CE1
LSP
54
L12
PE
VC100
31
L35
L12
35
MPLS network
VC101
3
L12
PE
1
VC200
VC201
L23
CE2
PE
RED-CE3
Local label mapping
VC300L31(CE3CE1)
VC301L32(CE3CE2)
8 48
VC300
VC301
CE3

CE 2 can now send data to CE 1.

To do so, it simply sends the data over VC 1, which is the VC corresponding to CE 1. PE 2 knows that data
received on VC 1 should be tunneled to PE 1, with label L12 (the label for data from CE 2 to CE 1).
Then because the connection between PE1 and PE2 is achieved through an LSP, a second label is stacked
The top label can be swapped by the core LSRs.
The egress PE discards the top label. Inner label L12 is removed and the data is forwarded over VC 2 to
CE1.

Section 8 Page 48
Lasserre-V.Kompella vs. K.Kompella
Marc Lasserre
Vach Kompella
Lasserre-V.Kompella
Signaling or auto-configuration
(tunnels establishment and routing
information exchanges)
Kireeti Kompella
K.Kompella
LDP
MP-BGP
Auto-discovery
no
MP-BGP
Learning which other PE routers are

participating in the VPLS.
To do manually or using
proprietary solutions.
Supported by
8 49
Many vendors
(Alcatel-Lucent)
Complex. Spends
bandwidth
Juniper

Signaling also called auto-configuration : the mechanism by which tunnels are established and routing
information are exchanged
Auto-discovery : process by which one PE router learns which other PE routers are participating in the
VPLS.
The main difference between the two drafts is that Vach advocates using the LDP protocol for VPLS
signaling setup, while Kireeti says MP-BGP can do that and discover other VPLS nodes
Currently, Juniper is the only company supporting Kireeti's Draft Kompella. Most vendors planning on
offering VPLS are behind Vachs solution, co-authored with Marc Lasserre
The two drafts have very similar names and both relate to how routers assign labels, but there are subtle
differences.
Alcatel supports an approach to label distribution specified in a draft named Lasserre V Kompella. This
specification use LDP protocol for assigning the label for a pseudo-wire LSP. This is convenient because
routers in a MPLS network already support LDP signaling for their LSPs. LDP has been designed to establish
signaling relationships with directly connected neighbors as well as indirectly connected neighbors and is
easily extensible.
Lasserre V Kompella draft does nor define an auto-discovery method, so there is a need for extension of
LDP or to do it manually or to develop proprietary solutions.
The alternative approach is supported by Juniper. It is named the K. Kompella. It uses MP-BGP for
signaling the assigned labels. Again, the routers in a MPLS network already use BGP and use MP-BGP for the
MPLS L3 VPN service, so this is convenient. However, since BGP is a broadcast protocol, it may not be
bandwidth efficient.
K-Kompella Pros:
K-Kompella Cons:
Similar to L3VPNs (uses MP-BGP, like L3VPNs) . Not as widely supported as Lasserre-V.Kompella
Easier to add PEs to a VPN
. BGP is essentially a broadcast mechanism
Dont have to run LDP

(wasted bandwidth, security)
Uses Auto-Discovery
Section 8 Page 49
3.2 Virtual Private LAN Services (VPLS)
8 50


Section 8 Page 50
VPLS : Virtual Private Lan Service

LSP between PEs
VPLS A
MPLS
network
P
VPLS A
PE3
VPLS B
P
VPLS B
PE1
VPLS B
Customer VPLS
tunneled through
MPLS network
PE2
MTU
VPLS B
VPLS A
Point-to-Multipoint Service
8 51
MTU : Multi Tenant Unit

One of the main differences between a VPWS and the VPLS described above is that the VPWS only provides
a point-to-point service, whereas the VPLS provides a point-to-multipoint service. This also means that
the requirements on the CE devices are quite different. In a VPWS, layer 2 switching must be carried out
by the CE routers, which have to choose which Virtual Wire to use to send data to another customer site.
In comparison, the CE routers in a VPLS simply send all traffic destined for other sites to the PE router.
MTU are typically located in large buildings, serving different customers.
In the IETF L2VPN terminology, a MTU is called Layer2 PE (L2PE).
Customers designated VPLS A and VPLS B are part of two independent Virtual Private LANs
Tunnels LSP are set up between PEs
Layer 2 VC LSPs are set up in Tunnel LSPs
The CE at the ingress side simply reviews Layer-2 addresses and forwards information to the CE on the
egress side based upon Layer-2 switching or bridging tables.
All customer sites using VPLS appear to be on the same LAN, regardless their location. From customer edge
device point of view, the WAN is not visible.
Customer edge devices appear to each other as connected via single logical learning bridge with fully
meshed ports.
VPLS combines the best of Frame Relay VPN and IP
Defined in draft-lasserre-vkompella-vpls-l2vpn-08.txt

Section 8 Page 51
VPLS : LAN emulation

MPLS
network
P
Site A
Site B
PE3
P
PE1
P
PE2
Site C
IEEE 802.1D
Bridging
(MAC learning)
Site A
Switch
(LAN emulation)
VPLS Bridge
Site B
VPLS Bridge
Site C
Bridge VPLS
Pseudo-Wire
8 52

The MPLS core acts like a Layer-2 Bridge (LAN switch).

VPLS Forwarding
Learns MAC addresses per pseudo-wire (VC LSP)
Forwarding based on MAC addresses
Replicates multicast & broadcast frames
Floods unknown frames
Split-horizon for loop prevention
Standard IEEE 802.1D code

Used to interface with customer facing ports
Might run STP with CEs
Used to interface with VPLS
Might run STP between Pes
There are two proposed standards for implementing a VPLS. They differ based on their approach to the following:
Auto-Discovery What technique is used to enable backbone routers that participate in a VPLS domain to find each
other?
Signaling What protocol is used to set up MPLS tunnels and distribute labels?
Draft Lasserre-Vkompella VPLS
This solution uses LDP for signaling and does not use a protocol for auto-discovery. Any network organization that
implements it would have to know what backbone routers were a part of a VPLS instance. For every VPLS instance on
a backbone router, the network organization would have to configure that backbone router with the addresses of all
of the other backbone routers that are part of that VPLS instance. This approach is both operationally demanding and
error prone, and it introduces another protocol (LDP) into the network.
Draft Kompella VPLS
A Draft Kompella VPLS uses MP-BGP for both auto-discovery and signaling. Using MP-BGP for auto-discovery greatly
simplifies the configuration of VPLS without introducing an additional protocol into the network.

Section 8 Page 52
VPLS : Virtual Forwarding Instance
Site 1
Red
Red
VFI table
LSP
Blue
VFI table
Eth
Site 1
Blue
Blue
VFI table
Red
VFI table
e1
e3
Eth VLAN tag8
PE
Eth e2
PE
VLAN tag 8
Eth 2
Site 2
Blue
MPLS
network
PE
LSP
Pseudo-wire
Red
VFI table
Eth e
1
Site 2
Red
Eth
Blue
VFI table
e0
VLAN tag8
Site 3
Red
Site 3
Blue
VFI : Virtual Forwarding Instance

8 53

Provider Edge routers track MAC addresses I VPLS networks by using Virtual Forwarding Instances (VFIs).
VFIs are table that contain MAC addresses for a given VPLS service or customer.
VFIs can be assigned to a physical port such as an Ethernet interface, or a VLAN.
VFIs separate one customers MAC addresses and VLANs from another.
Thus, PEs associate received frames to a particular Pseudo-Wire, using the VFI assigned to the port

Section 8 Page 53
VPLS : Encapsulations
Self-learning bridge
Spanning Tree Prot.
b
a
Red VFI table

Mac
a,b,c
d,e,f
g,h,i
Site 1
Red
1
ad
e1
1
3
e1
Site 1
Blue
Eth e2 PE1
VLAN tag 8
np
Blue VFI table

LSP
VPN
Label Label
l,m,n E2.8
o,p,q
1
12
56
r,s,t
3
78
42
Mac
if
8 54
L34
CW
2
L12
CW
Red VFI table
LSP
VPN
Label Label
34
56
65
42
VPN label
34
Eth
if
L56
L34
CW
56
1 3 3
3 L56
42 L12
CW
L25
L34
CW
L34
CW
25
if
d,e,f
a,b,c
g,h,i
e3
0
2
ad
PE2
f e
Site 2
Red
L12
CW
LSP
VPN
Label Label
34
12
23
44
12
5 44
27
4
L25
L12
CW
Mac
e3
Eth VLAN tag8
Site 2
Blue
Eth e
1
MPLS
network
VPN label
65
LSP PE1PE2 : 56 25-3

LSP PE2PE1 : 12-27-3

These VFIs contain MAC addresses and/or VLAN tags as well as any QoS policies. They also contain inner
labels used for a given Pseudo-Wire or set of pseudo-Wires established for the customer.
Here we see the encapsulation of ethernet over MPLS network and VPLS service.
A standard Ethernet frame is received off the LAN on the customer edge switch. This can be also an MTU.
The frame is forwarded to PE. The PE then looks up the VFI assigned to the port. From information stored in
the VFI, the PE then adds the VPLS/MPLS headers that include :
A control word
A VPN label that represents the Pseudo-Wire
The network MPLS label that reaches the destination PE

Section 8 Page 54
VPLS :Hierarchical VPLS (H-VPLS)
Flat Topology
VPLS scalability
problem
Hierarchical Topology
8 55
Hub and Spoke

There are some scalability limitations that apply equally.
VPLS places a significant burden in the PE devices. In particular, the PE device performs routing in the
provider network, it maintains the MPLS tunnels in the provider network together with the pseudowires on top of these, and it performs MAC learning for all of the attached VPLSs. This means that the
PE device will need enough processing power and memory to maintain the forwarding state for
hundreds or thousands of VPLS instances, each of which could have thousands of MAC addresses.
The size of a single VPN instance is limited by the efficiency of the MAC learning and bridging
algorithms deployed. As a general rule of thumb, it is likely to be possible to connect tens of sites to a
single VPLS VPN, but not hundreds.
Hierarchical VPLS
VPLS requires a full mesh of pseudo-wires between all PE devices causing scalability problems.
It is beneficial to select one PE Hub and spoke and to only set up the mesh of tunnels between this Hub
and spoke PE and the other PEs.
This architecture has a direct impact on the Signaling Overhead
This approach seems to be well established as a good solution to the core LSP scalability issue.
It reduces :
the number of connections
The replication requirement (In the basic model, when a frame is received whose destination MAC
address is unknown, the PE replicates the frame to all other PE routers in the network mesk. has to be
fooded)
However, it does not reduce the number of MAC addresses that need to be maintained. PE still does the
Ethernet bridging.

Section 8 Page 55
VPLS : De-coupled VPLS

Thousands MAC addresses
VLANs
MTU
MTU
MTU
MTU
MTU
MTU
hundreds MAC
addresses
8 56
MTU
MTU
MTU: Multi Tenant Unit


De-coupled VPLS distributes the VPLS functions between PEs and MTUs
De-coupled VPLS reduces the number of MAC addresses to maintain, and the number of signaling
connections but does not limit the number of pseudo-wires as the hierarchical VPLS does.
All Ethernet MAC functions (MAC switching, learning, aging, flooding, STP, etc) and Pseudo-wires
termination functions are performed in the MTU, while the auto-discovery and LSR (MPLS) functions are
performed in the PEs
The link between MTU and PE is able to maintain multiple virtual circuits implemented using VLAN tags (or
MPLS labels).
PE acts as an LSR/LER. It does not implement Ethernet bridging functions.
The result in this architecture is that MTUs perform all the replication and MAC functions and the PEs
establish a Pseudo-Wire mesh for each MTU-to-MTU link necessary for connectivity using MPLS
provisioning and signaling.

Section 8 Page 56
3.3 Virtual Private Routed Network (VPRN)
8 57


Section 8 Page 57
VRF : Virtual Routing and Forwarding
CE
PE
PE
VRF Blue
VRF Blue
CE
Router
VRF Red
CE
PE
CE
VRF Red
VRF Yellow
CE
PE
VRF Yellow
CE
8 58
PE
VRF Yellow
CE

VRFs : VPN (or virtual) Routing and Forwarding Table.

Each VPN uses its own forwarding table.
At a PE, a VRF represents the context that is specific to an attached VPN; a VRF is primarily associated to
(is identified by) the one or more sub-interfaces through which the sites belonging to this VPN are
connected.
In this architecture, each PE maintains a virtual router for each VPN forwarding table. Fully meshed tunnels
are advertised across the core using VR protocols. The core of the MPLS network does not combine data
from several sites. Since the data is kept separate, this design has the added benefit of additional
security in that a misconfiguration will not impact security of the data. The downside of this design could
prove to be one of scalability and the need for complex configuration.
Multiple VRFs are used on PE routers
Each VPN needs a separate Virtual routing and forwarding instance (VRF) in each PE router to :
Provides VPN isolation
Allows overlapping, private IP address space by different organizations

Section 8 Page 58
PE to CE Router Connectivity
OSPF
RIP
CE1
MPLS
network
CE3
PE
PE
MP-BGP
CE2
PE
CE4
eBGP
Static
8 59

The control flow consists of two subflows.

The first control subflow is responsible for the exchange of routing information between the CE and PE
routers at the edges of the provider's backbone and between the PE routers across the provider's
backbone.
The second control subflow is responsible for the establishment of LSPs across the providers backbone
between PE routers.
Different IGPs or eBGP supported between PE and CE peers. The PE learns customer routes from attached
CEs. Protocols used between CE and PE routers to populate VRFs with customer routes :

BGP-4, useful in stub VPNs and transit VPNs
RIPv2
OSPF
static routing, particularly useful in stub VPNs
Customer routes are distributed to other PEs with MP-BGP
Note:
Customer routes need to be advertised between PE routers
Customer routes are not leaked into backbone IGP

Section 8 Page 59
Overlapping VPN
Site 1
Red VPN
10.1/16
VRF Red
VRF Red
Site 4
Red VPN
10.2/16
VRF Blue
MPLS
network
VRF blue
green
Site 2
Blue VPN
10.5/16
Blue VPN
VRF Red
VRF green
Site 8
Red VPN
10.3/16
VRF green
Site 3
Green VPN
10.1/16
Red VPN
Site 6
Green VPN
10.2/16
Site 5
Blue VPN
Green VPN
10.4/16
Site 7
Green VPN
10.3/16
Green VPN
8 60

A site can be part of different VPNs

A site belonging to different VPNs may or may not be used as a transit point between VPNs
If two or more VPNs have a common site, address space must be unique among these VPNs

Section 8 Page 60
CE-PE routing
Site 1
Red VPN
10.1/16
a
RIP
a 224.0.0.9
10.1/16
RIP mess
10.1/16:cost=1 If_11
Site 2
Blue VPN
10.5/16
VRF
CE2
Interf. is assigned manually to VRF
VRF RED
Interface : If_11
Gw
Route
If
CE1
If_11
b
Site 3
Green VPN CE3
10.1/16
Site 5
Blue VPN
Green VPN
10.4/16
P
Gw
If_3c
If
CE6
If_3b
10.1/16 b If_13
VRF Green
Interface : If_13
address
overlapping
CE5
If_1b
If_13
Route
CE4
If_2b
If_1a
VRF
If_21
If_22
VRF PE1 MP-BGP

If_12
If_2a
VRF
PE2
VRF
Site 4
Red VPN
10.2/16
If_32
PE3
VRF VRF
If_31
If_33
Site 8
Red VPN
10.3/16 CE8
8 61
CE7
Site 6
Green VPN
10.2/16
Site 7
Green VPN
10.3/16

At a PE, a VRF represents the context that is specific to an attached VPN; a VRF is primarily associated to
(is identified by) the one or more sub-interfaces through which the sites belonging to this VPN are
connected.
In this example :
PE 1 is configured to associate VRF Red with the interface (or subinterface) if_11 over which it learns
routes from CE 1. When CE 1 advertises the route for prefix 10.1/16 to PE 1, PE 1 installs a local route to
10.1/16 in VRF Red.
PE 2 is configured to associate VRF Green with the interface (or subinterface) if_13 over which it learns
routes from CE 2. When CE 2 advertises the route for prefix 10.1/16 to PE 1, PE 1 installs a local route to
10.1/16 in VRF Green.
Then, the routes has to be propagated through the MPLS network.
Overlapping Customer Address Spaces
VPN customers often manage their own networks and use the RFC 1918 private address space.
If customers do not use globally unique IP addresses, the same 32-bit IPv4 address can be used to identify
different systems in different VPNs. The result can be routing difficulties because BGP assumes that each
IPv4 address it carries is globally unique.
To solve this problem, Layer3 VPNs support a mechanism that converts non-unique IP addresses into
globally unique addresses by combining the use of the VPN-IPv4 address family with the deployment of
Multiprotocol BGP Extensions (MP-BGP).

Section 8 Page 61
Route Distinguisher and VPN-IPv4

Site 1
Red VPN
VPN-IPv4 is a globally unique, 96bit routing prefix

CE1
10.1/16
If_11
PE1
Site 3
Green VPN
CE3 If_13
10.1/16
Type
00 00
ASN
Assigned
nb sub-field
Various
formats
Autonomous System Number

(ASN) assigned by IANA
Type
00 01
IP address
Assigned nb
sub-field
when MPLS/VPN network uses a private AS nb

(loopback@ of the PE router that originates the route)
Type
00 02
ASN
Assigned nb
sub-field
Autonomous System Number (ASN) assigned by IANA

8 62

VPN-IPv4 Address Family

One challenge posed by overlapping address spaces is that if conventional BGP sees two different routes to
the same IPv4 address prefix (where the prefix is assigned to systems in different VPNs), BGP treats the
prefixes as if they are equivalent and installs only one route. As a result, the other system is unreachable.
Eliminating this problem requires a mechanism that allows BGP to disambiguate the prefixes so that it is
possible to install two completely different routes to that address, one for each VPN. RFC 2547bis
supports this capability by defining the VPN-IPv4 address family.
BGP was originally designed to carry routing information only for the IPv4 address family. Realizing this
limitation, the IETF is working to standardize the Multiprotocol Extensions for BGP4 (MP-BGP). It is
designed to carry such routing information between peer routers (PE)
propagates VPN-IPv4 addresses
carries additional BGP route attributes (e.g. route target) called extended communities
The ability to use this particular address family is indicated during BGP capabilities exchange between two
MP-BGP peers during their initial session startup.
A VPN-IPv4 address is a 12-byte quantity composed of an 8-byte RD followed by a 4-byte IPv4 address
prefix.
The service provider must ensure that each RD is globally unique. For this reason, the use of the public ASN
space or the public IP address space guarantees that each RD is globally unique.
Notes :
VPN-IPv4 addresses are used only within the service provider network.
VPN customers are not aware of the use of VPN-IPv4 addresses.
VPN-IPv4 addresses are carried only in routing protocols that run across the provider's backbone.
VPN-IPv4 addresses are not carried in the packet headers of VPN data traffic as it crosses the provider's
backbone.
Section 8 Page 62
Route Distinguisher
Site 1
Red VPN
10.1/16
VRF Blue
If : if_ 12
RD: RD-2
Site 2
Blue VPN
10.5/16
CE2
VRF Red
If : if_11
RD: RD-1
CE1
If_11
If_2a
P
VRF
VRF
CE5
If_3c
CE6
If_3b
If_32
PE3
VRF VRF
CE8
Site 8
Red VPN
10.3/16
CE4
If_31
VRF Red
If : if_ 31
RD: RD-8
Site 4
Red VPN
10.2/16
Site 5
Blue VPN
Green VPN
10.4/16
VRF Brown
If : 22
RD: RD-5
If_1b
VRF Green
If : if_ 13
RD: RD-3
If_33
Route distinguisher is manually configured

8 63
If_21
If_2b
VRF PE1
If_12
VRF
PE2
VRF
If_22
If_1a
CE3 If_13
Site 3
Green VPN
10.1/16
VRF RED
If : if_ 21
RD: RD-4
Site 6
Green VPN
10.2/16
VRF Green
If : if_ 32, if_ 33
RD: RD-67
CE7
Site 7
Green VPN
10.3/16

The route distinguisher (RD) must be defined at VRF creation time . A Route Distinguisher makes nonunique routes unique. It travels in MP-BGP_update
This parameter is used when the VPN private routes are distributed via the backbone to the other sites. The
RDs enable the overlapping of addresses between VPNs
Route distinguishers are not automatically set up at the PE router, instead each element requires manual
input based on the topology design of the VPN and therefore each VPN requires manual set up of VRFs.
The VRF tables have attributes. The network administrator configures these attributes with route
distinguisher to control the distribution of VPN routes to the VPN members.
All further Customer-relayed VPN operations are fully automated by MPLS network significantly simplifying
and reducing operational costs for the service provider.

Section 8 Page 63
VPN labels exchange

In In Proc Out Out
if label
if label
a 12 Swap b 19
b 29 Pop a --
In In FEC Proc Out Out

if label
if label
- PE2 Push If_a 12
In In Proc Out Out

if label
if label
c 19 Pop d -d 21 swap c 29
In In FEC Proc Out Out

if label
if label
- PE1 Push If_2a 21
VRF Red
VPN label: 2001
Label: 21
Label:1001
VRF Red
VPN label: 1001
21
Label:1001
LSP
Label:
1001
3
a
If_1
VRF
Label:1001
Label:1001
Label: 29
b
LSP
Label:2002
VRF Blue
VPN label: 1002
8 64
19
If_2a
PE2
Label:
2001
VRF
Label:2001
Label:2001
Label: 12
Label:2002
VRF Blue
VPN label: 2002
Label:2001
VRF
Label: 19
12
PE1
29
c
VRF
e
Label: 19
Label:2002
Label: 12
Label:2002

Scalability is enhanced because PE routers are not required to maintain a dedicated VRF for all of the VPNs
supported by the provider's network. Each PE router is only required to maintain a VRF for each of its
directly connected sites.

Section 8 Page 64
User data flow

Output inner
outer
Route
if label label
6 10.2/16 If_21 2001 -10.1/16 If_2a 1001 21
10.3/16 If_2b 3001 23
Output inner
outer
Route
if label label
10.1/16 If_11 1001 -2 10.2/16 If_1a 2001 12
10.3/16 If_1b 3001 13
Site 1
Red VPN
10.1/16
10.1.2.3
CE1
if_11
1 10.2.4.2
Site 2
Blue VPN
10.5/16
CE2
Site 3 CE3
Green VPN
10.1/16
Px
2001
1001
VRF
19
2001
10.1.2.3
10.2.4.2
12
2001
10.1.2.3
10.2.4.2
10.1.1.1
10.4.4.4
8 65
2002
12
19
2002
10.1.1.1
10.4.4.4
2002
10.1.1.1
10.4.4.4
2001
19
VRF PE1 if_1a 3

12
if_12
VRF
if_13
Py
10.1.2.3
10.2.4.2
2001
10.1.2.3
10.2.4.2
if_2a
VRF
PE2
VRF
2002
10.1.1.1
10.4.4.4
if_21 CE4
Site 4
Red VPN
10.2/16
10.1.1.1
710.4.4.4
If_22
Site 5
Blue VPN
CE5 Green VPN
10.4/16
Output inner
Output inner outer

Route
if label label
10.1/16 If_13 1003 -10.2/16 If_1b 3002 13
10.3/16 If_1b 3003 13
10.4/16 If_1a 2002 12
outer
Route
if label label
6 10.4/16 If_22 2002 -10.1/16 If_2a 1003 21
10.2/16 If_2b 3002 23
10.3/16 If_2b 3003 23
10.5/16 If_2a 1002 21

Route distribution on the control plane has enabled the building of the VRFs and thus prepared the transfer
of IP traffic between sites. The above figure illustrates two simultaneous data transfers:
from a host at Site 1 to, for example, some server at Site 4 (with IP address 10.2.4.2) and,
from a host at Site 3 to some other server at Site 5 (with IP address 10.4.1.8).
When the IP packet with destination address 10.2.4.2 is received by PE1 from CE1, since all packets that
arrive on if_1 are associated with VRF Red, the Red VRF is interrogated and the entry corresponding to
10.2/16 route indicates if_1a as output interface, and a label stack:

Outer label (12) : which identifies the remote PE
Inner label (2001) : which identifies the remote CE
The label stack is inserted in front of the IP packet, the data link header is inserted in front of the label
stack and the resulting frame is queued on the output interface. Similarly, when the IP packet with
destination address 10.4.1.8 is received by PE1 from CE3, the Green VRF is interrogated and the entry
corresponding to 10.4/16 route indicates if_1a as output interface, 12+2002 as label stack, as well as
(not shown) a data link header. The label stack is inserted in front of the IP packet, the data link header
is inserted in front of the label stack and the resulting frame is queued on the output interface.
The two frames are sent on the LSP egress path (PE1s output interface: if_1a); at Px router, the top labels
are swapped (19 replaces 12) and the labelled packets forwarded towards Py, which is the penultimate
hop in the LSP.
As a result, the outer labels are popped and the packets sent towards PE2 with only the inner label in front.
At egress PE2, the relevant VRF sub-interface is retrieved from the VPN label and the original IPv4 packet
is finally forwarded to the CE enabling you to reach the server within the site.

Section 8 Page 65
End of Section
8 66


Section 8 Page 66
Section 9
IPSEC VPN Services
IP Technology

Section 9 Page 1
Blank Page
9 2
IPSEC VPN Services

Document History
Edition
Date
Author
Remarks
01
YYYY-MM-DD
First edition

Section 9 Page 2
1. IPSEC Services
9 3
IPSEC VPN Services


Section 9 Page 3
9 IPSEC VPN Services
Services Offered by IPSEC
Integrity check
Authentication of data
Name:
Chaplin
Confidentiality

Protection against replay
9 4
IPSEC VPN Services

The IP Security Protocol (IPsec) is a set of mechanisms intended to protect the traffic at the IP level (IPv4 or
IPv6).
The security services offered are:

integrity in connectionless mode,
authentication of the source of the data,
protection against replay,
confidentiality (confidentiality of the data and partial protection against the analysis of traffic). These
services are supplied at the level of the IP layer. Consequently, they offer a protection for IP and for all
the higher-level protocols. IPsec is optional in IPv4 but is mandatory for all implementations of IPv6.
A first release of the proposed mechanisms was published in the form of an RFC in 1995, but didn't deal with
key management. A second release, dealing with the IKE key management protocol, was published in
November 1998.

Section 9 Page 4
Operating modes _ Transport Mode

IP packet
A
B
src and dest.

addresses stay
visible
A
B
Not entirely protected
IPsec
A
A
B
IPsec
Entirely protected
Internet
9 5
IPSEC VPN Services

In transport mode, only the data coming from the higher-level protocol and carried by the IP datagram is
protected.
This mode can only be used with terminal devices. Indeed, when intermediate devices were used, the risk was
that, according to routing hazards, the packet reaches its final destination without going through the
gateway supposed to decryp it.
The original IP packet is not encapsulated into another IP packet.
The entire packet can be authenticated (AH protocol, see section 9.3 Security Mechanisms).
The packet payload can be encrypted (ESP protocol, see section 9.3 Security Mechanisms).
The original header does not change while passing through the Internet.

Section 9 Page 5
Operating modes _Tunnel Mode (example 1)
x
y
IPsec
src and dest.
addresses are
hidden
IP packet
A
B
A
B
Entirely protected
A
B
A
B
Internet
Intranet
IPsec
gateway
9 6
Intranet
IPsec
gateway
IPSEC VPN Services

In tunnel mode, the IP header is also protected (authentication, integrity and/or confidentiality). All of it is
encapsulated in a new packet. The purpose of the header of this new packet is to transport the initial packet
up to the end of the tunnel, where the packet is de-encapsulated. Therefore, the tunnel mode can be used
by both terminal devices and security gateways.
This mode enables to ensure a greater protection against analysis of traffic because it hides the source and
final destination addresses.
The original IP packet is encapsulated in another IP packet.
The entire packet can be authenticated and/or encrypted.

Section 9 Page 6
Operating modes _ Tunnel Mode for dial-in (example 2)

Ay
IPsec
IP packet
A
B
Entirely protected
A
B
A
B
Internet
IPsec
gateway
9 7
y
IPsec
gateway
IPSEC VPN Services

The tunnel mode can also be used by terminal devices.

Section 9 Page 7
Intranet
Security Mechanisms _ Services Offered by AH and ESP
AH (Authentication Header) ensures:

data integrity
the authentication of the source of the data
protection against replay, optionally.
ESP (Encapsulation Security Protocol) can ensure:

integrity of data
the authentication of the source of the data
protection against replay, and/or
confidentiality
9 8
IPSEC VPN Services

In addition to standard IP processing, IPsec uses two security mechanisms to provide security for IP traffic:
Authentication Header (AH) and Encapsulating Security Payload (ESP).
AH
AH does not offer confidentiality, which means that widespread use of this standard is possible over the
Internet, including in places where exporting, importing and using encryption for confidentiality purposes is
restricted by law. This is one of the reasons why two distinct mechanisms are used.
In AH, integrity and authentication are provided together, using an additional block of data attached to
the message to be protected. This data block is called the Integrity Control Value (ICV), which refers
generically to:

either a Message Authentication Code (MAC),
or a digital signature.
For reasons of performance, the algorithms currently offered are all integrity check algorithms.
Anti-replay protection is provided using a sequence number. It is available only if Internet Key Exchange (IKE)
is used because in manual mode there is no "connection open" that enables the counter to be reset.
ESP
Confidentiality can be selected independently of the other services. However, using confidentiality without
integrity/authentication (directly in ESP or with AH) leaves traffic vulnerable to certain types of active
attack that could weaken the confidentiality service. As in AH, the authentication and integrity services go
hand-in-hand and are often referred to as "authentication". They are based on the use of an ICV (in practice,
a MAC). Anti-replay protection can only be selected if authentication has been selected and IKE is used. It is
provided using a sequence number that is checked by the recipient of the packets.
Unlike AH, where an additional header is simply added to the IP packet, ESP uses the encapsulation function:
the original data is encrypted then encapsulated in a trailer header.
Section 9 Page 8
Authentication Header (AH)
byte
byte
Next header
byte
Payload length
byte
Reserved
Security Parameter Index (SPI)

Sequence number
Authentication data
(ICV: Integrity Control Value)
9 9
IPSEC VPN Services

AH: RFC 2402

The various fields of an AH are:

Next header: It identifies the type of payload that follows the AH header.
Length: It indicates the header length in words of 35 bits minus 2 (since AH is an extension of the IPv6
header).
Reserved: This field is reserved for future use. It must be set to 0.
Security Parameters Index (SPI): The SPI field is a 32-bit arbitrary value that when combined to the
destination IP address, defines the unique Security Association (SA) of this datagram.
Sequence number: This field gives the packet number and is incremented of 1 at each transmission. This
enables to prevent replay (Protection against replay) since this number is not authorized to "cycle" for a
given SA (a new SA must then be created after 232 packets). This field is mandatory for the emitter but it
cannot be taken into account by the recipient. In the latter case, the number is authorized to cycle.
Authentication data: It contains the Integrity Control Value (ICV) of the packet.

Section 9 Page 9
AH: Next Header Field: Example

Transport mode
A
Internet
Tunnel mode
Intranet
Next header
Next header
Payload
Reserved
17
length
Sequence number
Authentication data
UDP src port UDP dest. port
UDP length Checksum UDP
UDP
Internety
Hd
ToS
Datagram length
Vers leng
Identification
F Datagram offset
TTL Prot: 51
Checksum
Source IP address x
Destination IP address y
Options
Hd
ToS
Datagram length
Vers leng
Identification
F Datagram offset
TTL Prot: 51
Checksum
Source IP address A
Destination IP address B
Options
AH
Intranet
Payload
Reserved
4
length
Sequence number
Authentication data
AH
Vers
Hd
ToS
leng
Identification
TTL
IPv4
Protocol
Datagram length
Datagram offset
Checksum
Source IP address A
Options
Data
Data
9 10
IPSEC VPN Services

Next header:
It authenticates the type of payload that follows the AH header. The value of this field is chosen (see the site
http://www.iana.org/assignments/port-numbers).
AH can be used in transport or tunnel mode.

When used in transport mode, it must represent the value of the protected higher-level protocol, namely
UDP or TCP.
When used in tunnel mode:

The value 4 indicates, in IPv4, an IP-in-IP encapsulation.
The value 41 indicates an IPv6 encapsulation.

Section 9 Page 10
AH: Authentication Data

Example: Tunnel
mode
Hd
Datagram length
Vers leng ToS
Identification
F Datagram offset
TTL Prot: 51
Checksum
Source IP address: X
Destination IP address: Y
Options
B
Fields excluded
from authentication
Intranet
Next header Payload

length
4
y
Internet
Reserved
AH
header
Sequence number
Authentication data
Hd
Ver4 leng
ToS
Identification
Datagram length
F Datagram offset
TTL
Protocol
Checksum
Source IP address: A
Destination IP address: B
Options
Hd
ver4 leng
ToS
Identification
x
Intranet
F Datagram offset
TTL
Protocol
Checksum
Options
Data
9 11
Datagram length
Data
IPSEC VPN Services

Authentication data:
This field contains the Integrity Control Value (ICV) of the packet.
The length of this field must be a multiple of 32 bits. All the implementations must respect this length and
therefore add padding data to this field if required.
Some fields may be modified by intermediate routers. Consequently, they must not be taken into account in
the calculation of authentication anymore. Thus, the fields excluded from the authentication are:

Type of Service (TOS)
Fragment Offset (always set to 0 since AH only applies to unfragmented packets)
Flags
Time To Live (TTL)
IP header checksum
Options
Protocol number in the IP packet: 51

The default algorithms that are supplied for all implementations of IPsec for AH are HMAC and MD5 96
[RFC2403] or HMAC and SHA1-96 [RFC2404].

Section 9 Page 11
Encapsulation Security Protocol (ESP)
byte
byte
byte
byte

Sequence number
Payload data (IP header + application data)
Padding (0 ..255 bytes)
Padding length Next header
Authentication data
9 12
IPSEC VPN Services

ESP: RFC 2406

The fields SPI, Sequence number, Next header and Authentication data (optional) are defined as for AH.
The Payload data field contains encrypted data. The problems are then:

If the encryption algorithm (for example, DES-MAC) needs a cryptographic synchronization (Cipher Block
Chaining (CBC) mode) i.e., an Initialization Vector (IV), then it is possible that such a data is contained in
the Payload data field.
The Padding field enables to resort to padding for the following reasons:

In case of block-encryption, the algorithm may request a certain size of data to be encrypted. Therefore,
this enables the content has the size required by the algorithm.
Padding may also be required when the ESP packet is 4-byte long.
The main algorithms that can be used with ESP are:

Confidentiality:

triple DES (mandatory) (168-bit key),
DES (56-bit key),
RC5, AES, CAST, IDEA, IDEA triple, Blowfish, RC4,
NULL when there is no need of encryption.
Authentication:

HMAC-MD5 (mandatory),
HMAC-SHA-1 (mandatory),
DES-MAC, HMAC-RIPE-MD, KPDK-MD5
NULL when authenticity is not selected.

Section 9 Page 12
ESP Format
Example: Tunnel
Hd
mode
Datagram length
Vers leng ToS

Identification
B
Intranet
F Datagram offset
TTL Prot:50
Checksum
Source IP address: X
Destination IP address:Y
Options
Sequence number
Hd
Internet
Ver4 leng ToS

Identification
TTL
Hd
Ver leng
ToS Datagram length

Identification F Datagram offset
TTL Protocol Checksum
Options
Data
9 13
F Datagram offset
Protocol
Checksum
Auth.
Encrypted
Data
Intranet
Datagram length
Options
ESP
header
Padding
Padding
length
Authentication data
Next
ESP
header4
trailer
ESP auth.
IPSEC VPN Services

ESP: protocol number 50

The emitter:

Encapsulates, in the Payload data field of ESP, the data carried by the original datagram and the IP header
in tunnel mode.
Adds if necessary a padding.
Encrypts the result (Data, Padding, Length and Next header fields).
Eventually, adds cryptographic synchronization data (initialization vector) at the beginning of the Payload
data field.
If the authentication has been selected, it is always applied after the data has been encrypted. This
enables to check the validity of the received datagram before performing the datagram decryption,
which is an expensive operation. Unlike AH, the authentication in ESP only applies to the ESP packet
(header + payload + trailer) and includes neither the IP header nor the Authentication data field.

Section 9 Page 13
ESP Position in Transport Mode

src and dest. stay
visible
IP packet
A
B
A
B
A
B
ESP header
ESP header
ESP trailer
Suject to
encryption
(if supplied)
Suject to
authentication (if
supplied)
ESP trailer
ESP
authentication
B
Internet
9 14
IPSEC VPN Services

In transport mode, only the data coming from the higher-level protocol and carried by the IP datagram is
protected.

Section 9 Page 14
How to Find the Path Maximum Transmission Unit

Flag df
(dont fragment)
1500
1
Phase 1
2
=10
U
MT
MTU=1536
2
MT
U=5
12
ICMP
destination unreachable (Path MTU Discovery:1024)
Message ICMP
1
Type Code
4
3
CRC
MTU
next hop
Data
IP header+ first 64 bits
Need of fragmentation
Phase 2
Flag df
(dont fragment)
4
1024
MTU=1536
102
U=
T
M 6
MT
U=5
12
ICMP
destination unreachable (Path MTU Discovery:512)
9 15
IPSEC VPN Services

It is essential to know the Path Maximun Transmit Unit (PMTU) mainly when there is a large amount of data
to be transmitted. Indeed, if long packets are sent along the path, some routers will have to perform an
expensive fragmentation in terms of resources and longer processing time. The recipient will also have to
perform complex operations of re-assembly.
Generally, data transfer applications (FTP for example) prefer to determine the PMTU and to emit packets
that do not exceed this PMTU to get faster transfers.
The PMTU is known by emitting IP packets with the "dont fragment" flag.

At first, the emitter transmits a packet of a maximum length.
A router that cannot forward a packet of such a length sends back in an ICMP message the value of the
next MTU.
The sender can then emit a new packet which length is equal to the received MTU. This packet is
emitted with the "dont fragment" flag.
The previous 2 steps are repeated until a packet reaches the recipient.
The length of the last packet correctly transmitted is used as a reference for the rest of the traffic.
This way, the sender can find the MTU of a path (PMTU).

Section 9 Page 15
Information Sent Back in the ICMP Message
IP header
Source IP address A
Options
TCP/UDP
header20bytes
Hd
leng
Datagram length
ToS
Identification
DF Datagram offset
TTL
Protocol
Checksum
Vers
src port (xx)

1
dest. Port (21)
1500 bytes
Application data
2 @Z
!
MTU
3 1000
4
Allows the sender to
find out the
application
AZ
Part of IP packet
having caused
ICMP message
9 16
Type: Dest. unreachable

Code: Fragment needed
Next MTU: 1000
Datagram length
Data: Vers leng ToS
ICMP
message
FTP
server
Identification
DF Datagram offset
Protocol
Checksum
Source IP address
Options
src port (xx)
dest. Port (21)
TTL
IP
header
8 bytes
IPSEC VPN Services

The ICMP message sent back by the router that cannot forward the packet since fragmentation is
impossible, is the following:

Type = 3 (Destination uUnreachable )
Code = 4 (Need of fragmentation and of DF positioning)
Next-Hop MTU in the 16 weak bits of the second word of the ICMP header (called "unused" in RFC 792),
with the 16 heavy-weight bits set to zero.
Data: contains the IP header + 64bits of the packet that caused this ICMP message.
Thanks to these 64 bits, the sender is able to find the application that has initiated the transmission (source
and destination port numbers).

Section 9 Page 16
"Dont Frag" Flag

Hd
Vers leng ToS

Identification
Datagram length
DF Datagram offset
Prot:50
Checksum
Source IP address::X
Destination IP address:Y
Options
TTL
B
Intranet
FW
Sequence number
Internet
Hd
Ver4 leng
ToS
Source IP address A
Options
src port
dest. port
x
FW
DF Datagram offset
Protocol
Checksum
Options
Intranet
Data
Application data
9 17
Datagram length
ToS
Identification
TTL
Hd
leng
Datagram length
Identification
DF Datagram offset
TTL
Protocol
Checksum
Vers
Must be
copied
Padding
Padding
length
Next
header4
Authentication data
IPSEC VPN Services

IPsec: how to find the PMTU

When a system (host or gateway) adds an encapsulation header (ESP or AH) tunnel, it MUST support the
option enabling to copy the DF bit from the initial packet to the encapsulation header (and to process
PMTU ICMP messages). This means that it MUST be possible to configure the system processing of the DF
bit (positioning, discarding, copy from the encapsulated header) for each interface.

Section 9 Page 17
Information Sent Back in the ICMP Message if IPsec

Vers leng ToS
Identification
TTL
Datagram length
DF Datagram offset
Protocol
Checksum
Source IP address x
Intranet
Options
FW

Sequence number
leng
Datagram length
Ver4
ToS
Identification
DF Datagram offset
TTL
Protocol
Checksum
MTU !
3
Z 4
Internet
Options
Padding
Next
length header4
Data
9 18
Type: Dest. unreachable

Code: Fragment needed
Next MTU:1000
Datagram length
Data: Vers leng ToS
Authentication data
leng
Datagram length
Ver4
ToS
Identification
DFDatagram offset
TTL
Protocol
Checksum
Options
ICMP
message
XZ
Identification
TTL
Data
Padding
IP
header
DF Datagram offset
Protocol
Checksum
Source IP address x
Options
IP
header

Sequence number
FW 5
8
bytes
Intranet
Which host?
Which application?
A
IPSEC VPN Services

The PMTU message with a 64-bit IPsec header

If the PMTU ICMP message contains an only 64-bit IPsec header (minimum for IPv4), then a security gateway
MUST support the following options for the SPI/SA base:
a)
If it is possible to determine the source host (or if it is possible to manage the amount of possible
sources), one option consists in sending the PM information to all the possible source hosts.
b)
If it is not possible to determine the source host, another option consists in storing the PMTU with the
SA and in waiting that the next packet(s) come(s) from the source host for the concerned SA. If the
packet(s) exceed(s) the PMTU, the packet(s) is/are discarded and one or more PMTU ICMP messages are
generated with the new packet(s) and the updated PMTU. Then the ICMP message(s) relating to the
problem is/are sent to the source host. The PMTU information is used for any message that comes later.
PMTU calculation
The PMTU calculation that is sent back to the host must take into account that an IPsec header has been
added whichever it is -- AH transport, ESP transport, AH/ESP transport, ESP tunnel, AH tunnel.
Note: In certain situations, the addition of IPsec headers might result in the calculation of an effective
PMTU (as seen by the host or the application) but that is too small. To avoid this, the implementation can
set a threshold under which it would not register a reduced PMTU. The implementation would then apply
IPsec and would reduce by fragmenting the resulting packet according to the PMTU. As a consequence,
the use of the available bandwidth will be more effective.

Section 9 Page 18
Reminder: NAT Function
Private Network
10.10.10.0
.1
Internet
Public IP@212.17.22.13
Prot Private IP@ Port Public IP@ Port
2 tcp 10.10.10.4 2125 212.17.22.13 2125 3
7 tcp 10.10.10.1 2125 212.17.22.13 1024
8
.4
9 19
IPdest: 194.5.3.12
IPsrc: 10.10.10.4
TCPsrc: 2125
TCPdest: 21
IPdest: 194.5.3.12
IPsrc: 10.10.10.1
TCPsrc: 2125
TCPdest: 21
IPdest: 194.5.3.12
IPsrc: 212.17.22.13
TCPsrc: 2125
TCPdest: 21
IPdest: 194.5.3.12
IPsrc: 212.17.22.13
TCPsrc: 1024
TCPdest: 21
194.5.3.12
FTP server
Socket: 5
194.5.3.12
21
212.17.22.13
2125
Socket:10
194.5.3.12
21
212.17.22.13
1024
IPSEC VPN Services

The Network Address Translation (NAT) and Port Address Translation (PAT) functions allow several users to
access the Internet simultaneously.

Section 9 Page 19
Several NAT Devices May Be Crossed
@B
@B @X
TCP 21 8901
6
Internet
@X @B
TCP 8901 21
@1 @B
TCP 4567 21
@X
5
@B @X
TCP 21 8901
ISP
NAT
@1
9 20
Prot Private IP@ Port Public IP@

@1
@A
1234
2 TCP
Intranet
10
1
Port
8901
@B @1
TCP 21 4567
NAT
@A
@B
TCP 1234
21

@X
TCP
@1
4567
@B @A
TCP 21 1234
@A
IPSEC VPN Services

A communication may pass through several NAT devices.

Section 9 Page 20
Port
4567
IPsec Problem Inherent to NAT

Example: Tunnel
mode
@B
Intranet
FW
@Y
Internet
@X
ISP
NAT
IP @1
@Y
PID: 50
ESP
3
2
A
B

esp
@1
@1
FW
Intranet
A
B
@A
1
9 21
IPSEC VPN Services

The NAT problem

RFC 3715: IPsec-Network Address Translation (NAT) Compatibility Requirements
RFC 3947: Negotiation of NAT Traversal in the IKE
RFC 3948: UDP Encapsulation of IPsec ESP Packets

Section 9 Page 21
???
@X
Port
???
NAT Traversal
@B
IP @X
@Y
PID: 17(UDP)
UDP
Src:4567Dest:500
Intranet
IP @Y
@X
PID: 17(UDP)
FW
UDP
Src: 500 Dest:4567
ESP
ESP
@B
@A
Internet
@A
@B
4
IP @1
@Y
PID: 17(UDP)
@Y
@X Prot Private IP@ Port Public IP@
NAT Traversal
ISP
UDP
Src:500 Dest:500
ESP
NAT
@1
UDP
1
9 22
@X
UDP
Src:500Dest:500
Intranet
@A
@B
500
IP @Y
@1
PID: 17(UDP)
6
FW
@A
@B
@1
Port
4567
@A
ESP
@B
@A
IPSEC VPN Services

The UDP header is a standard [RFC0768] header, where the Source Port and Destination Port MUST be the same
as that used by IKE traffic. The IPv4 UDP Checksum SHOULD be transmitted as a zero value, and receivers
MUST NOT depend on the UDP checksum being a zero value. The SPI field in the ESP header MUST NOT be a
zero value.
RFC 3947: Negotiation of NAT-Traversal in the IKE
This document describes how to detect one or more Network Address Translation devices (NATs) between IPsec
hosts, and how to negotiate the use of UDP encapsulation of IPsec packets through NAT boxes in Internet Key
Exchange (IKE).

Section 9 Page 22

What is the role of the sequence number in the IPSEC headers?
Detect packet loss
Detect attempts of replay
Put the packets in order before supplying them to the applications
If two IPsec tunnels are set between the same pair of entities, which
parameter enables to identify a tunnel accurately?
Impossible at IPsec level as the packet is encrypted
The SPI field
The port number at transport level
The Authentication data field
9 23
IPSEC VPN Services


Section 9 Page 23

In which operating mode does ESP hide the source and
destination addresses of IP packets?
Tunnel
Transport
9 24
IPSEC VPN Services


Section 9 Page 24
2. IPSEC operation
9 25
IPSEC VPN Services


Section 9 Page 25
Security Association (SA)

Parameters:
Encryption algorithm
Prot: AH / ESP
..
dest IP@:Y
SA
Prot:
ESP
identification
SPI: .
IPsec
gateway
Security Association
Internet
Security Association
IPsec
gateway
dest IP@: X
SA
Prot: ESP
identification
SPI: .
Parameters:
Encryption algorithm
Prot: AH / ESP
..
9 26
IPSEC VPN Services

The mechanisms mentioned previously have resort to cryptography and consequently use a certain amount of
parameters (encryption algorithms, keys, selected mechanisms, etc.) on which the communicating parties
must agree. IPsec uses the Security Association (SA) to manage these parameters.
An IPsec Security Association is a simplex connection that supplies security services to the traffic
transported by it. It can be considered as a structure of data enabling to store the set of parameters
associated to a given communication.
An SA is unidirectional; As a consequence, to protect both ways of a traditional communication, two
associations are required, one for each way. If the AH or ESP protocol is used, then the security services are
used. If AH and ESP are both applied to the traffic concerned, two SAs (even more) are created; They are
referred to as bundle of SAs.
Each association has a unique identification i.e., a triplet made up of:

The packet destination address.
The identifier of the security protocol used (AH or ESP).
A Security Parameter Index (SPI). An SPI is a 32-bit block which is legible in the header of each packet
exchanged. The SPI is chosen by the recipient.

Section 9 Page 26
Security Association Database (SAD)

One-way SA
Several SAs towards several partners
Several SAs towards the same partner {(traffic type or destination)}
SAD
outgoing traffic
SA1
Internet
SA3
SA1
SA5
x
IPsec
gateway
SAD
incoming traffic
SA2
IPsec
gateway
SA3
SA4
SA5
SA6
SA2
SA4
SA6
9 27
IPsec
gateway
IPSEC VPN Services

The Security Association Database (SAD) enables to manage the active security associations. It contains all
the parameters relative to each SA. The IPsec gateway looks up the SAD to know how to process each packet
in emission or in reception.
IPsec uses two databases:

One for the outgoing traffic,
the other for the incoming traffic.
Indeed:

Several SAs may be set between several partners.
Several SAs may be set towards the same partner.
Different types of protection may be defined according to the types of the applications.
Different types of protection may be defined according to the direction.

Section 9 Page 27
SAD: Synthesis
SAD
Sequence number counter,
Policy when the counter reaches the
maximum value,
Anti-replay window for the incoming
traffic,
Algorithms used,
Time To Live
Mode (transport / tunnel)
Information path MTU
SAx
dest IP@
AH / ESP
SPI

Policy when the counter reaches the
maximum value,
Anti-replay window for the incoming
traffic,
Algorithms used,
Time To Live
Mode (transport / tunnel)
Information path MTU
SAy
dest IP@
AH / ESP
SPI
9 28
IPSEC VPN Services

The IPsec processing between two partners requires the following parameters:

Policy when the counter reaches the maximum value,
Anti-replay window for the incoming traffic,
Selection of the AH or ESP algorithm and of the associated parameters,
Time To Live (in seconds or amount of bytes),
Mode (transport or tunnel),
Information path MTU.

Section 9 Page 28
Security Policy Database (SDP) (Example of Outgoing Traffic)

SPD
SA selection parameters:
3
Dest IP@ = 194.1.2.*
Src IP@ = 155.2.8.*
4
ESP
Transport = TCP
SA Id1
Dest port = 21 (ftp)
Action: apply
IP packet
IPs:155.2.8.1
IPd:194.1.2.6
Prot: 6 (TCP)
Portsrc: 1024
Portdest: 21(FTP)
Dest IP@ = 129.9.9.9

Src IP@ = 155.2.8.2
Transport = TCP
Dest port = any
Action: apply
SAD
outgoing traffic
5
SA1
SA2
AH
SA Id2
.6
194.1.2.0
Data
1
.1
155.2.8.0
.2
9 29
Algo: ..
Time To Live:
transport/tunnel
SA1
Internet 129.9.9.9
SA2
IPSEC VPN Services

How and when does IPsec processing apply to the IP traffic?

The Security Policy Database (SPD) determines which SA (or sequence of SAs) applies to an IP datagram. There may be:
An SA for each particular type of IP traffic (fine-grained),
An SA for a set of traffic types (coarse-grained).
The protections offered by IPsec are based upon choices defined in a Security Policy Database (SPD). This database is set
and maintained by a user, a system administrator or an application implemented by them. With this database, each
packet is attributed or not security services, is authorized or not to bypass or is rejected or not.
The SPD contains an ordered list of rules. Each rule comprises a given amount of criteria that enable to determine the part
of the traffic that is concerned. The criteria that can be used are the set of information made available by the headers
of the IP and transport layers. They allow to define the granularity according to which security services can be applied
and have a direct influence on the amount of corresponding SAs. When the traffic corresponds to a rule, security services
are attributed to it. The rule indicates the characteristics of the corresponding SA (or bundle of SAs): protocol(s),
modes, required algorithms, etc.
The SPD consists of:
Selection parameters:
Destination IP address (accurate/range of addresses/jocker)
Source IP address (accurate/range of addresses/joker)
ToS
Transport protocol
Source/dest ports (accurate/range of port numbers/jocker)
User identification (e-mail or X.500 name)
One or several mechanisms that can be applied:
ESP, AH, cryptographic algorithm, fine/coarse-grained, tunnel/transport, etc.
SA (sequence of SAs) ID:
Towards the SAD input (or incoming trunk groups),
(SA endpoint).
Some actions - discard (deletes the IP packet), - bypass IPSEC (lets the packet carry on) - apply IPSEC (applies the
security services contained into an SA or a group of SAs (SA Bundle).

Section 9 Page 29
SPD and SAD Management

Administrator
Negotiates
modifies, 6
discards
1
Manual
configuration
of policies
SA
creation
request
Looks up 5
SAD
Points to
SPD
Application
Internet
Keys
Exchange
(IKE)
HTTP,FTP,
POP,
Sockets
Transport (TCP, UDP)
IP / IPsec (AH, ESP) 2
Looks up
3
Link
9 30
IPSEC VPN Services

Outgoing traffic:
When the IPsec "layer" receives data to be sent:
At first, it looks up the Security Policy Database (SPD) to determine how to process this data.
If this database indicates that security mechanisms must be applied to the traffic, it gets back the
characteristics required for the corresponding SA and looks up the SA Database (SAD):
If the required SA already exists, it is used to process the concerned traffic.
If not, IPsec uses IKE to set a new SA with the required characteristics.

Section 9 Page 30
SA negotiation _ Key Management
Manual
Automatic
With "Preshared
Keys"
Certification
Authority
IKE protocol
With "Certificates"
9 31
IPSEC VPN Services

The distribution and the management of keys are critical operations. IPsec uses two methods of key distribution:

Manual
Automatic
Manual key exchange

The administrators at the end of a tunnel must configure all the security parameters. This principle can be applied in
small static networks. However, the key distribution may become problematic over long distances since the keys may
have been compromised during transit. Moreover, the keys must be regenerated regularly.
IKE supplies a method for:

Negotiating the protocols, algorithms and keys to be used.
Authenticating the parties, that is making sure that you are communicating with the good person from the beginning of
the exchange (primary authentication services).
Managing the keys once chosen (key management).
Supplying the mechanisms to manage the keys.
With "Preshared key":

A "preshared key" is an encryption and decryption key. Both parties must know this key before starting a communication.
In order to authenticate the participants to an IKE session, each end must, in advance, exchange the "preshared key" in a
secured way. In this respect, the problem of a secured distribution of the key is the same as with manual keys. However,
once distributed, unlike a manual key, IKE can automatically create new keys at predefined intervals. Changing the keys
frequently improves considerably the security. If it is automatic, it significantly reduces the responsisbility of key
management. Nevertheless, note that changing the keys frequently increases the traffic. So a compromise must be found
between the security and the effectiveness of data transmission.
With "certificates":
A certificate enables to authenticate the participants to an IKE negotiation. Each end generates a couple of private/public
keys and gets a certificate. The participants can seach for the public key of their peer and ask a Certification Authority
(CA) they both rely on to check the signature.
Section 9 Page 31
Origins of IKE
ISAKMP
OAKLEY
defines the procedures of

authentication and SA
management
defines the groups that

will be used for the DiffieHellman exchange
DOI
IKE
SKEME
IPSEC
IPSEC
Secure Key Exchange

Mechanism
9 32
IPSEC VPN Services

IKE is drifted from a set of protocols. These protocols are ISAKMP, OAKLEY and SKEME. Actually, they constitute
a protocol stack enabling the automatic key exchange.
OAKLEY
OAKLEY defines the groups that will be used for the Diffie-Hellman exchange. There are 5 groups that are
called The OAKLEY Groups. Among these 5 groups, there are three groups of classical modular
exponentiation (MODP) and two groups of elliptical curves.
Secure Key Exchange Mechanism for Internet (SKEME)
SKEME is a secure key exchange Mechanism. The authentication methods are based upon this mechanism.
Indeed, the method with public-key encryption comes from the SKEME exchange. Developed specifically for
IPsec, SKEME is an extension of Photuris.
Internet Security Association and Key Management Protocol (ISAKMP)
ISAKMP defines the procedures of negotiation, setup, modification, deletion of SAs. ISAKMP defines a
framework to negotiate the security associations, but does not impose anything as regards SA parameters. A
document called Domain of Interpretation (DOI) must define the negotiated parameters and describe how
to use ISAKMP in a given framework. A DOI identifier is used to interprete the content of ISAKMP messages.
Therefore, IKE uses ISAKMP message formats (header and payload).
IPsec Domain of Interpretation (IPsec DOI)
[RFC 2407] defines DOI to use ISAKMP with IPsec.
IPsec DOI is, in brief, a document that contains the definitions of all the security parameters in order the
negotiation of a VPN tunnel is successful i.e., essentially all the attributes required to negotiate SAs and
IKE.
Section 9 Page 32
IKE Phases
Phase 1:
A secure channel is set up to execute IKE

Main mode: a secure channel is set up
Aggressive mode: the same goes for this mode but the
the partners' identities are not
protected (simpler and faster)
Phase 2:
negotiation of SA parameters
Quick mode: performs an SA negotiation
9 33
IPSEC VPN Services

IKE is implemented above UDP. The "well-known port" is: 500
IKE phases
IKE is a two-phase protocol:

Phase 1:

Both partners set up a secure channel (IKE SA) to execute IKE. They negotiate how to authenticate and
secure the channel.
Phase 2

Both partners negotiate the IPsec SA parameters.
IKE modes
Oakley supplies three modes of key exchange and of SA implementation.

Two modes for the phase 1 of IKE:

Main mode: performs a phase 1 IKE exchange by setting up a secure channel,

Aggressive mode: simpler and faster but does not protect the identity of the pertners that are
negotiating since they must transmit their identity before negotiating a secure channel.
One mode for the phase 2 of IKE.

Quick mode: negotiates an IPsec SA.

Section 9 Page 33
IKE Phase 1 _ Main Mode
Msg #1
Negotiation of basic and

hash algorithms
Msg #2
Msg #
3
Exchange of public keys

and signature
Msg #
5
Check of identities
4
Msg #
6
Msg #
(encrypted exchange)
9 34
IPSEC VPN Services

Internet Key Exchange - Phase 1 - Main Mode

The aim of the phase 1 is to configure a secure authentication channel between both parties.
IKE Main Mode consists of six messages that are exchanged between the initiator and the responder in order to
set up an IKE SA. The first 4 messages are legible and are used to determine the security parameters of
future exchanges.

In the first exchange (messages 1 and 2), both parties agree on the basic and hash algorithms:

Authentication (Preshared-key / RSA certificate).
Hash (MD5 / SHA-1 / )
Encryption (DES / 3DES / AES / )
DH groups (1 .. 5)
In the second exchange (messages 3 and 4), they exchange:

the public keys in the case of a Diffie-Hellman exchange,

a random number " NONCE" (Number used ONCE) (the random numbers if the other party must then be
signed and sent back to counter the attempts of replay).
In the third exchange, (messages 5 and 6), they check their identities.

Section 9 Page 34
Parameters Used in the IKE SA Negotiation

Main parameters to secure the ISAKMP tunnel:
Encryption algorithms
(DES, 3DES, AES)
Hash algorithms
(MD5, SHA)
Authentication method
(Preshared-Key or Certificate)
Diffie-Hellman groups (DH Group1: 768-bit modulo,

DH Group2: 1024-bit modulo, )
9 35
IPSEC VPN Services

Setting up a secure channel for IKE negotiation

The initiator supplies several components to the secure exchange of keys and the authentication:

The encryption algorithm to protect the data (DES, 3DES, AES128, etc.)
The hash algorithms in order to reduce the data intended to the signature (MD4, SHA)
An authentication method to sign the data (Preshared-Key, Certificate: RSA, DSA)
The choice of the Diffie-Hellman group (among 5)

Section 9 Page 35
IKE Phase 2 _ Messages
Msg #1
Negotiation of security
protocols for IPsec
Msg #2
Authentication
Msg #3
SPI: x
Outgoing SA
SPI: y
IPsec tunnel
Incoming SA
SPI: Security Parameters Index
9 36
IPSEC VPN Services

Phase 2: Quick Mode

The messages exchanged during phase 2 are protected as regards authenticity and confidentiality thanks
to the elements negotiated in phase 1.

Authenticity is ensured by the addition of a HASH block after the ISAKMP header.
Confidentiality is ensured by the encryption of all the message blocks.
All the messages of phase 2 are encrypted and authenticated by the IKE SA and the shared secret keys. The
phase 2 always comprises three messages. As there's no long operation in this phase, it has been called
"Quick mode". In this phase, both parties negotiate the parameters of IPsec SAs and calculate the second set
of shared secret keys from the secret value generated in phase 1 and from new random numbers.
Phase 2 is a quick phase because it uses secret-key cryptography instead of public-key cryptography that is a
lot slower and expensive.
The Quick Mode is used to negotiate SAs for some security protocols such as IPsec. Actually, each negotiation
gives two SAs, one for each way of communication.
More precisely, the different exchanges in this mode have the following purposes:

Negotiating a set of IPsec parameters (SA bundles).
Exchanging nonces, used to generate a new key from the secret generated in phase 1 with the DiffieHellman protocol. Optionally, it is possible to create a new Diffie-Hellman exchange to access the property
of Perfect Forward Secrecy (PFS), which is not supplied if a new key is only generated from an old one and
from nonces.
Optionally, identifying the traffic this SA bundle will protect, by means of selectors (IDi and IDr optional
blocks; Without these blocks, the IP addresses of both parties are used).
Security Parameters Index (SPI): The SPI field is an arbitrary value of 32 bits that combined to the destination
IP address defines the unique Security Association (SA) of this datagram.

Section 9 Page 36
Negotiation of SAs for the Data

The initiator proposes:
Encryption algorithms
Hash algorithms
An authentication method
If Perfect Forward Secrecy (PFS) then Diffie-Hellman groups
The type of protection to be used (ESP or
AH)
A time to live
9 37
IPSEC VPN Services

Negotiation of security protocols for IPsec

The initiator proposes several components to ensure the exchange of secure keys and authentication:

Encryption algorithms to protect the data (DES, 3DES, AES128, etc.)
Hash algorithms to reduce the data intended to the signature (MD5, SHA)
If Perfect Forward Secrecy (PFS), the choice of the Diffie-Hellman group (among 5)
A Pseudo-Random Function (PRF) used to hash some values during key exchanges for verification purposes
(this is optional, a hash algorithm can simply be used)
A time to live in terms of time and volume.
For the IPsec AH protocol, the transform algorithms which can be negotiated are MD5, SHA and DES (MD5 and
SHA mandatory to implement).
For the IPsec ESP protocol, the transform algorithms which can be negotiated as a basis for authentication are
MD5, SHA and DES. The possible encryption algorithms are DES, 3DES, RC5, IDEA, (DES being mandatory for
support).

Section 9 Page 37
Perfect Forward Secrecy (PFS)

Alice
Bob
Network
Secret of Alice: a1
Session 1
A1= ga1 mod n
Data
Secret of Alice: a2
B1= gb1 mod n
K1= ga1b1 mod n
Session 2
A1= ga2 mod n
Data
Secret of Bob: b1
Data
Secret of Bob: b2
B1= gb2 mod n
K2= ga2b2 mod n
Data
Even if K1 is compromised, K2 is totally secured

9 38
IPSEC VPN Services

Perfect Forward Secrecy

When encrypted data goes through a public network, an attacker has many opportunities to intercept the
encrypted data. You can reduce the risk of interception by using larger and larger keys. But the larger the
keys, the slower and the more complex the encryption. This may alter the network performance.
A good compromise consists in using keys of a reasonable length, and to change them frequently. It also has
some problems. The new keys must not be generated from the old ones. Indeed, if a key is discovered, all
the traffic might be compromised.
So, it is necessary to implement a method to generate a new key which will not depend at all on the value of
the current key. Thereby, if someone intercepts your current key, this person can analyze only a small part
of the traffic. He/she will have to crack again another entirely independent key to analyze the other part of
traffic.
Two variants may be used to generate the keys that will be associated to the encryption, hash and
authentication of SAs specific to the negociatied application.
They can simply be generated from the ISAKMP SAs.
Some keys are generated again and are independent of the keys of ISAKMP SAs by exchanging new DH values.
This concept is called Perfect Forward Secrecy.

Section 9 Page 38

In which entities are the Security Associations (SAs) stored?
SPD
SAD
Which parameters might IPSEC use to select an SA?
Source IP address
Port number
ToS
Protocol
9 39
IPSEC VPN Services


Section 9 Page 39

Who selects the SAs to be used to forward a packet?
SPD
SAD
What are the main pieces of information stored in an IPSEC SA?
Algorithms used
Time To Live of SA
Operating mode (Tunnel/transport)
ESP/AH protocol
Sequence number counter
Anti-replay window
9 40
IPSEC VPN Services


Section 9 Page 40

What is the role of IKE in phase 1?
To negotiate the parameters of IPsec SA
To set up a secure channel
To select the algorithms for the traffic of data
In which mode is SPI selected?

Main mode
Aggressive mode
Quick mode
9 41
IPSEC VPN Services


Section 9 Page 41

Which function doesn't the aggressive mode ensure?
Authentication
Non-repudiation
Protection of the third parties' identities
The automatic management of IKE keys requires all the same, in the
Preshared key method, the manual introduction of a secret key. What is its
role?
To ensure authentication
To perform encryption
To perform a hash
9 42
IPSEC VPN Services


Section 9 Page 42

What does the Perfect Forward Secrecy (PFS) allow ?
Any new key is generated independently of the current
key
The protection of the identities
A longer time to live for the keys
9 43
IPSEC VPN Services


Section 9 Page 43
End of Section
9 44
IPSEC VPN Services


Section 9 Page 44

IP Tec For Mobile Networks

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

IP Tec For Mobile Networks

Enviado por

Direitos autorais:

Formatos disponíveis

Technology

IP for Mobile Networks

All Rights Reserved Alcatel-Lucent 2009

All rights reserved Alcatel-Lucent 2008

All Rights Reserved Alcatel-Lucent 2009

Terms of Use and Legal Notices

Switch to notes view!

All Rights Reserved Alcatel-Lucent @@YEAR

All Rights Reserved Alcatel-Lucent 2009

Point to Point transport

1. User Datagram protocol (UDP) All Rights Reserved Alcatel-Lucent @@YEAR

2. Transmission Control Protocol (TCP)

All Rights Reserved Alcatel-Lucent 2009

About this Student Guide

Where you can get further information

All Rights Reserved Alcatel-Lucent @@YEAR

If you want further information you can refer to the following:

All Rights Reserved Alcatel-Lucent 2009

Do not delete this graphic elements in here:

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

1.1 Basic Concepts

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

Wide Area Network (WAN): coverage extends to wide geographical areas.

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

Connectionless Communication Mode

All Rights Reserved Alcatel-Lucent 2009

All packets must know the destination address.

Data can arrive at the destination in any order.

All Rights Reserved Alcatel-Lucent 2009

Connection-Oriented Communication Mode

All Rights Reserved Alcatel-Lucent 2009

the setting up of a virtual circuit.

the identification of data by a path identifier.

the delivery of data in the order it is sent.

the need to release the connection after communication.

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

The main role of TCP/IP is the interconnection of networks.

Public Switched Telephone Networks

All Rights Reserved Alcatel-Lucent 2009

- Point-to-Point (leased lines, PSTN, etc.)

- Virtual connections (Wide Area Networks),

Various Operating Systems

All Rights Reserved Alcatel-Lucent 2009

Network interconnection brings into play different types of links:

multipoint links (deployed mainly in local networks).

These operating systems function on machines built by different equipment manufacturers.

surf the internet,

many other tasks.

These types of software are known as services.

All Rights Reserved Alcatel-Lucent 2009

HTTP TELNET FTP SMTP DNS

IEEE 802.2 (LLC)/802.1 (Bridging)

All Rights Reserved Alcatel-Lucent 2009

All Rights Reserved Alcatel-Lucent 2009

Internet Engineering Task Force