Escolar Documentos
Profissional Documentos
Cultura Documentos
Introduction
Introduction
Course Objective To make the audience understand TCP/IP Stack internals & TCP/IP networking Course Schedule 8 days (June 9-11, 16-18, 24-25) 9:30 am to 6:00 pm Course Delivery Quiz, Top down to start with, Protocol, demo, APIs, Scope Messaging aspects of Protocol, Coding with APIs, Miscellaneous: Q & A, Breaks, Responsibilities: Presence, participation, punctuality, .
2
Outline
FTP Server Configuring FTP Server Connecting and using FTP Server Terminating Client
Client-Server Communication
1. 2. 3.
Server is configured and started, it is running perpetually User starts the client User provides the servers address to the client
User, through client receives the service 5. User terminates the client 6. Server continues to run Note: A computer is hosting 100s of programs
4.
Client-Server Characteristics
Both
client and server are programs These programs communicate using Internet Communication involves exchange of messages between client and server Each message is a binary sequence
7
Concurrent Server handles multiple clients in interleaved fashion Examples: Web Server, Mail Server
Synchronous Server completes the request of a client before starting the next request from queue
Examples: Radius Server, SNMP Agent
8
Addresses IP Address
IP Address is a 4 octet number that identifies a host uniquely. A host needs at least one IP address to send/receive a message. It can also have more than one IP address. Messages sent (from other computers) to these addresses are delivered to the host Exercise: Find your own IP addresses using ipconfig
Hi, I am Google!
I am Yahoo
IP Address
Dr. Hari T.S. Narayanan
Domain Name
Note: Domain Names are for user; messages do not use them for addressing 10
11
Port Number
IP address by itself is not sufficient for a client to send and receive messages why? There may be 100s of programs running on a host Port number is a 2 byte value that uniquely identifies a program on a host.
Host 192.168.1.2
Web Server
80
21
23
Dr. Hari T.S. Narayanan
Listing Server programs Controlling Server Programs Port Assignment file C:\WINDOWS\system32\drivers\etc\services Changing the port assignment for a server Use SNMP to demonstrate this
13
Each transport layer has got its full range of port numbers The port numbers are divided into three ranges: Well Known Ports (0-1023) Registered Ports (1024-49151) Dynamic and/or Private Ports (49152-65535)
subnet 3
D2
Subnet 1
Subnet 3
Links L1
1. 2. 3.
4.
Every packet is carried in a Frame Only frames are sent on the medium Network Interface Controller (NIC) implements Link Layer. NIC connects a node to subnet/link (Only) Frame header is used in sending, filtering, and forwarding Frames
6.
7. 8. 9.
5.
NIC is responsible for sending, filtering, and forwarding frames using MAC address! MAC address is hard-wired to NIC card. Sending a packet from L1 to D 1 and L1 to D2. IP address is associated with a NIC (not host)!
MAC Address
Example: 02-00-4c-4f-4f-50
Vendor ID
MAC address allocations for a company can be found by going to: http://standards.ieee.o rg/regauth/oui/index. shtml
18
Observation
Addresses used
Four types of addresses are used: IP Address Domain Name Link Layer address or MAC address or hw address Port Number
Address Structure
Networking Layer Address - IP Address - unique, but likely to change and move, but got structure
Example: 192.168.1.128
Link Layer Address - MAC Address - unique & fixed, but flat
Example: 08:56:27:6f:2b:9c
21
TCP/IP Networking
Networking is implemented using a number of software and hardware modules, instead of single monolithic piece of h/w & s/w This collection is loosely grouped into several layers. A layer offers multiple options for networking. These layers enable message flow between client and server Each layer offers a specific networking function in various flavors with multiple modules
Dr. Hari T.S. Narayanan
Application/Layer
Physical Layer
Medium
22
Why Layering?
There are many reasons for building layered architecture. The following are some of the important ones:
To isolate different networking functions. To offer standard interface at functional boundaries Modularize Network functions Abstraction of lower level layers
23
Port No
ICMP IGMP
TCP
UDP
IPX IP ARP/RARP
protocol type
Frame type
24
Accessing an Interface
A networking layer is accessed by higher layer using a well defined h/w and/or s/w interface S/W interfaces are the ones that we discuss predominantly in this course For example: an application layer in TCP/IP suite access a transport layer (TCP/UDP/SCTP/) using Socket API In other words, TCP/UDP layers offer their transport service to higher layers through Socket API. How about network layer, link layer, ?
25
Encapsulation
message
datagram segment
application
TCP
packet
IP
Ethernet
frame
14
20 20 46-1500
Dr. Hari T.S. Narayanan
4
Physical Medium 26
Service
Protocol
27
Install Wireshark
Wireshark will also install WinpCap library Wireshark Tutorial Copy pcap.dll to c:/Program Files/Tcl/
Start Wireshark
Install Scotty
28
Ethernet Header
30
Standard Bodies
IEEE (most of the physical & link layer standards) FAQ: http://standards.ieee.org/faqs/sa-faq.html Free downloads: http://www.ieee.org/publications_standards/index. html#IEEE_Standards IETF (most of the higher layer standards) For Newbies: The Tao of IETF
31
IEEE 802.3 (CSMA/CD), 802.5 (TR), 802.11, FDDC, Internet Society (ISOC) Internet Architecture Board (IAB) IETF Engineering Task Force IRTF Research Task Force IANA Assigned Number Authority InterNIC IP Address distribution Two types of documents: Internet Draft (ID) and Request for Comment (RFCs) drafts have short life Well defined standards track Only a small set of the RFCs are standards
32
Ping Command
Checking for IP connectivity Usage: ping <Otherhost> Loopback Interface Used for Inter Process Communication (IPC) Loopback address 127.*.*.*
Host
Clients
Servers
Loop back Interface
33
6.
7. 8. 9. 10.
Click Start, and then click Control Panel. Double-click Printers and Other Hardware, and then click Next. Under See Also in the left pane, click Add Hardware, and then click Next. Click Yes, I have already connected the hardware, and then click Next. At the bottom of the list, click Add a new hardware device, and then click Next. Click Install the hardware that I manually select from a list, and then click Next. Click Network adapters, and then click Next. In the Manufacturer box, click Microsoft. In the Network Adapter box, click Microsoft Loopback Adapter, and then click Next. Click Finish.
Loopback driver installation Configuring loopback driver Assign IP address 192.168.1.2 and netmask 255.255.255.0 arp s 192.168.1.2 02-00-4c-4f-4f-50 route add 192.168.1.2 192.168.1.2 mask 255.255.255.255 Capturing frames with loopback driver (in Wireshark)
Dr. Hari T.S. Narayanan
Networking Hierarchy
Subnet 2
Subnet 1
internet
Subnet 3
36
LAN Segment
Shared LAN Segment is inherently a broadcast domain Shared LAN segments are created using Hubs and Bridges Switched LAN segments require controlled broadcast
37
Subnet
Hosts and Router ports within a subnet share the same subnet ID Subnet ID is the IP address prefix that is common to all the IP addresses in the subnet Subnet is a link layer broadcast domain Router is the gateway between subnets Router terminates subnet broadcast
192.168.1
Router Port
192.168.3 192.168.9
192.168.2
38
Switching Establish end-to-end connection (logical duplex pipe) Transmit data/media using the connection Close the connection Discuss the implications Routing For every packet, pass the packet to network and ask it to deliver it to its destination with best effort Discuss the implications
39
Sequencing guaranteed
40
Connectionless Messaging
A
Sequencing not guaranteed No dedicated pipe between A & B Pipe is shared Global address needed
Ideal for 1-to-n communication Inherently robust Needs big transfer tables
Dr. Hari T.S. Narayanan
41
IP is connectionless networking Both connection-oriented and connectionless transport could be offered on top of IP. TCP is a connection-oriented protocol, UDP is connectionless protocol Both are implemented using connectionless IP network
42
IP Address Classes
0 Class A 0 Class B 1 0 Class C 1 1 0 Class D 1 1 1 0 Class E 1 1 1 1 7 Net-id 15 Host-id Host-id Net-id Multicast address Reserved Host-id 23 31
Net-id
43
Subnet Addressing
0 8 16 24
Subnet ID (3bits) Host ID (5bits)
211
77
20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
3 Left-Most 0Bits Changed to 1 Binary Subnet Mask Converted to Dotted Decimal (255.255.255.224)
44
255
255
255
Dr. Hari T.S. Narayanan
224
45
Host Sending
id
knows its MAC address knows its Gateways IP address Application provides Servers
Application/Layer
(Destination) IP address
IP/Link Layer maintains ARP cache Servers MAC address is required to complete the datagram
Dr. Hari T.S. Narayanan
Physical Layer
Network
46
3.
4.
5.
Host A checks if Server B is in the same subnet. It is. Host A sends a broadcast frame asking for the MAC address of Server B (IP Address). This request frame is seen by all hosts & servers within the subnet. Server B responds to Host A with its MAC address. Host A saves the Servers IP address and MAC address in its ARP cache and starts sending /receiving frames to/from Server B.
47
DST IP (4)
SRC IP (4)
Message
SRC IP, SRC MAC, and SRC Port are known to the message sender DST IP is provided by the user DST Port is a well known port in client-server applications That leaves us with DST MAC which is not known to the sender
48
ARP stands for Address Resolution Protocol Each entry in an ARP table contains an IP Address and the corresponding MAC Address. ARP entries live only for a short duration - 2 to 10 mins
Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp.
Interface: 10.0.0.224 --- 0x2 Internet Address 10.0.0.2 Physical Address 00-80-c6-f9-29-a7 Type dynamic
49
Host A checks if Server B is in the same subnet. It is not. Host A sends a broadcast frame asking for the MAC address of
4.
5.
6.
This request frame is seen by all hosts & servers within the subnet. Router A responds to Host A with its Port 1 MAC address. Host A saves the Servers IP address and Router Port 1 MAC address in its ARP table and starts sending /receiving frames to/from Router A. Router A Routes packets from host A to Server.
50
IP layer on host can be configured to do routing in addition to acting as host When IP datagram is received, IP layer checks if the destination IP is one of its own IP addresses or an IP broadcast
If so the datagram is delivered to protocol module specified in the protocol field in datagram If not then If the host is configured as a router, then the datagram is forwarded using the IP routing table Else the datagram is silently dropped
51
ARP finds the physical address of a host given its IP address by issuing an ARP broadcast within the subnet This information stored in ARP cache and used in IP datagram transmission ARP cache is a table where each entry contains hosts IP address and corresponding physical address ARP entries also contain host name and expiration counter. Default expiration time is 20 mins ARP command can be used to list the entries of an ARP cache Example: arp a
hostname
hostname
Resolver
(1)
IP address
TCP
ARP (5)
(6)
ARP Request (Ethernet broadcast)
(4) (8)
(3) IP (9)
Ethernet Driver
Ethernet Driver
Ethernet Driver
ARP
(7)ARP
Dr. Hari T.S. Narayanan
IP
53
54
56
Link Layer
Responsible for
Creating a frame and sending it to next node Receiving a frame and Processing it
57
Link Layers
SLIP PPP Ethernet
58
Motivation
Multidrop Versus
Desktop/ Router/ Switch Desktop/ Router/ Switch
Point-to-point
FTP data
FTP data
TCP segment
IP-H TCP-H
FTP data
IP datagram
IP-H TCP-H
FTP data
Frame
Network
Dr. Hari T.S. Narayanan
60
END (0xC0) and ESC (0xdb) are used to create the frame. No type field supports only IP. IP address issue No Frame Check Sequence (FCS) or CRC!
IP Datagram
c0
db
c0
db dc
Dr. Hari T.S. Narayanan
db dd
c0
61
FTP data
FTP data
TCP segment
IP-H TCP-H
FTP data
IP datagram
IP-H TCP-H
FTP data
Frame
Network
Dr. Hari T.S. Narayanan
62
PPP
Motivated by the deficiencies of SLIP. Includes type field. IP address could be exchanged Includes Frame Check Sequence (FCS) or CRC!
63
PPP Implementation
PPP Encapsulation (RFC 1661) PPP Link Control Protocol (RFC 1570) PPP Network Control Protocol for IP (RFC 1332)
64
PPP Link configuration (LCP) Authentication (LCP) - Optional Network Protocol configuration (NCP)
65
Protocol
Data
Reject
Link Termination packets: Req, and Ack Link Maintenance packets: Code-Reject, Protocol-Reject,
Optionally supports Authentication of peer LCP negotiation is started administratively or when the carrier is sensed
67
Identifier - Decimal value which aids in matching requests and replies. Length - Length of the LCP packet, including the Code, Identifier, Length and Data fields. Data - Variable length field which may contain one or more configuration options
68
LCP Codes
1 2 3 4 5 6 ConfigureRequest Configure-Ack Configure-Nak Configure-Reject TerminateRequet Terminate-Ack 7 8 9 10 11 12 13 Code-Reject Protocol-Reject Echo-Request Echo-Reply Discard-(this)Request Identification Time Remaining
69
LCP Data includes one more options encoded using TLV Each option includes a default value LCP options need not be symmetrical!
70
Peer 2
http://technet.microsoft.com/en-us/library/cc957992.aspx
Dr. Hari T.S. Narayanan
71
3.
The remote access server sends a CHAP Challenge message containing a session ID and an arbitrary challenge string. The remote access client returns a CHAP Response message containing the user name in cleartext and a hash of the challenge string, session ID, and the client's password using the MD5 one-way hashing algorithm. The remote access server duplicates the hash and compares it to the hash in the CHAP Response. If the hashes are the same, the remote access server sends back a CHAP Success message. If the hashes are different, a CHAP Failure message is sent.
72
Network Layer negotiation starts right after the completion of LCP. Series of messages are exchanged for each of the networking layer protocols (IP, IPX, Appletalk, ) IP layer negotiation is done using IPCP IPCP negotiates for Client IP parameters (IP address, DNS, etc.) There is no negotiation for gateway and netmask There is no ARP table updates Routing table is updated
73
74
75
Observations to be made
Sequence of messages for LCP, CHAP, and NCP (IP) Message Code and Length Correlating Identification between request and response Observe MAC addresses find vendor ID for them Authentication Challenge, userId, IP address and DNS negotiation IGMP, DHCP, and DNS messages that are appearing after LCP and NCP TLV encoding of IP addresses etc LCP and NCP Data Options: RFC 1661
76
MRU Callback IP & DNS Server Address Authentication Username Password Optimization Address, Control and Protocol Field compression
77
In multipoint setup, PPP starts after a message sequence This sequence is used to identify and connect to a peer Earlier DSL modems were using this PPPoE is one such message sequence This enabled multiple users behind gateway.
PPP Gateway
78
79
MAC Addresses
MAC Addresses used in Serial Link are not device specific! There is no registered vendor ID for these MAC addresses
80
Deployment Scenario
Point-to-point connections between bridges/switches (PoS) Dialup connections (PPPoE and PPPoA) VPN connections (L2PP, PPTP)
81
Ethernet Characteristics
Ethernet is a LAN Link Layer Standard Most popular LAN standard Least Expensive Comes in Half-duplex and Full-duplex forms Comes in several speeds 10/100/1000/10000 Mbps Comes with several media options (wireless, fiber, coaxial, twisted pair,) Wireless LAN variations 802.11x (CSMACA) Initial competition from Token Ring, later from ATM, now none! Variety of related standards
VLAN Bridges Spanning Tree Protocol Multiple Spanning Tree Rapid Spanning Tree Link Aggregation Port-Based Network Access Control
83
Sense the media (Carrier Sense). If the medium is idle, transmit (& listen), otherwise go to next step.
If the medium is busy, continue to listen until medium is idle, then transmit immediately.
If a collision is detected during transmission:
Transmit a jam signal for one slot. Wait for a random time and reattempt (up to 16 times). Random time generated according to exponential back-off .
Collision is detected by monitoring the voltage, high voltage two or more transmitters are colliding.
Dr. Hari T.S. Narayanan
Pad
FCS 4
PRE Preamble SFD Start of Frame Delimiter DA Destination Address SA Source Address FCS Frame Check Sequence
85
P listens to the medium, there is no signal, so P starts sending a frame to Q , and so does Q , it starts sending a frame to P When the first few bits of Ps frame reach Q , Q detects collision. Similarly P detects collision when bits of Qs frame reach P If frames were small, P and Q wouldnt have identified the collision why? To detect collision we would like P to be still transmitting while Qs first bit reach P and vice versa. In the worst case, Q may starts sending its frame just before Ps first bit arrives at Q
The last bullet in the previous slide suggests that the frame cannot be smaller than certain size. It suggests, that the frame should be large enough to detect collision in the worst case In other words, the segment cannot be larger than certain size Note: The frame size and segment size constraints are created by collision detection Besides this, there are other reasons for Segment size limitation.
88
Noise and signal attenuation constraint the LAN segments to few hundred meters. Several different networking elements are used to extend the span of LANs. These enhancements still have to satisfy the round trip constraint if collision detection is required These enhancements also need to satisfy any other constraints suggested by the standards.
89
Repeater
Repeater is bidirectional Analog amplifier that amplifies and retransmits signals. Layer 1 Device. Can double the size of a LAN segment.
Segment 1 Segment 2
91
Repeater
Standard suggests a limit of 4 Repeaters between any two stations on LAN. A maximum of 5 segments. Repeaters dont understand frame formats. Collision affect the entire extended network. Noise propagates throughout the extended network.
92
Hub
Hub is a multilink repeater with star topology In other respects, a hub is similar to a repeater
B D C
Hub
Another Representation
93
94
Bridge
Bridge is a device that connects two or more LAN segments. Unlike Repeater, Bridge receives, processes, and retransmits frames. Bridge is invisible to the other attached computers. Bridges can connect dissimilar LAN segments
Segment 2 (Token Ring)
P1 P3
95
Segment 1 (Ethernet)
P2
Bridge Characteristics
Layer 2 Device. Can do frame filtering. Isolate collision and noise. Two types of bridging:
96
Bridging
Bridge uses a forwarding table to forward frames. Initially, this table is empty. Table populated by examining the source address in frames received. If there is no forwarding entry for a frame, then is forwarded to all the other ports. Entries are removed when they age more than 5 minutes
Source Mac A B C D E F P Q
Port 1 1 1 2 2 2 3 3
A B
1 Bridge 2 3 Q P
D E
97
Switches
Switch is a bridge that is configured to work like a hub in a star topology. Frame received in port is processed and forwarded to the right port using a forwarding table. Each computer thinks it is on segment by itself. Unlike bridges, switches support large number of ports.
P1 P32
To Uplink
98
Bridge: Supports less than 5 ports (interfaces) Software implementation can easily handle the traffic Interface connects to a LAN segment Price per port is higher than comparable switch
Switch: The workgroup switch, one of the smallest, can support 16/32/64 ports Port volume requires hardware solution Interface connects to a computer Price per port is very low
99
100
Broadcast Storm
Two LAN segments A and B are still connected if one of the two switches fail!
Segment B
Switch A Switch B
Broadcast
Segment A
Host
Router
Port 2
Host
Router
Disclaimer
The terms switch and bridge are used interchangeably in this presentation.
103
Spanning Tree Protocol (STP) offers a solution to manage redundancy by eliminating loops and cycles and using redundancy during failure. Converts a LAN with loops & cycles to a rooted tree LAN One of the first standardized protocol for STP is IEEE 802.1D RSTP: Variation of STP Rapid STP IEEE 802.1W (now merged with IEEE 802.1D) MSTP: Multiple STP A variation of STP for VLAN
104
All switches in a LAN exchange Bridge Protocol Data Units (BPDUs) periodically (typically every 2 secs) to achieve the following: The election of a unique root switch for the stable spanning-tree network topology Election of a port within a switch to connect to root The election of a designated switch for every switched LAN segment The removal of loops in the switched network by placing redundant switch ports in a backup state Retaining connectivity in the presence of failures
105
Priority.
This assignment is used in choosing and configuring the root bridge By default, all bridges are given the same priority ID of 0x8000 Similarly structured Port ID (8 bit priority + 8 bit Port ID) uses 0X80 as default value for priority
106
Root Port: Forward Traffic to Root; one per bridge Designated Port: Traffic away from root; multiple Non-designated Port: Inactive wrt to data
Spanning Tree forwards frames only on root port and designated ports. Blocks on other ports STP PDUs are not forwarded but information is relayed!!
107
Recommended Data Rate Link Cost 4 Mb/s 250 10 Mb/s 100 16 Mb/s 62 100 Mb/s 19 1 Gb/s 4 10 Gb/s 2
108
bridge
2.
3.
On each bridge, a root port is selected Port with the least cost path to the root bridge On each LAN segment, designated bridge is selected Bridge with least cost path to root bridge Corresponding port is the designated port Frames are forwarded only on marked ports
109
Special frames, Bridge Protocol Data Units (BPDUs) are sent between neighboring switches Not forwarded! Two types of BPDUs: Configuration & Topology Change Notification (TCN) Sent as a link layer multicast 01-80-c2-00-00-00.
110
Determines how often the switch broadcasts its BPDU message to other switches. Max Age is the time interval that a switch stores a BPDU before discarding it. While executing the STP, each switch port keeps a copy of the "best" BPDU that it has heard. If the source of the BPDU loses contact with the switch port, the switch will notice that and TCN will be initiated.
Monitors the time spent by a port in the learning and listening states. The timeout value is the
Dr. Hari T.S. Narayanan
111
113
Not every change to network is topological change Forwarding port going down is a topological change Change is noticed by an STP when the best BPDU ages out with Max-
Age
STP that identifies the change notifies the neighbor using TCN Neighbor acknowledges the TCN with TCA and forwards the TCN to root node Root acknowledges TCN with TCA and broadcasts configuration BPDU with TC bit set for the next 35 secs Sending and forwarding TCA and TCN are independent of Hello Time Forwarding table entries are aged out with Forward Delay Learning phase starts on all designated ports including the migrated port
Dr. Hari T.S. Narayanan
114
Too short timers can lead to loops and instabilities Too long timers can lead to long convergence times STP converges in 30 50 seconds with default settings
115
Convergence Delay
MaxAge (20 sec) + Listening (15 sec) + Learning (15 sec) = 50 seconds
116
RSTP has two more port designations Alternate Portbackup for Root Port Backup portbackup for Designated Port on the segment In RSTP, all bridges send BPDUs automatically While in STP, the root triggers BPDUs In RSTP, bridges act to bring the network to convergence While in STP, bridges passively wait for time-outs before changing port states RSTP port status change is sensed to trigger TCN
117
Port States are Renamed as follows: 1. Discarding (Disabled, Blocked, and Listening) 2. Learning (Learning) 3. Forwarding (Forwarding) Port Roles 1. Root (Root) 2. Designated (Designated) 3. Backup or Alternate (Nondesignated)
Dr. Hari T.S. Narayanan
Root S1
S2
S3
Alternate
Designated
Backup
118
All nodes connected to a hub are in the same physical LAN (collision & broadcasting domain) All nodes connected to a bridge port are in the same physical LAN (collision & broadcasting) By default all nodes connected a switch are in the same physical LAN (broadcasting) VLAN groups nodes/hosts independent of the device or the port to which they are connected This means a VLAN may span multiple switches or there may be multiple VLANs in a single switch
119
VLANs
Host 3 (VLAN 1)
Host 4 (VLAN 2)
S1 Trunk 1 Trunk 2 S2
Host 1 (VLAN1)
Host 2 (VLAN 2)
120
121
Topological Change
If the link between A and C goes then all the hosts which are mapped to port 2 of A will have to either be shifted to port 1 and 3
X
122
Libraries:
Winpcap (Windows C Library) Libcap (Linux/Unix C Library) Pcap (Multiplatform Tcl Library) Wireshark Tcpdump
123
Tools
Day 3 - IP Layer
125
126
IP Header- Fields
Version Size 4 bits Version value 4 for IPv4 and 6 for IPv6 Header Length Size 4 bits Units Number of Words (4 Octets) Example: a value of 5 indicates 20 octets of header size Total Length Size 16 bits Units Octets Includes header, optional fields, and IP Payload
127
IP Header- Fields
0001 (1)
0010 (2) 0100 (4)
Minimize Cost
Maximize Reliability Maximize Throughput
1000 (8)
Minimize Delay
128
Identification 16 bits
Flags 3 bits
IP Header
Size 8 bits
Protocol
IP Header
Header Checksum Size 16 bits Includes all the other header fields Source IP Address Destination IP Address IP Optional Fields Loose source routing Strict Source Routing Record Route Count
131
Code indicates type of option Len length of the option in bytes Ptr Shows where to store/read the next IP address starts at 4
Data 4 bytes Data 4 bytes
OR
code 1
132
Data 4 bytes
Data 4 bytes
Normal Header Optional Header Fragmentation TTL, checksum relationship Calculate Checksum for the following IP header: 45 60 00 3c
3b 11 00 00 38 01 00 00 d1 00 00 68 c0 a8 01 02
Reference: http://mathforum.org/library/drmath/view/54379.html
133
Points of Controversy
Do we need more than 255 Hops? allowing hop count to be very large, looping packets will be relayed many times before being discarded Should packets be larger than 64K? allowing very large packets increase the size of queues and the variability of queuing delays Can we live without checksum? Some IPv4 routers started to cut corners by not verifying checksums to gain advantage over competition. By removing checksum altogether offers all routers the same advantage.
134
Routing Process
135
Routing
One of the important functions of IP network The same IP code is used in Router and hosts A table (Routing table) is used in Routing mechanism (routing packets) Routing table created at bootup, updated periodically (30 secs), and accessed frequently Routing Policy is used in deciding which routes go into the routing table
136
Routing Process
Routing daemon
Routing Updates From adjacent routers
route command
UDP
Yes
TCP
Our packet
routing table
Source routing
Process IP options
IP Layer
Network Interfaces
Dr. Hari T.S. Narayanan
IP input Queue
137
IP=128.192.1.1
entry
1 2
netmask
255.255.0.0 0.0.0.0
net-address
128.192.0.0 0.0.0.0
next-hop
128.192.6.7 128.192.1.1
hop-count
0 1
(comments)
DCN DR
138
139
netmask
255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.255 255.255.0.0 0.0.0.0
net-address
128.192.6.0 128.192.7.0 128.192.150.0 128.192.232.0 131.144.4.10 168.15.0.0 0.0.0.0
next-hop
128.192.6.250 128.192.7.250 128.192.150.250 128.192.232.250 128.192.232.2 128.192.232.2 128.192.232.2
hopcount 0 0 0 0 1 2 1
(comments)
Routing Principles
Routing table drives all the routing decisions IP Output searches the routing table with the given IP address in the following order:
Search for a matching host address in the destination field Search for a matching network address in the destination field Search for a default entry in the destination field
When an interface is configured using ipconfig For Loopback and point-to-point, the route is to a host For broadcast packet, the route is sub network Manually routes could be added using route add command ICMP route discovery protocol Router daemon dynamic Routing protocols
142
standing host Only loopback entry Host connected to a single LAN loopback and one subnet entry Host connected to a single router default entry points to the router Host specific and router specific entries
143
144
ICMP Redirect
0 type 5
78
15 16
31
code 0-3
checksum
R2 IP Address
R1
R2
145
Host Sending Router Solicitation Message Router sending periodic broadcast advertisements
146
The number of bits allocated for host is too many! Some of these bits are taken to create subnets. This created additional entries in Routing tables. Supernetting is created to reduce this explosion of routing table entries. It is also known as CIDR. It recommends certain ways to allocate IP addresses.
147
Supernetting
Dividing a class A/B/C net into a number of smaller nets is subnetting Grouping a number of class A/B/C nets into a bigger net is supernetting Pros: Routing entries reduced in neighboring routers Cons: Broadcast domain size is larger Implementation could eliminate the Cons in certain conditions
148
CIDR or Supernetting
netId
Router 1 Router 2
hostId
G1
129.168
host/net 129.168.x.x
Gateway G1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1x x x x x x x x
netId
Dr. Hari T.S. Narayanan
subnetid
hostId
149
Host/SubnetId 129.168.8 129.168.9 129.168.10 129.168.11 129.168.12 129.168.13 129.168.14 129.168.15 129.168.8 129.168.12
CIDR
Host/SubnetId 129.168.00001000 129.168.00001001 129.168.00001010 129.168.00001011 129.168.00001100 129.168.00001101 129.168.00001110 129.168.00001111 129.168.000010xx 129.168.000011xx
Gateway G1 G1 G1 G1 G2 G2 G2 G2 G1 G3
Host/SubnetId 129.168.8 129.168.9 129.168.10 129.168.11 129.168.12 129.168.13 129.168.14 1292.168.15 129.168.8
Host/SubnetId 129.168.00001000 129.168.00001001 129.168.00001010 129.168.00001011 129.168.00001100 129.168.00001101 129.168.00001110 129.168.00001111 129.168.00001xxx
Gateway G1 G1 G1 G1 G1 G1 G1 G1 G1
supernetted Router 1
Router 2
G1 G3
129.168 .8
.9
.10
.11
G2
Router 3
129.168
Dr. Hari T.S. Narayanan
.12
.13
.14
.15 150
151
Membership Management Protocol (IGMPv1,v2, & v3 & MLD) Multicast Routing Protocol (DVMRP, PIM-S, PIM-D)
Filtering
Unless specified, Ethernet interface delivers to IP only those frames which are: addressed to that interface or multicast/broadcast Similarly IP layer delivers to transport layers only those datagram which are: Address to that IP address Multicast/broadcast UDP Broadcast Port number is used to filter
153
Ethernet Broadcast
154
IP Multicast
155
Multicast
Applications
Multicast Group
Host group
It can span multiple networks Membership in a host group is dynamic There is no restriction on the number of members Multicasting host need not be a member A host can be members of multiple groups A multicast group has no size limits. Multicast groups can be either transient or permanent.
157
IANA lists some well known host groups Also known as permanent host groups 224.0.0.1 means all systems on this subnet 224.0.0.2 means all routers on this subnet 224.0.1.1 is for NTP 224.0.0.9 is for RIP-2 224.0.0.22 IGMP V3 Report
158
The most significant 3-bytes of the MAC address is always set to 01 00 5e. The most significant bit of the least 3 significant bytes is set always set to 0 Rest of the 23 bits are copied from the IP Multicast group address This is not unique mapping!
160
Multicast end-to-end
Sending process specifies a destination (multicast) IP The device driver on that host converts this to corresponding Ethernet address Receiving process notifies its IP layers about its participation in IP multicast Device drivers on the receiving host starts receiving this multicast frames joining multicast When multicast datagram is received it is given to all participating processes (port numbers)
161
X X
162
IGMP is a multicast protocol and there are 3 versions of IGMP (v1, v2, and v3) There are two IGMP message types of concern to the IGMPv3 protocol: Query (0x11) and Report (0x22) There is no separate version field: message type values code the version!
Type=0x11 Max Resp code Checksum Group Address xxx, QRV QQIC Number of Sources (N)
165
IGMP
Report
Report (Query) IP Layer maintains its own filter information Role: Periodic Query to prune and deprune by maintaining state information Querier Election Querier backup
Querier
The Join operation is equivalent to IPMulticastListen ( socket, interface, multicastaddress, EXCLUDE, {} ) and the Leave operation is equivalent to: IPMulticastListen ( socket, interface, multicastaddress, INCLUDE, {} ) where {} is an empty source list.
General Queries are sent with an IP destination address of 224.0.0.1 Group-Specific and Group-and-Source-Specific Queries are sent with an IP destination address equal to the multicast address of interest.
Number of Sources(N)
Auxiallary Data
IGMP Report Messages are sent either to respond to a Query or on filter state change. A "Current-State Record (in response to a Query)
All of the above messages are sent to 224.0.0.22 (all IGMPv3 multicast routers)
Examples
Old State --------INCLUDE (A) EXCLUDE (A) INCLUDE (A) EXCLUDE (A) New State --------INCLUDE (B) EXCLUDE (B) EXCLUDE (B) INCLUDE (B) State-Change Record Sent -----------------------ALLOW (B-A), BLOCK (A-B) ALLOW (A-B), BLOCK (B-A) TO_EX (B) TO_IN (B)
Assume: A is { 1 2 3 4 5} and B is {1 2 6 7 8}
171
Query Robustness Variable (QRV): Default 2 Query Interval: Default 125 sec Query Response Interval: Default 100 (10 secs) Group Membership Interval: This value MUST be QRV times (the Query Interval) plus (one Query Response Interval). Other Querier Present Interval: This value MUST be QRV times (the Query Interval) plus (one half of Query Response Interval). Startup Query Interval : 1/4th of Query Interval Startup Query Count: Robustness Value Last Member Query Interval: 10 (1 sec)
173
IGMP Snooping
By default Layer 2 switches do not have the intelligence of routers to selectively forward IP multicast. They simply forward on all other ports This reduces the effective bandwidth A feature IGMP Snooping is added to Layer 2 Simply by enabling this feature at interface or VLAN level multicast traffic is pruned
IGMP Proxy
IGMP Proxy is useful in setting up multicast in SOHO networks IGMP proxy acts as a multicast host to upstream routers and acts as a multicast router for downstream hosts. Thus, IGMP proxy can be made to run in two different modes, as: Router mode, Proxy (Host) mode
175
Listens in multicast promiscuous mode. Listens for IGMP Host Membership Report messages and Leave Group messages. Sends IGMP Host Membership Queries. Maintains entries in the IPv4 multicast forwarding table. IGMP router mode can be enabled on multiple interfaces. For each interface, a specific version of IGMP can be configured. The default version is IGMP v3.
176
177
PIM is a multicast routing protocol There are three flavors of this protocol: PIMDense, PIM-Sparse, and Bidirectional-PIM. PIM-Dense floods the entire network (from source to listeners) and then start pruning redundant flooding inefficient PIM-Sparse builds the multicast tree from bottomup (Listener to source)
PIM-Sparse
PIM-S is a multicast routing protocol It does not depend on the any other routing (RIP, OSPF, ..) specific topology it maintains its own multicast routing table It also makes use of unicast forwarding table of the routers to avoid loops. Loops are identified using Reverse Path Forwarding (RPF)
Mobile IP Networks
Mobility, Header Compression, Tunneling
180
IETF Specifications
RFC2002 - IP Mobility Support - Describes a process handling mobility of IP nodes. RFC2003 - IP Encapsulation within IP Describes IP Tunneling (message encoding). Also describes various potential issues and solutions to IP Tunneling avoiding routing loops and handling ICMP messages. RFC2004 - Minimal Encapsulation within IP - An optimization of RFC2003 encapsulation. RFC2507 - IP Header Compression in special cases IP carried over PPP Carrying stream of UDP/RTP type payload. RFC2508 - Compressing IP/UDP/RTP Headers for Low-Speed Serial Links SCTP-ROHC SCTP Robust Header Compression Draft
181
182
Internet
CN
Router
183
Operations
Packet Delivery Agent Discovery Agent Solicitation Agent Advertisement Registration Tunneling and Encapsulation IP-in-IP Minimal Encapsulation
184
Packet Delivery
Destination Encapsulator
Home Network Router HA
Tunnel
Decapsulator
Router FA
3
MN 4 Foreign network
Internet
1 CN Router HA
185
source
IP in IP
186
IP Header Compression
TCP/IP header size was never an issue with traditional delay insensitive applications, except when the links were too slow (few kilobytes) and overhead of the header was high. Even in those adverse conditions, it was not critical. However, with the use of TCP/IP for delay sensitive applications this issue is becoming critical. There are some RFCs (2507-2509) we plan to examine to under this initiative
188
189
Fortunately, there are opportunities to reduce the header size in new scenarios. Opportunities: Tunneling, PPP, and Repeating header information Scope of RFC 2507 PPP Headers Compressed: IPv4, IPv6 Base and Extension, UDP, and TCP. General method: Establish base line and send only the delta.
190
Some Terminologies
Context: The context is the uncompressed version of the last header sent (compressor) or received (decompressor) over the link Context ID: The context for a packet stream is associated with a context identifier (CID) Generation: The context for non-TCP packet streams is also associated with a generation
191
FULL_HEADER - Indicates a packet with an uncompressed header COMPRESSED_NON_TCP - Indicates a non-TCP packet with a compressed header. COMPRESSED_TCP - Indicates a packet with a compressed TCP header, COMPRESSED_TCP_NODELTA - Indicates a packet with a compressed TCP header where all fields that are normally sent as the difference to the previous value are instead sent as-is. CONTEXT_STATE - Indicates a special packet sent from the decompressor to the compressor to communicate a list of (TCP) CIDs for which synchronization has been lost.
192
193
194
UDP Characteristics
Connection-less (best effort) Error check No acknowledgement Exchanges of messages One message at a time (simplex) No flow and congestion control Error check but no error control
195
Pseudo Checksum
The pseudo-header helps to find packets or packets that arrive at the wrong address. However the pseudo-header violates the protocol hierarchy because the IP addresses which are used in it belong to the IP layer and not to the UDP layer.
32 bits
Source Address Destination Address 00000000 Protocol Number UDP Segment Length
196
% udp receive udp1 <first hello received> % udp receive udp1 <2nd hello received>
Time
NetProWise
Characteristics of TCP
Connection-oriented (state based) Reliable Timeout, Buffering, Checksum, Positive Cumulative Acknowledge, Optional Selective Acknowledgement Exchanges Byte Stream Different from message (datagram) exchange, message transparent Duplex Flow control and congestion control Limited to single stream No built-in reliability Timing issues
Dr. Hari T.S. Narayanan
198
IP Header
TCP Header
TCP data
199
SYN
RST PSH URG ACK FIN
Synchronize Sequence Numbers to initiate connection. Reset Connection. Push data to receiving process ASAP. Urgent pointer is valid. Acknowledgement is valid. Sender is finished sending.
200
2.
3.
SYN: Requesting end (client) sends the destination port and source initial sequence number (ISN) with SYN flag set. Client ACK & SYN: The server ACKs this with its own ISN, the next expected sequence number from the client with SYN flag set. ACK: The client must ACK time this SYN with servers ISN plus 1.
Dr. Hari T.S. Narayanan
Server
201
Client
time
Server
Close Connection
202
3. 4.
FIN: Client sends a FIN ACK: Server ACKs clients FIN FIN: Server sends a FIN ACK: Client ACKs servers FIN
time
Server Client
. . .
203
Half-open: Server is waiting for SYN requests from client Half-close: Client has no more requests and sent its FIN and Server has even ACKed the FIN. But Server has some more data to send to the client. Active/Passive close: It is said that the first host to issue a FIN performs the active close , then the other and second one becomes the passive close. Maximum Segment Size (MSS)
204
Stream Sequencing
Sequence number field is used to maintain the datagram sequence. Sequence number indicates the octet order Sequence number starts with random offset Incremented with each datagram Rolls over when exceeds the offset
205
Acknowledgment Number is used to implement positive cumulative acknowledgment Acknowledgement is piggy backed Acknowledgement of multiple datagrams are done with a single acknowledgement Acknowledgement implemented using individual time for each datagram If acknowledgement not received datagram is resent by resetting the time After n tries connection is declared as failed.
206
Client
socket ()
Data (request)
Data reply
# Service provider proc echo {s a port} { puts Received Request from: $a:$port set l [gets $s] puts $s "Echoed: $l" puts "Echoed to stdout: $l" }
socket -server accept 9999 puts Ready vwait x NetProWise
NetProWise
TCP Control
Flow Control using Sliding Window Congestion Control using Slow Start Silly Window Problem
210
211
A simple positive acknowledgement protocol wastes a substantial amount of network bandwidth Sender must delay sending a new packet until it receives an acknowledgement for the previous packet
Send packet 1
Receive packet 1 Send ACK 1 Receive ACK 1 Send packet 2
Receive packet 2
Send ACK 2 Receive ACK 2
212
Sliding Window
Uses network bandwidth better Allow sender to transmit multiple packets The protocol places a small window on the sequence and transmits all packets that lie inside the window.
Initial window
1 2 3 4 5 6 7 8 9 10
Window slides
1 2 3 4 5 6 7 8 9 10
213
The window advances/slides upon the arrival of an ack The sender sends only packets in the window Receiver usually sends cumulative acks
Send packet 1 Send packet 2 Send packet 3 Receive ACK 1 Receive ACK 2 Receive ACK 3 Receive packet 1 send ACK 1 Receive packet 2 send ACK 2 Receive packet 3 send ACK 3
214
SWP
Number of packets unacknowledged is constrained by the window size Window size is limited to a small, fixed number. For ex, if the window size is 8, the sender is permitted to transmit 8 datagrams before if receives acknowledgement Once the sender receives an ack for the first packet inside the window, it slides the window The performance of SWP depends on the window size and the speed at which the receiver accepts packets With a window size of 1, a sliding window protocol is exactly the same as simple positive ack protocol
215
SWP
A well tuned SWP keeps the network completely saturated with packets
SWP partitions
The window partitions the sequence of the packets into three sets
Those packets to the left of the window have been successfully transmitted, received and acknowledged Those packets to the right have not yet been transmitted Those packets that lie in the window are being transmitted
217
Caused by poor TCP flow control If a server is unable to process all incoming data, it requests that its clients reduce the amount of data they send at a time If the server continues to be unable to process all incoming data, the window becomes smaller and smaller Some times the data transmitted is smaller than the packet header, making data transmission extremely in efficient That is the window size is shrinking to a silly value the overhead data is about 40 bytes to send one byte
218
SWS Solution
Nagles Algorithm is used to solve SWS Purpose: to avoid inefficient use of bandwidth Sender Operation Buffer all user data if any unacknowledged data is outstanding Ok to send if all ACKd or have a full packet (MSS) worth of data to send
Receiver resists advertising a window bigger than it is currently advertising unless it can be increased by at least MIN(one MSS,0.5*receivers available buffer)
219
220
Problem:
The
normal response to lost packets is retransmission. This will only make congestion worse.
221
General approach
Receiver advertises its "receive window size". Sender keeps track of "congestion window size." Then the amount of unacknowledged data must be <= minimum { congestion window, receive window }.
222
Slow-Start
Start with ConWin = 1 MSS (maximum segment size) Send one segment If acknowledged within time limit, double the ConWin ("slow start") (up to a certain threshold -- another variable in the sender - initially 64K) After reaching threshold, increase ConWin by 1 MSS every time an ack is received okay.
223
Slow-start
If there is a timeout, indicating a lost packet, then reset threshold = 1/2 last threshold ConWin = 1 MSS
224
TCP Keep-Alive
To make sure that the other end of the TCP connection still alive.
225
Home work your summary for each protocol/module: should include motivation, operation, message format, message exchange, API, deployment, configuration parameters, constants, repeatability of the hands-on, miscellaneous items This is due next week in bullet form, cut and paste is OK too. You need to own your summary! No quiz today and tomorrow Final exam next week covering all the material covered with handson Routing (RIP/BGP/MPLS), IPSec, Switch Architecture next week Rest of the material need to be covered this week!
226
227
Requirements of SS7 are unique and difficult to be met either by UDP or TCP SS7 requires
Reliable Transport With strict timing requirements Reliability using multi-homing Better timing by avoiding head-of-the queue blocking
228
SCTP offers
Messages can be ordered or un-ordered Each SCTP packet can contain multiple user fragments and control fragments in it Stronger Validation Flexible Acknowledgement mechanism Flow and Congestion Control Path selection and monitoring Reliable, unreliable, and partially reliable associations between end points
229
SCTP Association
An SCTP association provides reliable transfer of messages There can be only one association between two end points An end point can have any number of associations Association can also be viewed as a collection of paths between the two ends of the transport Only one path is used at a given time
Dr. Hari T.S. Narayanan
SCTP 1
SCTP 3
Path
IP Network
Association 2
Association 1
SCTP 2
230
SCTP Association
An association is created by a 4-way handshake Primary Path is used at the start of the association In case the Primary path fails Association can use an alternate path from the association Paths are monitored using HEARTBEAT
Dr. Hari T.S. Narayanan
SCTP 1
Paths
IP Network
SCTP 2
231
SCTP Stream
Stream: Sequence of user messages (as opposed to sequence of byte) that are to be delivered in order An association can carry multiple streams Number of streams are negotiated at association setup time Each stream is identified by Stream ID (SID). It is a unique integer (within an association) SCTP user specifies the stream ID (16 bit integer) when a message is passed on to SCTP layer Messages within a stream are identified by their stream sequence numbers (SSN) SCTP layer assigns the sequence number Sequence number is 16 bit integer Sequencing within the stream can also be optionally bypassed, SSN is not used in such case
232
When required a (user) message is fragmented to conform to the given MTU The fragmented message is assembled at the receiving end before it is sent to the destination SCTP user B,E Flags and Transmission Sequence Number (TSN) are used in assembling fragments
233
SCTP assigns TSN to each user data fragment or un-fragmented message TSN is independent of Stream Sequence Number (SSN) The receiving end acknowledges all TSN received, even if there are gaps in the sequence Thus, the reliable delivery is kept functionally separate from sequenced stream delivery Retransmission is based on (lack of) timely acknowledgement Retransmission is conditioned by congestion avoidance procedures When a message is fragmented, the assembly is based on TSN as all fragments use the same SSN!
234
Type
Length
Chunk 1 (Control)
Type
Resv U,B,E Length TSN Stream ID S Stream Sequence #n Payload Protocol Identifier User Data (seq n of Stream S)
Dr. Hari T.S. Narayanan
235
SCTP Node B
Receive INIT, Create COOKIE, Send INITACK
INIT Chunk
Receive INIT-ACK, Stop INIT Timer, Start COOKIE TIMER
INIT-ACK
COOKIE ECHO
COOKIE-ACK
236
SCTP Node B
DATA
SACK Chunk
Receive HEARTBEAT . Send HEARTBEATACK
HEARTBEAT
HEARTBEAT-ACK
237
SHUTDOWN Receive SHUTDOWNACK, Send SHUTDOWNCOMPLETE, Stop SHUTDOWN Timer, Delete TCB SHUTDOWN COMPLETE
Dr. Hari T.S. Narayanan
SHUTDOWNACK
Description
DATA, Payload Data. INIT, Initiation. INIT ACK, Initiation Acknowledgement. SACK, Selective Acknowledgement. HEARTBEAT, Heartbeat Request. HEARTBEAT ACK, Heartbeat Acknowledgement. ABORT, Abort. SHUTDOWN, Shutdown. SHUTDOWN ACK, Shutdown Acknowledgement. ERROR, Operation Error. COOKIE ECHO, State Cookie. COOKIE ACK, Cookie Acknowledgement. ECNE, Reserved for Explicit Congestion Notification Echo. CWR, Reserved for Congestion Window Reduced. SHUTDOWN COMPLETE, Shutdown Complete.
239
Chunk Format
Chunk Type refer to previous slide Chunk Type bits XX used when chunk type is not understood at the receiver 00 stop processing the packet, no need to report 01 stop processing the packet, report the error 10 skip the chunk & process rest of the packet, no report 11 skip the chunk & process rest of the packet, report error Chunk length in bytes, includes type, flags, length fields Flag value depends on the chunk type value
Chunk Type
XXYYYYYY Chunk Flags Chunk Data
240
Chunk Length
Optional Parameters
Optional parameters are coded using Type/tag, Length, and Value Parameter types are coded similar to chunk types The most significant two bits encode required action when the parameter type is not known to the receiver
241
INIT Chunk
Initiation Tag: non-zero random 32-bit nonce value Receiver Window Credit: initial window size used for flow control # of Outbound Streams: number of streams the sender wishes to use Max # of Inbound Streams: maximum number of streams the sender supports Initial TSN: initial 32-bit TSN used for data transfer which is also a random value (it may be copied from the initiation tag) INIT-ACK Chunk is identical to INIT Chunk, but the type value is 2
242
INIT YES YES YES YES YES YES NO NO YES YES YES
INIT-ACK YES YES NO YES YES YES YES YES YES YES YES
243
Cookie-Echo and Cookie-ACK help prevent resource attacks Both can be bundled with data chunks
Cookie Echo Cookie ACK
244
U Unordered, there is no Stream Sequence Number, both B and E bits are set to 1 when U is set B Beginning Fragment E End Fragment
245
Data Fragments
109971 Missing
TSN=109970 TSN=109969
TSN=109968 TSN=109967
109966 Missing
TSN=109965 TSN=109964 TSN=109963
Dr. Hari T.S. Narayanan
Duplicate Duplicate
246
Cumulative TSN: the highest consecutive TSN that the SACK sender has received Receiver Window Credit: current RWND available for the peer to send # of Fragments: number of Gap ACK Blocks included # of Duplicates: number of Duplicate TSN reports included
247
Gap Ack Block Start / End TSN offset: the start and end offset for a range of consecutive TSNs received relative to the cumulative ack point The TSNs not covered by a Gap Ack Block indicate TSNs that are missing
Duplicate TSN: TSN that has been received more than once Note that the same TSN may be reported more than once
248
Heartbeat Data
Type=5 Flags=0 Length=variable Length=variable
Param Type = 1
Heartbeat Data
249
Value --------1 2 3 4 5 6 7 8 9 10
Cause Code ---------------Invalid Stream Identifier Missing Mandatory Parameter Stale Cookie Error Out of Resource Unresolvable Address Unrecognized Chunk Type Invalid Mandatory Parameter Unrecognized Parameters No User Data Cookie Received While Shutting Down
250
SCTP Deployment
ISDN/
/ISDN
NetProWise
SCTP uses extended socket API There are SCTP libraries for Windows and Linux Download and install SCTP Echo Client-Server with SCTP library Capture and Echo Client-Server interaction Observe: Handshakes, Chunk Types, Cookie, TSN, Acknowledgement, heart beat, Encoding of flags, Window size, . Check the reliability part by disabling one of the interfaces on the server side
Dr. Hari T.S. Narayanan
252
Client Window echo_client server:8000 Connecting to [10.0.0.10]:8000 ... Connecting Completed. hello SERVER: hello
253
SCTP Features
4-way handshake with cookie Demonstrate Reliability by disabling loopback adapter Acknowledgement Streams
254
IPv6
Introduction
IPv6 Stack
Transport Layer: TCPv4 replaced with TCPv6, UDPv4 replaced with UDPv6
Internet Layer: IPv4 replaced with IPv6, ICMPv4 replaced with ICMPv6
Link Layer: Removed ARP, Added ND, OSPFv2 replaced with OSPFv3
Application Layer HTTP, FTP, SMTP, SIP, RTP, DNS, DHCPv6, etc TCPv6, UDPv6, SCTPv6 IPv6, ICMPv6, IPsec
Revised Addresses increased 32 bits 128 bits Time to Live Hop Limit Protocol Next Header Type of Service Traffic Class Streamlined Fragmentation fields moved out of base header IP options moved out of base header Header Checksum eliminated Header Length field eliminated Length field excludes IPv6 header Alignment changed from 32 to 64 bits Extended Flow Label field added
IPv6 addresses are 16 octets (128 bits) in size Represented in coloned hex notation This splits the 128 bit addresses into eight 16-bit fields Examples:
2001:df8:5403:3000:b5ea:976d:679f:30f5 A global unicast address using EUI-64 2001:df8:5403:3000::1e A manually assigned unicast address fe80::b5ea:97ff:fe9f:30f5 A link-local address using EUI-64 ff02::1 A multicast address (all nodes on link) ::1 The loopback address for IPv6 :: The unspecified address (all zeros)
Dr. Hari T.S. Narayanan
In addition to manual address configuration, IPv6 offers three Types of Auto configurations:
1. 2. 3.
Stateless Auto Configuration State full Auto Configuration (with DHCPv6) Both
EUI-64 is an IEEE specification A host can automatically assign itself a unique 64-bit IPv6 interface identifier without manual or DHCP intervention This unique address is derived from 48-bit Ethernet address Using this both locally and globally scoped addresses for an interface can be generated Issue of Privacy!
Dr. Hari T.S. Narayanan
The scope of an address specifies in what part of the network it is valid and unique.
Node-local: ::<number> Link-local (Prefix 1111 1110 10 FE80 followed by 54 bits of 0) Site-local* Global (Prefix b001/3) - b for binary
Address Resolution
Maps IPv6 on-link address to its corresponding Link layer address Address resolution is never performed on multicast addresses. When a node has a unicast packet to send to a neighbor, but does not know the neighbor's linklayer address, it performs address resolution.
Address Resolution
Step 1- If neighbor Bs entry is NOT in neighbor cache then neighbor discovery is initiated Step 2 - A new INCOMPLETE entry is created for B in neighbor cache and a Neighbor Solicitation message is sent to B using Solicitation-node Multicast Address of B This solicitation message includes As MAC address as the Source Link-layer Address option Step 3 B receives the Neighbor Solicitation message, and responds with a Neighbor Advertisement message, sent to As MAC address. Step 4 A receives the Neighbor Advertisement message from B, and then updates Bs entry in its Neighbor Cache (REACHABLE) Step 5 A can now send the original packet it wanted to send to B using his MAC address. Note: There is no broadcast, even the multicast is limited!
Dr. Hari T.S. Narayanan
Windows IPv6 Neighbor Cache (NC) is similar IPV4 ARP cache equivalent Neighbor Cache can be managed from Windows command shell using ipv6 utility Lab: Demo
NC with no neighbor entry Ping neighbor and create an entry Identify Neighbor Solicitation and Neighbor Advertisement messages triggered by ping6
Dr. Hari T.S. Narayanan
Each host maintains the following tables for routing IPv6 data packets
Neighbor Cache:
ARP Cache equivalent maps IP to link layer address Neighbor Addresses, Host/Router, Pointer to queued packets, details of un-reachability algorithm A set of entries about destinations to which traffic has been sent recently. Destination Cache maps a destination IP address to the IP address of the next-hop neighbor
Dr. Hari T.S. Narayanan
Destination Cache:
Prefix List:
A list of the prefixes that define a set of addresses that are on-link. Prefix List entries are created from information received in Router Advertisements. A list of routers to which packets may be sent. Router list entries point to entries in the Neighbor Cache;
Dr. Hari T.S. Narayanan
First Destination cache is examined for a match. If there is a match NC is used to find the DA for link layer. Otherwise Next-hop Determination: Destination IP address in the packet is checked against Prefix List for longest prefix match to find out if the destination is onor off-link If the destination is on-link, the next-hop address is the same as the destination address in the packet. Otherwise, the sender selects a router from the Default Router list based on certain criteria (explained later) Next-hop determination is not performed on every packet that is sent. Instead, the result of next-hop determination are save in the Destination Cache (also contains Redirect result) Once the IP address of the next-hop is known, the Neighbor Cache is examined for Link Layer Address
10 8
7 13
http://technet.microsoft.com/en-us/library/dd392266%28WS.10%29.aspx
Dr. Hari T.S. Narayanan
Observing NC States
Delete NC entry
3.
4. 5. 6. 7. 8. 9.
Router Discovery Address Resolution Subnet Prefix Discovery Parameter Discovery Stateless Address Auto configuration (SAA) Next-hop Determination Neighbor Unreachability Detection (NUD) Duplicate Address Detection (DAD) Redirect (more prevalent in IPv6)
Dr. Hari T.S. Narayanan
NDP Messages
There are five ICMPv6 messages that ND uses to accomplish these things:
1. 2. 3. 4. 5.
On-link Unicast IPv6 Link Layer address IsRouter Flag Reachability State address
Destination Cache
Router
A node can determine the link-local address of the router(s) on the local link at any time, specifically at power on Node sends Router Solicitation to multicast group ff02::2 with source address as link local address All routers on the link will respond with Router Advertisement messages either to ff02::1 or nodes link local address. Source addresses from advertisements are used to build the
Checksum
Route Advertisement could be sent as a response to Router Solicitation or as a periodic message IP SA: Link Local Address from which this message is sent IP DA: Multicast (All nodes in the link or Source Address of an invoking Router Solicitation) One of the options field is Subnet Prefix
Checksum
Router Life Time
Stateless Auto-configuration
Duplicate Address Detection Router & Prefix Discovery
Router Advertisement
NO
YES
M=0
YES NO
Stateless Configuration
Manual Configuration
Procedure is identical to router discovery, except for extracting the subnet prefix option from advertisement
Dr. Hari T.S. Narayanan
Address Resolution
Maps IPv6 on-link address to its corresponding Link layer address Address resolution is never performed on multicast addresses. When a node has a unicast packet to send to a neighbor, but does not know the neighbor's linklayer address, it performs address resolution.
Address Resolution
Step 1- If neighbor Bs entry is NOT in neighbor cache then neighbor discovery is initiated Step 2 - A new INCOMPLETE entry is created for B in neighbor cache and a Neighbor Solicitation message is sent to B using Solicitation-node Multicast Address of B This solicitation message includes As MAC address as the Source Link-layer Address option Step 3 B receives the Neighbor Solicitation message, and responds with a Neighbor Advertisement message, sent to As MAC address. Step 4 A receives the Neighbor Advertisement message from B, and then updates Bs entry in its Neighbor Cache (REACHABLE) Step 5 A can now send the original packet it wanted to send to B using his MAC address.
Dr. Hari T.S. Narayanan
NC with no neighbor entry Ping neighbor and create an entry Identify Neighbor Solicitation and Neighbor Advertisement messages triggered by ping6
Dr. Hari T.S. Narayanan
Both hosts and routers perform DAD on all unicast and anycast addresses regardless of how they are obtained
First Destination cache is examined for a match. If there is a match NC is used to find the DA for link layer. Otherwise Next-hop Determination: Destination IP address in the packet is checked against Prefix List for longest prefix match to find out if the destination is onor off-link If the destination is on-link, the next-hop address is the same as the destination address in the packet. Otherwise, the sender selects a router from the Default Router list based on certain criteria (explained later) Next-hop determination is not performed on every packet that is sent. Instead, the result of next-hop determination are save in the Destination Cache (also contains Redirect result) Once the IP address of the next-hop is known, the Neighbor Cache is examined for Link Layer Address
OffLink
Router Selection
Not ReachableNeighbour
Unreachability Detection
NO
Reachable
Router Selection
NO ENTRY EXISTS
11
INCOMPLETE
Send Packet
6
http://technet.microsoft.com/en-us/library/dd392266%28WS.10%29.aspx
Observing NC States
Delete NC entry
Migration Mechanisms
2. 3.
4.
5.
Co-existence: It involves all client and server nodes supporting both IPv4 and IPv6 in their network stacks. Tunneling: Carrying IP in IP Translation: Using NAT like mechanism that maps one type of IP to another Application Proxies: Changes at the application layer to enable interworking between clients and servers that are of different types IPv4 Mapped IPv6 Address: For instance IPv4 Address 192.168.1.1 is mapped to IPv6 address ::192.168.1.1
Dr. Hari T.S. Narayanan
IPv4 addresses are expected to be exhausted (at IANA) by mid 2011 (Feb ) this year, in few months! We will have about 4 billion devices with IPv4 addresses Most of them may not be ready for IPv6 - Windows XPs, Playstations, CPE DSL modems, Some content on the Internet may stay on IPv4 as well This requires hosts and routers to run Dual Stack. There are several commercial offerings of Dual Stack
Link Layer support for IPv4 and IPv6 IPv6 ND and IPv4 ARP for address resolution ICMP and ICMPv6 TCP-v6 and UDP-v6 with revised checksum calculation Socket API for v4 and v6 All 4-combinations of connections are supported with Dual-stack The only issues are the additional complexity of implementation and deployment, and the additional memory requirements.
Application Layer Transport Layer (v4) Internet Layer (v4) Link Layer
HTTP, FTP, SMTP, SIP, RTP, DNS, DHCPv4, DHCPv6, etc TCPv4, UDPv6 IPv4, ICMPv4
Transport Layer (v6) Internet Layer (v6)
DS-Lite Approach
Move as many devices to IPv6 as possible Build an IPv6 transition bridge for the IPv4 long tail Goal: Provide IPv4 service without providing a dedicated IPv4 address Technology: Leverage IPv6 access infrastructure Provide only IPv6 addresses to endpoint Share IPv4 addresses in the access networks DS-lite: IPv4/IPv6 tunnel + provider NAT
Why CGN fewer public addresses are available in IPv4 172.16.0.1/32 is also a private address in the following diagram The public address 192.0.2.1/32 could be used with multiple private addresses (Home Networks), thus conserving public addresses
192.0.2.1/32 Public Address
Edge NAT
172.16.0.1/32 RFC 1918 Address
ISP
IPv4 Internet
10.0.0.1
IPv4 Src : 10.0.0.1 Dst : 128.0.0.1 TCP Port Src :10000 Dst : 80
IPv4 Src : 10.0.0.1 Dst : 128.0.0.1 TCP Port Src : 10000 Dst : 80
IPv4 Src : 129.0.0.1 Dst : 128.0.0.1 TCP Port Src : 5000 Dst : 80
Broker enables tunneling between IPv6 client and server over IPv4 network
303
What is DHCP ?
A host requires IP Address, Netmask, etc to boot and connect to network DHCP provides a framework for a host to obtain the above information automatically DHCP uses BOOTP
304
Motivation
Ease of use DHCP integrates the functions of RARP, TFTP, and ICMP RARP is developed for Suns diskless X-terms It is based on client-server paradigm Host to be configured runs DHCP client Question: how can host without any IP address participate in client-server interaction over network?
305
306
307
DHCP System
There can be multiple clients obviously! There can be multiple DHCP servers in an organization There can be multiple relay agents within a subnet
308
309
Allow automated assignment of configuration parameters to new clients to avoid hand configuration for new clients,
Support fixed or permanent allocation of configuration parameters to specific clients.
310
Study Configuration
311
DHCP Messages
313
Cost
314
NAT Host
315
IPyip
316
Typically, in a SOHO environment, NAT enables a client in a private network to communicate with a server in Internet A NAT can also enable communication between a client in Internet to connect to a virtual server in a private Net A NAT can also enable communication between a client in a private network with a virtual server in another private network
2.
3.
317
Internet
192.168.1.2 192.168.1.3 Server in DMZ
ISP
192.168.1.1 Nagotiated IP Addreass Address dynamically assigned ISP ADSL Line 222.182.22.39
ADSL Provider
318
NAT
Dr. Hari T.S. Narayanan
Internet
ISP
192.168.1.1
NAT
ADSL Line
ADSL Provider
1. 2.
Web FTP
319
320
Number Mapped Port Transport Private Port 1 2 3 4 5 5600 7200 80 UDP TCP TCP 4500 3400 80
Type D D S
321
Port Mapping
Port mapping is used to map multiple private IP addresses to a single registered IP address. Whenever possible, the original source port number will be preserved. A new source port number is only generated if some other host is already using the original source port number. IP address mapping must be on. Outgoing packets are port-mapped only if their source port numbers are greater than 1024. Incoming packets are port-mapped as needed to allow forwarding of replies to the original host. Incoming packets with a port number contained in the Permanent Port Mapping Table are forwarded to the host and port number specified in that table. Incoming packets with a mapping already in the Port Mapping Table are unmapped normally. For incoming UDP packets, such mappings usually expire after 60 seconds, unless the mappings was made permanent by adding the client application's destination port to the UDP Port Table.
322
DSL NAT
323
Firewall
Introduction
Firewall helps computers to keep more secure. Firewall regulates network traffic passing through it based on its configuration Success of a firewall depends very much the way it is configured Firewall technology emerged in the late 1980s when the Internet was a fairly new technology. Firewall function is similar to ACL and Anti-Virus, however, different
Firewall Configuration
User Class
Customers, Partners, Consultant, Vendors, Employees, Administrators, Executives, User data, Employee Data, Product Data, Corporate Data, ..
Data Class
Corporate data needs to be highly secure Completed Product Specification exposed to customers and public Product Design Details are protected from public, customers,
Windows Firewall
Internet Gateway & Management Console
Internet PSTN
1 DSLAM
NAS Server
4 2/3
ISP Intranet
RADIUS Server
Firewall Operations
Turning firewall On or Off Allowing & disallowing applications through firewall Filtering based on port numbers Logging firewall events Configuring firewall for a specific interface Allowing applications from a certain trusted host Filtering various ICMP packets
Review of pin-hole creation FTP data transfer Close the pin-holes Understand state-full behavior of Windows
firewall
334
Confidentiality
Web Server
DB Server Router
Firewall
Technology Enterprises Group
Firewall
Public Access from Internet bounded to DMZ Intranet User is also bounded by DMZ No direct communication from an Internet application to Intranet application DMZ acts as mediator Intranet Private IP addresses, DMZ including Inner Firewall Public addresses Inner Firewall running NAT
Prepared or Annotated by Dr. Hari
Illustrate ACL
339
Runs on UDP No authentication No encryption No listing of directories Server port 69 Get and put file are possible Three types of file handled: ASCII, octet (binary) and mail
340
Opcode 1 2
3
4 5
File Data
Data Acknowledge Error
2 bytes
Block #
0-512 bytes
Data
2 bytes Opcode
2 bytes Block #
Opcode
Command
Description
1
2 3 4 5
Read Request
Write Request File Data Data Acknowledge Error
2 bytes
Error Code
String
ErrMsg
1 byte
0
The Opcode is 5. Error code - an integer indicating the nature of the error. 0 Not defined, see error message (if any). 1 File not found. 2 Access violation. 3 Disk full or allocation exceeded. 4 Illegal TFTP operation. 5 Unknown transfer ID. 6 File already exists. 7 No such user.
343
TFTP - Trace
344
Non-standard implementations support options. Block size is one of the options Typically, a client proposes a non-standard block size with the first message. This done with the first Read/Write request. Server acknowledges this proposal with a non-standard opcode Options Acknowledgement If client is putting a file then it specifies the file size in its first request. If a server sending the file then it specifies the file size in Options Acknowledgement.
345
http://tftpd32.jounin.net/
346
347
Motivation
IP addresses (numbers) are difficult to remember We use domain names (strings) instead Domain names are only for human consumption and they are not used as to or from addresses A mechanism is required to convert a domain name to its corresponding IP address and vice-versa This is accomplished by Domain Name System (DNS) DNS is implemented using client-server paradigm Socket library supports two functions for DNS-IP mapping.
348
The mechanism by which Internet software translates names to addresses and vice versa A globally distributed, loosely coherent, scalable, reliable,
dynamic database
Comprised of three components A name space Name Servers making that name space available Resolvers (clients) that query the servers about the name space DNS is a mapping service and NOT a directory service
349
""
com
edu
gov
int
mil
net
org
netpro
metainfo berkeley
vit
nato
army
uu
west
east
www
dakota
tornado
350
Each node in the tree must have a label A string of up to 63 characters The DNS protocol makes NO limitation on what binary values are used in labels Sibling nodes must have unique labels The null label is reserved for the root node A domain name is the sequence of labels from a node to the root, separated by dots (.s), read left to right The name space has a maximum depth of 127 levels Domain names are limited to 255 characters in length URL could be longer than 255 characters A nodes domain name identifies its position in the name space
351
Subdomains
One domain is a subdomain of another if its apex node is a descendant of the others apex node More simply, one domain is a subdomain of another if its domain name ends in the others domain name So sales.netpro.com is a subdomain of netpro.com and .com netpro.com is a subdomain of com
352
Zones
netpro.com domain netpro.com zone
""
.arpa acmebw rwc molokai skye
Dr. Hari T.S. Narayanan
rwc.netpro.com zone
ams.netpro.com zone
www
cheddar
353
Name Servers
Name servers store information about the name space in units called zones The name servers that load a complete zone are said to have authority for or be authoritative for the zone Usually, more than one name server are authoritative for the same zone This ensures redundancy and spreads the load Also, a single name server may be authoritative for many zones Two main types of servers Authoritative maintains the data Master where the data is edited Slave where data is replicated to Caching stores data obtained from an authoritative server
354
Name Resolution
A DNS query has three parameters: A domain name (e.g., www.netpro.com), Remember, every node has a domain name! A class (e.g., IN), and A type (e.g., A) A name server receiving a query from a resolver looks for the answer in its authoritative data and its cache If the server isnt authoritative for the answer and the answer isnt in the cache, the answer must be looked up
355
DNS Client
hostname
hostname
Resolver
(1)
IP address
FTP
(2)
Establish connection with IP address
TCP
ARP (5)
(6)
ARP Request (Ethernet broadcast) (8) (4) (3)
Send IP datagram to IP address
IP
(9)
Ethernet Driver
Ethernet Driver
Ethernet Driver
ARP
(7)
ARP
IP
356
struct hostent *gethostbyname (const char *name) struct hostent *gethostbyaddr (const void *addr, int len, int type); struct hostent { char *h_name; /* official name of host */ char **h_aliases; /* alias list */ int h_addrtype; /* host address type */ int h_length; /* length of address */ char **h_addr_list; /* list of addresses */ }
357
m.root-servers.net
3 4 1 8 7 6 5
ping www.netpro.com.
358
1 4
f.gtld-servers.net annie.west.sprockets.com
ns1.sanjose.netpro.net
ping www.netpro.com
359
Universal response: The same query always gets the same answer no matter where it was asked or which name server(s) were queried
360
AF Afghanistan AL Albania
DZ Algeria ...
YU Yugoslavia ZM Zambia
ZW Zimbabwe
361
362
363
Canonical CNAME AAAA or A6 Quad A NAME SOA PTR HINFO MX Start Of Authority PoinTeR Host INFOrmation Mail EXchange
Windows client resolver is the program that maps domain names to IP address Client resolver sends request to DNS to achieve this. Client resolver keeps a cache for storing the mappings learnt This cache can be managed using ipconfig utility Ipconfig /displaydns displays the content of the cache Ipconfig /flushdns clears this cache
365
Issue the following two commands and review the DNS records ipconfig /flushdns Ipconfig /displaydns
366
367
ping www.yahoo.com
Observe DNS frames in Ethereal Observe the cache
368
369
DNS Protocol
UDP as transport (also runs on TCP) Servers well known port 53 DNS PDU Format
370
RTP
RTCP
TCP
(till now)
RTP
Does NOT assume the underlying network is reliable and delivers PDUs in sequence
Fixed Header
by mixers)
Mixer
RTP mixer - an intermediate system that receives & combines RTP PDUs of one or more RTP sessions into a new RTP PDU
Reports contain statistics such as the number of RTP-PDUs sent, number of RTPPDUs lost, inter-arrival jitter Used by application to modify sender transmission rates and for diagnostics purposes
Dr. Hari T.S. Narayanan
Typically, several RTCP PDUs of different types are transmitted in a single UDP PDU
379
VLAN 3
VLAN 2
380
Virtual LAN
Virtual LAN is a collection of hosts that behave as though they are on a LAN segment even though they are not They are in a controlled Broadcast domain with no collision VLANs are created and supported by Layer 2 and multi-layer switches
381
382
Virtual LAN
VLAN 3
VLAN 2
383
Switches create individual collision free LAN Segments (1 host per Segment ) This eliminates not only collision but also the inherent broadcast Broadcast is a desirable characteristic at certain level. Controlled broadcast is added to switched networks by creating the VLAN concept! Because broadcast is controlled, there is no need to restrict it to ports in a single device
VLAN ID = 10
VLAN ID = 20
Port 5 3 2
Port 7 6 4
Engineering
Marketing
384
VLAN Types
Layer 1 Port Based Example: Port 5, 3, and 2 in VLAN 10 and Port 7, 6, and 4 in VLAN 20. Broadcast from 5 is confined to 3 and 2. Layer 2 Src MAC, Dst Mac, Frame Type Layer 3 Src IP, Dst IP,
VLAN ID = 10
VLAN ID = 20
Port 5 3 2
Port 7 6 4
Engineering
Dr. Hari T.S. Narayanan
Marketing
385
386
VLAN Operations
Intra VLAN Bridging Inter VLAN Routing
387
A VLAN could span multiple switches Consider a scenario as shown in the following figure We need a mechanism to identify a frame with a VLAN in this situation, otherwise a frame cannot be delivered to right VLAN at destination switch
VLAN 1 VLAN 2 VLAN 3
VLAN 1 VLAN 2 Trunk Link VLAN 3
IEEE 802.1Q defines a new frame format to support the existence of VLAN VLAN Tag also supports QoS using User Priority field VLAN Tag is used to receive and forward frames to the appropriate ports
4 Bytes
Destination Address Source Address 802.1Q VLAN log Frame Check
Type/Len
Data
2 Bytes
Tag Protocol ID 0x8100
VLAN ID (12Bits)
389
Trunk AB carries tagged frames. Capture frames and check the contents for VLAN Tag
A B
VLAN 2
VLAN 3
Dr. Hari T.S. Narayanan
VLAN 2
VLAN 3
390
Hierarchical VLANs
Using multilevel tagging one could build hierarchical networks. These solutions are specifically meant for carrier level networks that are transporting user level frames There are two such solutions:
391
This is an IEEE amendment to IEEE 802.1Q (single level VLAN tagging). This permits multiple level tagging to enable service providers MetroEthernet to carry users tagged frames Priority field in VLAN Tag can be used to create different service category Issue: Limited to 4K access Tags Every core switch needs to learn and maintain forwarding entries for every customer MAC address.
Source Address VLAN Tag1 VLAN Tag2 Type/Len Data Frame Check
392
Destination Address
393
Trunk (Link) aggregation provides a graceful and inexpensive scaling solution It also provides fault-tolerance, increased availability, and loadbalancing Using aggregation control protocols a logical trunk with multiple physical trunks is created and managed dynamically. LACP and PAgP are two link aggregation protocols
Dr. Hari T.S. Narayanan
Trunk Lines
394
3.
4.
Link aggregation among more than two systems is not supported Link Aggregation is supported only on links using the IEEE 802.3 MAC. Link aggregation is supported only on point-to-point links with MACs operating in full duplex mode. All links in a Link Aggregation Group operate at the same data rate (e.g., 10 Mb/s, 100 Mb/s, or 1000 Mb/s)
395
Link Aggregation present a single MAC interface to the MAC client Link Aggregation layers main function is to distribute and collect frames. This is achieved using Aggregator
396
Link Aggregator
Aggregator
An Aggregator binds to one or more ports within a switch (system) Aggregator takes care off frame distribution and collection on behalf of MAC client A switch may contain multiple Aggregators, serving multiple MAC Clients. A given port will bind to (at most) a single Aggregator at any time. A MAC Client is served by a single Aggregator at a time.
397
LACP Modes
Mode on
off passive
active
LACP Parameters
System Priority: The system priority is used with the switch MAC address to form the system ID Port priority: The port priority is used with the port number to form the port identifier. Administrative Key: The ability of a port to aggregate with other ports is defined with the administrative key.
399
400
Ports fastethernet 0/1 - 3 to be configured in the etherchannel: Switch-A(config)#interface range fastEthernet 0/1 -3 Switch-A(config-if-range)# channel-protocol lacp Configure the lacp protocol to be used in this channel Switch-A(config-if-range)# channel-group 1 mode active Configure the mode active to be used in lacp
401
External Router with a port for each VLAN External Router that allows trunking Multi-layer Switch
402
PC0, PC1, Port 5 of switch are in VLAN2 PC2, PC3, Port 6 of switch are in VLAN3 Router port 1 is gateway to VLAN2 Router port 2 is gateway to VLAN3
403
The trunk that connects switch to router can carry both VLAN 2 & 3 traffic Router Interface supports two sub-interfaces each interface is a gateway to one of the VLANs
VLAN 2
VLAN 3
404
ROUTER CONFIGURATION
(config)# interface fa 0/0.1 (config-if)# encapsulation dot1q 2 (config-if)# ip address 10.10.10.1 255.255.255.0 (config-if)# exit (config)# interface fa 0/0.2 (config-if)# encapsulation dot1q 3 (config-if)# ip address 11.0.0.1 255.255.255.0 (config-if)# exit
405
SVIs are virtual VLAN interfaces on multilayer switches that enable routing between VLANs Except for VLAN 1 (default VLAN), for other VLANs SVIs are to be created and configured manually (as follows): Enable IP routing Create the VLANs Create the SVI Assign an IP address to each SVI Enable the interface Optional Enable an IP routing protocol
406
VLAN 2
VLAN 3
407
Configuring Inter-VLAN Routing with SVIs Switch(config)# ip routing Switch(config)# vlan 10 Switch(config)# interface vlan 10 Switch(config-if)# ip address 10.10.1.1 255.0.0.0 Switch(config-if)# no shutdown Switch(config)# router rip
408
Advantages of VLAN
Performance Virtual Workgroups Ease of Administration Reduced Cost Security
409
Introduction
RADIUS is a protocol for remote user AAA Its primarily used by ISP RADIUS (RFC 2138, 2866-accounting) State-less Protocol Includes a limited built in security Built-on Client-Server model. Client and Server share a secret key No room for extension Attributes are coded using TLV
Operation
Internet Gateway & Management Console
Internet PSTN
1 DSLAM
NAS Server
RADIUS Server
ISP Intranet
Message Flow
Logging session Access-request Access-response
Authentication Request
Authentication is for accessing network resources via Network Access Server (NAS). The NAS issues a RADIUS access request (AccessRequest) message to the RADIUS Server requesting authorization to grant access. The request includes identification typically in the form of username and password provided by the user.
Authentication Response
Access Accept - The user is granted access. Access Reject - Reasons may include failure to provide proof of identification or inactive user account. Access challenge - Requests additional information from the user such as a secondary password, PIN, token card challenge response.
Accounting Data
The user's session start time The user's session end time Total packets transferred during the session Volume of data transferred during the session Reason for session ending
Id
Length
Authenticator
.Attributes
Code is a single octet which identifies the type of radius packet Server receives an invalid code, it discards the packets Radius Request and Reply are coded in a common format The server listens on UDP port 1812 for Access-Requests and port 1813 for Accounting-Requests. Identifier is a single octet used to match requests and reply Length is two octet, which determine the length of the packet.
RADIUS Codes
Valid Radius Codes 1 Access-Request 2 Access-Accept 3 Access-Reject 4 Accounting-Request 5 Accounting-Response 11 Access-Challenge 255 Reserved
The NAS at 192.168.1.16 sends an Access-Request UDP packet to the RADIUS Server for a user named nemo logging in on port 3 with password "arctangent". 01 00 00 38 0f 40 3f 94 73 97 80 57 bd 83 d5 cb 98 f4 22 7a 01 06 6e 65 6d 6f 02 12 0d be 70 8d 93 d4 13 ce 31 96 e4 3f 78 2a 0a ee 04 06 c0 a8 01 10 05 06 00 00 00 03 1 Code = Access-Request (1), 1 Identifier = 0 2 Length = 56, 16 Request Authenticator Attribute List: 6 User-Name = "nemo 18 User-Password 6 NAS-IP-Address = 192.168.1.16 6 NAS-Port = 3 Note: 6 is attribute length
The RADIUS server authenticates nemo with Access-Accept UDP packet to the NAS telling it to telnet nemo to host 192.168.1.3 02 00 00 26 86 fe 22 0e 76 24 ba 2a 10 05 f6 bf 9b 55 e0 b2 06 06 00 00 00 01 0f 06 00 00 00 00 0e 06 c0 a8 01 03 1 Code = Access-Accept (2) 1 Identifier = 0 (same as in Access-Request) 2 Length = 38 16 Response Authenticator Attribute List: 6 Service-Type (6) = Login (1); 6 Login-Service (15) = Telnet (0) 6 Login-IP-Host (14) = 192.168.1.3 Note: 6 is attribute length
An attribute-value (A-V) pair represents a variable. The A-V pairs usually appear as AttributeName=Value in the configuration files and AttributeName :Type:Value in the log files.
Some Attributes
Code 1 2 3 4 5 6 7 Attributes User-Name User-Password CHAP-Password NAS-IP-Address NAS-Port Service-Type Framed-Protocol
Request Authenticator
All attributes are clear text with the exception of UserPassword Request-Authenticator (ReqA) is used to encrypt the password as follows:
Client and server share a secret c1 = p1 XOR MD5(S + ReqA) c2 = p2 XOR MD5(S + c1) ... cn = pn XOR MD5(S + cn-1) p1, p2, are 16 octet blocks of password Passoword is coded as c1.c2.c3, ... , cn in PDU
Response Authenticator
Limitations
Vulnerabilities: http://www.untruth.org/~josh/security/radius /radius-auth.html No room to extend Limited Security Unreliable transport (UDP)
426
Diameter
RFC 3588
Introduction
Diameter vs RADIUS
1. 2. 3. 4. 5. 6.
7.
8. 9.
Diameter uses TCP or SCTP for transport Offers both Network Layer (IPSec) and Transport Layer Security (TLS) Diameter is not fully compatible with RADIUS Larger address space for AVPs and identifiers (32 bits instead of 8 bits) Uses Peer-to-Peer paradigm to some extent Supports both stateful and stateless models Dynamic discovery of peers (using DNS SRV and NAPTR) Capability negotiation Supports application layer acknowledgements, defines failover methods and state machines Error notification Better roaming support New commands and attributes can be defined Aligned on 32-bit boundaries
The Diameter node that receives the user connection request will act as the Diameter client. In most cases, a Diameter client will be a NAS. Diameter client will collect user information. Then send an access request to Diameter server node Diameter server can also act as client in some instances
Diameter Message
Diameter Messages are synchronous Request and response share the same command code Values are coded using AVP (TLV) AVPs carry the details of AAA as well as routing, security, and capability information between two Diameter nodes.
Command code
274 274 271 271 257 257 280 280 282 282 258 258 275 275
Message Format
R = The message is a request (1) or an answer (0). P = The message is proxiable (1) and may be proxied, relayed or redirected, or it must be processed locally (0). E = The message is an error message (1) or a regular message (0). T = The message is potentially being re-transmitted (1) or being sent for the first time (0). Last 4 bits are reserved
Header Fields
Application-ID : authentication, accounting, or vendor specific application. Hop-by-Hop ID A unique ID, which is used to match requests and answer. End-to-End ID A time-limited unique ID that is used to detect duplicate messages.
Message Flow
Applications
A Diameter Application is a protocol based on the Diameter base protocol Each application is defined by an application identifier An application can add new command codes and/or new mandatory AVPs. Adding a new optional AVP does not require a new application.
List of Applications
Diameter Mobile IPv4 Application (MobileIP, RFC 4004) Diameter Network Access Server Application (NASREQ, RFC 4005) Diameter Extensible Authentication Protocol Application (RFC 4072) Diameter Credit-Control Application (DCCA, RFC 4006) Diameter Session Initiation Protocol Application (RFC 4740) Various applications in the 3GPP IP Multimedia Subsystem Both the HSS and the SLF communicate using the Diameter protocol. (Generic Bootstrapping Architecture): Bootstrapping Server Function
Introduction
D-H is a procedure to create a shared secret key with insecure network Module Organized as follows:
Discrete Logarithm
Logarithm and exponentiation are inversely related An ordinary logarithm loga(b) is a solution of the equation ax = b over the real numbers Example log10(100) = 2 is the solution to10x = 100 In other words, 2 is logarithm of 100 to the base 10 Discrete Logarithm is similar to ordinary logarithm, however, for elements of a finite cyclic group. Discrete Logarithm: k is discrete logarithm of a to the base g modulo p if gk a (mod p) Example: If 36 1 (mod 7) then 6 is discrete logarithm of 1 to the base 3 modulo 7
Hari Narayanan 443
Primitive Root: g is a primitive root modulo p, if for every co-prime a of p, there is an integer k such that gk a (mod p). g is also referred to as generator Example: 3 is a generator modulo 7 because
NetProWise
Congruence Class for multiplication modulo p (where p is a prime) is the set {1, 2, 3, (p-1)} Discrete exponentiation: kth power is defined as follows for an element a of this set: gk mod p Discrete Logarithm: The inverse operation of Discrete exponentiation. For example, take the equation 3k 13 (mod 17) for k. k=4 is a solution, but it is not the only solution. Since 316 1 (mod 17) it also follows that if n is an integer then 34+16n 13 1n 13 (mod 17). Hence the equation has infinitely many solutions of the form 4 + 16n.
Hari Narayanan
446
2.
Choose public numbers: p (large prime number), g (primitive root or generator mod p) A generates random XA and sends YA to B: YA = gXA mod p. Note: Finding X A from YA , g, and p is not easy as there is not
unique solution!
3.
4. 5.
B generates random XB and sends YB to A: YB = g XB mod p. A calculates secret key: K = (YB) XA mod p. B calculates secret key: K = (YA) XB mod p.
NetProWise
Example
NetProWise
Diffie-Hellman Example
1.
2.
3.
A generates random XA and sends B: YA = gXA mod p. XA = 4, YA = 24 mod 11 = 16 mod 11 = 5 B generates random XB and sends A: YB = g XB mod q. XB = 6, YB = 26 mod 11 = 64 mod 11 = 9
NetProWise
5.
NetProWise
2.
They generate the same keys: K = (YB) X mod p = (YA) X mod p An eavesdropper cannot find K from any transmitted value: p, g, YA, YB
A B
NetProWise
1. Keys Agree
(YB)X mod p (YA)X mod p = (gX mod p)X mod p = (gX mod p)X mod p = (gX )X mod p = (gX )X mod p = gX X mod p = gX X mod p
B A A B
B A A B B A A B
NetProWise
An eavesdropper cannot find K from any transmitted value: p, g, YA, YB K = (YB)X mod p = (YA)X mod p To find K without XA or XB we need to find x and y such that (YB) x mod p = (YA)y mod p Finding discrete logarithms is (probably) hard!
A B
NetProWise
While the problem of computing discrete logarithms and the problem of integer factorization are distinct problems they share some properties: both problems are difficult (no efficient algorithms are known for non-quantum computers), for both problems efficient algorithms on quantum computers are known, algorithms from one problem are often adapted to the other, and the difficulty of both problems has been utilized to construct various cryptographic systems.
Hari Narayanan 454
Remote Shell
SSH Client
SSH Tunnel
SSHD
User
KEX Messages primarily enable negotiation of various feature options also In addition, creates shared secret between peers. Putty messages with Wireshark
457
Network
458
What is chassis?
Chassis is an enclosure of switch/computing element components Types of Chassis Rack-Mount, Wall-Mount, Tower, Tabletop, Commercial, Industrial, Rugged, Military, Horizontally mounted, vertically mounted Chassis includes: Card Cage, Backplane/Midplane, Fan-tray, Powersupply, Circuit boards (Cards), front panel, Cable management arm, Each Chassis is produced to support a certain form factor (standard size)
Dr. Hari T.S. Narayanan
459
460
Backplane hosts bus (wires) and connectors to which circuit boards are connected Circuit boards use the bus interface and switch fabric to communicate with each other Switch fabric itself is one of the circuit boards
Bus
Bus interface
461
Normal backplane participates in the switching required between arrays of cards hosted on a chassis A Virtual backplane participates in switching traffic from a card on a chassis to a card on a different chassis
462
Backplane
Backplane connects several printed circuit board cards together to make up a complete switching system or computing system Backplanes are normally used in preference to cables because of their greater reliability. Backplane hosts one or more communication buses (/interface) In computing area, PICMG provides standards for the backplane interface.
463
Midplane
464
466
Purely Parallel
Single Master Clock Shared, parallel multi-drop bus Multiplexes address and data signals Number of control signals (sideband) Typically centralized bus access arbitration Example: PCI Low Latency High-bandwidth?
PCI Bridge
System Bus
Device 1 Device 2
467
Purely Serial
Clock/Data Line Shared, Serial multi-drop bus or Serial Point-to-Point Packetized message exchange Very few control lines Distributed bus access procedure Example: Ethernet High Latency Low-bandwidth
468
Point-to-point connection between devices Point-to-point connection can be switched using switch fabric (either store-forward or cross-connect) Blending several serial lines for a connection Each line carrying data/clock Packetized message with blended lines Limited Control lines
469
Form Factor
Form factor is a collection of specification for size, connection, and power requirements of a circuit board There are standard and de-facto standard specifications
Format Nano-ITX Mini-ITX Micro-ATX Size in cm 12 x 12 17 x 17 24.4 x 24.4
Switch Fabric
Bus by itself is sufficient when the number of devices sharing it are limited. We need different scalable mechanism when large amount of non-terminating, identical data needs to flow between devices persistently. This when we need a switch fabric between devices Switch Fabric: The functional block in a switch/router that moves that moves data from an incoming port to an appropriate output port Switch Fabric and bus technology work together to implement switching function within a node Switch Fabric is independent of the bus technology and the function offered by the node (routing/switching)
471
Types of switching:
Store-and-Forward 3
Memory
Connect
472
Data Flow
473
Standards
474
Fan Tray
Fan Tray includes set of fans to cool either power supply or card cage Status of the fan tray is indicated on the front panel Status is also integrated with management solution - typically appear as environmental alarm/trap
475
Power Supply
Power Supply provides the power to all the cards and fans Redundant power supply provide fault tolerance Status at the front panel Status integrated with management solution typically appear as environmental alarm/trap
476
Card Composition
Networking Boxes Control Card and Management Card Switching Card Data Card Peripheral Card Computing Boxes CPU Card Switching Card I/O Card Networking Card
477
Control Card besides running other critical (management) functions, includes the following control plane function Running tasks to learn packet forwarding/routing information Then sharing this information with I/O cards and neighboring nodes Initializing and maintaining state machines for various links and connections Forwarding the data packets to the right egress port (this and previous function are typically now implemented using other cards with NP/ASIC/FPGA/Processor) Controlling and monitoring the other peripherals (fan, power supply)
Dr. Hari T.S. Narayanan
478
Control function can be in a single card or can be distributed over multiple cards (data cards & control cards) Redundant control cards increase fault tolerance Secondary Control card could do back-up and/or load balance Two control cards are used typically Using some arbitration mechanism, one of them takes control and other one waits as stand-by The stand-by periodically checks the status of the active one using heartbeat messages sent over backplane When the active one fails, stand-by takes over An alarm sent to management application by stand-by when active control card fails
479
Time to recover through secondary should be minimum Configuration of the secondary needs to be in synch
480
Control function is distributed Part of the control function is implemented on I/O resident NP or ASIC or FPGA Control function common to multiple I/O cards is implemented in Control and Management Card Control Card includes a general purpose processor that runs RTOS This RTOS supports common control function and management interface in terms of various tasks Flash memory with run time images and configuration files EPROM with boot strap code and RAM
481
Control card uses the backplane to communicate with other cards in the chassis Control card uses I2C interface to control and monitor Fans and power supply Control card uses network interface (IP over Ethernet) to communicate with remote management applications Control card also support local console (Serial Interface) for local configuration and management LED panel provides status information for local users There is reset button to reset the switch
Dr. Hari T.S. Narayanan
482
483
484
485
Flash Memory
Contains one or more revisions of the run time code Switch can be made to run any one of these revisions Contains one or more versions of the configuration data Switch can be made to start with any one of these configuration data
486
In addition to the run time code in flash, a switch can be made to boot from a remote file server booting over net From the console, switch boot configuration can be updated with boot source (LAN/Flash/local drive), servers IP address, image path name, userid, etc Once the boot configuration is updated, subsequent boots make use of these options If boot source is selected as LAN, then using BOOTP and TFTP, switch can boot over network.
487
Router
Router Architecture
Controller card implements control and management functions Interface cards implement data functions Router backplane is used to exchange frames and control information between cards
Routing Components
Core Components:
Topology & Address exchange includes routing daemons (RIP, OSPF) and manual updates Packet forwarding is replicated on all interfaces
Special Services
Packet Forwarding
IP Packet Validation Destination IP Address Parsing Table Lookup Packet Lifetime Control Checksum Calculation
CPU is in data path Control and management tasks compete for CPU cycles Packets cross bus two times!
Distributed Forwarding and forwarding table (Route Cache) Learning from CPU CPU still forwards packets that cannot be handled by Line cards Packet crosses the bus once (in most cases)
Limitations
Traffic dependent throughput CPU is still a bottleneck for exceptional packets Increasing port density could be limited by CPU capacity
Header alone is processed by forwarding engine Rest of the packet stays in a buffer at the ingress line card Once the forwarding line card is decided, both header and payload are moved to the buffer there
Features
The route processor (controller) runs the routing protocols Multiple forwarding engines are shared by line cards Forwarding only IP headers to the forwarding engines eliminates an unnecessary packet payload transfer over the bus. Packet payloads are always directly transferred between the interface modules and they never go to either the forwarding engines or the route processor unless they are specifically destined to them.
The 3-Gen of routers offered higher bandwidth switch fabric as a replacement of shared bus.
Forwarding Engine
Checking IP header and identifying a forwarding entry based on hashed output of destination address (this may not provide exact next address) Checking if the hashed output is correct or not, if it is not the right one, then processor intervention is requested. New TTL and checksum are computed In the third stage the updated TTL and checksum are put in the IP header and the updated IP header is written out along with link-layer information from the forwarding table.
Decentralized Architecture
Each network interface includes power and the buffer space needed for handling all the packets flowing through it.
502
Forwarding Table
504
505
IxVoice H.248/MEGACO Test Library IxVoice T1/E1/Analog Test Library IMS :: Ixia - Leader in Converged IP Testing IxVoice MGCP Test Library IxLoad: Advanced Voice H.248 IxVoice SIP Test Library (RFC: 3261, , 3515, 3311, 261 ... I Catapult VoIP IxVoice H.323 Test Library
506
System for 40/100G Ethernet testing. Spirent TestCenter for emulating Voice, Video, Data and Mobile devices to validate functionality & benchmark performance of multiplay networks. Spirent TestCenter for Mobile Backhaul. Spirent TestCenter for Enterprise & Data Center Networks. Spirent TestCenter for Enterprise Routers & Switches
DUT
508
Admin Interface
509
Admin Windows
510
Test solution for IP & Ethernet based Mobile Backhaul available from Spirent, IXIA,
511
512
513
Routing Protocols
Routing table in a node is used to route packets from a node to rest of the nodes in network (Internet) Routers are critical nodes in routing packets between networks In large networks creating and updating routing table entries manually is cumbersome and slow Routing protocols are used to build loop/cycle free optimal routes in routing table automatically IP is a routed protocol and RIP, BGP, and OSPF are routing protocols
Terminology:
One of the oldest routing protocol ; uses UDP RFC 1058 Two versions: RIPv1 and RIPv2 Metric is Hop Count or Distance Vector Algorithm (Bellman-Ford) Routing information about the entire AS is shared RIP messages from neighbors (only) used RIP messages are sent periodically
Dr. Hari T.S. Narayanan
Normal Operation
When RIP daemon starts, it sends a request through all its interfaces asking for other routers COMPLETE route table. This is IP subnet broadcast Request Received: If it is request for full table, then entire table is sent in one or more messages. Otherwise, route for each entry is processed and metric for that entry (current metric or 0X10) sent Response Received: Response is validated and Route Table is updated: explained in the next slide
Dr. Hari T.S. Narayanan
RIP Algorithm
Receive a RIP message (a response) Add one hop for each advertised dest Repeat If (dest not in routing table) # New entry Add the advertised info to the table Else If (next-hop is the same) # could be an update Replace with the advertised one Else If (advertised hop count < one in the table) # better route Replace entry in the routing table Return
Dr. Hari T.S. Narayanan
RIP Example
RIP messages uses UDP datagrams on port 520 (Peer-to-Peer) Size of datagram limited to 512 bytes (allows 25 routes) A RIP message could be a request (1) or reply (2) Distance: hop count from the advertising router to the destination net Response: solicited or unsolicited First IP address is zero in request then request is for the entire table else
(2 for IPv4)
RIP Version 2
Version: 0x02. Route Tag: Route Tag. IP Address: Destination IP address. It could be a natural network address, subnet address or host address. Subnet Mask: Mask of the destination address. Next Hop: If set to 0.0.0.0, it indicates that the originator of the route is the best next hop; Otherwise it indicates a next hop better that the originator of the route
If AFI is 0xFFFF then there is Authentication Type and Authentication of data of 16 octets Two Types of Authentication: 2 represents plain text authentication, while 3 represents MD5. Authentication: Authentication data, including password information when plain text authentication is adopted or including key ID, MD5 authentication data length and sequence number when MD5 authentication is adopted.
RIP Timers
Periodic timer: Controls advertising of regular update messages (25-35 sec) Expiration timer: Governs the validity of a route (180 sec)
Every time an update (on a 30 sec average) is received the timer is reset If no update is received within this timer the metric is set to 0X10 A route can be advertised with a 0X10 metric for 120 sec before it get purged Allow neighbors to have knowledge of the invalidity of a route
Dr. Hari T.S. Narayanan
Limitations or problems
Slow Convergence with changes in the network Instability
526
527
528
Home Work
OSPF
SPF Shortest Path First SPF is Dijkstras algorithm for finding the shortest path between two nodes in a graph SPF is used in Open SPF (OSPF) and MPLS Tutorial on SPF
532
OSPF Introduction
Autonomous System (AS) Interior Routing Protocol Runs directly on IP layer Popular among Intra-AS Protocol An AS is divided into multiple areas for OSPF Routing information exchanged and kept as Link State Advertisements (LSA)
An OSPF Networking is divided into a number of areas Every OSPF network must contain an Area 0 Area 0 is designed as transit area Other Regular Areas should be attached directly to Area 0 and only to Area 0
Area 0
R5 Other Networks
R3
Area 1
R4 R2 R1
Area 2
Backbone Router (BR) Routers within area 0. Area Border Router (ABR) Connects area 0 to nonbackbone areas. Area 1 AS Border Router (ASBR) Internal Router (IR) Routers within regular areas with no direct connectivity to area 0
Dr. Hari T.S. Narayanan
Area 0
R5
R3 R4 R2 R1
Area 2
a. Neighbor Relationship
Dr. Hari T.S. Narayanan
b. Adjacency Relationship
536
OSPF Artifacts
Neighbor and Adjacency - Neighbor and Adjacency tables are built using Hello Packets. Link State Advertisements - Link State Advertisement Messages are exchanged between routers All other routers in the network (or in an area of the network) and their attached networks LSAs stored in topology table or database (LSDB). The best path to each destination held in the routing table.
Dr. Hari T.S. Narayanan
Adjacency
Adjacency using OSPF hello packets Hello packets are sent with multicast address AllDRouters (224.0.0.5) Adjacency established with matching: Subnet mask, hello Interval, Area ID, and Authentication Once the adjacency is established then routers start exchanging Link State information (LSA) with DB/DBR using AllSPFRouters (224.0.0.6 )
LSAs, also called link-state protocol data units (PDUs) facilitate the exchange of link-state information. LSAs are reliable; they are acknowledge after receipt. LSAs are flooded throughout the area (or throughout the domain if there is only one area). LSAs have a sequence number and a set lifetime, so each router recognizes that it has the most current version of the LSA. LSAs are refreshed periodically to confirm topology information before they age out of the LSDB.
Dr. Hari T.S. Narayanan
Each Router maintains a LSA database (LSDB) A separate LSDB is maintained for each area connected to the router LSDBs maintained by different IRs in an area are identical Each LSA is numbered with a sequence number, and a timer. Timer is used to age out old LSAs. Default age is 30 mins Sequence numbers are 32 bit in size. First legal sequence number is 0x80000001. Large numbers are more recent
Dr. Hari T.S. Narayanan
Sequence Number
Sequence number identify the relative age of LSA records Sequence numbers change only under two conditions:
The LSA changes because a route is added or deleted The LSA ages out. (LSA updates are flooded within the area every half hour, even if nothing changes)
Dr. Hari T.S. Narayanan
OSPF Operation
Using Hello packets adjacency is developed between routers. Designated Router (DR) and Backup DR are elected. Using adjacency routers exchange DBD packets. These packets list LSAs (IDs) stored in each routers LSDB. Receiving routers can use the above list to ask for detailed LSAs for the missing ones using LSR packet LSR are responded with LSU packets LSUs are acknowledeged using LSAck packets
Dr. Hari T.S. Narayanan
OSPF Operation
At the end of these exchanges each router will have complete & identical LSDB for its area LSAs are exchanged periodically (every 30 mins) after that Or LSAs are exchanged instantly to network changes. Hello messages are exchanged at every hello interval for
keepalive
LSAs carry a link-state age field value of 30 minutes. This acts as an aging timer for the LSAs. When the timer expires, the router that originally sent the entry sends the LSA in LSU, with a higher sequence number, in a link-state update (LSU). The LSU can contain one or more LSAs
Type refers to one of the 5 types of OSPF packets (Hello, DBD, LSR, LSU, LSack)
2.
3.
4.
5.
Hello Discovers neighbors and builds adjacencies between them. Hello packets are exchange every 10 secs. Type 1 Database Description (DD/DBD) Checks for database synchronization between routers. Type 2 Link-State Request (LSR) Requests specific link-state record from another router. Type 3 Link-State Update (LSU) Sends specifically requested link-state records. Type 4 Link-State Acknowledgment (LSack) Acknowledges the other packet types. Type 5
Dr. Hari T.S. Narayanan
Hello Data
Hello data enables neighbor discovery Hello data is used for: listing neighbors, to do keepalive, and in electing DR and BDR It ensures bidirectional communication
547
DD Data
Two routers exchange database description (DD) packets describing their LSDBs for database synchronization Contents in DD packets include the header of each LSA (uniquely representing a LSA). The recipient checks whether the LSA is available using the LSA header.
548
LSR
After exchanging DD packets, any two routers know which LSAs of the peer routers are missing from the local LSDBs. In this case, they send LSR (link state request) packets, requesting the missing LSAs. The packets contain the digests of the missing LSAs.
549
LSU
LSU (Link State Update) packets are used to send the requested LSAs to peers, and each packet carries a collection of LSAs. The LSU packet format is shown below.
550
OSPF Responds instantly to network changes. Sends triggered updates when a network change occurs Periodic updates are sent at long intervals, such as every 30 minutes. OSPF routers in an area contains full information about all the routers and links in an area. This means each router can independently select a loopfree and efficient path, based on cost, to reach every network in the area.
Dr. Hari T.S. Narayanan
553
User
Simple view of Internet is a flat network However, there is a distributed and dynamic hierarchy overlaid on flat routing network for business requirement in reality This hierarchy is made of different types ISPs Tier 1, Tier 2, and
An Internet Exchange (IX) is a layer 2 service to facilitate the interconnection between ISPs and Large Scale Network Savvy Content Providers. Most IXP offer public peering, typically using Ethernet
To have complete Internet connectivity you must be able to reach all destinations on the net. Your packets have to get delivered to every destination. This is easy (default routes). Packets from everywhere else have to find you. This is done by having your ISP(s) advertise routes for you.
Dr. Hari T.S. Narayanan
Tracing ISPs
2.
Visual TraceRoute tool Home User Edition: http://www.visualroute.com/ Who is by IP Address: http://tools.whois.net/whoisbyip/
BGP - Introduction
BGP can be used as inter-Autonomous System (AS) routing protocol as well as IntraAS routing protocol BGP enables exchange of network reachability information among peers (BGP Speakers) using TCP BGP enables the identification of loopfree inter-domain routing routes or paths between Autonomous Systems A peer could be an internal peer or an external peer
A route is defined as a unit of information that pairs a destination with the attributes of a path to that destination Adj-RIB-In: Stores routing information learned from inbound messages Adj-RIBs-Out: Stores the route information selected for advertisement to peers Loc-RIB: Stores the local routing information derived from the routing information of Adj-RIBs-In
Message Formats
The maximum message size is 4096 octets.
The following type codes are defined: 1 OPEN, 2 UPDATE, NOTIFICATION, 4 - KEEPALIVE
OPEN message to establish peer connection KeepAlive message sent as an acknowledgement OPEN message
The initial data flow is the entire BGP routing table with UPDATE messages
Incremental updates are sent as the routing tables changes again using UPDATE messages
Open Message
Open is the first message, after TCP connection is established Response/Confirmation is a KeepAlive message Version 4
KeepAlive Message
HoldTime is in seconds KeepAlive interval is typically one third of the HoldTime (1 sec if
HoldTime is 3 sec) If the negotiated HoldTime is zero then there is no KeepAlive exchange.
Notification Message
The BGP connection is closed immediately after sending it. Message contains Error Code, Error Subcode, and additional data describing the error code Message Length = 21 + Data Length
E Code
E Subcode Data
Dr. Hari T.S. Narayanan
Data
Update Message
UPDATE messages are used to transfer routing information between BGP peers. Loops and cycles are eliminated in the resulting BGP topology based on UPDATE
message content. Minimum size is 19 (Header) + 4 (2 Length fields) = 23 bytes An UPDATE message can advertise at most one route
Unfeasible Routes Length (2) Withdrawn Routes (variable) Total Path Attribute Length (2) Path Attributes (variable) Network Layer Reachability Information (variable)
Dr. Hari T.S. Narayanan
Backup slides
1. 2. 3. 4. 5. 6. 7.
BGP AS Number Tool: http://networktools.nl/as-info Regional Internet Registry APNIC: http://www.apnic.net/ Indias Internet Exchange: http://nixi.in/ NIXI Membership Agreement: BGP Tools: http://www.bgp4.as/tools CERN Internet Exchange: http://cixp.web.cern.ch/cixp/ Peering: http://peering.drpeering.net/
573
Study Exercises
Capture and study BGP KeepAlive (KA) message Capture and study BGP Update message for Withdrawn (WR) route List the neighbors of AS2 and AS1 List the BGP summary of AS2
574
Content
VPN Overview
Virtual Private Network provides a secure remote access to users on Internet. These users could be:
Employees in a branch office An individual employee with home office An employee who is away on a business travel
All these users are connecting to their Corporate Intranet using VPN over the Internet. Users could be accessing Corporate servers or accessing other users (desktops).
VPN System
Remote Office
Internet
VPN Tunnels
SOHO
VPN Components
VPN gateway
Anchors the end points of VPN Enables remote clients to participate in VPN in the absence of VPN gateways. Sending encrypted networking datagrams within another network layer datagram. VPN gateways will do the encapsulation and de-encapsulation.
VPN Tunnels
Tunneling
S3 S5 S6 S1
S2 S4
Plain Text IP Packet Encrypted IP Packet Encrypted IP Packet within Plain header IP Packet
VPNs are setup over IPSec, MPLS, T1, and DSL. VPN Gateways are used to anchor the VPN endpoints. VPN Gateway functionality is rolled into Firewall.
Fire Wall Corporate office Intranet
Remote Office
Internet
VPN Gateway VPN Gateway VPN Tunnel
Road Warrior
Internet
With VPN Client S/W VPN Gateway
VPN Tunnel
SOHO User
Switching Network
Two paths (incoming and outgoing) are setup either dynamically or statically between a pair of communicating end points. A path is a non-empty sequence of nodes Each and every packet from one end point to the other one takes the same path Use of a path is likely to provide predictable performance (delay and throughput) Forwarding of packets is done at each node in the path using locally scoped labels
Dr. Hari T.S. Narayanan
584
Why MPLS?
Unlike switched networks, IP based networks failed to provide Quality of Service (QoS) There are other known issues with IP networks: Routing table size Destination based routing lookup at every hop Layer 2 topology may be different from Layer 3 topology leading to suboptimal paths There is no provision to do Traffic Engineering (TE)
MPLS
MPLS is a new Layer-2 forwarding mechanism in which packets are forwarded based on labels. MPLS over IP network is the one we are interested
Labels usually correspond to IP destination networks (equal to traditional IP forwarding). Labels can also correspond to other parameters, such as QoS or source address. MPLS was designed to support forwarding of other protocols as well
MPLS Architecture
Control plane: Exchanges Layer 3 routing information and labels; contains complex mechanisms to exchange routing information, such as OSPF/BGP, and to exchange labels; such as LDP, and RSVP Data plane: Forwards packets based on labels; has simple forwarding engine
LDP
Payload
IP Header Layer 3
Payload
MPLS uses a 32-bit label field that contains the following information: 20-bit label 3-bit experimental field (EXP) used in QoS 1-bit bottom-of-stack indicator 8-bit TTL field
The protocol identifier in a Layer 2 header specifies that the payload starts with a label (labels) and is followed by an IP header. The bottom-of-stack bit indicates whether the next header is another label or a Layer 3 header. The receiving router uses the top label only. Ethernet value type for MPLS 0x8847 PPP value type for MPLS 0281
MPLS Forwarding
An LSR can perform the following functions: Insert (impose/push) a label or a stack of labels on ingress Swap a label with a next-hop label or a stack of labels in the core Remove (pop) a label on egress Label Switched Path (LSP) means the path along which a FEC travels through an MPLS network.
As packets enter an MPLS network they are classified into groups Forward Equivalence Class (FEC) Labels are assigned at an Ingress point based on this classification. Classification is done using destination IP address, port numbers, and various other information from packet header An FEC may correspond to multiple labels, however, a label can only represent a single FEC. Example of an FEC: All packets that are to be directed to the same egress router
594
MPLS
Label Distribution Protocol
597
Introduction
When packet with a certain label arrives, it is processed and sent out with a different label. These labels could be configured manually for small networks However, we need a protocol using which labels could be agreed upon dynamically MPLS Label Distribution Protocol is one such mechanism.
598
Principle
In IP Network the labels are primarily generated based on route table entries and other quality criteria LDP uses the route information generated by routing protocols like OSPF, BGP,
599
600
IPSec is integrated at the IP layer (layer 3) of the TCP/IP stack, so it provides security for almost all protocols in the TCP/IP suite.
Introduction
It is a framework that enables one to develop secure message exchange between applications
603
1
2
4
6 17 41 47
50
605
606
607
Computing IVC
(MD5 or SHA-1)
608
IPSec calculates IVC by leaving out TOS, TTL, Fragment Offset, and Flags fields. If on the flight if any of the other fields of IP header change then IVC validation will fail at the receiving end. This is the expected behavior for all packets. However, NAT changes the Source IP address on the flight without adjusting the IVC. This means, at the receiving end IVC validation will fail and the packet will be ignored. Thus, there is an issue with NAT and IPSec interworking.
609
610
IPSec System
611
612
Key can be exchanged manually in the worst case. This does not scale well and it is not safe either The Diffie-Hellman key exchange algorithm allows two parties to agree on a shared key value without requiring encryption.
a, g, p
Alice Bob b
g, p, A
A = ga mod p
K = Ba mod p
B = gb mod p
B
K = Ab mod p
613
Key exchange algorithms by themselves are not sufficient, we also need protocols to offer complete secure services. ISAKMP (RFC 2408) is framework to setup secure connections with different authentication and encryption algorithms. ISAKMP defines the framework for authentication and key exchange by providing procedures for negotiating, establishing, changing, and deleting SAs. ISAKMP is independent of key exchange protocols and security algorithm. It can be implemented over any transport protocol. UDP support (port 500) is mandatory
614
ISAKMP defines primarily procedures and packet formats to create and manage Security Association (SA) between two ends of a connection A Security Association identifies a collection of negotiated algorithms and their parameters for a secure connection. A SA contains all the information required for realization of various network security services. ISAKMP for IPSec operates in two phases phase 1 and phase 2 In phase 1 security association (SA) is established between two ends. This SA is used in phase 2 to exchange AHA and ESP parameters to setup another SA for exchange of IPSec IPSec supports both manual and automatic management of SAs and keys.
615
1.
2.
Use of SAs for IPSec requires two databases SPD and SAD: The Security Policy Database (SPD) stores the security requirements or policy requisites for an SA to be established The Security Association Database (SAD) contains the parameters of each active SA
616
617
ISAKMP (RFC 2408) provides a framework for authentication and key exchange but does not define them. IKE (RFC-2409) describes a protocol using part of Oakley and part of SKEME in conjunction with ISAKMP. Oakley describes a series of key exchanges-called "modes"-- and details the services provided by each (RFC 2412) SKEME describes a versatile key exchange technique which provides anonymity, repudiability, and quick key renewal. The Internet IP Security Domain of Interpretation for ISAKMP is (RFC 2407)
618
OAKLEY
The Oakley protocol defines several modes for the key exchange process. These modes correspond to the two negotiation phases defined in the ISAKMP protocol. For phase 1, the Oakley protocol defines two principle modes: main and aggressive. IPSec for Windows does not implement aggressive mode. For phase 2, the Oakley protocol defines a single mode, quick mode.
619
There is a main header and this is followed by various payload type header. ISAKMP permits vendors and applications to define new payload types. Message sequences are grouped with exchange type There is one exchange type defined for Phase 1 of IPSec and one for Phase 2 ISAKMP permits vendors and applications to define new exchange types.
620
Attribute Encoding
Attributes are coded in various payloads. Generic attribute types are defined in appropriate documents Vendor & Application specific attributes can added Attributes used by various implementations are specified and registered with IANA (RFC 2401). ISAKMP provides the following generic format for encoding attribute:
AF (1) & Attribute Type (7) AF= 0 Attribute Length AF=1 Attribute Value
622
E(ncryption Bit) (1 bit) - If set (1), all payloads following the header are encrypted A(uthentication Only Bit) (1 bit) - This bit is intended for use with the Informational Exchange with a Notify payload C(ommit Bit) (1 bit) - This bit is used to signal key exchange synchronization.
623
SA Payload: A Security Association Payload includes one or more Proposal Payloads, Domain of Interpretation (DOI) and Situation under which the negotiation is taking place. Proposal Payload: A Proposal Payload lists security mechanisms, or transforms, to be negotiated to secure the communications channel. Transform Payload contains information used during Security Association negotiation. A proposal payload contains one or more Transform Payloads. Vendor ID Payload contains a vendor defined constant. The constant is used by vendors to identify and recognize remote instances of their implementations.
624
Domain of Interpretation (DOI): A DOI defines payload formats, exchange types, and conventions for naming securityrelevant information such as security policies or cryptographic algorithms and modes. IPSec is one of DOIs IPSec DOI spec is in RFC 2407 Situation is DOI specific field
Dr. Hari T.S. Narayanan
Payload Length
625
Protocol-Id (1 octet) - Specifies the protocol identifier for the current negotiation. Examples might include IPSEC ESP, IPSEC AH, OSPF, TLS, etc. Security Parameter Index (SPI): An identifier for a Security Association, relative to some security protocol. Each security protocol has its own "SPI-space". A (security protocol, SPI) pair may uniquely identify an SA.
Next Payload Proposal # Reserved Protocol-ID Payload Length SPI Size #of Transforms
Transform Payload
Transform # (1 octet) - Identifies the Transform number for the current payload. Transform-Id (1 octet) - Specifies the Transform identifier for the protocol within the current proposal. These transforms are defined by the DOI and are dependent on the protocol being negotiated.
Next Payload Transform # Reserved Transform-ID Payload Length Reserved
The Vendor ID Payload contains a vendor defined constant. The constant is used by vendors to identify and recognize remote instances of their implementations.
Next Payload Reserved Payload Length
Exchange Type
An exchange type defines a sequence of messages with various payload types to achieve a specific objective SA Establishment, SA Modification,
Base, Identity Protection, Authentication Only, Aggressive, Some exchange types are defined in RFC 2408.
Vendors or Applications can define their own exchange types.
629
Exchange type of 1 indicates Identification Protection message sequence. This is Phase 1(Main mode) of IPSec setup
IDENTITY PROTECTION EXCHANGE Initiator Direction Responder Note HDR, SA Begin ISAKMP-SA or Proxy negotiation HDR, SA Basic SA agreed upon HDR; KE; NONCE HDR; KE; NONCE Key Generated (by Initiator and Responder) Initiator Identity HDR*; IDii; AUTH Verified by Responder HDR*; IDir; AUTH Responder Identity Verified by Initiator SA established 630
Dr. Hari T.S. Narayanan
No 1 2 3 4 5 6
The Security Association, Proposal, and Transform payloads are used to negotiate and establish SAs An SA establishment message consists of a single SA payload followed by at least one, and possibly many, Proposal payloads and at least one, and possibly many, Transform payloads associated with each Proposal payload. Because these payloads are considered together, the SA payload will point to any following payloads and not to the Proposal payload included with the SA payload.
631
Requirements of DOI
2.
3. 4. 5. 6. 7.
define the naming scheme for DOI-specific protocol identifiers define the interpretation for the Situation field define the set of applicable security policies define the syntax for DOI-specific SA Attributes (Phase II) define the syntax for DOI-specific payload contents define additional Key Exchange types, if needed define additional Notification Message types, if needed
632
IPSec Situation
633
ISAKMP (RFC 2408) provides a framework for authentication and key exchange but does not define them. IKE (RFC-2409) describes a protocol using part of Oakley and part of SKEME in conjunction with ISAKMP. The Internet IP Security Domain of Interpretation for ISAKMP (RFC 2407) Oakley describes a series of key exchanges-- called "modes"-- and details the services provided by each (RFC 2412) SKEME describes a versatile key exchange technique which provides anonymity, repudiability, and quick key refreshment.
634
There are two basic methods used to establish an authenticated key exchange: Main Mode and Aggressive Mode. Each generates authenticated keying material from an ephemeral Diffie-Hellman exchange. The SA payload MUST precede all other payloads in a phase 1 exchange. Except where otherwise noted, there are no requirements for ISAKMP payloads in any message to be in any particular order.
635
IKE performs main mode to establish protection mechanisms and keys for quick mode IKE main mode negotiation occurs in three parts:
Part one: Negotiation of protection mechanisms Part two: Diffie-Hellman exchange Part three: Authentication
636
Sender
Initiator Responder
Payload
ISAKMP header, Security Association (contains proposals) ISAKMP header, Security Association (contains a selected proposal) ISAKMP header, Key Exchange (contains DiffieHellman key), Nonce, additional payloads (depending on authentication method) ISAKMP header, Key Exchange (contains DiffieHellman key), Nonce, additional payloads (depending 637 on authentication method) Dr. Hari T.S. Narayanan
Initiator
Responder
Identification Protection
Main Mode Message 1 2 3 4 5* 6* Sender Initiator Responder Initiator Responder Initiator Responder Payload ISAKMP header, Security Association (contains proposals) ISAKMP header, Security Association (contains a selected proposal) ISAKMP header, Key Exchange (contains Diffie-Hellman key), Nonce, Initiator Kerberos Token ISAKMP header, Key Exchange, Nonce, Responder Kerberos Token ISAKMP header, Identification, Initiator Hash ISAKMP header, Identification, Responder Hash
Dr. Hari T.S. Narayanan
638
640
{B,8} {C,4}
641
list of nodes that are known to be on the shortest path to a destination. This list is called the PATH list. It is important to understand that the only things on the PATH list are paths to a destination that are known to be the shortest path to that destination. 2. A list of next nodes that might or might not be on the shortest path to a destination. This list is called the TENTatitve or TENT list.
642
Algorithm
Step 1. Put "self" on the PATH list with a distance of 0 and a next hop of self.
PATH List TENT List
{A,0,A}
(empty)
Step 2. Described in Next slide due to lack of space Step 3. Find the neighbor in the TENT list with the lowest cost, add that neighbor to the PATH list, and repeat Step 2. If the TENT list is empty, stop.
Dr. Hari T.S. Narayanan
643
Step 2 of Algorithm
Take the node just placed on the PATH list, and call it the PATH node. Look at the PATH node's list of neighbors. Add each neighbor in this list to the TENT list with a next hop of the PATH node, unless the neighbor is already in the TENT or PATH list with a lower cost. Call the node just added to the TENT list the TENT node. Set the cost to reach the TENT node equal to the cost to get from the root node to the PATH node plus the cost to get from the PATH node to the TENT node. If the node just added to a TENT list already exists in the in the TENT list, but with a higher cost, replace the higher-cost node with the node currently under consideration.
644
TENT List
{B,5,B} {C,10,C}
{A,0,A}
PATH List
S2
TENT List
{A,0,A}
{B,5,B} PATH List
{C,10,C}
S1
{A,0,A} {B,5,B}
PATH List
S1
{A,0,A} {B,5,B}
{C,8,B}
646
TENT List
{D,12,B}
Node Cost A 0
B
C D
5
8 12
B (directly connected)
B B
647
Example
6
2 14
5
6
3
9
11
4
15
1
7
10
648
650
651
652
653
654
655
OSPF Version 3
RFC 2740
656
Introduction
Most important, OSPFv3 uses the same fundamental mechanisms as Constants and variables such as timers and metrics are also the same. OSPFv3 is not backward-compatible with OSPFv2. Obviously, the LSA structure is modified to accommodate IPv6 addresses
657
Per link protocol processing rather than per interface Removal of addressing semantics Neighbors are always identified by Router ID Addition of a link-local flooding scope Use of link-local addresses Support for multiple OSPF instances per link Removal of OSPF-specific authentication More flexible handling of unknown LSA types
658
OSPFv3 Header
Message Formats of main message header, Hello and DD are different in OSPFv3 Authentications fields are removed, Instance ID added
659
Hello Message
No Network mask field Size of option field is increased to 24 bits Router Dead Interval decreased to 16 bits DD Message in OSPFv3 contains larger options field
660
LSA Header
There is no Options field, and the Link State Type field is 16 bits rather than the 8-bit Type field of OSPFv2.
661
662