Escolar Documentos
Profissional Documentos
Cultura Documentos
UC San Diego
Center for Networked Systems
Member Companies
Research
Interests
Center Faculty
Project
Proposals
Diverse Research
Projects
-Multiple faculty
-Multiple students
-Multidisciplinary
-CNS Research Theme
Current Team
Amin Vahdat
Mohammad Al-Fares
Hamid Bazzaz
Harshit Chitalia
Sambit Das
Shaya Fainman
Nathan Farrington
Brian Kantor
Harsha Madhyastha
Pardis Miri
Erik Rubow
Aram Shahinfard
Vikram Subramanya
Malveeka Tewari
Meg Walraed-Sullivan
Acknowledgements
Industry
Ericsson, HP, Cisco, Facebook, Yahoo!, Google, Broadcom
Government
National Science Foundation: building ~300 node data
center network prototype
Academia
Stanford (NetFPGA + OpenFlow)
Motivation
Motivation
Motivation
Motivation
Economies of scale
Price/port constant with number of hosts
Must leverage commodity merchant silicon
Management
Modular design
Avoid actively managing 100s-1000s network elements
Core
EoR
ToR
$25
33% BW
$20
$15
$10
$5
$0
0
5000
10000
Hosts
15000
20000
25000
Pod 0
Pod 1
Pod 2
Pod 3
Pod 0
Pod 1
Pod 2
Pod 3
Pod 0
Pod 1
Pod 2
Pod 3
Pod 0
Pod 1
Pod 2
Pod 3
Pod 0
Pod 1
Pod 2
Pod 3
Management
Thousands of individual elements that must be
programmed individually
Our Work
Routing/Forwarding [ongoing]
PortLand Goals
No forwarding loops
System Comparison
System
Topology
Forwarding
Routing
ARP
Loops
Multicast
TRILL
header
with TTL
ISIS,
extensions
based on
MOSPF
Switch
State
Addressing
General
O(# of
global
hosts)
Flat: MAC in
MAC encap
Switch
broadcast
Switch
maps
MAC
address
to
remote
switch
SEATTLE General
O(# of
global
hosts)
Flat
Switch
broadcast
One-hop Unicast
DHT
loops
possible
PortLand
O(# of
ports)
Hierarchical
TRILL
Multirooted
tree
New
construct:
groups
Broadcast
free routing;
Multi-rooted
spanning
trees
Design Overview
Fabric Manager
Maintains IPPMAC mapping
Facilitates fault tolerance
Layer 2:
Scalability
Small Switch
State
Seamless
VM
Migration
Flat MAC
Addresses
Layer 3:
IP Addrs
Layer 2:
Flat MAC
Addresses
Scalability
Flooding
Layer 3:
IP Addrs
~
LSR
Broadcast
based
Small Switch
State
Host MAC
Address
Out
Port
9a:57:10:ab:6e
Seamless
VM
Migration
No change
62:25:11:bd:2d
3
to IP
manual
configuration
endpoint
ff:30:2a:c9:f3
0
Host IP
Address
Out
Port
10.2.x.x
10.4.x.x
IP endpoint
changes
Cost Consideration:
Flat vs. Hierarchical Addresses
Pair-wise
communication
Out
Port
0:2:x:x:x:x
0:4:x:x:x:x
0:6:x:x:x:x
0:8:x:x:x:x
State Of Art
Address Resolution
Broadcast based
Routing
Broadcast based
Forwarding
POD 0
POD 1
Pod Number
POD 2
POD 3
0 1 0 1 0 1 0 1
Position Number
0 1 0 1 0 1 0 1
PMAC: pod.position.port.vmid
PMAC: pod.position.port.vmid
PMAC: pod.position.port.vmid
Location characteristic
Technique
2) Pod number
3) Position number
Switch Identifier
A0:B1:FD:56:32:01
Pod Number
??
Position
??
Tree Level
??
Switch Identifier
A0:B1:FD:56:32:01
Pod Number
??
Position
??
Tree Level
??
Switch Identifier
A0:B1:FD:56:32:01
Pod Number
??
Position
??
Tree Level
??
0
Switch Identifier
B0:A1:FD:57:32:01
Pod Number
??
Position
??
Tree Level
??
Switch Identifier
B0:A1:FD:57:32:01
Pod Number
??
Position
??
Tree Level
??
Switch Identifier
B0:A1:FD:57:32:01
Pod Number
??
Position
??
Tree Level
??
Propose
1
Switch Identifier
A0:B1:FD:56:32:01
Pod Number
??
Position
??
Tree Level
0
Propose
0
Switch Identifier
D0:B1:AD:56:32:01
Pod Number
??
Position
??
Tree Level
0
Yes
Switch Identifier
A0:B1:FD:56:32:01
Pod Number
??
Position
1??
Tree Level
0
Switch Identifier
D0:B1:AD:56:32:01
Pod Number
??
Position
0
Tree Level
0
Switch Identifier
D0:B1:AD:56:32:01
Pod Number
??
Position
0
Tree Level
0
Pod 0
Switch Identifier
D0:B1:AD:56:32:01
Pod Number
??
00
Position
0
Tree Level
0
Pseudo MAC
10.5.1.2
00:00:01:02:00:01
10.2.4.5
00:02:00:02:00:01
ARP mappings
Soft state
Network map
Administrator
configuration
10.5.1.2
MAC ??
10.5.1.2
00:00:01:02:00:01
Address
HWtype
HWAddress
Flags
10.5.1.2
ether
00:00:01:02:00:01
Mask
Iface
eth1
PortLand Prototype
20 OpenFlow NetFPGA switches
TCAM + SRAM for flow entries
Software MAC rewriting
3 tiered fat-tree
16 end hosts
Implementations underway for
HP, Broadcom, and Fulcrum
switches
PortLand: Evaluation
Measurements
Configuration
Results
Network convergence
time
Keepalive frequency =
10 ms
Fault detection time = 50
ms
65 ms
RTOmin = 200ms
~200 ms
Multicast convergence
time
110ms
RTOmin = 200ms
~200 ms 600 ms
27,000+ hosts,
100 ARPs / sec per host
400 Mbps
CPU requirements of
fabric manager
27,000+ hosts,
100 ARPs / sec per host
70 CPU cores
Related Work
Floodless in SEATTLE
Lookup scheme to enable layer 2 routing based on DHT
DCell
Alternative topology for data center interconnect
Fewer modifications to switch, but end host modifications
required
Topology is not rearrangeably nonblocking
Rbridges/TRILL
Layer 2 routing protocol with MAC in MAC encapsulation
Related Work
PortLand
Modify network fabric to
Work with arbitrary operating systems and virtual machine
monitors
Maintain the boundary between network and end-host
administration
Conclusions
Questions?