Você está na página 1de 20

Converged networks with Fibre Channel over Ethernet and Data Center Bridging

Technology brief, 2nd edition

Introduction ......................................................................................................................................... 2 Traditional data center topology ............................................................................................................ 2 Early attempts at converged networks ..................................................................................................... 2 Network convergence with FCoE ........................................................................................................... 3 10 Gigabit Ethernet ............................................................................................................................. 4 HP Virtual Connect Flex-10 ................................................................................................................ 4 HP Virtual Connect FlexFabric with FCoE ............................................................................................ 5 Emerging standards for network convergence ......................................................................................... 5 FCoE standard ................................................................................................................................. 5 FCoE protocol encapsulation.......................................................................................................... 5 Fibre Channel Forwarder............................................................................................................... 6 ENode ........................................................................................................................................ 6 FCoE and Ethernet............................................................................................................................ 7 DCB standards................................................................................................................................. 8 Priority-based Flow Control ............................................................................................................ 8 Enhanced Transmission Selection.................................................................................................. 10 Quantized Congestion Notification............................................................................................... 12 Data Center Bridging Exchange ................................................................................................... 14 Migrating to converged fabrics............................................................................................................ 17 HP strategy ....................................................................................................................................... 19 For more information .......................................................................................................................... 20 Call to action .................................................................................................................................... 20

Introduction
Using application-specific networks for data, management, and storage is complex and costly. Network convergence is a more economical solution that simplifies your data center management by partially or completely consolidating all block-based storage and Ethernet-based data communications networks onto a single fabric. Any network topology constructed with one or more switched network nodes is a fabric. Converged networks consolidate two or more network types onto a single fabric. The promise of network convergence is that it will reduce the cost of qualifying, buying, powering, cooling, provisioning, maintaining, and managing network-related equipment. The challenge is determining the best adoption strategy for your business. This technology brief does the following for you: Defines converged networks Summarizes previous attempts to create them Explains Fibre Channel over Ethernet (FCoE) technology Describes how converged network topologies and converged network adapters (CNAs) work together to tie multiple networks into a single, converged infrastructure Introduces the networking standards required to support this new breed of converged networks Explains how the new standards will affect how you will design and deploy your converged network infrastructure over the next several years

Traditional data center topology


Traditional data centers typically have underused capacity, inflexible single-purpose resources, and high management costs. Typical designs of data center infrastructure include separate, heterogeneous network devices for different types of data. Each device adds to the complexity, cost, and management overhead. Many datacenters support three or more types of networks that serve these purposes: Block storage data management Remote management Business-centric data communications Multiple types of networks require unique switches, network adapters, and network management systems and technology to unify these networks.

Early attempts at converged networks


There have been many attempts to create converged networks over the past decade. Fibre Channel Protocol (FCP) is a lightweight mapping of SCSI to the Fibre Channel (FC) layers 1 and 2 transport protocol (Figure1, left). FC carries not only FCP traffic, but also IP traffic, to create a converged network. The cost of FC and the acceptance of Ethernet as the de-facto standard for LAN communications prevented widespread FC use except for data center SANs for enterprise businesses. InfiniBand (IB) technology provides a converged network capability by transporting inter-processor communication, LAN, and storage protocols. The two most common storage protocols for IB are SCSI Remote Direct Memory Access Protocol (SRP) and iSCSI Extensions for RDMA (iSER). These protocols use the RDMA capabilities of IB. SRP builds a direct SCSI to RDMA mapping layer and protocol, and iSER copies data directly to the SCSI I/O buffers without intermediate data copies (Figure 1, left of center). These protocols are lightweight but not as streamlined as FC. Widespread deployment was impractical because of the perceived high cost of IB and the complex gateway and routers needed to translate from these

IB-centric protocols/networks to the native FC storage devices in data centers. High Performance Computing (HPC) environments that have adopted IB as the standard transport network use SRP and iSER protocols.

Figure 1. Comparison of multiple protocol stacks for converged networks

Fibre Channel

InfiniBand

FCoE/DCB

Internet SCSI (iSCSI) was an attempt to bring a direct SCSI to TCP/IP mapping layer and protocol to the mass Ethernet market, to drive costs lower, and to allow deploying SANs over existing Ethernet LAN infrastructure. iSCSI technology (Figure 1, center) was very appealing to the small and medium business market because of the low-cost software initiators and the ability to use any existing Ethernet LAN. However, iSCSI typically requires new iSCSI storage devices that lack the features in devices using FC interfaces. Also, iSCSI to FC gateways and routers are very complex and expensive. They do not scale cost effectively for the enterprise. Most enterprise businesses have avoided iSCSI or have used it for lower tier storage applications or for departmental use. FC over IP (FCIP) and Internet FC Protocol (iFCP) map FCP and FC characteristics to LANs, MANs, and WANs. Both of these protocols map FC framing on top of the TCP/IP protocol stack (Figure 1, right of center). FCIP is a SAN extension protocol to bridge FC SANs across large geographical areas. It is not for host/server or target/storage attachment. The iFCP protocol allows Ethernet-based hosts to attach to FC SANs through iFCP-to-FC SAN gateways. These gateways and protocols were never widely adopted except for SAN extension because of their complexity, lack of scalability, and cost.

Network convergence with FCoE


FCoE is the next attempt to converge block storage protocols onto Ethernet. FCoE relies on an Ethernet infrastructure that uses a new set of Data Center Bridging (DCB) standards defined by the IEEE (Figure 1, right). Converged Enhanced Ethernet (CEE) is Ethernet infrastructure that implements DCB. Although the DCB standards can apply to any IEEE 802 network, most use it to refer to enhanced Ethernet, making DCB 3

and CEE equivalent terms. We use the term DCB to refer to an Ethernet infrastructure that implements at least the minimum set of DCB standards to carry FCoE protocols. A traffic class (TC) is a traffic management element. DCB enhances low-level Ethernet protocols to send different traffic classes to their appropriate destinations. It also supports lossless behavior for selected TCs, for example, those that carry block storage data. FCoE with DCB tries to mimic the lightweight nature of native FC protocols. It does not incorporate TCP or even IP protocols. This means that FCoE is a nonroutable protocol meant for local deployment within a data center. The main advantage of FCoE is that switch vendors can easily implement the logic for converting FCoE/DCB to native FC in high performance switch silicon. FCoE solutions should cost less as they become widely used.

10 Gigabit Ethernet
One obstacle to using Ethernet for converged networks has been its limited bandwidth. As 10 Gigabit Ethernet (10 GbE) technology becomes more widely used, 10 GbE network components will fulfill the combined data and storage communication needs of many applications. With 10 GbE, converged Ethernet switching fabrics handle multiple TCs for many data center applications. DCB-capable Ethernet gives you maximum flexibility in selecting network management tools. As Ethernet bandwidth increases, fewer physical links can carry more data (Figure 2).

Figure 2. Multiple traffic types sharing the same link

HP Virtual Connect Flex-10


Virtual Connect (VC) Flex-10 technology lets you partition the Ethernet bandwidth of each 10 Gb Ethernet port into up to four FlexNICs. The FlexNICs function and appear to the system as discrete physical NICs, each with its own PCI function and driver instance. The partitioning must be in increments of 100 Mb. While FlexNICs share the same physical port, traffic flow for each is isolated with its own MAC address and VLAN tags between the FlexNIC and associated VC Flex-10 module. Using the VC Manager CLI or GUI, you can set and control the transmit bandwidth available to each FlexNIC according to server workload needs. With the VC Flex-10 modules now available, each dual-port Flex-10 enabled server or mezzanine card supports up to eight FlexNICs, four on each physical port. Each VC Flex-10 module can support up to 64 FlexNICs. Flex-10 adds LAN convergence to VCs virtual I/O technology. It aggregates up to four separate traffic streams into a single 10 Gb pipe connecting to VC modules. VC then routes the frames to the appropriate external networks. This lets you consolidate and better manage physical connections, optimize bandwidth, and reduce cost.

HP Virtual Connect FlexFabric with FCoE


Now that we have achieved an acceptable level of LAN convergence with Flex-10 technology, the next logical step is to add LAN/SAN convergence technology. Virtual Connect FlexFabric broadens Virtual Connect Flex-10 technology to provide solutions for converging different network protocols. We plan to deliver the FlexFabric vision by converging technology, management tools, and partner product portfolios into a virtualized fabric for the data center.

Emerging standards for network convergence


Converged networks require new standards. The International Committee for Information Technology Standards (INCITS) T11 technical committee creates the standards that relate to storage and storage networking based technologies. The IEEE 802.1 Work Group is responsible for developing two types of standards: Standards common to all IEEE 802 defined network types (for example, Ethernet and Token-Ring) Standards necessary to support communication within and between these network types.

FCoE standard
FCoE is an emerging technology under development by the INCITS T11 technical committee. INCITS/ANSI T11.3 FC-BB-5 is the official standard. It includes two protocol definitions: FCoE and FCoE Initialization Protocol (FIP). The FCoE protocol defines the encapsulation of FC frames into Ethernet frames. FIP defines a fabric discovery protocol, creates an Ethernet version of FC fabric login services, and defines the protocols for handling MAC address assignment and association with World Wide Names (WWNs). FCoE relies on improved flow control, well-defined traffic shaping, and multiple TC support that IEEE 802.1 DCB standards provide. FCoE protocol encapsulation FCoE is different from previous attempts to move SCSI traffic over Ethernet. The FCoE protocol allows efficient, high performance conversion between FCoE links and FC links in layer 2 switches. DCB enhancements offer lossless operation for some TCs. This lets us place the FC protocol directly on top of layer 2 (link layer) Ethernet, so we dont have to rely on more complex transport protocols such as TCP to ensure lossless behavior. Implementing FCoE in this way lets us develop devices such as adapters and switches that use most of the existing FC logic on top of the new DCB/Ethernet physical interfaces. The FCoE protocol encapsulation standard requires IEEE 802.1Q tags. Each FCoE frame contains explicit TC/priority tags for efficient processing in layer 2 DCB-capable Ethernet switches. Data centers deploy FCoE for intra-data center use with a similar span as a switched LAN subnet or SAN fabric because FCoE is a layer 2 protocol and does not use the layer 3 IP protocol. FCoE encapsulates FC frames, including FC frame delimiters, headers, payload, and frame check sequence, within the Ethernet frames using a format illustrated in Figure 3.

Figure 3. Illustration of an FCoE frame

Header Start of Frame Header Payload Frame Check

End of Frame

Layer 2 encapsulation provides several advantages to FCoE over previous converged network implementations: Because devices use existing FC logic, FCoE devices use existing FC driver models for the new converged network adapters. We can easily implement FCoE in switches because the logic necessary to convert between FCoE and FC is simple. Existing FC security and management operations, procedures, and applications do not change when using an FCoE/DCB infrastructure for a partial or completely converged network. FCoE takes advantage of a lossless 10 GbE fabric with significantly higher bandwidth than 8 Gb FC fabrics (actually 6.4 Gb plus encoding overhead in the FC protocol). Future protocols can use enhanced DCB Ethernet features that support FCoE. Fibre Channel Forwarder Fibre Channel Forwarder (FCF) is a function within a switch that acts as a translation point that supports converting FCoE traffic between DCB-enabled Ethernet ports and native FC ports. There is one FCF function in a switch for each upstream FC fabric connected to the FC ports of that switch. In other words, there can be more than one FCF function in a switch. An FCF also provides the portal where converged network adapters access the traditional SAN fabric services, for example fabric login, name services, and zoning services. When first initialized, converged network adapters discover the available FCFs in a DCB network. Through management direction, they attach themselves to at least one FCF to begin communication with a SAN fabric. During fabric login, FCFs provide the mechanism that negotiates the MAC address provisioning to the FCoE portion of a converged network adapter. The most commonly used mechanism is Fabric Provisioned MAC Addresses, or FPMA. It operates as FC addresses in an FC network where the address used in the frames is allocated at fabric login time. This is different from normal Ethernet NIC functions, which typically have a static address burned in to them in the factory. ENode ENode is a device that takes the place of the traditional LAN NIC and the FC HBA in a host or server. It is commonly called a converged network adapter (CNA). It provides both data communications and block storage communications through a converged network implemented with DCB-capable Ethernet. An ENode merges the traffic from the NIC and from the SCSI/FC functions into a stream of Ethernet frames to the DCBenabled Ethernet network. Within the DCB network, a DCB/FCoE/FC switch disaggregates the converged traffic streams and sends the different TCs to their appropriate destinations: legacy LANs, legacy FC nodes, or DCB network nodes.

The ENode (Figure 4) consists of these components: FCoE Controller uses FCoE Initialization Protocol (FIP) to discover the SAN fabrics through the FCFs and provisions the virtual N_Ports (VN_Ports) and FCoE Link End Points (LEPs). FCoE LEPs convert FC frames to FCoE frames on the transmit side, and convert FCoE frames to FC frames on the receive side. There is one LEP for each VN_Port established in the ENode. VN_Ports instantiate virtual N_Ports with N_Port ID Virtualization (NPIV) capability similar to a traditional FC HBA. The VN_Ports in an ENode include information about the MAC address to WWN translations required for proper communications with FCFs in a converged network. FC Function is the traditional logic implemented in an FC HBA. It handles storage discovery, storage connection management, error recovery, and host bus (PCIe) interface interoperation to upper layer driver/SCSI drivers. Again, this function behaves so much like an FC HBA function that CNAs and HBAs from the same vendors typically use the same storage drivers in the host operating systems to control them. This makes deploying both converged and non-converged systems in a data center very easy during the transition to a converged infrastructure.

Figure 4. FCoE architecture components

FCoE and Ethernet


FCoE requires DCB-enabled Ethernet. The IEEE is working to enhance the IEEE 802 network standards to allow FC, or any TC requiring lossless behavior, to run efficiently over many types of IEEE 802 compliant, MAC layer protocols, including Ethernet. We expect the FCoE standard ratification in late 2010. It is important to understand that FCoE will not work on legacy Ethernet networks because it requires a lossless form of Ethernet. FC cannot handle dropped frames as Ethernet allows today. It is possible to create a lossless Ethernet network using existing IEEE 802.3x flow control mechanisms. If the network carries multiple 7

TCs, the existing mechanisms can cause Quality of Service (QoS) issues, limit the ability to scale a network, and affect performance.

DCB standards
DCB is not just the name for a set of new standards the IEEE is developing. It is a term often used for Ethernet designed to carry multiple TCs, some with lossless behavior. You can think of DCB-enabled Ethernet as applying the DCB standards to IEEE 802.3 Ethernet standards to create a new set of products to implement this improved version of Ethernet. The change from legacy Ethernet to DCB-enabled Ethernet requires hardware and software changes, so you cant upgrade legacy Ethernet NICs and switches with DCB support to carry FCoE traffic. Fortunately, you only have to update the data paths in a data center that carry FCoE with DCB-enabled Ethernet devices. For full end-to-end data center use, all equipment manufacturers must agree to adopt four new IEEE protocols. The proposed standards are still under development, and full ratification of the complete set may take until late 2010 or 2011. One result of these ongoing standardization efforts is that DCB/FCoE products offered on the market today will likely need frequent software upgrades or even new hardware by the time DCB/FCoE technology is fully mature. The DCB Task Group within the IEEE 802.1 Higher Layer LAN Protocols Work Group is defining DCB for protocols and technologies that apply to data center-oriented LAN communications. The standards they develop apply to all IEEE 802 network types, but they implicitly target Ethernet for primary implementation. Table 1 lists four new technologies defined in three DCB draft standards.
Table 1. DCB draft standards for IEEE 802 networks Draft standard IEEE 802.1 Qbb IEEE 802.1 Qaz IEEE 802.1Qau New technology Priority-based Flow Control (PFC) Enhanced Transmission Selection (ETS) DCB Capability Exchange Protocol (DCBX) Quantized Congestion Notification (QCN)

These standards serve three general purposes: Allow IEEE 802 LANs to carry multiple traffic classes Support lossless behavior on a subset of these traffic classes Formally define standard frame transmission scheduling mechanisms to support multiple traffic classes. You dont have to use all four of these protocols to implement a DCB network, and you dont need to use all options available in each protocol. However, if vendors do not implement the entire set, products may limit the possible scale or features. Because the standards are evolving, current DCB/FCoE products do not implement all of these protocols or all their supported options. Therefore, we must discuss their deployment limitations within a data center. Priority-based Flow Control Legacy FC networks support a link-level flow control mechanism known as buffer-to-buffer or credit-based flow control. This lightweight, high performance mechanism lets FC work in a lossless manner. Credit-based flow control provides a reliable layer 2 network required for block storage traffic, for example SCSI. To transport FC and SCSI protocols over Ethernet and maintain a lightweight implementation, we recommend providing a similar mechanism for Ethernet networks. Legacy Ethernet uses a simple flow control mechanism. It uses pause frames to let a congested network device port on an Ethernet NIC or switch tell its link partner to pause all traffic for a specified time. This approach can limit performance when a network device port has multiple queues for receiving incoming 8

frames of varying priority or TCs: If one queue becomes full, the device must send a pause frame to the other side of the link. This pauses all traffic, regardless of TC/priority. Supporting lossless behavior of block storage protocols on legacy Ethernet networks requires using legacy pause frames. However, this forces all traffic to be lossless on that link. The most bursty or bandwidth driven TCs dictate the behaviors of all TCs. Many types of traffic flows, for example real-time audio/video data streams, dont require lossless transmission and dont perform well on a lossless link. Even traditional TCPbased traffic flows optimized for lossy communications environments often dont perform well in lossless environments that transport different classes of traffic with vastly different characteristics simultaneously. In Figure 5, low-bandwidth, latency-sensitive traffic for voice/video/financial transactions (green) and higher bandwidth bulk traffic for storage (red) are sent on a link. The receiving device has two sets of queues for receiving and storing data, one for green traffic and the other for red traffic. In this example, the high bandwidth bulk traffic will fill the red receive queue. Although the green traffic has plenty of queue space available, the receiving device sends a pause frame because the red queue is full. The transmitting device receives this pause frame and stops all traffic on the link. Long delays interrupt the low latency traffic.

Figure 5. Legacy pause-based flow control

Priority-based Flow Control (PFC) uses a modified version of the pause frame called a Per Priority Pause (PPP) frame. PPP allows the pause frame to specify which priorities, and thus which TCs, to pause. PFC uses the priority levels in the class of service fields of the 802.1Qbb PPP frame header. When a network device has one or more receive queues that are nearly full, it constructs a PPP frame to send to the remote link partner. The remote device examines the class of service fields to determine which priorities/TCs to pause. The ports transmit function will stop sending the priorities/TCs going to the full ingress queues on the congested device without affecting priorities/TCs going to unfilled queues on the congested device.

Figure 6 illustrates the same scenario up to the point where the receiving node needs to send a pause frame. A PPP frame dictates pausing the red TC. The pause takes advantage of the class of service fields to restrict the pause to only classes of traffic that have nearly full queues. The transmitting station stops sending red traffic; the latency-sensitive green traffic continues to flow properly.

Figure 6. PFC-based flow control

Receive queues in a DCB Ethernet device will have high and low watermarks. If the queues fill up to the high watermark, the device generates a PPP frame. If the level of the queue drops below the low watermark, the device will send a PPP frame specifying a zero time to indicate that the link partner may send traffic for the affected TCs immediately. This allows an XON/XOFF-type operation on a per priority/TC. PPP frames allow a single frame to specify XON/XOFF behavior independently for any of up to eight priorities/TCs. This reduces the control frame overhead if devices support PFC on multiple TCs. The FCoE protocol requires DCB-enabled Ethernet devices to support only one PFC-enabled priority/TC. Not all eight priorities/TCs must support PFC, and not all priorities/TCs have to support PFC. Many devices on the market today support only one PFC-enabled priority/TC. In the future, devices should support a greater number of PFC-enabled priorities/TCs, but that is not required for basic FCoE transport over DCB-enabled Ethernet links. Enhanced Transmission Selection Legacy Ethernet supports multiple traffic management elements called traffic classes (TCs). IEEE 802.1Q (VLAN) tags with a class of service (CoS) field assign a transmission priority to each TC. You can implement up to eight TCs (TC0 through TC7) in an Ethernet device. Current standards and product implementations focus on transmitting the traffic classes in strict priority order. For applications operating completely at layer 2, the MAC layer, strict priority does not allow for fair, deterministic bandwidth control typically preferred

10

for all but the very highest priority traffic classes. This includes converged networks that handle block storage traffic using a layer 2 encapsulation protocol, like FCoE. One common misunderstanding about many modern Ethernet devices, particularly Ethernet switches, is that they already have bandwidth control and traffic shaping capabilities that support layer 2 protocols like FCoE. But these devices typically define traffic classes based on layer 3 (IP) or layer 4 information in frames, not by the priority field of the IEEE 802.1Q tag field or the Ethertype (protocol) field in the Ethernet frame header. The Enhanced Transmission Selection (ETS) standard formally defines how the port transmit logic of an Ethernet device selects the next frame to send from one or more priority/traffic class queues for layer 2, or MAC based, protocols. This lets the device allocate bandwidth between layer 2 defined traffic classes and support strict priority scheduling for traffic classes requiring it. ETS refines the existing TCs. ETS adds a bandwidth-sharing algorithm that you can assign to each of the supported TCs. When you configure a TC to use the ETS bandwidth-sharing algorithm, you must provide a bandwidth percentage. Traffic class queues that are part of TCs assigned a strict priority-scheduling algorithm (typically the default algorithm) are processed in strict priority order. They have three typical uses: Extremely high priority network control or management traffic Low-bandwidth/low-latency Jitter (variable latency) sensitive or intolerant The ETS standard specifies that once all the strict priority TC queues are empty, the device sends frames from the TCs assigned an ETS scheduling algorithm. A single ETS TC can have more than one priority queue. There is a common misconception about the ETS bandwidth-sharing algorithm. Some people think that the bandwidth percentage assigned to an ETS traffic class is a percentage of link bandwidth for the port. That is not true. ETS bandwidth percentages represent the percentage of available bandwidth after satisfying all of the strict priority TCs. That is, if the strict priority TCs take up 4 Gb/s of the link bandwidth of a 10 Gb/s link, the ETS queue assigned 50 percent bandwidth is asking for 50 percent of the remaining 6 Gb/s of the link bandwidth, or 3 Gb/s. The ETS standard does not specify the bandwidth allocation algorithm that DCB-enabled Ethernet devices must use to select frames from the TCs. Device vendors get to decide the best algorithms for their products. The standard does suggest that deficit weighted round robin (DWRR) and a handful of other algorithms would suffice. The ETS standard also does not specify the algorithm for selecting frames for transmit from multiple priority queues assigned to the same TC. The standard suggests that using a strict priority algorithm between these queues is one possibility. As Ethernet frames of varying priority queue up for transmission on a port, the device maps them into priority queues and traffic classes. The device then places the frames into independent priority or traffic class queues. Network administrators responsible for managing the port on the network device are responsible for configuring these assignments. The ETS standard specifies that these administrators are also responsible for assigning the scheduling algorithm for each traffic class. In Figure 7, priority 5 frames are in TC4, and priority 1 frames are in TC1. Strict priority was the scheduling algorithm for both TCs, so the device sends their frames before any frames of TCs assigned the ETS scheduling algorithm. In this case, the device sends frames for TC4 before any frames from TC1. If there are no frames in the queue for TC4, then the device sends frames in TC1 before any frames in any of the other TCs. TCs assigned with ETS scheduling (TC0, TC2, and TC3) have been allocated 50, 40, and 10 percent of the available bandwidth, respectively. These allocations are the percentage of bandwidth available after the transmit requirements of TC4 and TC1are satisfied.

11

Figure 7. Example of an Enhanced Transmission Selection (ETS) configuration

Also in Figure 7, note that TC2 has priority queues 2 and 3. The ETS standard suggests that frames transmit from TC2 queues in strict priority order. In this example, the device sends any frames in the queue for priority 3 before any frames in the queue for priority 2. Again, the standard leaves the implementation of scheduling for these intra-TC queues to device vendors. Vendors might use the two traffic classes scheduled in strict priority order or in round robin. Some implementations may be configurable to allow either mode. The FCoE protocol requires DCB-enabled Ethernet devices to support at least two TCs that support ETS scheduling algorithms: one to support traditional data communication traffic and one to support FCoE traffic. Many devices on the market only support two TCs with ETS capability. In future generations of hardware, devices should support more TCs capable of ETS bandwidth scheduling, but this is not required for basic FCoE transport over DCB-enabled Ethernet links. Those who adopt of this technology must clearly understand another important aspect of ETS performance. ETS bandwidth allocation is merely the best effort specification of minimum bandwidth guarantee. Many factors can limit the effectiveness of a device to meet these bandwidth requirements consistently. The bandwidth consumed by the strict priority queues can directly affect the amount of bandwidth available for ETS traffic classes. When a port receives a per priority pause frame (PPP) from its link partner, all transmission from that traffic class or priority queue within the traffic class stops for the duration of the pause. This could dramatically reduce the effective throughput of that traffic class. Finally, implementing congestion notification can also affect the amount of data transmitted from a traffic class, but not as severely as PFCs effect on ETS. Quantized Congestion Notification The IEEE 802.1Qau standard specifies a protocol called Quantized Congestion Notification (QCN). The QCN protocol supports end-to-end flow control in large, multi-hop, DCB-enabled, switched Ethernet infrastructures. It is one of the most significant standards for enabling converged network deployments in moderate to large data centers. PFC protects against occasional bursty congestion on a single link between DCB-enabled devices. QCN protects larger multi-hop or end-to-end converged networks from persistent or chronic congestion. These multi-hop networks are susceptible to congestion because typical tree-like network architectures tend to have choke points where multiple sources of data compete for network resources and bandwidth to reach a smaller number of destinations. Typical shared storage traffic patterns especially compound this issue. QCN does not guarantee a lossless environment in the DCB-enabled LAN. You must

12

use QCN in conjunction with PFC to provide lossless operation with smooth congestion management across large DCB-enabled networks. QCN uses a special new tag that allows sources of traffic, for example CNAs, to identify traffic flows to all interconnect devices in a QCN-enabled DCB network. QCN defines two specific points in a network that implement the QCN protocol, congestion points and reaction points. The QCN protocol has these basic procedural elements: Reaction points initiate traffic into the network. They can include CNAs, target nodes, or DCB-enabled switches that bridge between native FC networks and the DCB-enabled Ethernet network. Reaction points tag their frames with traffic flow information identifying the source and destination of the traffic flow. When transmit queues fill up due to congestion from oversubscription, congestion points (typically switches) statistically sample the frames in the congested transmit queues to identify the traffic flows contributing most to the congestion. The congestion point device calculates congestion feedback quanta for each traffic source sampled. The device uses information from the sampled traffic flow tags to send congestion notifications back to the traffic sources. Upon receiving the congestion notification, a reaction point will use the feedback quanta to reduce the transmission rate for that traffic flow to that specific destination. QCN does not affect traffic sent on unrelated flows to unrelated destinations. If a reaction point receives no further congestion notification messages, it slowly increases its transmit rates until they reach normal levels. Most DCB-enabled Ethernet switches will implement congestion points. We can roughly equate QCN operation to the TCP window algorithms that restrict traffic flow when the device detects lost frames. In the case of QCN, however, the protocol operates at layer 2 in the network. It uses high-performance, low-level hardware to improve the networks ability to react to congestion. Figure 8 illustrates a multi-hop network that implements QCN.

Figure 8. QCN congestion notification

Congestion Points

Reaction Points

Storage

Data Flow

Congestion Notification Messages

In this example, multiple CNAs in servers are sending write data to a common storage device through a multi-hop network. As a switch queue fills and surpasses a high water mark, the device sends congestion

13

notification messages to the server CNAs. The switch selects the server CAN by statistically sampling the congested queue. The congestion notification occurs dynamically by sending higher feedback quanta to CNAs producing the most traffic and lower feedback quanta to sources producing less traffic. As a result, the CNAs throttle down their transmit rates on congested traffic flows. The decrease in traffic flow rates reduces the number of frames in the congested queue in the switch to achieve a more sustainable, balanced level of performance. As the congestion eases, the switch reduces or stops sending notifications and the CNAs start to accelerate the throughput rate. This active feedback protocol continuously balances traffic flow. It is possible to construct simple converged networks on one or two switch hops without QCN. In fact, the FCoE protocol does not require use of QCN in DCB-enabled Ethernet equipment. However, the general understanding is that building relatively complex multi-hop or end-to-end, data-center-wide converged networks based on DCB-enabled Ethernet equipment requires enabling QCN in this infrastructure. Networks that use the QCN protocol face several challenges: QCN protocol complexity Implementing the flow tagging, statistical sampling, and congestion messaging is relatively complex. Identifying the proper timing and quanta of notification feedback to satisfy a wide variety of operating conditions is also difficult. Difficult interoperability process Perfecting multi-vendor interoperability could take several years because of protocol complexity. No QCN support in current generation products No DCB/FCoE products shipping today support the QCN protocol. Furthermore, most, if not all, products will require a hardware upgrade to support QCN. Products claiming to support QCN have unproven, untested hardware implementations. Vendors havent performed any rigorous interoperability tests with production level QCN software. Complete end-to-end support requirement To enable QCN in a network, the entire data path must support the QCN protocol. All hardware across the DCB-enabled network must support QCN. This poses a significant problem because upgrading existing first-generation, DCB-based converged networks requires replacing or upgrading all DCB components. Because of these challenges, only one-hop and two-hop networks will be reliable until next generation hardware becomes available to support QCN. Most currently shipping hardware cannot support QCN and cannot be software upgraded to add this support. Therefore, support for larger DCB-based network deployments will require hardware upgrades. Data Center Bridging Exchange Data Center Bridging Exchange (DCBX) protocol provides two primary functions: Lets DCB-enabled Ethernet devices/ports advertise their DCB capabilities to their link partners Lets DCB-enabled Ethernet devices push preferred parameters to their link partners DCBX supports discovery and exchange of network configuration information between DCB-compliant peer devices. DCBX enhances the Link Layer Discovery Protocol (LLDP) with more network status information and more parameters than LLDP. The specification separates DCBX exchange parameters into administered and operational groups. The administered parameters contain network device configurations. The operational parameters describe the operational status of network device configurations. Devices can also specify a willingness to accept DCBX parameters from the attached link partner. This is most commonly supported in CNAs that allow the attached DCB-enabled switch to set up their parameters.

14

NOTE Link Layer Discovery Protocol (LLDP), IEEE 802.1AB, defines a protocol and a set of managed objects that can be used for discovering the physical topology and connection end-point information from adjacent devices in 802 LANs and MANs. The protocol is not restricted from running on non-802 media. Table 2. DCBX supported parameters Protocol PFC Parameters Advertised Indication of which priorities have PFC enabled Willingness to accept PFC recommendations (CNA) Number of priorities that can support PFC MACsec bypass capability ETS Number of traffic classes supported on the port Priority to Traffic Class Mapping Willingness to accept ETS recommendations (CNA) Traffic class bandwidth allocations (for ETS TCs) Bandwidth allocation algorithms for each TC QCN Other Not currently in the standard How applications, for example FCoE, map to priorities

Figure 9 illustrates DCBX parameter negotiation between a CNA and the attached switch port where neither device is willing to accept DCBX parameter recommendations. In this case, the CNA and switch advertise DCB capabilities to each other. The adapter chooses a storage traffic priority that is not compatible with the switch. The CNA and switch cannot properly exchange storage traffic with one another so communication on that link does not happen. Typically, this generates an error that prompts you to reconfigure either the CNA or the switch parameters to make them compatible. The same situation can occur on links between switches.

15

Figure 9. DCBX static parameter exchange

CNA parameters switch parameters

X
The DCBX protocols strength lies in its ability to perform dynamic negotiation using attributes called recommended and willingness. CNAs and switches using DCBX can advertise their willingness to adopt parameter settings from their link partner. In the example shown in Figure 10, a CNA communicates the initial exchange of ETS and PFC information and willingness to consider parameters from the switch. The switch acknowledges this willingness and sends the CNA the recommended parameter values for ETS and PFC parameters. If the CNA can successfully adopt the recommended parameters, the CNA will readvertise its DCBX parameters using the recommended values. The two devices will then be able to communicate on the established link.

16

Figure 10. DCBX dynamic negotiation

X
CNA willingness Switch recommended DCB - C CNA new parameters

Migrating to converged fabrics


In a one-hop architecture, converged traffic goes from a server to a switch that splits it to Ethernet and Fibre Channel. In two-hop architecture, converged traffic goes to a second switch before the split. The more switch hops in a DCB-enabled network, the more difficult it is to keep the network operating at peak efficiency while minimizing congestion. Figure 11 shows the expected industry path to convergence.

Figure 11. Industry path to convergence

17

This is the first phase of migration to converged fabrics. CNAs will connect to converged fabric access switches that support DCB-enabled Ethernet, legacy Ethernet, and legacy FC. The CNAs will provide converged connectivity between servers and the first hop switch before disaggregating the traffic to the legacy LAN and SAN infrastructure. Figure 12 compares traditional deployment to the first phase of converged network deployment.

Figure 12. Comparison of traditional deployment and converged network, phase 1

Figure 13 shows how the next phases of deployment may occur as you update existing data centers or build new ones. Eventually a server will require only a single pair of redundant CNAs. Converged network switches will replace separate FC, 10 GbE, and IB switches.

18

Figure 13. Converged network deployment, phases 2 and 3

HP strategy
We believe that the transition to DCB/FCoE can be graceful. It need not disrupt existing network infrastructures if you first deploy at the server-to-network edge and then migrate farther into the network. With this approach, you will gain the immediate benefit of reduced cable and adapter hardware with the least amount of disruption to the overall network architecture. As you deploy new servers, you can deploy DCB/FCoE with new CNAs and DCB/FCoE/FC enabled edge/access switches. Doing this will optimize, simplify, and reduce the cost of the server-to-network edge infrastructure, and you wont have to replace the entire data center communications infrastructure. You should start by implementing DCB/FCoE technology only with those servers requiring access to FC SAN storage targets. Many data centers average about 60 to 80 percent LAN-only network attachment, so only the remaining 20 to 40 percent would need both LAN and SAN. Not all servers need access to FC SANs. Looking forward, many IT organizations are re-evaluating the network storage connectivity of their server infrastructure. Besides DCB/FCoE technology, other methods of converging traffic include iSCSI protocols with storage devices at 10 Gb, and file-oriented network storage protocols with storage such as NFS or CIFS. Neither of these technologies requires a DCB-enabled Ethernet network. Both can operate on traditional 1/10 Gb Ethernet infrastructure. Transitioning the server-to-network edge first to accommodate FCoE/CEE will maintain the existing architecture structure and management roles, keeping the existing SAN and LAN topologies. Updating the server-to-network edge offers the greatest benefit and simplification without disrupting the data center architecture.

19

For more information


Resource description HP Multifunction Networking Products HP ProLiant networking Ethernet network adapters Server-to-network edge technologies: converged networks and virtual I/O technology brief Ethernet technology for industry-standard servers technology brief HP FlexFabric and Flex-10 technology technology brief Server virtualization technologies for x86based HP BladeSystem and HP ProLiant servers technology brief HP Virtual Connect Technology web page Web address http://h18004.www1.hp.com/products/servers/proliantadvantage/networking.html http://h18004.www1.hp.com/products/servers/networking/in dex-nic.html http://h20000.www2.hp.com/bc/docs/support/SupportManu al/c02044591/c02044591.pdf http://h20000.www2.hp.com/bc/docs/support/SupportManu al/c02475134/c02475134.pdf http://h20000.www2.hp.com/bc/docs/support/SupportManu al/c01608922/c01608922.pdf http://h20000.www2.hp.com/bc/docs/support/SupportManu al/c01067846/c01067846.pdf http://isscontent.americas.hpqcorp.net/products/blades/virtual connect/

Call to action
Send comments about this paper to TechCom@HP.com

Copyright 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. TC101220TB, December 2010

Você também pode gostar