Você está na página 1de 12

iPCA Quality Awareness

Technology White Paper

iPCA Quality Awareness


Technology White Paper
1 IP/Ethernet Networks Cannot Measure Service Quality .......................... 2
1.1 iPCA Overview ...............................................................................................................2
1.2 iPCA Benefits .................................................................................................................3

2 iPCA Applications .................................................................................. 3


2.1 Device-Level Measurement ............................................................................................3
2.2 Network-Level Measurement .........................................................................................5

3 Traditional Service Quality Measurements.............................................. 8


3.1 Factors Affecting Service Quality ....................................................................................8
3.2 Quality Guarantee and Fault Location Difficulties on IP Networks...................................9
3.3 Problems in Traditional IP Network Quality Measurement Technologies .......................10

4 Summary ............................................................................................. 11
4.1 iPCA Identifies and Locates Problems ...........................................................................11

1 IP/Ethernet Networks Cannot


Measure Service Quality
1.1 iPCA Overview
IP and Ethernet have been widely used as basic network technologies. Both IP and
Ethernet networks are connectionless networks. Connectionless networks feature
good scalability and service transparency but, in contrast to connection-oriented
networks, can only identify ingress and egress of data flows and do not provide
any connection information. Therefore, service quality monitoring and guarantees
are difficult.
Packet Conservation Algorithm for Internet (iPCA) provides quality measurement
at the device and network levels. Device-level measurement can be implemented
when all devices in an area are Huawei agile devices. In this case, Quality
Awareness and accurate fault location can be provided for the entire network
simply by enabling device-level measurement on each agile device. Agile devicelevel measurement encompasses cards, switch fabric units, and links, and can
accurately detect any problems that affect user experience, no matter the cause.
Packet conservation means that the number of packets leaving a system (network,
link, device, or card) equals the number of packets arriving at the system. If data
flows passing through a system comply with packet conservation, packet loss does
not occur and packet transmission quality is ensured. iPCA monitors packet loss,
which is a major factor affecting user experience on IP networks.

Figure 1-1 iPCA diagram

Arriving packets
Internally
generated packets
Monitored system (network/device/card/link)
Absorbed
packets

The iPCA quality measurement mechanism is simple. A monitored system is normal


if the following condition is met: Number of packets arriving at the system +
Number of internally generated packets = Number of packets departing the system
+ Number of packets absorbed by the system. If this condition is not met, it means
some packets have been dropped. However, quality measurement is complex
because many factors must be considered such as packet counter synchronization
and specific situations of the monitored system. The detailed measurement process
is outside the scope of this document.

1.2 iPCA Benefits


iPCA helps you monitor network quality in real time to find and solve network
problems in a timely manner. This technology ensures good user experience and
quick identification of failure points. To implement device-level measurement, you
only need to enable the device-level measurement capability on each device. To
implement network-level measurement, you need to define a monitored domain
and enable an iPCA instance. After you configure the iPCA domain and alarm
thresholds, iPCA can detect problems in the domain. Then you can handle the
problems before these problems degrade the user experience.
When Huawei agile devices establish a network with Huawei non-agile devices
or third-party devices, you can enable network-level measurement in addition to
device-level measurement on the agile devices. In this way, the agile devices can
monitor the quality of non-agile devices and the network while distinguishing
the problems in agile and non-agile areas. To implement this function, familiarize
yourself with iPCA implementation and properly define the monitored flows and
package conservation domain.
If iPCA does not cover the entire network or the alarm thresholds are not properly
set, network problems are often discovered through user complaints but not realtime alarms. In this case, historical device-level or network-level measurement data
provided by iPCA can be used to locate faults. For example, if you do not know where
a fault has occurred, you can check iPCA records to find the possible failure points.

2 iPCA Applications
2.1 Device-Level Measurement
iPCA device-level measurement redefines a Multiple-Input-Multiple-Output (MIMO)
system, and monitors MIMO line cards, links, and switch fabric units to measure
transmission quality in each device as well as links between devices. It not only
quickly measures network quality, but also identifies the specific line card, switch
fabric unit, or link to replace, helping you quickly identify and handle network
problems. After iPCA device-level measurement is deployed on a network, network
administrators only need to check alarms against the predefined quality parameter
thresholds.
As shown in the following figure, iPCA monitors loss of incoming and outgoing
packets in area 1 (ENP cards), area 2 (switch fabric units), and area 3 (links), to
measure network quality and accurately identify failure points.
In area 1, iPCA treats each ENP card as an independent MIMO area, and measures
the packet loss ratio on each ENP card. Normally, the number of packets leaving an
ENP card is equal to the number of packets arriving at the ENP card. If the number
of outgoing packets is smaller than the number of incoming packets, packets have
been dropped.
The switch fabric units in area 2 are not programmable, but can be monitored
using ENP cards. Each switch fabric unit is connected to multiple ENP cards.

Figure 2-2 Typical networking for device-level measurement


CPU

Switch fabric

ENP card

ENP card

Switch fabric

ENP card

Non-ENP
card

Packet loss on the switch fabric units and links between them can be measured by
counting incoming and outgoing packets on the ENP cards.

Branch

Branch

WAN
Headquarters

eSight

iPCA device-level measurement is enabled

If all devices in the network or an area are Huawei agile devices, you only need to
enable device-level measurement on the agile devices. Then the agile devices can
provide Quality Awareness and accurate fault location.

2.2 Network-Level Measurement


iPCA network-level measurement applies to networks established by agile and nonagile devices. To implement network-level measurement, you should be familiar
with iPCA implementation and be able to properly define the monitored domain.
The following conditions must be met to implement network-level measurement:
The monitored object is a network established by multiple network devices. The
network can be a single-input-single-output system or a multiple-input-multipleoutput system.
The monitored flow must traverse the monitored network, and cannot be
generated or terminated in the network.
The edge devices (measurement systems) of the network must support iPCA.
Typical networking for network-level measurement works in the following way:
1. The access devices for video and VoIP services are Huawei agile devices.
Network-level measurement is deployed to monitor the quality and faults of
Huawei non-agile devices or third-party devices. In this way, problems in the

agile and non-agile domains can be distinguished and end-to-end service quality
can be monitored.

Non-agile devices and agile devices connected to them form a MIMO domain.
iPCA measurement is enabled on interfaces of Huawei agile devices. Based on the
packet conservation principle, input traffic volume is the total number of packets
sent from Huawei agile devices to non-agile devices in the domain, and output
traffic volume is the total number of packets that Huawei agile devices receive
from the non-agile devices. If input traffic volume is larger than output traffic
volume, packets have been dropped in the domain.
2. Measurement of the WAN (outside the campus network) is implemented as
follows:
Huawei agile devices at the edge of the WAN form a measurement domain.
The agile devices monitor all the interfaces through which packets are sent to
or from the WAN.
The agile devices count the total number of incoming and outgoing packets
to determine the domains packet loss ratio.

Branch

Branch

WAN

Headquarters

eSight

Device with iPCA enabled

3 Traditional Service Quality


Measurements
3.1 Factors Affecting Service Quality
Packet loss: Packet loss is the most important factor affecting service quality. For
Transmission Control Protocol (TCP) flows on an IP network, packet loss causes
retransmission and fast TCP convergence, which greatly reduces transmission
speed, increases response time, and lowers bandwidth utilization. For User
Datagram Protocol (UDP) flows (mainly video and voice flows), packet loss severely
affects service quality. Most quality issues in data, voice, and video services such
as slow access speeds, delayed response, video pixelation, and fuzzy voice, are
caused by packet loss. The following table summarizes the causes of packet loss.

Table 3-1 Possible causes of packet loss


Types of Packet Loss

Cause
Packet loss in port queues: link bandwidth is insufficient

Controlled packet loss on


network devices
Network devices drop some
packets according to certain
rules. When the network
is properly planned and
network devices are correctly
configured, controlled packet
loss is mainly triggered by
network attacks.

ACL-triggered packet loss: the packets do not meet


ACL rules, the ACL configuration is incorrect, or the
network is attacked
CAR-triggered packet loss: the traffic rate exceeds
the CAR limit, the CAR configuration is incorrect, or
the network is attacked
Loss of packets with TTL of 0: a routing loop may
exist on the network
Packet loss caused by route absence: route
calculation or routing configuration is incorrect, or
the network is attacked
Error packet loss: the configuration is incorrect or
the network is attacked

Unexpected packet loss


Packets are dropped
due to link or hardware
failures. Such packet loss is
uncontrollable, and packets
of any priority may be
dropped. Unexpected packet
loss is the major factor that
affects service quality.

Small buffer: switches with small buffer sizes are


used at incorrect positions and cannot handle heavy
traffic on the network
Link failure: optical fibers are broken, optical
transceiver parameters are set incorrectly, or
network cables are not properly connected
Hardware failure: hardware components are aging
or affected by harsh environments

Bandwidth: Network bandwidth is another fundamental element that affects


services. A service cannot be provided if the available bandwidth is lower than
the minimum bandwidth required by the service. Traditional Quality of Service
(QoS) designs focus on allocation of link bandwidth. TCP flows dominant on an
IP network adapt to network bandwidth. That is, TCP can reduce the traffic rate
when bandwidth is sufficient, but insufficient bandwidth will not cause obvious
packet loss in TCP flows if network devices have a large buffer size. However,
insufficient bandwidth will cause severe packet loss in port queues for UDP flows
that carry video and voice services. When this occurs, the network cannot deliver
normal video and voice services. It should be noticed that sufficient bandwidth
does not necessarily mean high service quality, because quality is also affected by
other factors like network failures, misuse, or misconfigurations.

Table 3-2 Impact of insufficient bandwidth on services


Service Flows

Impact of Insufficient Bandwidth

TCP flow (data services,


streaming media, desktop
cloud )

Transmission speed is low. Packets normally are


forwarded on devices with large buffer sizes, while
devices with small buffer sizes drop a large number
of packets.

UDP flow (video, IPTV, and


voice services)

A large number of service packets are dropped, and


the network cannot deliver these services.

Latency/jitter: Packet loss and retransmission are the major factors that cause
latency and jitter in service transmission. A networks own latency and jitter is only
significant to voice, video, or desktop cloud services that have high requirements
for real-time transmission.
End-to-end network latency = Signal transmission latency + Device
forwarding latency + Port buffering latency: signal transmission and device
forwarding latency are almost constant and only need to be measured at the
early stage of network construction. If signal transmission and device forwarding
latency cannot meet service requirements, you need to adjust network planning,
including the latency on primary and backup paths. Real-time monitoring of signal
transmission and device forwarding latency are not very helpful for improving
service quality.
The only factor that can cause changes in network latency is port buffering latency,
which is mostly caused by network overloads. Variable port buffering latency
depends on link loads. The only way to shorten variable latency is to avoid link
overloads by properly planning traffic transmission.
Timely identification and resolution of user experience issues: Service quality
guarantees are important measures for identifying and resolving problems that
affect user experience. A "problem" may include but is not limited to a failure. A
failure is a problem that interrupts services. Actually, there are many problems that
7

will not cause service interruption but can degrade service quality, for example,
packet loss on links, hardware-triggered packet loss, incorrect configuration,
incorrect network planning, and network attacks.

3.2 Quality Guarantee and Fault Location


Difficulties on IP Networks
On traditional connection-oriented networks such as Asynchronous Transfer Mode
(ATM) and Frame Relay (FR) networks, each Virtual Channel (VC) has an identifier.
Quality monitoring and guarantees and fault location are performed based on the
Operation, Administration and Maintenance (OAM) settings of each VC.
Currently, most IP networks use coarse-grained bandwidth management polices
and do not have quality monitoring and guarantee mechanisms. As a result, the
networks provide only connectivity and cannot ensure good user experience.
However, the networks and network administrators are unaware of these issues
because there is no system to monitor service quality on the entire network.
Administrators try to locate network problems only after receiving complaints from
users. Even then, it often takes a long time to locate and solve a problem due to
lack of real-time monitoring mechanisms and effective problem location methods.
This problem location process is inefficient and severely affects user experience.

3.3 Problems in Traditional IP Network Quality


Measurement Technologies
Some technologies have been developed to monitor network quality and check
connectivity on IP networks such as ping, BFD/NQA, and Y.1731. However, these
technologies require point-to-point connections, and many connections need to be
created to monitor transmission quality of all services. In addition, the application
of these technologies is restricted by their disadvantages such as inconsistent
service paths and small scope of fault detection. For these reasons, they are mainly
used for quality monitoring or fault location between a few nodes or on private
links, and cannot provide effective quality measurement and fault detection.
Traditional network quality measurement technologies have the following problems:
N2 connection issue of point-to-point monitoring: N nodes on a network must
set up Nx(N-1)/2 bidirectional connections or Nx(N-1) unidirectional connections.
When there are many nodes on a network, network expansion is difficult.

Inconsistent service paths: If load balancing functions such as Eth-Trunk+VSS


and ECMP are configured, NQA/Y.1731 packets are not transmitted over the
same paths as the service packets. Therefore, NQA/Y.1731 cannot accurately
monitor service flows.
Inability to detect some failure points: For example, NQA and BFD probe packets
are transmitted along the path of CPU control channel uplink forwarding
channel uplink interface. Service packets, however, are transmitted along the
path of access interface access forwarding channel switch fabric uplink
forwarding channel uplink interface. Therefore, NQA and BFD cannot detect
failures of access links, access interface cards, or switch fabric units. Measured
latency and jitter are mostly caused by CPU processing latency, but not real
network latency and jitter.
Inability to simulate real service traffic through out-of-band measurement: NQA
and BFD use out-of-band traffic to simulate service traffic, but out-of-band traffic
is only a sample of real service traffic and is much slower than real service traffic.
Therefore, NQA and BFD can only detect network disconnections or severe
packet loss. Increasing out-of-band traffic only worsens network congestion.

4 Summary
4.1 iPCA Identifies and Locates Problems
iPCA technology provides quality and problem detection capabilities in connectionless
networks. If packet loss is detected in an iPCA domain, network devices can trigger
alarms according to the preconfigured alarm threshold (packet loss ratio and
duration). In this way, the fault location scope is narrowed from the entire network to
an iPCA domain, greatly improving the efficiency of fault location and rectification. To
analyze the specific cause of packet loss, network administrators need to check the
packet counters of iPCA and other device information.
iPCA significantly improves fault location efficiency while ensuring network quality
and user experience. Network administrators must have certain network knowledge
and fault location experience to find the specific causes of network problems.

Copyright Huawei Technologies Co., Ltd. 2014. All rights reserved.


No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademark Notice
, HUAWEI, and

are trademarks or registered trademarks of Huawei Technologies Co., Ltd.

Other trademarks, product, service and company names mentioned are the property of their respective owners.
General Disclaimer
The information in this document may contain predictive statements including,
without limitation, statements regarding the future financial and operating results,
future product portfolio, new technology, etc. There are a number of factors
that could cause actual results and developments to differ materially from those
expressed or implied in the predictive statements. Therefore, such information

HUAWEI TECHNOLOGIES CO., LTD.


Huawei Industrial Base
Bantian Longgang
Shenzhen 518129, P.R. China
Tel: +86-755-28780808
Version No.: M3-032102-20140218-C-1.0

is provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.

www.huawei.com