Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract
This document presents some typical FTP behaviour as seen on mobile
networks. Each point is illustrated with traces captures or figures and the
mechanisms or parameters involved are highlighted.
Audience:
People performing e2e performance tests and/or analysing e2e results
People performing radio, core or service platforms performance tests and
analysis looking to understand the origin of the behaviour observed
Prerequisites:
For details on Wireshark (Ethereal), refer to To Start- Wireshark
training.ppt
For general overview about all data protocols in mobile networks, refer to
To Start - Data protocols in mobile networks
3
Playground
Simplified view - key points
TCP behaviour &
parameters? (RWIN) TCP behaviour?
TCP connection
Once the radio aspect has been taken care of (coverage, congestion, mobility),
the #1 issue is TCP PACKET LOSSES
Congestion on interfaces or buffer overflow?
Firewall, IP Router or Ethernet card bad configuration?
Side effects due to policing (GGSN) or content billing platforms?
then TCP parameters can limit the max possible performance
RWIN
4
Agenda
Introduction:
Why looking at FTP performance ?
Reminder:
Time/sequence graph on Wireshark
More about TCP
TCP behaviours Slow start / Acknowledgments / Retransmissions
TCP parameters RWIN / SACK / MTU-MSS / Window scaling / Timestamping
TCP typical call-flow
to conclude
Impact of mobile network:
Bursts DL bursts, UL bursts
Mobility Intra-system, Inter-system
Windows OS specificities
-UL FTP capture on client side - XP
Conclusion
What to secure and check first ?
In practice trace route / quick look at TCP/IP messages
Annex:
-Performing FTP transfers DOS, scripts, tools
-Updating windows registry - tools
Introduction
However, the key point is that all those protocols are based on TCP/IP and
the main driver for performance is TCP. Most of this document discuss TCP
behaviors and performance.
Note: Do not use FTP or TCP when looking for the pure radio performance of a wireless system. UDP is more suitable
in this case, because it does not provide any congestion control mechanism.
Reminder
Data
ACK Capture
Graphical view of :
-TCP data segments and ACK.
-CWIN: amount of data for which
no ACK has been received
-RWIN: maximum value for CWIN
-RTT: time between data sending
and ACK receiving for the same
segment.
10
Time/Sequence graph on Wireshark
DL FTP example - Receiver side
Data
ACK
Capture
Graphical view of :
-TCP data segments and ACK.
RWIN ACK are sent almost
TCP segments immediately by client: one ACK
for 2 segments.
-CWIN: NO (server side only).
-RWIN: yes but not relevant
-RTT: NO (server side only)
TCP ACK
More about TCP TCP behaviours
Note: This is the basic typical TCP behaviour. Some implementations may differ.
13
Acknowledgement strategy
TCP DATA packets are numbered with sequence number (unit: bytes).
TCP ACK packets acknowledge all data up to a given sequence number.
To avoid overloading the reverse path, the goal for the receiver is to generate
1 ACK message for 2 DATA packets.
Delayed ACK is used for that purpose: if only 1 data segment has been
received, wait for a 2nd one during Delayed ACK timer, then send the ACK.
Note: in case some data packets have been lost, the receiver detects that a
segment is missing (with sequence number) and generates a duplicate ACK
without waiting for the delayed ACK timer.
Note: To go further, see Windows registry parameters: (Modification is not required)
TcpDelAckTicks (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters) and TcpAckFrequency (new in XP, same location)
14
Retransmissions
1- Timeouts
The server continuously computes the RTD of the connection, i.e. the time
needed to receive an ACK after sending a packet.
If for a given segment, no ACK is received after RTO = 2*RTD, a timeout occurs
and the packet is retransmitted.
After a timeout, the congestion window (cwnd) is reduced to the minimum (2) and
the transfer is resumed with slow start.
If the retransmitted packet is not acknowledged, the RTO is doubled (exponential
timeout). The server stops the transfer definitely if nothing is received for 1min.
Note that RTD computation is not straightforward: the value is typically smoothed, e.g. RTDn = .RTDcur + (-1).RTDn-1
15
Retransmissions
2- Fast retransmit Fast recovery (1/2)
Fast retransmit is a way to retransmit the packets lost without waiting for
the timeout to expire.
If the server receive several (typically 2) duplicate ACK reporting the same
sequence number, it is very likely that the packet just after has been lost. It
is then immediately retransmitted.
After Fast recovery, the server state is congestion avoidance, i.e. linear
increase of cwnd up to RWIN.
More about TCP - TCP parameters
Note: RWIN can be set in Windows registry with TcpWindowSize in HKEY_LOCAL_MACHINE \ SYSTEM \
CurrentControlSet \ Services \ Tcpip \ Parameters.
Other parameters: TcpWindowSize / interface and GlobalMaxTcpWindowSize (Max limit of RWIN whatever other
settings)
20
RWIN (3/4)
Windows 2000/XP Typical values
On Windows 2000 / XP, RWIN is common to all network technology. (see next slide for
VISTA)
A trade-off is needed for the value to be correct for GPRS, EDGE, 3G, HSDPA, HSUPA
but also ADSL with Ethernet cable or WIFI!
Without SACK option, the receiver indicates the next expected segment in the
Acknowledgements. The sender knows that all frames before this one have been
received.
With SACK option, the receiver may also report the frames correctly received after a
possible lost frame. The sender knows that there is no point in retransmitting them.
Better retransmissions management in case of several successive packet lost.
Note: SACK option can be set in Windows registry with SackOpts in HKEY_LOCAL_MACHINE \ SYSTEM \
CurrentControlSet \ Services \ Tcpip \ Parameters. (Win XP)
23
MTU (Maximum Transfer Unit)
MSS (Maximum Segment Size)
MTU is the maximum frame size at IP level, including headers.
1500 is the maximum value permitted for MTU. High MTU leads to a the lower header impact
on throughput (<3% for 1500) and a faster TCP slow-start.
MTU can be set in Windows registry to control the maximum size of DL frames.
MTU is used by TCP stack to compute the MSS, i.e. maximum amount of data
carried in one TCP packet.
If timestamping is not activated,
TCP-header = 20bytes,
so: MSS = MTU 40TCP+IP headers
The MSS is included in the TCP 3-ways handshake to inform the other peer of the maximum
size that can be received. Example of TCP handshake initiated by MS:
SYN informs of the maximum size that the server can send to the mobile (i.e. in DL)
SYN-ACK informs of the maximum size that the mobile can send to the server (i.e. in UL)
ACK no info about MSS
There is no MSS negotiation during the handshake : it is only information exchange.
Window Scaling
RWIN value is coded with 16bits, so maximum value is 0xFFFF, i.e. 65535bytes.
This may not be enough for HSxPA networks (refer to previous RWIN section).
Window Scaling option has been implemented to allow RWIN > 65535, by multiplying the
RWIN advertised in the ACK by 2n.
The option is activated by default in Windows 2000 / XP / VISTA. It is used when RWIN is
defined as > 65535 bytes for W2k / XP and always used for VISTA.
Note: The same registry key is used to activate 2 TCP options: Window Scaling and Timestamp: Tcp1323Opts in
HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters.
0 (disable both options) / 1 (window scaling enabled only) / 2 (timestamps enabled only) / 3 (both options enabled)
25
Timestamping (1/2)
When no timestamp is included in TCP packets, estimating the RTD may be
difficult.
On sender size: RTD is the time to receive ACK after sending a packet.
How to associate an ACK to a packet when the packet has been repeated ?
On receiver side: How to compute the RTD ?
Duration between packet and ACK is almost null (stay local): not possible this way.
How to know when packet has been sent ? When ACK will be received ?
Some algorithms exists but are not completely reliable.
Note: The same registry key is used to activate 2 TCP options: Window Scaling and Timestamp: Tcp1323Opts in
HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters.
0 (disable both options) / 1 (window scaling enabled only) / 2 (timestamps enabled only) / 3 (both options enabled)
26
Note: Timestamp option is 10bytes long but TCP header size must be a multiple of 32bits (4bytes). As a result, TCP
header goes from 20 to 32 bytes when using timestamping (and not 30bytes as we might have expected).
More about TCP Typical call flow
NOTES
FTP FTP
TCP TCP
IP IP IP
BSS Protocols
Core Protocols
INTERNET
MOBILE BSC / PCU
GGSN
SGSN
Radio retransmission: Normal BLER (Block Error Rate) depending on radio
conditions and BSS strategy (link adaptation)
33
EDGE (GPRS)
DL bursts
When capturing IP trace of DL FTP transfer on client side, EDGE transfers
typically present the following pattern:
[Client side] We can notice a lot of small burst,
presenting instantaneous throughput higher
than maximum theoretical radio throughput.
This is an artifact coming from:
-EDGE radio BLER
-In-order delivery from RLC to LLC layers.
On radio interface,
BLER >10% is not unusual in EDGE.
RLC blocks are retransmitted after 200~300ms.
During this timeframe, the blocks coming next are
buffered by RLC layer [in-order delivery]
When the erroneous block has been correctly
retransmitted, all packets are sent to upper layer
almost instantly.
Note that GPRS BLER is typically much lower, in
particular when only CS1-2 are used.
34
EDGE (GPRS)
UL bursts
When capturing IP trace of DL FTP transfer on server side, EDGE transfers with
old mobiles (before 3GPP Rel4) typically present bursts of ACK in UL.
On the contrary, those bursts are not visible on client side, meaning that the TCP
stack sends ACK regularly.
We may think of UL BLER; however, the most impacting point is the support of
Extended UL TBF feature on BSS side.
FTP FTP
TCP TCP
IP IP IP
INTERNET
MOBILE UTRAN
GGSN
SGSN
Radio retransmission: Target BLER (Block Error Rate) set in UTRAN
36
UMTS R99
DL bursts
When capturing IP trace of DL FTP transfer on client side, UMTS R99
transfers typically present the following pattern:
[Client side] We can notice a few small burst,
presenting instantaneous throughput higher
than maximum theoretical radio throughput.
This is an artifact coming from:
-Target BLER
-In-order delivery from RLC to LLC layers.
Notes:
The phenomenon gets more and more important when the throughput increases.
For unknown reason to date, this is mostly observed on ALU UTRAN
This is not observed with HSUPA as the retransmissions are very fast on E-DCH.
40
HSDPA
DL behavior with UL ACK bursts
The pattern observed on IP trace is the following on server side:
2/ ACK burst: all ACK buffered are send on once by the RNC
Note that the main problem for performance is the TCP retransmissions after packet losses in Core network. The idle
periods while waiting for ACK retransmissions have much less impact on overall throughput, or even no impact at all if
the RWIN is correctly set (i.e. large enough).
Impacts of mobile networks Mobility
Notes: The lack of LLC rerouting is even more impacting in case of video streaming, where retransmissions are
impossible (for streaming over UDP)
45
EDGE (GPRS) mobility
With LLC rerouting
Most retransmissions are generally avoided.
However, there may be a problem if the reselection takes too long, in which
case the server may retransmit some (or all) packets due to TCP timeout,
even if they were lost.
The most efficient solution is then to activate both NACC and LLC-rerouting:
reselections are much less impacting for applicative layer if there is a very
short gap and not packet loss.
Notes:
The lack of LLC rerouting is even more impacting in case of video streaming, where retransmissions are impossible (for
streaming over UDP)
46
3G (R99, HSDPA, HSUPA) mobility
Reminders
Soft handovers are available in UMTS R99 for voice and packet calls.
As a result, mobility in itself does not have any impact at application level.
No interruption / No data loss: same performance in mobility & in static
2G3G radio signaling duration is shorter because LAU & RAU are performed
simultaneously.
2G3G radio procedures may be sped up by:
Combined MSC/SGSN, to fasten LAU/RAU
48
3G-2G intersystem mobility
Applicative impact
3G-2G mobility is a very long procedure that strongly impacts user experience.
This is due to TCP behavior that uses exponential timeouts, as described in the
example below.
RT1 is the first timeout after 1s (example)
Other timeouts are exponential, waiting 2, 4 and 8s.
Beginning of End of
intersystem mobility: intersystem
mobility: TCP packet reaches UE.
Radio interruption
Transfer restarts.
TCP interruption
1 3 7 10 15
RT1 RT2 RT3 RT4
Offset due to
buffering ?
Those 3 points show undoubtedly that we do not have direct access to the
TCP stack. There is at least an offset between the ACK seen in DL and the
packets seen in UL, probably due to some buffering internal to Win XP.
Only rely on DL to extract timestamp or interruption time on Win XP.
Conclusion
What to secure and check first?
Recommendations
In practice:
Perform a trace route to internet
Quick look at the TCP/IP messages
52
Recommendations
Mastering end to end performance in mobile networks requires to
address very different aspects:
see next slides to understand how to detect TCP problems from traces
53
In practice
What to check first?
Perform a trace route to internet
Trace route principle is to use the Time-To-Live (TTL) field in IP header to measure
accessibility, reliability and response time of all IP routers between the client and the server.
A PING request is sent with a TTL field increased from 1 to the maximum value.
Each router decreases the TTL by 1 and should return a specific message (TTL exceeded
in transit) when reaching 0. On a mobile network, the first router to answer is the GGSN.
Note that trace route may not be relevant in certain configurations (e.g. Firewall blocking PING
messages, congested routers that respond very slowly, ) loss rate
Notes: Cyberkit (freeware: http://www.snapfiles.com/get/cyberkit.html) is a good choice to have full and easy control
over the trace route procedure.
54
In practice
What to check first?
Quick look at the TCP/IP messages
Before developing your own personal ways of analyzing IP
traces, here are some steps you can follow.
Is it regular or not?
if yes, zoom to be sure
regular pattern tends to show that there is no problem at applicative level.
(bottleneck may come from any other network interface, including radio)
you can see small steps, but the overall pattern is regular.
It may be due by radio BLER (typical for EDGE or HSDPA, see previous
sections)
If the small steps are regular, it tends to show a applicative limitation. In
particular if the size of the steps is roughly equal to RWIN, the performance is
most probably limited by a too small RWIN
Note: DOS UL throughput is not reliable for small files. (i.e. <500ko).
Transfer duration may be extracted from Wireshark IP trace in that case.
59
Performing FTP transfers
DOS scripts
It may be useful to write short scripts to log to the server and perform FTP
transfers automatically.
We need two files: FTP_cmd.bat and FTP_instructions.txt.
Example:
FTP_cmd.bat:
ftp -s:FTP_instructions.txt IP@
FTP_instructions.txt:
user
password
lcd c:\temp change local directory
mget 1Mo_* download all files beginning by 1Mo_
Note: Many FTP servers are also available for free (Filezilla Server, )
Annex Updating Windows registry
Tools
62
Updating Windows registry
Tool
The easiest way to update windows registry is to use a dedicated tool, like
Dr.TCP, available at http://www.dslreports.com/drtcp