02a - FTP Generic Optim.1.04 ALU

MOBILE NETWORKS
FTP E2E diagnostic & optimization

Guidelines
Christophe PLANTIN,
Regional Service delivery EMEA
V1.04
2
Abstract
This document presents some typical FTP behaviour as seen on mobile
networks. Each point is illustrated with traces captures or figures and the
mechanisms or parameters involved are highlighted.
No magic optimisation rule is provide though; the appropriate actions

depend on a local configurations and constraints.
In any case, understanding the behaviour encountered is the first step in
finding a satisfying solution.
Audience:
People performing e2e performance tests and/or analysing e2e results
People performing radio, core or service platforms performance tests and
analysis looking to understand the origin of the behaviour observed
Prerequisites:
For details on Wireshark (Ethereal), refer to To Start- Wireshark
training.ppt
For general overview about all data protocols in mobile networks, refer to
To Start - Data protocols in mobile networks
3
Playground
Simplified view - key points
TCP behaviour &
parameters? (RWIN) TCP behaviour?
TCP connection
Once the radio aspect has been taken care of (coverage, congestion, mobility),
the #1 issue is TCP PACKET LOSSES
Congestion on interfaces or buffer overflow?
Firewall, IP Router or Ethernet card bad configuration?
Side effects due to policing (GGSN) or content billing platforms?
then TCP parameters can limit the max possible performance
RWIN
4
Agenda
Introduction:
Why looking at FTP performance ?
Reminder:
Time/sequence graph on Wireshark
More about TCP
TCP behaviours Slow start / Acknowledgments / Retransmissions
TCP parameters RWIN / SACK / MTU-MSS / Window scaling / Timestamping
TCP typical call-flow
to conclude
Impact of mobile network:
Bursts DL bursts, UL bursts
Mobility Intra-system, Inter-system
Windows OS specificities
-UL FTP capture on client side - XP
Conclusion
What to secure and check first ?
In practice trace route / quick look at TCP/IP messages
Annex:
-Performing FTP transfers DOS, scripts, tools
-Updating windows registry - tools
Introduction

6

FTP [File Transfer protocol] is the basic solution to perform reliable file
transfer. It may be performed by DOS or with any FTP client (see annex).
It is not widely used by end-users: typical real-life download or upload are

performed with:
WEB browser, e.g. Internet Explorer, Firefox, over HTTP, or
e-mail client, e.g. Outlook, Thunderbird, over POP, SMTP, IMAP,
However, the key point is that all those protocols are based on TCP/IP and
the main driver for performance is TCP. Most of this document discuss TCP
behaviors and performance.
FTP allows to have an easy-to-understand and representative insight that

will reflect any transfer method over TCP.
Note: Do not use FTP or TCP when looking for the pure radio performance of a wireless system. UDP is more suitable
in this case, because it does not provide any congestion control mechanism.
Reminder
Time/Sequence graph on Wireshark

8

Time/sequence graph is THE essential tool for TCP analysis
and, as a consequence, for FTP performance analysis.
It can be drawn from client or server traces.

On receiver side: shows the performance.
On sender side: to understand TCP behavior.
Basics: For a given TCP socket (i.e. TCP port), representation

of TCP data and TCP ACK on the same graph.
Both are plotted according to their sequence number.
- Data segments are thick dark
- Line of the highest ACKed segment number in light grey
Sequence number and time are typically set to 0 at the TCP connection
opening.
9
DL FTP example - Server side
Data
ACK Capture
Graphical view of :
-TCP data segments and ACK.
-CWIN: amount of data for which
no ACK has been received
-RWIN: maximum value for CWIN
-RTT: time between data sending
and ACK receiving for the same
segment.
10
DL FTP example - Receiver side
Data
ACK
Capture
Graphical view of :
-TCP data segments and ACK.
RWIN ACK are sent almost
TCP segments immediately by client: one ACK
for 2 segments.
-CWIN: NO (server side only).
-RWIN: yes but not relevant
-RTT: NO (server side only)
TCP ACK
More about TCP TCP behaviours
Slow start / Acknowledgments / Retransmissions

Global recommendations
12
Slow Start Congestion avoidance

Reminder: TCP concept is to optimize bandwidth use without any previous
knowledge of the network. Throughput increase must then be progressive.
Slow Start phase: after the TCP port opening

or after some retransmissions (timeout).
The server increases its congestion window
exponentially up to the slow start threshold:
ssthres.
This is typically or 66% of RWIN (depends on
implementation)
Congestion avoidance phase: after slow start
or after some retransmissions (fast retransmit)
The server increases its congestion window
linearly (1segment /RTT) up to the RWIN.
Slow start example on client (receiver) side.

Note that only 2 packets are sent at first.
Note: This is the basic typical TCP behaviour. Some implementations may differ.
13
Acknowledgement strategy
TCP DATA packets are numbered with sequence number (unit: bytes).
TCP ACK packets acknowledge all data up to a given sequence number.
To avoid overloading the reverse path, the goal for the receiver is to generate
1 ACK message for 2 DATA packets.
Delayed ACK is used for that purpose: if only 1 data segment has been
received, wait for a 2nd one during Delayed ACK timer, then send the ACK.
Delayed ACK timer is typically set to 200ms (Windows default).
[Client side] ACK is not sent immediately

by the client due to Delayed ACK. This
is the normal behavior.
In the example, timer is a bit <200ms.
Note: in case some data packets have been lost, the receiver detects that a
segment is missing (with sequence number) and generates a duplicate ACK
without waiting for the delayed ACK timer.
Note: To go further, see Windows registry parameters: (Modification is not required)
TcpDelAckTicks (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters) and TcpAckFrequency (new in XP, same location)
14
Retransmissions
1- Timeouts
The server continuously computes the RTD of the connection, i.e. the time
needed to receive an ACK after sending a packet.
If for a given segment, no ACK is received after RTO = 2*RTD, a timeout occurs
and the packet is retransmitted.
After a timeout, the congestion window (cwnd) is reduced to the minimum (2) and
the transfer is resumed with slow start.
If the retransmitted packet is not acknowledged, the RTO is doubled (exponential
timeout). The server stops the transfer definitely if nothing is received for 1min.
[Gn side] The packet is still not acknowledged

after RTO (~9s here) and is repeated after
timeout. The transfer is resumed with a slow
Slow Start start (exponential increase)
Note also that a lot of data packets have been
lost and are repeated in the example.
Timeout
Note that RTD computation is not straightforward: the value is typically smoothed, e.g. RTDn = .RTDcur + (-1).RTDn-1
15
Retransmissions
2- Fast retransmit Fast recovery (1/2)
Fast retransmit is a way to retransmit the packets lost without waiting for
the timeout to expire.
If the server receive several (typically 2) duplicate ACK reporting the same
sequence number, it is very likely that the packet just after has been lost. It
is then immediately retransmitted.
[Gn side] After a few Dup ACK reporting the same

sequence number, the lost segment is repeated.
Retransmission In the example, it looks like we have 5 DupACKs
before the repetition; in fact GnServer RTD is not
null and the server probably reacts after only 2.
TCP DupACK
Question: Why not retransmitting after only 1 DupACK ?

frame N+1 received just before frame N)
Answer: it would trigger retransmission when 2 frames are received out of order (because Dup ACK is generated if
16
Retransmissions
2- Fast retransmit Fast recovery (2/2)
When entering in Fast retransmit phase, the server reduces the
congestion window (cwnd) by 2 to slow the transfer down.
However, receiving DupACKs proves that some packets still arrive to the
receiver. Fast recovery allows to increase the congestion window by 1
segment for each DupACK.
After Fast recovery, the server state is congestion avoidance, i.e. linear
increase of cwnd up to RWIN.
More about TCP - TCP parameters
RWIN / SACK / Window scaling / Timestamping

18
RWIN (1/4)
Receive Window (unit: bytes)
The RWIN is configured at receiver side, showing how much data it can process at
TCP level.
There shall never be more than RWIN bytes of unack. data in the network.
RWIN is then the maximum possible value for cwnd (congestion window).
RWIN is linked to the bandwidth-delay product (BDP) of a system.

If RWIN < BDP, the RWIN will limit the performance.
To reach maximal performance in a live network with varying delays and packet
losses, a margin should also be considered, e.g.: We should send enough
RWIN = 2 * BDP = 2 * Expected_Throughput * RTD data in advance to
provide the throughput
This value must then be validated with tests. (bandwidth) during the
round-trip (delay), i.e. at
the very least BDP
Example: RWIN = 17k enough in HSDPA cat6 ? (~3Mbps) data in advance.
Typical delay (RTD) for HSDPA is ~100ms. (optimistic
approximation based on the average small ping response
Data
time)
Theoretical max throughput: RWIN / RTD = 17000*8 / 0.1 =
1.36Mbps << 3Mbps.
ACK
The maximum radio throughput will never be reached.
19
RWIN (2/4)
Illustration
RWIN limitation can be seen on Wireshark (TCP trace graph).
It shows a particular pattern where the server sends some data during a
while but then must stop to wait for acknowledgements because it reached
the RWIN limitation.
[Client side] The packets are received by group
of RWIN = 17k in the example.
This is not enough for HSDPA : after having
Server waiting send 17k, the server must wait for the ACK.
for ACK We can measure graphically the end-user

throughput (i.e. avg over the transfer) compared
to the radio throughput (i.e. throughput over
active periods).
RWIN The difference is a pure loss of performance.
RWIN data sent by server
Note: RWIN can be set in Windows registry with TcpWindowSize in HKEY_LOCAL_MACHINE \ SYSTEM \
CurrentControlSet \ Services \ Tcpip \ Parameters.
Other parameters: TcpWindowSize / interface and GlobalMaxTcpWindowSize (Max limit of RWIN whatever other
settings)
20
RWIN (3/4)
Windows 2000/XP Typical values
On Windows 2000 / XP, RWIN is common to all network technology. (see next slide for
VISTA)
A trade-off is needed for the value to be correct for GPRS, EDGE, 3G, HSDPA, HSUPA
but also ADSL with Ethernet cable or WIFI!
Risk of having a too small RWIN:

The maximum throughput available on radio cannot be reached, because the server often stops
sending data to wait for acknowledgments. (see previous page)
Risk of having a too high RWIN:
The amount of data in flight between server and client (i.e. cwnd) may be too important:
Possible buffers overflow in various nodes of the network, then frame drop and throughput decrease.
Long queuing in buffers that could impact the interactivity when several services are run in parallel, e.g. Mail +
WEB. (No impact when only one service is run at a time)
CASE 1: HSDPA cat6 (3.6Mbps) is the target: RWIN = 64k is enough.

in any case, if network not ready, cat8 more likely to be limited by other factors than RWIN
CASE 2: HSDPA cat8-10 (..14.4Mbps) is the target: RWIN = 128k required.
note: network shall be designed accordingly (radio quality, >4 E1s)
CASE 3: HSxPA cat14-24 (..42Mbps) is the target: RWIN = 256~512k required.
note: using VISTA or 7 seems more appropriate!
21
RWIN (4/4)
Windows VISTA
DL FTP are far from optimal with first version of VISTA. This is
due to a new feature: RWIN auto tuning.
The feature sets the RWIN automatically depending on the transfer,
in order to achieve optimal throughput (Microsoft claims)
However, the resulting RWIN was much too small, which resulted in
bad throughput, typically 20% worse than XP and up to 50% worse!
VISTA [Post processing from server

side trace] RWIN (blue line) with
VISTA is very unstable und not
high enough. RWIN does not
change with XP.
XP
Microsoft is aware of the problem and delivered a patch that

corrects the issue. VISTA + patch is now comparable with XP.
Without the patch, throughput are much better with timestamp option
but still not at the level of XP with right RWIN.
Note: The patch is n940646. See http://support.microsoft.com/default.aspx?scid=kb;EN-US;940646 for details and
download. It is included in VISTA SP1 (H1 2008)
22
SACK
Selective Acknowledgement
TCP SACK option is negotiated during TCP port opening (3 ways handshake)
Without SACK option, the receiver indicates the next expected segment in the
Acknowledgements. The sender knows that all frames before this one have been
received.
With SACK option, the receiver may also report the frames correctly received after a
possible lost frame. The sender knows that there is no point in retransmitting them.
Better retransmissions management in case of several successive packet lost.
TCP SACK should be set to enable

Should already be by default in Win2000 / XP / VISTA.
The packets correctly received after the first

packet loss are mentioned as an option in the
TCP ACK message.
Note: SACK option can be set in Windows registry with SackOpts in HKEY_LOCAL_MACHINE \ SYSTEM \
CurrentControlSet \ Services \ Tcpip \ Parameters. (Win XP)
23
MTU (Maximum Transfer Unit)
MSS (Maximum Segment Size)
MTU is the maximum frame size at IP level, including headers.
1500 is the maximum value permitted for MTU. High MTU leads to a the lower header impact
on throughput (<3% for 1500) and a faster TCP slow-start.
MTU can be set in Windows registry to control the maximum size of DL frames.
MTU is used by TCP stack to compute the MSS, i.e. maximum amount of data
carried in one TCP packet.
If timestamping is not activated,
TCP-header = 20bytes,
so: MSS = MTU 40TCP+IP headers
The MSS is included in the TCP 3-ways handshake to inform the other peer of the maximum
size that can be received. Example of TCP handshake initiated by MS:
SYN informs of the maximum size that the server can send to the mobile (i.e. in DL)
SYN-ACK informs of the maximum size that the mobile can send to the server (i.e. in UL)
ACK no info about MSS
There is no MSS negotiation during the handshake : it is only information exchange.
Recommendation: The maximum possible value shall be set, i.e. 1500.

Note: MTU can be specified in Windows XP registry for Dial-Up (RAS) with ProtocolMTU in HKEY_LOCAL_MACHINE\SYSTEM \
ControlSet002 \ Services \ NdisWan \ Parameters \ Protocols \ 1
For other interfaces, MTU in HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters \ interfaces
24
Window Scaling
RWIN value is coded with 16bits, so maximum value is 0xFFFF, i.e. 65535bytes.
This may not be enough for HSxPA networks (refer to previous RWIN section).
Window Scaling option has been implemented to allow RWIN > 65535, by multiplying the
RWIN advertised in the ACK by 2n.
Window Scaling option support and n is included in

SYN/SYN-ACK messages during the TCP handshake.
Window Scaling shall be supported on both ends, but the
factor n may depend on the direction.
Real RWIN = Advertised RWIN * 2n
The option is activated by default in Windows 2000 / XP / VISTA. It is used when RWIN is
defined as > 65535 bytes for W2k / XP and always used for VISTA.
Recommendation: No need to update anything on client side.

However, make sure that Window Scaling is supported on network side when HSxPA
requires RWIN > 65k, like to fully support category 8 devices (up to 7.2Mbps)
Note: The same registry key is used to activate 2 TCP options: Window Scaling and Timestamp: Tcp1323Opts in
HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters.
0 (disable both options) / 1 (window scaling enabled only) / 2 (timestamps enabled only) / 3 (both options enabled)
25
Timestamping (1/2)
When no timestamp is included in TCP packets, estimating the RTD may be
difficult.
On sender size: RTD is the time to receive ACK after sending a packet.
How to associate an ACK to a packet when the packet has been repeated ?
On receiver side: How to compute the RTD ?
Duration between packet and ACK is almost null (stay local): not possible this way.
How to know when packet has been sent ? When ACK will be received ?
Some algorithms exists but are not completely reliable.
Timestamping option support is included in SYN/SYN-ACK messages during the

TCP handshake. Timestamping shall be supported on both ends.
Two timestamps are then included in all TCP messages (data and ACKs):
TSval: Timestamp Value, current time
TSecr: Timestamp Echo Reply, i.e. timestamp value included in last received message.
Note: The same registry key is used to activate 2 TCP options: Window Scaling and Timestamp: Tcp1323Opts in
HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters.
0 (disable both options) / 1 (window scaling enabled only) / 2 (timestamps enabled only) / 3 (both options enabled)
26
Timestamping (2/2) Advantages:

By computing a precise RTD,
smarter TCP algorithm can be
used, i.e. to avoid useless
retransmission or adapt some
t1 parameters in real time (see RWIN
auto-tuning with Windows VISTA).
Data
TSval = t1
t2 TSecr = 0
Drawback:
RTD = Timestamping option typically adds
ACK t3 t1
t3 TSval = t2 12 bytes to TCP header, and
TSecr = t1
theoretically impacts the maximum
Data
TSval = t3 achievable throughput.
t4 TSecr = t2
The reduction is <1% (12/1460)
ACK and can be neglected compared to
RTD = TSval = t4 other disturbances coming from
t4 t2 TSecr = t3
radio or packet losses.
Recommendation: Timestamping should be activated.

Not used by default in Win2k / XP but used if other peer requests it.
Used by default with Win VISTA.
Note: Timestamp option is 10bytes long but TCP header size must be a multiple of 32bits (4bytes). As a result, TCP
header goes from 20 to 32 bytes when using timestamping (and not 30bytes as we might have expected).
More about TCP Typical call flow
TCP port opening and release

28
Typical call flow
NOTES
SYN SYN: RWIN, MSS and Window Scaling

RWIN
Options: SACK, timestamp, MSS, Window Scaling
factor are valid for client side only.
SYN-ACK: RWIN, MSS and Window

SYN-ACK Scaling factor are valid for server side only.
RWIN
Options: SACK, timestamp, MSS, Window Scaling If SACK, timestamp and WS fields are
present, it indicates support from the
ACK server.
RWIN
Options: timestamp
Data DATA / ACK /:

Data RWIN is advertised in all TCP headers.
ACK Data
Data TCP options:
ACK Timestamp in all TCP headers (if used)
SACK in ACK only if selective
FIN-ACK acknowledgment needs to be reported.
ACK
FIN-ACK
ACK
More about TCP to conclude
Recommendations according to TCP behaviour

30
Recommendations according to TCP
behaviour
We saw that the packet losses durably impact TCP throughput and
user experience: the congestion window is reduced in all cases
(after timeout and after congestion avoidance).
Packet losses must be troubleshouted very carefully!
Here are some factors that may create packet losses:

IP routing or firewall problems
Routers badly configured in Core Network: conflict between Half and Full
duplex Ethernet.
Bad dimensioning of various interfaces (overload)
Overloaded network element, or network element with too small buffers.
QoS activation in some network element (policing) if not on purpose!
TCP behaviour of network elements that intercept the data flow, e.g proxy
used for content billing
to be continued
Impacts of mobile networks Bursts
EDGE (GPRS), R99, HSDPA

Reminder; DL bursts, UL bursts
32
Reminder: EDGE (GPRS)
Data layers
TCP congestion control and retransmission
FTP FTP
TCP TCP
IP IP IP
BSS Protocols
Core Protocols
INTERNET
MOBILE BSC / PCU
GGSN
SGSN
Radio retransmission: Normal BLER (Block Error Rate) depending on radio
conditions and BSS strategy (link adaptation)
33
EDGE (GPRS)
DL bursts
When capturing IP trace of DL FTP transfer on client side, EDGE transfers
typically present the following pattern:
[Client side] We can notice a lot of small burst,
presenting instantaneous throughput higher
than maximum theoretical radio throughput.
This is an artifact coming from:
-EDGE radio BLER
-In-order delivery from RLC to LLC layers.
On radio interface,
BLER >10% is not unusual in EDGE.
RLC blocks are retransmitted after 200~300ms.
During this timeframe, the blocks coming next are
buffered by RLC layer [in-order delivery]
When the erroneous block has been correctly
retransmitted, all packets are sent to upper layer
almost instantly.
Note that GPRS BLER is typically much lower, in
particular when only CS1-2 are used.
34
EDGE (GPRS)
UL bursts
When capturing IP trace of DL FTP transfer on server side, EDGE transfers with
old mobiles (before 3GPP Rel4) typically present bursts of ACK in UL.
On the contrary, those bursts are not visible on client side, meaning that the TCP
stack sends ACK regularly.
We may think of UL BLER; however, the most impacting point is the support of
Extended UL TBF feature on BSS side.
Without the feature, the UL TBF is released as soon as possible. If some

ACK arrives at that time, it has to be buffered while releasing the old UL
TBF and re-establishing a new one (i.e. >300ms). All ACK buffered are
then transferred with the new TBF, and then the TBF is closed.
With the feature, the UL TBF is not released immediately. In most cases,
only one UL TBF is enough to send all ACK during an FTP DL transfer.
As a result, there is no more delay due to TBF release/establishment,
and the ACK shape on server side is the same as it was on client side.
The feature must be supported by BSS and by the mobile (OK for
mobiles from 3GPP Rel4)
Note: Extended UL TBF modifies the pattern observed at IP level, but also the performance.
RTD is much less (roughly half: 300ms instead of 600ms) and lead to a very impressive performance enhancement for
interactive applications such as WEB and WAP. This is also visible for short transfers (e.g. MMS). Note also that the
RWIN needed goes down to ~20kbytes; on the contrary, the minimum RWIN is ~30kbyte without the feature.
35
Reminder: UMTS R99
Data layers
TCP congestion control and retransmission
FTP FTP
TCP TCP
IP IP IP
INTERNET
MOBILE UTRAN
GGSN
SGSN
Radio retransmission: Target BLER (Block Error Rate) set in UTRAN
36
UMTS R99
DL bursts
When capturing IP trace of DL FTP transfer on client side, UMTS R99
transfers typically present the following pattern:
[Client side] We can notice a few small burst,
presenting instantaneous throughput higher
than maximum theoretical radio throughput.
This is an artifact coming from:
-Target BLER
-In-order delivery from RLC to LLC layers.
On radio interface, target BLER is configured in

UTRAN, typical value is 1%.
RLC blocks are retransmitted after ~200ms.
During this timeframe, the blocks coming next are
buffered by RLC layer [in-order delivery]
When the erroneous block has been correctly
retransmitted, all packets are sent to upper layer
almost instantly.
Note: Throughput is much more stable in UMTS R99 compared to EDGE:

-No link adaptation in UMTS R99, only fast power control that aims at achieving the target BLER
- Target BLER (1%) much lower than typical EDGE BLER, creating much less bursts.
37
Reminder: HSDPA TCP retransmission
Data layers Retransmission possible but

no target BLER.
Radio retransmission in HSDPA at MAC-hs level between UE and

NodeB. Normal BLER depending on UTRAN strategy.
38
HSDPA
DL behaviour
With HSDPA introduction, we have now 3 levels of possible retransmissions:
TCP level, e2e retransmission (mobile / server)
RLC level, between mobile and RNC
MAC-hs level, between mobile and NodeB.
Most HSDPA implementation are designed to run with 10% BLER. It is

retransmitted very quickly on the radio (after ~10ms), due to the HARQ
management at NodeB level and very short TTI of 2ms.
As a result, IP trace does not show some isolated bursts but rather a lot of small
bursts. Note that the phenomenon is amplified when using high HSDPA category
(cat8): @6Mbps, 10ms is 5 TCP packets.
[Client side] The full transfer is made of very

small bursts.
Note: There is almost no repetition at RLC level (Mobile RNC) in HSDPA.

Exceptions: - very bad radio and retransmission failure at MAC-hs (e.g. after 7 attempts at MAC-hs)
- Inter-NodeB mobility with MAC-hs buffer deletion.
39
HSDPA
DL behavior with UL ACK bursts
When using HSDPA without HSUPA, the UL path is still on R99 DCH channel.
As a result, the target BLER @RLC level applies in UL: this may slow down UL
TCP ACK and create bursts of ACK in UL.
This triggers the following chain reaction:
Radio BLER RLC block is not correctly received
RLC block is retransmitted; during this time (100~200ms), new ACK reach the RNC but
cannot by forwarded to SGSN due to in-order delivery.
When the block is finally retransmitted, all ACK are send almost instantly from the RNC to
the sever.
The server reaction is to immediately send the corresponding data in DL: typically 2 TCP
data packets per ACK: this may be a very important amount of data !
If there is some bandwidth limitation on any interface between the server and the UE
(and there is typically on Gi or Gn or IU-PS), some packets are lost
TCP must retransmit the packet lost, leading to throughput decrease
Notes:
The phenomenon gets more and more important when the throughput increases.
For unknown reason to date, this is mostly observed on ALU UTRAN
This is not observed with HSUPA as the retransmissions are very fast on E-DCH.
40
HSDPA
The pattern observed on IP trace is the following on server side:
[Sever side] Typical pattern of data loss

after ACK burst on ALU UTRAN
HSDPA cat8 with UL DCH
5/ Risk of long suboptimal
throughput (Cwnd much
less than RWIN)
4/ Packets have been lost and must be repeated by server

3/ Server reaction: big data burst in DL (all RWIN is
sent in a few milliseconds)
2/ ACK burst: all ACK buffered are send on once by the RNC
1/ Wait while erroneous RLC block is retransmitted on radio

41
HSDPA
Recommendations:
There is no universal solutions but some actions may limit the impact of UL ACK bursts:
UTRAN side: minimize the target BLER if possible (may impact coverage), i.e. from 1% to
0.5 or 0.1%. We could also increase the minSIR associated to UL channels. In both cases,
this will minimize retransmission probability and the minimize bursts number.
UTRAN side: careful setting of parameters controlling the RLC retransmissions, i.e.
Status_prohibit timers, Poll_prohibit timers This will minimize the time needed to retransmit
RLC blocks and then minimize bursts size.
Core side: track bottleneck and review interfaces dimensioning to avoid packet loss during
the bursts.
It could be possible to perform shaping in UL to smooth ACK bursts. It is complex because it
should only apply to TCP ACK and not to the real UL data. Note: the feature is already
available on Bytemobile booster. It could avoid the DL burst to be sent.
It could be possible to perform shaping in DL to smooth data bursts before they reach the
GGSN. It is complex because the burst are very short. It could avoid packet losses in DL.
Note that the main problem for performance is the TCP retransmissions after packet losses in Core network. The idle
periods while waiting for ACK retransmissions have much less impact on overall throughput, or even no impact at all if
the RWIN is correctly set (i.e. large enough).
Impacts of mobile networks Mobility
EDGE (GPRS), R99, HSDPA, HSUPA

Intra-system / Inter-system
43
EDGE (GPRS) mobility
Reminders
No handover in 2G Packet Switched (GPRS or EDGE)
Autonomous cell reselection from the mobile (called NC0 mode)
Mobile performs measurements, compute reselection criterions (C1, C2..) and
take the decision of moving to a new cell
Radio resources (TBF) is not pre-established on the new cell: explains that radio
gap is much more important than during voice handover for example.
Key BSS feature for reselection duration is NACC (Network Associated Cell
Change): avoid the need to listen SysInfo of target cell, gain of 1~1.5s. <applies to
intra-BSC mobility and requires Rel4 mobiles>
Key BSS features for applicative impact are NACC and LLC-rerouting: LLC
rerouting allows to transfer the LLC frames buffered in BSS from old cell buffer to
new cell buffer <applies to intra-RA mobility>. Without LLC rerouting, TCP
retransmission in necessary, whereas most retransmissions this can be avoided
with the feature.
LLC-rerouting is called SCR (Seamless Cell Reselection) on Motorola BSS and
Enhanced Flush Procedure on Nokia BSS.
44
Without LLC rerouting - Illustration
The pattern is typically the following:
Packet routed to the

right cell at first
attempt
Packet that were

buffered in old cell
bucket: lost Retransmissions
(slow start)
Timeout [Client side] Typical pattern of data loss

during 2G-2G mobility
Notes: The lack of LLC rerouting is even more impacting in case of video streaming, where retransmissions are
impossible (for streaming over UDP)
45
With LLC rerouting
Most retransmissions are generally avoided.
However, there may be a problem if the reselection takes too long, in which
case the server may retransmit some (or all) packets due to TCP timeout,
even if they were lost.
The most efficient solution is then to activate both NACC and LLC-rerouting:
reselections are much less impacting for applicative layer if there is a very
short gap and not packet loss.
Recommendation: Reselection number should be minimized.

Reselections have an important impact and must be minimized by careful radio
parameter tuning. (CRH, CRO).
This is valid in any case: with and without NACC and/or LLC-rerouting
Notes:
The lack of LLC rerouting is even more impacting in case of video streaming, where retransmissions are impossible (for
streaming over UDP)
46
3G (R99, HSDPA, HSUPA) mobility
Reminders
Soft handovers are available in UMTS R99 for voice and packet calls.
As a result, mobility in itself does not have any impact at application level.
No interruption / No data loss: same performance in mobility & in static
Only hard handover are available in DL for HSDPA.

As a result, mobility in itself slightly impacts the user experience:
Short interruption (~400ms is a typical value)
In some cases (inter-NodeB), MAC-hs data loss is possible: retransmissions
needed at RLC level. Retransmissions at TCP level (timeout) is also likely to occur
if the RLC retransmissions take too long (even if no RLC loss).
However, the most important impact of mobility in HSDPA is the radio link quality
that is likely to decrease strongly at cell edge
Soft handovers are available in HSUPA

As a result, mobility in itself does not have any impact at application level.
No interruption / No data loss
Note that there may be some disturbance as the reverse path is carried over
HSDPA and there is an interruption in HSDPA.
47
3G-2G intersystem mobility
Reminders
3G2G radio signaling duration is roughly 10s. Typical steps are:
UE switch to the target technology ~3s
Location Area Update 2.5~3s
Inter procedure 1~3s
Routing Area Update ~2.5s
3G2G radio procedures may be sped up by:
Gs interface, allows to perform combined LAU / RAU procedures
Combined MSC, to fasten LAU
Combined SGSN, to fasten RAU
3G2G NACC (sometimes called eNACC) to send 2G SysInfo on 3G layer, thus
speeding up the first 2G access
2G3G radio signaling duration is shorter because LAU & RAU are performed
simultaneously.
2G3G radio procedures may be sped up by:
Combined MSC/SGSN, to fasten LAU/RAU
48
3G-2G intersystem mobility
Applicative impact
3G-2G mobility is a very long procedure that strongly impacts user experience.
This is due to TCP behavior that uses exponential timeouts, as described in the
example below.
RT1 is the first timeout after 1s (example)
Other timeouts are exponential, waiting 2, 4 and 8s.
Beginning of End of
intersystem mobility: intersystem
mobility: TCP packet reaches UE.
Radio interruption
Transfer restarts.
TCP interruption
1 3 7 10 15
RT1 RT2 RT3 RT4
Applicative interruption time typically reaches 15~20s.

Minimize the number of intersystem mobility whenever possible
Special care should be taken for 3G2G static mobility, i.e. in case of 3G
coverage loss and recovery, as ping-pongs are likely to occur in this case.
Windows OS specificities
UL FTP capture with XP

50
UL FTP capture with XP

Theoretically, the look of an UL FTP capture on laptop side should be the
same as the look of a DL FTP capture on server side.
However, we may observe strange patterns instead, as the following
(3G2G mobility):
3/ Retransmissions of
2/ Lots of DupACKs packets that have already
1/ The cwnd does not but no retransmission been acknowledged !!
increase and is much
below the RWIN
Offset due to
buffering ?
Those 3 points show undoubtedly that we do not have direct access to the
TCP stack. There is at least an offset between the ACK seen in DL and the
packets seen in UL, probably due to some buffering internal to Win XP.
Only rely on DL to extract timestamp or interruption time on Win XP.
Conclusion
What to secure and check first?
Recommendations
In practice:
Perform a trace route to internet
Quick look at the TCP/IP messages
52
Recommendations
Mastering end to end performance in mobile networks requires to
address very different aspects:
Radio network quality is a pre-requesite, but not enough

Radio signal quality? Congestion on radio interface? Mobility that triggers data interruption
or packet losses?
At application level, the #1 issue is TCP PACKET LOSSES
Congestion on interfaces or buffer overflow?
Side effects due to policing (GGSN) or content billing platforms?
but TCP parameters can also limit the max possible performance
RWIN: limits the maximum performance if too small, or create too much buffering if too big
All technologies up to HSDPA cat6 (3.6Mbps): RWIN = 64k
Strategy focused on HSDPA cat8-10 (7.2-14.4Mbps): RWIN = 128k + Window Scaling
Latest HSxPA (42Mbps): Windows 7 safer
see next slides to understand how to detect TCP problems from traces
53
In practice
What to check first?
Perform a trace route to internet
Trace route principle is to use the Time-To-Live (TTL) field in IP header to measure
accessibility, reliability and response time of all IP routers between the client and the server.
A PING request is sent with a TTL field increased from 1 to the maximum value.
Each router decreases the TTL by 1 and should return a specific message (TTL exceeded
in transit) when reaching 0. On a mobile network, the first router to answer is the GGSN.
Note that trace route may not be relevant in certain configurations (e.g. Firewall blocking PING
messages, congested routers that respond very slowly, ) loss rate
If the trace route is successful (i.e.

most nodes respond, results are
reproducible), it is possible to
detect the network element or router
creating the issue by comparing the
success rate and response time of
each hop.
In this example, the loss rate
explodes between 4th and 5th hop.
Notes: Cyberkit (freeware: http://www.snapfiles.com/get/cyberkit.html) is a good choice to have full and easy control
over the trace route procedure.
54
In practice
Before developing your own personal ways of analyzing IP
traces, here are some steps you can follow.
Starting point: FTP DL transfer with disappointing throughput.

We have access to the trace on client side.
1) overview of the messages:
Does it look like WireShark detects some errors, retransmissions, Dup

ACK, or any other messages highlighted in red?
if yes, there may be some problems of packet loss
Warning: if 2 packets do not arrive in order for any reason, the first one is tagged
Previous Packet lost and the 2nd one is tagged Retransmission.
This is not a problem if the packets have just been exchanged, in what case WireShark
is wrong (not a real really a retransmission)
Note: If the time between the 2 packets is <RTT, this cannot be a real retransmission.
55
In practice
2) overview of the messages: (continued)
Find the following information in the trace:
RWIN: refer to previous section

short RWIN (<65k or <128k for cat8) could prevent from reaching maximum
throughput
Average packet size

should be typically >1400
if not, MTU is not set correctly, or a network element forces the MTU to be
reduced (with MTU path discovery)
56
In practice
3) draw the Time/Sequence graph
Is it regular or not?
if yes, zoom to be sure
regular pattern tends to show that there is no problem at applicative level.
(bottleneck may come from any other network interface, including radio)
you can see small steps, but the overall pattern is regular.
It may be due by radio BLER (typical for EDGE or HSDPA, see previous
sections)
If the small steps are regular, it tends to show a applicative limitation. In
particular if the size of the steps is roughly equal to RWIN, the performance is
most probably limited by a too small RWIN
if no, is it due to packet losses?

if yes, zoom on the sections concerned and determines if the throughput
decrease is due to the loss
if not, this can be due to mobility, modification of radio condition or resource
availability (more or less TS in 2G, Radio Bearer modification in 3G, cell
change)
Annex - Performing FTP transfers
DOS, scripts, other FTP clients

58
Performing FTP transfers
DOS
DOS provides a FTP client which can be accessed command line. The
throughput given by DOS at the end of the transfer is used as a reference*.
C:\> ftp myservername or C:\> ftp IP@ Connection

ftp> dir Lists remote directory files
ftp> get filename Download filname
ftp> put filename Upload filname
ftp> mget (mput) 1Mo_* Download (Upload) all files beginning
by 1Mo_
ftp> hash Display # showing the transfer progress
ftp> cd Change remote directory
ftp> lcd Change local directory
ftp> help Show help
ftp> bye Close ftp session and quit
Note: DOS UL throughput is not reliable for small files. (i.e. <500ko).
Transfer duration may be extracted from Wireshark IP trace in that case.
59
DOS scripts
It may be useful to write short scripts to log to the server and perform FTP
transfers automatically.
We need two files: FTP_cmd.bat and FTP_instructions.txt.
Example:
FTP_cmd.bat:
ftp -s:FTP_instructions.txt IP@
FTP_instructions.txt:
user
password
lcd c:\temp change local directory
mget 1Mo_* download all files beginning by 1Mo_
We can now launch the connection with FTP_cmd.bat.

60
Other FTP clients
A lot of FTP-client softwares are available for Windows, often for free.
The graphical user interface may be useful when one needs to quickly see the
content of a server. It may also be possible to use such a software for performance
test but the throughput value reported may be computed in a different way than DOS:
consider it with care and validate the relevancy of the result with a few samples, or
simply use only DOS for performance tests.
Well-known FTP clients:

-SmartFTP (http://www.smartftp.com/ - free)
-FileZilla (http://sourceforge.net/projects/filezilla/ - free)
-
Note: Many FTP servers are also available for free (Filezilla Server, )
Annex Updating Windows registry
Tools
62
Updating Windows registry
Tool
The easiest way to update windows registry is to use a dedicated tool, like
Dr.TCP, available at http://www.dslreports.com/drtcp
Restarting the network interface is

enough to take into account the
parameters directly associated to this
interface; however Windows shall be
restarted for global parameters (such as
RWIN) to be activated.
Always check in IP trace that the setting
has really been taken into account.
thank you!

02a - FTP Generic Optim.1.04 ALU

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

02a - FTP Generic Optim.1.04 ALU

Enviado por

Direitos autorais:

Formatos disponíveis

MOBILE NETWORKS

FTP E2E diagnostic & optimization

No magic optimisation rule is provide though; the appropriate actions

Why looking at FTP performance ?

Why looking at FTP performance ?

It is not widely used by end-users: typical real-life download or upload are

FTP allows to have an easy-to-understand and representative insight that

Time/Sequence graph on Wireshark

Time/Sequence graph on Wireshark

It can be drawn from client or server traces.

Basics: For a given TCP socket (i.e. TCP port), representation

Slow start / Acknowledgments / Retransmissions

Slow Start Congestion avoidance

Slow Start phase: after the TCP port opening

Slow start example on client (receiver) side.

Delayed ACK timer is typically set to 200ms (Windows default).

[Client side] ACK is not sent immediately

[Gn side] The packet is still not acknowledged

[Gn side] After a few Dup ACK reporting the same

Question: Why not retransmitting after only 1 DupACK ?

RWIN / SACK / Window scaling / Timestamping

RWIN is linked to the bandwidth-delay product (BDP) of a system.

for ACK We can measure graphically the end-user

RWIN data sent by server

Risk of having a too small RWIN:

CASE 1: HSDPA cat6 (3.6Mbps) is the target: RWIN = 64k is enough.

VISTA [Post processing from server

Microsoft is aware of the problem and delivered a patch that

TCP SACK should be set to enable

The packets correctly received after the first

Recommendation: The maximum possible value shall be set, i.e. 1500.

Window Scaling option support and n is included in

Recommendation: No need to update anything on client side.

Timestamping option support is included in SYN/SYN-ACK messages during the

Timestamping (2/2) Advantages:

Recommendation: Timestamping should be activated.

TCP port opening and release

SYN SYN: RWIN, MSS and Window Scaling

SYN-ACK: RWIN, MSS and Window

Data DATA / ACK /:

Recommendations according to TCP behaviour

Packet losses must be troubleshouted very carefully!

Here are some factors that may create packet losses:

EDGE (GPRS), R99, HSDPA

Without the feature, the UL TBF is released as soon as possible. If some

On radio interface, target BLER is configured in

Note: Throughput is much more stable in UMTS R99 compared to EDGE:

Data layers Retransmission possible but

Radio retransmission in HSDPA at MAC-hs level between UE and

Most HSDPA implementation are designed to run with 10% BLER. It is

[Client side] The full transfer is made of very

Note: There is almost no repetition at RLC level (Mobile RNC) in HSDPA.

[Sever side] Typical pattern of data loss

4/ Packets have been lost and must be repeated by server

1/ Wait while erroneous RLC block is retransmitted on radio

EDGE (GPRS), R99, HSDPA, HSUPA

Packet routed to the

Packet that were

Timeout [Client side] Typical pattern of data loss

Recommendation: Reselection number should be minimized.

Only hard handover are available in DL for HSDPA.

Soft handovers are available in HSUPA

Applicative interruption time typically reaches 15~20s.

UL FTP capture with XP