Você está na página 1de 22

TIME-WAIT Hack

For High Performance Ephemeral Connection in

Linux TCP Stack


E A Faisal
eafaisal@nexoprima.com

$ whoami
Engku Ahmad Faisal

github.com/efaisal
twitter.com/efaisal
facebook.com/eafaisal
plus.google.com/u/0/+EAFaisal

Linux user since 1996/1997


Attempted to contribute to open source projects:
few accepted, most rejected ;-P

$ whoami
Worked with Nexo Prima Sdn Bhd
Open Source Cloud Infrastructure

Virtualisation: oVirt/OpenStack
Storage: Gluster/Ceph

High Availability & Scalability Infrastructure

Linux-based solutions

System Performance Tuning & Profiling

Focusing on web-based application on Linux platform

TCP STATE MACHINE

TCP :: ACTIVE CLOSE


3-way
handshake

ESTABLISHED
close()/fin

CLOSING

FIN_WAIT_1
fin+a

ck/ac

ack/-

fin/ack

ack/-

TIME_WAIT

Active Close

fin/ack

FIN_WAIT_2
2MSL Timeout

CLOSED

TCP :: ACTIVE CLOSE


By the initiator of close()
TIME-WAIT & 2MSL are there for good reasons:

due to nature of Internet - packet lost, re-transmission, arrives late


to ensure the other end properly closed

RFC 793 states 2MSL should be 4 minutes


2MSL:

MS Windows - 4 minutes
Linux - 1 minute (hard coded)

TIME-WAIT is good for TCP communication over the Internet

TCP :: PASSIVE CLOSE


3-way
handshake

ESTABLISHED
fin/ack

close
()/fin

LAST_ACK
ack/-

CLOSED

Passive Close

CLOSE_WAIT

TCP :: PASSIVE CLOSE


By the receiver of close()
CLOSE-WAIT

waits up to 60 seconds in Linux


configurable via tcp_fin_timeout

WARNING!
Some resources on the Web wrongly informed their readers to tweak
tcp_fin_timeout to tune TIME-WAIT

WEB APPLICATION OF TODAY

SIMPLIFIED WEB APP STACK


Client
Load
Balancer
Web App

Cache

Database

MQ

REST
API

WEB APP STACK


Supporting services for Web App layer typically use TCP as transport protocol
Web App layer is both:

TCP server listening to connection from the client


TCP client connecting to various supporting services

Consider a LAMP stack + memcached server

Each HTTP request, creates/opens a TCP connection to the memcached


At the end of the request, the connection is closed
OMG! Ephemeral connection!

If we have more supporting services (MQ, REST API, etc), there might be more open/close
operations for each request
HTTP is considered ephemeral by nature

IMPACT AND PROBLEMS

BUSY SERVER WITH EPHEMERAL CONNECTIONS


Busy server, e.g. 1,000 HTTP requests/second
Web App layer also open TCP connection to backend services at that rate or
more
In 1 minute, were going to have thousands lingering TCP TIME_WAIT
You can check using netstat or ss command
$ ss -nt state time-wait
$ netstat -tn | grep TIME_WAIT

PROBLEMS: CONNECTION TABLE SLOT


Connection in TIME-WAIT state hold a local port for 1 minute
Local port range is finite - 16-bit integer
In many distro, default to around 30,000
Can be changed: net.ipv4.ip_local_port_range
If local port range is exhausted, any connect() results in EADDRNOTAVAIL

PROBLEMS: ADDITIONAL MEMORY & CPU USAGE


Memory Usage to Hold Socket Structure

Though not really significant but annoying enough

Additional CPU Usage

Searching for free port uses CPU


Wasting CPU cycle to iteratively purge tons TIME_WAIT connections

EXISTING & POTENTIAL SOLUTIONS

SOLUTION 1: tcp_tw_reuse
From Linux doc:
Allow to reuse TIME-WAIT sockets for new connections when it is safe from
protocol viewpoint. Default value is 0. It should not be changed without
advice/request of technical experts.
Commonly recommended to be enabled
$ echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
Dependent on another kernel param to be enabled: net.ipv4.tcp_timestamps
Does it really work?

SOLUTION 2: TIME-WAIT NEGOTIATION


Proposed by Theodore Faber, Joe Touch & Wei Yue from University of Southern
California in 1999
No code available, claimed have experimental code written for SunOS 4.1.3
Involves modifying TCP by adding a new TCP option called TW-Negotiate,
negotiated during the three-way handshake
Not a viable solution, simply a theoretical one

INTRODUCING LINUXTCPTW

LINUXTCPTW
Implementation of an old idea

Once discussed in kernel core dev mailinglist to make TIME-WAIT tunable


Rejected by kernel core dev - TIME-WAIT is there for good reasons
Easily abused to make TCP non-compliant to standard
Open source project to create patch set to the kernel for configurable TIMEWAIT
Introduce a new kernel param - tcp_timewait_len
A new entry in proc fs - /proc/sys/net/ipv4/tcp_timewait_len
Able to use sysctl for configuration - net.ipv4.tcp_timewait_len

THE PROJECT
Project lives at https://github.com/efaisal/linuxtcptw/
Binary release available for CentOS 6 and 7 at https://github.
com/efaisal/linuxtcptw/releases
Unfortunately not battle tested in production environment yet - any volunteer?
Currently working on Ubuntu 14.04 LTS kernel

THANK YOU

Você também pode gostar