Você está na página 1de 20

System Tuning Info for Linux Servers

NOTE: Most of the info on this page is about 3 years, and one or to !erne" versions out of
date#
This page is about optimizing and tuning Linux based systems for server oriented tasks.
Most of the info presented here I've used myself, and have found it to be beneficial. I've
tried to avoid the ell tread ground !hdparm, turning off hostname lookups in apache,
etc" as that info is easy to find elsehere.
#ome cases here you might ant to apply some of benchmarking, high traffic eb sites,
or in case of any load spike !say, a eb transfered virus is pegging your servers ith
bogus re$uests"
%isk Tuning
&ile system Tuning
#'#I Tuning
%isk I() *levators
+etork Interface Tuning
T', Tuning
&ile limits
,rocess limits
Threads
+&#
-pache and other eb servers
#amba
)penldap tuning
#ys . shm
,tys and ttys
/enchmarks
#ystem Monitoring
0tilities
#ystem Tuning Links
Music
Thanks
T)%)
'hanges
$i"e and %is! Tuning
/enchmark performance is often heavily based on disk I() performace. #o getting as
much disk I() as possible is the real key.
%epending on the array, and the disks used, and the controller, you may ant to try
softare raid. It is tough to beat softare raid performace on a modern cpu ith a fast
disk controller.
The easiest ay to configure softare raid is to do it during the install. If you use the gui
installer, there are options in the disk partion screen to create a 1md1 or multiple2device,
linux talk for a softare raid partion. 3ou ill need to make partions on each of the
drives of type 1linux raid1, and then after creating all these partions, create a ne partion,
say 1 (test1, and select md as its type. Then you can select all the partions that should be
part of it, as ell as the raid type. &or pure performance, 4-I% 5 is the ay to go.
+ote that by default, I belive you are limited to 67 drives in a M% device, so you may be
limited to that. If the drives are fast enough, that should be sufficent to get 8655 M/(s
pretty consistently.
)ne thing to keep in mind is that the position of a partion on a hardrive does have
performance implications. ,artions that get stored at the very outer edge of a drive tend to
be significantly faster than those on the inside. - good benckmarking trick is to use 4-I%
across several drives, but only use a very small partion on the outside of the disk. This
give both consistent performance, and the best performance. )n most moden drives, or
least drives using 9'-. !9oned 'onstant -ngular .elocity", this tends to be sectors ith
the loest address, aka, the first partions. &or a ay to see the differences illustrated, see
the 9'-. page.
This is :ust a summary of softare 4-I% configuration. More detailed info can be found
elsehere including the #oftare24-I%2;)<T), and the docs and man pages from the
raidtoo"s package.
$i"e System Tuning
#ome of the default kernel paramaters for system performance are geared more toards
orkstation performance that file server(large disk io type of operations. The most
important of these is the 1bdflush1 value in (proc(sys(vm(bdflush
These values are documented in detail in (usr(src(linux(%ocumenation(sysctl(vm.txt.
- good set of values for this type of server is=
echo 100 5000 640 2560 150 30000 5000 1884 2 >
/proc/sys/vm/bdflush
!you change these values by :ust echo'ing the ne values to the file. This takes effect
immediately. ;oever, it needs to be reinitilized at each kernel boot. The simplest ay to
do this is to put this command into the end of (etc(rc.d(rc.local"
-lso, for pure file server applications like eb and samba servers, you probably ant to
disable the 1atime1 option on the filesystem. This disabled updating the 1atime1 value for
the file, hich indicates that the last time a file as accessed. #ince this info isnt very
useful in this situation, and causes extra disk hits, its typically disabled. To do this, :ust
edit (etc(fstab and add 1notime1 as a mount option for the filesystem.
for example=
/dev/rd/c0d0p3 /test ext2 noatime
1 2
<ith these file system options, a good raid setup, and the bdflush values, filesystem
performace should be suffiecent.
The disk i(o elevators is another kernel tuneable that can be teaked for improved disk
i(o in some cases.
S&SI Tuning
#'#I tuning is highly dependent on the particular scsi cards and drives in $uestions. The
most effective variable hen it comes to #'#I card performace is tagged command
$ueueing.
&or the -daptec aic>xxx seriers cards !7?@5's, >A?5's, B6C5's, etc" this can be enabled
ith a module option like=
aicxx!ta"#info$%%0&0&0&0&''
This enabled the default tagged command $ueing on the first device, on the first @ scsi
ids.
options aicxxxaicxxx!ta"#info$%%24(24(24(24(24(24''
in (etc(modules.conf ill set the T'D depth to 7@
3ou probably ant to check the driver documentation for your particular scsi modules for
more info.
%is! I'O E"evators
)n systems that are consistently doing a large amount of disk I(), tuning the disk I()
elevators may be useful. This is a 7.@ kernel feature that allos some control over latency
vs throughput by changing the ay disk io elevators operate.
This orks by changing ho long the I() scheduler ill let a re$uest sit in the $ueue
before it has to be handled. #ince the I() scheduler can collapse some re$uest together,
having a lot of items in the $ueue means more can be cooalesced, hich can increase
throughput.
'hanging the max latency on items in the $ueue allos you to trade disk i(o latency for
throughput, and vice versa.
The tool 1(sbin(elvtune1 !part of util2linux" allos you to change these max latency
values. Loer values means less latency, but also less thoughput. The values can be set
for the read and rite $ueues seperately.
To determine hat the current settings are, :ust issue=
/sbin/elvtune /dev/hda1
substituting the approriate device of course. %efault values are A6?7 for read, and 6CEA@
for rites.
To set ne values of 7555 for read and @555 for example=
/sbin/elvtune )r 2000 )* 4000 /dev/hda1
+ote that these values are for example purposes only, and are not recomended tuning
values. That depends on the situation.
The units of these values are basically 1sectors of rites before reads are alloed1. The
kernel attempts to do all reads, then all rites, etc in an attempt to prevent disk io mode
sitching, hich can be slo. #o this allos you to alter ho long it aits before
sitching.
)ne ay to get an idea of the effectiveness of these changes is to monitor the output of
Fisostat 2d 2x %*.I'*F. The 1avgr$2sz1 and 1avg$u2sz1 values !average size of re$uest
and average $ueue length, see man page for iostat" should be affected by these elevator
changes. Loering the latency should cause the 1av$r$2sz1 to go don, for example.
#ee the e"vtune man page for more info. #ome info from hen this feature as
introduced is also at Ln.net
This info contributed by -r:an van de .en.
Netor! Interfa(e Tuning
Most benchmarks benifit heavily from making sure the +I''s in use are ell supported,
ith a ell ritten driver. *xamples include eepro655, tulip's, neish Ecom cards, and
acenic and sysconect gigabit cards.
Making sure the cards are running in full duplex mode is also very often critical to
benchmark performace. %epending on the netorking hardare used, some of the cards
may not autosense properly and may not run full duplex by default.
Many cards include module options that can be used to force the cards into full duplex
mode. #ome examples for common cards include
alias eth0 eepro100
options eepro100 full#duplex!1
alias eth1 tulip
options tulip full#duplex!1
Though full duplex gives the best overall performance, I've seen some circumstances
here setting the cards to half duplex ill actually increase thoughput, particulary in
cases here the data flo is heavily one sided.
If you think your in a situation here that may help, I ould suggest trying it and
benchmarking it.
T&) tuning
&or servers that are serving up huge numbers of concurent sessions, there are some tcp
options that should probabaly be enabled. <ith a large G of clients doing their best to kill
the server, its probabaly not uncommon for the server to have 75555 or more open
sockets.
In order to optimize T', performace for this situation, I ould suggest tuning the
folloing parameters.
echo 1024 65000 > /proc/sys/net/ipv4/ip#local#port#ran"e
-llos more local ports to be available. Henerally not a issue, but in a benchmarking
scenario you often need more ports available. - common example is clients running FabF
or FhttpIloadF or similar softare.
In the case of firealls, or other servers doing +-T or mas$uerading, you may not be
able to use the full port range this ay, because of the need for high ports for use in +-T.
Increasing the amount of memory associated ith socket buffers can often improve
performance. Things like +&# in particular, or apache setups ith large buffer configured
can benefit from this.
echo 262143 > /proc/sys/net/core/rmem#max
echo 262143 > /proc/sys/net/core/rmem#default
This ill increase the amount of memory available for socket input $ueues. The
1memIB1 values do the same for output $ueues.
Note: <ith 7.@.x kernels, these values are supposed to 1autotune1 fairly ell, and some
people suggest :ust instead changing the values in=
/proc/sys/net/ipv4/tcp#rmem
/proc/sys/net/ipv4/tcp#*mem
There are three values here, 1min default max1.
These reduce the amount of ork the T', stack has to do, so is often helpful in this
situation.
echo 0 > /proc/sys/net/ipv4/tcp#sac+
echo 0 > /proc/sys/net/ipv4/tcp#timestamps
$i"e Limits and the "i!e
)pen tcp sockets, and things like apache are prone to opening a large amount of file
descriptors. The default number of available &% is @5?C, but this may need to be upped
for this scenario.
The theorectial limit is roughly a million file descriptors, though I've never been able to
get close to that many open.
I'd suggest doubling the default, and trying the test. If you still run out of file descriptors,
double it again.
&or example=
echo 128000 > /proc/sys/fs/inode)max
echo 64000 > /proc/sys/fs/file)max
and as root=
ulimit )n 64000
+ote= )n 7.@ kernels, the 1inode2max1 entry is no longer needed.
3ou probabaly ant to add these to (etc(rc.d(rc.local so they get set on each boot.
There are more than a fe ays to make these changes 1sticky1. In 4ed ;at Linux, you
can you (etc(sysctl.conf and (etc(security(limits.conf to set and save these values.
If you get errors of the variety 10nable to open file descriptor1 you definately need to up
these values.
3ou can examine the contents of (proc(sys(fs(file2nr to determine the number of allocated
file handles, the number of file handles currently being used, and the max number of file
handles.
)ro(ess Limits
&or heavily used eb servers, or machines that span off lots and lots of processes, you
probabaly ant to up the limit of processes for the kernel.
-lso, the 7.7 kernel itself has a max process limit. The default values for this are 7JC5,
but a kernel recompile can take this as high as @555. This is a limitation in the 7.7 kernel,
and has been removed from 7.E(7.@.
The values that need to be changed are=
If your running out ho many task the kernel can handle by default, you may have to
rebuild the kernel after editing=
/usr/src/linux/include/linux/tas+s(h
and change=
,define -.#/0121 2560 /3 4n x86 5ax 4062& or 4060 */075
confi"ured(3/
to
,define -.#/0121 4000 /3 4n x86 5ax 4062& or 4060 */075
confi"ured(3/
and=
,define 508#/0121#79.#:19. ;-.#/0121/2<
to
,define 508#/0121#79.#:19. ;-.#/0121<
Then recompile the kernel.
also run=
ulimit )u 4000
Note: This process limit is gone in the 7.@ kernel series.
Threads
Limitations on threads are tightly tied to both file descriptor limits, and process limits.
0nder Linux, threads are counted as processes, so any limits to the number of processes
also applies to threads. In a heavily threaded app like a threaded T', engine, or a :ava
server, you can $uickly run out of threads.
&or starters, you ant to get an idea ho many threads you can open. The Fthread2limitF
util mentioned in the Tuning 0tilities section is probabaly as good as any.
The first step to increasing the possible number of threads is to make sure you have
boosted any process limits as mentioned before.
There are fe things that can limit the number of threads, including process limits,
memory limits, mutex(semaphore(shm(ipc limits, and compiled in thread limits. &or most
cases, the process limit is the first one to run into, then the compiled in thread limits, then
the memory limits.
To increase the limits, you have to recompile glibc. )h funK. -nd the patch is essentially
to linesK. <oohooK
))) (/linuxthreads/sysdeps/unix/sysv/linux/bits/local#lim(h(a+l 5on 1ep
4
16$3$42 2000
=== (/linuxthreads/sysdeps/unix/sysv/linux/bits/local#lim(h 5on 1ep
4
16$3$56 2000
>> )64& =64& >>
/3 /he number of threads per process( 3/
,define #741?8#/@.90A#/@.90A1#508 64
/3 /his is the value this implementation supports( 3/
),define 7/@.90A#/@.90A1#508 1024
=,define 7/@.90A#/@.90A1#508 8162

/3 5aximum amount by *hich a process can descrease its asynchronous ?/4
priority level( 3/
))) (/linuxthreads/internals(h(a+l 5on 1ep 4 16$36$58 2000
=== (/linuxthreads/internals(h 5on 1ep 4 16$3$23 2000
>> )330& =330& >>
/@.90A#19BC implementation is used& this must be a po*er of t*o and
a multiple of 70D9#1?E9( 3/
,ifndef 1/0F2#1?E9
),define 1/0F2#1?E9 ;2 3 1024 3 1024<
=,define 1/0F2#1?E9 ;64 3 70D9#1?E9<
,endif

/3 /he initial siGe of the thread stac+( 5ust be a multiple of
70D9#1?E9(
3 3/
+o :ust patch glibc, rebuild, and install it. L28 If you have a package based system, I
seriously suggest making a ne package and using it.
#ome info ho to do this are Mlinux.org. They describe ho to increase the number of
threads so Mava apps can use them.
N$S
- good resource on +&# tuning on linux is the linux +&# ;)<2T). Most of this info is
gleaned from there.
/ut the basic tuning steps include=
Try using +&#vE if you are currently using +&#v7. There can be very significant
performance increases ith this change.
Increasing the read rite block size. This is done ith the rsi*e and si*e mount options.
They need to the mount options used by the +&# clients. .alues of @5?C and A6?7
reportedly increase performance alot. /ut see the notes in the ;)<T) about
experimenting and measuring the performance implications. The limits on these are A6?7
for +&#v7 and E7>CA for +&#vE
-nother approach is to increase the number of nfsd threads running. This is normally
controlled by the nfsd init script. )n 4ed ;at Linux machines, the value
14,'+&#%')0+T1 in the nfs init script controls this value. The best ay to determine
if you need this is to experiment. The ;)<T) mentions a ay to determin thread usage,
but that doesnt seem supported in all kernels.
-nother good tool for getting some handle on +&# server performance is FnfsstatF. This
util reads the info in (proc(net(rpc(nfsNdO and displays it in a somehat readable format.
#ome info intended for tuning #olaris, but useful for it's description of the nfsstat format
#ee also the tcp tuning info
+pa(he (onfig
Make sure you starting a ton of initial daemons if you ant good benchmark scores.
#omething like=
,,,,,,,
5in1pare1ervers 20
5ax1pare1ervers 80
1tart1ervers 32
, this can be hi"her if apache is recompiled
5axFlients 256
5ax.eHuests7erFhild 10000

Note: #tarting a massive amount of httpd processes is really a benchmark hack. In most
real orld cases, setting a high number for max servers, and a sane spare server setting
ill be more than ade$uate. It's :ust the instant on load that benchmarks typically
generate that the #tart#ervers helps ith.
The Max4e$uest,er'hild should be bumped up if you are sure that your httpd processes
do not leak memory. #etting this value to 5 ill cause the processes to never reach a
limit.
)ne of the best resources on tuning these values, especially for app servers, is the
modIperl performance tuning documentation.
,umping the number of avai"ab"e httpd pro(esses
-pache sets a maximum number of possible processes at compile time. It is set to
7JC by default, but in this kind of scenario, can often be exceeded.
To change this, you ill need to chage the hardcoded limit in the apache source
code, and recompile it. -n example of the change is belo=
))) apache#1(3(6/src/include/httpd(h(preGab Cri 0u" 6
20$11$14 1666
=== apache#1(3(6/src/include/httpd(h Cri 0u" 6 20$12$50 1666
>> )306& =306& >>
3 the overhead(
3/
,ifndef @0.A#19.I9.#B?5?/
),define @0.A#19.I9.#B?5?/ 256
=,define @0.A#19.I9.#B?5?/ 4000
,endif
/3
To make useage of this many apache's hoever, you ill also need to boost the
number of processes support, at least for 7.7 kernels. #ee the section on kernel
process limits for info on increasing this.
The biggest scalability problem ith apache, 6.E.x versions at least, is it's model of using
one process per connection. In cases here there large amounts of concurent connections,
this can re$uire a large amount resources. These resources can include 4-M, schedular
slots, ability to grab locks, database connections, file descriptors, and others.
In cases here each connection takes a long time to complete, this is only compunded.
'onnections can be slo to complete because of large amounts of cpu or i(o usage in
dynamic apps, large files being transfered, or :ust talking to clients on slo links.
There are several strategies to mitigate this. The basic idea being to free up heavyeight
apache processes from having to handle slo to complete connections.
Stati( &ontent Servers
If the servers are serving lots of static files !images, videos, pdf's, etc", a common
approach is to serve these files off a dedicated server. This could be a very light
apache setup, or any many cases, something like thttpd, boa, khttpd, or T0P. In
some cases it is possible to run the static server on the same server, addressed via
a different hostname.
&or purely static content, some of the other smaller more lighteight eb servers
can offer very good performance. They arent nearly as poerful or as flexible as
apache, but for very specific performance crucial tasks, they can be a big in.
/oa= http=((.boa.org(
thttpd= http=((.acme.com(softare(thttpd(
mathopd= http=((mathop.diva.nl
If you need even more *xtreme<eb#erver,erformance, you probabaly ant to
take a look at T0P, ritten by Ingo Molnar. This is the current orld record
holder for #pec<eb??. It probabaly ons the right to be called the orlds fastest
eb server.
)roxy -sage
&or servers that are serving dynamic content, or ssl content, a better approach is to
employ a reverse2proxy. Typically, this ould done ith either apache's
modIproxy, or #$uid. There can be several advantages from this type of
configuration, including content caching, load balancing, and the prospect of
moving slo connections to lighter eight servers.
The easiest approache is probabaly to use modIproxy and the 1,roxy,ass1
directive to pass content to another server. modIproxy supports a degree of
caching that can offer a significant performance boost. /ut another advantage is
that since the proxy server and the eb server are likely to have a very fast
interconnect, the eb server can $uickly serve up large content, freeing up a
apache process, hy the proxy sloly feeds out the content to clients. This can be
further enhanced by increasing the amount of socket buffer memory thats for the
kernel. #ee the section on tcp tuning for info on this.
proxy links
Info on using modIproxy in con:uction ith modIperl
ebtechni$ues article on using modIproxy
modIproxy home page
#$uid
0sing modIproxy ith 9ope
Listen,a(!"og
)ne of the most frustrating thing for a user of a ebsite, is to get 1connection
refused1 error messages. <ith apache, the common cause of this is for the number
of concurent connections to exceed the number of available httpd processes that
are available to handle connections.
The apache Listen/acklog paramater lets you specify hat backlog paramater is
set to listen!". /y default on linux, this can be as high as 67A.
Increasing this allos a limited number of httpd's to handle a burst of attempted
connections.
There are some experimental patches from #HI that accelerate apache. More info at=
http=((oss.sgi.com(pro:ects(apache(
I havent really had a chance to test the #HI patches yet, but I've been told they are pretty
effective.
Samba Tuning
%epending on the type of tests, there are a number of teaks you can do to samba to
improve its performace over the default. The default is best for general purpose file
sharing, but for extreme uses, there are a couple of teaks.
The first one is to rebuild it ith mmap support. In cases here you are serving up a large
amount of small files, this seems to be particularly useful. 3ou :ust need to add a 122ith2
mmap1 to the configure line.
3ou also ant to make sure the folloing options are enabled in the (etc(smb.conf file=
read ra* ! no
read prediction ! true
level2 oploc+s ! true
)ne of the better resources for tuning samba is the 10sing #amba1 book from )'reily.
The chapter on performance tuning is available online.
Open"dap tuning
The most important tuning aspect for )penL%-, is deciding hat attributes you ant to
build indexes on.
I use the values=
cachesiGe 10000
dbcachesiGe 100000
siGelimit 10000
lo"level 0
dbcache-oJsync
index cn&uid
index uidnumber
index "id
index "idnumber
index mail
If you add the folloing parameters to (etc(openldap(slapd.conf before entering the info
into the database, they ill all get indexed and performance ill increase.
Sys. shm
#ome applications, databases in particular, sometimes need large amounts of #;M
segments and semaphores. The default limit for the number of shm segments is 67A for
7.7.
This limit is set in a couple of places in the kernel, and re$uires a modification of the
kernel source and a recompile to increase them.
- sample diff to bump them up=
))) linux/include/linux/sem(h(save Jed 0pr 12 20$28$3 2000
=== linux/include/linux/sem(h Jed 0pr 12 20$26$03 2000
>> )60& =60& >>
int semaemK
'K
),define 1955-? 128 /3 L max , of semaphore identifiers 3/
=,define 1955-? 512 /3 L max , of semaphore identifiers 3/
,define 19551B 250 /3 M! 512 max num of semaphores per id
3/
,define 1955-1 ;1955-?319551B< /3 L max , of semaphores in system 3/
,define 195475 32 /3 N 100 max num of ops per semop call
3/
))) linux/include/asm)i386/shmparam(h(save Jed 0pr 12 20$18$34 2000
=== linux/include/asm)i386/shmparam(h Jed 0pr 12 20$28$11 2000
>> )21& =21& >>
3 2eep #1@5#?A#O?/1 as lo* as possible since 1@55-? depends on it and
3 there is a static array of siGe 1@55-?(
3/
),define #1@5#?A#O?/1
=,define #1@5#?A#O?/1 10
,define 1@5#?A#5012 ;;1MM#1@5#?A#O?/1<)1<
,define 1@5#?A8#1@?C/ ;#1@5#?A#O?/1<
Theoretically, the I#;MII%I/IT# can go as high as 66. The rule is that
I#;MII%I/IT# Q I#;MII%PI/IT# must be RS 7@ on xAC.
In addition to the number of shared memory segments, you can control the maximum
amount of memory allocated to shm at run time via the (proc interface.
(proc(sys(kernel(shmmax indicates the current. *cho a ne value to it to increase it.
echo P6108864P > /proc/sys/+ernel/shmmax
To double the default value.
- good resource on this is Tunings The Linux Ternel's Memory.
The best ay to see hat the current values are, is to issue the command=
ipcs )l
)tys and ttys
The number of ptys and ttys on a box can sometimes be a limiting factor for things like
login servers and database servers.
)n 4ed ;at Linux >.x, the default limit on ptys is set to 75@A for iCAC and athlon kernels.
#tandard iEAC and similar kernels default to 7JC ptys.
The config directive ')+&IHI0+IP?AI,T3I')0+T defaults to 7JC, but can be set as
high as 75@A. &or 75@A ptys to be supported, the value of
0+IP?AI,T3IM-M)4I')0+T needs to be set to A in include(linux(ma:or.h
<ith the current device number scheme and allocations, the maximum number of ptys is
75@A.
,en(hmar!s
Lies, damn lies, and statistics.
/ut aside from that, a good set of benchmarking utilities are often very helpful in doing
system tuning ork. It is impossible to duplicate 1real orld1 situations, but that isnt
really the goal of a good benchmark. - good benchmark typically tries to measure the
performance of one particular thing very accurately. If you understand hat the
benchmarks are doing, they can be very useful tools.
#ome of the common and useful benchmarks include=
,onnie
/onnie has been around forever, and the numbers it produces are
meaningful to many people. If nothing else, it's good tool for producing
info to share ith others. This is a pretty common utility for testing driver
performance. It's only draback is it sometimes re$uires the use of huge
datasets on large memory machines to get useful results, but I suppose that
goes ith the territory.
'heck %oug Ledford's list of benchmarks for more info on /onnie. There
is also a somhat neer version of /onnie called /onnieQQ that fixes a
fe bugs, and includes a couple of extra tests.
%ben(h
My personal favorite disk io benchmarking utility is FdbenchF. It is
designed to simulate the disk io load of a system hen running the
+et/ench benchmark suite. It seems to do an excellent :ob at making all
the drive lights blink like mad. -lays a good sign.
%bench is available at The #amba ftp site and mirrors
http/"oad
- nice simple http benchmarking app, that does integrity checking,
parallel re$uests, and simple statistics. Henerates load based off a test file
of urls to hit, so it is flexible.
httpIload is available from -'M* Labs
d!ftpben(h
- !theU" ftp benchmarking utility. %esigned to simulate real orld ftp
usage !large number of clients, throttles connections to modem speeds,
etc". ;andy. -lso includes the useful dklimits utility .
dkftpbench is available from %an kegel's page
tioben(h
- multithread disk io benchmarking utility. #eems to do an a good :ob at
pounding on the disks. 'omes ith some useful scripts for generating
reports and graphs.
The tiobench site.
dt
dt does a lot. disk io, process creation, async io, etc.
dt is available at The dt page
tt(p
- tcp(udp benchmarking app. 0seful for getting an idea of max netork
bandidth of a device. Tends to be more accurate than trying to
guestimate ith ftp or other protocols.
netperf
+etperf is a benchmark that can be used to measure the performance of
many different types of netorking. It provides tests for both
unidirecitonal throughput, and end2to2end latency. The environments
currently measureable by netperf include= T', and 0%, via /#% #ockets,
%L,I, 0nix %omain #ockets, &ore -TM -,I, ;i,,I.
Info= http=((.netperf.org(netperf(+etperf,age.html
%onload= ftp=((ftp.sgi.com(sgi(src(netperf(
Info provided by /ill ;ilf.
httperf
httperf is a popular eb server benchmark tool for measuring eb server
performance. It provides a flexible facility for generating various ;TT,
orkloads and for measuring server performance. The focus of httperf is
not on implementing one particular benchmark but on providing a robust,
high2performance tool that facilitates the construction of both micro2 and
macro2level benchmarks. The three distinguishing characteristics of
httperf are its robustness, hich includes the ability to generate and sustain
server overload, support for the ;TT,(6.6 protocol, and its extensibility to
ne orkload generators and performance measurements.
Info= http=((.hpl.hp.com(personal(%avidIMosberger(httperf.html
%onload= ftp=((ftp.hpl.hp.com(pub(httperf(
Info provided by /ill ;ilf.
+utoben(h
-utobench is a simple ,erl script for automating the process of
benchmarking a eb server !or for conducting a comparative test of to
different eb servers". The script is a rapper around httperf. -utobench
runs httperf a number of times against each host, increasing the number of
re$uested connections per second on each iteration, and extracts the
significant data from the httperf output, delivering a '#. or T#. format
file hich can be imported directly into a spreadsheet for
analysis(graphing.
Info= http=((.xenoclast.org(autobench(
%onload= http=((.xenoclast.org(autobench(donloads
Info provided by /ill ;ilf.
Heneral benchmark #ites
%oug Ledford's page
4esier&# benchmark page
System Monitoring
#tandard, and not so standard system monitoring tools that can be useful hen trying to
tune a system.
vmstat
This util is part of the procps package, and can provide lots of useful info
hen diagnosing performance problems.
;eres a sample vmstat output on a lightly used desktop=
procs memory s*ap io
system cpu
r b * s*pd free buff cache si so bi bo
in cs us sy id
1 0 0 5416 2200 1856 34612 0 1 2 1
140 164 2 1 6
-nd heres some sample output on a heavily used server=
procs memory s*ap io
system cpu
r b * s*pd free buff cache si so bi bo
in cs us sy id
16 0 0 2360 264400 6662 6400 0 0 0 1
53 24 3 1 66
24 0 0 2360 25284 6662 6400 0 0 0 6
3063 113 64 36 0
15 0 0 2360 250024 6662 6400 0 0 0 3
3036 16811 66 34 0
The interesting numbers here are the first one, this is the number of the
process that are on the run $ueue. This value shos ho many process are
ready to be executed, but can not be ran at the moment because other
process need to finish. &or lightly loaded systems, this is almost never
above 62E, and numbers consistently higher than 65 indicate the machine
is getting pounded.
)ther interseting values include the 1system1 numbers for in and cs. The
in value is the number of interupts per second a system is getting. -
system doing a lot of netork or disk I(o ill have high values here, as
interupts are generated everytime something is read or ritten to the disk
or netork.
The cs value is the number of context sitches per second. - context
sitch is hen the kernel has to take off of the executable code for a
program out of memory, and sitch in another. It's actually IayI more
complicated than that, but thats the basic idea. Lots of context sithes are
bad, since it takes some fairly large number of cycles to performa a
context sithch, so if you are doing lots of them, you are spending all your
time chaining :obs and not actually doing any ork. I think e can all
understand that concept.
netstat
#ince this document is primarily concerned ith netork servers, the
FnetstatF command can often be very useful. It can sho status of all
incoming and outgoing sockets, hich can give very handy info about the
status of a netork server.
)ne of the more useful options is=
netstat )pa
The F2pF options tells it to try to determine hat program has the socket
open, hich is often very useful info. &or example, someone nmap's their
system and ants to kno hat is using port CCC for example. 4unning
netstat 2pa ill sho you its satand running on that tcp port.
)ne of the most tisted, but useful invocations is=
netstat )a )nQ"rep )9 PR;tcp<PQ cut )c 68)QsortQuniH )cQsort
)n
This ill sho you a sorted list of ho many sockets are in each
connection state. &or example=
6 B?1/9-
21 91/0OB?1@9A
ps
)kay, so everyone knos about ps. /ut I'll :ust highlight one of my
favorite options=
ps )eo pid&Scpu&vsG&ar"s&*chan
#hos every process, their pid, V of cpu, memory size, name, and hat
syscall they are currently executing. +ifty.
-ti"ities
#ome simple utilities that come in handy hen doing performance tuning.
dklimits
a simple util to check the acutally number of file descriptors available,
ephemeral ports available, and poll!"2able sockets. ;andy. /e arned that
it can take a hile to run if there are a large number of fd's available, as it
ill try to open that many files, and then unlinkt them.
This is part of the dkftpbench package.
fd2limit
a tiny util for determining the number of file descriptors available.
fd2limit.c
thread2limit
- util for determining the number of pthreads a system can use. This and
fd2count are both from the system tuning page for .olano chat, a
multithread :ava based chat server.
thread2limit.c
System Tuning Lin!s
http=((.kegel.com
'heck out the 1c65k problem1 page in particular, but the entire site has IlotsI of
useful tuning info.
http=((linuxperf.nl.linux.org(
#ite organized by 4ik .an 4iel and a fe other folks. ,robabaly the best linux
specific system tuning page.
http=((.citi.umich.edu(pro:ects(citi2netscape(
Linux #calibity ,ro:ect at 0mich.
+&# ,erformance Tunging
Info on tuning linux kernel +&# in particular, and linux netork and disk io in
general
http=((home.att.net(W:ageorge(performance.html
Linux ,erformace 'hecklist. #ome useful content.
http=((.linux.com(enhance(tuneup(
Miscelaneous performace tuning tips at linux.com
http=((.psc.edu(netorking(perfItune.htmlGLinux
#ummary of tcp tuning info
Musi(
'areful analysis and benchmarking has shon that server ill respond positively to being
played the approriate music. &or the common case, this can be about anything, but for
high performane servers, a more careful choice needs to be made.
The industry standard for pumping up a server has alays been 1'razy Train1, /y )zzy
)zbourne. <hile this has been proven over and over to offer increased performance, in
some circumstances I recomdend alternatives.
- classic case is the co2located server. +othing like packing up your pride and :oy and
shipping it to strange far off locations like #unnyvale and ;erndon, .-. Its enough to
make a server homesick, so I like to suggest choosing a piece of music that ill remind
them of home and tide them over till the bigger servers stop picking on them. &or servers
from +orth 'arolina, I like to play the entirety of 1feet in mud again1 by Heezer Lake.
+othing like some good old +' style avant2metal2alterna2prog.
'omentary, controverys,chatter. chit2chat. 'hat and irc servers have their on uni$ue set
of problems. I find the polyrythmic and incessant restatement of purpose of *lephant Talk
by Ting 'rimson a good ay to bend those servers back into shape.
bt, Pach says 1'razy Train1 has the best guitar solo ever.
Than!s
&olks that have sent me ne info, corrected info, or :ust sat still long enough for me to
ask them lots of $uestions.
9ach /ron
-r:an van de .en
Pach /eane
Michael T. Mohnson
Mames Manning
TO%O
add info about modIproxy, caching, listen/acklog, etc
-dd info for oracle tuning
any other useful server specific tuning info I stumble across
add info about kernel mem limits, ,-*, bigmeme, L&# and other kernel related stuff
likely to be useful
&hanges
+ov 6? 7556
s(conf.modules(modules.conf info on httpperf(autobench(netperf from /ill ;ilf.
)ct 6C 7556
-dded links to the excellent modIperl tuning guide, and the online chapter for tuning
samba. -dded some info about the use of Max4e$uests,er'hild, modIproxy, and
listen/acklog to the apache section
alikinsXredhat.com

Você também pode gostar