Você está na página 1de 88

DrupalBackEndPerformanceOptimization forlargewebsites KhalidBaheyeldin March5,2009 DrupalConWashington,DC

AboutKhalid

Softwaredevelopmentandconsultingfor24 years Drupaladdictsince2003 Corecontributions

Contributedmodules(37+?)

Userpoints Nagios SecondLife Adsense Jobsearch Favoritenodes Flagcontent Nudge StockAPIandmodule CurrencyAPIandmodule CustomError Imagewatermark Sitemenu Emailloggingandalerts Technorati Referral Nodevote

Siteofflinemaintenancefeature Loggingandalerts(syslog) ReverseProxy Otherpatches...

Memberof

DrupalAssociation(GeneralAssembly) security/infrastructureteams

Cofounderof2bits.com,Inc. Blogathttp://baheyeldin.com

About2bits.com

Founded1999 BasedinWaterloo,Ontario(Canada) ActivememberoftheDrupalcommunitysince2003 37+contributedmodulesondrupal.org ListedonDrupal.org'sserviceproviderssection Maintainmodulesthatrunondrupal.org(donations,feature, lists,fee,...) Eventsponsorship(DrupalCon,DrupalCamps)

2bits.comServices

ClientsmainlyintheUSAandCanada,aswell asinEurope Performancetuningandoptimization Drupalsitemonitoring Development/Customizationofmodules Subcontractingdevelopmentprojects (developers'developer) Serverprovisioning,installation,upgrades Automatedbackups

Agenda

Introduction TheLAMPStack

Linux,Apache,MySQL,PHP Databasequeries Modules Caching

Drupal

Measurementandmonitoringtools Casestudies Questions,discussion

Definitions

Performance Scalability HighAvailability LoadBalancing PerformanceAssessment/Analysis PerformanceOptimization/Tuning

Goals

Defineyourobjectivesandgoalsfirst

Doyouwantfasterresponsetotheenduserper page? Doyouwanttohandlemorepageviews? Doyouwanttominimizedowntime?

Eachisdifferent,buttheycanberelated Mostoften,everyonewantsthemall,butdon't needthem,yetwillingtopayfornone!

DiminishingReturns

Often,therearesomelowhangingfruit,easy topick,thatprovidenoticeableimprovement withrelativelylittleeffort Afterthat,itgetsharderandhardertoachieve moreperformance(moreeffort,lessreturn)


Moreinfrastructure(splitserver,multiplewebhead) PatchingofDrupal Rearchitectingtheapplication(e.g.CCK,Views)

Diagnosis

Aproperdiagnosisisessentialforanysolutions Otherwise,youarerunningblind LikeadoctorwhosaysletustrymedicineA, andsurgeryB,aswellasprocedureC,andsee maybethingswillgetbetterwithoutlabtests andexaminations! Mustbebasedonproperdata Analysisofthedatacollected

Validation

Validatetheresultsonatestserver Copythesite(MySQLdumpandtararchive, maybewithoutimages) Recreatethesite Measureagainandseeiftherelativetimesare aboutthesame Avoidwildgoosechase

Hardware

Physicalservermatters

Dedicated VPS

Multiplecoresarethenormnow 4arebetterthan2,and8andbetterthan4 LotsofRAM(cachingthefilesystemandthedatabase,asmuchas possible) Multipledisksifyoucanfordifferentfilesystems Alwaysmirrored! Notapplicabletosharedhosting

MultipleServers

Onedatabaseserver+multiplewebservers CanuseDNSroundrobinforloadleveling Orproperloadbalancers(commercial,free) Evenareverseproxy(squid,likedrupal.orguses) Doitonlyifyouhavethebudget


Complexityisexpensive(runningcost) Tuningasystemcanavoid(ordelay)thesplit

TheLAMPstack

MostcommonlyusedstackforhostingDrupal andsimilarapplications

Linux Apache MySQL PHP

Mostofthispresentationappliesto*BSDas well.PartsapplytoWindows(anyoneuseit?).

Linux

Useaprovenstabledistro(Debianstable, UbuntuServerLTS,CentOS) Userecentversions Usewhateverdistroyourstaffhasexpertisein Beaminimalist,avoidbloat

Installonlywhatyouneed

(e.g.NoX11,nodesktop,NoJava,NoPostgreSQLif youareonlyusingMySQL,...etc.)

Linux(cont'd)

Balancecompileyourownvs.upgrades Compileyourown

Pros:Fullcontrolonspecifcversions Cons:noteasy(morework)todosecurityupgrades Pros:easytoupgradesecurityreleases,lesswork Cons:whateverversionsyourdistrohas

Usingdeb/rpm

Apache

Mostpopular,mostsupported,moststableandfeature rich Cutthefat


Enableonlymod_phpandmod_rewrite(asastart) Disableeverythingelse(mod_python,mod_perl,...)

Apache

MaxClients(preventswapping/thrashing)

Toolow:youcan'tserveatrafficspike(Digg,Slashdot) Toohigh:yourmemorycannotkeepupwiththeload,and youstartswapping(serverdies!) Toterminatetheprocessfaster,andfreeupmemory Shouldbelow(~3seconds) CompressHTML,CSS,JS,...

MaxRequestsPerChild

KeepAlive

mod_gzip/deflate

ApacheAlternatives

lighttpd(lighty)

PopularwithRubyonRails 1MBperprocess Recently:reportsofreallybadmemoryleaks Newcomer Morestablethanlighty(noleaks)

nginx

ApacheAlternatives

OnlyrunPHPasFastCGI Bothlighttpdandnginxrunthatway Separateprocesses Coveredlater(PHP)

MySQL

MostpopulardatabaseforDrupal Notthebestdatabasefromthetechnology pointofview(ACID,transactions,concurrency), butstilladequateforthejob Variouspluggableengines

MySQLEngines

MyISAM

Fasterforreads Lessoverhead Poorconcurrency(tablelevellocking) Transactional Slowerinsomecases(e.g.SELECTCOUNT(*)) Betterconcurrency(goodforheavilyhittables,suchas sessions,watchdog,...) Oracleownstheenginenow...

InnoDB

MySQLEngines

Newengines,ownedbyMySQLAB

Falcon.NotmatureenoughtomatchInnoDB, benchmarksshowitisstillslow,butpromising SolidDB.

Maria PBXT

PrimeBaseXT

MySQLtuning

Querycache

Probablythemostimportantthingtotune Alsoimportant

Tablecache

Keybuffer InnoDB(e.g.sessions,watchdog,...) TemptablesonLinuxtempfs(inmemory)

MySQLreplication

Nowinuseondrupal.org

INSERT/UPDATE/DELETEgotothemaster SELECTsgotheslave

Noticableimprovement Patchherehttp://drupal.org/node/147160 Bewareofcomplexity(codeandinfrastructure)

PHP

Usearecentversion

5.2minimumforDrupal7.x,andmany6.xcontribs eAccelerator APC Xcache Zend(commerical)

InstallanOpcodecache/Accelerator

Opcodecaches

Benefits

Dramaticspeedupofapplications,speciallycomplex oneslikeDrupal SignificantdecreaseinCPUutilization Considerabledecreaseinmemoryutilization Thebiggestimpactonabusysite

APCvs.eAcceleratorvs.Xcachebenchmarkon2bits.com Drawbacks(forotherthanAPC)

OtherthanAPC,theymaycrashoften UselogwatchertoautorestartApache

Unless...

Acceleratorswillnothelpincertaincases

Whenitisnotjustcodeexecution Networkconnections(Web2.0widgets,emails, someads) Sortingofarrays Heavydatabaseaccess Combinationsoftheabove tagadelic,nodeaccessmodules,admin_menu, forum,tracker)

APCadmin

mod_php

Normally,Apachemod_phpisthemostcommonlyused configuration Sharednothing


Nostateretainedbetweenrequests Lessissues Mosttestedandsupported

Staywithmod_phpifyoucan. Canbeaslowas1012MBperprocess Sawitashighas100MB(butdependsonmodulesinstalled, Apachemodules,...)

PHPasCGI

CGIistheoldestmethodfromtheearly90s. Forksaprocessforeachrequest,andhence veryinefficient. Somehostsofferitbydefault(security)orasan option(e.g.runningaspecificPHPversion). Don'tuseit!

FastCGI

FCGIisfasterthanCGI(usesasockettothePHP process,notforking) MostlywithLighttpdandnginx,sinceitistheonlyway torunPHPforthoseservers,butalsowithApache Betterseparationofpermissions(e.g.Sharedhosting)

IfyouhaveoneserverandoneLinuxuser, permissionsmaynotbeanissue.

Oflate,Apachewithfcgidhasproventobestableas wellasbetteronmemoryusage(majorsavings).

mod_phpvs.fcgid

OtherwaysforPHP

RoadsendPHPcompiler

CompilesPHPtonativecode! http://code.roadsend.com/pcc http://www.phpcompiler.org/ ImplementationofPHPwritteninJava! http://quercus.caucho.com/

PHC(incomplete,Parrotspinoff)

CauchoQuercus

Drupal

Mainlydatabaseintensive(100sofqueriesperpage) CanbeCPUbound(certainmodules,resource starvedhosts,...) Canbeamemoryintensive(lotsofmodules,orif untuned) Bottlenecksareworkedonastheyarefoundbythe community Somemodulesknowntobeslow(moreonit) Notallsitesaffectedbyallbottlenecks

Drupal(cont'd)

Disablemodulesthatyoudonotneed. Makesurecronrunsregulary Enablethrottle

Bewaryaboutthrottleandcache

Modulecallsnetwork?

Doesyourmoduledostuffoverthenetwork? Foreverypageview?

Email2,000usersonnode/commentsubmit(og!) Callweb2.0widgets(e.g.Diggthis)?

Don't! Cachethedata Usejob_queue Orqueue_mailmodule

Mediafiles

Largevideoandaudiotiesupresourcesforalongtime Speciallytoslowconnections,orunstableones(userstryto downloadagainandagain) Servethemfromaseparatebox


http://example.comforPHP http://media.example.comforvideo/audio Videomodulesalreadysupportsthis(butyouhaveto manuallyFTPthevideos)

Useacontentdeliverynetwork(CDN)e.g.Akamai.

CDN

ContentDeliveryNetwork

Serversindifferentlocations(e.g.Europe,USEast coastandUSWestcost) Monthlyfees,aswellasvolumefees. Pricingvarieswildly Proximitybased,userrequestsfullfilledfrom nearestservers Akamai,PantherExpress

CachingReverseProxy

SquidCache

Storesstaticfiles(css,js,images) NeedsapatchforHTML(i.e.Drupalgeneratedpages)

on2bits.comforDrupal6.x Requestsneverreachthewebserver,letalonePHP orthedatabase!

Vastperformanceimprovement

Intermediateproxiesstillanissue NewerthanSquid

Varnish

Drupalcaching

Foranonymousvisitorsonly Doesnotaffectauthenticatedusers Enablepagecaching

Mayexpiretoooftenonabusysite,causingslow downs! Setthecacheexpiryminimum(Drupal5andlater)

Aggressivecachingcanhavesomeimplications,but givesbetterperformance

Drupalcaching(cont'd)

Certainpartsofcachearealwaysonand cannotbeturnedoff(butseelater)

Filter Menu Variables Forms

Boost

Drupalmodule CreatesHTMLforpagesandstoresitinfiles Requireschangesto.htaccessandsymlinks UsableonsharedhostsaswellasVPS/Ded. Vastlyenhancestheabilitytohandletraffic spikes MakesureyouTRUNCATEsessionswhen installing,otherwiseyouwillseestalepages Canleavedanglingsymlinksinthefilesystem

Drupalcaching(cont'd)

IfyouuseSquidasacache,thenthosemaynotapply Considerothercachingmodulesthatusefiles

FSFastpath

StillsomeofDrupal'sPHPisexecuted Usefulforsharedhosting UsesflatfilestostorethecachedobjectsoutsidetheDB Availableincacheroutermoduletoo

FileCache

Pluggablecaching

Using$confvariableinsettings.php

'cache_include'=>'./includes/yourcache.inc'

Allowsyoutohaveacustomcachingmodule Developerstip:canbeusedtodisablecache fordevelopment(stubfunctionsthatdonothing)

Blockcaching

ContribmoduleforDrupal5.x IncoresinceDrupal6(butlessconfigurability) Eliminatestheoverheadofgeneratingblocks foreachpageview 64%improvement(Drupal6.x)

memcached

Distributedobjectcachinginmemory WrittenbyDangaforlivejournal NodiskI/O(databaseorfiles) Canspanmultipleservers(overaLAN) GiveitalotofRAM UsesDrupalpluggablecaching RequirespatchesandschemachangesforDrupal5.x ShouldbeseamlessforDrupal6.x(atleastcore)

memcached(cont'd)

Howmuchofan effectdoes memcachehas? Seehowmany SELECTswere reducedinearly Julycompared toearliermonth!

memcached(cont'd)

Watchoutfor:

MuststartApacheaftermemcachedrestart Getscomplexasyouaddinstances Getsmorecompelxasyouaddinstancesonother servers

Also:

AdvancedCaching

Contributedmodule,setofpatches Forauthenticatedusers

block_cache comment_cache node_cache path_cache search_cache taxonomy_cache

Slowmodules

Statisticsmodule

Addsextraqueries EvensloweronInnoDB(COUNT(*)slow) DisablePopularContentblock Hadanextrajoin,patchaccepted Can'thandlemorethan50,000nodes Exhaustedmemory Newversionrewrittentouseaflatfile

gsitemap(XMLsitemap)

Slowmodules(cont'd)

Aggregator2

Usesbodyfield(text)tostoreanID Joinsonit Abandoned!

Tagadelicwithfreetagging(many1,000s) Admin_menu(addsupto500ms) NodeAccessmoduleswithlargenumberof nodes(10,000ormore)

MeasureandMonitor

Howdoyouknowyouhaveaproblem?

Waittilluserscomplain(siteissluggish,timeouts)? Waittillyouloseaudience?Lossofinterestfromvisitors?

Differenttoolsforvarioustasks

Top

ClassicUNIX/Linuxprogram Realtimemonitoring(i.e.Whatthesystemisdoing NOW,notyesterday) Loadaverage CPUutilization(user,system,nice,idle,waitI/O) Memoryutilization Listofprocesses,sorted,withCPUandmemory Canchangeorderofsorting,aswellastime interval,andmanyotherthings

htop

Similartotop Multiprocessor(individualcores) Fancycolors

atop

ATSTop Differentformatandinfo Showsnetworkstats Runsacollectiondaemoninthebackground

vmstat

FromBSD/Linux Showsaggregateforthesystem(noindividual processes) Showssnapshotorincremental Processesintherunqueueandblocked Swapping CPUuser,system,idleandiowait Firstlineisaveragesincelastreboot

netstat

Showsactivenetworkconnections(alland ESTABLISHED) netstatanp netstatanp|grepEST Rememberthatdeliveringcontenttodialup userscanbeslow,becausetheotherendis slow

apachetop

ReadsandanalysesApache'saccesslog Showsall/recenthits

Requestpersecond,KB/sec,KB/req 2xx,3xx,4xx,5xx

Listofrequestsbeingserved Goodtodetectcrawlers Torunituse:

apachetopf/var/log/access.log

mtop,mytop

mtop/mytop

Liketop,butforMySQL Realtimemonitoring(nohistory) Showsslowqueriesandlocks

Ifyouhaveneither

SHOWFULLPROCESSLIST mysqladminprocesslist

runfromcron?

OtherMySQLtools

Mysqlreport

Displaysstatistics Norecommendations AshellscriptthatreadsvariablesfromMySQL Usefulrecommendations

MySQLDBtuningprimer

SlowQueryLog

Hastobeenabledinmy.cnf ListsqueriestakingmorethanNseconds Veryusefultoidentifybottlenecks Bestwaytointerpretit:


Usemysql_slow_log_parserscript Alsomysqlslascript

Stresstesting

Howmuchrequestspersecondcanyoursite handle? Areyoureadyforadigg? Doyouknowyourperformanceandbottlenecks beforeyoudeploy?orafter? Thechallengeisfindingarealisticworkloadand simulatingit Ifyoufindbottlenecks,submitpatches

Stresstesting(cont'd)

ab/ab2(Apachebenchmark)

abc50n10000http://example.com Requestspersecond Averageresponsetimeperrequest UseCforauthenticatedsessions http://httpd.apache.org/docs/2.0/programs/ab.html

Stresstesting(cont'd)

Siege

AnotherHTTPServerloadtesttool http://www.joedog.org/JoeDog/Siege WritteninJava Desktop http://jakarta.apache.org/jmeter/

Jmeter

GraphicalMonitoring

Munin

Niceeasytounderstandgraphs. Historyoveraday,week,monthandyear CPU,memory,network,Apache,MySQL,and muchmore Canaddyourownmonitoringscripts(e.g.Wewrote oneforphpcgiwhenrunningfcgid) Similarfeatures

Cacti

Nagios

Amonitoringplatform

Alertsbyemail,XMPP,SMS,... Alertsaboutmanythings

NewmoduleforDrupal(5.xand6.x)

Pendingcoreandcontribreleases(security!) Databaseschemaupdates Filedirectorypermissions Performance Muchmore APItoo!

Websitestatistics

Definitions

Hits(everypage,graphic,video,css,jsfile) Pageviews(e.g.anode,ataxonomylist) Visits Uniquevisits(advertiserscareaboutthis)

SiteStatistics

Doyouknowhowmanypageviewsperdays yoursitegets?(notjustvisits!) GoogleAnalytics


Measureshumansonly(javascript) Doesnotcountaccesstofeeds Norsearchengineandspambots Measureseverything(alsobandwidth!) ReliesonApache'slogs

Awstats

Drupaltools

Develmodule

Totalpageexecution Queryexecutiontime Querylog Memoryutilization Morefordebugging,butalsousefulinknowing whatgoesonunderthehood

Tracemodule

PerformanceLogging

Startedasanindependentprojectby2bits NowpartofDevel(5.x,6.xand7.x,indev) Aimsatcollectinginfoforanalysisofperformance


Whichpagesusemostqueries Whichpagesusemosttimetogenerate Averageandmaximums Logstodatabase(dev/test)orAPC(okforlive sites) Canbecombinedwithstresstesting(ab/siege)

Drupaltools(cont'd)

Loadtestmodule

GoogleSummerofCode2007 LoadtestingofDrupal Measurestimingsfordiscretecomponents Needtowritesimpletestliketests Hasaprojectpageondrupal.org

Casestudies

Inreallifeaction...

Case1:Millionpages

CanDrupaldo1,000,000pageviewsaday? Yeswecan!

How?

Dedicatedserver(singleserverinthiscase) Leansite(noviews,noCCK,nolocale,no statistics,buthasvotingAPI,fivestar, subscriptions) Memcacheisalivesaver APC Fcgidinsteadofmod_php(savesmemory)

Case2:Slowforums

Nodeaccess(taxonomyaccessliteinthis case)wasusedtomakesomeforumsprivate Notneededinthatcase

Case3:10sofseconds

Asitewhichtoolstensofsecondstoprocess thesubmissionofanode Ogwassendingemailsto1,000sofusers instantly Usedjob_queuemodule

Case4:Authenticated

Intranetapplicationfor200concurrentusers (96,000inuserstable) Couldonlydo30! UsingCREATETEMPORARYTABLEoneach user'shomepage 2bits.comwasabletoscaleitinourlabsto200 users


Usecache_get()/set()forthisquery InnoDBforsessions,watchdog,accesslogtables

Case5:Hangs

Pageloadinghangs Thesiteusedadiggwidgetthatdida fsock_open()call,anddigg'sAPIhostwas down! Usingnetstat,weknewtheIPaddressandsaw manyconnections Removed...

Case6:SlowLAN

Sitewas1012seconds TwoVPS's,oneforwebtheotherfordatabase 10Mbpsconnection!Notenough,veryhigh querytime Increasedto60Mbps,muchbetter...

Case7:Crawler!

Siteisveryslow Acrawlerwashittingitrepeatedly Turnedouttoaworm Useapachetop UseiptablestoblocktheIps Freedupresourcesinstantly

CPU100%

Whatwasit?
WasOKforaday eAccelerator(svn303+PHP 5) NoteCPUutilization(100%, thenhigh,thendroppedlow whengoodversionused)

Memory

Swappingmeansyoudon'thaveenoughRAM Excessiveswapping(thrashing)isserverhell! ReducethesizeofApacheprocesses(noSVNDAV) ReducethenumberofApacheprocesses(MaxClients) Turnoffprocessesthatarenotused(e.g.Java,extra copiesofemailservers,otherdatabases) Buymorememory!Costeffectiveandworthit.

Memory

Impactonmemory usagewhenthereis noopcodecachevs. withanopcode cache(eAccelerator inthiscase)

DiskI/O

Firsteliminateswappingifgethitbyit. Getthefastestdisksyoucan.7200RPMataminimum. TurnoffPHPerrorloggingto/var/log/*/error.log Considerdisablingwatchdogmoduleinfavorofsyslog (Drupal6willhavethatoption),orhackthecode OptimizeMySQLonceaweek,oronceaday

Network

Normallynotanissue,butmakesureyouhave enoughbandwidth PrivategigabitLANifyouhavetwoservers Occasionallyyouwillhavestubborncrawlers though OrevenaDDoS Orworse,extortion Caneatupresources,includingnetwork

DDoS

DistributedDenialofService

Aggressivecrawlers Wormsprobingforvulnerabilities

Saptheenergyfromyoursite Externalproblem Diagnosis:Lookinthewebserverlogoruse Apachetop Solution:useiptablestoblocktheaddresses

Furtherreading

DrupalPerformanceTuningandOptimization section http://2bits.com

Discussion

Questions? Comments?

Você também pode gostar