Você está na página 1de 31

Visualizing Garbage Collection in Ruby and Python

@pat_shaughnessy October 24th 2013 16 Comments Note: This post is based on a presentation I just did at RuPy in Budapest. Instead o just posting !y slides I thought it "ould be !ore use ul i I "rote it do"n as a blog post "hile it#s still resh in !y !ind. I#ll post the $ideo lin% here "hen the RuPy ol%s ha$e it online. &'I I#! planning to do a si!ilar presentation at RubyCon ( e)cept I#ll re!o$e the Python in o( and instead co!pare ho" GC "or%s inside o *RI $s. +Ruby and Rubinius. &or a !ore detailed e)planation o GC in Ruby and o Ruby internals generally( see !y upco!ing boo%( Ruby ,nder a *icroscope( due out $ery soon ro! No -tarch Press.

If your algorithms and business logic are the brain of your application, which organ would the garbage collector be? Since this is the Ruby ython! con"erence# $ thought it %ou&' be "un to compare ho% garbage co&&ection %or(s insi'e o" Ruby an' ython) *ut be"ore %e get to that# %hy ta&( about garbage co&&ection at a&&+ ,"ter a&&# it-s not the most g&amorous# e.citing topic# is it+ /o% many o" you get e.cite' by garbage co&&ection+ 0 , number o" Ru y atten'ees actua&&y raise' their han's1 2 Recent&y in the Ruby community there %as a b&og post about ho% you can spee' up your unit tests by changing your Ruby 3C settings) $ thin( that-s great) $t-s goo' to run your tests "aster an' to run your app %ith "e%er 3C pauses# but someho% 3C 'oesn-t rea&&y excite me) $t seems &i(e a boring# 'ry# technica& topic at "irst g&ance) *ut actua&&y# garbage co&&ection is a "ascinating topic4 3C a&gorithms are both an important part o" computer science history# an' a sub5ect o" cutting e'ge research) 6or e.amp&e# the 7ar( an' S%eep a&gorithm use' by 7R$ Ruby is o8er 90 years o&'# %hi&e one o" the 3C

a&gorithms use' insi'e o" Rubinius# an a&ternati8e imp&ementation o" Ruby# %as in8ente' 5ust recent&y in 200:) /o%e8er# the name garbage co&&ection#! is actua&&y misnomer)

The Beating .eart o 'our /pplication


3C systems 'o much more than 5ust co&&ect garbage)! $n "act# they per"orm three important tas(s) ;hey

a&&ocate memory "or ne% ob5ects# i'enti"y garbage ob5ects# an' rec&aim memory "rom garbage ob5ects)

$magine i" your app&ication %as a human bo'y4 ,&& o" the e&egant co'e you %rite# your business &ogic# your a&gorithms# %ou&' be the brain or the inte&&igence insi'e the app&ication) 6o&&o%ing this ana&ogy# %hat part o" the bo'y 'o you thin( the garbage co&&ector %ou&' be+ 0 $ got &ots o" "un ans%ers "rom the Ru y au'ience4 (i'neys# %hite b&oo' ce&&s 4< 2 $ thin( the garbage co&&ector is the beating heart o" your app&ication) =ust as your heart pro8i'es b&oo' an' nutrients to the rest o" the your bo'y# the garbage co&&ector pro8i'es memory an' ob5ects "or your app&ication to use) $" your heart stoppe' beating you %ou&' 'ie in secon's) $" the garbage co&&ector stoppe' or ran s&o%&y > i" it ha' c&ogge' arteries > your app&ication %ou&' s&o% 'o%n an' e8entua&&y 'ie1

/ -i!ple 0)a!ple
$t-s a&%ays he&p"u& to %or( through theories using e.amp&es) /ere-s a simp&e c&ass# %ritten in both ython an' Ruby# that %e can use as an e.amp&e to'ay4

*y the %ay# it-s ama?ing to me ho% simi&ar this co'e is in both &anguages4 Ruby an' ython are rea&&y 5ust s&ight&y 'i""erent %ays o" saying the same thing) *ut are the &anguages a&so implemented in a simi&ar %ay interna&&y+

The &ree 1ist


@hen %e ca&& Ao'e)ne%B1< abo8e# %hat 'oes Ruby 'o# e.act&y+ /o% 'oes Ruby go about creating a ne% ob5ect "or us+ Surprising&y# it 'oes 8ery &itt&e1 $n "act# &ong be"ore your co'e starts to run# Ruby creates thousan's o" ob5ects ahea' o" time an' p&aces them on a &in(e' &ist# ca&&e' the free list) /ere-s %hat the "ree &ist might &oo( &i(e# conceptua&&y4

$magine each o" the %hi&e sCuares abo8e is an unuse'# precreate' Ruby ob5ect) @hen %e ca&& Ao'e)ne%# Ruby simp&y ta(es one o" these ob5ects an' han's it to us4

$n the 'iagram abo8e# the gray sCuare on the &e"t represents an acti8e Ruby ob5ect %e-re using in our co'e# %hi&e the remaining %hite sCuares are unuse' ob5ects) 0 Aote4 o" course# my

'iagrams are a simp&i"ie' 8ersion o" rea&ity) $n "act# Ruby %ou&' use another ob5ect to ho&' the string ,*C#! a thir' ob5ect to ho&' the c&ass 'e"inition o" Ao'e# an' sti&& other ob5ects to ho&' the parse'# abstract synta. tree B,S;< representation o" my co'e# etc) 2 $" %e ca&& Ao'e)ne% again# Ruby 5ust han's us another ob5ect4

John McCarthys 19 ! implementation of "isp contained the first garbage collector# $Courtesy MI% Museum& ;his simp&e a&gorithm o" using a &in(e' &ist o" precreate' ob5ects %as in8ente' o8er 90 years ago by a &egen'ary computer scientist name' =ohn 7cCarthy# %hi&e he %as %or(ing on the origina& imp&ementation o" Disp) Disp %as not on&y one o" the "irst "unctiona& programming

&anguages# but a&so containe' a number o" other groun'brea(ing a'8ances in computer science) One o" these %as the concept o" automatica&&y managing your app&ication-s memory using garbage co&&ection) ;he stan'ar' 8ersion o" Ruby# a&so (no%n as 7at?-s Ruby $nterpreter! B7R$<# uses a 3C a&gorithm simi&ar to the one use' by 7cCarthy-s imp&ementation o" Disp in 1E60) 6or better or %orse# Ruby uses a 93 year o&' a&gorithm "or garbage co&&ection) =ust as Disp 'i'# Ruby creates ob5ects ahea' o" time an' han's them to your co'e %hen you a&&ocate ne% ob5ects or 8a&ues)

/llocating 2bjects in Python


@e-8e seen that Ruby creates ob5ects ahea' o" time an' sa8es them in the "ree &ist) @hat about ython+ @hi&e ython a&so uses "ree &ists "or 8arious reasons interna&&y Bit recyc&es certain ob5ects such as &ists<# it norma&&y a&&ocates memory "or ne% ob5ects an' 8a&ues 'i""erent&y than Ruby 'oes) Suppose %e create a Ao'e ob5ect using ython4

ython# un&i(e Ruby# %i&& as( the operating system "or memory imme'iate&y %hen you create the ob5ect) B ython actua&&y imp&ements its o%n memory a&&ocation system %hich pro8i'es an a''itiona& &ayer o" abstraction on top o" the OS heap) *ut $ 'on-t ha8e time to get into those 'etai&s to'ay)< @hen %e create a secon' ob5ect# ython %i&& again as( the OS "or more memory4

'uby lea(es unused ob)ects lying around in memory until the next *C process runs# Seems simp&e enoughF at the moment %e create an ob5ect ython ta(es the time to "in' an' a&&ocate memory "or us)

Ruby 3e$elopers 1i$e in a *essy .ouse


*ac( to Ruby) ,s %e a&&ocate more an' more ob5ects# Ruby %i&& continue to han' us precreate' ob5ects "rom the "ree &ist) ,s it 'oes this# the "ree &ist %i&& become shorter4

Gan' shorter4

Aotice as $ continue to assign ne% 8a&ues to n1# Ruby &ea8es the o&' 8a&ues behin') ;he ,*C# =HD an' 7AO no'es remain in memory) Ruby 'oesn-t imme'iate&y c&ean up o&' ob5ects my co'e is no &onger using1 @or(ing as a Ruby 'e8e&oper is &i(e &i8ing in a messy house# %ith c&othes &ying on the "&oor or 'irty 'ishes in the (itchen sin() ,s a Ruby 'e8e&oper you ha8e to %or( %ith unuse'# garbage ob5ects surroun'ing you)

Python 3e$elopers 1i$e in a Tidy .ousehold

+ython cleans up garbage ob)ects immediately after your code is done using them# 3arbage co&&ection %or(s Cuite 'i""erent&y in ython than in Ruby) Det-s return to our three ython Ao'e ob5ects "rom ear&ier4

$nterna&&y# %hene8er %e create an ob5ect ython sa8es an integer insi'e the ob5ect-s C structure# ca&&e' the reference count) $nitia&&y# ython sets this 8a&ue to 14

;he 8a&ue o" 1 in'icates there is one pointer or re"erence to each o" the three ob5ects) Ao% suppose %e create a ne% no'e# =HD4

=ust as be"ore# ython sets the re"erence count in =HD to be 1) /o%e8er# a&so notice since %e change' n1 to point to =HD# it no &onger re"erences ,*C# an' that ython 'ecremente' its re"erence count 'o%n to 0) ,t this point# the ython garbage co&&ector imme'iate&y 5umps into action1 @hene8er an ob5ect-s re"erence count reaches ?ero# ython imme'iate&y "rees it# returning it-s memory to the operating system4

,bo8e ython rec&aims the memory use' by the ,*C no'e) Remember# Ruby simp&y &ea8es o&' ob5ects &ying aroun' an' 'oesn-t re&ease their memory) ;his garbage co&&ection a&gorithm is (no%n as reference counting) $t %as in8ente' by 3eorge Co&&ins in 1E60 > not coinci'enta&&y the same year =ohn 7cCarthy in8ente' the "ree &ist a&gorithm) ,s 7i(e *ernstein sai' in his "antastic presentation on garbage co&&ection at the 3otham Ruby Con"erence bac( in =une4 ,19 ! was a good year for *arbage Collectors-#. @or(ing as a ython 'e8e&oper is &i(e &i8ing in a ti'y houseF you (no%# the (in' o" p&ace %here your roommates are a bit OCJ an' are constant&y c&eaning up a"ter you) ,s soon as you put 'o%n a 'irty 'ish or g&ass# someone has a&rea'y put it a%ay in the 'ish%asher1 Ao% "or a secon' e.amp&e) Suppose %e set n2 to re"er to the same no'e as n14

,bo8e to the &e"t you can see ython has 'ecremente' the re"erence count "or JK6 an' %i&& imme'iate&y garbage co&&ect the JK6 no'e) ,&so note that the =HD no% has a re"erence count o" 2# since both n1 an' n2 point to it)

*ar% and -"eep


K8entua&&y a messy house "i&&s up %ith trash an' &i"e can-t continue as usua&) ,"ter your Ruby program runs "or some time# the "ree &ist %i&& e8entua&&y be entire&y use' up4

/ere a&& o" the precreate' Ruby ob5ects ha8e been use' by our app&ication Bthey are a&& gray< an' no ob5ects remain on the "ree &ist Bno %hite sCuares are &e"t<) ,t this point Ruby uses another a&gorithm in8ente' by 7cCarthy (no%n as Mar/ and 0weep) 6irst Ruby stops your app&icationF Ruby uses stop the %or&' garbage co&&ection)! Ruby then &oops through a&& o" the pointers# 8ariab&es an' other re"erences our co'e ma(es to ob5ects an' other 8a&ues) Ruby a&so iterates o8er interna& pointers use' by its 8irtua& machine) $t mar/s each ob5ect that it is ab&e to reach using these pointers) $ in'icate these mar(s using the &etter 7 here4

,bo8e the three ob5ects mar(e' %ith 7! are &i8e# acti8e ob5ects that our app&ication is sti&& using) $nterna&&y# Ruby actua&&y uses a series o" bits (no%n as the free bitmap to (eep trac( o" %hich ob5ects are mar(e' or not4

10

Ruby sa8es the "ree bitmap in a separate memory &ocation in or'er to ta(e "u&& a'8antage o" Lni. copyMonM%rite optimi?ation) 6or more in"ormation on this# see my artic&e @hy Nou Shou&' *e K.cite' ,bout 3arbage Co&&ection in Ruby 2)0) $" the mar(e' ob5ects are &i8e# the remaining# unmar(e' ob5ects must be garbage# meaning they are no &onger being use' by our co'e) $-&& sho% the garbage ob5ects as %hite sCuares be&o%4

Ae.t Ruby sweeps the unuse'# garbage ob5ects bac( onto the "ree &ist4

11

$nterna&&y this happens Cuite Cuic(&y# since Ruby 'oesn-t actua&&y copy ob5ects aroun' "rom one p&ace to another) $nstea'# Ruby p&aces the garbage ob5ects bac( onto the "ree &ist by a'5usting interna& pointers to "orm a ne% &in(e' &ist) Ao% Ruby can gi8e these garbage ob5ects bac( to us the ne.t time %e create a ne% Ruby ob5ect) $n Ruby# ob5ects are reincarnate'# an' en5oy mu&tip&e &i8es1

*ar% and -"eep $s. Re erence Counting


,t "irst g&ance# ython-s 3C a&gorithm seems "ar superior to Ruby-s4 %hy &i8e in a messy house %hen you can &i8e in a ti'y one+ @hy 'oes Ruby "orce your app&ication to stop running perio'ica&&y each time it c&eans up# instea' o" using ython-s a&gorithm+ Re"erence counting isn-t as simp&e as it seems at "irst g&ance# ho%e8er) ;here are a number o" reasons %hy many &anguages 'on-t use a re"erence counting 3C a&gorithm &i(e ython 'oes4

6irst# it-s 'i""icu&t to imp&ement) ython has to &ea8e room insi'e o" each ob5ect to ho&' the re"erence count) ;here-s a minor space pena&ty "or this) *ut %orse# a simp&e operation such a changing a 8ariab&e or re"erence becomes a more comp&e. operation since ython nee's to increment one counter# 'ecrement another# an' possib&y "ree the ob5ect) Secon'# it can be s&o%er) ,&though ython per"orms 3C %or( smooth&y as your app&ication runs Bc&eaning 'irty 'ishes as soon as you put them in the sin(<# this isn-t necessari&y "aster) ython is constant&y up'ating the re"erence count 8a&ues) ,n' %hen you stop using a &arge 'ata structure# such as a &ist containing many e&ements# ython might ha8e to "ree many ob5ects a&& at once) Jecrementing re"erence counts can be a comp&e.# recursi8e process) 6ina&&y# it 'oesn-t a&%ays %or() ,s %e-&& see in my ne.t post containing my notes "rom the rest o" this presentation# re"erence counting can-t han'&e cyclic data structures > 'ata structures that contain circu&ar re"erences)

,ntil Ne)t Ti!e4


Ae.t %ee( $-&& type up the rest o" the presentation) $-&& 'iscuss ho% ython han'&es cyc&ic 'ata structures# an' ho% 3C %or(s in the upcoming Ruby 2)1 re&ease)

Generational GC in Python and Ruby


October 30th 2013 9 Comments

12

1oth the 'uby and +ython garbage collectors handle old and young ob)ects differently# Dast %ee( $ %rote up the my "irst ha&" o" my notes "rom a presentation $ 'i' at Ru y ca&&e' Oisua&i?ing 3arbage Co&&ection in Ruby an' ython)! $ e.p&aine' ho% stan'ar' Ruby Ba&so (no%n as 7at?-s Ruby $nterpreter or 7R$< uses a garbage co&&ection B3C< a&gorithm ca&&e' mar/ and sweep# the same basic a&gorithm 'e8e&ope' "or the origina& 8ersion o" Disp in 1E60) @e a&so sa% ho% ython uses a 8ery 'i""erent 3C a&gorithm a&so in8ente' 93 years ago ca&&e' reference counting) ,s it turns out# a&ong %ith re"erence counting ython emp&oys a secon' a&gorithm ca&&e' generational garbage collection) ;his means ython-s garbage co&&ector han'&es ne%&y create' ob5ects 'i""erent&y than o&'er ones) ,n' starting %ith the upcoming 8ersion 2)1 re&ease# 7R$ Ruby %i&& a&so emp&oy generationa& 3C "or the "irst time) B;%o a&ternati8e imp&ementations o" Ruby# =Ruby an' Rubinius# ha8e been using generationa& garbage co&&ection "or years) $-&& ta&( about ho% garbage co&&ection %or(s insi'e o" =Ruby an' Rubinius ne.t %ee( at RubyCon")< O" course# the phrase han'&es ne% ob5ects 'i""erent&y "rom o&'er ones! is a bit o" han'M %a8ing) @hat e.act&y are ne% an' o&' ob5ects+ K.act&y ho% 'o Ruby an' ython han'&e them 'i""erent&y+ ;o'ay $-&& continue to 'escribe ho% these t%o garbage co&&ectors %or( an' ans%er these Cuestions) *ut be"ore %e get to generationa& 3C# %e "irst nee' to &earn more about a serious theoretica& prob&em %ith ython-s re"erence counting a&gorithm)

Cyclic 3ata -tructures and Re erence Counting in Python


@e sa% &ast time that ython uses an integer 8a&ue sa8e' insi'e o" each ob5ect# (no%n as the reference count# to (eep trac( o" ho% many pointers re"erence that ob5ect) @hene8er a 8ariab&e or other ob5ect in your program starts to re"er to an ob5ect# ython increments this counterF %hen your program stops using an ob5ect# ython 'ecrements the counter) Once the re"erence count becomes ?ero# ython "rees the ob5ect an' rec&aims its memory)

13

Since the 1E60s# computer scientists ha8e been a%are o" a theoretica& prob&em %ith this a&gorithm4 i" one o" your 'ata structures re"ers to itse&"# i" it is a cyclic data structure# some o" the re"erence counts %i&& ne8er become ?ero) ;o better un'erstan' this prob&em &et-s ta(e an e.amp&e) ;he co'e be&o% sho%s the same Ao'e c&ass %e use' &ast %ee(4

@e ha8e a constructor Bthese are ca&&e' __init__ in ython< %hich sa8es a sing&e attribute in an instance 8ariab&e) *e&o% the c&ass 'e"inition %e create t%o no'es# ,*C an' JK6# %hich $ represent using the rectang&es on the &e"t) ;he re"erence count insi'e both o" our no'es is initia&&y one# since one pointer Bn1 an' n2# respecti8e&y< re"ers to each no'e) Ao% &et-s 'e"ine t%o a''itiona& attributes in our no'es# ne.t an' pre84

Ln&i(e in Ruby# using ython you can 'e"ine instance 8ariab&es or ob5ect attributes on the "&y &i(e this) ;his seems &i(e a bit o" interesting magic missing "rom Ruby) BJisc&aimer4 $-m not a ython 'e8e&oper so $ might ha8e some o" the nomenc&ature %rong here)< @e set n1)ne.t to re"erence n2# an' n2)pre8 to point bac( to n1) Ao% our t%o no'es "orm a 'oub&y &in(e' &ist using a circu&ar pattern o" pointers) ,&so notice that the re"erence counts o" both ,*C an' JK6 ha8e increase' to t%o) ;here are t%o pointers re"erring to each no'e4 n1 an' n2 as be"ore# an' no% ne.t an' pre8 as %e&&) Ao% &et-s suppose our ython program stops using the no'esF %e set both n1 an' n2 to nu&&) B$n ython nu&& is (no%n as Aone)<

14

Ao% ython# as usua&# 'ecrements the re"erence count insi'e o" each no'e 'o%n to 1)

Generation 5ero in Python


Aotice in the 'iagram 5ust abo8e %e-8e en'e' up %ith an unusua& situation4 @e ha8e an is&an'! or a group o" unuse' ob5ects that re"er to each other# but %hich ha8e no e.terna& re"erences) $n other %or's# our program is no &onger using either no'e ob5ect# there"ore %e e.pect ython-s garbage co&&ector to be smart enough to "ree both ob5ects an' rec&aim their memory "or other purposes) *ut this 'oesn-t happen because both re"erence counts are one an' not ?ero) ython-s re"erence counting a&gorithm can-t han'&e ob5ects that re"er to each other1 O" course# this is a contri8e' e.amp&e# but your o%n programs might contain circu&ar re"erences &i(e this in subt&e %ays that you may not be a%are o") $n "act# as your ython program runs o8er time it %i&& bui&' up a certain amount o" "&oating garbage#! unuse' ob5ects that the ython co&&ector is unab&e to process because the re"erence counts ne8er reach ?ero) ;his is %here ython-s generationa& a&gorithm comes in1 =ust as Ruby (eeps trac( o" unuse'# "ree ob5ects using a &in(e' &ist Bthe free list<# ython uses a 'i""erent &in(e' &ist to (eep trac( o" acti8e ob5ects) $nstea' o" ca&&ing this the acti8e &ist#! ython-s interna& C co'e re"ers to it as *eneration 2ero) Kach time you create an ob5ect or some other 8a&ue in your program# ython a''s it to the 3eneration Pero &in(e' &ist4

,bo8e you can see %hen %e create the ,*C no'e# ython a''s it to 3eneration Pero) Aote that this isn-t an actua& &ist that you see an' access in your programF this &in(e' &ist is entire&y interna& to the ython runtime) Simi&ar&y# %hen %e create the JK6 no'e# ython a''s it to the same &in(e' &ist4

19

Ao% 3eneration Pero contains t%o no'e ob5ects) B$t %i&& a&so contain e8ery other 8a&ue our ython co'e creates# an' many interna& 8a&ues use' by ython itse&")<

3etecting Cyclic Re erences


Dater ython &oops through the ob5ects in the 3eneration Pero &ist an' chec(s %hich other ob5ects each ob5ect in the &ist re"ers to# 'ecrementing re"erence counts as it goes) $n this %ay# ython accounts "or interna& re"erences "rom one ob5ect to another that pre8ente' ython "rom "reeing the ob5ects ear&ier) ;o ma(e this a bit easier to un'erstan'# &et-s ta(e an e.amp&e4

,bo8e you can see the ,*C an' JK6 no'es contain a re"erence count o" 1) ;hree other ob5ects are in the 3eneration Pero &in(e' &ist a&so) ;he b&ue arro%s in'icate some o" the ob5ects are re"erre' to by other ob5ects that are &ocate' e&se%here > re"erences "rom outsi'e o" 3eneration Pero) B,s %e-&& see in a moment# ython a&so uses t%o other &ists ca&&e' 3eneration One an' 3eneration ;%o)< ;hese ob5ects ha8e higher re"erence counts because o" the other pointers re"erring to them) *e&o% you can see %hat happens a"ter ython-s garbage co&&ector processes 3eneration Pero)

16

*y i'enti"ying interna& re"erences# ython is ab&e to re'uce the re"erence count o" many o" the 3eneration Pero ob5ects) ,bo8e in the top ro% you can see that ,*C an' JK6 no% ha8e a re"erence count o" ?ero) ;his means the co&&ector %i&& "ree them an' rec&aim their memory) ;he remaining &i8e ob5ects are then mo8e' to a ne% &in(e' &ist4 3eneration One) $n a %ay# ython-s 3C a&gorithm resemb&es the mar( an' s%eep a&gorithm Ruby uses) erio'ica&&y it traces re"erences "rom one ob5ect to another to 'etermine %hich ob5ects remain &i8e# acti8e ob5ects our program is sti&& using > 5ust &i(e Ruby-s mar(ing process)

Garbage Collection Thresholds in Python


@hen 'oes ython per"orm this mar(ing process+ ,s your ython program runs# the interpreter (eeps trac( o" ho% many ne% ob5ects it a&&ocates# an' ho% many ob5ects it "rees because o" ?ero re"erence counts) ;heoretica&&y# these t%o 8a&ues shou&' remain the same4 e8ery ne% ob5ect your program creates shou&' e8entua&&y be "ree') O" course# this isn-t the case) *ecause o" circu&ar re"erences# an' because your program uses some ob5ects &onger than others# the 'i""erence bet%een the a&&ocation count an' the re&ease count s&o%&y gro%s) Once this 'e&ta 8a&ue reaches a certain thresho&'# ython-s co&&ector is triggere' an' processes the 3eneration Pero &ist using the subtract a&gorithm abo8e# re&easing the "&oating garbage! an' mo8ing the sur8i8ing ob5ects to 3eneration One) O8er time# ob5ects that your ython program continues to use "or a &ong time are migrate' "rom the 3eneration Pero &ist to 3eneration One) ython processes the ob5ects on the 3eneration One &ist in a simi&ar %ay# a"ter the a&&ocationMre&ease count 'e&ta 8a&ue reaches an

1I

e8en higher thresho&' 8a&ue) ython mo8es the remaining# acti8e ob5ects o8er to the 3eneration ;%o &ist) $n this %ay# the ob5ects that your ython program uses "or &ong perio's o" time# that your co'e (eeps acti8e re"erences to# mo8e "rom 3eneration Pero to One to ;%o) Lsing 'i""erent thresho&' 8a&ues# ython processes these ob5ects at 'i""erent inter8a&s) ython processes ob5ects in 3eneration Pero most "reCuent&y# 3eneration One &ess "reCuent&y# an' 3eneration ;%o e8en &ess o"ten)

The 6ea% Generational .ypothesis


;his beha8ior is the cru. o" the generationa& garbage co&&ection a&gorithm4 the co&&ector processes ne% ob5ects more "reCuent&y than o&' ob5ects) , ne%# or young ob5ect is one that your program has 5ust create'# %hi&e an o&' or mature ob5ect is one that has remaine' acti8e "or some perio' o" time) ython promotes an ob5ect %hen it mo8es it "rom 3eneration Pero to One# or "rom One to ;%o) @hy 'o this+ ;he "un'amenta& i'ea behin' this a&gorithm is (no%n as the wea/ generational hypothesis) ;he hypothesis actua&&y consists o" t%o i'eas4 that most ne% ob5ects 'ie young# %hi&e o&'er ob5ects are &i(e&y to remain acti8e "or a &ong time) Suppose $ create a ne% ob5ect using ython or Ruby4

,ccor'ing to the hypothesis# my co'e is &i(e&y to use the ne% ,*C no'e on&y "or a short time) ;he ob5ect is probab&y 5ust an interme'iate 8a&ue use' insi'e o" one metho' an' %i&& become garbage as soon as the metho' returns) 7ost ne% ob5ects %i&& become garbage Cuic(&y in this %ay) Occasiona&&y# ho%e8er# my program creates a "e% ob5ects that remain important "or a &onger time > such as session 8ariab&es or con"iguration 8a&ues in a %eb app&ication) *y processing the ne% ob5ects in 3eneration Pero more "reCuent&y# ython-s garbage co&&ector spen's most o" its time %here it %i&& bene"it the most4 it processes the ne% ob5ects %hich %i&& Cuic(&y an' "reCuent&y become garbage) On&y rare&y# %hen the a&&ocation thresho&' 8a&ue increases# 'oes ython-s co&&ector process the o&'er ob5ects)

Bac% to the &ree 1ist in Ruby


;he upcoming re&ease o" Ruby# 8ersion 2)1# no% uses a generationa& garbage co&&ector a&gorithm "or the "irst time1 BRemember# other imp&ementations o" Ruby# such as =Ruby an' Rubinius# ha8e been using this i'ea "or years)< Det-s return to the "ree &ist 'iagrams "rom my &ast post to un'erstan' ho% this %or(s) Reca&& that %hen the "ree &ist is use' up# Ruby mar(s the ob5ects your program is sti&& using4

1:

$n this 'iagram# %e see there are three acti8e ob5ects because the pointers n1# n2 an' n3 sti&& re"er to them) ;he remaining ob5ects# the %hite sCuares# are garbage) BO" course# the "ree &ist %i&& actua&&y contain thousan's o" ob5ects that re"er to each other in comp&e. patterns) 7y simp&e 'iagrams he&p me communicate the basic i'eas behin' Ruby-s 3C a&gorithm %ithout getting bogge' 'o%n in the 'etai&s)< ,&so reca&& that Ruby mo8es the garbage ob5ects bac( onto the "ree &ist# because no% they can be recyc&e' an' reuse' by your program %hen it a&&ocates ne% ob5ects4

Generational GC in Ruby 7.8


Starting %ith Ruby 2)1# the 3C co'e ta(es an a''itiona& step4 it promotes the remaining acti8e ob5ects to the mature generation) B;he 7R$ C source co'e actua&&y uses the %or' old an' not mature)< ;his 'iagram sho%s a conceptua& 8ie% o" the t%o Ruby 2)1 ob5ect generations4

1E

On the &e"t is a 'i""erent 8ie% o" the "ree &ist) @e see the garbage ob5ects in %hite# an' the remaining &i8e# acti8e ob5ects in gray) ;he gray ob5ects %ere 5ust mar(e') Once the mar( an' s%eep process is "inishe'# Ruby 2)1 %i&& consi'er the remaining mar(e' ob5ects to be mature4

20

$nstea' o" using three generations &i(e ython# Ruby 2)1-s garbage co&&ector uses 5ust t%o) On the &e"t are ne% ob5ects in the young generation# an' on the right are o&' ob5ects in the mature generation) Once Ruby 2)1 has mar(e' an ob5ect once# it consi'ers it to be mature) Ruby bets the ob5ect has remaine' acti8e "or a &ong enough time that it %i&& not become garbage Cuic(&y) $mportant note4 Ruby 2)1 'oesn-t actua&&y copy ob5ects aroun' in memory) ;hese generations 'on-t consist o" 'i""erent areas o" physica& memory) BSome 3C a&gorithms use' by other &anguages an' other Ruby imp&ementations# (no%n as copying garbage collectors# 'o actua&&y copy ob5ects %hen they are promote')< $nstea'# Ruby 2)1 uses co'e interna&&y that 'oesn-t inc&u'e pre8ious&y mar(e' ob5ects in the mar( an' s%eep process again) Once an ob5ect has been mar(e' once it %on-t be inc&u'e' in the ne.t mar( an' s%eep process) Ao% suppose your Ruby program continues to run# creating more ne%# young ob5ects) ;hese appear in the young generation again# on the &e"t4

=ust &i(e ython# Ruby-s co&&ector "ocuses its e""orts on the young generation) $t on&y inc&u'es the ne%# young ob5ects create' since the &ast 3C process occurre' in the mar( an' s%eep a&gorithm) ;his is because many ne% ob5ects are &i(e&y to be garbage a&rea'y Bthe %hite bo.es on the &e"t<) Ruby 'oesn-t reMmar( the mature ob5ects on the right) *ecause they a&rea'y sur8i8e' one 3C process# they are &i(e&y to remain acti8e an' not become garbage "or a &onger time) *y on&y mar(ing young ob5ects# Ruby-s 3C co'e runs much "aster) $t 5ust s(ips o8er the mature ob5ects entire&y# re'ucing the amount o" time your program is %aiting "or garbage co&&ection to "inish)

21

Occasiona&&y Ruby runs a "u&& co&&ection#! reMmar(ing an' reMs%eeping the mature ob5ects as %e&&) Ruby 'eci'es %hen to run a "u&& co&&ection by monitoring the number o" mature ob5ects) @hen the number o" mature ob5ects has 'oub&e' since the &ast "u&& co&&ection# Ruby c&ears the mar(s an' consi'ers a&& the ob5ects to be young again)

6rite Barriers
One important cha&&enge to this a&gorithm is %orth "urther e.p&anation) Suppose your co'e creates a ne%# young ob5ect an' a''s it as a chi&' o" an e.isting# mature ob5ect) 6or e.amp&e# this %ou&' happen i" you a''e' a ne% 8a&ue to an array that ha' e.iste' "or a &ong time)

/ere again on the &e"t %e see ne%# young ob5ects an' mature ob5ects on the right) On the &e"t si'e the mar(ing process has i'enti"ie' that 9 ne% ob5ects are sti&& acti8e Bgray<# %hi&e t%o ne% ob5ects are garbage B%hite<) *ut %hat about the ne%# young ob5ect in the center+ ;his is the one sho%n in %hite %ith a Cuestion mar() $s this ne% ob5ect garbage or acti8e+ $t-s acti8e# o" course# because there-s a re"erence to it "rom a mature ob5ect on the right) *ut remember Ruby 2)1 'oesn-t inc&u'e mature ob5ects in mar( an' s%eep Bunti& a "u&& co&&ection occurs<) ;his means the ne% ob5ect %i&& be incorrect&y consi'ere' garbage an' re&ease'# causing your program to &ose 'ata1 Ruby 2)1 o8ercomes this cha&&enge by monitoring the mature ob5ects to see i" your program a''s a re"erence "rom them to a ne%# young ob5ect) Ruby 2)1 uses an o&' 3C techniCue ca&&e' 22

write barriers to monitor changes to mature ob5ects > %hene8er you a'' a re"erence "rom one ob5ect to another B%hene8er you %rite to or mo'i"y an ob5ect<# a %rite barrier is triggere') ;he barriers chec( %hether the source ob5ect is mature# an' i" so a''s the mature ob5ect to a specia& &ist) Dater Ruby 2)1 inc&u'es these 5ust these mo'i"ie' mature ob5ects in the ne.t mar( an' s%eep process# pre8enting acti8e# young ob5ects "rom being incorrect&y consi'ere' garbage) Ruby 2)1-s actua& imp&ementation o" %rite barriers is Cuite comp&e.# primari&y because e.isting C e.tensions 'on-t contain them) Hoichi Sasa'a an' the Ruby core team use' a number o" c&e8er so&utions to o8ercome this cha&&enge as %e&&) ;o &earn more about these technica& 'etai&s# %atch Hoichi-s "ascinating presentation "rom KuRuHo 2013)

-tanding on the -houlders o Giants


,t "irst g&ance# Ruby an' ython seem to imp&ement garbage co&&ection 8ery 'i""erent&y) Ruby uses =ohn 7cCarthy-s origina& mar( an' s%eep a&gorithm# %hi&e ython uses re"erence counting) *ut %hen %e &oo( more c&ose&y# %e see that ython uses bits o" the mar( an' s%eep i'ea to han'&e cyc&ic re"erences# an' that both Ruby an' ython use generationa& garbage co&&ection in simi&ar %ays) ython uses three separate generations# %hi&e Ruby 2)1 uses t%o) ;his simi&arity shou&' not be a surprise) *oth &anguages are using computer science research that %as 'one 'eca'es ago > be"ore either Ruby or ython %ere e8en in8ente') $ "in' it "ascinating that %hen you &oo( un'er the hoo'! at 'i""erent programming &anguages# you o"ten "in' the same "un'amenta& i'eas an' a&gorithms are use' by a&& o" them) 7o'ern programming &anguages o%e a great 'ea& to the groun' brea(ing computer science research that =ohn 7cCarthy an' his contemporaries 'i' bac( in the 1E60s an' 1EI0s)

Subscribe

23

6hy 'ou -hould Be 0)cited /bout Garbage Collection in Ruby 7.9


7arch 23r' 2012 49 Comments more on Ruby interna&s

3hile not (ery glamorous, 1itmap Mar/ing *arbage Collection is a dramatic, creati(e inno(ation4

Nou may ha8e hear' &ast %ee( ho% $nno(enty 7ihai&o8-s great Knumerab&e44Da?y "eature %as accepte' into the Ruby 2)0 co'e base) *ut you may not ha8e hear' about an e8en more signi"icant change that %as merge' into Ruby 2)0 in =anuary4 a ne% a&gorithm "or garbage co&&ection ca&&e' *itmap 7ar(ing)! ;he 'e8e&oper behin' this sophisticate' an' inno8ati8e change# Aarihiro Aa(amura# has been %or(ing on this since 200: at &east an' a&so imp&emente' the Da?y S%eep! garbage co&&ection a&gorithm a&rea'y inc&u'e' in Ruby 1)E)3) ;he ne% *itmap 7ar(ing 3C a&gorithm promises to 'ramatica&&y re'uce o8era&& memory consumption by a&& Ruby processes running on a %eb ser8er1 *ut %hat 'oes bitmap mar(ing! rea&&y mean+ ,n' e.act&y %hy %i&& it re'uce memory consumption+ $" you (no% =apanese you can rea' a 'etai&e' aca'emic paper pub&ishe' in 200: by Aarihiro Aa(amura a&ong %ith Nu(ihiro B7at?!< 7atsumoto) $ %as so intereste' $ spent some time this %ee( stu'ying the garbage co&&ection co'e in 7R$ Ruby myse&"# an' this artic&e %i&& summari?e %hat $ &earne') Nou %on-t get any Ruby programming tips here to'ay# but hope"u&&y you-&& come a%ay %ith a better un'erstan'ing o" ho% garbage co&&ection actua&&y %or(s interna&&y# o" %hy Ruby 2)0 is something to &oo( "or%ar' to# an' o" ho% inno8ati8e an' creati8e the Ruby core 'e8e&opers rea&&y are)

*ar% and -"eep


,s $ e.p&aine' in my artic&e "rom =anuary# Ae8er create Ruby strings &onger than 23 characters# e8ery Ruby string 8a&ue is sa8e' interna&&y by 7R$ in a C structure ca&&e' RString# short "or Ruby String)! Kach RString structure is sp&it into t%o ha&8es &i(e this4

24

,t the bottom %e ha8e the actua& string 'ata itse&"# %hi&e at the top $-8e sho%n the %or' "&ags! to represent 8arious interna& meta'ata 8a&ues about the string that Ruby (eeps trac( o") $t turns out that a&& 8a&ues use' by your Ruby program are sa8e' in simi&ar structures ca&&e' R,rray# R/ash# R6i&e# etc) ;hey a&& share the same basic &ayout4 some 'ata an' the same set o" "&ags) ;he common name "or this type o" structure# %hich is share' across a&& the interna& ob5ect types# is ROa&ue > meaning Ruby Oa&ue)! Ruby a&&ocates an' organi?es these ROa&ue structures in arrays ca&&e' heaps)! /ere-s a conceptua& 'iagram o" a Ruby heap array# containing the three string 8a&ues a&ong %ith many other ROa&ue-s4

,s your Ruby program runs# %hene8er you create a ne% 8ariab&e or 8a&ue o" some type the Ruby interpreter "in's an a8ai&ab&e ROa&ue structure in the heap an' uses it to sa8e the ne% 8a&ue) O" course# you 'on-t nee' to %orry about this at a&&F it-s a&& han'&e' automatica&&y an' smooth&y "or you)

29

@e&& > it-s not that smooth at times# actua&&y) @hat happens %hen the ROa&ue structures in the heap run out+ G%hen there are none &e"t to sa8e a ne% 8a&ue your program reCuires+ ;his actua&&y happens more "reCuent&y than you might e.pect because there are many ROa&ue structures that you might not be a%are o" create' interna&&y by Ruby) $n "act# your Ruby co'e itse&" is con8erte' into a &arge number o" ROa&ue structures as it is parse' an' con8erte' into byte co'e) @hen there are no more ROa&ue structures a8ai&ab&e an' your program nee's to sa8e a ne% 8a&ue# Ruby runs its garbage co&&ection! B3C< co'e) ;he garbage co&&ector-s 5ob is to "in' %hich o" these ROa&ue-s are no &onger being re"erence' by your program an' can be recyc&e' an' reuse' "or some other 8a&ue) /ere-s ho% it %or(s# at a high &e8e&G) 6irst# the 3C co'e mar(s! a&& o" the acti8e ROa&ue structures# ;hat is# it &oops through a&& o" the 8ariab&es an' other acti8e re"erences that your program has to ROa&ue structures# an' mar(s each one using one o" those interna& "&ags ca&&e' 6D_7,RH)

;his is the "irst ha&" o" Ruby-s 7ar( an' S%eep! 3C a&gorithm) ;he mar(e' structures are acti8e&y being use' by your Ruby program an' cannot be "ree' or reuse') Once a&& the structures in the system are mar(e'# the remaining structures are s%ept! into a sing&e &in(e' &ist using the ne.t! pointer in each ROa&ue structure4 $n this 'iagram# $-8e sho%n the 6D_7,RH "&ags in the heap array %ith the &etter 7#! an' be&o% that you can see the &ist o" unmar(e' ROa&ue-s# ca&&e' the "ree &ist4!

26

,s you might guess# the "ree &ist can no% be use' to pro8i'e ne% ROa&ue structures to your Ruby program as it continues to run) Ao% e8ery time your Ruby program a&&ocates a ne% ob5ect or 8a&ue# it uses an ROa&ue "rom the "ree &ist# an' remo8es it "rom the &ist) K8entua&&y the "ree &ist %i&& become empty again an' Ruby %i&& ha8e to start another garbage co&&ection) ,"ter a %hi&e it might be that there are no unmar(e' structures &e"t in the heap at a&&# that a&& o" the a8ai&ab&e ROa&ue-s are being use'# in %hich case Ruby %i&& a&&ocate an entire ne% heap %ith more ROa&ue structures) B,ctua&&y it a&&ocates ne% heaps 10 at a time)< , typica& Ruby program might en' up ha8ing many 'i""erent heap arrays)

Copy:2n:6rite: ho" ,ni) shares !e!ory across di erent child processes


*e"ore %e can get to *itmap 7ar(ing! an' %hy it-s important# %e "irst nee' to &earn about a "eature o" Dinu. an' other Lni. an' Lni.M&i(e operating systems that is re&ate' to memory management an' memory a&&ocation4 CopyMOnM@rite optimi?ation) On these OS-s %hen a process ca&&s "or( to create a chi&' process %hich is a copy o" itse&"# the ne% chi&' process %i&& share a&& o" the memory > a&& o" the 'ata# 8ariab&es# etc) > that the parent ha' pre8ious&y a&&ocate') ;his ma(es the "or( ca&& much "aster by a8oi'ing copying memory aroun' unnecessari&y# an' a&so re'uces the tota& amount o" memory reCuire') ;his is ca&&e' CopyMOnM@rite! because separate copies o" a share' memory segment are ma'e %hen an' i" one o" the chi&' processes tries to mo'i"y the share' memory) ;his is simi&ar to the tric( that the Ruby interpreter itse&" uses to manage RString 8a&uesF "or 'etai&s chec( out a post $ %rote in =anuary about this4 Seeing 'oub&e4 ho% Ruby shares string 8a&ues) ;o un'erstan' this better# ta(e a &oo( at this conceptua& 'iagram o" a Ruby process4

2I

/ere $-8e sho%n a Ruby program that has t%o heaps as an e.amp&e) Ao% suppose this Ruby program is running on a %eb ser8er > maybe it-s a Rai&s %eb app&ication > an' no% a secon' /;; reCuest arri8es "rom another user4

Ao% %e ha8e t%o Ruby processes running) ossib&y this ser8er is running ,pache %ith something &i(e assenger that "or(s a separate Ruby process to han'&e each /;; reCuest) ;he nice thing about CopyMOnM@rite optimi?ation in Dinu. is that many o" the ROa&ue structures in the heap arrays can be share' bet%een these t%o Ruby programs# since they o"ten contain the same 8a&ues) $t might not seem that this %ou&' be the case at "irst g&anceF %hy %ou&' many > or any > o" the 8ariab&es in t%o Ruby programs be the same+ *ut remember on a %eb ser8er you are actua&&y running t%o or more copies o" the same co'e# creating the same 8ariab&es o8er an' o8er again) ,&so# many o" the ROa&ue structures in the heap actua&&y correspon' to the parse' 8ersion o" your Ruby program itse&" > the no'es in the 2:

,bstract Synta. ;ree! B,S;<) Since each process is running the same co'e# a&& o" these no'es %i&& ha8e the same 8a&ues an' %on-t e8er change) O" course# some o" the 'ata 8a&ues %i&& be 'i""erent an' %i&& be sa8e' separate&y insi'e each process > user 'ata type' into %eb "orms an' submitte'# resu&ts o" SQD Cueries on 'i""erent recor's# etc) *ut# as great as this soun's# it 'oesn-t actua&&y %or( "or Ruby1 @hy not+ @e&&# because as soon as Ruby has to run the 7ar( R S%eep garbage co&&ection a&gorithm $ e.p&aine' abo8e# a&& o" those ,S; no'es an' many other ROa&ue structures in the heap are a&& mar(e'# since they are sti&& being use' by the Ruby program) ;his means they are mo'i"ie' to set the 6D_7,RH "&ag# an' the Copy>OnM@rite co'e in the operating system has to start creating ne% copies o" the memory) So in "act on a typica& Ruby %eb ser8er this is %hat happens4

2E

;hat one &itt&e 6D_7,RH bit is %rea(ing ha8oc1 $t pre8ents %hat %ou&' norma&&y be a tremen'ous re'uction in ser8er memory usage "rom actua&&y happening) One important note here4 /ong&i Dai "rom husion# the creators o" the popu&ar assenger mi''&e%are component that connects ,pache %ith Rac( base' Ruby apps# patche' Ruby 1): an' create' a ne% 8ersion o" Ruby (no%n as Ruby Knterprise K'ition that so&8es this prob&em an' contains a number o" other per"ormance impro8ements) So in "act many Ruby 1): apps that use RKK ha8e been ab&e to ta(e a'8antage o" Lni. CopyMOnM@rite "or years no%) *ut CopyMOnM@rite sti&& 'oesn-t %or( %ith stan'ar' 7R$ Ruby 1): or 1)E)

Garbage Collection in Ruby 7.9: Bit!ap *ar%ing


/ere-s %here Aarihiro Aa(amura-s changes "or Ruby 2)0 come in1 $nstea' o" using the 6D_7,RH bit in each o" the ROa&ue structures to in'icate that Ruby is sti&& using an 8a&ue an' that it cannot be "ree'# Ruby 2)0 sa8es this in"ormation in something ca&&e' a bitmap! instea') AoG here bitmap! 'oes not re"er to an image "i&eF bitmap! in this conte.t re"ers to a &itera& co&&ection o" bits mappe' bac( to the ROa&ue structures4

6or each heap in Ruby 2)0 there is no% a correspon'ing memory structure that contains a series o" 1 or 0 bit 8a&ues) ,s you might guess# the 1 8a&ues are eCui8a&ent to the 6D_7,RH "&ag being set in a Ruby 1): or Ruby 1)E process# %hi&e a 0 is eCui8a&ent to the 6D_7,RH "&ag not being set) $n other %or's# the 6D_7,RH bits ha8e been mo8e' out o" the RString an' other ob5ect 8a&ue structures# an' into this separate memory area ca&&e' the bitmap) Aarihiro imp&emente' this by a''ing a hea'er structure to the beginning o" each heap %hich contains a pointer to the bitmap correspon'ing to that heap-s ROa&ue structures# a&ong %ith some other 8a&ues) @hat this means is that Ruby 2)0 can no% mar( a&& o" the inMuse structures 'uring the mar(! portion o" the 3C processing %ithout actua&&y mo'i"ying the structures themse&8es# a&&o%ing Lni. to continue to share memory across 'i""erent Ruby processes1 ;he bitmaps themse&8es# o" course# are mo'i"ie' "reCuent&y by Ruby 2)0# but since they use a contiguous stream o" bits they are actua&&y Cuite sma&& an' can be sa8e' separate&y in each process %ithout using too much memory) One interesting an' important 'etai& here is that the memory a&&ocate' "or heaps no% must be a&igne')! @hat this means is that %hen a&&ocating memory "or the heap# instea' o" ca&&ing ma&&oc as usua&# the Ruby C co'e ca&&s posi._mema&ign %hich on a Dinu. or Lni. operating system returns the ne% memory a&igne' to a po%er o" t%o a''ress boun'ary) @hat the hec( 'oes that mean+ @e&& i" you-re "ami&iar %ith C programming or bit%ise arithmetic# it a&&o%s the Ruby C co'e to Cuic(&y ca&cu&ate the &ocation o" the hea'er!

30

structure# %hich contains the pointer to the bitmap# "rom a gi8en ROa&ue ob5ect-s memory a''ress) Det-s ta(e another &oo( at a Ruby 2)0 heap4

Suppose that the Ruby 2)0 garbage co&&ector co'e nee's to mar( the "i"th ROa&ue ob5ect in this heap# re"erre' to by the ptr 8a&ue) ;he memory a&ignment tric( a&&o%s Ruby 2)0 to ta(e the ptr 8a&ue an' Cuic(&y ca&cu&ate the a''ress o" it-s heap hea'er structure) ,&& Ruby 2)0 has to 'o is mas( out the &ast "e% bits o" the ROa&ue a''ress# the 6:! he.a'ecima& o""set in this e.amp&e# to obtain the a''ress o" the hea'er structure# at membase! or 0.:066C000 in this 32Mbit e.amp&e)

Conclusion
3arbage co&&ection isn-t the most g&amorous or interesting part o" the Ruby &anguage at "irst g&ance# but as %e-8e seen i" you ta(e a c&ose &oo( at ho% it %or(s there-s a &ot o" interesting inno8ation going on) ractica&&y spea(ing# the *itmap 7ar(ing change %i&& he&p 7R$ Ruby 2)0 %or( better in pro'uction %eb ser8er en8ironments# re'ucing memory consumption 'ramatica&&y) *ut $ 8ie% *itmap 7ar(ing &ess as a practica& impro8ement that %i&& he&p my Rai&s apps run better# an' more 5ust as an e.citing# creati8e so&ution to a comp&e. prob&em) $t %as great "un &earning ho% 3C %or(s in Ruby 2)0 an' $ hope you no% ha8e a better appreciation o" a&& the har' %or( the ta&ente' Ruby core team is 'oing1

31

Você também pode gostar