Escolar Documentos
Profissional Documentos
Cultura Documentos
== Thu+(v> == Thu+(v>
l+l ,>fm l+l ,>fm
4qendo
W Cvervlew
Padoop Coogle
W aaS 1echnlques
llle SysLem
W ClS PulS
rogrammlng Model
W Map8educe regel
SLorage SysLem for SLrucLured uaLa
W 8lgLable Pbase
nodoop
W Padoop ls
A dlsLrlbuLed compuLlng
plaLform
A sofLware framework LhaL
leLs one easlly wrlLe and run
appllcaLlons LhaL process
vasL amounLs of daLa
lnsplred from publlshed
papers by Coogle
Padoop ulsLrlbuLed
llle SysLem (PulS)
Padoop ulsLrlbuLed
llle SysLem (PulS)
Map8educe Map8educe
Pbase Pbase
A ClusLer of Machlnes A ClusLer of Machlnes
Cloud AppllcaLlons Cloud AppllcaLlons
6ooq/e
W uoogle publisheu the uesigns of web-seaich
engine
S0SP
W 1he oogle lle ysLem
0SBI
W Mapkeduce Slmpllfled uaLa rocesslng on Large ClusLer
0SBI
W ab|e A ulsLrlbuLed SLorage SysLem for SLrucLured uaLa
6ooq/e vs nodoop
eve|op oup Coogle Apache
ponso Coogle ?ahoo Amazon
kesouce open documenL open source
|e sem ClS PulS
oammn Mode| Map8educe
Padoop
Map8educe
oae sem
(fo sucue daa)
8lgLable Pbase
each Lnne Coogle nuLch
C Llnux Llnux / CL
4qendo
W Cvervlew
Padoop Coogle
W aaS 1echnlques
llle SysLem
W ClS PulS
rogrammlng Model
W Map8educe regel
SLorage SysLem for SLrucLured uaLa
W 8lgLable Pbase
llL 5Y51M
ile System 0veiview
Bistiibuteu ile Systems (BS)
uoogle ile System (uS)
Bauoop Bistiibuteu ile Systems (BBS)
li/e 5ystem Overview
W SysLem LhaL permanenLly sLores daLa
W 1o sLore daLa ln unlLs called flles" on dlsks and oLher
medla
W llles are managed by Lhe CperaLlng SysLem
W 1he parL of Lhe CperaLlng SysLem LhaL deal wlLh flles
ls known as Lhe llle SysLem"
A flle ls a collecLlon of dlsk blocks
llle SysLem maps flle names and offseLs Lo dlsk blocks
W 1he seL of valld paLhs form Lhe namespace" of Lhe
flle sysLem
Jot 6ets 5tored
W user daLa lLself ls Lhe bulk of Lhe flle sysLems
conLenLs
W Also lncludes meLadaLa on a volumewlde and per
flle basls
W Avallable space
W lormaLLlng lnfo
W CharacLer seL
W
W Avallable space
W lormaLLlng lnfo
W CharacLer seL
W
volumewlde
W name
W Cwner
W ModlflcaLlon daLa
W
W name
W Cwner
W ModlflcaLlon daLa
W
erflle
esiqn considerotions
W namespace
hyslcal mapplng
Loglcal volume
W ConslsLency
WhaL Lo do when more Lhan one user reads/wrlLes on Lhe
same flle?
W SecurlLy
Who can do whaL Lo a flle?
AuLhenLlcaLlon/Access ConLrol LlsL (ACL)
W 8ellablllLy
Can flles noL be damaged aL power ouLage or oLher
hardware fallures?
Loco/ l5 on unix/ike 5ystems{1/4)
W namespace
rooL dlrecLory /" followed by dlrecLorles and flles
W ConslsLency
sequenLlal conslsLency" newly wrlLLen daLa are
lmmedlaLely vlslble Lo open reads
W SecurlLy
uld/gld mode of flles
kerberos LlckeLs
W 8ellablllLy
[ournallng snapshoL
Loco/ l5 on unix/ike 5ystems{2/4)
W namespace
hyslcal mapplng
W a dlrecLory and all of lLs subdlrecLorles are sLored on Lhe same
physlcal medla
/mnL/cdrom
/mnL/dlsk1 /mnL/dlsk2 when you have mulLlple dlsks
Loglcal volume
W a loglcal namespace LhaL can conLaln mulLlple physlcal medla or a
parLlLlon of a physlcal medla
sLlll mounLed llke /mnL/vol1
dynamlcal reslzlng by addlng/removlng dlsks wlLhouL rebooL
spllLLlng/merglng volumes as long as no daLa spans Lhe spllL
Loco/ l5 on unix/ike 5ystems{l/4)
W !ournallng
Changes Lo Lhe fllesysLem ls logged ln a jootool before lL ls
commlLLed
W useful lf an aLomlc acLlon needs Lwo or more wrlLes
eg appendlng Lo a flle (updaLe meLadaLa + allocaLe space +
wrlLe Lhe daLa)
W can play back a [ournal Lo recover daLa qulckly ln case of hardware
fallure
WhaL Lo log?
W changes Lo flle conLenL heavy overhead
W changes Lo meLadaLa fasL buL daLa corrupLlon may occur
lmplemenLaLlons xfs3 8elserlS l8Ms !lS eLc
Loco/ l5 on unix/ike 5ystems{4/4)
W SnapshoL
A snapshoL a copy of a seL of flles and dlrecLorles aL a
polnL ln Llme
W readonly snapshoLs readwrlLe snapshoLs
W usually done by Lhe fllesysLem lLself someLlmes by LvMs
W backlng up daLa can be done on a readonly snapshoL wlLhouL
worrylng abouL conslsLency
CopyonwrlLe ls a slmple and fasL way Lo creaLe snapshoLs
W currenL daLa ls Lhe snapshoL
W a requesL Lo wrlLe Lo a flle creaLes a new copy and work from
Lhere afLerwards
lmplemenLaLlon ulS Suns ZlS eLc
llL 5Y51M
ile System 0veiview
Bistiibuteu ile Systems (BS)
uoogle ile System (uS)
Bauoop Bistiibuteu ile Systems (BBS)
istributed li/e 5ystems
W Allows access Lo flles from mulLlple hosLs sharlng vla
a compuLer neLwork
W MusL supporL concurrency
Make varylng guaranLees abouL locklng who wlns" wlLh
concurrenL wrlLes eLc
MusL gracefully handle dropped connecLlons
W May lnclude faclllLles for LransparenL repllcaLlon and
faulL Lolerance
W ulfferenL lmplemenLaLlons slL ln dlfferenL places on
complexlLy/feaLure scale
Jen is l5 usefu/
W MulLlple users wanL Lo share flles
W 1he daLa may be much larger Lhan Lhe sLorage space
of a compuLer
W A user wanL Lo access hls/her daLa from dlfferenL
machlnes aL dlfferenL geographlc locaLlons
W users wanL a sLorage sysLem
8ackup
ManagemenL
noLe LhaL a user" of a ulS may acLually be a program"
esiqn considerotions of l5{1/2)
W ulfferenL sysLems have dlfferenL deslgns and
behavlors on Lhe followlng feaLures
lnLerface
W flle sysLem block l/C cusLom made
SecurlLy
W varlous auLhenLlcaLlon/auLhorlzaLlon schemes
8ellablllLy (faulLLolerance)
W conLlnue Lo funcLlon when some hardware fall (dlsks nodes
power eLc)
esiqn considerotions of l5{2/2)
namespace (vlrLuallzaLlon)
W provlde loglcal namespace LhaL can span across physlcal
boundarles
ConslsLency
W all cllenLs geL Lhe same daLa all Lhe Llme
W relaLed Lo locklng cachlng and synchronlzaLlon
arallel
W mulLlple cllenLs can have access Lo mulLlple dlsks aL Lhe same Llme
Scope
W local area neLwork vs wlde area neLwork
llL 5Y51M
ile System 0veiview
Bistiibuteu ile Systems (BS)
uoogle ile System (uS)
Bauoop Bistiibuteu ile Systems (BBS)
6ooq/e li/e 5ystem
Bow to piocess laige uata sets anu easily
utilize the iesouices of a laige uistiibuteu
system .
6ooq/e li/e 5ystem
W MoLlvaLlons
W ueslgn Cvervlew
W SysLem lnLeracLlons
W MasLer CperaLlons
W laulL 1olerance
Motivotions
W laulLLolerance and auLorecovery need Lo be bullL
lnLo Lhe sysLem
W SLandard l/C assumpLlons (eg block slze) have Lo be
reexamlned
W 8ecord appends are Lhe prevalenL form of wrlLlng
W Coogle appllcaLlons and ClS should be codeslgned
5l6N OvkvlJ
ssumptions
ichitectuie
Netauata
Consistency Nouel
4ssumptions{1/2)
W Plgh componenL fallure raLes
lnexpenslve commodlLy componenLs fall all Lhe Llme
MusL monlLor lLself and deLecL LoleraLe and recover from
fallures on a rouLlne basls
W ModesL number of large flles
LxpecL a few mllllon flles each 100 M8 or larger
MulLlC8 flles are Lhe common case and should be
managed efflclenLly
W 1he workloads prlmarlly conslsL of Lwo klnds of reads
large sLreamlng reads
small random reads
4ssumptions{2/2)
W 1he workloads also have many large sequenLlal
wrlLes LhaL append daLa Lo flles
1yplcal operaLlon slzes are slmllar Lo Lhose for reads
W Welldeflned semanLlcs for mulLlple cllenLs LhaL
concurrenLly append Lo Lhe same flle
W Plgh susLalned bandwldLh ls more lmporLanL Lhan
low laLency
lace a premlum on processlng daLa ln bulk aL a hlgh raLe
whlle have sLrlngenL response Llme
esiqn ecisions
W 8ellablllLy Lhrough repllcaLlon
W Slngle masLer Lo coordlnaLe access keep meLadaLa
Slmple cenLrallzed managemenL
W no daLa cachlng
LlLLle beneflL on cllenL large daLa seLs / sLreamlng reads
no need on chunkserver rely on exlsLlng flle buffers
Slmpllfles Lhe sysLem by ellmlnaLlng cache coherence
lssues
W lamlllar lnLerface buL cusLomlze Lhe Al
no CSlx slmpllfy Lhe problem focus on Coogle apps
Add snapshot and record append operaLlons
5l6N OvkvlJ
ssumptions
ichitectuie
Netauata
Consistency Nouel
4rcitecture
ldenLlfled by
an lmmuLable
and globally
unlque 64 blL
cunk ond/e
ko/es in 6l5
W 8oles masLer chunkserver cllenL
CommodlLy Llnux box user level server processes
CllenL and chunkserver can run on Lhe same box
W MasLer holds meLadaLa
W Chunkservers hold daLa
W CllenL produces/consumes daLa
5inq/e Moster
W 1he masLer have global knowledge of chunks
Lasy Lo make declslons on placemenL and repllcaLlon
W lrom dlsLrlbuLed sysLems we know Lhls ls a
Slngle polnL of fallure
ScalablllLy boLLleneck
W ClS soluLlons
Shadow masLers
Mlnlmlze masLer lnvolvemenL
W never move daLa Lhrough lL use only for meLadaLa
W cache meLadaLa aL cllenLs
W large chunk slze
W masLer delegaLes auLhorlLy Lo prlmary repllcas ln daLa
muLaLlons(chunk leases)
cunkserver oto
W uaLa organlzed ln flles and dlrecLorles
ManlpulaLlon Lhrough flle handles
W llles sLored ln nooks (cf blonks" ln dlsk flle sysLems)
A chunk ls a Llnux flle on local dlsk of a chunkserver
unlque 64 blL chunk handles asslgned by masLer aL
creaLlon Llme
llxed chunk slze of 64M8
8ead/wrlLe by (chunk handle byLe range)
Lach chunk ls repllcaLed across 3+ chunkservers
cunk 5ite
W Lach chunk slze ls 64 M8
W A large chunk slze offers lmporLanL advanLages when
sLream readlng/wrlLlng
Less communlcaLlon beLween cllenL and masLer
Less memory space needed for meLadaLa ln masLer
Less neLwork overhead beLween cllenL and chunkserver
(one 1C connecLlon for larger amounL of daLa)
W Cn Lhe oLher hand a large chunk slze has lLs
dlsadvanLages
PoL spoLs
lragmenLaLlon
5l6N OvkvlJ
ssumptions
ichitectuie
Netauata
Consistency Nouel
Metodoto
CFS master
- namespace(flle chunk)
- Mapplng from flles Lo chunks
- CurrenL locaLlons of chunks
- Access ConLrol lnformaLlon
All ln memory
durlng operaLlon
Metodoto {cont)
W namespace and flleLochunk mapplng are kepL
perslsLenL
operation logs + checkpoints
W CperaLlon logs hlsLorlcal record of muLaLlons
represenLs Lhe Llmellne of changes Lo meLadaLa ln concurrenL
operaLlons
sLored on masLers local dlsk
repllcaLed remoLely
W A muLaLlon ls noL done or vlslble unLll Lhe operaLlon log ls
sLored locally and remoLely
masLer may group operaLlon logs for baLch flush
kecovery
W 8ecover Lhe flle sysLem replay Lhe operaLlon logs
fsck" of ClS afLer eg a masLer crash
W use checkpolnLs Lo speed up
memorymappable no parslng
8ecovery read ln Lhe laLesL checkpolnL + replay logs Laken afLer
Lhe checkpolnL
lncompleLe checkpolnLs are lgnored
Cld checkpolnLs and operaLlon logs can be deleLed
W CreaLlng a checkpolnL musL noL delay new muLaLlons
1 SwlLch Lo a new log flle for new operaLlon logs all operaLlon
logs up Lo now are now frozen"
2 8ulld Lhe checkpolnL ln a separaLe Lhread
3 WrlLe locally and remoLely
cunk Locotions
W Chunk locaLlons are noL sLored ln masLers dlsks
1he masLer asks chunkservers whaL Lhey have durlng
masLer sLarLup or when a new chunkserver [olns Lhe
clusLer
lL decldes chunk placemenLs LhereafLer
lL monlLors chunkservers wlLh regular hearLbeaL messages
W 8aLlonale
ulsks fall
Chunkservers dle (re)appear geL renamed eLc
LllmlnaLe synchronlzaLlon problem beLween Lhe masLer
and all chunkservers
5l6N OvkvlJ
ssumptions
ichitectuie
Netauata
Consistency Nouel
consistency Mode/
W ClS has a relaxed conslsLency model
W llle namespace muLaLlons are aLomlc and conslsLenL
handled excluslvely by Lhe masLer
namespace lock guaranLees aLomlclLy and correcLness
order deflned by Lhe operaLlon logs
W llle reglon muLaLlons compllcaLed by repllcas
ConslsLenL" all repllcas have Lhe same daLa
ueflned" conslsLenL + repllca reflecLs Lhe muLaLlon
enLlrely
A relaxed conslsLency model noL always conslsLenL noL
always deflned elLher
consistency Mode/ {cont)
6ooq/e li/e 5ystem
W MoLlvaLlons
W ueslgn Cvervlew
W SysLem lnLeracLlons
W MasLer CperaLlons
W laulL 1olerance
5Y51M lN1k4c1lON5
eauWiite
Concuiient Wiite
tomic ecoiu ppenus
Snapshot
Ji/e reodinq o fi/e
AppllcaLlon AppllcaLlon ClS CllenL ClS CllenL MasLer MasLer Chunkserver Chunkserver
Cpen(name read)
name
handle
handle
8ead(handle offseL
lengLh buffer)
handle
chunk_lndex
chunk_handle
chunk_locaLlons
cache (handle
chunk_lndex)
(chunk_handle
locaLlons)
selecL a repllca
chunk_handle
byLe_range
uaLa uaLa
reLurn code
Cpen
8ead
Ji/e writinq to o li/e
chunk_handle
prlmary_ld 8ep
llca_locaLlons
AppllcaLlon AppllcaLlon ClS CllenL ClS CllenL MasLer MasLer Chunkserver Chunkserver
rlmary
Chunkserver
rlmary
Chunkserver
Chunkserver Chunkserver
WrlLe(handle
offseLlengLh
buffer) handle
Cuery
cache selecL a
repllca
granLs a lease
(lf noL granLed before)
uaLa uaLa
uaLa uaLa
uaLa uaLa
recelved
daLa recelved
wrlLe (lds)
m order(*)
m order(*)
compleLe
compleLe
compleLed
reLurn code
uaLa ush
CommlL
* asslgn muLaLlon
order wrlLe Lo dlsk
Chunkserver
Leose Monoqement
W A cruclal parL of concurrenL wrlLe/append operaLlon
ueslgned Lo mlnlmlze masLers managemenL overhead by
auLhorlzlng chunkservers Lo make declslons
W Cne lease per chunk
CranLed Lo a chunkserver whlch becomes Lhe prlmary
CranLlng a lease lncreases Lhe verslon number of Lhe chunk
8emlnder Lhe prlmary decldes Lhe muLaLlon order
W 1he prlmary can renew Lhe lease before lL explres
lggybacked on Lhe regular hearLbeaL message
W 1he masLer can revoke a lease (eg for snapshoL)
W 1he masLer can granL Lhe lease Lo anoLher repllca lf Lhe
currenL lease explres (prlmary crashed eLc)
Mutotion
1 CllenL asks masLer for repllca
locaLlons
2 MasLer responds
3 CllenL pushes daLa Lo all repllcas
repllcas sLore lL ln a buffer cache
4 CllenL sends a wrlLe requesL Lo Lhe
prlmary (ldenLlfylng Lhe daLa LhaL had
been pushed)
3 rlmary forwards requesL Lo Lhe
secondarles (ldenLlfles Lhe order)
6 1he secondarles respond Lo Lhe
prlmary
7 1he prlmary responds Lo Lhe cllenL
Mutotion {cont)
W MuLaLlon wrlLe or append
musL be done for all repllcas
W Coal
mlnlmlze masLer lnvolvemenL
W Lease mechanlsm for conslsLency
masLer plcks one repllca as prlmary glves lL a lease" for
muLaLlons
a lease a lock LhaL has an explraLlon Llme
prlmary deflnes a serlal order of muLaLlons
all repllcas follow Lhls order
W uaLa flow ls decoupled from conLrol flow
5Y51M lN1k4c1lON5
eauWiite
Concuiient Wiite
tomic ecoiu ppenus
Snapshot
concurrent Jrite
W lf Lwo cllenLs concurrenLly wrlLe Lo Lhe same reglon
of a flle any of Lhe followlng may happen Lo Lhe
overlapplng porLlon
LvenLually Lhe overlapplng reglon may conLaln daLa from
exacLly one of Lhe Lwo wrlLes
LvenLually Lhe overlapplng reglon may conLaln a mlxLure of
daLa from Lhe Lwo wrlLes
W lurLhermore lf a read ls execuLed concurrenLly wlLh
a wrlLe Lhe read operaLlon may see elLher all of Lhe
wrlLe none of Lhe wrlLe or [usL a porLlon of Lhe wrlLe
consistency Mode/ {remind)
WrlLe x aL reglon [ ln C1
C1 C1 C1
8eglon lnconslsLenL 8eglon conslsLenL
x x x
WrlLe xyz aL reglon [ ln C1
WrlLe abc aL
reglon [ ln C1
8eglon conslsLenL buL undeflned
xyzabc xyzabc xyzabc
Jrite/concurrent Jrite
1rodeoffs
W Some properLles
concurrenL wrlLes leave reglon conslsLenL buL posslbly
undeflned
falled wrlLes leave Lhe reglon lnconslsLenL
W Some work has moved lnLo Lhe appllcaLlons
eg selfvalldaLlng selfldenLlfylng records
4tomic kecord 4ppends
W ClS provldes an aLomlc append operaLlon called
record append"
W CllenL speclfles daLa buL noL Lhe offseL
W ClS guaranLees LhaL Lhe daLa ls appended Lo Lhe flle
aLomlcally aL leasL once
ClS plcks Lhe offseL and reLurns Lhe offseL Lo cllenL
works for concurrenL wrlLers
W used heavlly by Coogle apps
eg for flles LhaL serve as mulLlpleproducer/slngle
consumer queues
ConLaln merged resulLs from many dlfferenL cllenLs
now kecord 4ppend Jorks
W Cuery and uaLa ush are slmllar Lo wrlLe operaLlon
W CllenL send wrlLe requesL Lo prlmary
W lf appendlng would exceed chunk boundary
rlmary pads Lhe currenL chunk Lells oLher repllcas Lo do
Lhe same replles Lo cllenL asklng Lo reLry on Lhe nexL
chunk
W Llse
commlL Lhe wrlLe ln all repllcas
W Any repllca fallure cllenL reLrles
Append abc
C1 C1 C1
8eglon deflned lnLerspersed wlLh
lnconslsLenL
abc abc abc
8eLry
8eglon lnconslsLenL and
undeflned
abc
abc
4ppend
5Y51M lN1k4c1lON5
eauWiite
Concuiient Wiite
tomic ecoiu ppenus
Snapshot
5nopsot
W Makes a copy of a flle or a dlrecLory Lree almosL
lnsLanLaneously
mlnlmlze lnLerrupLlons of ongolng muLaLlons
copyonwrlLe wlLh reference counLs on chunks
W SLeps
1 a cllenL lssues a snapshoL requesL for source flles
2 masLer revokes all leases of affecLed chunks
3 masLer logs Lhe operaLlon Lo dlsk
4 masLer dupllcaLes meLadaLa of source flles polnLlng Lo
Lhe same chunks lncreaslng Lhe reference counL of Lhe
chunks
4fter 5nopsot{keod/Jrite)
UaZWX
8ead bar WrlLe bar
Copy
ton 2
Chunk 2ef0
Chunk handle
ton 1
Chunk 2ef1
Copy daLa
Copy daLa
SnapshoL
ton 1
Chunk handle
uaLa
6ooq/e li/e 5ystem
W MoLlvaLlons
W ueslgn Cvervlew
W SysLem lnLeracLlons
W MasLer CperaLlons
W laulL 1olerance
M451k OPk41lON5
amespace Nanagement anu Locking
eplica Placement
Cieation, ebalancing , e-ieplication
uaibage Collection
Stale eplica Betection
Nomespoce Mqt ond Lockinq
W Allows mulLlple operaLlons Lo be acLlve and use locks
over reglons of Lhe namespace
W Loglcally represenLs namespace as a lookup Lable
mapplng full paLhnames Lo meLadaLa
W Lach node ln Lhe namespace Lree has an assoclaLed
readwrlLe lock
W Lach masLer operaLlon acqulres a seL of locks before
lL runs
Nomespoce Mqt ond Lockinq {cont)
/d1/d2//dn/leaf
/d1
/d1/d2
/d1/d2//dn
/d1/d2//dn/leaf
lf lL lnvolves
8ead locks on Lhe
dlrecLory name
LlLher a read lock
or a wrlLe lock on
Lhe full paLhname
Nomespoce Mqt ond Lockinq {cont)
W Pow Lhls locklng mechanlsm can prevenL a flle
/home/user/foo from belng creaLed whlle
/home/user ls belng snapshoLLed Lo /save/user
kead |ocks We |ocks
SnapshoL
operaLlon
/home /home/user
/save /save/user
CreaLlon
operaLlon
/home
/home/user/foo
/home/user
M451k OPk41lON5
amespace Nanagement anu Locking
eplica Placement
Cieation, ebalancing , e-ieplication
uaibage Collection
Stale eplica Betection
kep/ico P/ocement
W 1rafflc beLween racks ls slower Lhan wlLhln Lhe same
rack
W A repllca ls creaLed for 3 reasons
Chunk creaLlon
Chunk rerepllcaLlon
Chunk rebalanclng
W MasLer has a repllca placemenL pollcy
Maxlmlze daLa rellablllLy and avallablllLy
Maxlmlze neLwork bandwldLh uLlllzaLlon
MusL spread repllca across racks
cunk creotion kebo/once
W Where Lo puL Lhe lnlLlal repllcas?
Servers wlLh belowaverage dlsk uLlllzaLlon
8uL noL Loo many recenL creaLlons on a server
And musL have servers across racks
W MasLer rebalances repllcas perlodlcally
Moves chunks for beLLer dlsk space balance and load
balance
lllls up new chunkserver
W MasLer prefers Lo move chunks ouL of crowded
chunkserver
cunk kerep/icotion
W MasLer rerepllcaLes a chunk as soon as Lhe number of
avallable repllcas falls below a userspeclfled goal
Chunkserver dles ls removed eLc ulsk falls ls dlsabled eLc
Chunk ls corrupL Coal ls lncreased
W lacLors affecLlng whlch chunk ls cloned flrsL
Pow far ls lL from Lhe goal
Llve flles vs deleLed flles
8locklng cllenL
W lacemenL pollcy ls slmllar Lo chunk creaLlon
W MasLer llmlLs Lhe number of clonlng per chunkserver and
clusLerwlde Lo mlnlmlze Lhe lmpacL on cllenL Lrafflc
W Chunkserver LhroLLles clonlng read
M451k OPk41lON5
amespace Nanagement anu Locking
eplica Placement
Cieation, ebalancing , e-ieplication
uaibage Collection
Stale eplica Betection
6orboqe co//ection
W Chunks of deleLed flles are noL reclalmed lmmedlaLely
W Mechanlsm
CllenL lssues a requesL Lo deleLe a flle
MasLer logs Lhe operaLlon lmmedlaLely renames Lhe flle Lo a
hldden name wlLh LlmesLamp and replles
MasLer scans flle namespace regularly
W MasLer removes meLadaLa of hldden flles older Lhan 3 days
MasLer scans chunk namespace regularly
W MasLer removes meLadaLa of orphaned chunks
Chunkserver sends masLer a llsL of chunk handles lL has ln
regular PearL8eaL message
W MasLer replles Lhe chunks noL ln namespace
W Chunkserver ls free Lo deleLe Lhe chunks
6orboqe co//ection{cont)
ueleLe /foo
o
Meadaa