Você está na página 1de 44

SEMINAR REPORT

on
HADOOP
From www.techalone.com
TABLE OF CONTENTS
INTRODUCTION.......................................................................................................3
Need for lare data !roce""#n............................................................................$
Challene" #n d#"tr#%&ted com!&t#n ''' meet#n hadoo!....................................(
COMPARISON )ITH OTHER S*STEMS......................................................................+
Com!ar#"on w#th RD,MS.....................................................................................+
ORI-IN OF HADOOP.................................................................................................
SU,PRO/ECTS.........................................................................................................0
Core....................................................................................................................0
A1ro..................................................................................................................23
Ma!red&ce........................................................................................................23
HDFS.................................................................................................................23
P#.....................................................................................................................23
THE HADOOP APPROACH......................................................................................23
Data d#"tr#%&t#on................................................................................................22
Ma!Red&ce4 I"olated Proce""e".........................................................................25
INTRODUCTION TO MAPREDUCE...........................................................................23
Proramm#n model..........................................................................................23
T6!e"................................................................................................................2+
HADOOP MAPREDUCE.......................................................................................27
Com%#ner F&nct#on"...........................................................................................55
HADOOP STREAMIN-.........................................................................................55
HADOOP PIPES..................................................................................................55
HADOOP DISTRI,UTED FI8ES*STEM 9HDFS:..........................................................53
ASSUMPTIONS AND -OA8S ...............................................................................53
Hardware Fa#l&re .........................................................................................53
Stream#n Data Acce"" ................................................................................53
8are Data Set" ...........................................................................................53
S#m!le Coherenc6 Model .............................................................................5$
;Mo1#n Com!&tat#on #" Chea!er than Mo1#n Data< ..................................5$
Porta%#l#t6 Acro"" Heteroeneo&" Hardware and Software Platform" ...........5$
DESI-N..............................................................................................................5$
HDFS Conce!t"..................................................................................................5(
,loc=" ..........................................................................................................5(
Namenode" and Datanode"..........................................................................57
The F#le S6"tem Name"!ace ........................................................................33
Data Re!l#cat#on ..........................................................................................32
Re!l#ca Placement........................................................................................32
Re!l#ca Select#on .........................................................................................35
Safemode ....................................................................................................35
The Per"#"tence of F#le S6"tem Metadata .....................................................33
5 | P a e
The Comm&n#cat#on Protocol" ..........................................................................33
Ro%&"tne"" .......................................................................................................3$
Data D#"= Fa#l&re> Heart%eat" and Re'Re!l#cat#on ........................................3$
Cl&"ter Re%alanc#n .........................................................................................3$
Data Inter#t6 ...................................................................................................3$
Metadata D#"= Fa#l&re .......................................................................................3(
Sna!"hot" ........................................................................................................3(
Data Oran#?at#on ............................................................................................3(
Data ,loc=" .................................................................................................3(
Sta#n ........................................................................................................3(
Re!l#cat#on P#!el#n#n ..................................................................................3+
Acce""#%#l#t6 .....................................................................................................3+
S!ace Reclamat#on ...........................................................................................37
F#le Delete" and Undelete" ..........................................................................37
Decrea"e Re!l#cat#on Factor ........................................................................37
Hadoo! F#le"6"tem".....................................................................................37
Hadoo! Arch#1e"...............................................................................................30
U"#n Hadoo! Arch#1e".................................................................................30
ANATOM* OF A MAPREDUCE /O, RUN..................................................................$3
Hadoo! #" now a !art of4'......................................................................................$2
INTRODUCTION
Com!&t#n #n #t" !&re"t form> ha" chaned hand" m&lt#!le t#me". F#r"t> from near the
%e#nn#n ma#nframe" were !red#cted to %e the f&t&re of com!&t#n. Indeed ma#nframe"
and lare "cale mach#ne" were %&#lt and &"ed> and #n "ome c#rc&m"tance" are &"ed
"#m#larl6 toda6. The trend> howe1er> t&rned from %#er and more e@!en"#1e> to "maller
and more afforda%le commod#t6 PC" and "er1er".
3 | P a e
Mo"t of o&r data #" "tored on local networ=" w#th "er1er" that ma6 %e cl&"tered and "har#n
"torae. Th#" a!!roach ha" had t#me to %e de1elo!ed #nto "ta%le arch#tect&re> and !ro1#de
decent red&ndanc6 when de!lo6ed r#ht. A newer emer#n technolo6> clo&d com!&t#n>
ha" "hown &! demand#n attent#on and A&#c=l6 #" chan#n the d#rect#on of the technolo6
land"ca!e. )hether #t #" -ooleB" &n#A&e and "cala%le -oole F#le S6"tem> or Ama?onB"
ro%&"t Ama?on S3 clo&d "torae model> #t #" clear that clo&d com!&t#n ha" arr#1ed w#th
m&ch to %e leaned from.

Clo&d com!&t#n #" a "t6le of com!&t#n #n wh#ch d6nam#call6 "cala%le and often 1#rt&al#?e
re"o&rce" are !ro1#ded a" a "er1#ce o1er the Internet. U"er" need not ha1e =nowlede of>
e@!ert#"e #n> or control o1er the technolo6 #nfra"tr&ct&re #n the Cclo&dC that "&!!ort"
them.
Need for large data processing
)e l#1e #n the data ae. ItB" not ea"6 to mea"&re the total 1ol&me of data "tored
electron#call6> %&t an IDC e"t#mate !&t the "#?e of the ;d##tal &n#1er"e< at 3.2. ?etta%6te"
#n 533+> and #" foreca"t#n a tenfold rowth %6 5322 to 2.. ?etta%6te".
Some of the lare data !roce""#n needed area" #ncl&de4'
D The New *or= Stoc= E@chane enerate" a%o&t one tera%6te of new trade data !er da6.

D Face%oo= ho"t" a!!ro@#matel6 23 %#ll#on !hoto"> ta=#n &! one !eta%6te of "torae.
D Ance"tr6.com> the enealo6 "#te> "tore" aro&nd 5.( !eta%6te" of data.
$ | P a e
D The Internet Arch#1e "tore" aro&nd 5 !eta%6te" of data> and #" row#n at a rate of 53
tera%6te" !er month.
D The 8are Hadron Coll#der near -ene1a> Sw#t?erland> w#ll !rod&ce a%o&t 2( !eta%6te" of
data !er 6ear.
The !ro%lem #" that wh#le the "torae ca!ac#t#e" of hard dr#1e" ha1e #ncrea"ed ma""#1el6
o1er the 6ear"> acce"" "!eed"Ethe rate at wh#ch data can %e read from dr#1e" ha1e not
=e!t &!. One t6!#cal dr#1e from 2003 co&ld "tore 2373 M, of data and had a tran"fer "!eed
of $.$ M,F">G "o we co&ld read all the data from a f&ll dr#1e #n aro&nd f#1e m#n&te". Almo"t
53 6ear" later one tera%6te dr#1e" are the norm> %&t the tran"fer "!eed #" aro&nd 233
M,F"> "o #t ta=e" more than two and a half ho&r" to read all the data off the d#"=. Th#" #" a
lon t#me to read all data on a "#nle dr#1eEand wr#t#n #" e1en "lower. The o%1#o&" wa6 to
red&ce the t#me #" to read from m&lt#!le d#"=" at once. Ima#ne #f we had 233 dr#1e"> each
hold#n one h&ndredth of the data. )or=#n #n !arallel> we co&ld read the data #n &nder
two m#n&te".Th#" "how" the "#n#f#cance of d#"tr#%&ted com!&t#n.
Challenges in distributed computing --- meeting hadoop
Har#o&" challene" are faced wh#le de1elo!#n a d#"tr#%&ted a!!l#cat#on. The f#r"t !ro%lem
to "ol1e #" hardware fa#l&re4 a" "oon a" we "tart &"#n man6 !#ece" of hardware> the
chance that one w#ll fa#l #" fa#rl6 h#h. A common wa6 of a1o#d#n data lo"" #" thro&h
re!l#cat#on4 red&ndant co!#e" of the data are =e!t %6 the "6"tem "o that #n the e1ent of
fa#l&re> there #" another co!6 a1a#la%le. Th#" #" how RAID wor="> for #n"tance> altho&h
Hadoo!B" f#le"6"tem> the Hadoo! D#"tr#%&ted F#le"6"tem9HDFS:> ta=e" a "l#htl6 d#fferent
a!!roach.
The "econd !ro%lem #" that mo"t anal6"#" ta"=" need to %e a%le to com%#ne the data #n
"ome wa6I data read from one d#"= ma6 need to %e com%#ned w#th the data from an6 of
the other 00 d#"=". Har#o&" d#"tr#%&ted "6"tem" allow data to %e com%#ned from m&lt#!le
"o&rce"> %&t do#n th#" correctl6 #" notor#o&"l6 challen#n. Ma!Red&ce !ro1#de" a
!roramm#n model that a%"tract" the !ro%lem from d#"= read" and wr#te" tran"form#n #t
#nto a com!&tat#on o1er "et" of =e6" and 1al&e".
This, in a nutshell, is what adoop pro!ides" a reliable shared storage and
anal#sis s#stem$ The storage is pro!ided b# D%&, and anal#sis b# 'apReduce$
There are other parts to adoop, but these capabilities are its (ernel$
( | P a e
adoop is the popular open source implementation of 'apReduce, a powerful
tool designed for deep anal#sis and transformation of !er# large data
sets$ Hadoo! ena%le" 6o& to e@!lore com!le@ data> &"#n c&"tom anal6"e" ta#lored to 6o&r
#nformat#on and A&e"t#on". Hadoo! #" the "6"tem that allow" &n"tr&ct&red data to %e
d#"tr#%&ted acro"" h&ndred" or tho&"and" of mach#ne" form#n "hared noth#n cl&"ter">
and the e@ec&t#on of Ma!FRed&ce ro&t#ne" to r&n on the data #n that cl&"ter. Hadoo! ha"
#t" own f#le"6"tem wh#ch re!l#cate" data to m&lt#!le node" to en"&re #f one node hold#n
data oe" down> there are at lea"t 5 other node" from wh#ch to retr#e1e that !#ece of
#nformat#on. Th#" !rotect" the data a1a#la%#l#t6 from node fa#l&re> "ometh#n wh#ch #"
cr#t#cal when there are man6 node" #n a cl&"ter 9a=a RAID at a "er1er le1el:.
CO')*RI&ON +IT OT,R &-&T,'&
Comparison with RD.'&
Unle"" we are deal#n w#th 1er6 lare 1ol&me" of &n"tr&ct&red data 9h&ndred" of -,>
T,B" or P,B": and ha1e lare n&m%er" of mach#ne" a1a#la%le 6o& w#ll l#=el6 f#nd the
!erformance of Hadoo! r&nn#n a Ma!FRed&ce A&er6 m&ch "lower than a com!ara%le SJ8
+ | P a e
A&er6 on a relat#onal data%a"e. Hadoo! &"e" a %r&te force acce"" method wherea"
RD,MSB" ha1e o!t#m#?at#on method" for acce""#n data "&ch a" #nde@e" and read'ahead.
The %enef#t" reall6 do onl6 come #nto !la6 when the !o"#t#1e of ma"" !arallel#"m #"
ach#e1ed> or the data #" &n"tr&ct&red to the !o#nt where no RD,MS o!t#m#?at#on" can %e
a!!l#ed to hel! the !erformance of A&er#e".
,&t w#th all %enchmar=" e1er6th#n ha" to %e ta=en #nto con"#derat#on. For e@am!le> #f
the data "tart" l#fe #n a te@t f#le #n the f#le "6"tem 9e.. a lo f#le: the co"t a""oc#ated w#th
e@tract#n that data from the te@t f#le and "tr&ct&r#n #t #nto a "tandard "chema and
load#n #t #nto the RD,MS ha" to %e con"#dered. And #f 6o& ha1e to do that for 2333 or
23>333 lo f#le" that ma6 ta=e m#n&te" or ho&r" or da6" to do 9w#th Hadoo! 6o& "t#ll ha1e
to co!6 the f#le" to #t" f#le "6"tem:. It ma6 al"o %e !ract#call6 #m!o""#%le to load "&ch data
#nto a RD,MS for "ome en1#ronment" a" data co&ld %e enerated #n "&ch a 1ol&me that a
load !roce"" #nto a RD,MS cannot =ee! &!. So wh#le &"#n Hadoo! 6o&r A&er6 t#me ma6
%e "lower 9"!eed #m!ro1e" w#th more node" #n the cl&"ter: %&t !otent#all6 6o&r acce"" t#me
to the data ma6 %e #m!ro1ed.
Al"o a" there arenBt an6 ma#n"tream RD,MSB" that "cale to tho&"and" of node"> at "ome
!o#nt the "heer ma"" of %r&te force !roce""#n !ower w#ll o&t!erform the o!t#m#?ed> %&t
re"tr#cted on "cale> relat#onal acce"" method. In o&r c&rrent RD,MS'de!endent we%
"tac="> "cala%#l#t6 !ro%lem" tend to h#t the harde"t at the data%a"e le1el. For a!!l#cat#on"
w#th K&"t a handf&l of common &"e ca"e" that acce"" a lot of the "ame data> d#"tr#%&ted #n'
memor6 cache"> "&ch a" memcached !ro1#de "ome rel#ef. Howe1er> for #nteract#1e
a!!l#cat#on" that ho!e to rel#a%l6 "cale and "&!!ort 1a"t amo&nt" of IO> the trad#t#onal
RD,MS "et&! #"nBt o#n to c&t #t. Unl#=e "mall a!!l#cat#on" that can f#t the#r mo"t act#1e
data #nto memor6> a!!l#cat#on" that "#t on to! of ma""#1e "tore" of "hared content reA&#re
a d#"tr#%&ted "ol&t#on #f the6 ho!e to "&r1#1e the lon ta#l &"ae !attern commonl6 fo&nd
on content'r#ch "#te. )e canBt &"e data%a"e" w#th lot" of d#"=" to do lare'"cale %atch
anal6"#". This is because see( time is impro!ing more slowl# than transfer rate$
&ee(ing is the process of mo!ing the dis(/s head to a particular place on the
dis( to read or write data. It character#?e" the latenc# of a d#"= o!erat#on> wherea" the
tran"fer rate corre"!ond" to a d#"=B" %andw#dth. If the data acce"" !attern #" dom#nated %6
"ee="> #t w#ll ta=e loner to read or wr#te lare !ort#on" of the data"et than "tream#n
thro&h #t> wh#ch o!erate" at the tran"fer rate. On the other hand> for &!dat#n a "mall
!ro!ort#on of record" #n a data%a"e> a trad#t#onal ,'Tree 9the data "tr&ct&re &"ed #n
relat#onal data%a"e"> wh#ch #" l#m#ted %6 the rate #t can !erform "ee=": wor=" well. For
&!dat#n the maKor#t6 of a data%a"e> a ,'Tree #" le"" eff#c#ent than Ma!Red&ce> wh#ch &"e"
SortFMere to re%&#ld the data%a"e.
Another d#fference %etween Ma!Red&ce and an RD,MS #" the amo&nt of "tr&ct&re #n the
data"et" that the6 o!erate on. Str&ct&red data #" data that #" oran#?ed #nto ent#t#e" that
7 | P a e
ha1e a def#ned format> "&ch a" LM8 doc&ment" or data%a"e ta%le" that conform to a
!art#c&lar !redef#ned "chema. Th#" #" the realm of the RD,MS. Sem#'"tr&ct&red data> on
the other hand> #" loo"er> and tho&h there ma6 %e a "chema> #t #" often #nored> "o #t ma6
%e &"ed onl6 a" a &#de to the "tr&ct&re of the data4 for e@am!le> a "!read"heet> #n wh#ch
the "tr&ct&re #" the r#d of cell"> altho&h the cell" them"el1e" ma6 hold an6 form of data.
Un"tr&ct&red data doe" not ha1e an6 !art#c&lar #nternal "tr&ct&re4 for e@am!le> !la#n te@t
or #mae data. Ma!Red&ce wor=" well on &n"tr&ct&red or "em#"tr&ct&red data> "#nce #t #"
de"#ned to #nter!ret the data at !roce""#n t#me. In otherword"> the #n!&t =e6" and 1al&e"
for Ma!Red&ce are not an #ntr#n"#c !ro!ert6 of the data> %&t the6 are cho"en %6 the !er"on
anal6?#n the data. Relat#onal data #" often normal#?ed to reta#n #t" #nter#t6> and remo1e
red&ndanc6. Normal#?at#on !o"e" !ro%lem" for Ma!Red&ce> "#nce #t ma=e" read#n a
record a nonlocal o!erat#on> and one of the central a""&m!t#on" that Ma!Red&ce ma=e" #"
that #t #" !o""#%le to !erform 9h#h'"!eed: "tream#n read" and wr#te".
Traditional RD.'& 'apReduce
Data si0e -#a%6te" Peta%6te"
*ccess Interact#1e and %atch ,atch
Updates Read and wr#te man6 t#me" )r#te once> read man6 t#me"
&tructure Stat#c "chema D6nam#c "chema
Integrit# H#h 8ow
&caling Non l#near 8#near
,&t hadoo! ha"nBt %een m&ch !o!&lar 6et. M6SJ8 and other RD,MSB" ha1e
"trato"!her#call6 more mar=et "hare than Hadoo!> %&t l#=e an6 #n1e"tment> #tB" the f&t&re
6o& "ho&ld %e con"#der#n. The #nd&"tr6 #" trend#n toward" d#"tr#%&ted "6"tem"> and
Hadoo! #" a maKor !la6er.
ORI1IN O% *DOO)
Hadoo! wa" created %6 Do& C&tt#n> the creator of A!ache 8&cene> the w#del6 &"ed te@t
"earch l#%rar6. Hadoo! ha" #t" or##n" #n A!ache N&tch> an o!en "o&rce we% "earchen#ne>
#t"elf a !art of the 8&cene !roKect. ,&#ld#n a we% "earch en#ne from "cratch wa" an
am%#t#o&" oal> for not onl6 #" the "oftware reA&#red to crawl and #nde@ we%"#te" com!le@
to wr#te> %&t #t #" al"o a challene to r&n w#tho&t a ded#cated o!erat#on" team> "#nce there
. | P a e
are "o man6 mo1#n !art". ItB" e@!en"#1e too4 M#=e Cafarella and Do& C&tt#n e"t#mated a
"6"tem "&!!ort#n a 2'%#ll#on'!ae #nde@ wo&ld co"t aro&nd half a m#ll#on dollar" #n
hardware> w#th a monthl6 r&nn#n co"t of M33>333. Ne1erthele""> the6 %el#e1ed #t wa" a
worth6 oal> a" #t wo&ld o!en &! and &lt#matel6 democrat#?e "earch en#ne alor#thm".
N&tch wa" "tarted #n 5335> and a wor=#n crawler and "earch "6"tem A&#c=l6 emered.
Howe1er> the6 real#?ed that the#r arch#tect&re wo&ldnBt "cale to the %#ll#on" of !ae" on the
)e%. Hel! wa" at hand w#th the !&%l#cat#on of a !a!er #n 5333 that de"cr#%ed the
arch#tect&re of -ooleB" d#"tr#%&ted f#le"6"tem> called -FS> wh#ch wa" %e#n &"ed #n
!rod&ct#on at -oole.N -FS> or "ometh#n l#=e #t> wo&ld "ol1e the#r "torae need" for the
1er6 lare f#le" enerated a" a !art of the we% crawl and #nde@#n !roce"". In !art#c&lar>
-FS wo&ld free &! t#me %e#n "!ent on adm#n#"trat#1e ta"=" "&ch a" mana#n "torae
node". In 533$> the6 "et a%o&t wr#t#n an o!en "o&rce #m!lementat#on> the N&tch
D#"tr#%&ted F#le"6"tem 9NDFS:. In 533$> -oole !&%l#"hed the !a!er that #ntrod&ced
Ma!Red&ce to the world.O Earl6 #n 533(> the N&tch de1elo!er" had a wor=#n Ma!Red&ce
#m!lementat#on #n N&tch> and %6 the m#ddle of that 6ear all the maKor N&tch alor#thm"
had %een !orted to r&n &"#n Ma!Red&ce and NDFS. NDFS and the Ma!Red&ce
#m!lementat#on #n N&tch were a!!l#ca%le %e6ond the realm of "earch> and #n Fe%r&ar6
533+ the6 mo1ed o&t of N&tch to form an #nde!endent "&%!roKect of 8&cene called
Hadoo!. At aro&nd the "ame t#me> Do& C&tt#n Ko#ned *ahooP> wh#ch !ro1#ded a ded#cated
team and the re"o&rce" to t&rn Hadoo! #nto a "6"tem that ran at we% "cale 9"ee "#de%ar:.
Th#" wa" demon"trated #n Fe%r&ar6 533. when *ahooP anno&nced that #t" !rod&ct#on
"earch #nde@ wa" %e#n enerated %6 a 23>333'core Hadoo! cl&"ter. In A!r#l 533.> Hadoo!
%ro=e a world record to %ecome the fa"te"t "6"tem to "ort a tera%6te of data. R&nn#n on a
023'node cl&"ter> Hadoo! "orted one tera%6te #n 5330 "econd" 9K&"t &nder 3Q m#n&te":>
%eat#n the !re1#o&" 6earB" w#nner of 507 "econd"9de"cr#%ed #n deta#l #n ;Tera,6te Sort on
A!ache Hadoo!< on !ae $+2:. In No1em%er of the "ame 6ear> -oole re!orted that #t"
Ma!Red&ce #m!lementat#on "orted one tera%6te #n +. "econd".G A" th#" %oo= wa" o#n to
!re"" 9Ma6 5330:> #t wa" anno&nced that a team at *ahooP &"ed Hadoo! to "ort one
tera%6te #n +5 "econd".
&U.)RO2,CT&

Altho&h Hadoo! #" %e"t =nown for Ma!Red&ce and #t" d#"tr#%&ted f#le"6"tem9HDFS>
renamed from NDFS:> the other "&%!roKect" !ro1#de com!lementar6 "er1#ce"> or %&#ld on
the core to add h#her'le1el a%"tract#on" The 1ar#o&" "&%!roKect" of hadoo! #ncl&de"4'
Core
0 | P a e
A "et of com!onent" and #nterface" for d#"tr#%&ted f#le"6"tem" and eneral
IFO9"er#al#?at#on> /a1a RPC> !er"#"tent data "tr&ct&re":.
*!ro
A data "er#al#?at#on "6"tem for eff#c#ent> cro""'lan&ae RPC> and !er"#"tent data"torae.
9At the t#me of th#" wr#t#n> A1ro had %een created onl6 a" a new "&%!roKect> and no other
Hadoo! "&%!roKect" were &"#n #t 6et.:
'apreduce
A d#"tr#%&ted data !roce""#n model and e@ec&t#on en1#ronment that r&n" on lare
cl&"ter" of commod#t6 mach#ne".
D%&
A d#"tr#%&ted f#le"6"tem that r&n" on lare cl&"ter" of commod#t6 mach#ne".
)ig
A data flow lan&ae and e@ec&t#on en1#ronment for e@!lor#n 1er6 lare data"et". P#
r&n" on HDFS and Ma!Red&ce cl&"ter".
.*&,
A d#"tr#%&ted> col&mn'or#ented data%a"e. H,a"e &"e" HDFS for #t" &nderl6#n "torae>
and "&!!ort" %oth %atch'"t6le com!&tat#on" &"#n Ma!Red&ce and !o#nt A&er#e" 9random
read":.
3oo(eeper
A d#"tr#%&ted> h#hl6 a1a#la%le coord#nat#on "er1#ce. RooSee!er !ro1#de" !r#m#t#1e" "&ch
a" d#"tr#%&ted loc=" that can %e &"ed for %&#ld#n d#"tr#%&ted a!!l#cat#on".
i!e
A d#"tr#%&ted data wareho&"e. H#1e manae" data "tored #n HDFS and !ro1#de" a A&er6
lan&ae %a"ed on SJ8 9and wh#ch #" tran"lated %6 the r&nt#me en#ne to Ma!Red&ce Ko%":
for A&er6#n the data.
Chu(wa
A d#"tr#%&ted data collect#on and anal6"#" "6"tem. Ch&=wa r&n" collector" that "tore data
#n HDFS> and #t &"e" Ma!Red&ce to !rod&ce re!ort". 9At the t#me of th#" wr#t#n> Ch&=wa
had onl6 recentl6 rad&ated from a ;contr#%< mod&le #n Core to #t" own "&%!roKect.:
T, *DOO) *))RO*C
23 | P a e
Hadoo! #" de"#ned to eff#c#entl6 !roce"" lare 1ol&me" of #nformat#on %6 connect#n
man6 commod#t6 com!&ter" toether to wor= #n !arallel. The theoret#cal 2333'CPU
mach#ne de"cr#%ed earl#er wo&ld co"t a 1er6 lare amo&nt of mone6> far more than 2>333
"#nle'CPU or 5(3 A&ad'core mach#ne". Hadoo! w#ll t#e the"e "maller and more rea"ona%l6
!r#ced mach#ne" toether #nto a "#nle co"t'effect#1e com!&te cl&"ter.
Perform#n com!&tat#on on lare 1ol&me" of data ha" %een done %efore> &"&all6 #n a
d#"tr#%&ted "ett#n. )hat ma=e" Hadoo! &n#A&e #" #t" "#m!l#f#ed !roramm#n model wh#ch
allow" the &"er to A&#c=l6 wr#te and te"t d#"tr#%&ted "6"tem"> and #t" eff#c#ent> a&tomat#c
d#"tr#%&t#on of data and wor= acro"" mach#ne" and #n t&rn &t#l#?#n the &nderl6#n
!arallel#"m of the CPU core".
Data distribution
In a Hadoo! cl&"ter> data #" d#"tr#%&ted to all the node" of the cl&"ter a" #t #" %e#n loaded
#n. The Hadoo! D#"tr#%&ted F#le S6"tem 9HDFS: w#ll "!l#t lare data f#le" #nto ch&n=" wh#ch
are manaed %6 d#fferent node" #n the cl&"ter. In add#t#on to th#" each ch&n= #" re!l#cated
acro"" "e1eral mach#ne"> "o that a "#nle mach#ne fa#l&re doe" not re"&lt #n an6 data %e#n
&na1a#la%le. An act#1e mon#tor#n "6"tem then re're!l#cate" the data #n re"!on"e to "6"tem
fa#l&re" wh#ch can re"&lt #n !art#al "torae. E1en tho&h the f#le ch&n=" are re!l#cated and
d#"tr#%&ted acro"" "e1eral mach#ne"> the6 form a "#nle name"!ace> "o the#r content" are
&n#1er"all6 acce""#%le.
Data #" conce!t&all6 record-oriented #n the Hadoo! !roramm#n framewor=. Ind#1#d&al
#n!&t f#le" are %ro=en #nto l#ne" or #nto other format" "!ec#f#c to the a!!l#cat#on lo#c. Each
!roce"" r&nn#n on a node #n the cl&"ter then !roce""e" a "&%"et of the"e record". The
Hadoo! framewor= then "ched&le" the"e !roce""e" #n !ro@#m#t6 to the locat#on of
dataFrecord" &"#n =nowlede from the d#"tr#%&ted f#le "6"tem. S#nce f#le" are "!read
acro"" the d#"tr#%&ted f#le "6"tem a" ch&n="> each com!&te !roce"" r&nn#n on a node
o!erate" on a "&%"et of the data. )h#ch data o!erated on %6 a node #" cho"en %a"ed on #t"
local#t6 to the node4 mo"t data #" read from the local d#"= "tra#ht #nto the CPU> alle1#at#n
"tra#n on networ= %andw#dth and !re1ent#n &nnece""ar6 networ= tran"fer". Th#" "trate6
of mo!ing computation to the data> #n"tead of mo1#n the data to the com!&tat#on
allow" Hadoo! to ach#e1e h#h data local#t6 wh#ch #n t&rn re"&lt" #n h#h !erformance.
22 | P a e
'apReduce" Isolated )rocesses
Hadoo! l#m#t" the amo&nt of comm&n#cat#on wh#ch can %e !erformed %6 the !roce""e">
a" each #nd#1#d&al record #" !roce""ed %6 a ta"= #n #"olat#on from one another. )h#le th#"
"o&nd" l#=e a maKor l#m#tat#on at f#r"t> #t ma=e" the whole framewor= m&ch more rel#a%le.
Hadoo! w#ll not r&n K&"t an6 !roram and d#"tr#%&te #t acro"" a cl&"ter. Proram" m&"t %e
wr#tten to conform to a !art#c&lar !roramm#n model> named CMa!Red&ce.C
In Ma!Red&ce> record" are !roce""ed #n #"olat#on %6 ta"=" called Mappers. The o&t!&t
from the Ma!!er" #" then %ro&ht toether #nto a "econd "et of ta"=" called Reducers>
where re"&lt" from d#fferent ma!!er" can %e mered toether.
25 | P a e
Se!arate node" #n a Hadoo! cl&"ter "t#ll comm&n#cate w#th one another. Howe1er> #n
contra"t to more con1ent#onal d#"tr#%&ted "6"tem" where a!!l#cat#on de1elo!er" e@!l#c#tl6
mar"hal %6te "tream" from node to node o1er "oc=et" or thro&h MPI %&ffer">
comm&n#cat#on #n Hadoo! #" !erformed #m!l#c#tl6. P#ece" of data can %e taed w#th =e6
name" wh#ch #nform Hadoo! how to "end related %#t" of #nformat#on to a common
de"t#nat#on node. Hadoo! #nternall6 manae" all of the data tran"fer and cl&"ter to!olo6
#""&e".
,6 re"tr#ct#n the comm&n#cat#on %etween node"> Hadoo! ma=e" the d#"tr#%&ted "6"tem
m&ch more rel#a%le. Ind#1#d&al node fa#l&re" can %e wor=ed aro&nd %6 re"tart#n ta"=" on
other mach#ne". S#nce &"er'le1el ta"=" do not comm&n#cate e@!l#c#tl6 w#th one another> no
me""ae" need to %e e@chaned %6 &"er !roram"> nor do node" need to roll %ac= to !re'
arraned chec=!o#nt" to !art#all6 re"tart the com!&tat#on. The other wor=er" cont#n&e to
o!erate a" tho&h noth#n went wron> lea1#n the challen#n a"!ect" of !art#all6
re"tart#n the !roram to the &nderl6#n Hadoo! la6er.
INTRODUCTION TO '*)R,DUC,
Ma!Red&ce #" a !roramm#n model and an a""oc#ated #m!lementat#on for !roce""#n
and enerat#n laredata "et". U"er" "!ec#f6 a ma! f&nct#on that !roce""e" a =e6F1al&e
!a#r to enerate a "et of #ntermed#ate =e6F1al&e !a#r"> and a red&ce f&nct#on that mere"
all #ntermed#ate 1al&e" a""oc#ated w#th the "ame #ntermed#ate =e6. Man6 real world ta"="
are e@!re""#%le #n th#" model.
Th#" a%"tract#on #" #n"!#red %6 the ma! and red&ce !r#m#t#1e" !re"ent #n 8#"! and man6
other f&nct#onal lan&ae". )e real#?ed that mo"t of o&r com!&tat#on" #n1ol1ed a!!l6#n a
ma! o!erat#on to each lo#cal .record. #n o&r #n!&t #n order to com!&te a "et of
#ntermed#ate =e6F1al&e !a#r"> and then a!!l6#n a red&ce o!erat#on to all the 1al&e" that
"hared the "ame =e6> #n order to com%#ne the der#1ed data a!!ro!r#atel6. O&r &"e of a
f&nct#onal model w#th &"er "!ec#l#?ed ma! and red&ce o!erat#on" allow" &" to !arallel#?e
lare com!&tat#on" ea"#l6 and to &"e re'e@ec&t#on a" the !r#mar6 mechan#"m for fa&lt
tolerance.
)rogramming model
23 | P a e
The com!&tat#on ta=e" a "et of #n!&t =e6F1al&e !a#r"> and !rod&ce" a "et of o&t!&t
=e6F1al&e !a#r". The &"er of the Ma!Red&ce l#%rar6 e@!re""e" the com!&tat#on a" two
f&nct#on"4 Ma! and Red&ce. Ma!> wr#tten %6 the &"er> ta=e" an #n!&t !a#r and !rod&ce" a
"et of #ntermed#ate =e6F1al&e !a#r". The Ma!Red&ce l#%rar6 ro&!" toether all
#ntermed#ate 1al&e" a""oc#atedw#th the "ame #ntermed#ate =e6 I and !a""e" them to the
Red&ce f&nct#on. The Red&ce f&nct#on> al"o wr#tten %6 the &"er> acce!t" an #ntermed#ate
=e6 I and a "et of 1al&e" for that =e6. It mere" toether the"e 1al&e" to form a !o""#%l6
"maller "et of 1al&e". T6!#call6 K&"t ?ero or one o&t!&t 1al&e #" !rod&ced !er Red&ce
#n1ocat#on. The #ntermed#ate 1al&e" are "&!!l#ed to the &"erT" red&ce f&nct#on 1#a an
#terator. Th#" allow" &" to handle l#"t" of 1al&e" that are too lare to f#t #n memor6.
'*)
map 4in5(e#, in5!alue6 -7 4out5(e#, intermediate5!alue6 list
,8ample" Upper-case 'apper
let ma!9=> 1: U em#t9=.toU!!er9:> 1.toU!!er9::
9;foo<> ;%ar<: ''V 9;FOO<> ;,AR<:
9;Foo<> ;other<: ''V9;FOO<> ;OTHER<:
9;=e65<> ;data<: ''V 9;SE*5<> ;DATA<:
R,DUC,
2$ | P a e
reduce 4out5(e#, intermediate5!alue list6 -7 out5!alue list
,8ample" &um Reducer
let red&ce9=> 1al":
"&m U 3
foreach #nt 1 #n 1al"4
"&m WU 1
em#t9=> "&m:
9;A<> X$5> 233> 325Y: ''V 9;A<> $($:
9;,<> X25> +> '5Y: ''V 9;,<> 2+:
,8ample9"-
Co&nt#n the n&m%er of occ&rrence" of each word #n a lare collect#on of doc&ment". The
&"er wo&ld wr#te code "#m#lar to the follow#n !"e&do'code4

ma!9Str#n =e6> Str#n 1al&e:4
2( | P a e
FF =e64 doc&ment name
FF 1al&e4 doc&ment content"
for each word w #n 1al&e4
Em#tIntermed#ate9w> C2C:I
red&ce9Str#n =e6> Iterator 1al&e":4
FF =e64 a word
FF 1al&e"4 a l#"t of co&nt"
#nt re"&lt U 3I
for each 1 #n 1al&e"4
re"&lt WU Par"eInt91:I
Em#t9A"Str#n9re"&lt::I
The ma! f&nct#on em#t" each word !l&" an a""oc#ated co&nt of occ&rrence" 9K&"t Z2T #n th#"
"#m!le e@am!le:. The red&ce f&nct#on "&m" toether all co&nt" em#tted for a !art#c&lar
word.
In add#t#on> the &"er wr#te" code to [ll #n a ma!red&ce "!ec#f#cat#on o%Kect w#th the name"
of the #n!&t and o&t!&t [le"> and o!t#onal t&n#n !arameter". The &"er then #n1o=e" the
Ma!Red&ce f&nct#on> !a""#n #t the "!ec#f#cat#on o%Kect. The &"erT" code #" l#n=ed toether
w#th the Ma!Red&ce l#%rar6 9#m!lemented #n CWW:
Proram" wr#tten #n th#" f&nct#onal "t6le are a&tomat#call6 !arallel#?ed and e@ec&ted on a
lare cl&"ter of commod#t6 mach#ne". The r&n't#me "6"tem ta=e" care of the deta#l" of
!art#t#on#n the #n!&t data> "ched&l#n the !roramT" e@ec&t#on acro"" a "et of mach#ne">
handl#n mach#ne fa#l&re"> and mana#n the reA&#red #nter'mach#ne comm&n#cat#on. Th#"
allow" !rorammer" w#tho&t an6 e@!er#ence w#th !arallel and d#"tr#%&ted "6"tem" to ea"#l6
&t#l#?e the re"o&rce" of a lare d#"tr#%&ted "6"tem.
The #""&e" of how to !arallel#?e the com!&tat#on> d#"tr#%&te the data> and handle fa#l&re"
con"!#re to o%"c&re the or##nal "#m!le com!&tat#on w#th lare amo&nt" of com!le@ code
to deal w#th the"e #""&e". A" a react#on to th#" com!le@#t6> -oole de"#ned a new
a%"tract#on that allow" &" to e@!re"" the "#m!le com!&tat#on" we were tr6#n to !erform
%&t h#de" the me""6 deta#l" of !arallel#?at#on> fa&lt'tolerance> data d#"tr#%&t#on and load
%alanc#n #n a l#%rar6.
T#pes
2+ | P a e
E1en tho&h the !re1#o&" !"e&do'code #" wr#tten #n term" of "tr#n #n!&t" and o&t!&t">
conce!t&all6 the ma! and red&ce f&nct#on" "&!!l#ed %6 the &"er ha1e a""oc#ated
t6!e"4
ma! 9=2>12: P l#"t9=5>15:
red&ce 9=5>l#"t915:: P l#"t915:
I.e.> the #n!&t =e6" and 1al&e" are drawn from a d#fferent doma#n than the o&t!&t =e6" and
1al&e". F&rthermore> the #ntermed#ate =e6" and 1al&e" are from the "ame doma#n a" the
o&t!&t =e6" and 1al&e". O&r CWW #m!lementat#on !a""e" "tr#n" to and from the &"er'
de[ned f&nct#on" and lea1e" #t to the &"er code to con1ert %etween "tr#n" and a!!ro!r#ate
t6!e".
In!erted Inde8" The ma! f&nct#on !ar"e" each doc&ment> and em#t" a "eA&ence of
hwordI doc&ment ID# !a#r". The red&ce f&nct#on acce!t" all !a#r" for a #1en word> "ort" the
corre"!ond#n doc&ment ID" and em#t" a hwordI l#"t9doc&ment ID:# !a#r. The "et of all
o&t!&t !a#r" form" a "#m!le #n1erted #nde@. It #" ea"6 to a&ment th#" com!&tat#on to =ee!
trac= of word !o"#t#on".
Distributed &ort" The ma! f&nct#on e@tract" the =e6 from each record> and em#t" a h=e6I
record# !a#r. The red&ce f&nct#on em#t" all !a#r" &nchaned.
*DOO) '*)R,DUC,
27 | P a e
Hadoo! Ma!'Red&ce #" a "oftware framewor= for ea"#l6 wr#t#n a!!l#cat#on" wh#ch !roce""
1a"t amo&nt" of data 9m&lt#'tera%6te data'"et": #n'!arallel on lare cl&"ter" 9tho&"and" of
node": of commod#t6 hardware #n a rel#a%le> fa&lt'tolerant manner.
A Ma!'Red&ce Ko% &"&all6 "!l#t" the #n!&t data'"et #nto #nde!endent ch&n=" wh#ch are
!roce""ed %6 the ma! ta"=" #n a com!letel6 !arallel manner. The framewor= "ort" the
o&t!&t" of the ma!"> wh#ch are then #n!&t to the red&ce ta"=". T6!#call6 %oth the #n!&t and
the o&t!&t of the Ko% are "tored #n a f#le'"6"tem. The framewor= ta=e" care of "ched&l#n
ta"="> mon#tor#n them and re'e@ec&te" the fa#led ta"=".
T6!#call6 the com!&te node" and the "torae node" are the "ame> that #"> the Ma!'
Red&ce framewor= and the D#"tr#%&ted F#leS6"tem are r&nn#n on the "ame "et of node".
Th#" conf#&rat#on allow" the framewor= to effect#1el6 "ched&le ta"=" on the node" where
data #" alread6 !re"ent> re"&lt#n #n 1er6 h#h areate %andw#dth acro"" the cl&"ter.
A Ma!Red&ce Ko% #" a &n#t of wor= that the cl#ent want" to %e !erformed4 #t con"#"t" of the
#n!&t data> the Ma!Red&ce !roram> and conf#&rat#on #nformat#on. Hadoo! r&n" the Ko%
%6 d#1#d#n #t #nto ta"="> of wh#ch there are two t6!e"4 ma! ta"=" and red&ce ta"=". There
are two t6!e" of node" that control the Ko% e@ec&t#on !roce""4 a Ko%trac=er and a n&m%er
of ta"=trac=er". The Ko%trac=er coord#nate" all the Ko%" r&n on the "6"tem %6 "ched&l#n
ta"=" to r&n on ta"=trac=er". Ta"=trac=er" r&n ta"=" and "end !rore"" re!ort" to the
Ko%trac=er> wh#ch =ee!" a record of the o1erall !rore"" of each Ko%. If a ta"=" fa#l"> the
Ko%trac=er can re"ched&le #t on a d#fferent ta"=trac=er. Hadoo! d#1#de" the #n!&t to a
Ma!Red&ce Ko% #nto f#@ed'"#?e !#ece" called #n!&t "!l#t"> or K&"t "!l#t". Hadoo! create" one
ma! ta"= for each "!l#t> wh#ch r&n" the &"erdef#ned ma! f&nct#on for each record #n the
"!l#t.
2. | P a e
Ha1#n man6 "!l#t" mean" the t#me ta=en to !roce"" each "!l#t #" "mall com!ared to the
t#me to !roce"" the whole #n!&t. So #f we are !roce""#n the "!l#t" #n !arallel> the
!roce""#n #" %etter load'%alanced #f the "!l#t" are "mall> "#nce a fa"ter mach#ne w#ll %e
a%le to !roce"" !ro!ort#onall6 more "!l#t" o1er the co&r"e of the Ko% than a "lower
mach#ne. E1en #f the mach#ne" are #dent#cal> fa#led !roce""e" or other Ko%" r&nn#n
conc&rrentl6 ma=e load %alanc#n de"#ra%le> and the A&al#t6 of the load %alanc#n
#ncrea"e" a" the "!l#t" %ecome more f#ne'ra#ned. On the other hand> #f "!l#t" are too "mall>
then the o1erhead of mana#n the "!l#t" and of ma! ta"= creat#on %e#n" to dom#nate the
total Ko% e@ec&t#on t#me. For mo"t Ko%"> a ood "!l#t "#?e tend" to %e the "#?e of a HDFS
%loc=> +$ M, %6 defa&lt> altho&h th#" can %e chaned for the cl&"ter 9for all newl6 created
f#le":> or "!ec#f#ed when each f#le #" created. Hadoo! doe" #t" %e"t to r&n the ma! ta"= on a
node where the #n!&t data re"#de" #n HDFS. Th#" #" called the data local#t6 o!t#m#?at#on. It
"ho&ld now %e clear wh6 the o!t#mal "!l#t "#?e #" the "ame a" the %loc= "#?e4 #t #" the
lare"t "#?e of #n!&t that can %e &aranteed to %e "tored on a "#nle node. If the "!l#t
"!anned two %loc="> #t wo&ld %e &nl#=el6 that an6 HDFS node "tored %oth %loc="> "o "ome
of the "!l#t wo&ld ha1e to %e tran"ferred acro"" the networ= to the node r&nn#n the ma!
ta"=> wh#ch #" clearl6 le"" eff#c#ent than r&nn#n the whole ma! ta"= &"#n local data. Ma!
ta"=" wr#te the#r o&t!&t to local d#"=> not to HDFS. Ma! o&t!&t #" #ntermed#ate o&t!&t4 #tB"
!roce""ed %6 red&ce ta"=" to !rod&ce the f#nal o&t!&t> and once the Ko% #" com!lete the
ma! o&t!&t can %e thrown awa6. So "tor#n #t #n HDFS> w#th re!l#cat#on> wo&ld %e o1er=#ll. If
the node r&nn#n the ma! ta"= fa#l" %efore the ma! o&t!&t ha" %een con"&med %6 the
red&ce ta"=> then Hadoo! w#ll a&tomat#call6 rer&n the ma! ta"= on another node to
recreate the ma! o&t!&t. Red&ce ta"=" donBt ha1e the ad1antae of data local#t6Ethe
#n!&t to a "#nle red&ce ta"= #" normall6 the o&t!&t from all ma!!er". In the !re"ent
e@am!le> we ha1e a "#nle red&ce ta"= that #" fed %6 all of the ma! ta"=". Therefore the
"orted ma! o&t!&t" ha1e to %e tran"ferred acro"" the networ= to the node where the
red&ce ta"= #" r&nn#n> where the6 are mered and then !a""ed to the &"er'def#ned red&ce
f&nct#on. The o&t!&t of the red&ce #" normall6 "tored #n HDFS for rel#a%#l#t6. For each HDFS
%loc= of the red&ce o&t!&t> the f#r"t re!l#ca #" "tored on the local node> w#th other re!l#ca"
%e#n "tored on off'rac= node". Th&"> wr#t#n the red&ce o&t!&t doe" con"&me networ=
%andw#dth> %&t onl6 a" m&ch a" a normal HDFS wr#te !#!el#ne con"&me. The dotted %o@e"
#n the f#&re %elow #nd#cate node"> the l#ht arrow" "how data tran"fer" on a node> and the
hea16 arrow" "how data tran"fer" %etween node". The n&m%er of red&ce ta"=" #" not
o1erned %6 the "#?e of the #n!&t> %&t #" "!ec#f#ed #nde!endentl6.
20 | P a e
Ma!Red&ce data flow w#th a "#nle red&ce ta"=
)hen there are m&lt#!le red&cer"> the ma! ta"=" !art#t#on the#r o&t!&t> each creat#n one
!art#t#on for each red&ce ta"=. There can %e man6 =e6" 9and the#r a""oc#ated 1al&e": #n
each !art#t#on> %&t the record" for e1er6 =e6 are all #n a "#nle !art#t#on. The !art#t#on#n
can %e controlled %6 a &"er'def#ned !art#t#on#n f&nct#on> %&t normall6 the defa&lt
!art#t#onerEwh#ch %&c=et" =e6" &"#n a ha"h f&nct#onEwor=" 1er6 well. Th#" d#aram
ma=e" #t clear wh6 the data flow %etween ma! and red&ce ta"=" #" colloA&#all6 =nown a"
;the "h&ffle>< a" each red&ce ta"= #" fed %6 man6 ma! ta"=". The "h&ffle #" more
com!l#cated than th#" d#aram "&e"t"> and t&n#n #t can ha1e a %# #m!act on Ko%
e@ec&t#on t#me. F#nall6> #tB" al"o !o""#%le to ha1e ?ero red&ce ta"=". Th#" can %e
a!!ro!r#ate when 6o& donBt need the "h&ffle "#nce the !roce""#n can %e carr#ed o&t
ent#rel6 #n !arallel.
53 | P a e
Ma!Red&ce data flow w#th m&lt#!le red&ce ta"="
Ma!Red&ce data flow w#th no red&ce ta"="
52 | P a e
Combiner %unctions
Man6 Ma!Red&ce Ko%" are l#m#ted %6 the %andw#dth a1a#la%le on the cl&"ter> "o #t !a6" to
m#n#m#?e the data tran"ferred %etween ma! and red&ce ta"=". Hadoo! allow" the &"er to
"!ec#f6 a com%#ner f&nct#on to %e r&n on the ma! o&t!&tEthe com%#ner f&nct#onB" o&t!&t
form" the #n!&t to the red&ce f&nct#on. S#nce the com%#ner f&nct#on #" an o!t#m#?at#on>
Hadoo! doe" not !ro1#de a &arantee of how man6 t#me" #t w#ll call #t for a !art#c&lar ma!
o&t!&t record> #f at all. In other word"> call#n the com%#ner f&nct#on ?ero> one> or man6
t#me" "ho&ld !rod&ce the "ame o&t!&t from the red&cer.
*DOO) &TR,*'IN1
Hadoo! !ro1#de" an API to Ma!Red&ce that allow" 6o& to wr#te 6o&r ma! and red&ce
f&nct#on" #n lan&ae" other than /a1a. Hadoo! Stream#n &"e" Un#@ "tandard "tream" a"
the #nterface %etween Hadoo! and 6o&r !roram> "o 6o& can &"e an6 lan&ae that can
read "tandard #n!&t and wr#te to "tandard o&t!&t to wr#te 6o&r Ma!Red&ce !roram.
Stream#n #" nat&rall6 "&#ted for te@t !roce""#n 9altho&h a" of 1er"#on 3.52.3 #t can
handle %#nar6 "tream"> too:> and when &"ed #n te@t mode> #t ha" a l#ne'or#ented 1#ew of
data. Ma! #n!&t data #" !a""ed o1er "tandard #n!&t to 6o&r ma! f&nct#on> wh#ch !roce""e"
#t l#ne %6 l#ne and wr#te" l#ne" to "tandard o&t!&t. A ma! o&t!&t =e6'1al&e !a#r #" wr#tten a"
a "#nle ta%'del#m#ted l#ne. In!&t to the red&ce f&nct#on #" #n the "ame formatEa ta%'
"e!arated =e6'1al&e !a#rE!a""ed o1er "tandard #n!&t. The red&ce f&nct#on read" l#ne"
from "tandard #n!&t> wh#ch the framewor= &arantee" are "orted %6 =e6> and wr#te" #t"
re"&lt" to "tandard o&t!&t.
*DOO) )I),&
Hadoo! P#!e" #" the name of the CWW #nterface to Hadoo! Ma!Red&ce. Unl#=e Stream#n>
wh#ch &"e" "tandard #n!&t and o&t!&t to comm&n#cate w#th the ma! and red&ce code>
P#!e" &"e" "oc=et" a" the channel o1er wh#ch the ta"=trac=er comm&n#cate" w#th the
!roce"" r&nn#n the CWW ma! or red&ce f&nct#on. /NI #" not &"ed.
55 | P a e
*DOO) DI&TRI.UT,D %I:,&-&T,' 4D%&6
F#le"6"tem" that manae the "torae acro"" a networ= of mach#ne" are called d#"tr#%&ted
f#le"6"tem". S#nce the6 are networ='%a"ed> all the com!l#cat#on" of networ= !roramm#n
=#c= #n> th&" ma=#n d#"tr#%&ted f#le"6"tem" more com!le@ than re&lar d#"= f#le"6"tem".
For e@am!le> one of the %#e"t challene" #" ma=#n the f#le"6"tem tolerate node fa#l&re
w#tho&t "&ffer#n data lo"". Hadoo! come" w#th a d#"tr#%&ted f#le"6"tem called HDFS> wh#ch
"tand" for Hadoo! D#"tr#%&ted F#le"6"tem.
D%&, the adoop Distributed %ile &#stem, is a distributed file s#stem designed
to hold !er# large amounts of data 4terab#tes or e!en petab#tes6, and pro!ide
high-throughput access to this information$ F#le" are "tored #n a red&ndant fa"h#on
acro"" m&lt#!le mach#ne" to en"&re the#r d&ra%#l#t6 to fa#l&re and h#h a1a#la%#l#t6 to 1er6
!arallel a!!l#cat#on".
*&&U')TION& *ND 1O*:&
ardware %ailure
Hardware fa#l&re #" the norm rather than the e@ce!t#on. An HDFS #n"tance ma6 con"#"t of
h&ndred" or tho&"and" of "er1er mach#ne"> each "tor#n !art of the f#le "6"temB" data. The
fact that there are a h&e n&m%er of com!onent" and that each com!onent ha" a non'
tr#1#al !ro%a%#l#t6 of fa#l&re mean" that "ome com!onent of HDFS #" alwa6" non'f&nct#onal.
Therefore> detect#on of fa&lt" and A&#c=> a&tomat#c reco1er6 from them #" a core
arch#tect&ral oal of HDFS.
&treaming Data *ccess
A!!l#cat#on" that r&n on HDFS need "tream#n acce"" to the#r data "et". The6 are not
eneral !&r!o"e a!!l#cat#on" that t6!#call6 r&n on eneral !&r!o"e f#le "6"tem". HDFS #"
de"#ned more for %atch !roce""#n rather than #nteract#1e &"e %6 &"er". The em!ha"#" #"
on h#h thro&h!&t of data acce"" rather than low latenc6 of data acce"". POSIL #m!o"e"
man6 hard reA&#rement" that are not needed for a!!l#cat#on" that are tareted for HDFS.
POSIL "emant#c" #n a few =e6 area" ha" %een traded to #ncrea"e data thro&h!&t rate".
:arge Data &ets
A!!l#cat#on" that r&n on HDFS ha1e lare data "et". A t6!#cal f#le #n HDFS #" #a%6te" to
tera%6te" #n "#?e. Th&"> HDFS #" t&ned to "&!!ort lare f#le". It "ho&ld !ro1#de h#h
areate data %andw#dth and "cale to h&ndred" of node" #n a "#nle cl&"ter. It "ho&ld
"&!!ort ten" of m#ll#on" of f#le" #n a "#nle #n"tance.
53 | P a e
&imple Coherenc# 'odel
HDFS a!!l#cat#on" need a wr#te'once'read'man6 acce"" model for f#le". A f#le once
created> wr#tten> and clo"ed need not %e chaned. Th#" a""&m!t#on "#m!l#f#e" data
coherenc6 #""&e" and ena%le" h#h thro&h!&t data acce"". A Ma!FRed&ce a!!l#cat#on or a
we% crawler a!!l#cat#on f#t" !erfectl6 w#th th#" model. There #" a !lan to "&!!ort
a!!end#n'wr#te" to f#le" #n the f&t&re.
;'o!ing Computation is Cheaper than 'o!ing Data<
A com!&tat#on reA&e"ted %6 an a!!l#cat#on #" m&ch more eff#c#ent #f #t #" e@ec&ted near
the data #t o!erate" on. Th#" #" e"!ec#all6 tr&e when the "#?e of the data "et #" h&e. Th#"
m#n#m#?e" networ= cone"t#on and #ncrea"e" the o1erall thro&h!&t of the "6"tem. The
a""&m!t#on #" that #t #" often %etter to m#rate the com!&tat#on clo"er to where the data #"
located rather than mo1#n the data to where the a!!l#cat#on #" r&nn#n. HDFS !ro1#de"
#nterface" for a!!l#cat#on" to mo1e them"el1e" clo"er to where the data #" located.
)ortabilit# *cross eterogeneous ardware and &oftware )latforms
HDFS ha" %een de"#ned to %e ea"#l6 !orta%le from one !latform to another. Th#"
fac#l#tate" w#de"!read ado!t#on of HDFS a" a !latform of cho#ce for a lare "et of
a!!l#cat#on".
D,&I1N
HDFS #" a f#le"6"tem de"#ned for "tor#n 1er6 lare f#le" w#th "tream#n data acce""
!attern"> r&nn#n on cl&"ter" on commod#t6 hardware. 8etB" e@am#ne th#" "tatement #n
more deta#l4
=er# large files
;Her6 lare< #n th#" conte@t mean" f#le" that are h&ndred" of mea%6te"> #a%6te"> or
tera%6te" #n "#?e. There are Hadoo! cl&"ter" r&nn#n toda6 that "tore !eta%6te" of data.O
&treaming data access
HDFS #" %&#lt aro&nd the #dea that the mo"t eff#c#ent data !roce""#n !attern #" a wr#te'
once> read'man6't#me" !attern. A data"et #" t6!#call6 enerated or co!#ed from "o&rce>
then 1ar#o&" anal6"e" are !erformed on that data"et o1er t#me. Each anal6"#" w#ll #n1ol1e a
lare !ro!ort#on> #f not all> of the data"et> "o the t#me to read the whole data"et #" more
#m!ortant than the latenc6 #n read#n the f#r"t record.
Commodit# hardware
Hadoo! doe"nBt reA&#re e@!en"#1e> h#hl6 rel#a%le hardware to r&n on. ItB" de"#ned to r&n
on cl&"ter" of commod#t6 hardware 9commonl6 a1a#la%le hardware a1a#la%le from m&lt#!le
5$ | P a e
1endor"\: for wh#ch the chance of node fa#l&re acro"" the cl&"ter #" h#h> at lea"t for lare
cl&"ter". HDFS #" de"#ned to carr6 on wor=#n w#tho&t a not#cea%le #nterr&!t#on to the &"er
#n the face of "&ch fa#l&re. It #" al"o worth e@am#n#n the a!!l#cat#on" for wh#ch &"#n HDFS
doe" not wor= "o well. )h#le th#" ma6 chane #n the f&t&re> the"e are area" where HDFS #"
not a ood f#t toda64
:ow-latenc# data access
A!!l#cat#on" that reA&#re low'latenc6 acce"" to data> #n the ten" of m#ll#"econd"
rane> w#ll not wor= well w#th HDFS. Remem%er HDFS #" o!t#m#?ed for del#1er#n a h#h
thro&h!&t of data> and th#" ma6 %e at the e@!en"e of latenc6. H,a"e 9Cha!ter 25: #"
c&rrentl6 a %etter cho#ce for low'latenc6 acce"".
8ot" of "mall f#le"
S#nce the namenode hold" f#le"6"tem metadata #n memor6> the l#m#t to the n&m%er of
f#le" #n a f#le"6"tem #" o1erned %6 the amo&nt of memor6 on the namenode. A" a r&le of
th&m%> each f#le> d#rector6> and %loc= ta=e" a%o&t 2(3 %6te". So> for e@am!le> #f 6o& had
one m#ll#on f#le"> each ta=#n one %loc=> 6o& wo&ld need at lea"t 333 M, of memor6. )h#le
"tor#n m#ll#on" of f#le" #" fea"#%le> %#ll#on" #" %e6ond the ca!a%#l#t6 of c&rrent hardware.
'ultiple writers, arbitrar# file modifications
F#le" #n HDFS ma6 %e wr#tten to %6 a "#nle wr#ter. )r#te" are alwa6" made at the end of
the f#le. There #" no "&!!ort for m&lt#!le wr#ter"> or for mod#f#cat#on" at ar%#trar6 off"et" #n
the f#le. 9The"e m#ht %e "&!!orted #n the f&t&re> %&t the6 are l#=el6 to %e relat#1el6
#neff#c#ent.:
D%& Concepts
.loc(s
A d#"= ha" a %loc= "#?e> wh#ch #" the m#n#m&m amo&nt of data that #t can read or wr#te.
F#le"6"tem" for a "#nle d#"= %&#ld on th#" %6 deal#n w#th data #n %loc="> wh#ch are an
#nteral m&lt#!le of the d#"= %loc= "#?e. F#le"6"tem %loc=" are t6!#call6 a few =#lo%6te" #n
"#?e> wh#le d#"= %loc=" are normall6 (25 %6te". Th#" #" enerall6 tran"!arent to the
f#le"6"tem &"er who #" "#m!l6 read#n or wr#t#n a f#leEof whate1er lenth. Howe1er> there
are tool" to do w#th f#le"6"tem ma#ntenance> "&ch a" df and f"c=> that o!erate on the
f#le"6"tem %loc= le1el. HDFS too ha" the conce!t of a %loc=> %&t #t #" a m&ch larer &n#tE+$
M, %6 defa&lt. 8#=e #n a f#le"6"tem for a "#nle d#"=> f#le" #n HDFS are %ro=en #nto %loc='
"#?ed ch&n="> wh#ch are "tored a" #nde!endent &n#t". Unl#=e a f#le"6"tem for a "#nle d#"=> a
f#le #n HDFS that #" "maller than a "#nle %loc= doe" not occ&!6 a f&ll %loc=B" worth of
&nderl6#n "torae. )hen &nA&al#f#ed> the term ;%loc=< #n th#" %oo= refer" to a %loc= #n
HDFS.
HDFS %loc=" are lare com!ared to d#"= %loc="> and the rea"on #" to m#n#m#?e the co"t of
"ee=". ,6 ma=#n a %loc= lare eno&h> the t#me to tran"fer the data from the d#"= can %e
made to %e "#n#f#cantl6 larer than the t#me to "ee= to the "tart of the %loc=. Th&" the
5( | P a e
t#me to tran"fer a lare f#le made of m&lt#!le %loc=" o!erate" at the d#"= tran"fer rate. A
A&#c= calc&lat#on "how" that #f the "ee= t#me #" aro&nd 23m"> and the tran"fer rate #" 233
M,F"> then to ma=e the "ee= t#me 2] of the tran"fer t#me> we need to ma=e the %loc= "#?e
aro&nd 233 M,. The defa&lt #" act&all6 +$ M,> altho&h man6 HDFS #n"tallat#on" &"e 25.
M, %loc=". Th#" f#&re w#ll cont#n&e to %e re1#"ed &!ward a" tran"fer "!eed" row w#th new
enerat#on" of d#"= dr#1e". Th#" ar&ment "ho&ldnBt %e ta=en too far> howe1er. Ma! ta"=" #n
Ma!Red&ce normall6 o!erate on one %loc= at a t#me> "o #f 6o& ha1e too few ta"=" 9fewer
than node" #n the cl&"ter:> 6o&r Ko%" w#ll r&n "lower than the6 co&ld otherw#"e.
Ha1#n a %loc= a%"tract#on for a d#"tr#%&ted f#le"6"tem %r#n" "e1eral %enef#t". The f#r"t
%enef#t #" the mo"t o%1#o&"4 a f#le can %e larer than an6 "#nle d#"= #n the networ=. ThereB"
noth#n that reA&#re" the %loc=" from a f#le to %e "tored on the "ame d#"=> "o the6 can ta=e
ad1antae of an6 of the d#"=" #n the cl&"ter. In fact> #t wo&ld %e !o""#%le> #f &n&"&al> to
"tore a "#nle f#le on an HDFS cl&"ter who"e %loc=" f#lled all the d#"=" #n the cl&"ter.
Second> ma=#n the &n#t of a%"tract#on a %loc= rather than a f#le "#m!l#f#e" the "torae
"&%"6"tem. S#m!l#c#t6 #" "ometh#n to "tr#1e for all #n all "6"tem"> %&t #" #m!ortant for a
d#"tr#%&ted "6"tem #n wh#ch the fa#l&re mode" are "o 1ar#ed. The "torae "&%"6"tem deal"
w#th %loc="> "#m!l#f6#n "torae manaement 9"#nce %loc=" are a f#@ed "#?e> #t #" ea"6 to
calc&late how man6 can %e "tored on a #1en d#"=:> and el#m#nat#n metadata concern"
9%loc=" are K&"t a ch&n= of data to %e "toredEf#le metadata "&ch a" !erm#""#on"
#nformat#on doe" not need to %e "tored w#th the %loc="> "o another "6"tem can handle
metadata orthoonall6:. F&rthermore> %loc=" f#t well w#th re!l#cat#on for !ro1#d#n fa&lt
tolerance and a1a#la%#l#t6. To #n"&re aa#n"t corr&!ted %loc=" and d#"= and mach#ne fa#l&re>
each %loc= #" re!l#cated to a "mall n&m%er of !h6"#call6 "e!arate mach#ne" 9t6!#call6
three:. If a %loc= %ecome" &na1a#la%le> a co!6 can %e read from another locat#on #n a wa6
that #" tran"!arent to the cl#ent. A %loc= that #" no loner a1a#la%le d&e to corr&!t#on or
mach#ne fa#l&re can %e re!l#cated from the#r alternat#1e locat#on" to other l#1e mach#ne" to
%r#n the re!l#cat#on factor %ac= to the normal le1el. 9See ;Data Inter#t6< on !ae 7( for
more on &ard#n aa#n"t corr&!t data.: S#m#larl6> "ome a!!l#cat#on" ma6 choo"e to "et a
5+ | P a e
h#h re!l#cat#on factor for the %loc=" #n a !o!&lar f#le to "!read the read load on the cl&"ter.
8#=e #t" d#"= f#le"6"tem co&"#n> HDFSB" f"c= command &nder"tand" %loc=". For e@am!le>
r&nn#n4
] hadoop fsc( -files -bloc(s
w#ll l#"t the %loc=" that ma=e &! each f#le #n the f#le"6"tem.
Namenodes and Datanodes
A HDFS cl&"ter ha" two t6!e" of node o!erat#n #n a ma"ter'wor=er !attern4 a namenode
9the ma"ter: and a n&m%er of datanode" 9wor=er":. The namenode manae" the f#le"6"tem
name"!ace. It ma#nta#n" the f#le"6"tem tree and the metadata for all the f#le" and
d#rector#e" #n the tree. Th#" #nformat#on #" "tored !er"#"tentl6 on the local d#"= #n the form
of two f#le"4 the name"!ace #mae and the ed#t lo. The namenode al"o =now" the
datanode" on wh#ch all the %loc=" for a #1en f#le are located> howe1er> #t doe" not "tore
%loc= locat#on" !er"#"tentl6> "#nce th#" #nformat#on #" recon"tr&cted from datanode" when
the "6"tem "tart". A cl#ent acce""e" the f#le"6"tem on %ehalf of the &"er %6 comm&n#cat#n
w#th the namenode and datanode".
57 | P a e
5. | P a e
The cl#ent !re"ent" a POSIL'l#=e f#le"6"tem #nterface> "o the &"er code doe" not need to
=now a%o&t the namenode and datanode to f&nct#on. Datanode" are the wor= hor"e" of the
f#le"6"tem. The6 "tore and retr#e1e %loc=" when the6 are told to 9%6 cl#ent" or the
namenode:> and the6 re!ort %ac= to the namenode !er#od#call6 w#th l#"t" of %loc=" that
the6 are "tor#n. )#tho&t the namenode> the f#le"6"tem cannot %e &"ed. In fact> #f the
mach#ne r&nn#n the namenode were o%l#terated> all the f#le" on the f#le"6"tem wo&ld %e
lo"t "#nce there wo&ld %e no wa6 of =now#n how to recon"tr&ct the f#le" from the %loc=" on
the datanode". For th#" rea"on> #t #" #m!ortant to ma=e the namenode re"#l#ent to fa#l&re>
and Hadoo! !ro1#de" two mechan#"m" for th#".
50 | P a e
The f#r"t wa6 #" to %ac= &! the f#le" that ma=e &! the !er"#"tent "tate of the f#le"6"tem
metadata. Hadoo! can %e conf#&red "o that the namenode wr#te" #t" !er"#"tent "tate to
m&lt#!le f#le"6"tem". The"e wr#te" are "6nchrono&" and atom#c. The &"&al conf#&rat#on
Cho#ce #" to wr#te to local d#"= a" well a" a remote NFS mo&nt. It #" al"o !o""#%le to r&n a
"econdar6 namenode> wh#ch de"!#te #t" name doe" not act a" a namenode. It" ma#n role #"
to !er#od#call6 mere the name"!ace #mae w#th the ed#t lo to !re1ent the ed#t lo from
%ecom#n too lare. The "econdar6 namenode &"&all6 r&n" on a "e!arate !h6"#cal
mach#ne> "#nce #t reA&#re" !lent6 of CPU and a" m&ch memor6 a" the namenode to !erform
the mere. It =ee!" a co!6 of the mered name"!ace #mae> wh#ch can %e &"ed #n the
e1ent of the namenode fa#l#n. Howe1er> the "tate of the "econdar6 namenode la" that of
the !r#mar6> "o #n the e1ent of total fa#l&re of the !r#mar6 data> lo"" #" almo"t &aranteed.
The &"&al co&r"e of act#on #n th#" ca"e #" to co!6 the namenodeB" metadata f#le" that are
on NFS to the "econdar6 and r&n #t a" the new !r#mar6.
The %ile &#stem Namespace
HDFS "&!!ort" a trad#t#onal h#erarch#cal f#le oran#?at#on. A &"er or an a!!l#cat#on can
create d#rector#e" and "tore f#le" #n"#de the"e d#rector#e". The f#le "6"tem name"!ace
h#erarch6 #" "#m#lar to mo"t other e@#"t#n f#le "6"tem"I one can create and remo1e f#le">
mo1e a f#le from one d#rector6 to another> or rename a f#le. HDFS doe" not 6et #m!lement
&"er A&ota" or acce"" !erm#""#on". HDFS doe" not "&!!ort hard l#n=" or "oft l#n=".
Howe1er> the HDFS arch#tect&re doe" not !recl&de #m!lement#n the"e feat&re".
33 | P a e
The NameNode ma#nta#n" the f#le "6"tem name"!ace. An6 chane to the f#le "6"tem
name"!ace or #t" !ro!ert#e" #" recorded %6 the NameNode. An a!!l#cat#on can "!ec#f6 the
n&m%er of re!l#ca" of a f#le that "ho&ld %e ma#nta#ned %6 HDFS. The n&m%er of co!#e" of a
f#le #" called the re!l#cat#on factor of that f#le. Th#" #nformat#on #" "tored %6 the NameNode.
Data Replication
HDFS #" de"#ned to rel#a%l6 "tore 1er6 lare f#le" acro"" mach#ne" #n a lare cl&"ter. It
"tore" each f#le a" a "eA&ence of %loc="I all %loc=" #n a f#le e@ce!t the la"t %loc= are the
"ame "#?e. The %loc=" of a f#le are re!l#cated for fa&lt tolerance. The %loc= "#?e and
re!l#cat#on factor are conf#&ra%le !er f#le. An a!!l#cat#on can "!ec#f6 the n&m%er of
re!l#ca" of a f#le. The re!l#cat#on factor can %e "!ec#f#ed at f#le creat#on t#me and can %e
chaned later. F#le" #n HDFS are wr#te'once and ha1e "tr#ctl6 one wr#ter at an6 t#me.
The NameNode ma=e" all dec#"#on" reard#n re!l#cat#on of %loc=". It !er#od#call6 rece#1e"
a Heart%eat and a ,loc=re!ort from each of the DataNode" #n the cl&"ter. Rece#!t of a
Heart%eat #m!l#e" that the DataNode #" f&nct#on#n !ro!erl6. A ,loc=re!ort conta#n" a l#"t
of all %loc=" on a DataNode.
Replica )lacement
The !lacement of re!l#ca" #" cr#t#cal to HDFS rel#a%#l#t6 and !erformance. O!t#m#?#n
re!l#ca !lacement d#"t#n&#"he" HDFS from mo"t other d#"tr#%&ted f#le "6"tem". Th#" #" a
feat&re that need" lot" of t&n#n and e@!er#ence. The !&r!o"e of a rac='aware re!l#ca
!lacement !ol#c6 #" to #m!ro1e data rel#a%#l#t6> a1a#la%#l#t6> and networ= %andw#dth
32 | P a e
&t#l#?at#on. The c&rrent #m!lementat#on for the re!l#ca !lacement !ol#c6 #" a f#r"t effort #n
th#" d#rect#on. The "hort'term oal" of #m!lement#n th#" !ol#c6 are to 1al#date #t on
!rod&ct#on "6"tem"> learn more a%o&t #t" %eha1#or> and %&#ld a fo&ndat#on to te"t and
re"earch more "o!h#"t#cated !ol#c#e".
8are HDFS #n"tance" r&n on a cl&"ter of com!&ter" that commonl6 "!read acro"" man6
rac=". Comm&n#cat#on %etween two node" #n d#fferent rac=" ha" to o thro&h "w#tche". In
mo"t ca"e"> networ= %andw#dth %etween mach#ne" #n the "ame rac= #" reater than
networ= %andw#dth %etween mach#ne" #n d#fferent rac=".
The NameNode determ#ne" the rac= #d each DataNode %elon" to 1#a the !roce"" o&tl#ned
#n Rac= Awarene"". A "#m!le %&t non'o!t#mal !ol#c6 #" to !lace re!l#ca" on &n#A&e rac=".
Th#" !re1ent" lo"#n data when an ent#re rac= fa#l" and allow" &"e of %andw#dth from
m&lt#!le rac=" when read#n data. Th#" !ol#c6 e1enl6 d#"tr#%&te" re!l#ca" #n the cl&"ter
wh#ch ma=e" #t ea"6 to %alance load on com!onent fa#l&re. Howe1er> th#" !ol#c6 #ncrea"e"
the co"t of wr#te" %eca&"e a wr#te need" to tran"fer %loc=" to m&lt#!le rac=".
For the common ca"e> when the re!l#cat#on factor #" three> HDFSB" !lacement !ol#c6 #" to
!&t one re!l#ca on one node #n the local rac=> another on a d#fferent node #n the local rac=>
and the la"t on a d#fferent node #n a d#fferent rac=. Th#" !ol#c6 c&t" the #nter'rac= wr#te
traff#c wh#ch enerall6 #m!ro1e" wr#te !erformance. The chance of rac= fa#l&re #" far le""
than that of node fa#l&reI th#" !ol#c6 doe" not #m!act data rel#a%#l#t6 and a1a#la%#l#t6
&arantee". Howe1er> #t doe" red&ce the areate networ= %andw#dth &"ed when read#n
data "#nce a %loc= #" !laced #n onl6 two &n#A&e rac=" rather than three. )#th th#" !ol#c6> the
re!l#ca" of a f#le do not e1enl6 d#"tr#%&te acro"" the rac=". One th#rd of re!l#ca" are on one
node> two th#rd" of re!l#ca" are on one rac=> and the other th#rd are e1enl6 d#"tr#%&ted
acro"" the rema#n#n rac=". Th#" !ol#c6 #m!ro1e" wr#te !erformance w#tho&t com!rom#"#n
data rel#a%#l#t6 or read !erformance.
The c&rrent> defa&lt re!l#ca !lacement !ol#c6 de"cr#%ed here #" a wor= #n !rore"".
Replica &election
To m#n#m#?e lo%al %andw#dth con"&m!t#on and read latenc6> HDFS tr#e" to "at#"f6 a read
reA&e"t from a re!l#ca that #" clo"e"t to the reader. If there e@#"t" a re!l#ca on the "ame
rac= a" the reader node> then that re!l#ca #" !referred to "at#"f6 the read reA&e"t. If anF
HDFS cl&"ter "!an" m&lt#!le data center"> then a re!l#ca that #" re"#dent #n the local data
center #" !referred o1er an6 remote re!l#ca.
&afemode
On "tart&!> the NameNode enter" a "!ec#al "tate called Safemode. Re!l#cat#on of data
%loc=" doe" not occ&r when the NameNode #" #n the Safemode "tate. The NameNode
35 | P a e
rece#1e" Heart%eat and ,loc=re!ort me""ae" from the DataNode". A ,loc=re!ort conta#n"
the l#"t of data %loc=" that a DataNode #" ho"t#n. Each %loc= ha" a "!ec#f#ed m#n#m&m
n&m%er of re!l#ca". A %loc= #" con"#dered "afel6 re!l#cated when the m#n#m&m n&m%er of
re!l#ca" of that data %loc= ha" chec=ed #n w#th the NameNode. After a conf#&ra%le
!ercentae of "afel6 re!l#cated data %loc=" chec=" #n w#th the NameNode 9!l&" an
add#t#onal 33 "econd":> the NameNode e@#t" the Safemode "tate. It then determ#ne" the
l#"t of data %loc=" 9#f an6: that "t#ll ha1e fewer than the "!ec#f#ed n&m%er of re!l#ca". The
NameNode then re!l#cate" the"e %loc=" to other DataNode".
The )ersistence of %ile &#stem 'etadata
The HDFS name"!ace #" "tored %6 the NameNode. The NameNode &"e" a tran"act#on lo
called the Ed#t8o to !er"#"tentl6 record e1er6 chane that occ&r" to f#le "6"tem metadata.
For e@am!le> creat#n a new f#le #n HDFS ca&"e" the NameNode to #n"ert a record #nto the
Ed#t8o #nd#cat#n th#". S#m#larl6> chan#n the re!l#cat#on factor of a f#le ca&"e" a new
record to %e #n"erted #nto the Ed#t8o. The NameNode &"e" a f#le #n #t" local ho"t OS f#le
"6"tem to "tore the Ed#t8o. The ent#re f#le "6"tem name"!ace> #ncl&d#n the ma!!#n of
%loc=" to f#le" and f#le "6"tem !ro!ert#e"> #" "tored #n a f#le called the F"Imae. The
F"Imae #" "tored a" a f#le #n the NameNodeB" local f#le "6"tem too.
The NameNode =ee!" an #mae of the ent#re f#le "6"tem name"!ace and f#le ,loc=ma! #n
memor6. Th#" =e6 metadata #tem #" de"#ned to %e com!act> "&ch that a NameNode w#th $
-, of RAM #" !lent6 to "&!!ort a h&e n&m%er of f#le" and d#rector#e". )hen the
NameNode "tart" &!> #t read" the F"Imae and Ed#t8o from d#"=> a!!l#e" all the
tran"act#on" from the Ed#t8o to the #n'memor6 re!re"entat#on of the F"Imae> and fl&"he"
o&t th#" new 1er"#on #nto a new F"Imae on d#"=. It can then tr&ncate the old Ed#t8o
%eca&"e #t" tran"act#on" ha1e %een a!!l#ed to the !er"#"tent F"Imae. Th#" !roce"" #"
called a chec=!o#nt. In the c&rrent #m!lementat#on> a chec=!o#nt onl6 occ&r" when the
NameNode "tart" &!. )or= #" #n !rore"" to "&!!ort !er#od#c chec=!o#nt#n #n the near
f&t&re.
The DataNode "tore" HDFS data #n f#le" #n #t" local f#le "6"tem. The DataNode ha" no
=nowlede a%o&t HDFS f#le". It "tore" each %loc= of HDFS data #n a "e!arate f#le #n #t" local
f#le "6"tem. The DataNode doe" not create all f#le" #n the "ame d#rector6. In"tead> #t &"e" a
he&r#"t#c to determ#ne the o!t#mal n&m%er of f#le" !er d#rector6 and create" "&%d#rector#e"
a!!ro!r#atel6. It #" not o!t#mal to create all local f#le" #n the "ame d#rector6 %eca&"e the
local f#le "6"tem m#ht not %e a%le to eff#c#entl6 "&!!ort a h&e n&m%er of f#le" #n a "#nle
d#rector6. )hen a DataNode "tart" &!> #t "can" thro&h #t" local f#le "6"tem> enerate" a
l#"t of all HDFS data %loc=" that corre"!ond to each of the"e local f#le" and "end" th#"
re!ort to the NameNode4 th#" #" the ,loc=re!ort.
The Communication )rotocols
33 | P a e
All HDFS comm&n#cat#on !rotocol" are la6ered on to! of the TCPFIP !rotocol. A cl#ent
e"ta%l#"he" a connect#on to a conf#&ra%le TCP !ort on the NameNode mach#ne. It tal=" the
Cl#entProtocol w#th the NameNode. The DataNode" tal= to the NameNode &"#n the
DataNode Protocol. A Remote Proced&re Call 9RPC: a%"tract#on wra!" %oth the Cl#ent
Protocol and the DataNode Protocol. ,6 de"#n> the NameNode ne1er #n#t#ate" an6 RPC".
In"tead> #t onl6 re"!ond" to RPC reA&e"t" #""&ed %6 DataNode" or cl#ent".
Robustness
The !r#mar6 o%Kect#1e of HDFS #" to "tore data rel#a%l6 e1en #n the !re"ence of fa#l&re".
The three common t6!e" of fa#l&re" are NameNode fa#l&re"> DataNode fa#l&re" and networ=
!art#t#on".
Data Dis( %ailure, eartbeats and Re-Replication
Each DataNode "end" a Heart%eat me""ae to the NameNode !er#od#call6. A networ=
!art#t#on can ca&"e a "&%"et of DataNode" to lo"e connect#1#t6 w#th the NameNode. The
NameNode detect" th#" cond#t#on %6 the a%"ence of a Heart%eat me""ae. The NameNode
mar=" DataNode" w#tho&t recent Heart%eat" a" dead and doe" not forward an6 new IO
reA&e"t" to them. An6 data that wa" re#"tered to a dead DataNode #" not a1a#la%le to
HDFS an6 more. DataNode death ma6 ca&"e the re!l#cat#on factor of "ome %loc=" to fall
%elow the#r "!ec#f#ed 1al&e. The NameNode con"tantl6 trac=" wh#ch %loc=" need to %e
re!l#cated and #n#t#ate" re!l#cat#on whene1er nece""ar6. The nece""#t6 for re're!l#cat#on
ma6 ar#"e d&e to man6 rea"on"4 a DataNode ma6 %ecome &na1a#la%le> a re!l#ca ma6
%ecome corr&!ted> a hard d#"= on a DataNode ma6 fa#l> or the re!l#cat#on factor of a f#le
ma6 %e #ncrea"ed.
Cluster Rebalancing
The HDFS arch#tect&re #" com!at#%le w#th data re%alanc#n "cheme". A "cheme m#ht
a&tomat#call6 mo1e data from one DataNode to another #f the free "!ace on a DataNode
fall" %elow a certa#n thre"hold. In the e1ent of a "&dden h#h demand for a !art#c&lar f#le> a
"cheme m#ht d6nam#call6 create add#t#onal re!l#ca" and re%alance other data #n the
cl&"ter. The"e t6!e" of data re%alanc#n "cheme" are not 6et #m!lemented.
Data Integrit#
It #" !o""#%le that a %loc= of data fetched from a DataNode arr#1e" corr&!ted. Th#"
corr&!t#on can occ&r %eca&"e of fa&lt" #n a "torae de1#ce> networ= fa&lt"> or %&6
"oftware. The HDFS cl#ent "oftware #m!lement" chec="&m chec=#n on the content" of
HDFS f#le". )hen a cl#ent create" an HDFS f#le> #t com!&te" a chec="&m of each %loc= of
3$ | P a e
the f#le and "tore" the"e chec="&m" #n a "e!arate h#dden f#le #n the "ame HDFS
name"!ace. )hen a cl#ent retr#e1e" f#le content" #t 1er#f#e" that the data #t rece#1ed from
each DataNode matche" the chec="&m "tored #n the a""oc#ated chec="&m f#le. If not> then
the cl#ent can o!t to retr#e1e that %loc= from another DataNode that ha" a re!l#ca of that
%loc=.
'etadata Dis( %ailure
The F"Imae and the Ed#t8o are central data "tr&ct&re" of HDFS. A corr&!t#on of the"e
f#le" can ca&"e the HDFS #n"tance to %e non'f&nct#onal. For th#" rea"on> the NameNode can
%e conf#&red to "&!!ort ma#nta#n#n m&lt#!le co!#e" of the F"Imae and Ed#t8o. An6
&!date to e#ther the F"Imae or Ed#t8o ca&"e" each of the F"Imae" and Ed#t8o" to et
&!dated "6nchrono&"l6. Th#" "6nchrono&" &!dat#n of m&lt#!le co!#e" of the F"Imae and
Ed#t8o ma6 derade the rate of name"!ace tran"act#on" !er "econd that a NameNode can
"&!!ort. Howe1er> th#" deradat#on #" acce!ta%le %eca&"e e1en tho&h HDFS a!!l#cat#on"
are 1er6 data #nten"#1e #n nat&re> the6 are not metadata #nten"#1e. )hen a NameNode
re"tart"> #t "elect" the late"t con"#"tent F"Imae and Ed#t8o to &"e.
The NameNode mach#ne #" a "#nle !o#nt of fa#l&re for an HDFS cl&"ter. If the NameNode
mach#ne fa#l"> man&al #nter1ent#on #" nece""ar6. C&rrentl6> a&tomat#c re"tart and fa#lo1er
of the NameNode "oftware to another mach#ne #" not "&!!orted.
&napshots
Sna!"hot" "&!!ort "tor#n a co!6 of data at a !art#c&lar #n"tant of t#me. One &"ae of the
"na!"hot feat&re ma6 %e to roll %ac= a corr&!ted HDFS #n"tance to a !re1#o&"l6 =nown
ood !o#nt #n t#me. HDFS doe" not c&rrentl6 "&!!ort "na!"hot" %&t w#ll #n a f&t&re relea"e.
Data Organi0ation
Data .loc(s
HDFS #" de"#ned to "&!!ort 1er6 lare f#le". A!!l#cat#on" that are com!at#%le w#th HDFS
are tho"e that deal w#th lare data "et". The"e a!!l#cat#on" wr#te the#r data onl6 once %&t
the6 read #t one or more t#me" and reA&#re the"e read" to %e "at#"f#ed at "tream#n "!eed".
HDFS "&!!ort" wr#te'once'read'man6 "emant#c" on f#le". A t6!#cal %loc= "#?e &"ed %6 HDFS
#" +$ M,. Th&"> an HDFS f#le #" cho!!ed &! #nto +$ M, ch&n="> and #f !o""#%le> each ch&n=
w#ll re"#de on a d#fferent DataNode.
&taging
3( | P a e
A cl#ent reA&e"t to create a f#le doe" not reach the NameNode #mmed#atel6. In fact>
#n#t#all6 the HDFS cl#ent cache" the f#le data #nto a tem!orar6 local f#le. A!!l#cat#on wr#te"
are tran"!arentl6 red#rected to th#" tem!orar6 local f#le. )hen the local f#le acc&m&late"
data worth o1er one HDFS %loc= "#?e> the cl#ent contact" the NameNode. The NameNode
#n"ert" the f#le name #nto the f#le "6"tem h#erarch6 and allocate" a data %loc= for #t. The
NameNode re"!ond" to the cl#ent reA&e"t w#th the #dent#t6 of the DataNode and the
de"t#nat#on data %loc=. Then the cl#ent fl&"he" the %loc= of data from the local tem!orar6
f#le to the "!ec#f#ed DataNode. )hen a f#le #" clo"ed> the rema#n#n &n'fl&"hed data #n the
tem!orar6 local f#le #" tran"ferred to the DataNode. The cl#ent then tell" the NameNode
that the f#le #" clo"ed. At th#" !o#nt> the NameNode comm#t" the f#le creat#on o!erat#on #nto
a !er"#"tent "tore. If the NameNode d#e" %efore the f#le #" clo"ed> the f#le #" lo"t.
The a%o1e a!!roach ha" %een ado!ted after caref&l con"#derat#on of taret a!!l#cat#on"
that r&n on HDFS. The"e a!!l#cat#on" need "tream#n wr#te" to f#le". If a cl#ent wr#te" to a
remote f#le d#rectl6 w#tho&t an6 cl#ent "#de %&ffer#n> the networ= "!eed and the cone"t#on
#n the networ= #m!act" thro&h!&t con"#dera%l6. Th#" a!!roach #" not w#tho&t !recedent.
Earl#er d#"tr#%&ted f#le "6"tem"> e.. AFS> ha1e &"ed cl#ent "#de cach#n to #m!ro1e
!erformance. A POSIL reA&#rement ha" %een rela@ed to ach#e1e h#her !erformance of
data &!load".
Replication )ipelining
)hen a cl#ent #" wr#t#n data to an HDFS f#le> #t" data #" f#r"t wr#tten to a local f#le a"
e@!la#ned #n the !re1#o&" "ect#on. S&!!o"e the HDFS f#le ha" a re!l#cat#on factor of three.
)hen the local f#le acc&m&late" a f&ll %loc= of &"er data> the cl#ent retr#e1e" a l#"t of
DataNode" from the NameNode. Th#" l#"t conta#n" the DataNode" that w#ll ho"t a re!l#ca of
that %loc=. The cl#ent then fl&"he" the data %loc= to the f#r"t DataNode. The f#r"t DataNode
"tart" rece#1#n the data #n "mall !ort#on" 9$ S,:> wr#te" each !ort#on to #t" local re!o"#tor6
and tran"fer" that !ort#on to the "econd DataNode #n the l#"t. The "econd DataNode> #n t&rn
"tart" rece#1#n each !ort#on of the data %loc=> wr#te" that !ort#on to #t" re!o"#tor6 and
then fl&"he" that !ort#on to the th#rd DataNode. F#nall6> the th#rd DataNode wr#te" the data
to #t" local re!o"#tor6. Th&"> a DataNode can %e rece#1#n data from the !re1#o&" one #n the
!#!el#ne and at the "ame t#me forward#n data to the ne@t one #n the !#!el#ne. Th&"> the
data #" !#!el#ned from one DataNode to the ne@t.

*ccessibilit#
HDFS can %e acce""ed from a!!l#cat#on" #n man6 d#fferent wa6". Nat#1el6> HDFS !ro1#de"
a Ka1a API for a!!l#cat#on" to &"e. A C lan&ae wra!!er for th#" /a1a API #" al"o a1a#la%le.
3+ | P a e
In add#t#on> an HTTP %row"er can al"o %e &"ed to %row"e the f#le" of an HDFS #n"tance.
)or= #" #n !rore"" to e@!o"e HDFS thro&h the )e%DAH !rotocol.
&pace Reclamation
%ile Deletes and Undeletes
)hen a f#le #" deleted %6 a &"er or an a!!l#cat#on> #t #" not #mmed#atel6 remo1ed from
HDFS. In"tead> HDFS f#r"t rename" #t to a f#le #n the Ftra"h d#rector6. The f#le can %e
re"tored A&#c=l6 a" lon a" #t rema#n" #n Ftra"h. A f#le rema#n" #n Ftra"h for a conf#&ra%le
amo&nt of t#me. After the e@!#r6 of #t" l#fe #n Ftra"h> the NameNode delete" the f#le from the
HDFS name"!ace. The delet#on of a f#le ca&"e" the %loc=" a""oc#ated w#th the f#le to %e
freed. Note that there co&ld %e an a!!rec#a%le t#me dela6 %etween the t#me a f#le #"
deleted %6 a &"er and the t#me of the corre"!ond#n #ncrea"e #n free "!ace #n HDFS.
A &"er can Undelete a f#le after delet#n #t a" lon a" #t rema#n" #n the Ftra"h d#rector6. If a
&"er want" to &ndelete a f#le that heF"he ha" deleted> heF"he can na1#ate the Ftra"h
d#rector6 and retr#e1e the f#le. The Ftra"h d#rector6 conta#n" onl6 the late"t co!6 of the f#le
that wa" deleted. The Ftra"h d#rector6 #" K&"t l#=e an6 other d#rector6 w#th one "!ec#al
feat&re4 HDFS a!!l#e" "!ec#f#ed !ol#c#e" to a&tomat#call6 delete f#le" from th#" d#rector6.
The c&rrent defa&lt !ol#c6 #" to delete f#le" from Ftra"h that are more than + ho&r" old. In
the f&t&re> th#" !ol#c6 w#ll %e conf#&ra%le thro&h a well def#ned #nterface.
Decrease Replication %actor
)hen the re!l#cat#on factor of a f#le #" red&ced> the NameNode "elect" e@ce"" re!l#ca"
that can %e deleted. The ne@t Heart%eat tran"fer" th#" #nformat#on to the DataNode. The
DataNode then remo1e" the corre"!ond#n %loc=" and the corre"!ond#n free "!ace
a!!ear" #n the cl&"ter. Once aa#n> there m#ht %e a t#me dela6 %etween the com!let#on of
the "etRe!l#cat#on API call and the a!!earance of free "!ace #n the cl&"ter.
adoop %iles#stems
Hadoo! ha" an a%"tract not#on of f#le"6"tem> of wh#ch HDFS #" K&"t one #m!lementat#on.
The /a1a a%"tract cla"" or.a!ache.hadoo!.f".F#leS6"tem re!re"ent" a f#le"6"tem #n
37 | P a e
Hadoo!> and there are "e1eral concrete #m!lementat#on"> wh#ch are de"cr#%ed #n follow#n
ta%le.
:ocal file
fs$:ocal%ile&#stem
* files#stem for a locall#
connected
dis( with client-side
chec(sums$
Use Raw:ocal%ile&#s
tem for a local files#stem with
no
chec(sums$
D%& hdf" hdf".D#"tr#%&tedF#leS6"tem
Hadoo!B" d#"tr#%&ted f#le"6"tem.
HDFS #" de"#ned to wor=
eff#c#entl6
#n conK&nct#on w#th Ma!'
Red&ce.
%T) hft!
hdf".Hft!F#leS6"tem
A f#le"6"tem !ro1#d#n read'onl6
acce"" to HDFS o1er HTTP.
9De"!#te
#t" name> HFTP ha" no connect#on
w#th FTP.: Often &"ed w#th d#"tc!
9;Parallel Co!6#n w#th
&%T) h"ft! Hdf".H"ft!F#leS6"tem
A f#le"6"tem !ro1#d#n read'onl6
acce"" to HDFS o1er HTTPS.
9Aa#n>
th#" ha" no connect#on w#th FTP.:
*R har F".HarF#leS6"tem
A f#le"6"tem la6ered on another
f#le"6"tem for arch#1#n f#le".
Hadoo!
Arch#1e" are t6!#call6 &"ed
for arch#1#n f#le" #n HDFS to
red&ce
the namenodeB" memor6 &"ae.
>%&4Cl
oud
&tore6
Sf" f".=f".So"mo"F#leS6"tem
Clo&dStore 9formerl6 So"mo"
f#le"6"tem:
#" a d#"tr#%&ted f#le"6"tem
l#=e HDFS or -ooleB" -FS>
wr#tten #n CWW.
%T) ft! f".ft!.Ft!F#leS6"tem
A f#le"6"tem %ac=ed %6 an FTP
"er1er.
A f#le"6"tem %ac=ed %6 Ama?on
3. | P a e
&?4Nat
i!e6
"3n f"."3nat#1e.Nat#1eS3F#leS6"tem S3.
&?4.lo
c(
.ased
6
S3 f"."3.S3F#leS6"tem A
A f#le"6"tem %ac=ed %6 Ama?on
S3> wh#ch "tore" f#le" #n %loc="
9m&ch l#=e HDFS: to o1ercome
S3B"
( -, f#le "#?e l#m#t.
adoop *rchi!es
HDFS "tore" "mall f#le" #neff#c#entl6> "#nce each f#le #" "tored #n a %loc=> and %loc=
metadata #" held #n memor6 %6 the namenode. Th&"> a lare n&m%er of "mall f#le" can eat
&! a lot of memor6 on the namenode. 9Note> howe1er> that "mall f#le" do not ta=e &! an6
more d#"= "!ace than #" reA&#red to "tore the raw content" of the f#le. For e@am!le> a 2 M,
f#le "tored w#th a %loc= "#?e of 25. M, &"e" 2 M, of d#"= "!ace> not 25. M,.: Hadoo!
Arch#1e"> or HAR f#le"> are a f#le arch#1#n fac#l#t6 that !ac=" f#le" #nto HDFS %loc=" more
eff#c#entl6> there%6 red&c#n namenode memor6 &"ae wh#le "t#ll allow#n tran"!arent
acce"" to f#le". In !art#c&lar> Hadoo! Arch#1e" can %e &"ed a" #n!&t to Ma!Red&ce.
Using adoop *rchi!es
A Hadoo! Arch#1e #" created from a collect#on of f#le" &"#n the arch#1e tool. The tool r&n"
a Ma!Red&ce Ko% to !roce"" the #n!&t f#le" #n !arallel> "o to r&n #t> 6o& need a Ma!Red&ce
cl&"ter r&nn#n to &"e #t.
:imitations
There are a few l#m#tat#on" to %e aware of w#th HAR f#le". Creat#n an arch#1e create" a
co!6 of the or##nal f#le"> "o 6o& need a" m&ch d#"= "!ace a" the f#le" 6o& are arch#1#n to
create the arch#1e 9altho&h 6o& can delete the or##nal" once 6o& ha1e created the
arch#1e:. There #" c&rrentl6 no "&!!ort for arch#1e com!re""#on> altho&h the f#le" that o
#nto the arch#1e can %e com!re""ed 9HAR f#le" are l#=e tar f#le" #n th#" re"!ect:. Arch#1e" are
#mm&ta%le once the6 ha1e %een created. To add or remo1e f#le"> 6o& m&"t recreate the
arch#1e. In !ract#ce> th#" #" not a !ro%lem for f#le" that donBt chane after %e#n wr#tten>
"#nce the6 can %e arch#1ed #n %atche" on a re&lar %a"#"> "&ch a" da#l6 or wee=l6. A" noted
earl#er> HAR f#le" can %e &"ed a" #n!&t to Ma!Red&ce. Howe1er> there #" no arch#1e'aware
30 | P a e
In!&tFormat that can !ac= m&lt#!le f#le" #nto a "#nle Ma!Red&ce "!l#t> "o !roce""#n lot" of
"mall f#le"> e1en #n a HAR f#le> can "t#ll %e #neff#c#ent.
*N*TO'- O% * '*)R,DUC, 2O. RUN
D The cl#ent> wh#ch "&%m#t" the Ma!Red&ce Ko%.
$3 | P a e
D The Ko%trac=er> wh#ch coord#nate" the Ko% r&n. The Ko%trac=er #" a /a1a a!!l#cat#on
who"e ma#n cla"" #" /o%Trac=er.
D The ta"=trac=er"> wh#ch r&n the ta"=" that the Ko% ha" %een "!l#t #nto. Ta"=trac=er"
are /a1a a!!l#cat#on" who"e ma#n cla"" #" Ta"=Trac=er.
D The d#"tr#%&ted f#le"6"tem wh#ch #" &"ed
for "har#n Ko% f#le" %etween the other ent#t#e".
adoop is now a part of"-
$2 | P a e
Ama?on S3
Ama?on S3 9S#m!le Storae Ser1#ce: #" a data "torae "er1#ce. *o& are %#lled monthl6 for
"torae and data tran"fer. Tran"fer %etween S3 and Ama?onEC5 #" free. Th#" ma=e" &"e of
S3 attract#1e for Hadoo! &"er" who r&n cl&"ter" on EC5.
Hadoo! !ro1#de" two f#le"6"tem" that &"e S3.
S3 Nat#1e F#leS6"tem 9URI "cheme4 "3n:
A nat#1e f#le"6"tem for read#n and wr#t#n re&lar f#le" on S3. The ad1antae of th#"
f#le"6"tem #" that 6o& can acce"" f#le" on S3 that were wr#tten w#th other tool".
Con1er"el6> other tool" can acce"" f#le" wr#tten &"#n Hadoo!. The d#"ad1antae #"
the (-, l#m#t on f#le "#?e #m!o"ed %6 S3. For th#" rea"on #t #" not "&#ta%le a" a
re!lacement for HDFS 9wh#ch ha" "&!!ort for 1er6 lare f#le":.
S3 ,loc= F#leS6"tem 9URI "cheme4 "3:
A %loc='%a"ed f#le"6"tem %ac=ed %6 S3. F#le" are "tored a" %loc="> K&"t l#=e the6 are
#n HDFS. Th#" !erm#t" eff#c#ent #m!lementat#on of rename". Th#" f#le"6"tem reA&#re"
6o& to ded#cate a %&c=et for the f#le"6"tem ' 6o& "ho&ld not &"e an e@#"t#n %&c=et
conta#n#n f#le"> or wr#te other f#le" to the "ame %&c=et. The f#le" "tored %6 th#"
f#le"6"tem can %e larer than (-,> %&t the6 are not #ntero!era%le w#th other S3
tool".
There are two wa6" that S3 can %e &"ed w#th Hadoo!T" Ma!FRed&ce> e#ther a" a
re!lacement for HDFS &"#n the S3 %loc= f#le"6"tem 9#.e. &"#n #t a" a rel#a%le d#"tr#%&ted
f#le"6"tem w#th "&!!ort for 1er6 lare f#le": or a" a con1en#ent re!o"#tor6 for data #n!&t to
and o&t!&t from Ma!Red&ce> &"#n e#ther S3 f#le"6"tem. In the "econd ca"e HDFS #" "t#ll
&"ed for the Ma!FRed&ce !ha"e. Note al"o> that %6 &"#n S3 a" an #n!&t to Ma!Red&ce 6o&
lo"e the data local#t6 o!t#m#?at#on> wh#ch ma6 %e "#n#f#cant.
%*C,.OO>
Face%oo=B" en#neer#n team ha" !o"ted "ome deta#l" on the tool" #tB" &"#n to anal6?e
the h&e data "et" #t collect". One of the ma#n tool" #t &"e" #" Hadoo! that ma=e" #t ea"#er
to anal6?e 1a"t amo&nt" of data.
Some #ntere"t#n t#d%#t" from the !o"t4
Some of the"e earl6 !roKect" ha1e mat&red #nto !&%l#cl6 relea"ed feat&re"
9l#=e the Face%oo= 8e@#con: or are %e#n &"ed #n the %ac=ro&nd to #m!ro1e
&"er e@!er#ence on Face%oo= 9%6 #m!ro1#n the rele1ance of "earch re"&lt">
for e@am!le:.
$5 | P a e
Face%oo= ha" m&lt#!le Hadoo! cl&"ter" de!lo6ed now ' w#th the %#e"t
ha1#n a%o&t 5(33 c!& core" and 2 Peta,6te of d#"= "!ace. The6 are load#n
o1er 5(3 #a%6te" of com!re""ed data 9o1er 5 tera%6te" &ncom!re""ed:
#nto the Hadoo! f#le "6"tem e1er6 da6 and ha1e h&ndred" of Ko%" r&nn#n
each da6 aa#n"t the"e data "et". The l#"t of !roKect" that are &"#n th#"
#nfra"tr&ct&re ha" !rol#ferated ' from tho"e enerat#n m&ndane "tat#"t#c"
a%o&t "#te &"ae> to other" %e#n &"ed to f#ht "!am and determ#ne
a!!l#cat#on A&al#t6.
O1er t#me> we ha1e added cla""#c data wareho&"e feat&re" l#=e !art#t#on#n>
"am!l#n and #nde@#n to th#" en1#ronment. Th#" #n'ho&"e data wareho&"#n
la6er o1er Hadoo! #" called H#1e.
-*OO@
*ahooP recentl6 la&nched the worldT" lare"t A!ache Hadoo! !rod&ct#on
a!!l#cat#on. The *ahooP Search )e%ma! #" a Hadoo! a!!l#cat#on that r&n" on a
more than 23>333 core 8#n&@ cl&"ter and !rod&ce" data that #" now &"ed #n e1er6
*ahooP )e% "earch A&er6.
The )e%ma! %&#ld "tart" w#th e1er6 )e% !ae crawled %6 *ahooP and !rod&ce" a
data%a"e of all =nown )e% !ae" and "#te" on the #nternet and a 1a"t arra6 of data a%o&t
e1er6 !ae and "#te. Th#" der#1ed data feed" the Mach#ne 8earned Ran=#n alor#thm" at
the heart of *ahooP Search.
Some )e%ma! "#?e data4
N&m%er of l#n=" %etween !ae" #n the #nde@4 roughl# A trillion lin(s
S#?e of o&t!&t4 o!er ?BB T., compressed@
N&m%er of core" &"ed to r&n a "#nle Ma!'Red&ce Ko%4 o!er AB,BBB
Raw d#"= &"ed #n the !rod&ct#on cl&"ter4 o!er C )etab#tes
Th#" !roce"" #" not new. )hat #" new #" the &"e of Hadoo!. Hadoo! ha" allowed &" to r&n
the #dent#cal !roce""#n we ran !re'Hadoo! on the "ame cl&"ter #n ++] of the t#me o&r
!re1#o&" "6"tem too=. It doe" that wh#le "#m!l#f6#n adm#n#"trat#on.
R,%,R,NC,&
OTre#ll6> Hadoo!4 The Def#n#t#1e -&#de %6 Tom )h#te
$3 | P a e
htt!4FFwww.clo&dera.comFhadoo!'tra#n#n'th#n=#n'at'"cale
htt!4FFde1elo!er.6ahoo.comFhadoo!Ft&tor#alFmod&le2.html
htt!4FFhadoo!.a!ache.orFcoreFdoc"Fc&rrentFa!#F
htt!4FFhadoo!.a!ache.orFcoreF1er"#on[control.html
$$ | P a e

Você também pode gostar