Você está na página 1de 9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

BlogHome
COURSES
Webinars
Blog
InterviewQuestions

HadoopInterviewQuestionsSettingUp
53
HadoopCluster
Tweet

Like

April23,2013|BigDataandHadoop,InterviewQuestions

Share

Email
Print

LookingoutforHadoopInterviewQuestionsthatarefrequentlyaskedbyemployers?Hereisthe
secondlistofHadoopInterviewQuestionswhichcoverssettingupHadoopCluster.
WhicharethethreemodesinwhichHadoopcanberun?
ThethreemodesinwhichHadoopcanberunare:
1.standalone(local)mode
2.Pseudodistributedmode
3.Fullydistributedmode
WhatarethefeaturesofStandalone(local)mode?
Instandalonemodetherearenodaemons,everythingrunsonasingleJVM.IthasnoDFSand
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

1/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

utilizesthelocalfilesystem.StandalonemodeissuitableonlyforrunningMapReduceprograms
duringdevelopment.Itisoneofthemostleastusedenvironments.
WhatarethefeaturesofPseudomode?
PseudomodeisusedbothfordevelopmentandintheQAenvironment.InthePseudomodeallthe
daemonsrunonthesamemachine.
CanwecallVMsaspseudos?
No,VMsarenotpseudosbecauseVMissomethingdifferentandpesudoisveryspecifictoHadoop.
WhatarethefeaturesofFullyDistributedmode?
FullyDistributedmodeisusedintheproductionenvironment,wherewehavennumberofmachines
formingaHadoopcluster.Hadoopdaemonsrunonaclusterofmachines.Thereisonehostonto
whichNamenodeisrunningandanotherhostonwhichdatanodeisrunningandthenthereare
machinesonwhichtasktrackerisrunning.Wehaveseparatemastersandseparateslavesinthis
distribution.
DoesHadoopfollowstheUNIXpattern?
Yes,HadoopcloselyfollowstheUNIXpattern.Hadoopalsohastheconfdirectoryasinthecaseof
UNIX.
InwhichdirectoryHadoopisinstalled?
ClouderaandApachehasthesamedirectorystructure.Hadoopisinstalledincd/usr/lib/hadoop
0.20/.
WhataretheportnumbersofNamenode,jobtrackerandtasktracker?
TheportnumberforNamenodeis70,forjobtrackeris30andfortasktrackeris60.
WhatistheHadoopcoreconfiguration?
Hadoopcoreisconfiguredbytwoxmlfiles:
1.hadoopdefault.xmlwhichwasrenamedto2.hadoopsite.xml.
Thesefilesarewritteninxmlformat.Wehavecertainpropertiesinthesexmlfiles,whichconsistof
nameandvalue.Butthesefilesdonotexistnow.
WhataretheHadoopconfigurationfilesatpresent?
Thereare3configurationfilesinHadoop:
1.coresite.xml
2.hdfssite.xml
3.mapredsite.xml

http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

2/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

Thesefilesarelocatedintheconf/subdirectory.
HowtoexittheVieditor?
ToexittheViEditor,pressESCandtype:qandthenpressenter.
WhatisaspillfactorwithrespecttotheRAM?
Spillfactoristhesizeafterwhichyourfilesmovetothetempfile.Hadooptempdirectoryisusedfor
this.
Isfs.mapr.working.dirasingledirectory?
Yes,fs.mapr.working.diritisjustonedirectory.
Whicharethethreemainhdfssite.xmlproperties?
Thethreemainhdfssite.xmlpropertiesare:
1.dfs.name.dirwhichgivesyouthelocationonwhichmetadatawillbestoredandwhereDFSis
locatedondiskorontotheremote.
2.dfs.data.dirwhichgivesyouthelocationwherethedataisgoingtobestored.
3.fs.checkpoint.dirwhichisforsecondaryNamenode.
Howtocomeoutoftheinsertmode?
Tocomeoutoftheinsertmode,pressESC,type:q(ifyouhavenotwrittenanything)ORtype:wq(if
youhavewrittenanythinginthefile)andthenpressENTER.
WhatisClouderaandwhyitisused?
ClouderaisthedistributionofHadoop.ItisausercreatedonVMbydefault.Clouderabelongsto
Apacheandisusedfordataprocessing.
Whathappensifyougetaconnectionrefusedjavaexceptionwhenyoutypehadoopfsck/?
ItcouldmeanthattheNamenodeisnotworkingonyourVM.
WeareusingUbuntuoperatingsystemwithCloudera,butfromwherewecandownload
HadoopordoesitcomebydefaultwithUbuntu?
ThisisadefaultconfigurationofHadoopthatyouhavetodownloadfromClouderaorfrom
Edurekasdropboxandtherunitonyoursystems.Youcanalsoproceedwithyourownconfiguration
butyouneedaLinuxbox,beitUbuntuorRedhat.ThereareinstallationstepspresentattheCloudera
locationorinEdurekasDropbox.Youcangoeitherways.
Whatdoesjpscommanddo?
ThiscommandcheckswhetheryourNamenode,datanode,tasktracker,jobtracker,etcareworkingor
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

3/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

not.
HowcanIrestartNamenode?
1.Clickonstopall.shandthenclickonstartall.shOR
2.Writesudohdfs(pressenter),suhdfs(pressenter),/etc/init.d/ha(pressenter)and
then/etc/init.d/hadoop0.20namenodestart(pressenter).
Whatisthefullformoffsck?
FullformoffsckisFileSystemCheck.
HowcanwecheckwhetherNamenodeisworkingornot?
TocheckwhetherNamenodeisworkingornot,usethecommand/etc/init.d/hadoop0.20namenode
statusorassimpleasjps.
Whatdoesthecommandmapred.job.trackerdo?
Thecommandmapred.job.trackerlistsoutwhichofyournodesisactingasajobtracker.
Whatdoes/etc/init.ddo?
/etc/init.dspecifieswheredaemons(services)areplacedortoseethestatusofthesedaemons.Itis
veryLINUXspecific,andnothingtodowithHadoop.
HowcanwelookfortheNamenodeinthebrowser?
IfyouhavetolookforNamenodeinthebrowser,youdonthavetogivelocalhost:8021,theport
numbertolookforNamenodeinthebroweris50070.
HowtochangefromSUtoCloudera?
TochangefromSUtoClouderajusttypeexit.
Whichfilesareusedbythestartupandshutdowncommands?
SlavesandMastersareusedbythestartupandtheshutdowncommands.
Whatdoslavesconsistof?
Slavesconsistofalistofhosts,oneperline,thathostdatanodeandtasktrackerservers.
Whatdomastersconsistof?
Masterscontainalistofhosts,oneperline,thataretohostsecondarynamenodeservers.
Whatdoeshadoopenv.shdo?
hadoopenv.shprovidestheenvironmentforHadooptorun.JAVA_HOMEissetoverhere.
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

4/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

Canwehavemultipleentriesinthemasterfiles?
Yes,wecanhavemultipleentriesintheMasterfiles.
Whereishadoopenv.shfilepresent?
hadoopenv.shfileispresentintheconflocation.
InHadoop_PID_DIR,whatdoesPIDstandsfor?
PIDstandsforProcessID.
Whatdoes/var/hadoop/pidsdo?
ItstoresthePID.
Whatdoeshadoopmetrics.propertiesfiledo?
hadoopmetrics.propertiesisusedforReportingpurposes.ItcontrolsthereportingforHadoop.
Thedefaultstatusisnottoreport.
WhatarethenetworkrequirementsforHadoop?
TheHadoopcoreusesShell(SSH)tolaunchtheserverprocessesontheslavenodes.It
requirespasswordlessSSHconnectionbetweenthemasterandalltheslavesandthesecondary
machines.
WhydoweneedapasswordlessSSHinFullyDistributedenvironment?
WeneedapasswordlessSSHinaFullyDistributedenvironmentbecausewhentheclusterisLIVE
andrunninginFully
Distributedenvironment,thecommunicationistoofrequent.Thejobtrackershouldbeabletosenda
tasktotasktrackerquickly.
Doesthisleadtosecurityissues?
No,notatall.Hadoopclusterisanisolatedcluster.Andgenerallyithasnothingtodowithan
internet.Ithasadifferentkindofaconfiguration.Weneedntworryaboutthatkindofasecurity
breach,forinstance,someonehackingthroughtheinternet,andsoon.Hadoophasaverysecuredway
toconnecttoothermachinestofetchandtoprocessdata.
OnwhichportdoesSSHwork?
SSHworksonPortNo.22,thoughitcanbeconfigured.22isthedefaultPortnumber.
CanyoutellusmoreaboutSSH?
SSHisnothingbutasecureshellcommunication,itisakindofaprotocolthatworksonaPortNo.
22,andwhenyoudoanSSH,whatyoureallyrequireisapassword.

http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

5/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

WhypasswordisneededinSSHlocalhost?
PasswordisrequiredinSSHforsecurityandinasituationwherepasswordlesscommunicationisnot
set.
Doweneedtogiveapassword,evenifthekeyisaddedinSSH?
Yes,passwordisstillrequiredevenifthekeyisaddedinSSH.
WhatifaNamenodehasnodata?
IfaNamenodehasnodataitisnotaNamenode.Practically,Namenodewillhavesomedata.
WhathappenstojobtrackerwhenNamenodeisdown?
WhenNamenodeisdown,yourclusterisOFF,thisisbecauseNamenodeisthesinglepointoffailure
inHDFS.
WhathappenstoaNamenode,whenjobtrackerisdown?
Whenajobtrackerisdown,itwillnotbefunctionalbutNamenodewillbepresent.So,clusteris
accessibleifNamenodeisworking,evenifthejobtrackerisnotworking.
CanyougiveussomemoredetailsaboutSSHcommunicationbetweenMastersandtheSlaves?
SSHisapasswordlesssecurecommunicationwheredatapacketsaresentacrosstheslave.Ithas
someformatintowhichdataissentacross.SSHisnotonlybetweenmastersandslavesbutalso
betweentwohosts.
WhatisformattingoftheDFS?
JustlikewedoforWindows,DFSisformattedforproperstructuring.Itisnotusuallydoneasit
formatstheNamenodetoo.
DoestheHDFSclientdecidetheinputsplitorNamenode?
No,theClientdoesnotdecide.Itisalreadyspecifiedinoneoftheconfigurationsthroughwhichinput
splitisalreadyconfigured.
InClouderathereisalreadyacluster,butifIwanttoformaclusteronUbuntucanwedoit?
Yes,youcangoaheadwiththis!Thereareinstallationstepsforcreatinganewcluster.Youcan
uninstallyourpresentclusterandinstallthenewcluster.
CanwecreateaHadoopclusterfromscratch?
YeswecandothatalsooncewearefamiliarwiththeHadoopenvironment.
CanweuseWindowsforHadoop?
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

6/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

Actually,RedHatLinuxorUbuntuarethebestOperatingSystemsforHadoop.Windowsisnotused
frequentlyforinstallingHadoopastherearemanysupportproblemsattachedwithWindows.Thus,
WindowsisnotapreferredenvironmentforHadoop.
Gotaquestionforus?Pleasementiontheminthecommentssectionandwewillgetbacktoyou.
RelatedPosts:
HadoopInterviewQuestionsHDFS
HadoopInterviewQuestionsMapReduce
HadoopInterviewQuestionsPIG
AboutPriyanka(13Posts)

Like

53

Tweet

http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

7/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

2Comments

http://www.edureka.co/blog/

Share

Recommend

Login

SortbyBest

Jointhediscussion
Awadhesh 9monthsago

pleaseupdatethequestionforHadoop2.0also...someofthequestionsarespecificto
Hadoop1.0whereTasktrackerandJobtrackersarementioned....

Reply Share

EdurekaSupport

Mod >Awadhesh

8monthsago

HiAwadhesh,thanksforcommenting.Wewilltakeyoursuggestioninto
consideration.

Reply Share

WHAT'STHIS?

ALSOONHTTP://WWW.EDUREKA.CO/BLOG/

ImplementingKmeansClusteringonthe
CrimeDataset

BusinessAnalyticsDecisionTreeinR
2comments6monthsago

4comments10monthsago

EdurekaSupportHiAl,wehaveupdated

thepostwiththecode.

saiclickontheimagewithsampledata

set(atthetop).

ViveksQuesttoStayUpdatedOpensup
aWorldofPossibilities

DBInputFormattoTransferDataFrom
SQLtoNoSQLDatabase

2comments6monthsago

2comments9monthsago

vivekmishraThanksRory.

EdurekaSupportThanksJyothi!!Feel

freetogothroughourotherblogsaswell.

Subscribe

AddDisqustoyoursite

Privacy

Search

STARTLEARNING

select course
FirstName

LastName

http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

8/9

11/30/2015

HadoopInterviewQuestions|SettingupHadoopCluster|Edureka

EmailAddress
Password
MobileNumber
SIGN UP

RecentPosts
AllYouNeedToKnowAboutDevOps
DoYouNeedJavaToLearnHadoop?
ADeepDiveIntoPig
4WaysToUseRAndHadoopTogether
EvolutionOfMarketing:VirtualReality,AugmentedRealityAndIoT

Subscribe

RSSFeed

http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/

9/9

Você também pode gostar