Escolar Documentos
Profissional Documentos
Cultura Documentos
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
BlogHome
COURSES
Webinars
Blog
InterviewQuestions
HadoopInterviewQuestionsSettingUp
53
HadoopCluster
Tweet
Like
April23,2013|BigDataandHadoop,InterviewQuestions
Share
Email
Print
LookingoutforHadoopInterviewQuestionsthatarefrequentlyaskedbyemployers?Hereisthe
secondlistofHadoopInterviewQuestionswhichcoverssettingupHadoopCluster.
WhicharethethreemodesinwhichHadoopcanberun?
ThethreemodesinwhichHadoopcanberunare:
1.standalone(local)mode
2.Pseudodistributedmode
3.Fullydistributedmode
WhatarethefeaturesofStandalone(local)mode?
Instandalonemodetherearenodaemons,everythingrunsonasingleJVM.IthasnoDFSand
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
1/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
utilizesthelocalfilesystem.StandalonemodeissuitableonlyforrunningMapReduceprograms
duringdevelopment.Itisoneofthemostleastusedenvironments.
WhatarethefeaturesofPseudomode?
PseudomodeisusedbothfordevelopmentandintheQAenvironment.InthePseudomodeallthe
daemonsrunonthesamemachine.
CanwecallVMsaspseudos?
No,VMsarenotpseudosbecauseVMissomethingdifferentandpesudoisveryspecifictoHadoop.
WhatarethefeaturesofFullyDistributedmode?
FullyDistributedmodeisusedintheproductionenvironment,wherewehavennumberofmachines
formingaHadoopcluster.Hadoopdaemonsrunonaclusterofmachines.Thereisonehostonto
whichNamenodeisrunningandanotherhostonwhichdatanodeisrunningandthenthereare
machinesonwhichtasktrackerisrunning.Wehaveseparatemastersandseparateslavesinthis
distribution.
DoesHadoopfollowstheUNIXpattern?
Yes,HadoopcloselyfollowstheUNIXpattern.Hadoopalsohastheconfdirectoryasinthecaseof
UNIX.
InwhichdirectoryHadoopisinstalled?
ClouderaandApachehasthesamedirectorystructure.Hadoopisinstalledincd/usr/lib/hadoop
0.20/.
WhataretheportnumbersofNamenode,jobtrackerandtasktracker?
TheportnumberforNamenodeis70,forjobtrackeris30andfortasktrackeris60.
WhatistheHadoopcoreconfiguration?
Hadoopcoreisconfiguredbytwoxmlfiles:
1.hadoopdefault.xmlwhichwasrenamedto2.hadoopsite.xml.
Thesefilesarewritteninxmlformat.Wehavecertainpropertiesinthesexmlfiles,whichconsistof
nameandvalue.Butthesefilesdonotexistnow.
WhataretheHadoopconfigurationfilesatpresent?
Thereare3configurationfilesinHadoop:
1.coresite.xml
2.hdfssite.xml
3.mapredsite.xml
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
2/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
Thesefilesarelocatedintheconf/subdirectory.
HowtoexittheVieditor?
ToexittheViEditor,pressESCandtype:qandthenpressenter.
WhatisaspillfactorwithrespecttotheRAM?
Spillfactoristhesizeafterwhichyourfilesmovetothetempfile.Hadooptempdirectoryisusedfor
this.
Isfs.mapr.working.dirasingledirectory?
Yes,fs.mapr.working.diritisjustonedirectory.
Whicharethethreemainhdfssite.xmlproperties?
Thethreemainhdfssite.xmlpropertiesare:
1.dfs.name.dirwhichgivesyouthelocationonwhichmetadatawillbestoredandwhereDFSis
locatedondiskorontotheremote.
2.dfs.data.dirwhichgivesyouthelocationwherethedataisgoingtobestored.
3.fs.checkpoint.dirwhichisforsecondaryNamenode.
Howtocomeoutoftheinsertmode?
Tocomeoutoftheinsertmode,pressESC,type:q(ifyouhavenotwrittenanything)ORtype:wq(if
youhavewrittenanythinginthefile)andthenpressENTER.
WhatisClouderaandwhyitisused?
ClouderaisthedistributionofHadoop.ItisausercreatedonVMbydefault.Clouderabelongsto
Apacheandisusedfordataprocessing.
Whathappensifyougetaconnectionrefusedjavaexceptionwhenyoutypehadoopfsck/?
ItcouldmeanthattheNamenodeisnotworkingonyourVM.
WeareusingUbuntuoperatingsystemwithCloudera,butfromwherewecandownload
HadoopordoesitcomebydefaultwithUbuntu?
ThisisadefaultconfigurationofHadoopthatyouhavetodownloadfromClouderaorfrom
Edurekasdropboxandtherunitonyoursystems.Youcanalsoproceedwithyourownconfiguration
butyouneedaLinuxbox,beitUbuntuorRedhat.ThereareinstallationstepspresentattheCloudera
locationorinEdurekasDropbox.Youcangoeitherways.
Whatdoesjpscommanddo?
ThiscommandcheckswhetheryourNamenode,datanode,tasktracker,jobtracker,etcareworkingor
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
3/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
not.
HowcanIrestartNamenode?
1.Clickonstopall.shandthenclickonstartall.shOR
2.Writesudohdfs(pressenter),suhdfs(pressenter),/etc/init.d/ha(pressenter)and
then/etc/init.d/hadoop0.20namenodestart(pressenter).
Whatisthefullformoffsck?
FullformoffsckisFileSystemCheck.
HowcanwecheckwhetherNamenodeisworkingornot?
TocheckwhetherNamenodeisworkingornot,usethecommand/etc/init.d/hadoop0.20namenode
statusorassimpleasjps.
Whatdoesthecommandmapred.job.trackerdo?
Thecommandmapred.job.trackerlistsoutwhichofyournodesisactingasajobtracker.
Whatdoes/etc/init.ddo?
/etc/init.dspecifieswheredaemons(services)areplacedortoseethestatusofthesedaemons.Itis
veryLINUXspecific,andnothingtodowithHadoop.
HowcanwelookfortheNamenodeinthebrowser?
IfyouhavetolookforNamenodeinthebrowser,youdonthavetogivelocalhost:8021,theport
numbertolookforNamenodeinthebroweris50070.
HowtochangefromSUtoCloudera?
TochangefromSUtoClouderajusttypeexit.
Whichfilesareusedbythestartupandshutdowncommands?
SlavesandMastersareusedbythestartupandtheshutdowncommands.
Whatdoslavesconsistof?
Slavesconsistofalistofhosts,oneperline,thathostdatanodeandtasktrackerservers.
Whatdomastersconsistof?
Masterscontainalistofhosts,oneperline,thataretohostsecondarynamenodeservers.
Whatdoeshadoopenv.shdo?
hadoopenv.shprovidestheenvironmentforHadooptorun.JAVA_HOMEissetoverhere.
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
4/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
Canwehavemultipleentriesinthemasterfiles?
Yes,wecanhavemultipleentriesintheMasterfiles.
Whereishadoopenv.shfilepresent?
hadoopenv.shfileispresentintheconflocation.
InHadoop_PID_DIR,whatdoesPIDstandsfor?
PIDstandsforProcessID.
Whatdoes/var/hadoop/pidsdo?
ItstoresthePID.
Whatdoeshadoopmetrics.propertiesfiledo?
hadoopmetrics.propertiesisusedforReportingpurposes.ItcontrolsthereportingforHadoop.
Thedefaultstatusisnottoreport.
WhatarethenetworkrequirementsforHadoop?
TheHadoopcoreusesShell(SSH)tolaunchtheserverprocessesontheslavenodes.It
requirespasswordlessSSHconnectionbetweenthemasterandalltheslavesandthesecondary
machines.
WhydoweneedapasswordlessSSHinFullyDistributedenvironment?
WeneedapasswordlessSSHinaFullyDistributedenvironmentbecausewhentheclusterisLIVE
andrunninginFully
Distributedenvironment,thecommunicationistoofrequent.Thejobtrackershouldbeabletosenda
tasktotasktrackerquickly.
Doesthisleadtosecurityissues?
No,notatall.Hadoopclusterisanisolatedcluster.Andgenerallyithasnothingtodowithan
internet.Ithasadifferentkindofaconfiguration.Weneedntworryaboutthatkindofasecurity
breach,forinstance,someonehackingthroughtheinternet,andsoon.Hadoophasaverysecuredway
toconnecttoothermachinestofetchandtoprocessdata.
OnwhichportdoesSSHwork?
SSHworksonPortNo.22,thoughitcanbeconfigured.22isthedefaultPortnumber.
CanyoutellusmoreaboutSSH?
SSHisnothingbutasecureshellcommunication,itisakindofaprotocolthatworksonaPortNo.
22,andwhenyoudoanSSH,whatyoureallyrequireisapassword.
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
5/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
WhypasswordisneededinSSHlocalhost?
PasswordisrequiredinSSHforsecurityandinasituationwherepasswordlesscommunicationisnot
set.
Doweneedtogiveapassword,evenifthekeyisaddedinSSH?
Yes,passwordisstillrequiredevenifthekeyisaddedinSSH.
WhatifaNamenodehasnodata?
IfaNamenodehasnodataitisnotaNamenode.Practically,Namenodewillhavesomedata.
WhathappenstojobtrackerwhenNamenodeisdown?
WhenNamenodeisdown,yourclusterisOFF,thisisbecauseNamenodeisthesinglepointoffailure
inHDFS.
WhathappenstoaNamenode,whenjobtrackerisdown?
Whenajobtrackerisdown,itwillnotbefunctionalbutNamenodewillbepresent.So,clusteris
accessibleifNamenodeisworking,evenifthejobtrackerisnotworking.
CanyougiveussomemoredetailsaboutSSHcommunicationbetweenMastersandtheSlaves?
SSHisapasswordlesssecurecommunicationwheredatapacketsaresentacrosstheslave.Ithas
someformatintowhichdataissentacross.SSHisnotonlybetweenmastersandslavesbutalso
betweentwohosts.
WhatisformattingoftheDFS?
JustlikewedoforWindows,DFSisformattedforproperstructuring.Itisnotusuallydoneasit
formatstheNamenodetoo.
DoestheHDFSclientdecidetheinputsplitorNamenode?
No,theClientdoesnotdecide.Itisalreadyspecifiedinoneoftheconfigurationsthroughwhichinput
splitisalreadyconfigured.
InClouderathereisalreadyacluster,butifIwanttoformaclusteronUbuntucanwedoit?
Yes,youcangoaheadwiththis!Thereareinstallationstepsforcreatinganewcluster.Youcan
uninstallyourpresentclusterandinstallthenewcluster.
CanwecreateaHadoopclusterfromscratch?
YeswecandothatalsooncewearefamiliarwiththeHadoopenvironment.
CanweuseWindowsforHadoop?
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
6/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
Actually,RedHatLinuxorUbuntuarethebestOperatingSystemsforHadoop.Windowsisnotused
frequentlyforinstallingHadoopastherearemanysupportproblemsattachedwithWindows.Thus,
WindowsisnotapreferredenvironmentforHadoop.
Gotaquestionforus?Pleasementiontheminthecommentssectionandwewillgetbacktoyou.
RelatedPosts:
HadoopInterviewQuestionsHDFS
HadoopInterviewQuestionsMapReduce
HadoopInterviewQuestionsPIG
AboutPriyanka(13Posts)
Like
53
Tweet
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
7/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
2Comments
http://www.edureka.co/blog/
Share
Recommend
Login
SortbyBest
Jointhediscussion
Awadhesh 9monthsago
pleaseupdatethequestionforHadoop2.0also...someofthequestionsarespecificto
Hadoop1.0whereTasktrackerandJobtrackersarementioned....
Reply Share
EdurekaSupport
Mod >Awadhesh
8monthsago
HiAwadhesh,thanksforcommenting.Wewilltakeyoursuggestioninto
consideration.
Reply Share
WHAT'STHIS?
ALSOONHTTP://WWW.EDUREKA.CO/BLOG/
ImplementingKmeansClusteringonthe
CrimeDataset
BusinessAnalyticsDecisionTreeinR
2comments6monthsago
4comments10monthsago
EdurekaSupportHiAl,wehaveupdated
thepostwiththecode.
saiclickontheimagewithsampledata
set(atthetop).
ViveksQuesttoStayUpdatedOpensup
aWorldofPossibilities
DBInputFormattoTransferDataFrom
SQLtoNoSQLDatabase
2comments6monthsago
2comments9monthsago
vivekmishraThanksRory.
EdurekaSupportThanksJyothi!!Feel
freetogothroughourotherblogsaswell.
Subscribe
AddDisqustoyoursite
Privacy
Search
STARTLEARNING
select course
FirstName
LastName
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
8/9
11/30/2015
HadoopInterviewQuestions|SettingupHadoopCluster|Edureka
EmailAddress
Password
MobileNumber
SIGN UP
RecentPosts
AllYouNeedToKnowAboutDevOps
DoYouNeedJavaToLearnHadoop?
ADeepDiveIntoPig
4WaysToUseRAndHadoopTogether
EvolutionOfMarketing:VirtualReality,AugmentedRealityAndIoT
Subscribe
RSSFeed
http://www.edureka.co/blog/hadoopinterviewquestionshadoopcluster/
9/9