Escolar Documentos
Profissional Documentos
Cultura Documentos
JustanotherWordPress.comsite
RemoteDebuggingofHadoopJobwithEclipse
April5,2013byPravinChavaninHadoop.
Introduction:
WhenwecreateMapReduceApplicationinJava,andrungeneratedwaronHadoopplatform,wemayneedtoremotely
debugthatMapReduceApplicationatruntime.
Hadooprunsin3modes
1)Standalone
2)Pseudodistributed
3)Fullydistributed
ItispossibletodebugHadoopMapReduceAppinallthreemodes.
ItsalsopossibletodebugtheMapperandReducerTaskwhichareexecutedwithhelpofcontainerswhenjobissubmitted,
whichcontainsYarnChild.java(PartofHadoopFramework),ForthesepleasereferdebuggingchildprocessesinHadoopat
theendofthecurrentblog.
Scenario:
YouhaveVirtualmachineonwhichHadoopisinstalledandyouwanttodebugMapReduceAppfromeclipseinwindows
onsamemachineoranothermachine.
Toachievethis
1)Modifyconf/hadoopenv.shfileinHadoopinstallationdirectory.
#cd/root/hadoop/hadoop1.0.4/conf
Openhadoopenv.shandaddfollwinglineexport
HADOOP_OPTS="agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5432$HADOOP_OPTS"
jdwpisjavadebuggerwireprotocol
suspend=yisforwhenbreakpointisfoundsuspendexecutionuntildebuggerisattached.
address=<PORT>Hadoopwilllistenonthisportfordebugging.
2)NowrunjobonHadoop
#hadoopjar/root/hadoop/app/WordCount.jar/root/hadoop/app/input/file1/root/hadoop/app/output/file1
3)Nowcometowindows/anotherVMwhereyoureclipseispresent.
a)YoushouldhavesameMapReduceprojectinyourworkspaceofeclipse.
b)RightclickonProject>DebugAs>Debugconfiguration>RemoteJavaApplication
i)Browseprojectfromworkspace.
ii)InIPfieldspecifyIPofVMwhereyouarerunningHadoop.
iii)SetPortNumberequaltoportnumbersetinhadoopenv.shHADOOP_OPTSvalue.
Wehadset
HADOOP_OPTS="agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5000"
Inthiscase,
Sonowsetport=5000indebugconfiguration.
clickonDebugtostartdebugging.
*DebuggingHadoopcorecomponents
1)Modifythefile$HADOOP_HOME/etc/hadoop/yarnenv.sh.Addthefollowinglines.
YARN_OPTS="$YARN_OPTSagentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=51234"
Addthefollowinglinesinthefile$HADOOP_HOME/etc/hadoop/mapredsite.xmlinsideblock.ItwillenableYARN
FrameworkandjobwillruninYARN.
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2)Executethefollowingcommands
$HADOOP_HOME/sbin/startdfs.sh
$HADOOP_HOME/sbin/startyarn.sh
Followthesamestepsaswedidfordebuggingamapreducejob.
DebuggingChildProcessinHadoop
1.setfollwingpropertyinmapredsite.xml
<property>
<name>mapred.child.java.opts</name>
<value>agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5432</value>
</property>
2.Followabovestepstodebug.
3.Youcandebugchildprocesses(Mapper/Reducer)inHadoopcluster(fullydistributed)also,Butyoudontknowonwhich
datanodethecurrentMapper/Reducertaskisrunning,soyouneedtotrytofinditoutbytryingIPsofdatanodeswith
configuredport(5432)witheclipseremotedebugging.
11responsestoRemoteDebuggingofHadoopJobwithEclipse
1. LorenaMay21,2013at2:25amReply
Hi:Imrunninghadoop1.0.4onMacOs10.6.8.IfIaddthislinetohaddopenv.shexportHADOOP_OPTS=
agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5432$HADOOP_OPTSandthenstartallservicesusing
bin/startall.shIgetthefollowingerrorforeachservice(datanode,namenode,jobtracker,etc.)
ERROR:CannotloadthisJVMTIagenttwice,checkyourjavacommandlineforduplicatejdwpoptions.Erroroccurred
duringinitializationofVM
Anysuggestions?
thanksinadvance,Lorena
2. PravinChavanMay21,2013at6:16amReply
Youdontevenneedtorestartservices,youcansetHADOOP_OPTSanddebugitwhileservicesarerunning,Andwhile
restartingjustcommentthetheHADOOP_OPTS..
3. JunIlParkOctober12,2013at7:56pmReply
hey..thisseemstobenotworkingonpseudodistribute.
shouldisetdebugmodetojvmofhadoobyusingcommandlikejavaXdebug
Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000^C
?
4. PravinChavanOctober14,2013at6:54amReply
See,wearesettingthisoptiontoHadoopJVM(JVMinwhichmap/reducetaskrun),ThisJVMarecreatedruntimewhile
jobrunningforeachtask,soIdontthinksowecansetremotedebuggingoptionsfromcommandline,Soyouneedtoset
followingoptioninmapredsite.xml
<property>
<name>mapred.child.java.opts</name>
<value>agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5432</value>
</property>
5. akashmahakodeOctober20,2013at4:49amReply
hi,IconnecttomyVMusingpublickey.
i.eIsshtomyVMwherehadoopisinstalledas
sshikey.pemubuntu@vm_ip.HowcanIprovidepublickeyineclipsedebugmodE?
6. PravinChavanOctober21,2013at9:24amReply
Hey,Fordebugging,youjustneedIPaddressofVMonwhichjobisrunningandconfiguredportno.,youdontneedto
putpublickeyineclipse.
akashmahakodeOctober21,2013at9:40amReply
butmyVMdoesnotallowauthenticationwithoutpublickey.
PravinChavanOctober21,2013at9:45am
See,WhathappensinremotedebuggingisyourVMislistensononeportnumber,forthatwedontneedpublic
key,wedontneedtoconnectthatVM.
Pleaseputyourexacterrors,screenshots,whatishappening.
7. akashmahakodeOctober21,2013at6:05pmReply
hi,thanksforyourreply.Itisworkingnow
8. akashmahakodeOctober21,2013at6:41pmReply
Whatdoyoumeanbydebuggincorecomponents?IconfiguredImusingmapreduce1.So,whenmycodereaches
status=jobSubmitClient.submitJob(
jobId,submitJobDir.toString(),jobCopy.getCredentials());
inJobClient.java,IamnotabletoseewhatshappeninginsidethissubmitJob()method.IsthereanythingthatImissed
out?
9. YarnexporterDecember16,2013at12:26pmReply
Aroundofapplauseforyourblogpost.ReallyGreat.
BlogatWordPress.com.|TheOxygenTheme.