Escolar Documentos
Profissional Documentos
Cultura Documentos
ControlofaRobotbyVoiceInput
KlipPal
ArtificialIntelligenceME768JanApr2000
RelatedSearches
CONTROLOFAROBOTBYVOICEINPUT
UttarPradeshIndia
Submittedby
CrossoverSUV
GatramPradeep(97131)
ShalabhGupta(97319)
DirectTvOffers
IITKanpur:February2000
GmailAccount
HighSpeedInternet
LastMinuteCruise
Deals
FreeForeclosure
Listings
Whatarewedoing
Ourmotivationbehindit
TheexampleOurtask
SampleInputOutput
PastWork
Proposedmethodology
Results
Conclusions
Applications
Linksontheweb
Sourcecode
Bibliography
INTRODUCTION
Supposeyouwanttocontrolamenudrivensystem.Whatisthemost
strikingpropertythatyoucanthinkof?
Wellthefirstthoughtthatcametoourmindisthattherangeofinputsina
menudrivensystemislimited.Infact,byusingamenuallwearedoingis
limitingtheinputdomainspace.Now,thisisonecharacteristicwhichcanbevery
usefulinimplementingthemenuinstandalonesystems.Forexamplethinkof
thepinemenuorawashingmachinemenu.Howmanydistinctcommandsdo
theyrequire.
Top
MOTIVATION
Lastyearwebothparticipatedinrobocarromines(Techkriti'99).Wewere
usingswitchestocontrolthevariousmotionsofourrobots.ThenShalabh
participatedinsumofighting(Yantriki'99)usingafullyautonomousversionof
therobotNINJA.Wefeltthenextlogicalstepwastoeithergoforsomesortof
wirelesscontrolmechanismordesignavoicebasedcontrolsystem.Wedecided
togoforthelatter.
AlsoadancingrobotcompetitionisbeingorganizedbyIngenuitycellat
http://home.iitk.ac.in/~amit/courses/768/00/gatram/
1/5
4/20/2015
ControlofaRobotbyVoiceInput
TechkritiMillennium,inwhichtherobotshavetodancetothetuneofthemusic
beingplayed.Thiseventwastheonewhichgotustothinkabouttheconceptofa
voicecontrolledrobot.
Wearenotaimingtobuildasoftwarewhichcanrecognizealotofwords.
Ourbasicideaistodevelopsomesortofmenudrivencontrolforourrobot,
wherethemenuisgoingtobevoicedriven.Arecognitionstrengthofafew
wordswoulddoforsuchkindofjobs.Apersoninteractingwithsuchasystem
wouldnotneedtousehishandsforroutinejobs,whichiswhatwewishto
achieve.Thisleadsustoourmaintaskintheproject.
Top
THETASK
WhatweareaimingatistocontroltherobotNinjausingvoicecommands.
Ninjacandothesebasictasks:
1. moveforward
2. moveback
3. turnright
4. turnleft
5. hit
6. stop(stopsdoingthecurrentjob)
Thiscanbeconsideredasasmallmenuconsistingof5commands.Soa
softwarewhichcanrecognizeanddistinguishthe5commandsfromoneanother
willdothejob.Soasoftwareneedstobedevelopedwhichtakesvoicedataas
input&outputsthematchedcommand.
Top
SAMPLEINPUTOUTPUT
INPUT(Speakerspeaks)
OUTPUT(Robotdoes)
forward
movesforward
back
right
movesback
turnsright
left
turnsleft
hit
stop
hitsthecoin
stopsdoingcurrenttask
Top
PASTWORK
Alotofworkhasbeendoneearlierinthefieldofisolatedwordrecognition.
Usingatraditionalrecognizeranaccuracyofaround60%haspreviouslybeen
http://home.iitk.ac.in/~amit/courses/768/00/gatram/
2/5
4/20/2015
ControlofaRobotbyVoiceInput
obtainedforbotha156townnametaskand1108roadnametask.Techniques
presentedin[Azzopardi/Semnani_et_al:1998]hasresultedinanaccuracyof90%
foranautomatedcorporatedirectorysystemwith120,000entries.
Asaninputmethodforrapidlyspreadingsmallportableinformation
devices,andadvancedrobotics'applications,developmentofspeaker
independentspeechrecognitiontechnologywhichcanbeembeddedonasingle
DSPchiphasbeendevelopedby[Hoshimi/Yamada_et_al:1998].Whenthe
newlyproposednoiserobustnessmethodwastestedwith100isolatedword
vocabularyspeechof50subjects,recognitionaccuracyof94.7%wasobtained
undervariousnoisyenvironments.
Softwareengineeringforresearchanddevelopmentintheareaofsignal
processingisbynomeansunimportant.Aprogrammingparadigmwhichallows
softwarecomponentstobeadvantageouslycombinedwitheachotherinaway
thatrecallstheconceptofhardwareplugandplay,withouttheneedfor
incorporatingcomplexschedulerstocontroldataflowshasbeendevelopedby
[Dutoit/Shroeter:1998].
Earliersimilarworkinalimitedinputdomainwasdoneusingwirelessfor
e.g.remotecontrolofelectricalswitches(thisiscurrentlyoneoftheingenuity
problems).Wereadanewspaperreportaboutanyearago(TheHindu:Thursday
Science&TechnologySection)aboutsuchaproject.Asuggestedapplicationwas
forhospitalizedpatientswhousuallyaredependentonsomeoneelseforto
switchon/offthelights,fan,etc.Butwhatifthepatient'shandsarebroken.
Obviouslyavoicebasedsystemoughttobeusedinsuchacase.
Top
METHODOLOGY
Wearetakingthevoicedatafromthemicrophoneusingasoundcard.This
dataisstoredinanarray.Thisarrayispassedontoafunctionwhichextracts
wordsfromthearray(i.e.spokenwordsareextracted&quietperiodsare
dumped).Thesewordsarethesenttoafunctionwhichextractsfrequencyasa
functionoftime.Thisisthefrequencyvectorofthespokenword.Thisvectoris
comparedwithreferencevectors.Thecomparisonisdoneusingthestandard
innerproductoftwovectors.Oneofthereferencevectorswouldmatch(i.e.the
innerproductinthiscaseisgreaterthantheother5).Thecommand
correspondingtothisreferencevectorisfedtoNinja.Theelectroniccircuit
mountedonNinjawouldtheninterpretthecommand&moveitaccordingly.
Top
RESULTS
Dataacquisitionusingamicrophone&soundcardhasbeensuccessful.
Dataacquiredhasbeensegmentedintoseparatewords&quietperiodsare
beingdumped.
Frequencyvectorsofthewordshavebeengenerated.
Thereferencematrixhasbeengeneratedusingdataacquiredfrom10
http://home.iitk.ac.in/~amit/courses/768/00/gatram/
3/5
4/20/2015
ControlofaRobotbyVoiceInput
speakersinall.Weused12setsof6vectors.Butwegotverybadresults.
Theprobablereasonwasthatthesizeofmatrixis60x39.Soaround1000
datasetsmighthaveresultedinabetterperformancewiththematrix.But
generatingsuchahugeamountofdatawasnotpossibleforus.
Wegotaround85%recognitionrateforasingleuser.Thesystemdidnot
workformultipleusersi.eitwasuserdependent.Theperformancewas
bestwhenreferencefilesofonepersononlywereincluded.
Top
CONCLUSIONS
Inthisprojectwearegettingauserdependentisolatedwordrecognition
systemwitharecognitionaccuracyofabout85%usingsixwords.Theaccuracy
canbeimprovedfurtherandthesystemcanbeusedformorenumberofwordsif
duringthetrainingofthesystem,thenoiseconditionsareimproved.
WeweregettingapeakSNR(Signaltonoiseratio)ofabout20dB,whereas
inbestconditions,SNRcanbeobtainedupto35dB,at8KHzsamplingrate.
Alsothemicrophoneusedbyuswasnotfilteringouttheburstsofairproduced
whenwespeakwords,whichwasaddingtoalotofnoiseintheinputvoice
signal.
Butforspeakerindependentwordrecognitionsystem,wecannotusethe
techniquediscussedhere.Thefrequencyscales,speedofspeakingwords,and
signalpowerconcentrationondifferentsyllablesvarywidelyfromspeakerto
speaker(asdepictedbythevariationsinthefrequencyandamplitudegraphsfor
thesamewordsinmethodologysection).Thusforspeakerindependentsystems,
wemustuseabetterapproachlikeMarkovchainmodelingetc.
Top
APPLICATIONS
Webelievesuchasystemwouldfindwidevarietyofapplications.Menu
drivensystemssuchasemailreaders,householdapplianceslikewashing
machines,microwaveovens,andpagersandmobilesetc.willbecomevoice
controlledinfuture.Ourprojectmayfindapplicationsouttherebecause
inherentlythenumberofpossibleinputsarelimited.Usingoursoftwarethesecan
becontrolledthroughanetworkaswell.
Top
WEBLINKS
OSS
WehaveusedOSS(opensoundsystem)toreaddatafromthesoundcard
onalinuxmachine.Thissitehasverygoodreferencematerialforpeople
interestedinusingsoundcardsfortheirprogramsonlinux.
SignalRepresentation
http://home.iitk.ac.in/~amit/courses/768/00/gatram/
4/5
4/20/2015
ControlofaRobotbyVoiceInput
Thispagehasverygoodinformationaboutrepresentationofspeechsignals
indiscreteformat.
IsolatedWordRecognition
Thispagehaslinkstolotsofpapersonspeechrecognitionusingisolated
words,justwhatweneeded.
Top
SOURCECODE
BIBLIOGRAPHY
ThisproposalwaspreparedbyGatramPradeepandShalabhGuptaasapartof
theprojectcomponentintheCourseonArtificialIntelligenceinEngineeringin
theJANsemesterof2000.
(Instructor:AmitabhaMukerjee)
[COURSEWEBPAGE][COURSEPROJECTS2000(localCCusers)]
http://home.iitk.ac.in/~amit/courses/768/00/gatram/
5/5