Você está na página 1de 5

4/20/2015

ControlofaRobotbyVoiceInput

KlipPal

ArtificialIntelligenceME768JanApr2000

RelatedSearches

CONTROLOFAROBOTBYVOICEINPUT

UttarPradeshIndia

Submittedby

CrossoverSUV

GatramPradeep(97131)
ShalabhGupta(97319)

DirectTvOffers

IITKanpur:February2000

GmailAccount
HighSpeedInternet
LastMinuteCruise
Deals
FreeForeclosure
Listings

Whatarewedoing
Ourmotivationbehindit
TheexampleOurtask
SampleInputOutput
PastWork
Proposedmethodology
Results
Conclusions
Applications
Linksontheweb
Sourcecode
Bibliography
INTRODUCTION
Supposeyouwanttocontrolamenudrivensystem.Whatisthemost
strikingpropertythatyoucanthinkof?
Wellthefirstthoughtthatcametoourmindisthattherangeofinputsina
menudrivensystemislimited.Infact,byusingamenuallwearedoingis
limitingtheinputdomainspace.Now,thisisonecharacteristicwhichcanbevery
usefulinimplementingthemenuinstandalonesystems.Forexamplethinkof
thepinemenuorawashingmachinemenu.Howmanydistinctcommandsdo
theyrequire.
Top

MOTIVATION
Lastyearwebothparticipatedinrobocarromines(Techkriti'99).Wewere
usingswitchestocontrolthevariousmotionsofourrobots.ThenShalabh
participatedinsumofighting(Yantriki'99)usingafullyautonomousversionof
therobotNINJA.Wefeltthenextlogicalstepwastoeithergoforsomesortof
wirelesscontrolmechanismordesignavoicebasedcontrolsystem.Wedecided
togoforthelatter.
AlsoadancingrobotcompetitionisbeingorganizedbyIngenuitycellat

http://home.iitk.ac.in/~amit/courses/768/00/gatram/

1/5

4/20/2015

ControlofaRobotbyVoiceInput

TechkritiMillennium,inwhichtherobotshavetodancetothetuneofthemusic
beingplayed.Thiseventwastheonewhichgotustothinkabouttheconceptofa
voicecontrolledrobot.
Wearenotaimingtobuildasoftwarewhichcanrecognizealotofwords.
Ourbasicideaistodevelopsomesortofmenudrivencontrolforourrobot,
wherethemenuisgoingtobevoicedriven.Arecognitionstrengthofafew
wordswoulddoforsuchkindofjobs.Apersoninteractingwithsuchasystem
wouldnotneedtousehishandsforroutinejobs,whichiswhatwewishto
achieve.Thisleadsustoourmaintaskintheproject.
Top

THETASK
WhatweareaimingatistocontroltherobotNinjausingvoicecommands.
Ninjacandothesebasictasks:
1. moveforward
2. moveback
3. turnright
4. turnleft
5. hit
6. stop(stopsdoingthecurrentjob)
Thiscanbeconsideredasasmallmenuconsistingof5commands.Soa
softwarewhichcanrecognizeanddistinguishthe5commandsfromoneanother
willdothejob.Soasoftwareneedstobedevelopedwhichtakesvoicedataas
input&outputsthematchedcommand.
Top

SAMPLEINPUTOUTPUT

INPUT(Speakerspeaks)

OUTPUT(Robotdoes)

forward

movesforward

back
right

movesback
turnsright

left

turnsleft

hit
stop

hitsthecoin
stopsdoingcurrenttask

Top
PASTWORK
Alotofworkhasbeendoneearlierinthefieldofisolatedwordrecognition.
Usingatraditionalrecognizeranaccuracyofaround60%haspreviouslybeen
http://home.iitk.ac.in/~amit/courses/768/00/gatram/

2/5

4/20/2015

ControlofaRobotbyVoiceInput

obtainedforbotha156townnametaskand1108roadnametask.Techniques
presentedin[Azzopardi/Semnani_et_al:1998]hasresultedinanaccuracyof90%
foranautomatedcorporatedirectorysystemwith120,000entries.
Asaninputmethodforrapidlyspreadingsmallportableinformation
devices,andadvancedrobotics'applications,developmentofspeaker
independentspeechrecognitiontechnologywhichcanbeembeddedonasingle
DSPchiphasbeendevelopedby[Hoshimi/Yamada_et_al:1998].Whenthe
newlyproposednoiserobustnessmethodwastestedwith100isolatedword
vocabularyspeechof50subjects,recognitionaccuracyof94.7%wasobtained
undervariousnoisyenvironments.
Softwareengineeringforresearchanddevelopmentintheareaofsignal
processingisbynomeansunimportant.Aprogrammingparadigmwhichallows
softwarecomponentstobeadvantageouslycombinedwitheachotherinaway
thatrecallstheconceptofhardwareplugandplay,withouttheneedfor
incorporatingcomplexschedulerstocontroldataflowshasbeendevelopedby
[Dutoit/Shroeter:1998].
Earliersimilarworkinalimitedinputdomainwasdoneusingwirelessfor
e.g.remotecontrolofelectricalswitches(thisiscurrentlyoneoftheingenuity
problems).Wereadanewspaperreportaboutanyearago(TheHindu:Thursday
Science&TechnologySection)aboutsuchaproject.Asuggestedapplicationwas
forhospitalizedpatientswhousuallyaredependentonsomeoneelseforto
switchon/offthelights,fan,etc.Butwhatifthepatient'shandsarebroken.
Obviouslyavoicebasedsystemoughttobeusedinsuchacase.
Top

METHODOLOGY
Wearetakingthevoicedatafromthemicrophoneusingasoundcard.This
dataisstoredinanarray.Thisarrayispassedontoafunctionwhichextracts
wordsfromthearray(i.e.spokenwordsareextracted&quietperiodsare
dumped).Thesewordsarethesenttoafunctionwhichextractsfrequencyasa
functionoftime.Thisisthefrequencyvectorofthespokenword.Thisvectoris
comparedwithreferencevectors.Thecomparisonisdoneusingthestandard
innerproductoftwovectors.Oneofthereferencevectorswouldmatch(i.e.the
innerproductinthiscaseisgreaterthantheother5).Thecommand
correspondingtothisreferencevectorisfedtoNinja.Theelectroniccircuit
mountedonNinjawouldtheninterpretthecommand&moveitaccordingly.
Top

RESULTS
Dataacquisitionusingamicrophone&soundcardhasbeensuccessful.
Dataacquiredhasbeensegmentedintoseparatewords&quietperiodsare
beingdumped.
Frequencyvectorsofthewordshavebeengenerated.
Thereferencematrixhasbeengeneratedusingdataacquiredfrom10
http://home.iitk.ac.in/~amit/courses/768/00/gatram/

3/5

4/20/2015

ControlofaRobotbyVoiceInput

speakersinall.Weused12setsof6vectors.Butwegotverybadresults.
Theprobablereasonwasthatthesizeofmatrixis60x39.Soaround1000
datasetsmighthaveresultedinabetterperformancewiththematrix.But
generatingsuchahugeamountofdatawasnotpossibleforus.
Wegotaround85%recognitionrateforasingleuser.Thesystemdidnot
workformultipleusersi.eitwasuserdependent.Theperformancewas
bestwhenreferencefilesofonepersononlywereincluded.
Top

CONCLUSIONS
Inthisprojectwearegettingauserdependentisolatedwordrecognition
systemwitharecognitionaccuracyofabout85%usingsixwords.Theaccuracy
canbeimprovedfurtherandthesystemcanbeusedformorenumberofwordsif
duringthetrainingofthesystem,thenoiseconditionsareimproved.
WeweregettingapeakSNR(Signaltonoiseratio)ofabout20dB,whereas
inbestconditions,SNRcanbeobtainedupto35dB,at8KHzsamplingrate.
Alsothemicrophoneusedbyuswasnotfilteringouttheburstsofairproduced
whenwespeakwords,whichwasaddingtoalotofnoiseintheinputvoice
signal.
Butforspeakerindependentwordrecognitionsystem,wecannotusethe
techniquediscussedhere.Thefrequencyscales,speedofspeakingwords,and
signalpowerconcentrationondifferentsyllablesvarywidelyfromspeakerto
speaker(asdepictedbythevariationsinthefrequencyandamplitudegraphsfor
thesamewordsinmethodologysection).Thusforspeakerindependentsystems,
wemustuseabetterapproachlikeMarkovchainmodelingetc.

Top

APPLICATIONS
Webelievesuchasystemwouldfindwidevarietyofapplications.Menu
drivensystemssuchasemailreaders,householdapplianceslikewashing
machines,microwaveovens,andpagersandmobilesetc.willbecomevoice
controlledinfuture.Ourprojectmayfindapplicationsouttherebecause
inherentlythenumberofpossibleinputsarelimited.Usingoursoftwarethesecan
becontrolledthroughanetworkaswell.
Top

WEBLINKS
OSS
WehaveusedOSS(opensoundsystem)toreaddatafromthesoundcard
onalinuxmachine.Thissitehasverygoodreferencematerialforpeople
interestedinusingsoundcardsfortheirprogramsonlinux.
SignalRepresentation
http://home.iitk.ac.in/~amit/courses/768/00/gatram/

4/5

4/20/2015

ControlofaRobotbyVoiceInput

Thispagehasverygoodinformationaboutrepresentationofspeechsignals
indiscreteformat.
IsolatedWordRecognition
Thispagehaslinkstolotsofpapersonspeechrecognitionusingisolated
words,justwhatweneeded.
Top
SOURCECODE
BIBLIOGRAPHY
ThisproposalwaspreparedbyGatramPradeepandShalabhGuptaasapartof
theprojectcomponentintheCourseonArtificialIntelligenceinEngineeringin
theJANsemesterof2000.
(Instructor:AmitabhaMukerjee)
[COURSEWEBPAGE][COURSEPROJECTS2000(localCCusers)]

http://home.iitk.ac.in/~amit/courses/768/00/gatram/

5/5

Você também pode gostar