Você está na página 1de 3

Speech recognition for Asterisk

Speech recognition script for Asterisk that uses Google speech API.
This AGI script makes use of Google's speech recognition engine in order to render speech to text and return it back to the dialplan as an asterisk channel variable. See README for a complete list of supported languages.

Dependencies
Perl: The Perl Programming Language perl-libwww: The World-Wide Web library for Perl flac: Free Lossless Audio Codec Internet access in order to contact google and get the speech data.

Install
To install copy speech-recog.agi to your agi-bin directory. Usually this is /var/lib/asterisk/agi-bin/ To make sure check your /etc/asterisk/asterisk.conf file

Usage
agi(speech-recog.agi,[lang],[timeout]): Records from the current channel untill the pound key (#) is pressed or the timeout (set to 10 seconds by default, -1 for no timeout) is reached. The recording is send over to googles speech recognition service and the returned text string is assigned as the value of the channel variable 'utterance'. The scripts sets the following channel variables: status: Return status. 0 means success, non zero values indicating different errors. utterance: The generated text string. confidence: A value between 0 and 1 indicating the probability of a correct recognition.Values bigger than 0.95 usually mean that the resulted text is correct. id: Some id string that googles engine returns, not very useful(?). Asterisk dialplan examples: In these examples googletts.agi script is used for speech synthesys:
;;Simple exten => exten => exten => exten => exten => speech recognition 1234,1,Answer() 1234,n,agi(speech-recog.agi,en-US) 1234,n,Verbose(1,The text you just said is: ${utterance}) 1234,n,Verbose(1,The probability to be right is: ${confidence}) 1234,n,Hangup()

;;Speech recognition demo: exten => 1235,1,Answer()

exten => 1235,n,agi(googletts.agi,"Say something in English, when done press the pound key.",en) exten => 1235,n(record),agi(speech-recog.agi,en-US) exten => 1235,n,Verbose(1,Script returned: ${status} , ${id} , ${confidence} , ${utterance}) ;Check return status: exten => 1235,n,GotoIf($["${status}" = "0"]?success:fail) ;Check the probability of a successful recognition: exten => 1235,n(success),GotoIf($["${confidence}" > "0.8"]?playback:retry) ;Playback the text: exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en) exten => 1235,n,agi(googletts.agi,"${utterance}",en) exten => 1235,n,goto(end) ;Retry in case speech recognition wasn't successful: exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en) exten => 1235,n,goto(record) exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en) exten => 1235,n(end),Hangup()

;;Voice dialing example exten => 1236,1,Answer() exten => 1236,n,agi(googletts.agi,"Please say the number you want to dial.",en) exten => 1236,n(record),agi(speech-recog.agi,en-US) exten => 1236,n,GotoIf($[$["${status}" = "0"] & $["${confidence}" > "0.8"]]?success:retry) exten => 1236,n(success),goto(${utterance},1) exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en) exten => 1236,n,goto(record)

Under the folder wolfram you can find a sample agi script that in combination with speech-recog.agi sends queries to WolframAlpha and returns the answers as a dialplan variable that can be read back to the user. See wolfram/README for details. A diaplan example follows where you can dictate your question to WolframAplha and listen to the answer on your phone.
;WolframAlpha query demo: exten => 1237,1,Answer() exten => 1237,n,agi(googletts.agi,"What is your question?",en) ;;Record the question and render it to text: exten => 1237,n(record),agi(speech-recog.agi,en-US) exten => 1237,n,GotoIf($[$["${status}" = "0"] & $["${confidence}" > "0.8"]]?success:retry) ;;Submit the question to wolfram: exten => 1237,n(success),agi(wolfram.agi,"${utterance}") ;;Playback the answer: exten => 1237,n,agi(googletts.agi,"${wolfram_answer}",en) exten => 1237,n,goto(end) ;;Retry in case speech recognition wasn't successful: exten => 1237,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en) exten => 1237,n,goto(record) exten => 1237,n(end),Hangup()

License
The speech-recog script for asterisk is distributed under the GNU General Public License v2.

Authors
Lefteris Zafiris (zaf.000@gmail.com)

Download
You can get the latest stable version here. Develompent snapshots are available in either zip or tar formats. You can also clone the project with Git by running:
$ git clone git://github.com/zaf/asterisk-speech-recog

Links
GoogleTTS text to speech script for asterisk Asterisk Flite text to speech module Asterisk e-Speak text to speech module

get the source code on GitHub : zaf/asterisk-speech-recog

Você também pode gostar