Escolar Documentos
Profissional Documentos
Cultura Documentos
Presentation Flow
Introduction Related Work Various Approaches Various Modules Working Performance Parameters Of The System Applications Advantages and Disadvantages Future Scope Conclusion References
Introduction
Introduction and Motivation
History Traditional Method Need ?
Related Work
Related Work
International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 3, Issue 1, January -February 2013, pp.253-258
Related Work
Related Work
International Journal of Engineering Research and Applications (IJERA) ISSN: 22489622 www.ijera.com Vol. 2, Issue 2,Mar-Apr 2012, pp.1126-1128
Literature Survey
Association for International Journal in Computer Science & Electronics Volume I Reference ID: aijcse2005.
Various Approaches
Template-Based Approaches Knowledge-Based Approaches Neural Network-Based Approaches Hidden Markov Model (HMM)-Based Speech
Recognition
hand-coded into a system. It uses set of features from the speech, and then the training system generates set of production rules automatically from the samples. This has the advantage of explicitly modeling variations in speech; but unfortunately such expert knowledge is difficult to obtain and use successfully, so this approach was judged to be impractical, and automatic learning procedures were sought instead.
neural networks. They are capable of solving much more complicated recognition tasks, but do not scale as excellent as Hidden Markov Model (HMM) when it comes to large vocabularies A neural network (NN) is an interconnected group of natural or artificial neurons that uses a mathematical or computational model for information processing
pattern representative is created using one or more test patterns. then HMM-Based Recognition is used. Recognition or pattern classification is the process of comparing the unknown test pattern with each sound class reference pattern and computing a measure of similarity (distance) between the test pattern and each reference pattern. An Hidden Markov model is use for speech recognition, which converts the speech to text.
Various Modules
1. Speech Recognition 2. Speech Preprocessing 3. Hmm Training
1. SPEECH RECOGNITION
Speech samples are obtained from speaker at real time. For speech recognition we require microphone There is need to store the sample of different users to make
2. SPEECH PREPROCESSING
The voice which is taken at the real time will require
noise free speech signals background noise that need to be removed. The preprocessing reduces the amount of efforts in next stages. Input to the speech preprocessing is speech signals which then converted into speech frames and gives unique sample.
speech signal. To accomplish this goal, the system divides the speech samples into overlapped frames. 2. The system performs checks for the voice activity using endpoint detection and energy threshold calculations. 3. The speech samples are then passed through a pre-emphasis filter. 4. The frames with voice activity are passed through a Hamming window. The system performs autocorrelation analysis on each frame. 6. The system finds linear predictive coding (LPC) coefficients using the Levinson and Durbin algorithm. We apply a Hamming window to each frame to minimize signal discontinuities at the beginning and end of the frame.
3. HMM TRAINING
Training involves creating a pattern representative of
the features of a class using one or more test patterns. A model commonly used for speech recognition is the HMM, which is a statistical model used for modeling an unknown system using an observed output sequence.
Working
Speech Recognition Speech Processing Noise Free Data HMM Training
Text Storage
WER=(S+D+I)/N
Where S is the number of substitutions, D is the number of the deletions, I is the number of the insertions and N is the number of words in the reference.
RTF=P/I
Applications
Applications
1. Sending Message
2. Email
Applications
3.Web Search
Applications
4.Voice Dial
5. Data Entry
Advantages
Natural way of interaction.it is not necessary to sit at a
keyboard. Faster data processing. No training required for users! Useful for physically handicapped people. We can use Google's speech recognition engine for memory savings.
Disadvantages
If there is noise or some other sound in the room (e.g. the
television or a kettle boiling), the number of errors will increase. The microphone is close to the user More distant microphones (e.g. on a table or wall) will tend to increase the number of errors. Requires preprocessing after acquiring of speech. The system requires training of voice because of the purpose of it should recognize the correct voice of that user. the need for permanent Internet connection.
Future Scope
Further work is planned to implement the model of speech recognition for different language.
Conclusion
Reference
[1] B. Raghavendhar Reddy, E. Mahender : Speech to Text
Conversion using Android Plat-form,International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 3, Issue 1, January -February 2013, pp.253-258 [2] Ms. Anuja Jadhav,Prof. Arvind Patil : Android Speech to Text Converter for SMS Application,IOSR Journal of Engineering Mar. 2012, Vol. 2(3) pp: 420-423 [3] Jagriti Chand:Sms Application Using Speech To Text Convertor In Android Mobiles, Volume I Issue I Reference Id: Aijcse2005 [4] M. Bacchiani, F. Beaufays, J. Schalkwyk, M. Schuster, and B. Strope. Deploying GOOG-411: Early lessons in data, measurement, and testing. In Proceedings of ICASSP, pages 52605263, 2008.
Questions???
Thank You.