Escolar Documentos
Profissional Documentos
Cultura Documentos
A. Muthamizh Selvan
Research Scholar (Ph.D.) Department of Computer Science & Engg. Bharathiar University Coimbatore - 641 046.
muthamizh@ieee.org @ g
Speech = Sound
Sound is a wave (Disturbance) Sound needs a medium to travel Sound vibrates the air like a slinky What is Wave ?
Wave is a disturbance traveling through a medium by which energy is transferred from one particle of the medium to another without causing any permanent displacement of the medium itself. A wave can be described as a disturbance that travels through a medium, transporting energy from one location to another location. The medium is simply the material through which the disturbance is moving moving. A wave is a transfer of energy from one point to another without the transfer of material between the two points.
Example
What is Sound ?
Sound is the result of a mechanical disturbance of some object in a physical medium, such as air. This mechanical disturbance generates vibrations that can be represented as electrical signals by means of a device (for example, a microphone), that converts these vibrations into a time-varying voltage. - Eduardo Reck Miranda, University of Plymouth Sound is a wave motion propagated in an elastic medium, traveling in both transverse and longitudinal directions producing an auditory directions, sensation, by the change of pressure at the ear. Sound is a wave which is created by vibrating objects and propagated through a medium from one location to another.
Characteristics of Sound
Sound i a M h i l Wave S d is Mechanical W Sound is a Longitudinal Wave Sound is a Pressure Wave Mechanical Wave
A sound wave is transported through a medium via the mechanism of particle interaction is characterized as a mechanical wave. Example : Tuning Fork p g
Longitudinal Wave
Longitudinal sound waves are waves in which the motion of the individual particles of the medium is in a direction which is parallel (and antiparallel) to the direction of energy transport Example : Slinky
Longitudinal sound wave
Energy moves left and right Coil moves left and right
Energy transport
Pressure Wave
A sound wave consists of a repeating pattern of high pressure and low p essu e eg o s ov g pressure regions moving through a medium is referred to as a pressure oug ed u s e e ed o p essu e wave. Pressure Wave
C Compressions R Rarefactions
The compressions are regions of high air pressure while the rarefactions are regions of low air pressure.
Types of Sound Wave Signal Discrete Time Signal ( Eg. Speech Signal ) Continuous Time Signal ( Eg. Music Signal )
A. Muthamizh Selvan, Research Scholar (Ph.D.), DCSE, Bharathiar University, Coimbatore
Praat
7
Analog - to - Digital Analog-to-Digital conversion of signal is in two steps g g g p Sampling Quantization (Code Word Generation) Sampling
The sound pressure level when picked up by a microphone becomes an electrical (analog) signal. Analog to Digital Converter (ADC) convert the electrical signal into Digital form.
Quantization
The Digitized form of signal is quantizing as a sequence of numbers (Code words 4 bits or 16 bits) that represents the shape of the electrical signal; which represented the shape of the sound wave
Sample Frequency
The Sample frequency of a sound is equal to the number of cycles which occur every second (" l per second", " " or "H ") d ("cycles d" "cps" "Hz").
Sample Points are 001, 100, 101, 110, 010, 001, 001, 010
Sampling Rate
The frequency of this sampling process is called sampling frequency or sampling rate, and it i measured i H t (H ) li t d is d in Hertz (Hz).
Sampling Theorem
The sampling theorem states that in order to accurately represent a sound digitally, the sampling rate must be higher than at least twice the value of the highest frequency contained in the signal
File formats of Sound (.wav) adopted by Microsoft (.voc) ( voc) adopted by Creative Lab's Sound Blaster Lab s (.snd and .au) originated by NeXT and Sun computers (.aif ) originated by Apple computers (.avr) adopted b A i and A l computers ( ) d d by Atari d Apple (.ils) New TIMIT database format p g (.adf ) CSRE software package format (.adc) old TIMIT database format
A. Muthamizh Selvan, Research Scholar (Ph.D.), DCSE, Bharathiar University, Coimbatore
10
Sound Examples
Highly Good Voice: g y 44100 samples per second results in a sound file of 669 kB Good Voice: 11050 samples per second results is a sound file of 167 kB Bad Voice: 5500 samples per second results in a sound file of 83 kB l d lt i d fil f Distorted Voice: 2250 samples per second results in a sound file of 35 kB Highly Distorted Voice: 1125 samples per second results i a sound fil of 17 kB l d lt in d file f Voice with Noise: 11050 samples per second results in a sound file with Noise
A. Muthamizh Selvan, Research Scholar (Ph.D.), DCSE, Bharathiar University, Coimbatore
11
12
V F Relation The sound wave properties Speed, Frequency and Wavelength are not independent (V = F) (F = V / ) (=V / F)
13
Frequency ( F )
The number of times the wavelength occurs in one second Speed f Sound / W l S d of S d Wavelength (F = V / ) th
14
Example : Praat
A. Muthamizh Selvan, Research Scholar (Ph.D.), DCSE, Bharathiar University, Coimbatore
15
Speech Processing
Speech Processing Applications Text -To Speech ( TTS ) Spoken Language Recognition (Identification) Speaker Recognition (Id ifi i ) S k R i i (Identification) Speech - To - Text (Speech Recognition) Spoken Language Recognition (Identification) S ii ( ifi i )
The recognizer gets an input utterance. It then performs the same signal preprocessing as the trainer. Then, it will identify the language which is spoken by extracting the feature vectors and comparing the features to all the stored example feature vectors of the languages
16
17
18
19
Speech Recognition:
http://www.dcs.shef.ac.uk/~stu/com326/sym.html p http://web1.mtnl.net.in/~nilami/dtw.html
20
21
References
1. H. Lamb, The Dynamical Theory of Sound, 2nd ed. London: Edward Arnold and Co., 1925 2. Lord Rayleigh, The Theory of Sound, 2nd. ed. 1894 (London: Macmillan, 1926, 2 vols. 3. L. E. 3 L E Kinsler and A R Fre F d A. R. Frey, Fundamentals of A t l f Acoustics, 2nd ed John ti ed., Wiley & Sons, 1962 4. Wood, Acoustics, 2nd ed., Dover, 1966 5. L. Rabiner & B. Juang "Fundamentals of Speech Recognition, 1993 . ab e . Jua g undamentals ecognition , 993 6. C. Becchetti and L.P. Ricotti, "Speech Recognition : Theory and C++ Implementation, 1999 7. D. Jurafsky, J. Martin, "Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 2000 8. J. Deller, J. Hansen, J. Proakis, "Discrete-Time Processing of Speech Signals (IEEE Press Classic Reissue), 1999 Reissue) 9. L. Rabiner, R. Schafer, "Digital Processing of Speech Signals, 1978 10. Sinaporn Suebvisai, Paisarn Charoenpornsawat, Alan Black, Monika Woszczyna and Tanja Schultz, Thai Automatic Speech Recognition, y j p g ICASSP-2005
A. Muthamizh Selvan, Research Scholar (Ph.D.), DCSE, Bharathiar University, Coimbatore
23
Questions
Please
24