Escolar Documentos
Profissional Documentos
Cultura Documentos
B. Pitch And Formants Analysis Results take the decision. For it, we try to improve this idea
Figure 2 illustrates the pitch variation by by new methods that use wavelet transformed and
application of the cepstrum method analysis of neural networks.
normal and pathological female sounds (32 years).
III. NEW APPROACH FOR VOICE PATHOLOGY
CLASSIFICATION
This work presents a development of the basic idea
presented in [5]. This paper propose a technique
that uses wavelet analysis to extract a feature vector
from speech samples, which is used as input to a
Multilayer Neural Network classifier. Wavelet
analysis provides a two-dimensional pattern of
wavelet coefficients. The energy content of
Wavelet coefficients at various level of scaling is
Figure 2-a: Pitch evolution for a normal female voice used to formulate a feature vector of speech sample.
Attempt is made to use this feature vector as a
diagnostic tool to identify pathological disorders in
the voice. A three layer feed forward network with
sigmoid activation is used for classification.
Generalized Back Propagation Algorithm (BPA) is
used for training of the network.
Speech Acquirement
Multilayer Neural
Network (MNN)
Decision
Pathological Healthy
0.4
1
0.2
0.8
0
0.6
-0.2
0.4
-0.4
-0.6 0.2
-0.8 0
0 20 40 60 80 0 5000 10000
time(ms) frequency (Hz)
-3-
2008 International Conference on Signals, Circuits and Systems
level
term supervised learning is used. Alternatively, a
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
network can be trained through self-guidance, Continuous Transform, absolute coefficients.
Scale
19
17
according to the input. In both cases, the free 15
13
11
9
7
5
3
parameters in the network, weights and biases, 1
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
time (or space) b
adapt according to the measured data. The training
Figure 10: Wavelet Analysis of Normal Voice
can be gradual (incremental training), which means
Analyzed signal of pathological voice.
that the weights and biases are adapted every time 1
19
output layer, the outputs of the network are limited 17
15
13
11
9
7
5
3
to a small range. Also, if a linear activation function 1
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
time (or space) b
is used in the output layer, the network outputs can
Figure 11: Wavelet Analysis of Pathological Voice
have any real number values.
B. Neural Network Design
IV. SIMULATION RESULTS
In our survey we use a multilayer neural network
The Matlab7.0 platform is used for implementation
(MNN) with only one layer hidden between the
of the neural network formed of three layers, one of
input layer and the output layer. Every neuron of
input, one of output and a hidden layer (Figure 11).
the hidden layer is connected to the neurons of the
The input layer is formed of the same neurons
input layer and those of the output layer and there
number that corresponds to the components of the
is not a connection between the cells of a same
input vector. The input is the feature vector
layer. The activation functions used in this type of
obtained from Wavelet decomposition.
network are the doorstep or sigmoid functions.
The hidden layer contains fifteen neurons and the
This network follows a supervised training
output layer contains only one neuron to give the
according to the rule of errors correction. The
decision pathological or normal (Figure 9).
training type used for this network is the
supervised fashion. To every well stocked input
an answer corresponds waited at the output. So
the network is going to alter until it finds the good
output.
C. Feature Extraction
A Filter Bank is used to extract the wavelet
coefficients.
The energy of every level is normalized against
Figure 9: Voices classification model total energy content in the signal.
E (3)
A. Wavelet Coefficients E (i) = i N
ET
We apply discreet and continuous wavelet
transform coefficients on the same word Where i = 1, 2 …
pronounced by two speakers that have the same ET : Total Energy across all the levels.
sex and same age. One of these speakers is E T = ¦ Ei (4)
healthy whereas the other sulfur of Alzheimer's i
(2,0) (2,1)
C. Seven Energy Coefficeients
We use a wavelet filter bank to extract a seven
wavelet coefficient then we calculate their
E1 E2 E3 corresponding energy
The results of neural network training and testing
Figure 12: Three energy coefficients extraction
are regrouped in table III.
The obtained training curve is given in figure 13 Table III: Neural Network Results With Seven Coefficients
Performance is 9.96138e-006, Goal is 1e-005
1
10 Pronounced Normal Pathological
0
10
Word
Training
-1
10 20 20
Number
Training-Blue Goal-Black
-2
10
Test Number 10 10
-3
10
Correcte
10 10
-4
10 Classification
-5
10
Rate of
-6 100 % 100 %
10
0 20 40 60 80 100 120 140 160 180 200 Classification
202 Epochs
Figure 13: Training curve with three coefficients Then we can resum these different results in the
following diagrams (figure 14).
The results of neural network training and testing
are regrouped in table I.
-5-
2008 International Conference on Signals, Circuits and Systems
VII. REFERENCES
Pathological Voices Classification [1] V. Parsa and D. G. Jamieson, “Interactions
between speech coders and disordered speech,”
100 Speech Communication, vol. 40, no. 7, pp. 365–
80 385, 2003.
60 [2] S. B. Davis, “Acoustic characteristics of normal
40 and pathological voices,” Speech and Language:
20 Advances inBasic Research and Practice, vol. 1, pp.
271–335, 1979.
0
1 2 3 4 5 6 7 [3] F. Plant, H Kessler, B Cheetham, J Earis,
E n e r g y C o e f f i c i e n t N u mb e r
“Speech Monitoring of Infective Laryngitis”,
Proceedings of ICSLP96, Philadelphia, pp. 749 –
752 , 1996
Normal Voices Classification [4] M.N. Viera, F.R. McInnes, M.A. Jack “Robust
F0 andJitter estimation in the Pathological voices “,
Proceedings of ICSLP96, Philadelphia, pp.745 –
100
748, 1996.
80
[5] J.Nayak, P.S.Bhat “Classification and analysis
Classification Rate
60
of speech abnormalities”, ITBM-RBM 26 (2005)
40 319-327.
20 [6] S. Mallat, “A Theory for multiresolution signal
0 decomposition: Wavelet representation” , IEEE
1 2 3 4 5 6 7
Energy Coefficient Number
Trans. Pattern Analysis and Machine Intelligence.
Vol. 11. No. 7 pp674-693 July 1989.
Figure 14: Voice classification by MNN results
[7] B. Boyanov, S.Hadjitodorov: “Acoustic analysis
of pathological voices: a voice analysis system for
VI. CONCLUSION screening of laryngeal diseases”, Proc. IEEE
The goal of this work is to conceive a tool of help Engineering in Medical and Biology, (1997), vol.
to the clinicians in the Tunisian hospitals. This tool 16, no. 4, 74-82.
allow to follow-up of patients who suffer from [8] J.J. Jiang,Yu Zhang: “Nonlinear dynamic
illness of vocal and neurological origin. analysis of speech from pathological subjects”,
We presented in this paper a material and software Proc IEEE Electronics Letters, March (2002),
interface of numeric treatment of the patient’s vocal vol.38, no.6.
signal based on neural networks. [9] P.Yu, M.Ouaknine, J.Revis, and A.Giovanni,
Result of the multilayer neural network (MNN) “Objective Voice Analysis for Dysphonic Patients:
classifier gives the correct classification. The A Multiparametric Protocol Including Acoustic and
classification rate is between 90% and 100%. We Aerodynamic Measurements”, Journal of Voice
have demonstrated in this study, a feature vector Vol. 15, No. 4, pp. 529–542 © 2001 The Voice
based on wavelet coefficients is useful for Foundation
classification of normal and pathological speech [10] J.Wang, Jo.Cheolwoo, “Performance of
data. At a preliminary level, the speech data is Gaussian Mixture Model as a classifier for
classified into two classes normal or pathological. Pathological Voice”, proceeding of the ASST in
The multilayer neural network (MNN) with back Auckland 2006, pp 165-169.
propagation algorithm (BPA) used as a classifier [11] J.Kortelainen, K.Noponen, « Neural
has been proved to be more efficient and more networks », Intelligent Systems 2005
precise than the time-frequency analysis method. [12] S.Lotfi, C.Adnène, “A Speech Processing
The MNN classifier represents a low cost, accurate, Interface for Analysis of Pathological Voices”, in
and automatic tool for pathological voice proceeding of ICTTA conference, Damascus 2006.
classification using wavelet coefficients normalized [13] S.Lotfi, B.Haythem, C.Adnène, “ Interface
energy. It is presented in this paper as diagnostic d’analyse vocale a l’identification de certaines
tools to aid the physician and clinician in the pathologies d’origine neurologique et vocale”, in
analysis of speech disease. proceeding of JTM conference, Tunis 2007.
Therefore, future work will be focused on the [14] A.CHÉRIF, « Pitch detection and formant
specific recognition of illness type that causes the extraction of Arabic speech processing » Journal of
speech pathology. applied acoustics, January 2001
Finally, This work has to be validated on a larger [15] A.M. Gaouda, M. Salama, A. Chikhani, and
speech pathology database to increase the result M. Sultan, “Application of wavelet analysis for
reliability. monitoring dynamic performance in industrial
plants,” North American Power Symposium, Oct.
1997, Laramie, Wyoming.
-6-