Escolar Documentos
Profissional Documentos
Cultura Documentos
Presented by
Nitesh Kumar Chaudhary
Department of Electronics & Communication Engineering
The LNM Institute Of Information Technology, Jaipur
Under the Supervision of
Dr. Navneet upadhyay
Block Diagram
Noisy Signal
X(n)
Wj,m (K)
Perceptual WPT
m =1...17
Teager Energy
Operator
tj,m (K)
Critical Band
Selection
m =1...17
Mj,m (K)
m =1...17
Recovered
Clean Signal
Y(n)
Wm (n)
Lj,m (K)
Inverse PWPT
m =1...17
m =1...17
level
dependent
Thresholding
The Wavelet Packet Transform (WPT) is one such time frequency analysis
tools. It is a transform that brings the signal into a domain that contains both
time and frequency information.
(0,0)
0.3
(1,0)
(1,1)
(2,0)
(2,1)
(3,0)
(3,1)
(3,2)
(2,2)
(3,3)
(3,4)
(2,3)
Signal Magnitude
Decomposition Level
0.2
0.1
-0.1
(4,0)
(4,1)
(4,2)
(4,3)
-0.2
(5,0) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (5,7)
-0.3
Wavelet Decomposition
0.5
Sample Point
1.5
2
4
x 10
data1
data2
data3
data4
data5
data6
data7
data8
data9
data10
data11
data12
data13
data14
data15
data16
data17
data18
data19
data20
data21
data22
data23
data24
data25
data26
data27
data28
data29
data30
data31
data32
[()] = () ( + )( )
The time adaptive threshold selection for wavelet coefficients has been
computed, which takes care of varying noise time into account.
,() =
, ,
,
{ , }
0.4
(0,0)
0.3
(1,0)
(1,1)
(2,0)
(2,1)
(3,0)
(3,1)
(3,2)
(2,2)
(3,3)
(3,4)
(2,3)
Signal Magnitude
Decomposition Level
0.2
0.1
-0.1
(4,0)
(4,1)
(4,2)
(4,3)
-0.2
(5,0) (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) (5,7)
-0.3
Wavelet Decomposition
0.5
1.5
Sample Point
2
4
x 10
Masking Construction:
For a selected band, mask is obtained by
, = , (
Hamming window.
()
=
Where Wm(n) is the inverse perceptual Wavelet packet tranform of Mj,m k in equation
. ,
) <
Where AWT(i) is the time adaptive threshold value of frame i, and frame(i) is defined as
Frame(i) = [V(( i-1)*160 + 1], [V(( i-1)*160],
Noise is defined as Noise(n) = p *{E[V(2)(n)] + Mean(Frame(i))}/2
E[V(k)(n)] is the mean of V(k)(n).
The voice-active regions are characterized by V(n) > AWT
Level 3
Noise Signal of level 3rd of Wavelet Tree
Signal Amplitude
Node (3,5)
1
0.5
0
-0.5
-1
500
1000
1500
2000
2500
3000
1
0.5
0
-0.5
-1
3500
500
1000
Frequency in Hz
0.5
0
-0.5
500
1000
1500
2000
2500
3000
500
1000
Signal Amplitude
Signal Amplitude
-0.5
2000
Frequency in Hz
1500
2000
3500
2500
3000
3500
Node (3,7)
1500
3000
-0.5
Frequency in Hz
0.5
1000
2500
Node (3,7)
500
3500
0.5
-1
3500
3000
Frequency in Hz
-1
2500
Node (3,6)
Signal Amplitude
Signal Amplitude
Node (3,6)
2000
Frequency in Hz
-1
1500
2500
3000
3500
1
0.5
0
-0.5
-1
500
1000
1500
2000
Frequency in Hz
Level 4
Denoised Signal Of Level 4th Of Wavelet Tree
Node (4,4)
Node (4,4)
200
400
600
800
1000
Frequency in Hz
Node (4,5)
1200
800
1000
Frequency in Hz
Node (4,6)
1200
800
1000
Frequency in Hz
Node (4,7)
1200
1400
200
400
600
1400
Amp
200
400
600
1400
1600
Amp
Amp
200
400
600
800
1000
Frequency in Hz
Node (4,8)
1200
1400
1600
Amp
Amp
0
200
400
600
800
1000
Frequency in Hz
Node (4,9)
1200
1400
1600
Amp
0
200
400
600
800
1000
Frequency in Hz
1400
1600
200
400
600
800
1000
Frequency in Hz
Node (4,6)
1200
1400
1600
200
400
600
800
1000
Frequency in Hz
Node (4,7)
1200
1400
1600
200
400
600
800
1000
Frequency in Hz
Node (4,8)
1200
1400
1600
200
400
600
800
1000
Frequency in Hz
Node (4,9)
1200
1400
1600
200
400
600
800
1000
Frequency in Hz
1200
1400
1600
0
-1
1200
-1
800
1000
Frequency in Hz
Node (4,5)
0
-1
600
-1
400
0
-1
200
0
-1
1
0
-1
Amp
1
0
-1
1600
1
0
-1
1
0
-1
1600
Amp
1
0
-1
Amp
Amp
-1
Amp
Amp
1
0
1200
1400
1600
Level 5
Noise Signal Of Level 5th Of Wavelet Tree
Node (5,0)
Node (5,1)
200
400
600
Frequency in Hz
Node (5,4)
200
400
600
Frequency in Hz
Node (5,5)
200
400
600
Frequency in Hz
Node (5,6)
Amp
200
400
600
Frequency in Hz
Node (5,7)
Amp
200
400
600
Frequency in Hz
800
-1
Amp
200
400
600
Frequency in Hz
Node (5,4)
200
400
600
Frequency in Hz
800
800
200
400
600
Frequency in Hz
Node (5,5)
800
200
400
600
Frequency in Hz
Node (5,7)
800
200
400
600
Frequency in Hz
800
200
400
600
Frequency in Hz
Node (5,6)
0
-1
800
0
-1
200
400
600
Frequency in Hz
Node (5,3)
0
-1
800
0
-1
800
0
-1
800
0
-1
800
200
400
600
Frequency in Hz
Node (5,2)
0
-1
800
Amp
Amp
Amp
1
Amp
0
-1
800
-1
-1
800
Amp
-1
200
400
600
Frequency in Hz
Node (5,3)
1
Amp
Amp
Amp
-1
800
Amp
200
400
600
Frequency in Hz
Node (5,2)
Amp
Node (5,1)
1
Amp
-1
Node (5,0)
1
Amp
Amp
-1
200
400
600
Frequency in Hz
800
0
-1
Evaluation
To verify the effectiveness of the proposed algorithms, we compared the speech detection
and false-alarm probabilities
The proposed methods are all evaluated by receiver operating characteristic (ROC)
curves which show discriminative properties of VAD between noise-only and noisy
speech frames in terms of the Probability of Correct detection (Pd) and Probability of
false-alarm (Pf) such that
Performance Evaluation
20.6710 dB
shape-preserving
linear
0.01
10
10
-0.01
10
-0.01
10
10
0.01
0.02
10
10
Pf: Probability of False alarm
0.03
10
0.04
10
Probability Of Correct
Detection (Pd %)
Computation time
(CP)
Daubechies 2
86.4
15.6
2.872 s
Daubechies 4
89.3
11.7
2.884 s
Daubechies 8
91.8
9.2
3.023 s
Daubechies 10
94.3
5.7
3.074 s
Daubechies 12
94.5
5.5
3.898 s
Daubechies 14
94.8
5.2
3.899 s
( )
Where the CP time is the average PWPT process time of specific wavelet. Considering the
cost performance rate given in Table 1, the Daubechies wavelet filter with length 12,
which has the best CP ratio, is recommended for the proposed algorithm.
References :
Shi-Huang Chen, HsinTe Wu, Yukon Chang and T.K. Truong Robust voice activity
detection using perceptual wavelet-packet transform and Teager energy operator in Pattern
Recognition Letters 28 (2007) 13271332.
Daubechies, I. (1992), Ten lectures on wavelets, CBMS-NSF conference series in applied
mathematics, SIAM Ed.
D. L. Donoho, I. M. Johnstone, Ideal Spatial Adaptation via Wavelet Shrinkage,
Biometrika, vol. 81, pp. 425-455, 1994.
S. Mallat, A theory for multiresolution signal decompo-sition: The wavelet representation,
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 7, pp. 674
693, July 1989.
M. Berouti, R. Schwartz, and J. Makhoul, Enhancement of speech corrupted by acoustic
noise, in Proc. IEEE ICASSP, Apr. 1979, pp. 208211.
Johnstone, I.M., Silverman, B.W., 1997. Wavelet threshold estimators for data with correlated
noise. J. Roy. Stat. Soc. B 59, 319351.
G. David Forney, Jr., Exponential error bounds for erasure, list, and decision feedback
schemes, Information Theory, IEEE Transactions on, vol. 14, no. 2, pp. 206220, Mar 1968.