Escolar Documentos
Profissional Documentos
Cultura Documentos
2, MARCH 2006
Abstract—We propose a new algorithm for blind source sep- For the high-quality acquisition of audible signals, several
aration (BSS), in which independent component analysis (ICA) microphone array systems based on the DS array have been
and beamforming are combined to resolve the slow-convergence implemented since the 1980s [5]. Recently, many DS array
problem through optimization in ICA. The proposed method
consists of the following three parts: (a) frequency-domain ICA systems with talker localization have been implemented for
with direction-of-arrival (DOA) estimation, (b) null beamforming hands-free telecommunications or speech recognition [6]–[9].
based on the estimated DOA, and (c) integration of (a) and (b) Although the DS array has a simple structure, it requires a
based on the algorithm diversity in both iteration and frequency large number of microphones to achieve high performance,
domain. The unmixing matrix obtained by ICA is temporally
particularly in low-frequency regions. Thus, the degradation of
substituted by the matrix based on null beamforming through
iterative optimization, and the temporal alternation between ICA separated signals at low frequencies cannot be avoided in these
and beamforming can realize fast- and high-convergence opti- array systems.
mization. The results of the signal separation experiments reveal In order to further improve the performance using more effi-
that the signal separation performance of the proposed algorithm cient methods than the DS array, the ABF has been introduced
is superior to that of the conventional ICA-based BSS method,
even under reverberant conditions. [10]–[12]. The goal of the adaptive algorithm is to determine
the optimum directions of the nulls under the specific constraint
Index Terms—Beamforming, blind source separation, indepen- that the desired signal arriving from the look direction is not sig-
dent component analysis, microphone array.
nificantly distorted. This method can improve the signal-sepa-
ration performance with even a small array in comparison with
I. INTRODUCTION the that of the DS array. The ABF, however, has the following
drawbacks. (a) The look direction of each signal which is sep-
S OURCE separation for acoustic signals is the estimation of
original sound source signals from the mixed signals ob-
served in each input channel. This technique is applicable in the
arated must be determined in the adaptation process. Thus, the
DOAs of the separated sound source signals must be determined
realization of noise-robust speech recognition and high-quality in advance. (b) The adaptation procedure should be performed
hands-free telecommunication systems. Methods of achieving during breaks in the target signal to avoid any distortion of sepa-
the source separation can be classified into two groups: methods rated signals. However, we cannot predict signal breaks in con-
based on a single-channel input, and those based on multi- ventional use. These requirements arise from the fact that the
channel inputs. As single-channel types of source separation, conventional ABF is based on supervised adaptive filtering, and
a method of tracking a formant structure [1], the organization this significantly limits the applicability of the ABF to source
technique for hierarchical perceptual sounds [2], and a method separation in practical applications.
based on auditory scene analysis [3] have been proposed. On In recent years, alternative source-separation approaches
the other hand, as a multichannel type of source separation, the have been proposed by researchers who do not use array signal
method based on array signal processing, e.g., a microphone processing but a specialized branch of information theory,
array system, is one of the most effective techniques [4]. In this i.e., information-geometry theory [13], [14]. Blind source
system, the directions of arrival (DOAs) of the sound sources separation (BSS) is the approach for estimating original source
are estimated and then each of the source signals is separately signals using only the mixed signals observed in each input
obtained using the directivity of the array. The delay-and-sum channel, where the independence among the source signals
(DS) array and the adaptive beamformer (ABF) are the most is mainly used for the separation. This technique is based
conventional and widely used microphone arrays currently on unsupervised adaptive filtering [14], and provides us with
utilized for source separation and noise reduction. extended flexibility in that the source-separation procedure
requires no training sequences and no a priori information on
the DOAs of the sound sources. The early contributory studies
on the BSS performed by Cardoso and Jutten [15], [16] used
Manuscript received July 26, 2002; revised December 21, 2004. This work high-order statistics of the signals for measuring the indepen-
was supported in part by CREST (Core Research for Evolutional Science and
Technology) in Japan. The associate editor coordinating the review of this man- dence. Comon clearly defined the term independent component
uscript and approving it for publication was Dr. Walter Kellermann. analysis (ICA), and presented an algorithm that measures inde-
The authors are with the Graduate School of Information Science, pendence among the source signals [17]. In recent works on the
Nara Institute of Science and Technology, Nara 630-0192, Japan (e-mail:
sawatari@is.naist.jp). ICA-based BSS, several methods in which the complex-valued
Digital Object Identifier 10.1109/TSA.2005.855832 unmixing matrices are calculated in the frequency domain have
1558-7916/$20.00 © 2006 IEEE
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 667
been proposed to deal with the arrival lags among the elements
of the microphone array system [18]–[21]. The ICA-based BSS
approach seems to be a very flexible and effective technique
for the source separation, but it has an inherent disadvantage in
that there is difficulty with the slow convergence of nonlinear
optimization [22].
To resolve this problem, in this paper, we describe a new al-
gorithm for BSS in which ICA and beamforming are combined.
The proposed method consists of the following three parts: (a)
frequency-domain ICA with estimation of the DOA of the sound
source, (b) null beamforming based on the estimated DOA, and
(c) integration of (a) and (b) based on the algorithm diversity in
both iteration and frequency domain. The temporal utilization of Fig. 1. Configuration of a microphone array and signals.
null beamforming through ICA iterations can realize fast- and
high-convergence optimization. The results of the signal sepa- II. DATA MODEL AND CONVENTIONAL BSS METHOD
ration experiments reveal that the signal separation performance
of the proposed algorithm is superior to that of the conventional A. Sound Mixing Model of Microphone Array
ICA-based BSS method, and the utilization of null beamforming In this study, a straight-line array is assumed. The coordi-
in ICA is effective for improving the separation performance nates of the elements are designated , and
and convergence, even under reverberant conditions. the DOAs of multiple sound sources are designated
In the similar context of a combination technique of BSS and (see Fig. 1).
beamforming, Parra et al. have proposed the methods [23], [24] Multiple mixed signals are observed at the microphone array,
in which geometric beamforming is utilized as a specific spatial and these signals are converted into discrete time series via an
constraint in the conventional BSS. Indeed, their methods ap- A/D converter. By applying the discrete time Fourier transform,
pear to be effective in separating the sound sources, particularly we can express the observed signals, in which multiple source
when the room reverberation is relatively short. However, un- signals are linearly mixed with additive noise, as follows in the
like our proposed method, it is not clear that their methods can frequency domain:
contribute toward an improvement of convergence in the filter (1)
updating. It is also worth mentioning that their methods have an
inherent drawback in that all DOAs of sources should be pre- where is the observed signal vector, is the source
viously identified (or known [25]) to construct the geometric signal vector, and is the mixing matrix; these are defined
beamforming, and thus the additional and redundant sensors are as
required. For example, in [23], regarding the separation of only (2)
two sources, they needed eight microphones which are mainly (3)
used for DOA estimation. This may prevent their methods from
being applied to a conventional BSS problem where the number .. ..
of sources is generally equal to that of sensors. On the contrary, . . (4)
our proposed method can still work in theory and practice under
such a condition for sources and sensors. Also, is the additive noise term which generally represents,
Several approaches to address a source permutation problem for example, an environment noise and/or a sensor noise.
have been recently proposed as another possibility on the uti- We introduce the model for dealing with the arrival lags
lization of beamforming in ICA framework [26]–[29]. It is in- among each of the elements of the microphone array. In this
dicated that spatial information is very valid for solving the case, is assumed to be complex-valued. Hereafter, for
source ordering ambiguity inherent in the frequency-domain convenience, we consider only the relative lags among each of
ICA. However, these approaches have nothing to do with ICA- the elements with respect to the arrival time of the wavefront
based filter optimization itself (only concerned with the permu- of each sound source, and neglect the pure delay between
tation problem), and cannot contribute the improvement of con- the microphone and sound source. Also, is regarded as
vergence in ICA. As far as we know, there were no detailed being identical to the source signals observed at the origin. For
studies on a direct application of the ICA-beamforming com- example, by neglecting the effect of the room reverberation,
bination to the convergence improvement before our proposed we can rewrite the elements in the mixing matrix (4) as the
method. following simple expression:
The rest of this paper is organized as follows. In Sections II
(5)
and III, the formulation for the general BSS problems and the
principle of the proposed method are explained. In Sections IV where is the arrival lag with respect to the th source signal
and V, the signal separation experiments are described. Fol- from the direction, which is observed at the th microphone
lowing a discussion on the results of the experiments, we present at the coordinate of . Also, is the velocity of sound. If the
our conclusions in Section VI. effect of room reverberation is considered, the elements in the
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
668 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 669
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
670 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
(22)
(18) (23)
If the th iteration is the final iteration, go to step 6; oth-
(19) erwise, go back to step 2 and repeat the ICA iteration, inserting
as given by (23) into in (14) with an increment
where is a function for obtaining the of .
smaller (larger) value between and . We conduct this pro- [Step 6: Ordering and scaling] Using the DOA information
cedure on a specific ordering basis, such that the smaller obtained in step 3, we can detect and correct the source per-
corresponds to the first sound source and the larger corre- mutation and the gain inconsistency [26]. From the directivity
sponds to the second sound source, as depicted in Fig. 1. patterns in all frequency bins, we collect the specific those in
[Step 4: Beamforming] Construct an alternative matrix for which the directional null is steered to the directions of .
signal separation, , based on the null-beamforming Also, we collect the other specific directivity patterns in which
technique where the DOA information obtained in the ICA sec- the directional null is steered to the directions of . By
tion is used. Hereafter, the th element of is written performing this procedure, we can resolve the permutation
by . In , we assume that the look direc- problem. The gain inconsistency problem is resolved by nor-
tion is and the directional null is steered to (see solid line malizing the directivity patterns according to the gain in each
in Fig. 4). Also, in , the look direction is and source direction after the classification. The resultant separated
the directional null is steered to (see broken line in Fig. 4). signals can be obtained as follows:
Under these assumptions, the unmixing matrix sat-
isfies the following equation:
(20)
(21) (24)
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 671
where is the final unmixing matrix obtained in (23), and 1) Make the whole set of to be classified, as
is the resultant directional gain for the th output at
the estimated th source direction , which is given by
(31)
(25)
(26) where is the total number of detected directional nulls
and at most .
2) Set initial partitions , where
C. Extended Algorithm to Case of and . Also, the
In this section, an extension for more than two sources with terminal partitions and are fixed at and ,
more than two sensors (i.e., ) is described. Basically, respectively, throughout the algorithm.
the straightforward extension can be easily made by substituting 3) Given the partitions, calculate the centroids
-dimensional vectors and matrices for all of the 2-D as
vectors and 2 2 matrices in Section III-B. For example, (15)
and (16) are rewritten with unmixing matrix
by (32)
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
672 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
Fig. 5. Layout of reverberant room used in simulation experiments (K=L= Fig. 6. Noise reduction rates for different iteration points in proposed method,
2).
conventional ICA, and iteratively optimized null beamformer. RT is 0 ms, and
K L= = 2.
TABLE I
ANALYSIS CONDITIONS FOR SIGNAL SEPARATION
requires 1500 frame shifts for 3 s length data within one itera-
tion, and this corresponds to 1500 iterations in the on-line algo-
rithm (e.g., the off-line 100 iterations correspond to the on-line
150 000 iterations). The improvement of the efficiency remains
an open problem.
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 673
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
674 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
Fig. 14. Result of alternation between ICA and null beamforming through
iterative optimization by the proposed algorithm. The symbol, black box,
indicates that the null beamforming is used at the iteration point and frequency
Fig. 12. Noise reduction rates for different iteration points in proposed
method, conventional ICA, and iteratively optimized null beamformer. RT is bin. RT is 150 ms, and K L = = 2.
150 ms, and K L= = 2.
Fig. 15. Result of alternation between ICA and null beamforming through
iterative optimization by the proposed algorithm. The symbol, black box,
indicates that the null beamforming is used at the iteration point and frequency
bin. RT is 300 ms, and K L = = 2.
Fig. 13. Noise reduction rates for different iteration points in proposed
method, conventional ICA, and iteratively optimized null beamformer. RT is • Null beamforming is used for the acceleration of learning
300 ms, and K L= = 2. early in the iterations because is a rough ap-
proximation of the unmixing matrix.
ditions, NRRs are shown in Figs. 12 and 13. In addition, as a • ICA is used after the early part of the iterations because it
baseline algorithm, we performed experiments using the null can update the unmixing matrix more accurately.
beamformer with ideal DOA information, and we obtained the • The unmixing matrix obtained by ICA is substituted by
NRR of 6.5 dB when the RT is 150 ms and that of 5.7 dB when the matrix based on null beamforming through all iteration
the RT is 300 ms. points at particular frequency bins where the independence
The results reveal that the separation performances of the between the sources is low.
proposed algorithm are superior to those of the conventional From these results, although null beamforming is not suitable
ICA-based BSS method at every iteration point, even when con- for signal separation under the condition that direct sounds and
sidering the additional computational cost of the proposed algo- their reflections exist, we can confirm that the temporal utiliza-
rithm. More specifically, compared with the conventional ICA, tion of null beamforming for algorithm diversity through ICA
the proposed method can improve the NRR by about 5.7 dB iterations is effective for improving the separation performance
at the 30-iteration point when the RT is 150 ms and by about and convergence.
2.3 dB when the RT is 300 ms. Also these results are recog-
nized as being more promising than the results of simple null D. Experimental Comparison With Alternative Combination
beamforming. Technique of BSS and Beamforming
Figs. 14 and 15 show the examples of alternation results be- As described in Section I, there are alternative approaches for
tween ICA and null beamforming through iterative optimization combining BSS and geometric beamforming [23], [24]. Parra
by the proposed algorithm when the RTs are 150 and 300 ms, et al. [23] have proposed Geometric Source Separation (GSS)
respectively. As shown in Figs. 14 and 15, the proposed algo- in which beamforming is utilized as a specific spatial constraint
rithm can function automatically as follows. in the conventional BSS. The aim of this section is to discuss
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 675
Fig. 16. Comparison among noise reduction rates for different iteration
points in proposed method and Parra’s Geometric Source Separation method
(34) (GSS-Ideal and GSS-Estimated). RT is 150 ms, and K L
= = 2.
where is a normalization
factor ( represents the Frobenius norm), and is a
weight given by the inverse of the condition number of the
matrix . and are the
cross-power spectra of the input and the output ,
respectively, which are calculated around the time (frame)
index . and should be estimated and given in advance
via an appropriate external DOA estimator.
In this paper, we estimate and at three
time instances with each 1 s data, i.e., the total length of the
input sound is 3 s similarly in the previous experiments. The
step-size parameter is set to which is the optimal
value to provide the fastest and highest convergence. The rest of
the experimental conditions are the same as those of the previous
experiments (see Section IV-A). We introduced two types of
GSSs which correspond to different DOA-estimation processes Fig. 17. Comparison among noise reduction rates for different iteration
as follows. points in proposed method and Parra’s Geometric Source Separation method
: GSS with ideal DOAs for each of
(GSS-Ideal and GSS-Estimated). RT is 300 ms, and K L
= = 2.
sources, where we assume that ac-
curate DOA information is previ- that two DOAs for two sources is
ously known, and consequently this heuristically given by combining
GSS is not blind. This will give the the detected value and the initial
upper bound on the separation per- assumption ( or ). For ex-
formance of GSS. ample, when the detected DOA is
: Blind GSS driven by the estimated 25 , the DOAs for GSS ( and )
DOAs, where the DOA estimation is are set to and . Also, when
performed by looking at the output the detected DOA is , and
power of the DS array steered to are set to and .
various directions. The reason for Figs. 16 and 17 shows the NRR results under and
choosing the DS array as a DOA 300 ms for different iterations, where the NRRs are the averages
estimator is that the other DOA of 12 speaker combinations. From these figures, the following
estimation methods, e.g., MUSIC points are revealed.
method cannot be used in this case • In both cases of RT=150 and 300 ms, GSS-Ideal can
because ; indeed MUSIC achieve the source separation to some extent, but this
works only when . In the DS result represents an only slight outperformance from the
array, if two directional peaks are null beamformer. As compared to the proposed method,
detected, then we use the DOAs as the separation performance is relatively low. This result in-
they are. Otherwise, when a single dicates that the spatial constraint by the ideal beamformer
directional peak is only detected, might help the separation, but the separation performance
we introduce a DOA hypothesis is trapped around almost the same level of the beamformer
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
676 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
SARUWATARI et al.: BSS BASED ON A FAST-CONVERGENCE ALGORITHM 677
VI. CONCLUSION
In this paper, we described a fast- and high-convergence al-
gorithm for BSS where null beamforming is temporally used for
algorithm diversity through ICA iterations. The simulation re-
sults of the signal separation experiments reveal that the signal
separation performance of the proposed algorithm is superior to
that of the conventional ICA-based BSS method, and the uti-
lization of null beamforming in ICA is effective for improving
the separation performance and convergence, even under rever-
berant conditions. More specifically, compared with the conven-
Fig. 21. Noise reduction rates for different iteration points in ICA under real
recording condition where K = L = 2. RT is 200 ms, and the background tional method, the proposed method can improve the NRR by
noise level is 37 dB(A). about 15.2 dB at the 30-iteration point when RT is 0 ms, by
about 5.7 dB when RT is 150 ms, and by about 2.3 dB when
RT is 300 ms. In addition, we have experimentally shown the
superiority of the proposed method over Parra’s combination
approach for BSS and beamforming. The results of the BSS
experiment with actual devices in a real acoustic environment
demonstrate that the proposed method can work well even in
the existence of the background noise. The outperformance over
the conventional ICA in the case of two sound sources has been
illustrated.
This paper mainly discussed the BSS algorithms with off-
line learning, but the extension to the on-line application has
not been addressed. The further study on the on-line algorithm
is an open problem, and this will be indispensable particularly
when we deal with more realistic situations, e.g., moving sound
sources and time-varying systems.
Fig. 22. Result of alternation between ICA and null beamforming through
iterative optimization by the proposed algorithm. The symbol, black box,
indicates that the null beamforming is used at the iteration point and frequency ACKNOWLEDGMENT
bin. RT is 200 ms, and K L = = 2.
The authors are grateful to Dr. S. Makino and R. Mukai of
NTT Co., Ltd. for their suggestions and discussions on this
The levels of background noise and each of the sound sources work. The authors thank S. Ukai of NAIST for his contribution
measured at the array origin were 37 dB(A) and 60 dB(A), in the part experiment.
respectively. It also should be mentioned that all of the experi-
mental apparatus may include possible sensor noise, environ-
REFERENCES
ment noise, and/or nonlinear error which is produced in, for
example, amplifiers. [1] T. W. Parsons, “Separation of speech from interfering speech by means
of harmonic selection,” J. Acoust. Soc. Amer., vol. 60, pp. 911–918,
1976.
B. Results [2] K. Kashino, K. Nakadai, T. Kinoshita, and H. Tanaka, “Organization of
hierarchical perceptual sounds,” in Proc. 14th Int. Conf. Artificial Intel-
Fig. 21 shows NRR results to illustrate the performances of ligence, vol. 1, 1995, pp. 158–164.
the proposed algorithm and the conventional BSS under a real [3] M. Unoki and M. Akagi, “A method of signal extraction from noisy
environment. The NRRs are the averaged scores with respect to signal based on auditory scene analysis,” Speech Commun., vol. 27, pp.
261–279, 1999.
three configurations, ConFig. 1–ConFig. 3. The results reveal [4] G. W. Elko, “Microphone array systems for hands-free telecommunica-
that the proposed algorithm outperforms the conventional ICA- tion,” Speech Commun., vol. 20, pp. 229–240, 1996.
based BSS method at every iteration point, even in the existence [5] J. L. Flanagan, J. D. Johnston, R. Zahn, and G. W. Elko, “Computer-
steered microphone arrays for sound transduction in large rooms,” J.
of the background noise. At the final 200-iteration point, the Acoust. Soc. Amer., vol. 78, pp. 1508–1518, 1985.
significant improvement of more than 4 dB can be obtained over [6] H. Wang and P. Chu, “Voice source localization for automatic camera
the conventional ICA. pointing system in videoconferencing,” in Proc. ICASSP’97, Apr. 1997,
pp. 187–190.
In Fig. 22, we show the example of alternation results be- [7] K. Kiyohara, Y. Kaneda, S. Takahashi, H. Nomura, and J. Kojima, “A
tween ICA and null beamforming through iterative optimization microphone array system for speech recognition,” in Proc. ICASSP’97,
by the proposed algorithm. This figure is a good demonstration Apr. 1997, pp. 215–218.
[8] M. Omologo, M. Matassoni, P. Svaizer, and D. Giuliani, “Microphone
for telling that the proposed algorithm can work properly and array based speech recognition with different talker-array positions,” in
achieve the automatic diversity as described in Section IV-C. Proc. ICASSP’97, Apr. 1997, pp. 227–230.
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.
678 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 2, MARCH 2006
[9] H. F. Silverman and W. R. Patterson III, “Visualizing the performance Hiroshi Saruwatari (M’00) was born in Nagoya,
of large-aperture microphone arrays,” in Proc. ICASSP’99, Mar. 1999, Japan, on July 27, 1967. He received the B.E.,
pp. 969–972. M.E., and Ph.D. degrees in electrical engineering
[10] O. L. Frost, “An algorithm for linearly constrained adaptive array pro- from Nagoya University in 1991, 1993, and 2000,
cessing,” Proc. IEEE, vol. 60, pp. 926–935, 1972. respectively.
[11] L. J. Griffiths and C. W. Jim, “An alternative approach to linearly con- He joined Intelligent Systems Laboratory,
strained adaptive beamforming,” IEEE Trans. Antennas Propagat., vol. SECOM Co., Ltd., Tokyo, Japan, in 1993, where
30, pp. 27–34, 1982. he engaged in the research and development on the
[12] Y. Kaneda and J. Ohga, “Adaptive microphone-array system for noise ultrasonic array system for the acoustic imaging.
reduction,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, He is currently an Associate Professor of Graduate
pp. 1391–1400, 1986. School of Information Science, Nara Institute of
[13] T.-W. Lee, Independent Component Analysis. Norwell, MA: Kluwer, Science and Technology. His research interests include array signal processing,
1998. blind source separation, and sound field reproduction. He is a member of the
[14] S. Haykin, Unsupervised Adaptive Filtering. New York: John Wiley, IEICE, Japan VR Society, and the Acoustical Society of Japan.
2000.
[15] J. F. Cardoso, “Eigenstructure of the 4th-order cumulant tensor with ap-
plication to the blind source separation problem,” in Proc. ICASSP’89,
1989, pp. 2109–2112.
[16] C. Jutten and J. Herault, “Blind separation of sources part I: An adaptive Toshiya Kawamura received the B.E. degrees in electrical engineering from
algorithm based on neuromimetic architecture,” Signal Process., vol. 24, Kinki University in 1999. He received the M.E. degrees in information science
pp. 1–10, 1991. from Nara Institute of Science and Technology in 2001.
[17] P. Comon, “Independent component analysis, a new concept?,” Signal His research interests include array signal processing and blind source
Process., vol. 36, pp. 287–314, 1994. separation.
[18] V. Capdevielle, C. Serviere, and J. Lacoume, “Blind separation of wide- Mr. Kawamura is a member of the Acoustical Society of Japan.
band sources in the frequency domain,” in Proc. ICASSP’95, 1995, pp.
2080–2083.
[19] N. Murata and S. Ikeda, “An on-line algorithm for blind source separa-
tion on speech signals,” in Proc. 1998 Int. Symp. Nonlinear Theory and Tsuyoki Nishikawa was born in Mie, Japan, on
Its Application (NOLTA’98), vol. 3, Sep. 1998, pp. 923–926. February 13, 1978. He received the B.E. degree in
[20] P. Smaragdis, “Blind separation of convolved mixtures in the frequency electrical engineering from Kinki University in 2000.
domain,” Neurocomput., vol. 22, pp. 21–34, 1998. He received the M.E. degree in information science
[21] L. Parra and C. Spence, “Convolutive blind separation of nonstationary from Nara Institute of Science and Technology in
sources,” IEEE Trans. Speech Audio Processing, vol. 8, pp. 320–327, 2002. He is currently an Ph.D. candidate of Nara
2000. Institute of Science and Technology.
[22] H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, and K. Shikano, “Blind His research interests include array signal pro-
source separation based on subband ICA and beamforming,” in Proc. cessing and blind source separation.
ICSLP2000, vol. 3, Oct. 2000, pp. 94–97. Mr. Nishikawa is a member of the IEICE and the
[23] L. Parra and C. V. Alvino, “Geometric source separation: Merging con- Acoustical Society of Japan.
volutive source separation with geometric beamforming,” IEEE Trans.
Speech Audio Processing, vol. 10, no. 6, pp. 352–362, 2002.
[24] C. Fancourt and L. Parra, “The generalized sidelobe decorrelator,” in
Proc. IEEE Workshop on Application of Signal Processing to Audio and Akinobu Lee was born in Kyoto, Japan, on De-
Acoustics, 2001, pp. 167–170. cember 19, 1972. He received the B.E. and M.E.
[25] M. S. Pedersen, U. Kjems, K. B. Rasmussen, and L. K. Hansen, “Semi- degrees in information science, and the Ph.D. degree
blind source separation using head-related transfer functions,” in Proc. in informatics from Kyoto University, Kyoto, Japan,
ICASSP 2004, vol. V, 2004, pp. 713–716. in 1996, 1998, and 2000, respectively.
[26] S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, and F. Itakura, “Evalu- He is currently an Assistant Professor of Graduate
ation of blind signal separation method using directivity pattern under School of Information Science, Nara Institute of Sci-
reverberant conditions,” in Proc. ICASSP 2000, vol. 5, Jun. 2000, pp. ence and Technology. His research interests include
3140–3143. large vocabulary continuous speech recognition and
[27] H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust and precise spoken language processing.
method for solving the permutation problem of frequency-domain blind Dr. Lee is a member of the IEICE and the Acous-
source separation,” in Proc. Int. Symp. Independent Component Analysis tical Society of Japan.
and Blind Signal Separation, 2003, pp. 505–510.
[28] W. Wang, J. A. Chambers, and S. Sanei, “A novel hybrid approach to
the permutation problem of frequency domain blind source separation,”
in Proc. Int. Conf. Independent Component Analysis and Blind Signal Kiyohiro Shikano (M’84) received the B.S., M.S.,
Separation, 2004, pp. 532–539. and Ph.D. degrees in electrical engineering from
[29] N. Mitianoudis and M. Davies, “Permutation alignment for frequency Nagoya University in 1970, 1972, and 1980, respec-
domain ICA using subspace beamforming methods,” in Proc. Int. Conf. tively.
Independent Component Analysis and Blind Signal Separation, 2004, He is currently a Professor of Nara Institute of Sci-
pp. 669–676. ence and Technology (NAIST), where he is directing
[30] H. Sawada, R. Mukai, S. Araki, and S. Makino, “Polar coordinate based speech and acoustics laboratory. His major research
nonlinear function for frequency domain blind source separation,” areas are speech recognition, multimodal dialog
IEICE Trans. Fund., vol. E86-A, no. 3, pp. 590–596, 2003. system, speech enhancement, adaptive microphone
[31] S. Araki, R. Mukai, S. Makino, T. Nishikawa, and H. Saruwatari, “The array, and acoustic field reproduction. Since 1972,
fundamental limitation of frequency domain blind source separation for he had been working at NTT Laboratories, where
convolutive mixtures of speech,” IEEE Trans. Speech Audio Processing, he had been engaged in speech recognition research. During 1990–1993, he
vol. 11, no. 2, pp. 109–116, Mar. 2003. was the Executive Research Scientist at NTT Human Interface Laboratories,
[32] D. H. Johnson and D. E. Dudgeon, Array Signal Processing: Concepts where he supervised the research of speech recognition and speech coding.
and Techniques. Englewood Cliffs, NJ: Prentice-Hall, 1993. During 1986–1990, he was the Head of Speech Processing Department at ATR
[33] A. Gersho and R. M. Gray, Vector Quantization and Signal Compres- Interpreting Telephony Research Laboratories, where he was directing speech
sion. New York: Kluwer Academic, 1998. recognition and speech synthesis research.
[34] T. Kobayashi, S. Itabashi, S. Hayashi, and T. Takezawa, “ASJ continuous Dr. Shikano received the IEEE Signal Processing Society 1990 Senior Award
speech corpus for research,” J. Acoust. Soc. Jpn., vol. 48, no. 12, pp. in 1991. He is a member of the IEICE, the IPSJ, the ASJ, and the Japan VR
888–893, 1992. in Japanese. Society.
Authorized licensed use limited to: Iran Univ of Science and Tech. Downloaded on May 19,2010 at 05:19:56 UTC from IEEE Xplore. Restrictions apply.