Escolar Documentos
Profissional Documentos
Cultura Documentos
9, 2010
Abstract—A parallelization of the low-frequency multilevel only publication on MLFMA parallelization for GPUs. In this
fast multipole algorithm (MLFMA) for graphics processing units letter, we describe a GPU implementation of the low-frequency
(GPUs) is presented. The implementation exhibits speedups be- MLFMA with the Helmholtz kernel [2] using double-precision
tween 10 and 30 compared to a serial CPU implementation of the
arithmetic. The parallelization pattern similar to the one dis-
algorithm. The error of the MLFMA on the GPU is controllable
down to machine precision. Under the typical method-of-moments cussed in [7] is utilized. The methodology achieves speedups
(MoM) error requirement of three correct digits, modern GPUs of over 30 compared to a conventional serial implementation of
are shown to handle problems with up to 7.5 million degrees of the low-frequency MLFMA on a CPU.
freedom in dense matrix approximation.
Index Terms—CUDA, graphics processing unit (GPU), low-fre- II. FIELD EXPANSIONS IN LOW-FREQUENCY MLFMA
quency fast multipole method, fast algorithms, multiscale Consider a spatial distribution of time-harmonic point
modeling. sources located at , and having magnitudes
, respectively. The field produced by such
sources at observation point is given by [9]
I. INTRODUCTION
Manuscript received October 19, 2009; manuscript revised December 10, (5)
2009. Date of publication January 15, 2010; date of current version March 05,
2010. This work was in part supported by the National Science and Engineering
Research Council of Canada (NSERC). where , and .
The authors are with the Department of Electrical and Computer Engi-
neering, University of Manitoba, Winnipeg, MB R3T5V6, Canada (e-mail: The MLFMA subdivides the computational domain recur-
mcwikla@ieee.org). sively [9]. At th level of subdivision, the domain is split into
Digital Object Identifier 10.1109/LAWP.2010.2040571 cubes (boxes), , where level corresponds
1536-1225/$26.00 © 2010 IEEE
Authorized licensed use limited to: Thangal Kunju Musaliar College of Engineering. Downloaded on June 26,2010 at 07:03:34 UTC from IEEE Xplore. Restrictions apply.
CWIKLA et al.: LOW-FREQUENCY MLFMA ON GRAPHICS PROCESSORS 9
Authorized licensed use limited to: Thangal Kunju Musaliar College of Engineering. Downloaded on June 26,2010 at 07:03:34 UTC from IEEE Xplore. Restrictions apply.
10 IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, VOL. 9, 2010
Authorized licensed use limited to: Thangal Kunju Musaliar College of Engineering. Downloaded on June 26,2010 at 07:03:34 UTC from IEEE Xplore. Restrictions apply.
CWIKLA et al.: LOW-FREQUENCY MLFMA ON GRAPHICS PROCESSORS 11
TABLE I TABLE II
BENCHMARK PARAMETERS BENCHMARKS RUN TIMES IN SECONDS AND SPEEDUP
15:
16: end while
17: end for
18: end for
Authorized licensed use limited to: Thangal Kunju Musaliar College of Engineering. Downloaded on June 26,2010 at 07:03:34 UTC from IEEE Xplore. Restrictions apply.