Escolar Documentos
Profissional Documentos
Cultura Documentos
D2Q9 model
16.03.2010 | Kostyantyn Kucher | Page 2
coarse
fine
16.03.2010 | Kostyantyn Kucher | Page 3
domain block
patch2
patch1
lattice node
Implementation
Synchronization
coarseP:SpTbPatch g:Synchronizer calculate() calculate()
exchangeBlockData() Tf
fineP1:SpTbPatch
n:Synchronizer
fineP2:SpTbPatch
calculate()
exchangeBlockData()
wait()
Tc
calculate()
exchangeBlockData() exchangeBlockData() Tf
wait() calculate()
exchangeBlockData()
wait()
wait() wait()
interpolation()
Validation
The flow past a cylinder
Schfer und Turek 1996
U=V=0 4.2r inlet 4.0r 1.95r 4.0r 40.0r U=V=0 1.95r U=V=0 fine grid 128 x 688 lattice nodes 256 x 1376 lattice nodes 512 x 2752 lattice nodes coarse grid 128 x 688 lattice nodes 256 x 1376 lattice nodes 512 x 2752 lattice nodes outlet
Validation
Re = 20
Parameter
resolution 128 x 688 Cd Cl 5.5972526 0.0042797 5.5807648 0.0073446
SBB
QBB
Cd
Cl Parameter Cd Cl
16.03.2010 | Kostyantyn Kucher | Page 7
5.5778068
0.0107031 Crouse 5.585-5.627 0.017-0.0119
5.5745483
0.0102097 Schfer and Turek 5.57-5.59 0.0104-0.011
x = 1.0 t = 1.0
Performance Analysis
Parallel Efficiency
100 90 80 70 60 50 40 30 20 10 0
0
coarse
fine
number of threads
10
12
14
16
Performance Analysis
Scale up
16
scaleup * number of threads
Intel Xeon E5520 2.26GHz (uniform) Intel Xeon E5520 2.26GHz (non-uniform)
14 12 10 8 6 4 2 0 0 2
coarse fine
10
12
14
16
number of threads
16.03.2010 | Kostyantyn Kucher | Page 10
Performance Analysis
Limitation by Memory Bandwidth:
120 100 29.9 80 60 40 20 0 Intel Core2 Quad Intel Core2 Quad 2xIntel Xeon 2xIntel Xeon Q8200 2.38GHz, Q8200 2.38GHz, E5520 2.26GHz, E5520 2.26GHz, DDR2-6400 DDR2-6400 DDR3-1066 DDR3-1066 4 threads 4 threads 8 threads 2 threads 53.7 75.2 52.4 73.6 40.6 44.8 35.2 FLOPS peak performance, % Load of BW, %
Outlook
Support of MPI
Support of varying hierarchic setups
Acknowledgments
German BMBF funding the SKALB (Lattice-Boltzmann-Methoden fr skalierbare Multi-Physik-Anwendungen) project (reference ID 01IH08003E).
Funding by the DFG under grant GE 1990/2-1 (Konsistente Multiskalenstrmungsmechanik mit der kaskadierten Lattice Boltzmann Methode).