GPGPU Acceleration of the FDTD Calculation


Hideaki Taniyama, Takashi Shimokawabe*, Takayuki Aoki*, and Masaya Notomi
Optical Science Laboratory, *Tokyo Institute of Technology

 Finite-Difference Time-Domain (FDTD) method of Maxwell equation is widely used and is recognized as a powerful tool in the study of optical nanostructures. Generally, FDTD simulation requires large amount of memories and long computing time. Therefore, it cannot be used for the calculation of large structures. Recently, the General-Purpose Graphics Processing Unit (GPGPU) featuring massive parallelism and high memory bandwidth has been used for the acceleration in some supercomputers. We try to use GPU as an accelerator of FDTD calculation, to accelerate the FDTD computation and overcome their limitation.
 The FDTD calculation requires large memory access in the time integration step of electromagnetic fields. The bottle-neck of FDTD calculation lies in slow memory transfer between main-memory and CPU. On the other hand, a memory bandwidth between GPU and VRAM is high. This suggests the possibility of greatly accelerating the speed of computation by using GPU. To take full advantage of the high memory bandwidth of GPU, we must minimize data transfer between CPU and GPU, because their data transfer costs much of the computation time and counteracts speedup effect of GPU.
 We use CUDA 3.1, which is an integrated development environment for NVIDIA’s GPGPU. For GPUs, we use NVIDIA’s Tesla C1060 and GeForce GTX 480, and for CPU we use Xeon/W3580 3.33 GHz. To estimate the acceleration effect of our code, we simulate the photonic crystal slab cavity and calculate the confined electromagnetic field profile of resonant mode (see Fig.1) using three-dimensional FDTD program. All calculation is performed in double-precision [1]. Figure 2 shows the elapsed time of CPU and CPU with GPU for several amount of simulation size. With GPU, our program achieves about 18 times speedup over CPU code. About 2.5 times speedup of GTX 480 over Tesla C1060 can be attributed to the difference in memory band width and L1 cashe.

[1] H. Taniyama et al., PIERS 2011, 1A9-K-14, March (2011).

Fig. 1. (a) Photonic Crystal Slab Cavity and
(b) field profile of resonant mode.
Fig. 2. Elapsed time of FDTD calculation for
CPU and CPU with GPU.

[back] [Top] [Next]