GPU Acceleration of Electromagnetic Solutions in FEKO
Recently high-performance computing has seen a boost through the usage of parallel processing capabilities of multi-core GPUs (graphics processing units). Even fairly standard graphics cards found in typical engineering desktop PCs have computing capabilities exceeding those of high-end CPUs.
EM Software & Systems – S.A. (Pty) Ltd (EMSS) is proud to announce that from FEKO Suite 6.0 (July 2010) GPU and mixed CPU/GPU processing will be integrated for run-time critical phases of the solution process of some of the solvers integrated in FEKO.
For moderate to large size Method of Moments (MoM) problems, the most run-time critical solution phase is the solution of the system of linear equations. Even though FEKO employs highly optimised libraries for this phase for the various CPUs, the GPU capabilities can beat this performance by more than one order of magnitude.
As an example, consider the radar scattering from a metallic object at a single frequency. The following figures give the performance in GFLOPS (billions of floating point operations per second) of the matrix solution phase (i.e. excluding times for matrix setup, near- and far-field calculations etc.) for solving this class of problem with the MoM. Measured performance is given for problem sizes ranging from between 2000 and 20000 unknowns, with a higher performance value indicating that a problem of a certain size will be solved more quickly.
Figure 1: GFLOPS performance (Double Precision) in FEKO for the MoM solution phase of solving the system of linear equations using different Intel CPUs and NVidia Graphics Cards (GPU's).
Figure 2: GFLOPS performance (Single Precision) in FEKO for the MoM solution phase of solving the system of linear equations using different Intel CPUs and NVidia Graphics Cards (GPU's).
The superiority of GPU processing for the matrix solution phase is obvious. Results also show that the single precision solution is about two times faster than double precision. Also note that the FEKO implementation is using a blocking based algorithm, which allows also the solution of MoM problems where the matrix memory exceeds the memory available on the GPU.
























