TY  - CONF
AU  - Kaczmarek, O.
AU  - Schmidt, C.
AU  - Steinbrecher, P.
AU  - Wagner, M.
TI  - Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs
IS  - arXiv:1411.4439
CY  - Hamburg
PB  - Deutsches Elektronen-Synchrotron, DESY
M1  - PUBDB-2015-05358
M1  - arXiv:1411.4439
M1  - DESY-PROC-2014-05/28
SP  - 157-162
PY  - 2015
AB  - Lattice Quantum Chromodynamics simulations typically spend most of the runtime in inversions of the Fermion Matrix. This part is therefore frequently optimized for various HPC architectures. Here we compare the performance of the Intel R © Xeon Phi TM to current Kepler-based NVIDIA R © Tesla TM GPUs running a conjugate gradient solver. By exposing more parallelism to the accelerator through inverting multiple vectors at the same time, we obtain a performance greater than 300 GFlop / s on both architectures. This more than doubles the performance of the inversions. We also give a short overview of the Knights Corner architecture, discuss some details of the implementation and the effort required to obtain the achieved performance
T2  - GPU Computing in High-Energy Physics
CY  - 10 Sep 2014 - 12 Sep 2014, Pisa (Italy)
Y2  - 10 Sep 2014 - 12 Sep 2014
M2  - Pisa, Italy
KW  - lattice field theory (INSPIRE)
KW  - quantum chromodynamics (INSPIRE)
KW  - numerical calculations (INSPIRE)
KW  - multiprocessor: graphics (INSPIRE)
KW  - programming (INSPIRE)
KW  - performance (autogen)
KW  - accelerator (autogen)
LB  - PUB:(DE-HGF)8 ; PUB:(DE-HGF)15
DO  - DOI:10.3204/DESY-PROC-2014-05/28
UR  - https://bib-pubdb1.desy.de/record/291386
ER  -