The 35th International Symposium on Lattice Field Theory

18-24 June 2017

Palacio de Congresos

Europe/Madrid timezone

Home > Timetable > Session details > Contribution details

Contribution Parallel

19/6/2017 16:40 - 17:00

Seminarios 8

Algorithms and Machines

Optimization of the Brillouin operator on the KNL architecture

Speakers

Dr. Stephan DURR

Primary authors

Dr. Stephan DURR (University of Wuppertal)

Files

Slides
- durr_lat17_talk-97-397.pdf

Content

Experiences with optimizing the matrix-times-vector application of the Brillouin operator on the Intel KNL processor are reported. Without any adjustments to the memory layout, performances figures of 300 Gflop/s in sp and 230 Gflop/s in dp are observed. This is with Nc=3 colors, Nv=12 right-hand-sides, Nthr=256 threads, on lattices of size 32^3*64, using exclusively OMP pragmas. Interestingly, the same routine performs quite well on standard Intel Core i7 architectures, too. Time permitting, some observations on the much harder Wilson fermion matrix-times-vector optimization problem will be added.