Table of Contents

Developping and building OpenMP programs


An excellent tutorial can be found here (look for English version) : http://www.idris.fr/formations/openmp/
Only few tips and examples are provided here because the IDRIS tutorial is more than enough to learn OpenMP.

Resources :

Beware, using random_number in OpenMP will results in extremely slow code execution. Only one random_number call (C/Fortran) can be called at a time on a single socket (not a core !). A good way to bypass this issue is to use Marsaglia’s Ziggurat algorithm. An OpenMP example can be found here : http://people.sc.fsu.edu/~jburkardt/f_src/ziggurat_openmp/ziggurat_openmp.html

Examples

(Many thanks to Adrien Cassagne who translated to C the Heat equation and the Conjugate gradient)

Compile Examples

Using GCC

gcc -fopenmp myprogramme.c -o myprogramme.exe

Same command with g++ (.ccp files) and gfortran (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads :

export OMP_NUM_THREADS=2

You can now launch the program as usual :

./myprogramme.exe

Note : you will need to specify the number of desired threads in each new terminal/console used.

Using Intel

icc -openmp myprogramme.c -o myprogramme.exe

Same command with icpc (.ccp files) and ifort (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads :

export OMP_NUM_THREADS=2

You can now launch the program as usual :

./myprogramme.exe

Note : you will need to specify the number of desired threads in each new terminal/console used.

Binding threads

Get informations

Informations on node :

> numactl -- hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 8 9 10 11
node 0 size: 18423 MB
node 0 free: 17137 MB
node 1 cpus: 4 5 6 7 12 13 14 15
node 1 size: 18432 MB
node 1 free: 17479 MB
node distances:
node 0 1
0: 10 20
1: 20 10 

Bind threads

Intel compiler only :

export OMP_NUM_THREADS=4
export KMP_AFFINITY=verbose,granularity=fine,proclist=[0,1,4,5],explicit