====== Developping and building OpenMP programs ====== \\ An excellent tutorial can be found here (look for English version) : [[http://www.idris.fr/formations/openmp/]]\\ Only few tips and examples are provided here because the IDRIS tutorial is more than enough to learn OpenMP. \\ \\ Resources : * [[https://software.intel.com/en-us/articles/using-kmp-affinity-to-create-openmp-thread-mapping-to-os-proc-ids]] * [[http://sepwww.stanford.edu/sep/claudio/Research/Prst_ExpRefl/ShtPSPI/intel/cce/10.1.015/doc/main_cls/mergedProjects/optaps_cls/common/optaps_openmp_thread_affinity.htm]] Beware, using random_number in OpenMP will results in extremely slow code execution. Only one random_number call (C/Fortran) can be called at a time on a single socket (not a core !). A good way to bypass this issue is to use Marsaglia’s Ziggurat algorithm. An OpenMP example can be found here : http://people.sc.fsu.edu/~jburkardt/f_src/ziggurat_openmp/ziggurat_openmp.html ===== Examples ===== (Many thanks to Adrien Cassagne who translated to C the Heat equation and the Conjugate gradient) * [[software:development:openmp:helloworld|Fortran and C : Hello World !]] * [[software:development:openmp:pr|Fortran and C : parallel region]] * [[software:development:openmp:variables|Fortran and C : shared and private variables]] * [[software:development:openmp:allocation|Fortran and C : memory allocation]] * [[software:development:openmp:atom|Fortran and C : atomic operations]] * [[software:development:openmp:exclu|Fortran and C : exclusives operations (single/master)]] * [[software:development:openmp:sections|Fortran and C : sections]] * [[software:development:openmp:workshare|Fortran and C : workshare on do_f/for_c loop]] * [[software:development:openmp:reduction|Fortran and C : reduction]] * [[software:development:openmp:subroutines|Fortran and C : subroutines]] * [[software:development:openmp:heat2D|Fortran and C : 2D heat equation example]] * [[software:development:openmp:cg|Fortran and C : conjugate gradient (1D heat equation) example]] ===== Compile Examples ===== ==== Using GCC ==== gcc -fopenmp myprogramme.c -o myprogramme.exe Same command with g++ (.ccp files) and gfortran (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads : export OMP_NUM_THREADS=2 You can now launch the program as usual : ./myprogramme.exe Note : you will need to specify the number of desired threads in each new terminal/console used. ==== Using Intel ==== icc -openmp myprogramme.c -o myprogramme.exe Same command with icpc (.ccp files) and ifort (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads : export OMP_NUM_THREADS=2 You can now launch the program as usual : ./myprogramme.exe Note : you will need to specify the number of desired threads in each new terminal/console used. ===== Binding threads ===== ==== Get informations ==== Informations on node : > numactl -- hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 8 9 10 11 node 0 size: 18423 MB node 0 free: 17137 MB node 1 cpus: 4 5 6 7 12 13 14 15 node 1 size: 18432 MB node 1 free: 17479 MB node distances: node 0 1 0: 10 20 1: 20 10 ==== Bind threads ==== Intel compiler only : export OMP_NUM_THREADS=4 export KMP_AFFINITY=verbose,granularity=fine,proclist=[0,1,4,5],explicit