Parallelization on GPU with Intel compiler

Wow… I have been heavily using OpenMP for 25 years, and I am only discovering today that all loop indexes are privatized by default, even the ones of the loops that are not parallelized!

use omp_lib
use iso_fortran_env
implicit none
integer :: i, k
integer(int64) :: iloc, kloc

iloc = loc(i)
kloc = loc(k)

!$omp parallel num_threads(2)
print*, omp_get_thread_num(), "i:", loc(i)-iloc
print*, omp_get_thread_num(), "k:", loc(k)-kloc
do k = 1, 10
   continue
end do
!$omp end parallel

end

Output with gfortan (similar with ifx):

           0 i:                    0
           0 k:                  -92
           1 i:                    0
           1 k:        -327285035964
1 Like