I recently came across an interesting code that I thought might be illuminating to improve Fortran standard specification about associate
statement in OpenMP/do concurrent
.
It seems that the Fortran standard is silent on whether the associate
structure is private
or shared
in OpenMP/do concurrent
do loops. I think it would make more sense for it to be private
, same as the block
statement.
The following code has different results in gfortran
and ifort/ifx
:
> cat main.f90
program main
use omp_lib
implicit none
integer :: ints(4) = [1, 2, 3, 4], i
!$omp parallel do schedule(dynamic)
do i = 1, 4
associate (x => ints(i))
print *, "loc = ", loc(x), ", x = ", x, ", omp_get_thread_num() = ", omp_get_thread_num()
end associate
end do
end program main
> gfortran -fopenmp main.f90 && . /a
loc = 140697567571968 , x = 1 , omp_get_thread_num() = 0
loc = 140697567571972 , x = 2 , omp_get_thread_num() = 1
loc = 140697567571980 , x = 4 , omp_get_thread_num() = 3
loc = 140697567571976 , x = 3 , omp_get_thread_num() = 6
> ifort /Qopenmp /Z7 main.f90 && . /a
loc = 140700606185484 , x = 4 , omp_get_thread_num() = 2
loc = 140700606185484 , x = 4 , omp_get_thread_num() = 3
loc = 140700606185484 , x = 4 , omp_get_thread_num() = 5
loc = 140700606185484 , x = 4 , omp_get_thread_num() = 1
As you can see from the storage address of x
, gfortran
considers x
to be private
(which makes more sense to me) and ifort/ifx
considers x
to be publicly shared (see the intel fortran forum for the same report).
What do you all think about this, and if you find the corresponding description in the Fortran standard, please let me know, thanks in advance.
ChatGPT let me know:
In OpenMP, the scope of the entity (in this case
x
) in theassociate
block isprivate
. This is because theassociate
block creates a new scope in Fortran that contains variables that can only be accessed within thatassociate
block, not externally. Therefore, each thread will have its ownassociate
block and its own x entity.
This design helps to ensure thread safety because each thread is manipulating its ownx
entity without affecting thex
’s of other threads, which avoids data contention problems common in parallel computing.
As a result, you can safely useassociate
statements in OpenMP loops.
In OpenMP, the data-sharing attributes (for example, whether they are
private
orshared
) of the associated name (in this case,x
) in theassociate
statement block are determined based on the data-sharing attributes of the original entity or expression. In your example,ints(i)
is an array element whose data sharing properties depend on the data sharing properties of the arrayints
. In your example, you don’t explicitly specify the data-sharing property of arrayints
, so by default, it isshared
.
As an aside, I have two other examples to provoke thought:
> cat demo1.f90
program main
use omp_lib
implicit none
integer :: ints(4) = [1, 2, 3, 4], i
!$omp parallel do schedule(dynamic) private(ints)
do i = 1, 4
associate (x => ints(i))
print *, "loc = ", loc(x), ", x = ", x, ", omp_get_thread_num() = ", omp_get_thread_num()
end associate
end do
end program main
> gfortran -fopenmp demo2.f90 && . /a
loc = 32374240 , x = 14234872 , omp_get_thread_num() = 7
loc = 28179940 , x = 0 , omp_get_thread_num() = 5
loc = 6290216 , x = 14226000 , omp_get_thread_num() = 0
loc = 26082796 , x = 32763 , omp_get_thread_num() = 4
> ifort /Qopenmp /Z7 demo2.f90 && . /a
loc = 20707964 , x = 0 , omp_get_thread_num() = 1
loc = 24967796 , x = 0 , omp_get_thread_num() = 2
loc = 29227632 , x = 0 , omp_get_thread_num() = 3
loc = 1373944 , x = 1 , omp_get_thread_num() = 0
> cat demo2.f90
program main
use omp_lib
implicit none
integer :: ints(4) = [1, 2, 3, 4], i
!$omp parallel do schedule(dynamic)
do i = 1, 4
associate (x => get_x(ints(i)))
print *, "loc = ", loc(x), ", x = ", x, ", omp_get_thread_num() = ", omp_get_thread_num()
end associate
end do
contains
integer function get_x(y) result(x)
integer, intent(in) :: y
x = y + 1
end function get_x
end program main
> gfortran -fopenmp demo2.f90 && . /a
loc = 29490688 , x = 2 , omp_get_thread_num() = 5
loc = 6290208 , x = 3 , omp_get_thread_num() = 0
loc = 23199232 , x = 4 , omp_get_thread_num() = 2
loc = 27393536 , x = 5 , omp_get_thread_num() = 4
> ifort /Qopenmp /Z7 demo2.f90 && . /a
loc = 1375832 , x = 5 , omp_get_thread_num() = 4
loc = 1375832 , x = 5 , omp_get_thread_num() = 3
loc = 1375832 , x = 5 , omp_get_thread_num() = 2
loc = 1375832 , x = 5 , omp_get_thread_num() = 0