I’ve encountered a behavior of DO CONCURRENT when using the Intel compiler that I find unexpected. There is already a long thread on the Intel board (Re:do concurrent broken with openmp - Intel Community) without a conclusion so far. I summarize here my current understanding.
The issue can be demonstrated with the help of the following example.
program test_do_concurrent implicit none print*, b([1,2]) contains function b(a) integer, dimension(2,2) :: b integer, intent(in), dimension(2) :: a integer :: i,j do concurrent(i=1:2, j=1:2) b(i,j) = a(2) * i * j enddo end function b end program test_do_concurrent
I would expect that it prints
2 4 4 8
and that is what I get from gfortran and ifort, unless I compile with ifort and
-qopenmp, in which case I get
0 0 0 0
According to the Intel support, this is ok because my code contains unspecified behavior if
a (more specifically
a(2)) is not defined in the loop.
The relevant aspects of the standard are
126.96.36.199 Additional semantics for DO CONCURRENT constructs
- The locality of a variable that appears in a DO CONCURRENT construct is LOCAL, LOCAL_INIT, SHARED, or unspecified. A construct or statement entity of a construct or statement within the DO CONCURRENT construct has SHARED locality if it has the SAVE attribute. If it does not have the SAVE attribute, it is a different entity in each iteration, similar to LOCAL locality.
- A variable that has LOCAL or LOCAL_INIT locality is a construct entity with the same type, type parameters, and rank as the variable with the same name in the innermost executable construct or scoping unit that includes the DO CONCURRENT construct, and the outside variable is inaccessible by that name within the construct. The construct entity has the ASYNCHRONOUS, CONTIGUOUS, POINTER, TARGET, or VOLATILE attribute if and only if the outside variable has that attribute; it does not have the BIND, INTENT, PROTECTED, SAVE, or VALUE attribute, even if the outside variable has that attribute. If it is not a pointer, it has the same bounds as the outside variable. At the beginning of execution of each iteration,
- If a variable has unspecified locality,
• if it is referenced in an iteration it shall either be previously defined during that iteration, or shall not be defined or become undefined during any other iteration; if it is defined or becomes undefined by more than one iteration it becomes undefined when the loop terminates;
C1128: A variable-name that appears in a LOCAL or LOCAL_INIT locality-spec shall not have the ALLOCATABLE, INTENT (IN), or OPTIONAL attribute, shall not be of finalizable type, shall not be a nonpointer polymorphic dummy argument, and shall not be a coarray or an assumed-size array. A variable-name that is not permitted to appear in a variable definition context shall not appear in a LOCAL or LOCAL_INIT locality-spec.
The reasoning seems to be a has unspecified locality, therefore it should behave similar to a variable with LOCAL locality, but this ignores that a variable with INTENT(in) cannot have LOCAL locality. I agree with the Intel support that the standard is not 100% clear here, but the behavior is in my opinion against the principle of least astonishment. 188.8.131.52.p3 also does not say that unspecified locality requires the variable to become defined in the iteration, it just says that it should become defined at most once.