Does ifort ignore locality specifiers in do concurrent?

Does anybody know if ifort actually pays attention to the locality specifiers in do concurrent ? Unless I’ve misunderstood the standard (quite possible), the following should not compile, but does:

module m_demo
      implicit none
contains

      function demo_f(a,b)
          real, contiguous, intent(in) :: a(:), b(:)
          real :: demo_f(size(a))

          do concurrent (integer :: i = 1:size(a)) default(none)
            demo_f(i) = sin(a(i)) + exp(b(i))
          end do
      end function

end module

If I use the -parallel -qopt-report=3 options do get ifort to automatically parallelize do concurrent, I see in the optimization report a section like:

remark #17109: LOOP WAS AUTO-PARALLELIZED
remark #17101: parallel loop shared={ } private={ } firstprivate={ ? i } lastprivate={ } firstlastprivate={ } reduction={ }

And I’ve noticed that even if I add shared() or local() or local_init() clauses to the do concurrent statement, it always puts all the variables in firstprivate in the optimization report.

Is this is a bug, or if my expectations are wrong? Thanks.

I cannot answer the OP’s question but will note that gfortran 12.0.0 20210718 from equation.com compiles the code if i is declared earlier and if default(none) is removed (what does that do?). So this compiles:

module m_demo
      implicit none
contains

      function demo_f(a,b)
          real, contiguous, intent(in) :: a(:), b(:)
          real :: demo_f(size(a))
          integer :: i
          do concurrent (i = 1:size(a))
            demo_f(i) = sin(a(i)) + exp(b(i))
          end do
      end function

end module

I would expect the compiler to require you to specify the locality of the other variables in the block. I suggest you report this to Intel. As far as I know, it does process locality specifiers. When I try your example in 2021.3, it tells me that the loop is not parallelized due to insufficient work.

@sblionel True, I should have mentioned that in order to get it to auto-parallelize, I added !dir$ loop count min(512) directly above the loop.

But thanks for the sanity check - I’ll let Intel know and see what they say.

1 Like