I’m starting to learn OpenACC. Consider this loop:
!$acc parallel loop collapse(3)
do i = 1,N
do j = 1,M
do k = 1,L
...
end do
end do
end do
Suppose I want to replace this by a do concurrent loop, both to get the extra checks of DC, and in preparation for the day when GFortran supports offloading DC to GPU:
!$acc parallel loop collapse(???)
do concurrent (i = 1:N, j = 1:M, k = 1:L)
...
end do
My question is, should the OpenACC directive read collapse(3) or collapse(1)? I honestly have no idea. I suspect that this is simply unspecified. Is each compiler possibly going to make a different choice?
One has to look at the OpenACC specifications to see how they interact with “do concurrent”.
The 3.2 specification says:
A do concurrent is treated as if defining a loop for each index in the concurrent-header.
When do-loop is a do concurrent, the OpenACC loop construct applies to the loop for each
index in the concurrent-header. The loop construct can describe what type of parallelism to use
to execute all the loops, and declares all indices appearing in the concurrent-header to be implicitly
private.
A tile and collapse clause may not appear on loop that is associated with do concurrent.
So my interpretation is that in your case collapse(3) is implied, i.e. all 3 nested loops are parallelized. And I don’t think that collapse(1) or collapse(2) make sense (ambiguity about which index it would apply).
Note that the 3.0 specification doesn’t say anything about “do concurrent”, so you have to make sure which specification gfortran currently follows.