I have a subroutine where I do (very similar) calculations along 2 axes. These 2 parts are independent from each other and contain nested loops, which can also be evaluated concurrently.
Is there a best practice or heuristic to chose how to approach this kind of problem with OpenMP? When should one use sections (which are incompatible with nested parallel do) ? When should one prefer parallel loops.
In my case the times are quite similar…
! method with **sections**
fedges = 0
!$omp parallel
!$omp sections
!$omp section
do j = 1, nc(2)
call myweno(1)%reconstruct(varray(:, j), vl1, vr1)
do i = 1, nc(1) - 1
fedges(i, j, 1) = godunov(flux1, vr1(i), vl1(i + 1), &
[gx(1)%right(i), gx(2)%center(j)], t)
end do
end do
!$omp section
do i = 1, nc(1)
call myweno(2)%reconstruct(varray(i, :), vl2, vr2)
do j = 1, nc(2) - 1
fedges(i, j, 2) = godunov(flux2, vr2(j), vl2(j + 1), &
[gx(1)%center(i), gx(2)%right(j)], t)
end do
end do
!$omp end sections
!$omp end parallel
! method with **parallel do**
fedges = 0
!$omp parallel
!$omp do
do j = 1, nc(2)
call myweno(1)%reconstruct(varray(:, j), vl1, vr1)
do i = 1, nc(1) - 1
fedges(i, j, 1) = godunov(flux1, vr1(i), vl1(i + 1), &
[gx(1)%right(i), gx(2)%center(j)], t)
end do
end do
!$omp end do nowait
!$omp do
do i = 1, nc(1)
call myweno(2)%reconstruct(varray(i, :), vl2, vr2)
do j = 1, nc(2) - 1
fedges(i, j, 2) = godunov(flux2, vr2(j), vl2(j + 1), &
[gx(1)%center(i), gx(2)%right(j)], t)
end do
end do
!$omp end do
!$omp end parallel