Concerning the Intel Fortran Compiler Classic (ifort
), this Intel thread from 2015 stated:
DO CONCURRENT allows the compiler to ignore any potential dependencies between iterations and to execute the loop in parallel. This can mean either SIMD parallelism (vectorization), which is enabled by default, or thread parallelism (auto-parallelization), which is enabled only by /Qparallel. This is independent of /Qopenmp, which does not enable auto-parallelization, it only enables parallelism through OpenMP directives. However, auto-parallelization with /Qparallel uses the same underlying OpenMP runtime library as /Qopenmp. The overhead for setting up and entering a parallel region is typically thousands of clock cycles, so auto-parallelization is usually worthwhile only for loops with a sufficiently large amount of work to amortize this overhead.
And in this Intel thread, @sblionel stated:
DO CONCURRENT does not “demand parallel” - it allows/requests it. As others have said, the semantics of DO CONCURRENT make it more likely that the loop can be parallelized correctly. If you’re not enabling auto-parallel, there is no benefit to DO CONCURRENT.
With the new Intel LLVM compiler (ifx
), this has changed, again quoting @sblionel:
Just as a followup to my March 2022 reply, Intel’s LLVM-based ifx compiler does not support -parallel at all. It will (attempt to) parallelize
DO CONCURRENT
if you enable OpenMP, even if you don’t use OpenMP otherwise.
I have verified that the -fopenmp
flag is not needed and inspected the compiler reports to verify parallelization occurs. The executable produced on Linux has a dependency on OpenMP (GOMP) and pthreads, as stated by GCC documentation for -ftree-parallelize-loops
:
This option implies -pthread, and thus is only supported on targets that have support for -pthread.
I’m guessing they were referring to the new Intel LLVM compiler, as ifort was “end-of-life” already.
What is worth noting is that in both ifort and gfortran, the respective parallel flags also work on regular do loops, if the compiler heuristic determines this would be profitable. Using do concurrent
instead of do
is about intent, and letting the compiler know the loop can be executed concurrently, meaning there are no data dependencies, and it can be safely parallelized.
The flang documentation captured this well when it says,
The best option seems to be the one that assumes that users who write
DO CONCURRENT
constructs are doing so with the intent to write parallel code.
on the topic of “how to convey to a compiler that a loop is safely parallelizable”