I summarized some do concurrent related information in this thread and thought it is worth reposting here:
Multi-threaded do concurrent (CPU)
Compiler | Parallel flag | Information | Number of threads | Underlying implementation |
---|---|---|---|---|
gfortran |
-ftree-parallelize-loops=n |
-fopt-info-loop |
using the parallel flag | OpenMP/pthreads |
nvfortran |
-stdpar=multicore |
-Minfo=stdpar,accel |
ACC_NUM_CORES |
OpenACC |
ifort (deprecated) |
-parallel |
-qopt-report -qopt-report-phase=par |
OMP_NUM_THREADS , -par-num-threads=n |
OpenMP |
ifx |
-qopenmp |
-qopt-report |
OMP_NUM_THREADS |
OpenMP |
CCE ftn (Cray/HPE) |
-h thread_do_concurrent |
? | ? | ? |
AMD flang |
-fopenmp |
? | OMP_NUM_THREADS |
OpenMP |
The OpenMP environment variables can also be used to control processor affinity. This is also the case for nvfortran, which responds to OMP_PROC_BIND
and OMP_PLACES
, because OpenACC doesn’t have variables for thread-to-core binding.
Resources
- Number of threads in
do concurrent
loops | NVIDIA - Accelerating Fortran DO CONCURRENT with GPUs and the NVIDIA HPC SDK | NVIDIA
- Does gfortran take advantage of DO CONCURRENT? | Stack Overflow
- When should I use DO CONCURRENT and when OpenMP? | Stack Overflow
- Using Fortran DO CONCURRENT for Accelerator Offload | Intel
- The Case for OpenMP* Target Offloading: Why ISO Fortran Is Not Enough for Heterogeneous Computing | Intel
- Transition to the Intel (R) Fortran Compiler | Intel
- DO CONCURRENT isn’t necessarily concurrent | LLVM (flang)
- Benchmarking Fortran DO CONCURRENT on CPUs and GPUs Using BabelStream | SC22
- Can Fortran’s `do concurrent’ replace directives for accelerated computing? | SC21
- Clarification on DO CONCURRENT | Fortran Discourse
PS: I’ve made this a Wiki post, so feel free to add missing information.