Can Fortran's 'do concurrent' replace directives for accelerated computing? (paper)

Beliavsky · October 23, 2021, 1:17am

Can Fortran’s ‘do concurrent’ replace directives for accelerated computing?

by Miko M. Stulajter, Ronald M. Caplan, and Jon A. Linker
arXiv 18 Oct 2021

Recently, there has been growing interest in using standard language constructs (e.g. C++'s Parallel Algorithms and Fortran’s do concurrent) for accelerated computing as an alternative to directive-based APIs (e.g. OpenMP and OpenACC). These constructs have the potential to be more portable, and some compilers already (or have plans to) support such standards. Here, we look at the current capabilities, portability, and performance of replacing directives with Fortran’s do concurrent using a mini-app that currently implements OpenACC for GPU-acceleration and OpenMP for multi-core CPU parallelism. We replace as many directives as possible with do concurrent, testing various configurations and compiler options within three major compilers: GNU’s gfortran, NVIDIA’s nvfortran, and Intel’s ifort. We find that with the right compiler versions and flags, many directives can be replaced without loss of performance or portability, and, in the case of nvfortran, they can all be replaced. We discuss limitations that may apply to more complicated codes and future language additions that may mitigate them. The software and Singularity containers are publicly provided to allow the results to be reproduced.

Beliavsky · November 16, 2021, 3:04am

It won the best paper award at the Eighth Workshop on Accelerator Programming Using Directives (WACCPD) @SC21.

CRquantum · November 16, 2021, 9:18am

May I ask, in order to make do concurrent really work, do I have to enable /qparallel flag or something?

Or, is it that do concurrent is just automatically optimized by compiler?

I use do concurrent a lot, I know it seems should be able to use multiple threads, and/or making more vectorization, but I am not very sure if it really have much difference from regular do loops in reality.

msz59 · November 16, 2021, 10:35am

See Table 3 in the cited paper and detailed discussion of compiler flags in section 3.3.
For Intel compiler (I guess /Qparallel that you mention is an ifort option), they use -fopenmp flag.

Topic		Replies	Views
Portability of Fortran's `do concurrent' on GPUs Compilers	1	360	August 18, 2024
Asynchronous GPU programming with Fortran Help	2	403	September 21, 2025
GSoC'22: Accelerating Fortran DO CONCURRENT in GCC GSoC-2022	9	1475	June 12, 2022
HPC Wire discusses Fortran "Fortran, Why Yes Fortran" Announcements	1	303	August 23, 2024
Gfortran with do concurrent for windows 10 Help	8	1099	August 27, 2023

Can Fortran's 'do concurrent' replace directives for accelerated computing? (paper)

Can Fortran’s ‘do concurrent’ replace directives for accelerated computing?

Related topics