Under the Wire: Nearly HPC News (Aug 22) (hpcwire.com)
Fortran, Why Yes Fortran
I recently came across a paper about the Portability of Fortran’s DO CONCURRENT on GPUs. For those who may not know (or have forgotten), the DO CONCURRENT construct was introduced in ISO Fortran 2008. It tells the compiler that the iterations of the DO CONCURRENT loop are independent and can be executed in parallel (that means speed-up). It would also be nice if the compiler could target GPUs as parallel acceleration devices.
As the paper points out, there is a continuing interest in using standard languages like Fortran for accelerated computing to avoid hardware-specific vendor APIs. The DO CONCURRENT (DC) loop has been successfully demonstrated for Fortran codes on the Nvidia platforms. However, support for DC on other hardware platforms has taken longer to implement. Recently, both HPE (for AMD APIs) and Intel (for Intel APIs) have added DC GPU offload support to their compilers.
Example DC loop in the HipFT code. It computes the diffusion operator matrix-vector product for internal grid points.
The paper explores the current portability of using DC across GPU vendors using a production solar surface flux evolution code, HipFT. The authors discuss implementation and compilation details, including when/where using directive APIs for data movement is needed/desired compared to using a unified memory system. Performance results for both data center and consumer platforms are provided.
A few points to consider. First, hiding hardware APIs has earned today’s second “Good Thing™” award. Hiding APIs means codes become more portable, and we all like to move our codes from one hardware platform to another.
Second, looking at the authors, the following companies contributed: Predictive Science Inc., Nvidia, AMD, Intel, and HPE. Talk about playing nice. This kind of cooperation is notable, and it benefits the HPC community in a big way.