Fortran projects running on GPUs in production

A case-insensitive grep (actually Windows findstr) of my list of Fortran codes on GitHub for “gpu” gives the following. If my description of a project does not include “gpu”, as it did not for CaNS, the project will be missed.

Nbody6++GPU - Beijing version: N-body star cluster simulation code, by Rainer Spurzem and team. It is an offspring of Sverre Aarseth’s direct N-body codes.

POT3D: High Performance Potential Field Solver: computes potential field solutions to approximate the solar coronal magnetic field using observed photospheric magnetic fields as a boundary condition. A version of POT3D that includes GPU-acceleration with both MPI+OpenACC and MPI+OpenMP was released as part of the Standard Performance Evaluation Corporation’s (SPEC) beta version of the SPEChpc™ 2021 benchmark suites.

Fluid Transport Accelerated Solver (FluTAS): modular, multiphysics code for multiphase fluid dynamics simulations. The code is written following a “functional programming” approach and aims to accommodate various independent modules. One of the main purposes of the project is to provide an efficient framework able to run both on many-CPUs (MPI) and many-GPUs (MPI+OpenACC+CUDA-Fortran).

IMEXLB-1.0: Lattice Boltzmann Method (LBM) proxy application code-suite for heterogeneous platforms (such as ThetaGPU). A ProxyApp, by definition, is a proxy for a full-fledged application code that simulates a wider array of problems.

MGLC: multi-GPU parallel implementation of LBM(Lattice Boltzmann Method), using OpenACC to accelerate codes on single GPU and MPI for inter-GPU communication

fft-overlap: efficient implementations of ffts on multiple GPUs and across multiple nodes, by dappelha. Overlapping data transfer on multiple levels.

arrayfire-fortran: Fortran wrapper for ArrayFire, a general purpose GPU library.

Eigensolver_gpu: generalized eigensolver for symmetric/hermetian-definite eigenproblems with functionality similar to the DSYGVD/X or ZHEGVD/X functions available within LAPACK/MAGMA, by Josh Romero et al. This solver has less dependencies on CPU computation than comparable implementations within MAGMA, which may be of benefit to systems with limited CPU resources or to users without access to high-performing CPU LAPACK libraries.

GraSPH: Smoothed-particle Hydrodynamics (SPH) program originally intended for simulations of bulk granular material as well as fluids, by Edward Yang. Src_CAF contains code intended to run multi-core configuration enabled with the Coarray Fortran 2008 features, and src_GPU contains code intended to run on a CUDA-enabled GPU.

CUDA Fortran: Fortran programming on GPU: a complete introduction for beginners by Koushik Naskar

ExaTENSOR: basic numerical tensor algebra library for distributed HPC systems equipped with multicore CPU and NVIDIA (or AMD) GPU, by Dmitry I. Lyakh. The hierarchical task-based parallel runtime of ExaTENSOR is based on the virtual tensor algebra processor architecture, i.e. a software processor specialized to numerical tensor algebra workloads on heterogeneous HPC systems (multicore/KNL, NVIDIA or AMD GPU).

FGPU: code examples focusing on porting FORTRAN codes to run DOE heterogenous architecture CPU+GPU machines, from LLNL. The purpose of these is to provide both learning aids for developers and OpenMP and CUDA code examples for testing vendor compilers capabilities.

GPU programming with OpenMP offloading: exercises and other material for course, by Jussi Enkovaara et al.

gpu-tips: Fortran examples of CUDA and directives tips and tricks for IBM Power + Nvidia Systems, by dappelha

nbody-ifx-do-concurrent: N-body Fortran code port to test ifx (Intel Fortran) GPU offload of do concurrent, by Saroj Adhikari

Tensor Algebra Library Routines for Shared Memory Systems: Nodes equipped with multicore CPU, NVIDIA GPU, AMD GPU, and Intel Xeon Phi (TAL_SH): implements basic tensor algebra operations with interfaces to C, C++11, and Fortran 90+, by Dmitry I. Lyakh

BDpack: GPU-enabled Brownian dynamics package for simulation of polymeric solutions, by Amir Saadat. An associated paper is Computationally efficient algorithms for incorporation of hydrodynamic and excluded volume interactions in Brownian dynamics simulations: A comparative study of the Krylov subspace and Chebyshev based techniques, A. Saadat and B. Khomami, J. Chem. Phys., 140, 184903 (2014).

A Fortran Electronic Structure Programme (AFESP): project based on the Crawford Group’s C++ Programming Tutorial in Chemistry, but written in Fortran, by Kirk Pearce et al. The end goal of this project will be performing HF, MP2, CCSD, and CCSD(T), as per the original tutorial, but with additional support for multicore processors (modern CPUs, GPUs).

qe-gpu: GPU-accelerated Quantum ESPRESSO using CUDA Fortran, by Filippo Spiga

2 Likes