PSA: the mpi_f08 module is not instrumented to be profiled with nsys profile for GPUs

jorgeg · November 13, 2025, 5:29am

Just another PSA that if you write a code that uses MPI and you’re using the (recommended) mpi_f08 module, it is not instrumented to be run with nsys profile –stats=true -t mpi. The old mpi one is…

I’m trying to write up a good workaround for this…so if you profile your code and you don’t see MPI (like I did earlier today) don’t despair!

Finally, if you want to try a GPU aware MPI implementation of a very simple 3d saxpy please see: learning_tools/fortran/mpi/dc_scatter.f90 at main · JorgeG94/learning_tools · GitHub

I’ve used do concurrent for offloading to GPUs, OpenMP to handle memory, and using a gpu aware MPI implementation I am able to do peer to peer communication. I’ve only tested this with the nvidia compilers but hey, this is nice . The main goal of that code is to have a mini app of MPI aware gpu comms using standard parallelism. If you want to play around with the loop and optimize the code, please send a PR.

See the post in the nvidia dev forum

ivanpribec · November 13, 2025, 8:08am

I wonder if it would work with vapaa.

jorgeg · November 13, 2025, 8:13am

I am trying to get it to work but my code has a scatterv which is not in vapaa yet. I am writing a small app to see if it works

jorgeg · November 13, 2025, 9:11am

it works with vapaa!!

Topic		Replies	Views
Profiling Fortran Code	11	5512	February 22, 2022
Troubles on getting started with MPI Help	2	1686	September 15, 2022
How to use IFX and offload openMP to GPU?	0	1639	April 2, 2022
Coarrays: Not ready for prime time	64	6339	April 18, 2022
Fortran Programmers : How do you want to offload to GPU accelerators in the next 5 years? Poll	21	5229	February 10, 2021

PSA: the mpi_f08 module is not instrumented to be profiled with nsys profile for GPUs

Related topics