PSA: the mpi_f08 module is not instrumented to be profiled with nsys profile for GPUs

Just another PSA that if you write a code that uses MPI and you’re using the (recommended) mpi_f08 module, it is not instrumented to be run with nsys profile –stats=true -t mpi. The old mpi one is…

I’m trying to write up a good workaround for this…so if you profile your code and you don’t see MPI (like I did earlier today) don’t despair!

Finally, if you want to try a GPU aware MPI implementation of a very simple 3d saxpy please see: learning_tools/fortran/mpi/dc_scatter.f90 at main · JorgeG94/learning_tools · GitHub

I’ve used do concurrent for offloading to GPUs, OpenMP to handle memory, and using a gpu aware MPI implementation I am able to do peer to peer communication. I’ve only tested this with the nvidia compilers but hey, this is nice :slight_smile:. The main goal of that code is to have a mini app of MPI aware gpu comms using standard parallelism. If you want to play around with the loop and optimize the code, please send a PR.

See the post in the nvidia dev forum

1 Like

I wonder if it would work with vapaa.

I am trying to get it to work but my code has a scatterv which is not in vapaa yet. I am writing a small app to see if it works

1 Like

it works with vapaa!!

1 Like