Coarray version of MPI_AlltoAll

han190 · July 20, 2022, 12:00am

Happy Tuesday, Fortraners!
I am refactoring a legacy and spaghetti code written in Fortran 77/90 into modern Fortran. The original code is parallelized by both MPI and OpenMP. I am trying to use as many “new features” as I can when modernizing the code. My wish is to replace MPI with coarray Fortran.

Now, the difficulty I am facing is that there are many distributed matrix transposition involved in the original code and thus MPI_AlltoAll is used extensively. I did some research and found that coarray may have the potential to out-perform a regular MPI_AlltoAll by using overlapping communications (Pekurovsky D., 2012 and Robert Fiedler, et al., 2013). I am not a computer science major but from reading these papers my impression is a co_alltoall is totally possible but might require CS knowledge that I do not possess.

So here comes the question (or questions): Does a coarray version of MPI_AlltoAll (for example co_alltoall) already exist (I tried to look for an open-source solution but had no luck.)? If not, is it worth it/how hard is it to implement a co_alltoall, and where should I start?

milancurcic · July 20, 2022, 12:48am

I don’t know if there’s a library, but Robert Numrich tackles this problem in section 5.7 of his “Parallel Programming with Co-Arrays” book.

han190 · July 20, 2022, 1:23am

I happen to have a copy of this book but only read the first four chapters of it! Guess it’s time to continue

nncarlson · July 20, 2022, 4:10am

@han190 , thanks for mentioning those references. I’m not familiar with them and should take a look.

Before you plunge into replacing MPI with coarrays I would strongly advise that you create a representative test bed for your particular usage of MPI – focusing on MPI_AlltoAll to start – and confirm that coarrays are competitive with MPI, and portable across the compilers/platforms you use. My own experience is that coarrays are not even close to being competitive, for the most part. But your experience may be quite different.

My own test bed focused on MPI_Alltoallv used for a halo exchange. You might find the coarray replacements there instructive.

han190 · July 20, 2022, 7:04am

@milancurcic Thanks again for directing me to the book! I will have to read chapter 9 and 10 thoroughly before I am able to make any serious conclusion. BTW, is that possible for me to accept more than one replies as solutions? Currently I am only allowed to select one.

@nncarlson Yes, I read your post about coarray and definitely understand your concern, but coarray is a very attractive feature for me (and probably many other domain scientists) because it is standard conforming, and the syntax of it is just elegant. Personally, I am inclined to use it if the performance penalty isn’t unacceptable. I will come back for the testing part once I have my co_alltoall implemented. Thank you for your suggestions!

CRquantum · July 20, 2022, 8:10am

alltoall is an expensive operation (it seems like more or less a \mathcal{O}(n^2) operation where n is the number of cpu cores involved), especially if it has to occur frequently.

I mean, usually I would have a rank 0 core which I call it a boss core. It will try its best to distribute equal amount of jobs to each cores, then each cores do their jobs independently, finally send their results back to the rank 0 core. In such case, you only need things like scatter, scatterv, gather, gatherv, or broadcast, which are all \mathcal{O}(n) operations, no need to use alltoall, which is expensive.

So, my personal naïve opinion is, perhaps you may check if alltoall is absolutely necessary. If you may be able to use things like scatter or scatterv or broadcast instead of alltoall, that may gave you more speedup than implementing a coarray version of alltoall.

Coarray seems more or less using MPI, so coarray version of alltoall seems will perform similar with MPI’s intrinsic alltoall if not better.

Topic		Replies	Views
Coarrays: Not ready for prime time	64	6229	April 18, 2022
Coarray usage resources	10	944	January 19, 2022
Multiple parallelization layers vs the agnosticism of coarrays: suggestions for improvement Language enhancement	0	408	February 19, 2023
Code for Robert Numrich's Parallel Programming with Co-arrays	4	199	April 26, 2025
Questions from a Fortran HPC Webinar Help	30	2663	July 15, 2021

Coarray version of MPI_AlltoAll

Related topics