Problem with MPI

Hi. I’ve been having problems with MPI since I’ve updated my operative system to Ubuntu 22.04. I have made a fresh install from zero. After installing ubuntu, I’ve installed gfortran, lapack, mpi, mpich, libopenmpi, and other libraries.

This was some months ago, and I think MPI was working fine at this point, I use some helloworld program to test MPI. This is one of the test codes I use:

program test

include 'mpif.h'

call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world,iam,ierr)
call mpi_comm_size(mpi_comm_world,nproc,ierr)

print*,'iam=',iam,'  nproc=',nproc

call mpi_finalize(ierr)

stop
end

So, I think I’ve checked it, and it worked properly at this point. Then, I’ve realized that I could install the intel fortran compiler for free. I don’t know what I was reading about, but until this point I always used gnu fortran, except when I run my codes from a cluster. As in the cluster I use the intel fortran compiler, and I saw that intel has released the intel fortran compiler for free for non commercial use, I installed the intel fortran compiler, the intel API, and I don’t remember if something else, I think that intel MKL (Math Kernel library) is included in the intel API (which I’m not totally sure what it is, but I installed it anyway).

And I think that after I installed the intel compiler and the libraries the problems with MPI began. What happens is that now when I run the test parallel code, I get something like this:

$ mpirun -n 2 ./testparalel.x
iam= 0 nproc= 1
iam= 0 nproc= 1

This means that instead of creating a single MPI environment with two processors, MPI is running the same program, twice, on two processors (which is not what it is supposed to do).

When working properly it should give something like
$ mpirun -n 2 ./testparalel.x
iam=0 nproc=2
iam=1 nproc=2

Any ideas on how to fix this problem?

Thank you in advance.

1 Like

Something resembling this happened to me not long ago. I had inadvertently compiled a library with one compiler and mpi library, and linked it with a program using a different compiler and mpi library. The symptom looked much like yours.

I suspect you are somehow (possibly without realising) using gfortran to compile the program, but using intel mpi to launch (or vice-versa). Try using which mpirun to see what is being used, and ldd testparalel.x to see what dynamic libraries will be loaded for execution.

5 Likes

I run into this quite often when I’m not careful enough and work with both GNU and Intel Fortran compilers and MPI libraries.

In addition to what Brad wrote, use mpif90 --show to see exactly what command is the MPI wrapper issuing. Intel builds provide a convenience wrapper mpiifort that invokes the correct compiler. On my Ubuntu 22.04 with a GNU+OpenMPI and Intel OneAPI + Intel MPI builds, I use mpif90 to build with GNU and mpiifort to build with Intel.

3 Likes

Hi. Thank you both! that’s clearly the problem. I was compiling with gfortran and running with intel.

Not related to your problem but please stop using “mpif.h”. The mpi_f08 module is the right way to use MPI in Fortran.

See deprecate mpif.h · Issue #561 · mpi-forum/mpi-issues · GitHub for details. The F08 module is the only way to use MPI in a type-safe/type-checked manner.

6 Likes

Unfortunately, I faced issues twice when using mpi_f08, perhaps due to implementations not catching up with the standard. Once when trying to profile a code compiled with the Intel MPI library, using ARM MAP (no idea what the issue was, just that reverting to use mpi fixed the issue); and a second time because Cray MPICH doesn’t support the mpi_f08 module yet.

you can probably build the MPICH mpi_f08 yourself and link it with Cray MPI C library, but if not, then it is further justification for me starting GitHub - jeffhammond/standalone_mpi_f08_module: An attempt to implement MPI Fortran 2018 support, which might be usable next year some time.

2 Likes

Thank you very much for this effort, Jeff. I noticed it before and it seems quite useful to solve problems like those I had without reverting the whole code to use mpi.

1 Like

Hello everyone.

I’m working on a new system now, and I’m running into similar issues.

I’m compiling the above given code with: mpiifort test.f90 -o ptest.x

When I try to run the code I obtain the following error

$ mpiexec -n 8 ./test.x
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)
[proxy:0:0@Kronos] HYD_spawn (…/…/…/…/…/src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:152): execvp error on file ./test.x (No such file or directory)

Any idea of whats casing this?

Thank you in advance.

Looks like one or the other of the MPI library or your executable (I’m guessing the latter) are not mounted in the environment on the other systems. I.e. if your executable is in /tmp, that is likely not a shared filesystem across the nodes of the cluster (i.e. each node has its own /tmp)

Edit: Disregard this. I was mistyping the name of the program :zipper_mouth_face:

Sorry for bothering.

Hello @everythingfunctional, thank you very much for your response. I’m running this on my notebook. The file is in the same directory where I compiled it and attempted to run it.

: which mpirun
/opt/intel/oneapi/mpi/2021.10.0//bin/mpirun
: which mpiexec
/opt/intel/oneapi/mpi/2021.10.0//bin/mpiexec
: which mpiifort
/opt/intel/oneapi/mpi/2021.10.0//bin/mpiifort

1 Like

No worries. I missed that little detail too. :face_with_spiral_eyes:

1 Like

Sorry if this sounds too basic, but your file is called ptest.x, should it then be ./ptest.x instead of ./test.x ?

1 Like