All MPI routines available even when not explicitly included in the `only` args list

I am trying to understand the solution I was provided to a problem I encountered while compiling an MPI program using intel compilers (see Solved: Error: Symbol ‘mpi_gather’ referenced at (1) not found in module ‘mpi’ - Intel Community). For a simple test program like,

! @file test_mpi.f90
program test_mpi
  use mpi, only: mpi_gather, mpi_init, mpi_finalize
end program

and attempting to compile with intel compilers

. /opt/intel/oneapi/setvars.sh 

which mpif90
# /opt/intel/oneapi/mpi/2021.14/bin/mpif90

mpif90 --version
# GNU Fortran (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
# Copyright (C) 2023 Free Software Foundation, Inc.
# This is free software; see the source for copying conditions.  There is NO
# warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

mpif90 test_mpi.f90

the resulting error is

Error: Symbol ‘mpi_gather’ referenced at (1) not found in module ‘mpi’

It turns out, the solution to this problem is to simply remove the mpi_gather routine from the only args list, however, the mpi_gather routine is still usable in the above test program even though it’s not explicitly imported (since an explicit interface doesn’t exist for it anyways, see the Intel community discussion at the beginning of my post for details). Using the more modern mpi_f08 module (your program must be compiled with mpiifx if you want to do this) has the same behavior, that is routines like mpi_send, mpi_gather, etc are all usable in the final binary despite never having been explicitly included via only: ....

Why and/or how is it that such routines are usable even though they are not explicitly included in the only list? It seems like this violates expected behavior since if one sees a module file containing

use mpi, only: mpi_gather, mpi_init, mpi_finalize

then naturally I would expect only those functions to be used in the file, however, this contract is not enforced.

A full example of this behavior (routines not explicitly included are usable) can be shown in the below program:

! @file mpi_send_example.f90
!
! @usage
! mpif90 mpi_send_example.f90
! mpirun -n 2 ./a.out
program mpi_send_example
    use mpi, only: MPI_INTEGER, MPI_COMM_WORLD, MPI_STATUS_IGNORE
    implicit none

    integer :: ierr, rank, size
    integer :: tag, source, dest
    integer :: message

    ! Initialize MPI
    call MPI_Init(ierr)
    call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
    call MPI_Comm_size(MPI_COMM_WORLD, size, ierr)

    tag = 0
    source = 0
    dest = 1

    if (size < 2) then
        if (rank == 0) print *, "This program needs at least 2 MPI processes."
        call MPI_Finalize(ierr)
        stop
    end if

    if (rank == source) then
        message = 42
        print *, "Rank", rank, "sending message:", message, "to rank", dest
        call MPI_Send(message, 1, MPI_INTEGER, dest, tag, MPI_COMM_WORLD, ierr)

    else if (rank == dest) then
        call MPI_Recv(message, 1, MPI_INTEGER, source, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE, ierr)
        print *, "Rank", rank, "received message:", message, "from rank", source
    end if

    call MPI_Finalize(ierr)
end program mpi_send_example

Something is weird here. It looks like the Intel mpif90 is defaulting to using gfortran instead of ifx. As far as I know, the Intel module files are not compatible with gfortran. Did you try using mpif90 -f90=ifx. As discussed in other threads, Fortran really needs a standardized module format but the Standards committee appears to have no interest in defining one. I am somewhat surprised that mpiifx gave you the same error. Remember all the module files do is provide an explicit interface (ie argument list) to the resulting functions so the compiler can check for argument errors. The actual code is in the MPI libraries you are linking in not the module files so the binary will still be visible to the linker. They are just treated as external routines with an implicit interface. Still I’m not sure why mpiifx didn’t work though.

2 Likes

The interfaces also define function result types and allow keyword arguments. If there were any assumed shape arguments, optional arguments, target arguments, allocatable arguments, etc., (all useful advanced fortran features) it would also allow those.

2 Likes

Using mpiifx throws the same error:

test_mpi.f90(3): error #6580: Name in only-list does not exist or is not accessible. [MPI_GATHER]
use mpi, only: mpi_gather, mpi_init, mpi_finalize
-----------------^
compilation aborted for test_mpi.f90 (code 1)

mpiifx --version
# ifx (IFX) 2025.0.4 20241205
# Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

I think Intel’s version of MPI does not support MPI_GATHER or whoever builds Intels MPI has dropped the ball and forgot to add it to mpi.f90 or mpi_f08.f90. I looked at the mpif.h file that comes with Intel MPI (on my Linux box its in opt/intel/oneapi/mpi/2021.15/include) and its not listed as an external function there. It has been a part of MPI for as long as I can remember (Its listed in my MPI V2 books). The MPI version is 2021.15 and compiler (ifx) is 2025.1.1.
I don’t remember if Intel MPI is based on openMPI or MPICH.

EDIT

A little more detective work. The last version of MPI I built from scratch was a 2022 version of MPICH. MPI_GATHER is not mentioned in either mpif.h or mpi.f90 mod file. It is listed in mpi_f08.f90 mod file

1 Like

Follow-ups

I think Intel’s version of MPI does not support MPI_GATHER…

To this point, I wrote a small bash script to count the MPI routines for which no explicit interface is provided and therefore will cause use mpi, only: <routine> to fail during compile time: https://community.intel.com/t5/Intel-MPI-Library/Error-Symbol-mpi-gather-referenced-at-1-not-found-in-module-mpi/m-p/1721767/highlight/true#M12236. I have attached the list of these routines as a text file below.

mpi_routines_with_choice_buffer.txt (1.8 KB)

MPI_GATHER is not mentioned in either mpif.h or mpi.f90 mod file. It is listed in mpi_f08.f90 mod file

To this point, yes, it does seem like something like

program test
  use mpi_f08, only: mpi_gather
end program

and compiling with mpiifx works without an issue.

Further Questions

I believe your point about

Remember all the module files do is provide an explicit interface (ie argument list) to the resulting functions so the compiler can check for argument errors. The actual code is in the MPI libraries you are linking in not the module files so the binary will still be visible to the linker.

almost solves my conceptual problem, but I have one more question that I think will wrap up my understanding:

If the module files in the oneapi mpi include directory define the explicit interface to functions for the purpose of compile time error checking, and the linker uses the shared libraries (e.g., libmpifort.so) to define references to the symbols (e.g., all MPI routines) that the final binary needs, then does this mean that the final binary can always use any symbols in a shared library regardless of whether an explicit interface is provided in the interface files (i.e., .mod files)?

Even asking this question seems very silly, but the behavior I am encountering in my example above (see All MPI routines available even when not explicitly included in the `only` args list) is fundamentally puzzling to me since the fact that one can still use functions like mpi_send even when they are not explicitly included in the only arguments violates a programming contract that I thought was the whole point of the use <module>, only: <symbols> syntax…

To this question, I show the include and link flags that mpiifx actually uses, though these are ancillary to the question itself, more just for personal reference.

$ mpiifx -show test_mpif90
ifx 'test_mpi.f90' 
-I"/opt/intel/oneapi/mpi/2021.14/include/mpi" 
-I"/opt/intel/oneapi/mpi/2021.14/include" 
-I"/opt/intel/oneapi/mpi/2021.14/include/mpi" 
-L"/opt/intel/oneapi/mpi/2021.14/lib" 
-L"/opt/intel/oneapi/mpi/2021.14/lib" 
-Xlinker 
--enable-new-dtags 
-Xlinker 
-rpath 
-Xlinker "/opt/intel/oneapi/mpi/2021.14/lib" 
-Xlinker 
-rpath 
-Xlinker "/opt/intel/oneapi/mpi/2021.14/lib" 
-lmpifort -lmpi -ldl -lrt -lpthread

Yes. The first MPI versions used implicit interfaces and were based on f77, and in that language model that is how all external subprograms are referenced. It was later in 1997, that MPI-2 introduced explicit interfaces and the USE statement, but they did it in this backwards-compatible way so that existing codes could still use the implicit interface conventions. The MPI-3 standard fully supports modern fortran conventions, with the dedicated module and the use of iso_c_binding.

As you have noticed, that convention has the feature that an external function can still be referenced even when it is not included in the ONLY list of the USE statement. MPI is not the only legacy library that works this way, and if you as a programmer maintain your own legacy library with this convention, it will have this feature too. As noted in my previous reply, the explicit interfaces also provide many other language features that are not available with implicit interfaces, so if your code needs or uses any of those, then it must use the modern fortran conventions and not the legacy conventions.

Another feature of the USE statement is that it overrides any other symbols with the same names, for example through host association. Say you have your own mpi_send routine within your module. Normally any reference to mpi_send within your module would be to your subroutine. But the USE statement would then bring in that other mpi_send subroutine, overriding your subroutine. The ONLY clause allows you to reverse that, it only brings in the symbols on that list, and if mpi_send is not in that list, then your subroutine will be referenced instead. On the other hand, if you do want to override your subroutine in a particular place, then adding mpi_send to the ONLY list will allow you to do that without affecting how the rest of your code works. Another variation on this is the rename clause,

use mpi, only: new_mpi_send=>mpi_send

This allows you to use both your mpi_send and also the standard mpi_send from the MPI library (with the new name) within the same name space. So the ONLY clause allows the programmer flexibility to pick, choose, and rename.

3 Likes

Adding to what @RonShepard said, some issues that early versions of MPI had with some Fortran 90 features (specifically assumed shape arrays) was one of the major motivations for both extending iso_c_binding and introducing the ISO_Fortran_binding.h CFI functions. If I remember correctly, there were situations where when assumed shape arrays were passed to an MPI function the program would hang and/or crash with a segmentation fault. What was happening is that without an interface Fortran would do a copy-in to the MPI routine. Based on what function you were using and other parameters like EAGER_LIMIT parameters and associated protocols, MPI would either try to send the data directly or buffer the data into a separate memory space that would then be used to send the data to the receiver. The problem was that when MPI chose to send the data directly, it assumed the memory address was the one passed to it and not a copy. MPI was free to start a communication with this data and then return immediately assuming the communication was still going on in the background. Unfortunately, when the MPI function returned to its caller the temporary copy made on copy-in was deleted leaving the background communication in limbo because the data it thought it was sending had disappeared before the communication was completed. Note that even some blocking sends where you thought the data was always being buffered might actually be sent directly if the message size was small enough (ie lower than the EAGER_LIMIT). The first time I saw this problem mentioned was in the paper by Kent Danielson and Raju Namburu, “Nonlinear dynamic finite element analysis on parallel computers using Fortran 90 and MPI”, presented at the 4th NASA National Symposium on Large-Scale Analysis and Design on High-Performance Computers and Workstations in Oct. 1997. A copy of Kent and Raju’s paper is available here:

2 Likes

Even with an explicit interface, copy-in/copy-out sometimes occurs. Here is a small program that demonstrates some of the landmines:

program xxx
   integer :: b(3,3) = 1
   integer, target :: bt(3,3) = 2
   call printx( 'b(1,2)=', b(1,2) )
   call legacy( b(1,:) )
   call modern( b(1,:) )
   call printx( 'bt(1,2)=', bt(1,2) )
   call legacy( bt(1,:) )
   call modern( bt(1,:) )
contains
   subroutine legacy(a)
      integer :: a(*)
      call printx( 'legacy:', a(2) )
      return
   end subroutine legacy
   subroutine modern(a)
      integer, intent(inout), target :: a(:)
      call printx( 'modern:', a(2) )
      return
   end subroutine modern
   subroutine printx( head, i )
      use, intrinsic :: iso_c_binding, only: c_loc
      use, intrinsic :: iso_fortran_env, only: int64
      character(*) :: head
      integer, target :: i
      write(*,'(a12,i4,i20)') head, i, transfer( c_loc(i), 1_int64)
      return
   end subroutine printx
end program xxx

$ gfortran --warn-all testx.f90 && a.out
     b(1,2)=   1          4333027340
     legacy:   1     105553121378404
     modern:   1          4333027340
    bt(1,2)=   2          4333027376
     legacy:   2     105553121378404
     modern:   2          4333027376

There are two types of addresses seen here, the small ones are the static arrays in heap memory and the large ones are the temporary array copies on the stack. The legacy interface has an assumed size array that is presumed to be contiguous, so even with the explicit interface a copy of the noncontiguous actual argument array is made. This occurs regardless of whether the actual argument has the target attribute (the difference between b(:,:) and bt(:,:)). The other landmine is that the temp stack storage can be reused. In this example, the same stack memory is used for both calls to the subroutine legacy.

If the subroutine legacy had saved an internal pointer to the dummy argument in the first call, in order to work on the array asynchronously after the return, then the first call would generate a pointer, that pointer would immediately become stale, and then after the second call it would instead point to the temp storage of the second call and its copy of the other array. Of course, a compiler could also allocate unique temp copies, and flag the address as illegal once it is outside the current stack frame. Or, as they say, it could start WWIII.

The legal way to traverse this situation, which I think this sample code does for subroutine modern and the bt(:,:) actual argument, is for both the actual array and the assumed shape dummy array to have the target attribute. That of course cannot be done with a legacy style interface, neither implicit nor explicit. I thought that I could get gfortran to warn the programmer about possible mismatches with the right combination of compiler options, but I didn’t find that combination with this code. Maybe someone knows the right incantation?

2 Likes

Exactly. I think the best way to handle these problems is to try to use the MPI F08 interfaces IF available. I say if because the last time I tried to build MPI from scratch the F08 interfaces were optional and not the default. If you are trying to use a version of MPI built by someone else you need to verify that the F08 interfaces are available.

1 Like

It can’t use any symbol because of the name mangling. You can use implicit none (type, external) to force declaration of external procedures (those without an interface).