How to use MPI in a time-iteration loop to solve an unsteady problem?

Hi,

I would like to use the mumps solver to solve an unsteady problem.

The below program complains

./mumpstest 

Entering DMUMPS 5.6.2 with JOB, N, NNZ =   6           5             12
      executing #MPI =      1, without OMP

 =================================================
 MUMPS compiled with option -Dpord
 =================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host

 ****** ANALYSIS STEP ********

 Processing a graph of size:         5
 ... Structural symmetry (in percent)=   92
 Average density of rows/columns =    2
 ... No column permutation
 Ordering based on AMF 

Leaving analysis phase with  ...
 INFOG(1)                                       =               0
 INFOG(2)                                       =               0
 -- (20) Number of entries in factors (estim.)  =              25
 --  (3) Real space for factors    (estimated)  =              25
 --  (4) Integer space for factors (estimated)  =              30
 --  (5) Maximum frontal size      (estimated)  =               5
 --  (6) Number of nodes in the tree            =               1
 -- (32) Type of analysis effectively used      =               1
 --  (7) Ordering option effectively used       =               2
 ICNTL (6) Maximum transversal option           =               0
 ICNTL (7) Pivot order option                   =               7
 ICNTL(13) Parallelism/splitting of root node   =               0
 ICNTL(14) Percentage of memory relaxation      =              20
 ICNTL(15) Analysis by block effectively used   =               0
 ICNTL(18) Distributed input matrix (on if >0)  =               0
 ICNTL(58) Symbolic factorization option        =               2
 Number of level 2 nodes                        =               0
 Number of split nodes                          =               0
 RINFOG(1) Operations during elimination (estim)= 7.000D+01

 MEMORY ESTIMATIONS ... 
 Estimations with standard Full-Rank (FR) factorization:
    Total space in MBytes, IC factorization      (INFOG(17)):           0
    Total space in MBytes,  OOC factorization    (INFOG(27)):           0

 Elapsed time in analysis driver=       0.0006


****** FACTORIZATION STEP ********

 GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
 Number of working processes                =               1
 ICNTL(22) Out-of-core option               =               0
 ICNTL(35) BLR activation (eff. choice)     =               0
 ICNTL(37) BLR CB compression (eff. choice) =               0
 ICNTL(49) Compact workarray S (end facto.) =               0
 ICNTL(14) Memory relaxation                =              20
 INFOG(3) Real space for factors (estimated)=              25
 INFOG(4) Integer space for factors (estim.)=              30
 Maximum frontal size (estimated)           =               5
 Number of nodes in the tree                =               1
 ICNTL(23) Memory allowed (value on host)   =               0
           Sum over all procs               =               0
 Memory provided by user, sum of LWK_USER   =               0
 Effective threshold for pivoting, CNTL(1)  =      0.1000D-01
 Max difference from 1 after scaling the entries for ONE-NORM (option 7/8)   = 0.38D+00
 Effective size of S     (based on INFO(39))=                 2445

 Elapsed time to reformat/distribute matrix =      0.0002

 ** Memory allocated, total in Mbytes           (INFOG(19)):           0
 ** Memory effectively used, total in Mbytes    (INFOG(22)):           0

 Elapsed time for factorization                     =      0.0008

Leaving factorization with ...
 RINFOG (2) Operations in node assembly             = 0.000D+00
 ------ (3) Operations in node elimination          = 7.000D+01
 ICNTL  (8) Scaling effectively used                =               7
 INFOG  (9) Real space for factors                  =              25
 INFOG (10) Integer space for factors               =              30
 INFOG (11) Maximum front size                      =               5
 INFOG (29) Number of entries in factors            =              25
 INFOG (12) Number of off diagonal pivots           =               1
 INFOG (13) Number of delayed pivots                =               0
 INFOG (14) Number of memory compress               =               0

 Elapsed time in factorization driver               =      0.0013


 ****** SOLVE & CHECK STEP ********

 GLOBAL STATISTICS PRIOR SOLVE PHASE ...........
 Number of right-hand-sides                    =           1
 Blocking factor for multiple rhs              =           1
 ICNTL (9)                                     =           1
  --- (10)                                     =           0
  --- (11)                                     =           0
  --- (20)                                     =           0
  --- (21)                                     =           0
  --- (30)                                     =           0
  --- (35)                                     =           0
 ** Space in MBYTES used for solve                        :         0

 Leaving solve with ...
 Time to build/scatter RHS        =       0.000312
 Time in solution step (fwd/bwd)  =       0.000075
  .. Time in forward (fwd) step   =          0.000027
  .. Time in backward (bwd) step  =          0.000038
 Time to gather solution(cent.sol)=       0.000002
 Time to copy/scale dist. solution=       0.000000

 Elapsed time in solve driver=       0.0008
 Solution is   0.99999999999999856        1.9999999999999998        2.9999999999999996        3.9999999999999996        5.0000000000000018     

Entering DMUMPS 5.6.2 with JOB =  -2
      executing #MPI =      1, without OMP
*** The MPI_Init() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[george-virtualbox:04044] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

I am unsure where to place the MPI_INIT, MPI_FINALIZE calls within my time-step and iteration loops.

Any advice?

This is my code

program main

   ! to build from commandline: mpif90 mumpstest.f90 -o mumpstest -I/usr/include -L/usr/lib/x86_64-linux-gnu -ldmumps
   ! to run from commandline: ./mumpstest

   implicit none

   integer, parameter      :: dp = kind(0.0D0)
   integer                 :: t, iter
   integer                 :: n
   integer                 :: nnz

   integer, allocatable    :: irn(:) ! list of row numbers for non-zero matrix entries
   integer, allocatable    :: jcn(:) ! list of column numbers for non-zero matrix entries
   real(dp), allocatable   :: A(:)
   real(dp), allocatable   :: rhs(:)

   n = 5
   nnz = 12

   allocate(irn(nnz))
   irn = [1,2,4,5,2,1,5,3,2,3,1,3]

   allocate(jcn(nnz))
   jcn = [2,3,3,5,1,1,2,4,5,2,3,3]

   allocate(a(nnz))
   a = [3._dp,-3._dp,2._dp,1._dp,3._dp,2._dp,4._dp,2._dp,6._dp,-1._dp,4._dp,1._dp]

   allocate(rhs(n))
   rhs = [20._dp,24._dp,9._dp,6._dp,13._dp]

   do t = 1, 3 ! time-step loop
      do iter = 1, 3 ! iteration loop
         rhs = t*rhs
         call solve(n, nnz, irn, jcn, A, rhs)
      end do
   end do

contains
   subroutine solve(n, nnz, irn, jcn, A, rhs)
      use iso_fortran_env, only: stdout=>output_unit
      use mpi_f08, only: MPI_COMM_WORLD, MPI_INIT, MPI_FINALIZE

      IMPLICIT NONE
      INCLUDE 'dmumps_struc.h'

      type(dmumps_struc)                     :: mumps_par
      integer                                :: ierr
      integer, parameter                     :: dp = kind(0.0D0)
      integer                                :: i
      integer, intent(in)                    :: n
      integer, intent(in)                    :: nnz
      integer, allocatable, intent(in)       :: irn(:) ! list of row numbers for non-zero matrix entries
      integer, allocatable, intent(in)       :: jcn(:) ! list of column numbers for non-zero matrix entries
      real(dp), allocatable, intent(in)      :: A(:)
      real(dp), allocatable, intent(inout)   :: rhs(:)

      CALL MPI_INIT(IERR)

      mumps_par%COMM = MPI_COMM_WORLD%MPI_VAL

      mumps_par%JOB = -1
      mumps_par%SYM = 0
      mumps_par%PAR = 1
      CALL DMUMPS(mumps_par)

      IF (mumps_par%INFOG(1) < 0) THEN
         WRITE(stdout,'(A,A,I6,A,I9)') " ERROR RETURN: ",&
            "  mumps_par%INFOG(1)= ", mumps_par%INFOG(1),&
            "  mumps_par%INFOG(2)= ", mumps_par%INFOG(2)
         GOTO 500
      END IF

      IF (mumps_par%MYID == 0) THEN
         mumps_par%n = n
         mumps_par%nnz = nnz
         allocate(mumps_par%irn, source=irn)
         allocate(mumps_par%jcn, source=jcn)
         allocate(mumps_par%a, source=a)
         allocate(mumps_par%rhs, source=rhs)
      END IF

      mumps_par%JOB = 6
      CALL DMUMPS(mumps_par)

      IF (mumps_par%INFOG(1) < 0) THEN
         WRITE(stdout,'(A,A,I6,A,I9)') " ERROR RETURN: ",&
            "  mumps_par%INFOG(1)= ", mumps_par%INFOG(1),&
            "  mumps_par%INFOG(2)= ", mumps_par%INFOG(2)
         GOTO 500
      END IF

      IF (mumps_par%MYID == 0) THEN
         WRITE(stdout,*) 'Solution is ',(mumps_par%RHS(I),I=1,mumps_par%N)
      END IF

      mumps_par%JOB = -2
      CALL DMUMPS(mumps_par)
      IF (mumps_par%INFOG(1) < 0) THEN
         WRITE(stdout,'(A,A,I6,A,I9)') " ERROR RETURN: ",&
            "  mumps_par%INFOG(1)= ", mumps_par%INFOG(1),&
            "  mumps_par%INFOG(2)= ", mumps_par%INFOG(2)
         GOTO 500
      END IF
500   CALL MPI_FINALIZE(IERR)
   end subroutine solve

end program main

To set up

sudo apt install libmumps-dev libmumps-headers-dev
mpif90 mumpstest.f90 -o mumpstest -I/usr/include -L/usr/lib/x86_64-linux-gnu -ldmumps
./mumpstest
1 Like

MPI_INIT and MPI_FINALIZE are the first and last thing to do at the program level scope.

You should move them outside of the solve subroutine and place MPI_INIT just after your last variable declaration within the program scope and MPI_FINALIZE just before your contains scope.

1 Like

Thank you!

1 Like