Generate same sequence of random numbers on a program with 1 or N tasks, using `random_number` and MPI

Dear all,

I think it is not possible to do this with only random_number and MPI, but I wanted to confirm here:

Is there any straightforward way of generating the same sequence of random numbers, in parallel, using Fortran’s random_number, and MPI? For instance, with two MPI tasks, having each task generating only half of the whole sequence? Something similar to this RANDOM_MPI code.

Thanks in advance!

1 Like

If you need the same sequence, why not initialise the PRNG with the same seed value for all MPI instances?

Thanks!

Because I want, e.g., task 1 to generate the first N/2 numbers, and task 2 to generate the last N/2. If I use the same seed and generate half of the numbers in each task, I get two identical sets. If I use two distinct seeds, the two subsets will differ but I won’t get the same as that computed with only one task.

If I use the same seed and have each task generating the same N numbers, and then picking only the relevant subset, it beats the purpose of generating them in parallel.

Could you use a different seed for each element of the array? Then you’re independent from the number of tasks and you always use the same seed between runs.

If it works, you could then further optimize it so that you don’t use one seed per array element, but per smallest possible domain decomposition. For example, if the smallest tile size is, say, 4, you could use one value of seed per 4 elements.

Take this with a grain of salt that it’s just an idea from the top of my head and I’ve never generated reproducible random numbers in parallel. It’s quite possible there are libraries that do this, it seems like it’d be useful.

Perhaps what you need are the “jump functions” that some PRNG’s provide. E.g. for the xoshiro/xoroshiro family:

All generators, being based on linear recurrences, provide jump functions that make it possible to simulate any number of calls to the next-state function in constant time, once a suitable jump polynomial has been computed. We provide ready-made jump functions for a number of calls equal to the square root of the period, to make it easy generating non-overlapping sequences for parallel computations [emphasis added], and equal to the cube of the fourth root of the period, to make it possible to generate independent sequences on different parallel processors.

Edit: this solution doesn’t work with the intrinsic random_number subroutine.

1 Like

Thanks!

Indeed, not being an expert in RNG, I actually tried this (1 seed per element) but it turned out that the sequence was far from random, but the integer seeds were also very close to each other. Later I can paste here the actual problem so people can see what I mean.

If the numbers represent a random sequence, how would task 1 know
its N/2 prn where generated from the same seeds or different seeds of the
N/2 prn of task 2? It seems you’re looking for the random_init intrinsic
subprogram.

1 Like

For the Fortran intrinsic random_number(), one could precompute the seeds of the RNG after generating N/2 variates, store those seeds, and pass them to the 2nd MPI task. This can be generalized to N MPI tasks. Here is a program showing how to use stored seeds to generate the next random variates in a sequence.

code
program xrandom_resume
implicit none
integer, parameter   :: dp = kind(1.0d0), n = 2
integer, allocatable :: seeds(:)
real(kind=dp)        :: x(n)
call put_random_seed(123)
call random_number(x)
call get_random_seeds(seeds)
print*,seeds
call random_number(x) ! 2nd batch of variates
print*,x
call random_seed(put=seeds)
call random_number(x) ! reproduce 2nd batch of variates
print*,x
!
contains
subroutine get_random_seeds(seeds)
integer, intent(out), allocatable :: seeds(:)
integer                           :: nseeds
call random_seed(size=nseeds)
allocate (seeds(nseeds))
call random_seed(get=seeds)
end subroutine get_random_seeds
!
subroutine put_random_seed(iadd) 
integer, allocatable :: seed(:) 
integer, intent(in)  :: iadd
integer              :: seed_size
integer, parameter   :: nmax = 33
integer, parameter   :: iran(nmax) = &
[993639,510370,867989,459062,250118,21367,615149,928396,60451,&
375304,451750,778177,920970,929490,559068,168877,198559,&
147494,138494,709469,770595,304923,951942,108290,651756,&
326105,825471,403004,732210,245191,968430,163348,865109]
call random_seed(size=seed_size)
if (seed_size < 1) then
   return
else if (seed_size <= nmax) then
   allocate (seed(seed_size))
   seed = iran(:seed_size)
   call random_seed(put=seed+iadd)
else
   write (*,*) "in put_random_seed, seed_size =",seed_size," need seed_size <= ",nmax,"STOPPING"
   stop
end if
end subroutine put_random_seed
end program xrandom_resume

Thank you!

Indeed, I looked into F2018’s random_init, but I was unable to understand if it would make this work.

Would anyone familiar with random_init be able to point me on how to generate the sequence in parallel in practice?

Thank you for the idea and the example!

This idea would make the sequence reproducible with changing number of MPI tasks. However, it would also serialize the operations, so it would not be really a concurrent (parallel) generation of random numbers.

Here is a reproducer of the problem. I want the distributed 3D array arr(:,:,:) have reproducible random numbers for the same global index ig,jg,kg, irrespective of the number of MPI tasks. Right now I just loop through the whole domain and pick the indexes pertaining to the local array arr.

Please don’t mind the MPI domain decomposition boilerplate code:

program p
  use mpi_f08
  implicit none
  integer, parameter, dimension(3) :: ng =[8,4,6]
  integer :: il,jl,kl,ig,jg,kg,n(3),lo(3)
  real :: rn
  integer, allocatable :: iseed(:)
  integer :: nrank,myid,dims(3),coords(3),rem(3)
  type(MPI_COMM) :: COMM_CART
  real, allocatable :: arr(:,:,:)
  !
  ! domain decomposition MPI boiler plate code
  !
  call MPI_INIT()
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nrank)
  dims(:) = 0
  call MPI_DIMS_CREATE(nrank,3,dims)
  call MPI_CART_CREATE(MPI_COMM_WORLD,3,dims,[.false.,.false.,.false.],.true.,COMM_CART)
  call MPI_COMM_RANK(COMM_CART,myid)
  call MPI_CART_COORDS(COMM_CART,myid,3,coords)
  !
  ! determine local array extents
  !
  n = ng(:)/dims(:) ! local array size
  rem = mod(ng(:),dims(:))
  where(coords(:)+1 <= rem(:)) n(:) = n(:) + 1
  lo(:) = (coords(:)  )*n(:)
  where(coords(:)+1 >  rem(:))
    lo(:) = lo(:) +    rem(:)
  end where
  !
  ! random number generation
  ! goal -- the random number corresponding 
  ! to each global index ig,jg,kg must be independent
  ! of the number of MPI tasks
  !
  allocate(iseed(64))
  iseed(:) = 42
  call random_seed(put=iseed(:))
  allocate(arr(n(1),n(2),n(3)))
  do kg=1,ng(3)             ! global index kg
    kl = kg - (lo(3)-1)     ! local  index kl
    do jg=1,ng(2)           ! global index jg
      jl = jg - (lo(2)-1)   ! local  index jl
      do ig=1,ng(1)         ! global index ig
        il = ig - (lo(1)-1) ! local  index il
        !
        ! random_number called product(ng(:)) times to
        ! obtain the same sequence for all MPI tasks
        !
        call random_number(rn)
        !
        ! now I choose the values corresponding 
        ! to the local computational subdomain
        !
        if(il >= 1.and.il <= n(1) .and. &
           jl >= 1.and.jl <= n(2) .and. &
           kl >= 1.and.kl <= n(3) ) then
           arr(il,jl,kl) = rn
           print*,ig,jg,kg,rn
        endif
      enddo
    enddo
  enddo
  call MPI_FINALIZE()
end

and here’s how to test the program:

mpif90 prog.f90 # compile

mpirun -n 1 ./a.out | sort -k3,3 -k2,2 -k1,1 -n > out_1.log # run and sort outputs
mpirun -n 3 ./a.out | sort -k3,3 -k2,2 -k1,1 -n > out_3.log # run and sort outputs

diff out_1.log out_3.log # should be equal!