Domain decomposition solver with coarrays

Hello,
I’m experimenting with coarray feature and trying to resolve a problem by domain decomposition. The particular problem I want to solve has a domain that cannot be represented with a matrix, but can be represented ( in 2D ) as connected patches of quadrilaterals (matrix). For starting, I’m using as a global domain a square that can be represented as a matrix, but let’s just pretend it can’t. So it must divided in patches and every images will have his matrix on which perform calculations.
I qant to resolve the heat equations on that square and the serial version works as expected.
I’m in doubt on how perform updating (a coarray that contains common boundaries ?).
Problems however start before updating boundaries, just before copying data from the global domain, created and decomposed by image 1, to the image patches. This is the code :

program heat_parallel

use mod_linspace
implicit none

!---------------
! type defining a patch/block of the multi-block domain (a 2D/3D domain that 
! can be decomposed in quadrilateral/hexahedral blocks) 
!---------------

type :: domain_patch
real, allocatable, dimension(:,:) :: temperature_domain , x_domain , y_domain
end type

!---------------
! every image will have its own patch. A global coarray cannot be used because
! the entire domain is not generally structured ( domain <--> matrix mXn ) 
!---------------
type(domain_patch),codimension[*] :: decomposed_domain

!-----------------
! the overall domain before decomposition in this particular case is a square, so 
! for brevity is declared as a domain patch, but it's just a coincidence
!-----------------
type(domain_patch) :: total_domain

!----------------
! decomposition is done only along first dimension for simplicity
!---------------
integer, codimension[*] :: ni_size_block


!-----------------------
! variables related to the creation of the initial global domain to be decomposed
!-----------------------

integer :: ni=400
integer :: nj
integer :: niter=1000*200

real :: len_x_domain=1.0,x_start=0, temp_up_left = 50, temp_up_right=100
real :: len_y_domain=1.0,y_start=0, temp_down_left = 50, temp_down_right=100
real :: init_temp=0.0 , a

integer::i,j,k ,nii ,n_im , acc

character(len=50) :: err_string, filenm
integer :: ierror

logical :: if_print_arrays=.false.


nj=ni

n_im= num_images()


if (this_image().eq.1) then

    allocate(total_domain%temperature_domain(ni,nj))
    allocate(total_domain%x_domain(ni,nj))
    allocate(total_domain%y_domain(ni,nj))

    do j=1,nj
        total_domain%x_domain( :, j )=linspace( x_start, x_start+len_x_domain, ni  )
    end do

    do i=1,ni
        total_domain%y_domain( i,: )=linspace( y_start, y_start+len_y_domain, nj  )
    end do

!-----------------------------------------------
!   initializing temperature field
!-----------------------------------------------

    total_domain%temperature_domain=init_temp

!-----------------------------------------------
!   setting BC
!-----------------------------------------------
    
    total_domain%temperature_domain( :, nj )=linspace( temp_up_left, temp_up_right, ni  )

    total_domain%temperature_domain( :, 1 )=linspace( temp_down_left, temp_down_right, ni  )
    
    total_domain%temperature_domain( 1, : )=linspace( temp_down_left, temp_up_left, nj  )

    total_domain%temperature_domain( ni, : )=linspace( temp_down_right, temp_up_right, nj  )

!-----------------------------------------------
! simple domain decomposition
!-----------------------------------------------

    
    nii=int( ni/n_im )

    acc=1
    do i=1,n_im 

        ni_size_block[i] = nii

        !-----------------------
        ! allocating for each image its own block
        !-----------------------

        allocate(decomposed_domain[i]%temperature_domain(nii,nj))
        allocate(decomposed_domain[i]%x_domain(nii,nj))
        allocate(decomposed_domain[i]%y_domain(nii,nj))

        !------------------------
        ! copying from the related global data portion to each image patch 
        !------------------------

        decomposed_domain[i]%temperature_domain = total_domain%temperature_domain(acc:nii, :) 
        decomposed_domain[i]%x_domain = total_domain%x_domain(acc:nii, :) 
        decomposed_domain[i]%y_domain = total_domain%y_domain(acc:nii, :) 
        acc=acc+nii

    end do

endif

!-----------
! assuring every image has received data before writing 
! patches to VTK file for inspection
!-----------
sync all

!--------------
! deallocating global data 
!--------------

deallocate(total_domain%temperature_domain)
deallocate(total_domain%x_domain)
deallocate(total_domain%y_domain)


!-----------------------------------------------
!   writing VTK solution file for each patch
!-----------------------------------------------
!..........
!..........
end program heat_parallel

The module linspace is this.
When I try to compile :

   82 |             allocate(decomposed_domain[i]%temperature_domain(nii,nj))
      |                     1
Error: Allocate-object at (1) must be ALLOCATABLE or a POINTER

What is wrong ?
Every suggestion on other approaches on how to tackle a domain decomposition problem is well accepted.
Thanks

------------ EDIT -----------
I solved the problems and now the solver runs well with 1 image, but when I use more than 1 image by inspection of the solution (and of the print output) I can see that the sync all statement after every iteration is not working. I’m using OpenCoarrays with gfortran. Has anyone experienced similar issues?

Could you post the corrected code? Now I am trying to fix the problems you have already fixed :).

First, Thanks for your help.
The solution at the second problem (evidentiated on the EDIT section) was the derived type containing a coarray. That was causing a problem with sync all, apparently. After I got rid of the derived type, the sync all statement started to work, but now there seems to be another ptoblem, the copy of data from image 0 to the other images.

program heat_parallel_bug

use mod_linspace
implicit none

!---------------
! type defining a patch/block of the multi-block domain (a 2D/3D domain that 
! can be decomposed in quadrilateral/hexahedral blocks) 
!---------------

type :: domain_patch
real, allocatable, dimension(:,:) :: temperature_domain , x_domain , y_domain
end type

type :: connect
integer :: left_img=-1, right_img=-1
end type

!---------------
! every image will have its own patch. A global coarray cannot be used because
! the entire domain is not generally structured ( domain <--> matrix mXn ) 
!---------------
real, allocatable, dimension(:,:):: temperature_domain[:] , x_domain[:] , y_domain[:]


type(connect), codimension[*] :: connectivity

!-----------------
! the overall domain before decomposition in this particular case is a square, so 
! for brevity is declared as a domain patch, but it's just a coincidence
!-----------------
type(domain_patch) :: total_domain

!----------------
! decomposition is done only along first dimension for simplicity
!---------------
integer, codimension[*] :: ni_size_block, i_down , i_up


!-----------------------
! variables related to the creation of the initial global domain to be decomposed
!-----------------------

integer :: ni=20
integer :: nj
integer :: niter=0

real :: len_x_domain=1.0,x_start=0, temp_up_left = 50, temp_up_right=100
real :: len_y_domain=1.0,y_start=0, temp_down_left = 50, temp_down_right=100
real :: init_temp=0.0 , a

integer::i,j,k ,nii ,n_im , acc,conn_left, conn_right


character(len=50) :: err_string, filenm
integer :: ierror

logical :: if_print_arrays=.false.

real :: a1,a2,a3,a4


nj=ni

n_im= num_images()


if (this_image().eq.1) then

    allocate(total_domain%temperature_domain(ni,nj))
    allocate(total_domain%x_domain(ni,nj))
    allocate(total_domain%y_domain(ni,nj))

    write(6,11) this_image()
    11 format('total domain allocated by image ',i2)

    do j=1,nj
        total_domain%x_domain( :, j )=linspace( x_start, x_start+len_x_domain, ni  )
    end do

    do i=1,ni
        total_domain%y_domain( i,: )=linspace( y_start, y_start+len_y_domain, nj  )
    end do

!-----------------------------------------------
!   initializing temperature field
!-----------------------------------------------

    total_domain%temperature_domain=init_temp

!-----------------------------------------------
!   setting BC
!-----------------------------------------------
    
    total_domain%temperature_domain( :, nj )=linspace( temp_up_left, temp_up_right, ni  )

    total_domain%temperature_domain( :, 1 )=linspace( temp_down_left, temp_down_right, ni  )
    
    total_domain%temperature_domain( 1, : )=linspace( temp_down_left, temp_up_left, nj  )

    total_domain%temperature_domain( ni, : )=linspace( temp_down_right, temp_up_right, nj  )
    
    nii=int( ni/n_im )

    acc=1

    do i=1,n_im 
        ni_size_block[i] = nii+1
        i_down[i]=acc
        i_up[i]=acc+nii
        acc=acc+nii-1


        connectivity[i]%left_img=i-1
        connectivity[i]%right_img=i+1


    end do

    ni_size_block[n_im]=ni-i_down[n_im]+1
    i_up[n_im]=ni

endif

sync all

!-----------------------
! allocating for each image its own block
!-----------------------

allocate(temperature_domain(ni_size_block[this_image()],nj)[*])
allocate(x_domain(ni_size_block[this_image()],nj)[*])
allocate(y_domain(ni_size_block[this_image()],nj)[*])

write(6,22) this_image()
22 format('patch domain allocated by image ',i2)
sync all
write(6,*)
write(6,23) this_image(), i_down[this_image()], i_up[this_image()],ni_size_block[this_image()]
23 format('image ',i2 ' has i_down = ',i4, ' , i_up = ',i4,' and i_size = ',i4)


!------------------------
! copying from the related global data portion to each image patch 
!------------------------

if (this_image().eq.1) then

    do i=1,n_im
       
        write(6,93) this_image(), i,i_down[i], i_up[i]
        93 format('image ',i2,' copying to image ',i2, ' from i_down ',i3, ' and i_up ',i3)

    !----------------------------
    !--- printing the 4th domain's patch before sending it to image 4 
        if (i.eq.n_im) then
            write(6,41) this_image()
            41 format('image ',i2,' printing on screen 4th patch of total domain')
            print *, total_domain%temperature_domain(i_down[i]:i_up[i], :) 
            write(6,*)
        endif
    !---------------------------


        temperature_domain(:,:)[i] = total_domain%temperature_domain(i_down[i]:i_up[i], :) 
        x_domain(:,:)[i] = total_domain%x_domain(i_down[i]:i_up[i] , :) 
        y_domain(:,:)[i] = total_domain%y_domain(i_down[i]:i_up[i], :)
    enddo
    
    !--------------
    ! deallocating global data 
    !--------------

    deallocate(total_domain%temperature_domain)
    deallocate(total_domain%x_domain)
    deallocate(total_domain%y_domain)

endif

write(6,*) 'patch copied and syncronizing'
sync all

!-------------------------------
!--- verifying image 4 has received corrected data from image 0
if (this_image().eq.n_im) then
    write(6,32) this_image()
    32 format('image ', i2, ' printing on screen its temperature patch' )
    print *, temperature_domain(:,:)
    write(6,*)
endif
!----------------------------

write(6,33) this_image()
33 format ('image ',i2,' deallocating and exiting')
deallocate(temperature_domain)
deallocate(x_domain)
deallocate(y_domain)

end program heat_parallel_bug

This is the output of the 4th patch (if you run with 4 images, gfortran,OpenCoarrays) of the total_domain%temperature_domain :
.......numbers.....100.000000 81.5789490 84.2105255 86.8421021 89.4736786 92.1052628 94.7368469 97.3684235 100.000000
This is the output of the array temperature_domain of the 4th image (they should be egual) :
....numbers.....0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000

—edit–
Sorry, the code is starting to look messy, it was not supposed to cause such a headache.
–edit2–
This is the patch of the 4th image as seen by paraview, without any iterations done, just copying from image 1 to image 4 and then to .VTK file.


It seems the last row is copied from total_domain%array(0,0)

I am using the Intel Fortran compiler to see what is going on and it appears that only a part of the array is actually filled. Two consecutive runs do not give the same result and I see suspicious values like 4.5748191E-41 that usually indicate that a real variable or array element has not been set and therefore inherited whatever bit pattern was left over in the computer’s memory.

No time right now to dive into the code, but you may want to check that.

1 Like

Thanks, I’ve got no space left on my laptop for ifort so these are useful hints. I’m checking.

—edit—
Using these lines for initializing patch domains for every image

!-----------------------
! allocating for each image its own block
!-----------------------

allocate(temperature_domain(ni_size_block[this_image()],nj)[*])
allocate(x_domain(ni_size_block[this_image()],nj)[*])
allocate(y_domain(ni_size_block[this_image()],nj)[*])

temperature_domain=-100
x_domain=-100
y_domain=-100

and the output for temperature patch of image 4, after copying from image 0, is :
.....0.00000000 100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000 -100.000000
So only the last n rows are uninitialized (j start index and j end are right, because domain is splitted along columns). In addition, if I use 2 image results are correct. The problem is slightly present with 3 images and with 4 we have a disaster. That’s a mistery.