Communicating Polymorphic Data Between Images

Say image n has defined a polymorphic value (i.e. class(something)), and image m needs to access that information. How can image m get it?

You can’t have a coarray of class(something), because all images (in a team) must allocate that coarray at the same time and with the same type, but only one image knows what the type must be.

You could have a coarray of something with that class(something) as a component, but you still won’t be able to perform the communication because of the following constraints.

C917 (R911) Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry, a data-ref shall not be a coindexed object that has a polymorphic allocatable potential subobject component.

C918 Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry, if the rightmost part-ref is polymorphic, no other part-ref shall be coindexed.

Is this just something that isn’t possible? Have I missed something? Anybody have some ideas?

It’s not possible.

It would help to have a short sample code. I suspect the solution would be to first broadcast information that can be used to conditionally allocate the correct dynamic type on every image. It will feel a little clunky and all possible types will need to be known in advance, but I suspect it can be made to work.

I think the relevant restrictions are consistent with the ideas behind one-sided communication. This feels somewhat similar to having an allocatable component of a derived-type coarray. Any safe access to a remote component must first involve some user-provided that the remote component has been allocated. In general, one-side communication allows for access to remote data but does not generally provide for one image to direct computation on another image, e.g., directing another image to perform a memory allocation.

It might be worth considering allowing for such behavior in the future, but there will be associated performance penalties to consider. Requiring some extra work to do things that come with performance penalties is a way to keep users from naively writing poor-performing code.

I was toying with the idea of broadcasting a “heterogeneous” array. The full code example can be found here.

The basic idea is:

  1. create a list on one image
  2. broadcast it to all images
  3. have each image print it out

Like below, with some details omitted.

program main
    ...
    type(shape_list_t) :: list

    if (this_image() == 1) then
        list = ...
    end if

    call list%co_broadcast(1)

    call put_line("from image " // to_string(this_image()) // ": " // list%to_string())
end program

module shape_list_m
    ...
    type :: shape_list_t
        private
        type(shape_item_t), allocatable :: shapes(:)
    contains
        ...
        procedure, public :: co_broadcast => shape_list_co_broadcast
    end type
...
    subroutine shape_list_co_broadcast(a, source_image)
        class(shape_list_t), intent(inout) :: a
        integer, intent(in) :: source_image

        integer :: list_length

        if (this_image() == source_image) list_length = size(a%shapes)

        call co_broadcast(list_length, source_image)

        if (this_image() /= source_image) then
            if (allocated(a%shapes)) deallocate(a%shapes)
            allocate(a%shapes(list_length))
        end if

        block
            integer :: i

            do i = 1, list_length
                call a%shapes(i)%co_broadcast(source_image)
            end do
        end block
    end subroutine
end module

module shape_item_m
    ...
    type :: shape_item_t
        private
        class(shape_t), allocatable :: shape
    contains
        ...
        procedure, public :: co_broadcast => shape_item_co_broadcast
    end type
...
    subroutine shape_item_co_broadcast(a, source_image)
        class(shape_item_t), intent(inout) :: a
        integer, intent(in) :: source_image

        type(shape_item_t), allocatable :: tmp[:]

        allocate(tmp[*])
        select type (a)
        type is (shape_item_t)
            if (this_image() == source_image) tmp%shape = a%shape
            sync all
            if (this_image() /= source_image) a%shape = tmp[source_image]%shape
        class default
            error stop "Cannot broadcast types extended from shape_item_t"
        end select
    end subroutine
end module

I have an idea for a method that would use serialization/de-serialization to reduce the number of communications necessary, but yeah, the de-serialization would need to know all of the possible types in advance, which eliminates the possibility of having users extend it.

Considering first a heterogeneous scalar, can you please take a look at the following and post the program output using OpenCoarrays? Thanks,

Broadcast heterogeneous scalar

module shape_m
   type, abstract :: shape_t
   contains
      procedure(IArea), deferred :: Area
   end type
   abstract interface
      function IArea( this ) result(A)
         import :: shape_t
         class(shape_t), intent(in) :: this
         real :: A
      end function
   end interface
end module
module circle_m
   use shape_m, only : shape_t
   type, extends(shape_t) :: circle_t
      real :: r = 0.0
   contains
      procedure :: Area => Area_Circle
   end type
contains
   function Area_Circle( this ) result(A)
      class(circle_t), intent(in) :: this
      real :: A
      A = 3.14*this%r**2
   end function
   function construct_circle_t( r ) result(c)
      real, intent(in) :: r
      type(circle_t) :: c
      c%r = r
   end function 
end module
module triangle_m
   use shape_m, only : shape_t
   type, extends(shape_t) :: triangle_t
      real :: b = 0.0
      real :: h = 0.0
   contains
      procedure :: Area => Area_Triangle
   end type
contains
   function Area_Triangle( this ) result(A)
      class(triangle_t), intent(in) :: this
      real :: A
      A = 0.5*this%b*this%h
   end function
   function construct_triangle_t( b, h ) result(t)
      real, intent(in) :: b
      real, intent(in) :: h
      type(triangle_t) :: t
      t%b = b ; t%h = h
   end function 
end module
   use shape_m
   use circle_m
   use triangle_m
   class(shape_t), allocatable :: shape1, shape2
   allocate( circle_t :: shape1 )
   allocate( triangle_t :: shape2 )
   sync all
   if ( this_image() == 1 ) then
      shape1 = circle_t( r=1.0 )
      shape2 = triangle_t( b=1.0, h=1.0 ) 
      call co_broadcast( shape1, source_image=this_image() )
      call co_broadcast( shape2, source_image=this_image() )
   end if
   sync all
   print *, "On image ", this_image(), shape1%area(), shape2%area()
   stop
end program

I expect the following, is that correct?

On image 2 3.140000 0.5000000
On image 1 3.140000 0.5000000
On image 8 3.140000 0.5000000
On image 7 3.140000 0.5000000
On image 6 3.140000 0.5000000
On image 4 3.140000 0.5000000
On image 3 3.140000 0.5000000
On image 5 3.140000 0.5000000

I think your program is malformed. All images must call co_broadcast, not just one. From the first paragraph of the section on collective subroutines:

Successful execution of a collective subroutine performs a calculation on all the images of the current team and assigns a computed value on one or all of them. If it is invoked by one image, it shall be invoked by the same statement on all active images of its current team in segments that are not ordered with respect to each other; corresponding references participate in the same collective computation.

I also do no believe you need the sync all statements. Thus your program should be

   use shape_m
   use circle_m
   use triangle_m
   class(shape_t), allocatable :: shape1, shape2
   allocate( circle_t :: shape1 )
   allocate( triangle_t :: shape2 )
   if ( this_image() == 1 ) then
      shape1 = circle_t( r=1.0 )
      shape2 = triangle_t( b=1.0, h=1.0 ) 
   end if
   call co_broadcast( shape1, source_image=1)
   call co_broadcast( shape2, source_image=1)
   print *, "On image ", this_image(), shape1%area(), shape2%area()
   stop
end program

With those changes I would expect to see your output. However, this still assumes that all images know without communication what the dynamic type of a given entity will be. I.e. they will all perform exactly the same calculations before calling co_broadcast to determine the dynamic type, which seems inefficient at best, or you know what the dynamic type is going to be before even compiling (as in your above example), in which case what was the point of the polymorphism?

The long term goal is to be able to define a container whose contents are polymorphic, so that users can insert their own types into it, and have it be possible to broadcast (and hopefully do communication between specific images for) such a container. But so far it seems it is not possible to perform such communication without at least knowing ahead of time all of the types which may be communicated.

1 Like

Thanks, that makes sense. It’s unclear to me whether the current standard allows this.

I’m authoring an interp that has some bearing on the question, but doesn’t exactly address it directly. Based on the constraints above, I think any direct communication via coarrays is not possible with polymorphic components. But I think co_broadcast would allow for polymorphic components based on the description:

A becomes defined, as if by intrinsic assignment, on all images in the current team with the value of A on image SOURCE_IMAGE

Intrinsic assignment being the key phrase there. My interp specifically asks about whether allocatable components are (re)allocated, which I believe they should be since intrinsic assignment does that. And since polymorphic components are necessarily allocatable, I think that covers it.

I also think the above constraints could be relaxed by changing

Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry

to

In a variable definition context

because that is still sufficient to prevent one image from performing allocation of data on another image. That would allow my example to work as written.

An interp is a good idea.

Re: “what was the point of the polymorphism?”, I suppose an argument can be made the intended purpose of first version of the collective subroutine CO_BROADCAST is primarily support toward nonpolymorphic objects and perhaps objects with polymorphic components at any level that are uniformly heterogeneous across all images e.g., a collection of different shapes where it’s the same dynamic type in its component hierarchy on all images? With this in mind, I’m curious how OpenCoarrays with gfortran (I don’t have access to it) works with the following, can you please post the program output? Thanks,

Simple heterogeneous collection

module shape_m
   type, abstract :: shape_t
   contains
      procedure(IArea), deferred :: Area
   end type
   abstract interface
      function IArea( this ) result(A)
         import :: shape_t
         class(shape_t), intent(in) :: this
         real :: A
      end function
   end interface
end module
module circle_m
   use shape_m, only : shape_t
   type, extends(shape_t) :: circle_t
      real :: r = 0.0
   contains
      procedure :: Area => Area_Circle
   end type
contains
   function Area_Circle( this ) result(A)
      class(circle_t), intent(in) :: this
      real :: A
      A = 3.14*this%r**2
   end function
   function construct_circle_t( r ) result(c)
      real, intent(in) :: r
      type(circle_t) :: c
      c%r = r
   end function 
end module
module triangle_m
   use shape_m, only : shape_t
   type, extends(shape_t) :: triangle_t
      real :: b = 0.0
      real :: h = 0.0
   contains
      procedure :: Area => Area_Triangle
   end type
contains
   function Area_Triangle( this ) result(A)
      class(triangle_t), intent(in) :: this
      real :: A
      A = 0.5*this%b*this%h
   end function
   function construct_triangle_t( b, h ) result(t)
      real, intent(in) :: b
      real, intent(in) :: h
      type(triangle_t) :: t
      t%b = b ; t%h = h
   end function 
end module
module shape_item_m
   use shape_m
   type :: shape_item_t
      class(shape_t), allocatable :: shape
   end type
end module
module shape_list_m
   use shape_item_m
   use circle_m
   use triangle_m
   type :: shape_list_t
      type(shape_item_t), allocatable :: items(:)
   end type
contains
   function get_areas( this ) result(areas)
      type(shape_list_t), intent(in) :: this
      real :: areas(size(this%items))
      integer :: i
      areas = 0.0
      if ( size(this%items) > 0 ) then
         do i = 1, size(this%items)
            areas(i) = this%items(i)%shape%Area()
         end do
      end if
   end function
end module
program p
   use shape_list_m
   use circle_m
   use triangle_m
   type(shape_list_t) :: list
   allocate( list%items(3) )
   allocate( circle_t :: list%items(1)%shape )
   allocate( triangle_t :: list%items(2)%shape )
   allocate( circle_t :: list%items(3)%shape )
   if ( this_image() == 1 ) then
      list%items(1)%shape = circle_t( r=1.0 )
      list%items(2)%shape = triangle_t( b=1.0, h=1.0 )
      list%items(3)%shape = circle_t( r=2.0 )
   end if
   call co_broadcast( list, source_image=1 )
   print *, "On image ", this_image(), "; Areas = ", get_areas( list )
   stop
end program

On my Mac, not well. I’ll also try on Linux soon.

[Brads-MacBook-Pro:~/tmp/vipul_co_broadcast] caf main.f90 
[Brads-MacBook-Pro:~/tmp/vipul_co_broadcast] cafrun -n 4 ./a.out
 On image            1 ; Areas =    3.14000010      0.500000000       12.5600004    
 On image            3 ; Areas =    0.00000000       0.00000000       0.00000000    
 On image            2 ; Areas =    0.00000000       0.00000000                  NaN
 On image            4 ; Areas =    0.00000000       5.43703804E-43   0.00000000    
STOP 
STOP 
STOP 
STOP 
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 0.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[Brads-MacBook-Pro.local:02280] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[Brads-MacBook-Pro.local:02280] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

It appears Open Coarrays is not getting it right on Linux either.

[pop-os:~/tmp/vipul_co_broadcast] caf main.f90 
[pop-os:~/tmp/vipul_co_broadcast] cafrun -n 4 ./a.out
 On image            1 ; Areas =    3.14000010      0.500000000       12.5600004    
 On image            2 ; Areas =    0.00000000       0.00000000       0.00000000    
 On image            3 ; Areas =    0.00000000       0.00000000       0.00000000    
 On image            4 ; Areas =    0.00000000       0.00000000       0.00000000    
STOP 
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
STOP 
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 1
STOP 
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 2
STOP 
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 3

For what it’s worth, Intel doesn’t get it right either (on Windows or Linux).

C:\Users\brad\tmp\vipul_co_broadcast>ifort -Qcoarray -traceback main.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.28.29914.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:main.exe
-subsystem:console
-incremental:no
main.obj

C:\Users\brad\tmp\vipul_co_broadcast>.\main.exe
 On image            1 ; Areas =    3.140000      0.5000000       12.56000
forrtl: severe (157): Program Exception - access violation
In coarray image 2
Image              PC                Routine            Line        Source
main.exe           00007FF781A31791  MAIN__                     95  main.f90
main.exe           00007FF781A8ABBE  Unknown               Unknown  Unknown
main.exe           00007FF781A8AFA4  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FF934367034  Unknown               Unknown  Unknown
ntdll.dll          00007FF936282651  Unknown               Unknown  Unknown

Thanks much for trying out these compilers and platforms. Would an explicit synchronization prior to CO_BROADCAST help mitigate these apparent compiler implementation issues?

I would be surprised if that had an effect. I would expect any necessary synchronization to happen within co_broadcast. But it may be worth trying.

Edit: Yep, tried adding sync all both before and after the co_broadcast, and saw no change in behavior with any of the compiler/platform combinations.

It may not directly help with your specific question here, but I am using OOP with coarrays (as coarray components) successfully myself and would point you to Aleksandar Donev’s paper, sections 3.1 and 3.1.1: http://caf.rice.edu/documentation/Aleksandar-Donev-Coarray-Rationale-N1702-2007-09.pdf , on page 6:

“…, one can use derived types to group allocatable co-arrays (data encapsulation) and even use inheritance and polymorphism together with coarrays. This is an important feature facilitating object-oriented programming and was one of the reasons for adding co-array components.”

So far, I did not try polymorphism with coarray components yet but plan to try (and use) it in the near future to further refine and implement parallel models. So far I’m using abstract classes and inheritance to implement parallel models successfully using coarray components with OpenCoarrays.

(More recently I did start to encapsulate access to coarray components to get them out of the parallel logic codes and thus, away from the OOP codes there. But I can confirm that using coarray components together with OOP does work.)

cheers
Michael Siehl

1 Like