Say image n
has defined a polymorphic value (i.e. class(something)
), and image m
needs to access that information. How can image m
get it?
You can’t have a coarray of class(something)
, because all images (in a team) must allocate that coarray at the same time and with the same type, but only one image knows what the type must be.
You could have a coarray of something with that class(something)
as a component, but you still won’t be able to perform the communication because of the following constraints.
C917 (R911) Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry, a data-ref shall not be a coindexed object that has a polymorphic allocatable potential subobject component.
C918 Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry, if the rightmost part-ref is polymorphic, no other part-ref shall be coindexed.
Is this just something that isn’t possible? Have I missed something? Anybody have some ideas?
It would help to have a short sample code. I suspect the solution would be to first broadcast information that can be used to conditionally allocate the correct dynamic type on every image. It will feel a little clunky and all possible types will need to be known in advance, but I suspect it can be made to work.
I think the relevant restrictions are consistent with the ideas behind one-sided communication. This feels somewhat similar to having an allocatable component of a derived-type coarray. Any safe access to a remote component must first involve some user-provided that the remote component has been allocated. In general, one-side communication allows for access to remote data but does not generally provide for one image to direct computation on another image, e.g., directing another image to perform a memory allocation.
It might be worth considering allowing for such behavior in the future, but there will be associated performance penalties to consider. Requiring some extra work to do things that come with performance penalties is a way to keep users from naively writing poor-performing code.
I was toying with the idea of broadcasting a “heterogeneous” array. The full code example can be found here.
The basic idea is:
- create a list on one image
- broadcast it to all images
- have each image print it out
Like below, with some details omitted.
program main
...
type(shape_list_t) :: list
if (this_image() == 1) then
list = ...
end if
call list%co_broadcast(1)
call put_line("from image " // to_string(this_image()) // ": " // list%to_string())
end program
module shape_list_m
...
type :: shape_list_t
private
type(shape_item_t), allocatable :: shapes(:)
contains
...
procedure, public :: co_broadcast => shape_list_co_broadcast
end type
...
subroutine shape_list_co_broadcast(a, source_image)
class(shape_list_t), intent(inout) :: a
integer, intent(in) :: source_image
integer :: list_length
if (this_image() == source_image) list_length = size(a%shapes)
call co_broadcast(list_length, source_image)
if (this_image() /= source_image) then
if (allocated(a%shapes)) deallocate(a%shapes)
allocate(a%shapes(list_length))
end if
block
integer :: i
do i = 1, list_length
call a%shapes(i)%co_broadcast(source_image)
end do
end block
end subroutine
end module
module shape_item_m
...
type :: shape_item_t
private
class(shape_t), allocatable :: shape
contains
...
procedure, public :: co_broadcast => shape_item_co_broadcast
end type
...
subroutine shape_item_co_broadcast(a, source_image)
class(shape_item_t), intent(inout) :: a
integer, intent(in) :: source_image
type(shape_item_t), allocatable :: tmp[:]
allocate(tmp[*])
select type (a)
type is (shape_item_t)
if (this_image() == source_image) tmp%shape = a%shape
sync all
if (this_image() /= source_image) a%shape = tmp[source_image]%shape
class default
error stop "Cannot broadcast types extended from shape_item_t"
end select
end subroutine
end module
I have an idea for a method that would use serialization/de-serialization to reduce the number of communications necessary, but yeah, the de-serialization would need to know all of the possible types in advance, which eliminates the possibility of having users extend it.
Considering first a heterogeneous scalar, can you please take a look at the following and post the program output using OpenCoarrays? Thanks,
Broadcast heterogeneous scalar
module shape_m
type, abstract :: shape_t
contains
procedure(IArea), deferred :: Area
end type
abstract interface
function IArea( this ) result(A)
import :: shape_t
class(shape_t), intent(in) :: this
real :: A
end function
end interface
end module
module circle_m
use shape_m, only : shape_t
type, extends(shape_t) :: circle_t
real :: r = 0.0
contains
procedure :: Area => Area_Circle
end type
contains
function Area_Circle( this ) result(A)
class(circle_t), intent(in) :: this
real :: A
A = 3.14*this%r**2
end function
function construct_circle_t( r ) result(c)
real, intent(in) :: r
type(circle_t) :: c
c%r = r
end function
end module
module triangle_m
use shape_m, only : shape_t
type, extends(shape_t) :: triangle_t
real :: b = 0.0
real :: h = 0.0
contains
procedure :: Area => Area_Triangle
end type
contains
function Area_Triangle( this ) result(A)
class(triangle_t), intent(in) :: this
real :: A
A = 0.5*this%b*this%h
end function
function construct_triangle_t( b, h ) result(t)
real, intent(in) :: b
real, intent(in) :: h
type(triangle_t) :: t
t%b = b ; t%h = h
end function
end module
use shape_m
use circle_m
use triangle_m
class(shape_t), allocatable :: shape1, shape2
allocate( circle_t :: shape1 )
allocate( triangle_t :: shape2 )
sync all
if ( this_image() == 1 ) then
shape1 = circle_t( r=1.0 )
shape2 = triangle_t( b=1.0, h=1.0 )
call co_broadcast( shape1, source_image=this_image() )
call co_broadcast( shape2, source_image=this_image() )
end if
sync all
print *, "On image ", this_image(), shape1%area(), shape2%area()
stop
end program
I expect the following, is that correct?
On image 2 3.140000 0.5000000
On image 1 3.140000 0.5000000
On image 8 3.140000 0.5000000
On image 7 3.140000 0.5000000
On image 6 3.140000 0.5000000
On image 4 3.140000 0.5000000
On image 3 3.140000 0.5000000
On image 5 3.140000 0.5000000
I think your program is malformed. All images must call co_broadcast
, not just one. From the first paragraph of the section on collective subroutines:
Successful execution of a collective subroutine performs a calculation on all the images of the current team and assigns a computed value on one or all of them. If it is invoked by one image, it shall be invoked by the same statement on all active images of its current team in segments that are not ordered with respect to each other; corresponding references participate in the same collective computation.
I also do no believe you need the sync all
statements. Thus your program should be
use shape_m
use circle_m
use triangle_m
class(shape_t), allocatable :: shape1, shape2
allocate( circle_t :: shape1 )
allocate( triangle_t :: shape2 )
if ( this_image() == 1 ) then
shape1 = circle_t( r=1.0 )
shape2 = triangle_t( b=1.0, h=1.0 )
end if
call co_broadcast( shape1, source_image=1)
call co_broadcast( shape2, source_image=1)
print *, "On image ", this_image(), shape1%area(), shape2%area()
stop
end program
With those changes I would expect to see your output. However, this still assumes that all images know without communication what the dynamic type of a given entity will be. I.e. they will all perform exactly the same calculations before calling co_broadcast
to determine the dynamic type, which seems inefficient at best, or you know what the dynamic type is going to be before even compiling (as in your above example), in which case what was the point of the polymorphism?
The long term goal is to be able to define a container whose contents are polymorphic, so that users can insert their own types into it, and have it be possible to broadcast (and hopefully do communication between specific images for) such a container. But so far it seems it is not possible to perform such communication without at least knowing ahead of time all of the types which may be communicated.
1 Like
Thanks, that makes sense. It’s unclear to me whether the current standard allows this.
I’m authoring an interp that has some bearing on the question, but doesn’t exactly address it directly. Based on the constraints above, I think any direct communication via coarrays is not possible with polymorphic components. But I think co_broadcast
would allow for polymorphic components based on the description:
A becomes defined, as if by intrinsic assignment, on all images in the current team with the value of A on image SOURCE_IMAGE
Intrinsic assignment being the key phrase there. My interp specifically asks about whether allocatable components are (re)allocated, which I believe they should be since intrinsic assignment does that. And since polymorphic components are necessarily allocatable, I think that covers it.
I also think the above constraints could be relaxed by changing
Except as an actual argument to an intrinsic inquiry function or as the designator in a type parameter inquiry
to
In a variable definition context
because that is still sufficient to prevent one image from performing allocation of data on another image. That would allow my example to work as written.
An interp is a good idea.
Re: “what was the point of the polymorphism?”, I suppose an argument can be made the intended purpose of first version of the collective subroutine CO_BROADCAST
is primarily support toward nonpolymorphic objects and perhaps objects with polymorphic components at any level that are uniformly heterogeneous across all images e.g., a collection of different shapes where it’s the same dynamic type in its component hierarchy on all images? With this in mind, I’m curious how OpenCoarrays with gfortran (I don’t have access to it) works with the following, can you please post the program output? Thanks,
Simple heterogeneous collection
module shape_m
type, abstract :: shape_t
contains
procedure(IArea), deferred :: Area
end type
abstract interface
function IArea( this ) result(A)
import :: shape_t
class(shape_t), intent(in) :: this
real :: A
end function
end interface
end module
module circle_m
use shape_m, only : shape_t
type, extends(shape_t) :: circle_t
real :: r = 0.0
contains
procedure :: Area => Area_Circle
end type
contains
function Area_Circle( this ) result(A)
class(circle_t), intent(in) :: this
real :: A
A = 3.14*this%r**2
end function
function construct_circle_t( r ) result(c)
real, intent(in) :: r
type(circle_t) :: c
c%r = r
end function
end module
module triangle_m
use shape_m, only : shape_t
type, extends(shape_t) :: triangle_t
real :: b = 0.0
real :: h = 0.0
contains
procedure :: Area => Area_Triangle
end type
contains
function Area_Triangle( this ) result(A)
class(triangle_t), intent(in) :: this
real :: A
A = 0.5*this%b*this%h
end function
function construct_triangle_t( b, h ) result(t)
real, intent(in) :: b
real, intent(in) :: h
type(triangle_t) :: t
t%b = b ; t%h = h
end function
end module
module shape_item_m
use shape_m
type :: shape_item_t
class(shape_t), allocatable :: shape
end type
end module
module shape_list_m
use shape_item_m
use circle_m
use triangle_m
type :: shape_list_t
type(shape_item_t), allocatable :: items(:)
end type
contains
function get_areas( this ) result(areas)
type(shape_list_t), intent(in) :: this
real :: areas(size(this%items))
integer :: i
areas = 0.0
if ( size(this%items) > 0 ) then
do i = 1, size(this%items)
areas(i) = this%items(i)%shape%Area()
end do
end if
end function
end module
program p
use shape_list_m
use circle_m
use triangle_m
type(shape_list_t) :: list
allocate( list%items(3) )
allocate( circle_t :: list%items(1)%shape )
allocate( triangle_t :: list%items(2)%shape )
allocate( circle_t :: list%items(3)%shape )
if ( this_image() == 1 ) then
list%items(1)%shape = circle_t( r=1.0 )
list%items(2)%shape = triangle_t( b=1.0, h=1.0 )
list%items(3)%shape = circle_t( r=2.0 )
end if
call co_broadcast( list, source_image=1 )
print *, "On image ", this_image(), "; Areas = ", get_areas( list )
stop
end program
On my Mac, not well. I’ll also try on Linux soon.
[Brads-MacBook-Pro:~/tmp/vipul_co_broadcast] caf main.f90
[Brads-MacBook-Pro:~/tmp/vipul_co_broadcast] cafrun -n 4 ./a.out
On image 1 ; Areas = 3.14000010 0.500000000 12.5600004
On image 3 ; Areas = 0.00000000 0.00000000 0.00000000
On image 2 ; Areas = 0.00000000 0.00000000 NaN
On image 4 ; Areas = 0.00000000 5.43703804E-43 0.00000000
STOP
STOP
STOP
STOP
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 0.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[Brads-MacBook-Pro.local:02280] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[Brads-MacBook-Pro.local:02280] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
It appears Open Coarrays is not getting it right on Linux either.
[pop-os:~/tmp/vipul_co_broadcast] caf main.f90
[pop-os:~/tmp/vipul_co_broadcast] cafrun -n 4 ./a.out
On image 1 ; Areas = 3.14000010 0.500000000 12.5600004
On image 2 ; Areas = 0.00000000 0.00000000 0.00000000
On image 3 ; Areas = 0.00000000 0.00000000 0.00000000
On image 4 ; Areas = 0.00000000 0.00000000 0.00000000
STOP
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
STOP
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 1
STOP
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 2
STOP
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 3
For what it’s worth, Intel doesn’t get it right either (on Windows or Linux).
C:\Users\brad\tmp\vipul_co_broadcast>ifort -Qcoarray -traceback main.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 14.28.29914.0
Copyright (C) Microsoft Corporation. All rights reserved.
-out:main.exe
-subsystem:console
-incremental:no
main.obj
C:\Users\brad\tmp\vipul_co_broadcast>.\main.exe
On image 1 ; Areas = 3.140000 0.5000000 12.56000
forrtl: severe (157): Program Exception - access violation
In coarray image 2
Image PC Routine Line Source
main.exe 00007FF781A31791 MAIN__ 95 main.f90
main.exe 00007FF781A8ABBE Unknown Unknown Unknown
main.exe 00007FF781A8AFA4 Unknown Unknown Unknown
KERNEL32.DLL 00007FF934367034 Unknown Unknown Unknown
ntdll.dll 00007FF936282651 Unknown Unknown Unknown
Thanks much for trying out these compilers and platforms. Would an explicit synchronization prior to CO_BROADCAST
help mitigate these apparent compiler implementation issues?
I would be surprised if that had an effect. I would expect any necessary synchronization to happen within co_broadcast
. But it may be worth trying.
Edit: Yep, tried adding sync all
both before and after the co_broadcast
, and saw no change in behavior with any of the compiler/platform combinations.
It may not directly help with your specific question here, but I am using OOP with coarrays (as coarray components) successfully myself and would point you to Aleksandar Donev’s paper, sections 3.1 and 3.1.1: http://caf.rice.edu/documentation/Aleksandar-Donev-Coarray-Rationale-N1702-2007-09.pdf , on page 6:
“…, one can use derived types to group allocatable co-arrays (data encapsulation) and even use inheritance and polymorphism together with coarrays. This is an important feature facilitating object-oriented programming and was one of the reasons for adding co-array components.”
So far, I did not try polymorphism with coarray components yet but plan to try (and use) it in the near future to further refine and implement parallel models. So far I’m using abstract classes and inheritance to implement parallel models successfully using coarray components with OpenCoarrays.
(More recently I did start to encapsulate access to coarray components to get them out of the parallel logic codes and thus, away from the OOP codes there. But I can confirm that using coarray components together with OOP does work.)
cheers
Michael Siehl
1 Like