How to best pass arguments of different rank to concrete methods of an abstract derived type?

I’m modeling a few concrete types that are based on the common abstract type. Each type has two methods, forward and backward. The methods expect arguments of different rank; for example, a concrete dense_layer type expects rank-1 arguments to its methods, while a concrete conv2d_layer type expects rank-3 arguments to its methods.

  type, abstract :: base_layer
    procedure(forward_interface), deferred :: forward
    procedure(backward_interface), deferred :: backward
  end type base_layer

  abstract interface

    subroutine forward_interface(self, input)
      import :: base_layer
      class(base_layer), intent(in out) :: self
      real, intent(in) :: input(..)
    end subroutine forward_interface

    subroutine backward_interface(self, input, loss)
      import :: base_layer
      class(base_layer), intent(in out) :: self
      real, intent(in) :: input(..)
      real, intent(in) :: loss(..)
    end subroutine backward_interface

  end interface

Initially, I thought about modeling this using the assumed-rank arrays and the select rank construct. For example, the forward method on the concrete dense_layer looks like this:

  subroutine forward(self, input)
    class(dense_layer), intent(in out) :: self
    real, intent(in) :: input(..)
    select rank(input)
        self % z = matmul(input, self % weights) + self % biases
        self % output = sigmoid(self % z)
      rank default
        print *, 'Warning: rank ', rank(input), ' is not valid'
    end select
  end subroutine forward

Here, only rank-1 argument is valid for this type, but I’m invoking all this select rank machinery only to be able to use input. OK, maybe not too bad, but it gets worse with the backward method which has two assumed-rank arguments:

  subroutine backward(self, input, loss)
    class(dense_layer), intent(in out) :: self
    real, intent(in) :: input(..)
    real, intent(in) :: loss(..)
    select rank(loss) 
        select rank(input)
            self % db = loss * sigmoid_prime(self % z)
            self % dw = matmul( &
              reshape(input, [size(input), 1]), &
              reshape(self % db, [1, size(self % db)]) &
          rank default
            print *, 'Rank ', rank(input), ' is not valid for input(..)'
        end select
      rank default
        print *, 'Rank ', rank(loss), ' is not valid for loss(..)'
    end select
  end subroutine backward

Notice that for each assumed-rank argument that I want to use, I need to wrap the algorithm in that many nested levels of select rank (or do I? I couldn’t find a different way to do it). At this point, assumed-rank arrays don’t seem like a good idea anymore. It feels like I’m shoehorning a language construct into a use-case where it’s not the best choice.

What are my alternatives? I thought about whether the deferred methods can be a generic, but I don’t see an obvious way to do that considering that the deferred method must have a specific interface.

A simpler solution may be to let my deferred methods expect the highest rank needed (in my case that would be 3), and then treat 1-d input in the algorithm itself, i.e. input(:,1,1).

In summary, I understand my options to be:

  1. Assumed-rank + select rank, where the array shape inquiry is cleanly separated from the algorithm, but at the cost of much boilerplate, and possibly not as well supported by compilers;
  2. Use rank-3 arguments for all concrete types, and sort out the shape on the caller-side and in the algorithm. The upside is that it needs less code and is well supported by compilers. The downside is that the array-shape logic would be mixed-in with the algorithm.

Currently option 2 seems more attractive to me.

What do you think about these options, and do you have any ideas for an alternative approach?

Thank you!

If there is only one set of argument ranks that make sense, you can test for all of them
at the beginning of the procedure. There is boilerplate code, but it is only linear in the number of assumed rank arguments, not exponential. Here is an example:

module m
implicit none
subroutine rank_one(a,b,c)
real, intent(in) :: a(..),b(..),c(..)
if (rank(a) /= 1 .or. rank(b) /= 1 .or. rank(c) /= 1) error stop "need rank(a) = rank(b) = rank(c) = 1"
select rank(a) ; rank(1)
select rank(b) ; rank(1)
select rank(c) ; rank(1)
   print*,"sum is",sum(a)+sum(b)+sum(c)
end select
end select
end select
end subroutine rank_one
end module m
program main
use m
implicit none
call rank_one([10.0,20.0],[30.0,40.0],[50.0,60.0])
end program main
! output: 
! sum is 210.000000
1 Like

That’s less boilerplate indeed. I avoid semicolons in Fortran but here it seems justified. It didn’t occur to me that rank default can be omitted.

What about making them all rank 1 and then using a contiguous pointer
with shape remapping and use the pointer variable locally.

subroutine backward(input, ni, nj, nk, ...)

  real, intent(in), target :: input(:)
  integer, intent(in)        :: ni, nj, nk
  real, pointer, contiguous :: iptr(:,:,:)
  iptr(1:ni,1:nj,1:nk) => input(:) 

I would trust this to work before assumed rank in any current compiler

1 Like

This seems like a symptom of a non-optimal design. The different concrete types have different algorithms and different interfaces, even if though you’re tempted to try and make them match. I would try and come up with some different type definitions; attach the methods to different types, turn the dependencies around, etc. If it’s hard to get the interfaces to be the same, that’s usually a red flag for me. It’s of course possible that this really is the best design, given other constraints and trade-offs, but it seems a bit awkward.

@rwmsu Thank you, that’s an interesting solution I didn’t consider (nor I knew that you could do that).

@everythingfunctional it’s quite possible that this is not the best design, thus my question. Getting the interfaces to be the same is easy with assumed-rank arrays, but I wasn’t sure whether all the boilerplate that comes with select rank is the necessary cost of the feature or my misuse of the feature.

Yes, I can define different methods with different interfaces as an alternative design. In that case, I’d need to move this boilerplate to the caller where I’d call different methods for each select type/type is() branch. I considered this early on but it led to more boilerplate than the current solution.

I’d try and think of even more radically different designs than just this kind of change. Sometimes the initial/obvious way of thinking about a problem does not lead to the best design.

More context will probably be helpful at this time. Here’s my design for the end-user API:

  • The user interacts with two types, network and layer
  • The user interacts with layer types only to instantiate them and pass them to the network constructor
  • network contains an arbitrary number of layer instances, which can be of different types, even within the same network.

Here’s a simple dense network instantiation:

  use nf, only: dense, input, layer, network
  implicit none
  type(layer), allocatable :: layers(:)
  type(network) :: densenet

  layers(1) = input(784)
  layers(2) = dense(layers(1), 10)
  layers(3) = dense(layers(2), 15)
  layers(4) = dense(layers(3), 10)
  densenet = network(layers)

To allow an array of layers of different types within a network, and to loop over them, I defined my types like this:

  type :: network
    type(layer), allocatable :: layers(:)
  end type network


  type :: layer
    class(base_layer), allocatable :: p
    procedure :: forward
    procedure :: backward
  end type layer

and the layer % forward and layer % backward methods invoke the concrete and assumed-rank methods layer % p % forward and layer % p % backward.

In summary:

  1. The end-user works with a network instance, which is a container for layer instances.
  2. One level down, layer is a container that supports any layer type, through a class(base_layer), allocatable data component.
  3. At the lowest level, each concrete type of base_layer implements its algorithms and layer-specific data.

Any ideas for design improvement very welcome.

Thanks all for the comments and ideas. I marked @rwmsu’s response as the solution because it directly addresses the question, even though I ended up not needing to use it.

After further experiments and arriving to a minimal code with three working examples of increasing complexity (approximating arbitrary (x)->(y) data relationship, approximating a transcendental function, and MNIST digit recognition), I came to the conclusion that forward and backward methods don’t need to be deferred, and thus don’t need matching interfaces. This is because every layer type has known (ahead of time) data rank that it works on, and this doesn’t change between applications. For example, a dense layer always works on 1-d data. A conv2d layer always works on 3-d data. For the same reason, I also don’t need assumed-rank arrays. Type guards at the caller ensure that the different layer instances are in a shape-conforming order.

For anybody interested, here’s the relevant PR.

1 Like