Function objects in Fortran

veryreverie · May 30, 2021, 8:34am

A feature I would love to see in Fortran is function objects, objects which can be called as if they are functions. I recently discovered that Fortran does not intrinsically have argument capture, and I think the workaround for this - writing a type which contains a procedure pointer and stores the captured variables explicitly - is clunky, because the type cannot be called like a function.

Proposed syntax

I would like to propose the syntax operator(()), which would be used in exactly the same way as e.g. operator(*) and operator(//). This would allow for the definition of more natural lambda types, e.g.

module lambda_module
  abstract interface
    function foo(input1,input2) result(output)
      integer, intent(in) :: input1,input2
      integer :: output
    end function
  end interface
  
  type :: Lambda
    procedure(foo), pointer :: func
    integer :: captured_variable
  contains
    generic, public :: operator(()) => call_Lambda
    procedure, private :: call_Lambda
  end type
  
  function call_Lambda(this,input) result(output)
    class(Lambda), intent(in) :: this
    integer, intent(in) :: input
    integer :: output
    output = this%func(input, this%captured_variable)
  end function
end module

and these lambda types might be used like

program test
  use lambda_module
  
  add_one = generate_lambda_add_one()
  write(*,*) add_one(3) ! Writes "4".
  
contains
  function generate_lambda_add_one() result(output)
    type(Lambda) :: output
    
    output = Lambda(add_integers, 1)
  contains
    function add_integers(input1,input2) result(output)
      integer, intent(in) :: input1,input2
      integer :: output
      output = input1+input2 
    end function
  end function
end program

Containers

I think this would also be helpful for writing containers which feel more “Fortranic”, so that container element access feels more like element access for intrinsic arrays, e.g.

type :: Array
  integer, allocatable :: contents(:)
contains
  generic, public :: operator(()) => access_Array
  procedure, private :: access_Array
end type

elemental function access_Array(this,index) result(output)
  class(Array), intent(in) :: this
  integer, intent(in) :: index
  integer :: output
  output = this%contents(index)
end function

type(Array) :: foo
foo = Array([2,4,8,16,32])
write(*,*) foo([1,5,3]) ! Writes "2 32 8"

Syntax clash

I think this would be a relatively small syntactical addition, which hopefully shouldn’t cause too many headaches for the compilers.

The main syntax issue I see is how to use arrays of function objects, e.g. if you have

  type(Array), allocatable :: foo(:)
  foo = [Array([1,2]), Array([3,4])]
  write(*,*) = foo(1)

then it’s unclear whether what is written should be the first element of foo, i.e. Array([1,2]), or the result of calling access_Array on foo elementally, i.e. [1,3].

For this I think I would favour leaning in to the clash between function call and array access syntax, and have

foo(1) return the first element of foo
foo(:,1) call access_Array on foo elementally.

Summary

So, what does everyone think? Is this an idea that’s been proposed before, and was it rejected for some reason? Is this even the right forum for these kinds of feature proposals? I’d love to hear your thoughts.

Beliavsky · May 30, 2021, 12:03pm

After you get feedback here you can submit a proposal at Issues · j3-fortran/fortran_proposals · GitHub and link to this thread.

veryreverie · May 30, 2021, 1:50pm

Ah, I didn’t know this existed, thank you!

And, somewhat unsurprisingly, this suggestion is already there.

Arjen · May 31, 2021, 6:58am

Have a look at my experiment with anonymous functions at flibs - a collection of Fortran modules / SVN / [r428] /trunk/experiments/lambda.f90.
(It is of course much more limited than what you would be able to do with what you suggest)

certik · May 31, 2021, 4:18pm

Yes, I was going to post here that this has already been proposed there. Thanks for finding it!

What would be the applications that you would like to be using this in?

veryreverie · June 1, 2021, 9:45am

In my own work, I’m more interested in the applications to containers than to Lambda functions, although this is partly because Lambda functions are so hard to write in Fortran that I steer well clear of them.

My research is about using crystal symmetry to accelerate free energy calculations, so I have lots of data structures which are sparse in very structured ways. The sparsity means that using dense arrays is horribly inefficient, so I use lots of ragged arrays, hash maps, sorted arrays and the like. Since Fortran doesn’t have these containers natively, I write my own, but they always end up being a lot more awkward to use than feels necessary.

For example, I regularly use 2-d ragged arrays defined like

type :: Array1d
  integer, allocatable :: elements(:)
end type

type :: Array2d
  type(Array1d), allocatable :: elements(:)
end type

type(Array2d) :: ragged_array

ragged_array = Array2d([ Array1d([1,2,3]), Array1d([4]), Array1d([5,6]) ])

I think the “least surprising” way of accessing the elements of ragged_array would be like ragged_array(i,j) or ragged_array(i,a:b) (or possibly even ragged_array(a:b,c:d), although this would presumably return a dense array). However, the best I can do at the moment is writing a getter which is called as ragged_array%elements(i,j) and which can’t take slices, or by accessing the components directly as ragged_array%elements(i)%elements(j) which is far more verbose than it needs to be, particularly if the ragged array is 3d or 4d or higher.

The situation gets worse if I don’t want the elements of the container to be directly accessible (e.g. because the container guarantees its elements are sorted, or because it uses C++ vector-style memory storage). Then the only way to access the elements is using a getter, and there is no way to use array slices at all.

ivanpribec · June 1, 2021, 10:41am

In some cases I’ve found using the associate construct helpful in such cases, e.g. a loop over all elements would like like:

associate(rows => ragged_array%elements)
  do i = 1, size(rows)
    associate(cols => rows(i)%elements)
       do j = 1, size(cols)
           ...
       end do
    end associate
  end do
end associate

Edit: Another idea is using a preprocessor liky fypp. If your containers follow a fixed-naming pattern an access macro might look something like:

#:def elem(*VARPOS)
#{for ARG in VARPOS}#%elements(${ARG}$)#{endfor}#
#:enddef get

do i = 1, nrows
  do j = 1, size(arr@{elem(i)}@)
    arr@{elem(i,j)}@ = arr@{elem(i,j)}@ + 1
  end do
end do

This will be expanded to:

do i = 1, nrows
  do j = 1, size(arr%elements(i))
    arr%elements(i)%elements(j) = arr%elements(i)%elements(j) + 1
  end do
end do

The downside is that your project now has a dependency, and the build process has one extra step. Also the syntax with @{ }@ is kind of ugly, even if it gets the job done.

Beliavsky · June 1, 2021, 12:09pm

You can use ELEMENTAL functions to get array sections in the first dimension and pass integer arguments to get sections in the second dimension. With Intel Fortran the output of

module m
implicit none
!
type :: Array1d
  integer, allocatable :: elements(:)
end type
!
type :: Array2d
  type(Array1d), allocatable :: elements(:)
end type
contains
!
subroutine display(mat)
type(Array2d), intent(in) :: mat
integer                   :: i
do i=1,size(mat%elements)
   write (*,"(1000i6)") i,mat%elements(i)%elements
end do
write (*,*)
end subroutine display
!
elemental function get(vec,j) result(k)
type(Array1d), intent(in) :: vec
integer      , intent(in) :: j
integer                   :: k
k = vec%elements(j)
end function get
!
elemental function slice(vec,j1,j2,stride) result(kvec)
type(Array1d), intent(in)           :: vec
integer      , intent(in)           :: j1,j2
integer      , intent(in), optional :: stride
type(Array1d)                       :: kvec
if (present(stride)) then
   kvec = Array1d(vec%elements(j1:j2:stride))
else
   kvec = Array1d(vec%elements(j1:j2))
end if
end function slice
!
end module m

program main
use m, only: Array2d,slice,display
implicit none
type(Array2d) :: ragged,ragged_slice
allocate (ragged%elements(3))
ragged%elements(1)%elements = [10,20]
ragged%elements(2)%elements = [100,200,300]
ragged%elements(3)%elements = [1000,2000,3000,4000]
call display(ragged)
ragged_slice%elements = slice(ragged%elements,1,2)
call display(ragged_slice)
ragged_slice%elements = slice(ragged%elements(2:3),1,3,2)
call display(ragged_slice)
end program main

is

     1    10    20
     2   100   200   300
     3  1000  2000  3000  4000
 
     1    10    20
     2   100   200
     3  1000  2000
 
     1   100   300
     2  1000  3000

GNU Fortran (GCC) 11.0.0 20200927 (experimental) from equation.com on Windows gives

     1    10    20
     2   100   200   300
     3  1000  2000  3000  4000

     1    10    20
     2   100   200
     3  1000  2000

     1   100   200
     2  1000  2000

which is different for the second instance of call display(ragged_slice). Is this a bug? Good old g95 says

Exception: Access Violation
At line 47 of file ragged.f90

but I think the program is valid.

veryreverie · June 1, 2021, 12:21pm

Thanks @ivanpribec and @Beliavsky, these are all good suggestions. I use associate a bit, and elemental functions a lot, and I should probably investigate a preprocessor other than cpp.

I guess my desire for operator(()) syntax is not to do anything that Fortran can’t already do (indeed, anything that could be done with operator(()) could also be done with a good old type-bound procedure, and as you point out slices can be emulated with arguments first, last and stride), but rather to be able to write containers which “feel like” containers, rather than like specialised classes. I see this as one step along the road towards Fortran having a standard container library whose containers can be used off the shelf without requiring a manual.

ivanpribec · June 1, 2021, 2:19pm

I certainly understand your desire. Overloading the operator() is a very common pattern in C++ (an example can be seen in my post on C++ MEX functions) and Python.

Most of the interpolator objects in scipy.interpolate provide a __call__ method to give the interpolants a function-like interface, e.g.

from scipy.interpolate import BarycentricInterpolator

y = BarycentricInterpolator([0,1,2],[0,1,4])
xi = np.linspace(0,2,50)
yi = y(xi)

For Fortran derived types I typically use the name %eval for this purpose. But as you mentioned earlier this is best considered separately from the case of container/array access.

Concerning containers, a project currently on hold is to provide a C++ container interface to Fortran arrays (Fortran/C++ compatibility library · Issue #325 · fortran-lang/stdlib · GitHub). This would make it easy to use C++ iterators and algorithms to operate on Fortran arrays. You could then implement your algorithm in C++.

certik · June 1, 2021, 2:46pm

For performance code, I would recommend not to use the allocatable of allocatables, I think that is very inefficient. For prototype code that is fine of course. I think being able to write your own arrays (with a custom internal implementation) is something that might be useful. Traditionally the idea was to add enough data structures into the Fortran language itself.

If I need to use sparse arrays, I just access them using subroutines. I think this is something worth investigating if there is a way to extend Fortran to overload the matrix operations and syntax, so that you can provide your own internal representation.

veryreverie · June 1, 2021, 4:00pm

I think I would disagree that an allocatable of allocatables is always less efficient than a single allocated array. Certainly in many cases a single allocatable will be faster, but I would argue that it depends on the details of the problem. For example, if each of the “leaf” allocatables is large, and an algorithm spends a reasonable amount of time in each, then the cost of jumping about in memory to access each leaf allocatable in turn is negligible. Or if each leaf allocatable needs updating regularly, and it is desirable to keep the memory of each leaf contiguous, then having an allocatable of allocatables allows each leaf allocatable to be re-allocated without having to either re-allocate the entire container.

I can definitely see the merits of this approach, but personally I would prefer an approach where the language is updated so that new data structures can easily be written as external libraries. Then users can get their containers from a standard library if they are available, or write their own if they need something more bespoke.

I agree. This sounds like it would improve the language immensely.

Topic		Replies	Views
Fortran function return value polymorphism Help	12	2039	August 3, 2021
__FUNCTION__ analogue for Fortran Help	7	1102	August 4, 2022
Calling different functions with an integer Help	3	271	December 11, 2023
Function not working Help	2	306	June 14, 2022
bind(C) philosophy: pointer-by-value or type-by-reference? Help	12	996	July 3, 2022

Function objects in Fortran

Proposed syntax

Containers

Syntax clash

Summary

Related topics