Automatic vs. allocatable arrays for function results

Consider two implementations of a function:

pure function ones1(n) result(res)
  integer, intent(in) :: n
  integer :: res(n)
  res = 1
end function

pure function ones2(n) result(res)
  integer, intent(in) :: n
  integer, allocatable :: res(:)
  allocate(res(n), source=1)
end function

The difference is how the result array is declared and initialized. In ones1, we use an automatic array and in ones2 an allocatable one.

Is there any practical advantage to using automatic arrays? Is there any performance benefit to either?

In the case of Intel compiler (but not GNU), the automatic array result is on the stack which can be exceeded with large arrays, so the user has to unlimit their stack size in the OS, or use the -heap-arrays flag. IMO this is a downside to automatic arrays approach.

The upside is that the code is simpler for the automatic approach than the allocatable one.

Which approach do you use and why? I’ve been using automatic arrays when possible, and allocatable otherwise.

4 Likes

A function returning an automatic array is more flexible, because in the caller, an array need not be allocatable to be set to the function result, although with allocation upon assignment, it can be. I use functions with an allocatable result when the size of the result does not trivially depend on the arguments. An example would be a function that returns the unique values of an array argument.

Edit: the first sentence above is wrong, as @milancurcic points out below.

2 Likes

I would say that’s specific to the Intel compiler and not a downside of the approach. I think for performance reason this is a great idea to use a stack, but one has to indeed use ulimit otherwise it just segfaults. I think a better user experience is to never segfault like this.

  • Is there a way to tell (for a compiler generated code) that a program is running out of stack space?

  • So in Debug mode at least it can put in checks and give an error message.

  • Can it be made to be performing?

In some sense, this is running out of memory. In practice people are used to ensuring that there is enough main memory. But not used to ensuring that ulimit -u (or whatever the option is) is called. For example, cannot the generated program check at the beginning how large the stack is, and give a nice error message if it is not big enough?

This leads to a more general question if the Fortran compiler can help with running out of memory. For example, the compiler could insert checks for automatic (and allocatable) arrays on stack, and then try to optimize such functions by inlining them or running the checks ahead of time as much as possible.

Btw, aren’t there compiler options (such as -fstack-arrays in gfortran) that put allocatable arrays on stack also? I thought there is no guarantee that an allocatable array will be on a heap.

1 Like

I don’t understand this. Why does the array need to be allocatable to be set to the result of ones2? It seems to me it doesn’t have to be.

program test
  integer :: a(10)
  a = ones2(10)
  print *, a
contains
  pure function ones2(n) result(res)
    integer, intent(in) :: n
    integer, allocatable :: res(:)
    allocate(res(n), source=1)
  end function
end program

My demo tests:
A website online, you can get different versions of the Fortran compiler, such as gfortran/ifort/ifx/flang: Compiler Explorer (godbolt.org)

Code: Array 1000*1000

program demo_1

    real :: stime, etime
    real(kind=8), allocatable :: A(:,:), B(:,:)

    call cpu_time(stime)
    A = empty_allocation(1000,1000)
    call cpu_time(etime)
    print "(A,ES10.3)", "etime - stime (seconds) : ", etime - stime !! 8.184E-03 (gfortran); 1.101E-02 (ifort)

    call cpu_time(stime)
    B = empty_automatic(1000,1000)
    call cpu_time(etime)
    print "(A,ES10.3)", "etime - stime (seconds) : ", etime - stime !! 8.000E-06 (gfortran); 9.549E-03 (ifort)

contains

    !> ALlocation.
    pure function empty_allocation(ndim1, ndim2) result(result)
        integer, intent(in) :: ndim1, ndim2
        real(kind=8), allocatable :: result(:,:)
        allocate(result(ndim1, ndim2))
    end function empty_allocation

    !> Automatic.
    pure function empty_automatic(ndim1, ndim2) result(result)
        integer, intent(in) :: ndim1, ndim2
        real(kind=8) :: result(ndim1, ndim2)
    end function empty_automatic

end program demo_1

Screenshot:

Loose conclusion

On the 1000*1000 matrix, it seems that gfortran is more efficient in the automatic scheme, while ifort has the same efficiency in the two schemes.

Code: continue to increase the matrix dimension, 3000*3000

program demo_2

    real :: stime, etime
    real(kind=8), allocatable :: A(:,:), B(:,:)

    call cpu_time(stime)
    A = empty_allocation(3000,3000)
    call cpu_time(etime)
    print "(A,ES10.3)", "etime - stime (seconds) : ", etime - stime !! 6.153E-02 (gfortran); 9.206E-02 (ifort)

    call cpu_time(stime)
    B = empty_automatic(3000,3000)
    call cpu_time(etime)
    print "(A,ES10.3)", "etime - stime (seconds) : ", etime - stime !! 1.100E-05 (gfortran); segmentation fault occurred (ifort)

contains

    !> ALlocation.
    pure function empty_allocation(ndim1, ndim2) result(result)
        integer, intent(in) :: ndim1, ndim2
        real(kind=8), allocatable :: result(:,:)
        allocate(result(ndim1, ndim2))
    end function empty_allocation

    !> Automatic.
    pure function empty_automatic(ndim1, ndim2) result(result)
        integer, intent(in) :: ndim1, ndim2
        real(kind=8) :: result(ndim1, ndim2)
    end function empty_automatic

end program demo_2

Screenshot:

Loose conclusion

With the default ifort/ifx setting, if you use the automatic scheme to encounter an array of 3000*3000, a segmentation error will be reported.

1 Like

Sorry, you are correct.

1 Like

A simple rule of thumb says always prefer the newer feature. In this case, allocatable arrays. You get orderly error handling in case of insufficient resources and you can complain loudly to the vendor if the performance is lacking because it is a current feature that is used a lot.

3 Likes

But the older a feature is, the more production codes there are using it, the more complaining there will be if it is broken, and the more confidence you can have that your compiler, even an old version, supports it. For example, the code below is not legal

do 1 i=1,3
   do 1 j=1,2
1      print*,i,j,i+j
end

since as gfortran explains,

continue.f90:2:13:

    2 |    do 1 j=1,2
      |             1
Warning: Fortran 2018 deleted feature: Shared DO termination label 1 at (1)
continue.f90:3:21:

    3 | 1      print*,i,j,i+j
      |                     1
Warning: Fortran 2018 deleted feature: DO termination statement which is not END DO or CONTINUE with label 1 at (1)

but as a practical matter you can rely on Fortran compilers supporting it (outside of strict mode) since so much code uses deleted or obsolescent features. One can help Fortran progress by using new features and reporting bugs, but that is a different issue.