Procedure with work arrays as optional arguments

Imagine a procedure that needs some work space to do its thing. If the user does not provide said work space, then the procedure allocates it internally. Otherwise, the work space passed by the user is employed.

What is the recommended pattern to implement this functionality? Is the approach below OK, or are there better options? Thanks.

module foo_mod
   implicit none

contains

   subroutine foo(n, res, work)
   !! A procedure that needs a work array 
      integer, intent(in) :: n
      !! problem dimension
      real, intent(out) :: res
      !! result
      real, intent(inout), optional, target :: work(:)
      !! user-supplied work space; intent(inout) is required in the 'true' application

      real, pointer :: lwork(:) ! local (internal) work space
      integer :: lenwork, i

      lenwork = length_work(n)
      
      if (present(work)) then
         if (size(work) == lenwork) then
            lwork => work
         end if   
      else
         allocate(lwork(lenwork))
      end if

      ! Core part of the code
      ! in this case, something silly where the work array is used
      lwork = 0
      do i = 1, lenwork
         lwork(i) = i
      end do
      res = sum(lwork)

      ! Pointer deallocation required (unlike allocatable arrays)
      if (present(work)) then
         lwork => null()
      else
         deallocate(lwork)
      end if

   end subroutine foo

   integer function length_work(n)
      integer, intent(in) :: n
      length_work = n**2 + n**3 ! some long formula
   end function

end module foo_mod

program test
   use foo_mod
   implicit none

   integer, parameter :: n=2
   real :: res
   real, allocatable :: work(:)
   
   ! call without 'work'
   call foo(n, res)
   print *, "without 'work':", res

   ! call with 'work'
   allocate(work(length_work(n)))
   call foo(n, res, work=work)
   print *, "with 'work':", res, work(size(work))

end program test

Looks ok to me. There are two comments:

  1. It seems “lwork” will remain uninitialized if user provides array “work” with incorrect size. I assume you will be handling such exception in a production code.
  2. It is not necessary to nullify “lwork” at the end of the subroutine, just deallocate it if needed. (Unless you prefer that “nullify” assignment in your code for esthetics reasons.)
2 Likes

Besides what @grofz said, you can see other examples from how we handled that in stdlib.

There are cases where a pointer is useful also for reshaping a 1D input array to a 2D one (for example, if LAPACK requires an input (lda, 1) array even for a single evaluation.

2 Likes

I think that that in some cases, it is preferable to use an extra local allocatable array to avoid potential memory leaks, which can occur if one forget to deallocate pointer at the end of subroutine.

subroutine foo(n, res, work)
      integer, intent(in) :: n
      real, intent(out) :: res
      real, intent(inout), optional, target :: work(:)
     
      real, allocatable, target :: loc_work(:)

      real, pointer :: lwork(:) ! 
      integer :: lenwork, i

      lenwork = length_work(n)
      
      if (present(work)) then
         if (size(work) == lenwork) then
            lwork => work
         end if   
      else
         allocate(loc_work(lenwork))
         lwork => loc_work
      end if
   ....
   end subroutine foo
1 Like

There are several approaches to consider.

I have done what you show and it works alright, but the programmer must be careful to conditionally deallocate all of the pointers before return, otherwise it will leak memory. If the local memory is allocatable, target, then you can skip the conditional pointer deallocations because the compiler will do that automatically. This is useful if there are many return statements in the code, and where it can be tricky to get all of the manual deallocations done for all possible execution paths.

Another thing to consider is that expressions with pointer arrays sometimes cannot be optimized due to the possible aliasing, so performance can suffer. A workaround for that is to write an interface routine that does the conditional pointer assignments, then call the actual work routine with the full argument list and where the dummy arguments are all just normal arrays (no target, pointer, or optional attributes). The expressions within the work routine can then be optimized. This approach has the advantage that the work routine can be called directly when the programmer wants to take care of all the workspace allocations (e.g. when the workspace arrays are reused many times), and the interface routine can be called when some or all of the workspace allocation is done by the interface routine.

You can also adopt the convention that all work arrays are allocatable, intent(inout). In this case, the routine reuses an array if it is allocated and of sufficient size, and it (re)allocates otherwise. This eliminates any possible memory leaks, but it restricts the actual arrays in the calling program to be allocatable.

2 Likes

This is the crux, and I had completely overlooked it… The code has several checking steps and an equal number of if (error) return statements spread across the main driving routine. There are “just” 4 of these, so, in theory, I could place the dealloc statements in a contained subroutine and call it before each return… But that feels a bit like tempting the devil. So, I guess I will go with @rsci’s suggestion and implement a local allocatable array, rather than allocating the pointer array directly .

The code is actually already structured in that way. These “preparation” operations take place in odr, which then hands over the job to dodcnt. See call graph: example5 – odrpack (hugomvale.github.io)

This was actually my first idea, but I figured out this approach might get me stuck. I want to have the possibility to call the procedure from Python, via the C-API, and I would not know how to make that work.

It works! :slight_smile: :partying_face:

1 Like

Perhaps this has already been suggested (I quickly scanned and didn’t see it). In such cases I prefer to have a private auxiliary procedure that performs the computation and which takes the workspace array as a required argument. Then a second public procedure where it is optional. If present, it is passed to the auxiliary, otherwise a local allocatable array is allocated and passed instead. No need for messing with pointers at all.

1 Like

For a single optional array, this would require two different call statements, one with the present optional array and the other with the local allocatable array. If there are two such arrays, then the selection of one of four call statements would be required. If there are N optional arrays, then selection of one of 2**N call statements would be required. The use of pointers for the optional arrays reduces all of that effort down to a single call statement.

This is a common problem when interfacing code from different languages. All of the interlanguage interoperablility must be restricted to some common intersection of the various languages. That means that fortran features such as allocatable arrays, optional arguments, assumed shape arrays, etc. must not be used. Those are the features that make modern fortran appealing.

Quite obviously there will need to be different calls depending on whether an argument is present or not. But for a single one it is really quite simple and much preferable imo to messing with pointers and all their baggage. If you feel differently, fine. It’s just another option.

It’s also obvious that it becomes unwieldy as the number of optional workspace arguments increase and so not such a good option in that case. However I’d suggest that an interface with multiple optional workspace arguments may not be a particularly good design.

If you look at the details of how LAPACK uses its single work(*) array, you will see that although it is a single dummy argument, it is used internally as multiple arrays (depending on the linear algebra algorithm). These multiple arrays are mixtures of vectors, 2D matrices, and so on. These effective arrays are implemented using offsets within that single work(*) array. Of course, this interface dates back to f77 conventions, and f77 did not have many of the modern features we are discussing here (allocatable arrays, pointers, optional arguments, etc.). With modern fortran, it is simpler, and arguably clearer, to have multiple workspace vectors and explicit 2D workspace matrices than to do the offset approach. For example, in my own codes I sometime define a derived type that has internally scalars, allocatable vectors, and allocatable matrices as its components, and I pass that collection of workspace arrays as a single argument to the subroutine. This works the same way as the f77-style LAPACK approach, but it uses modern fortran features.

1 Like

@HugoMVale ,

What is better is in the eyes of the caller, actually customer/consumer, of this who may be you yourself.

Note the notion of the customer/consumer managing work areas can be increasingly seen as passé, if not already.

You write, “Imagine a procedure that needs some work space to do its thing.” This “thing” you refer to is more often than not a class of actions e.g., linear algebra. Proceeding accordingly can provide your consumers a world of convenience and efficiency (fewer chances of errors).

module a_class_m

    private

    type, public :: a_class_m
         private
         real, allocatable :: work
    end type

    generic, public :: foo => foo_w, foo_nw

contains

subroutine foo( a_class, n, res, work)
   !! A procedure that needs a work array
      type(a_class_t), intent(inout) :: a_class   
      integer, intent(in) :: n
      !! problem dimension
      real, intent(out) :: res
      !! result
      real, intent(inout) :: work(:)
      ..
end subroutine

subroutine foo_nw( a_class, n, res )
   !! A procedure that needs a work array
      type(a_class_t), intent(inout) :: a_class   
      integer, intent(in) :: n
      !! problem dimension
      real, intent(out) :: res
      ..
      call foo_w( a_class, n, res, a_class%work )
      ..
   end subroutine
..
end module

program test
   use a_class_m
    type(a_class_t) :: a
   ..
   real, allocatable :: work(:)
   
   ! call without 'work'
   call foo(a, n, res)
   print *, "without 'work':", res

   ! call with 'work'
..
   call foo(a, n, res, work=work)
   

Increasingly the callers will start consuming the solution without bothering about managing their own work areas.