Accessing derived-data type array components via pointers

Problem:

Fortran’s OOP model makes it somewhat awkward to expose array components through accessor methods using pointers. Returning a pointer to an array component requires either the calling object to carry the target attribute, or the component itself to be declared as a pointer — both of which come with non-trivial trade-offs. There is no clean, zero-overhead way to simply hand out a reference to an internal array the way you might in C++.

I’ve come up with two reasonable approaches to work around this, and I’d love to hear from the community on best practices. Have you run into this pattern before? Which option do you prefer and why? Is there a third option I’m missing entirely?


Option 1: Allocatable array component + target attribute on the object

type, public :: mytype_t
  !! Any instantiated object must have the target or pointer attribute for the accessor function to work correctly.

  private
  real(RWP), dimension(:), allocatable :: myarray

contains

  procedure, pass(self), public :: get

end type mytype_t

contains

function get(self) result(myarray)

  implicit none
  class(mytype_t), target, intent(in) :: self
  real(RWP), dimension(:), pointer, contiguous :: myarray

  myarray => self%myarray

end function get

Pros:

  • No manual memory management

Cons:

  • The class is not truly self-contained — any object instantiated from mytype_t de facto requires the target (or pointer) attribute for the accessor to work correctly. This compounds when mytype_t objects are nested inside other objects, because the target attribute must propagate all the way up to the outermost object.

Option 2: Pointer array component

type, public :: mytype_t

  private
  real(RWP), dimension(:), pointer, contiguous :: myarray => null()

contains

  procedure, pass(self), public :: get
  
  final :: finalize

end type mytype_t

contains

function get(self) result(myarray)

  implicit none
  class(mytype_t), intent(in), target :: self
  real(RWP), dimension(:), pointer, contiguous :: myarray

  myarray => self%myarray

end function get

subroutine finalize(self)

  implicit none
  type(mytype_t), intent(inout) :: self
  
  if (associated(self%myarray)) then
    deallocate (self%myarray)
  end if
  
end subroutine finalize

Pros:

  • Truly self-contained — instantiated objects require no target or pointer attribute on the caller’s side

Cons:

  • Pointer arrays generally have worse performance than allocatables due to the possibility of non-contiguity and aliasing
  • Requires finalizer to avoid memory leaks

Questions:

  • Which of these two patterns do you reach for in practice, and why? Are there other approaches I haven’t considered?
  • Does marking something as contiguous actually improve performance in practice—specifically, does it let the compiler optimize it as well as allocatable arrays (e.g., for aliasing and stride-1 access), and is there any benchmark evidence showing it removes the performance gap?
  • For those working on large codebases: does the self-containment of Option 2 outweigh the performance and memory-management overhead in your experience? I’m particularly interested in hearing from people who have used either pattern in performance-sensitive code. Any insights or references are welcome!

In my codes I expose the arrays directly, I don’t like accessors as I found this extra layer of indirection makes maintainability worse (for me), adds number of lines, and has the issues you mentioned.

I also prefer not to use getters and setters for the “simple” task of accessing or modifying a DDT member (unless some pre- or post-processing logic is required). Fortran is already verbose enough! :slight_smile:

Neither!

The accessor pattern is for when you want to force the user of the derived type to only modify values through a setter (so you can verify/sanitize values). The getter should just return a copy of the data.

(Programmers of languages like Java/C#/PHP/etc. tend to use accessors for everything, but that’s due to a “just-in-case-we-might-need-them-later” mentality)

So returning a pointer to the data kind of defeats the purpose of the pattern… And it might also degrade performance.

You should just make the component public if the idea is to let the user change the value on its own. Otherwise, just provide a custom constructor (to initialize the value), a getter (to obtain a copy of the value), and, optionally, a setter (to override the current value).

Something like the following should make sense:

module mymod
    use ISO_FORTRAN_ENV, RWP => REAL64

    implicit none
    private

    type, public :: mytype_t
        private
        integer :: flags = 0
        real(RWP), allocatable :: myarray(:)
    contains
        ! the accessor pattern is better for when you need to intercept
        ! multiple things in an instance of your derived type, so TBP's names
        ! should be appropriate.

        procedure :: get_values
        procedure :: set_values

        ! notice that there's no `get_flags`
        procedure :: set_flags
        procedure :: has_flag
    end type

    interface mytype_t
        module procedure mytype_t_new
    end interface

contains
    pure function mytype_t_new(values, FLAGS) result(new)
        type(mytype_t) :: new
        real(RWP), intent(in) :: values(:)
        integer, optional, intent(in) :: FLAGS
        new%myarray = values
        if (present(FLAGS)) new%flags = FLAGS
    end function

    pure subroutine get_values(self, values)
        class(mytype_t), intent(in) :: self
        real(RWP), allocatable, intent(out) :: values(:)

        ! avoid runtime segfaults
        if (.not. allocated(self%myarray)) then
            allocate (values(1:0))
            return
        endif

        values = self%myarray
    end subroutine

    pure subroutine set_values(self, values, STAT)
        class(mytype_t), intent(inout) :: self
        real(RWP), intent(in) :: values(:)
        integer, optional, intent(out) :: STAT

        ! sanitize the values and return STAT /= 0 if they're invalid
        ! ...

        if (present(STAT)) STAT = 0
        self%myarray = values
    end subroutine

    pure subroutine set_flags(self, flags)
        class(mytype_t), intent(inout) :: self
        integer, intent(in) :: flags
        self%flags = ior(self%flags, flags)
    end subroutine

    pure function has_flag(self, flag) result(cond)
        logical :: cond
        class(mytype_t), intent(in) :: self
        integer, intent(in) :: flag
        cond = iand(self%flags, flag) == flag
    end function
end module mymod

Notice that, since there are no pointers involved, a final procedure is not needed.

This is not something I have a lot of experience with, but I’ve wondered about the protected attribute. Seems like it removes the need for a getter, instead letting you directly use a member without the risk of modifying it. Is that a good design pattern or are there downsides?

Edit: Actually, dumb question, since protected only applies to module variables, not derived type members.

The public|private attributes, are about accessing designators (i.e., variables, components, etc.) from outside the module. The protected attribute is about being able to modify variables from outside the module.

So you can have things like

module mod1
    implicit none
    private
    integer, protected :: a    ! here, `protected` is moot
    real, protected, public :: b(10)
end module mod1

In Fortran, the primary program unit when it comes to access is the module —which differs from other languages that center access around the class|struct. And it’s not the only one (e.g., in Go, the TitleCase access applies to the package, not the struct).

Also, Fortran has this thing about some attributes only applying to variables (e.g., protected, target), some applying only to derived type components (e.g., pass), and some applying to both (e.g., allocatable, pointer, public, …). In fact, originally, allocatable required rank > 0.

I think there was a proposal to extend protected so that it could apply to derived type components, but it didn’t take.