Accessing derived-data type array components via pointers

Problem:

Fortran’s OOP model makes it somewhat awkward to expose array components through accessor methods using pointers. Returning a pointer to an array component requires either the calling object to carry the target attribute, or the component itself to be declared as a pointer — both of which come with non-trivial trade-offs. There is no clean, zero-overhead way to simply hand out a reference to an internal array the way you might in C++.

I’ve come up with two reasonable approaches to work around this, and I’d love to hear from the community on best practices. Have you run into this pattern before? Which option do you prefer and why? Is there a third option I’m missing entirely?


Option 1: Allocatable array component + target attribute on the object

type, public :: mytype_t
  !! Any instantiated object must have the target or pointer attribute for the accessor function to work correctly.

  private
  real(RWP), dimension(:), allocatable :: myarray

contains

  procedure, pass(self), public :: get

end type mytype_t

contains

function get(self) result(myarray)

  implicit none
  class(mytype_t), target, intent(in) :: self
  real(RWP), dimension(:), pointer, contiguous :: myarray

  myarray => self%myarray

end function get

Pros:

  • No manual memory management

Cons:

  • The class is not truly self-contained — any object instantiated from mytype_t de facto requires the target (or pointer) attribute for the accessor to work correctly. This compounds when mytype_t objects are nested inside other objects, because the target attribute must propagate all the way up to the outermost object.

Option 2: Pointer array component

type, public :: mytype_t

  private
  real(RWP), dimension(:), pointer, contiguous :: myarray => null()

contains

  procedure, pass(self), public :: get
  
  final :: finalize

end type mytype_t

contains

function get(self) result(myarray)

  implicit none
  class(mytype_t), intent(in), target :: self
  real(RWP), dimension(:), pointer, contiguous :: myarray

  myarray => self%myarray

end function get

subroutine finalize(self)

  implicit none
  type(mytype_t), intent(inout) :: self
  
  if (associated(self%myarray)) then
    deallocate (self%myarray)
  end if
  
end subroutine finalize

Pros:

  • Truly self-contained — instantiated objects require no target or pointer attribute on the caller’s side

Cons:

  • Pointer arrays generally have worse performance than allocatables due to the possibility of non-contiguity and aliasing
  • Requires finalizer to avoid memory leaks

Questions:

  • Which of these two patterns do you reach for in practice, and why? Are there other approaches I haven’t considered?
  • Does marking something as contiguous actually improve performance in practice—specifically, does it let the compiler optimize it as well as allocatable arrays (e.g., for aliasing and stride-1 access), and is there any benchmark evidence showing it removes the performance gap?
  • For those working on large codebases: does the self-containment of Option 2 outweigh the performance and memory-management overhead in your experience? I’m particularly interested in hearing from people who have used either pattern in performance-sensitive code. Any insights or references are welcome!

In my codes I expose the arrays directly, I don’t like accessors as I found this extra layer of indirection makes maintainability worse (for me), adds number of lines, and has the issues you mentioned.

I also prefer not to use getters and setters for the “simple” task of accessing or modifying a DDT member (unless some pre- or post-processing logic is required). Fortran is already verbose enough! :slight_smile:

Neither!

The accessor pattern is for when you want to force the user of the derived type to only modify values through a setter (so you can verify/sanitize values). The getter should just return a copy of the data.

(Programmers of languages like Java/C#/PHP/etc. tend to use accessors for everything, but that’s due to a “just-in-case-we-might-need-them-later” mentality)

So returning a pointer to the data kind of defeats the purpose of the pattern… And it might also degrade performance.

You should just make the component public if the idea is to let the user change the value on its own. Otherwise, just provide a custom constructor (to initialize the value), a getter (to obtain a copy of the value), and, optionally, a setter (to override the current value).

Something like the following should make sense:

module mymod
    use ISO_FORTRAN_ENV, RWP => REAL64

    implicit none
    private

    type, public :: mytype_t
        private
        integer :: flags = 0
        real(RWP), allocatable :: myarray(:)
    contains
        ! the accessor pattern is better for when you need to intercept
        ! multiple things in an instance of your derived type, so TBP's names
        ! should be appropriate.

        procedure :: get_values
        procedure :: set_values

        ! notice that there's no `get_flags`
        procedure :: set_flags
        procedure :: has_flag
    end type

    interface mytype_t
        module procedure mytype_t_new
    end interface

contains
    pure function mytype_t_new(values, FLAGS) result(new)
        type(mytype_t) :: new
        real(RWP), intent(in) :: values(:)
        integer, optional, intent(in) :: FLAGS
        new%myarray = values
        if (present(FLAGS)) new%flags = FLAGS
    end function

    pure subroutine get_values(self, values)
        class(mytype_t), intent(in) :: self
        real(RWP), allocatable, intent(out) :: values(:)

        ! avoid runtime segfaults
        if (.not. allocated(self%myarray)) then
            allocate (values(1:0))
            return
        endif

        values = self%myarray
    end subroutine

    pure subroutine set_values(self, values, STAT)
        class(mytype_t), intent(inout) :: self
        real(RWP), intent(in) :: values(:)
        integer, optional, intent(out) :: STAT

        ! sanitize the values and return STAT /= 0 if they're invalid
        ! ...

        if (present(STAT)) STAT = 0
        self%myarray = values
    end subroutine

    pure subroutine set_flags(self, flags)
        class(mytype_t), intent(inout) :: self
        integer, intent(in) :: flags
        self%flags = ior(self%flags, flags)
    end subroutine

    pure function has_flag(self, flag) result(cond)
        logical :: cond
        class(mytype_t), intent(in) :: self
        integer, intent(in) :: flag
        cond = iand(self%flags, flag) == flag
    end function
end module mymod

Notice that, since there are no pointers involved, a final procedure is not needed.

This is not something I have a lot of experience with, but I’ve wondered about the protected attribute. Seems like it removes the need for a getter, instead letting you directly use a member without the risk of modifying it. Is that a good design pattern or are there downsides?

Edit: Actually, dumb question, since protected only applies to module variables, not derived type members.

The public|private attributes, are about accessing designators (i.e., variables, components, etc.) from outside the module. The protected attribute is about being able to modify variables from outside the module.

So you can have things like

module mod1
    implicit none
    private
    integer, protected :: a    ! here, `protected` is moot
    real, protected, public :: b(10)
end module mod1

In Fortran, the primary program unit when it comes to access is the module —which differs from other languages that center access around the class|struct. And it’s not the only one (e.g., in Go, the TitleCase access applies to the package, not the struct).

Also, Fortran has this thing about some attributes only applying to variables (e.g., protected, target), some applying only to derived type components (e.g., pass), and some applying to both (e.g., allocatable, pointer, public, …). In fact, originally, allocatable required rank > 0.

I think there was a proposal to extend protected so that it could apply to derived type components, but it didn’t take.

Thank you all for your responses.

I’d like to provide some additional context for why I’m considering the accessor pattern here. As a side note, I’ve always regarded Fortran’s explicitness and verbosity as strengths rather than weaknesses. The requirement to declare everything upfront is something I genuinely appreciate, as it makes code intent clear and unambiguous.

The use case I have in mind involves data that is owned and managed by one class and consumed in a read-only fashion by downstream classes. This aligns with a standard OOP encapsulation pattern: the owning class serves as the single source of truth, and external code should not be able to modify its internal state directly. Making the component public would violate that guarantee. Returning a copy is safe, but for large or frequently accessed arrays, the associated performance cost is prohibitive.

A pointer-based accessor is therefore intended to approximate the behavior of returning a const reference in C++. It allows read access to the underlying data without incurring the cost of copying. However, unlike a true const reference, Fortran does not prevent the caller from modifying the target through the returned pointer, which remains an important caveat.

As an aside—though not directly related to the accessor question—in my experience with Fortran OOP in computational fluid dynamics, the language’s object-oriented features tend to work best when used as an abstraction layer over a fundamentally procedural structure (input → computation → output). Lower-level classes are often stateless or nearly so, with only a few components fixed after initialization, while mid- and high-level classes primarily aggregate objects and manage temporary data. I’m interested to hear whether others in scientific computing have converged on a similar pattern, or if you’ve found alternative approaches that scale more effectively.

Maybe the problem is in trying to write C++ in any other language?

You can have a read-only look at private values through a callback that implements whatever you need:

module basemod
    use ISO_FORTRAN_ENV, RWP => REAL64

    implicit none
    private

    public :: RWP

    type, public :: mytype_t
        private
        real(RWP), allocatable :: myarray(:)
    contains
        procedure :: access_privates
    end type

    interface mytype_t
        module procedure mytype_t_new
    end interface

    abstract interface
        subroutine i_privates_cb(myarray)
            import
            real(RWP), intent(in) :: myarray(:)
        end subroutine
    end interface
    public :: i_privates_cb

contains
    pure function mytype_t_new(array) result(new)
        type(mytype_t) :: new
        real(RWP), intent(in) :: array(:)
        new%myarray = array
    end function

    subroutine access_privates(self, cb)
        class(mytype_t), intent(in) :: self
        procedure(i_privates_cb) :: cb

        if (.not. allocated(self%myarray)) then
            call cb([real(RWP) ::])
            return
        endif

        call cb(self%myarray)
    end subroutine
end module basemod

module myimpl
    use basemod

    implicit none
    private

    public :: process

contains
    subroutine process()
        type(mytype_t) :: t

        call t%access_privates(cb)

        t = mytype_t([real(RWP) :: (i, integer :: i = 1, 100)])
        call t%access_privates(cb)

    contains
        subroutine cb(array)
            real(RWP), intent(in) :: array(:)

            if (size(array) == 0) then
                print'(/a)','there''s no data'
                return
            endif

            print'(/a)','array is:'
            print'(10(f8.4,:,1x))',array
        end subroutine
    end subroutine
end module myimpl

use myimpl

implicit none

call process()
end

My gfortran version still doesn’t like the declaration of i within the implied-do, but ifx is fine with it:

$ ifx readonly-access.f90 && ./a.out 

there's no data

array is:
  1.0000   2.0000   3.0000   4.0000   5.0000   6.0000   7.0000   8.0000   9.0000  10.0000
 11.0000  12.0000  13.0000  14.0000  15.0000  16.0000  17.0000  18.0000  19.0000  20.0000
 21.0000  22.0000  23.0000  24.0000  25.0000  26.0000  27.0000  28.0000  29.0000  30.0000
 31.0000  32.0000  33.0000  34.0000  35.0000  36.0000  37.0000  38.0000  39.0000  40.0000
 41.0000  42.0000  43.0000  44.0000  45.0000  46.0000  47.0000  48.0000  49.0000  50.0000
 51.0000  52.0000  53.0000  54.0000  55.0000  56.0000  57.0000  58.0000  59.0000  60.0000
 61.0000  62.0000  63.0000  64.0000  65.0000  66.0000  67.0000  68.0000  69.0000  70.0000
 71.0000  72.0000  73.0000  74.0000  75.0000  76.0000  77.0000  78.0000  79.0000  80.0000
 81.0000  82.0000  83.0000  84.0000  85.0000  86.0000  87.0000  88.0000  89.0000  90.0000
 91.0000  92.0000  93.0000  94.0000  95.0000  96.0000  97.0000  98.0000  99.0000 100.0000

And again, no pointers were harmed during the process, :laughing:.

Firstly, welcome to the forum!

Secondly, all of the code examples that have been given so far in this thread (including your original two options) show OO anti-patterns that should be avoided.

And thirdly, …

you may wish to look up the “Tell, Don’t Ask” Principle. Your code examples (and the other code examples of the present thread) are violating this principle.

That’s very clever, even if the code doesn’t “read linearly” anymore. Saving the post :+1: . Solution fits under the fundamental theorem of software engineering..

The callback can also add the contiguous attribute to address the issue with,

@andris I have the same interest with you, and so I have asked a bit more questions to LLM (here chatgpt). The replies themselves were interesting to me, but the “answer” seems that there is no robust way of hiding implementation details (in a way similar to C++). I could ask ask more questions in the following session, so please let me know if necessary.

I’d recommend reading Allen Holub’s following article on the topic, instead:

There, you will also find the following citation of a statement by Kent Beck and Ward Cunningham that illustrates the actual problem (emphasis mine):

The most difficult problem in teaching object-oriented programming is getting the learner to give up the global knowledge of control that is possible with procedural programs, and rely on the local knowledge of objects to accomplish their tasks. Novice designs are littered with regressions to global thinking: gratuitous global variables, unnecessary pointers, and inappropriate reliance on the implementation of other objects.

It’s annoying the page has aggressive cookies, but thanks for sharing. I found a follow-up to that article, which is also interesting:

Thanks for reminding me about Bugayenko’s article.

He’s absolutely right in that objects are not data entities, and that trying to use them in this fashion leads to all the troubles that are mentioned in this thread – simply because the programmer can’t give up his/her procedural programming mindset.

The callback solution of @jwmwalrus reminds me a bit of the Apple Accelerate Quadrature library. Many numerical quadrature libraries expect the user to provide a function which evaluates a scalar integrand_. This “cripples” the use of SIMD parallelism. But in Accelerate, the functions expects a “vector callback”.

The Apple documentation only shows the Swift API, but at least last time I checked there was also an (Objective-)C API, which can be combined with Fortran callbacks using C interop.

I’ve mentioned this idea on Discourse before: Iso_c_binding: pass an array from c to fortran (Edit: python interop content) - #26 by ivanpribec

I’ve asked the “getters/setters” topic also in chatgpt, and the reply seems to be “depends”. (In my case, I often define property routines with additional checks or related computations, but I guess the OP’s goal may be a bit different).

For “Option2” in the first post (of OP):

I guess this “option 2” also has the same cons as “option 1”, because the statement class(mytype_t), target... requires the instantiated object to have TARGET…? I’ve tried flang-22 on mac with a bit simplified code:

module test_m
implicit none

type :: mytype_t
  real, pointer :: myarray(:) => null()
contains
  procedure :: get
end type

contains

function get(self) result(myarray)

  class(mytype_t), intent(in), target :: self
  real, pointer :: myarray(:)

  myarray => self%myarray

end function

end module

program main
    use test_m, only: mytype_t
    implicit none

    type(mytype_t) :: m   !! w/o target
    !! type(mytype_t), target :: m   !! w/ target

    real, pointer :: p(:)

    p => m% get()

end program

which gives:

$ flang-22 -pedantic test.f90 
./test.f90:32:10: warning: Any pointer associated with TARGET dummy 
argument 'self=' during this call must not be used afterwards, 
as 'm' is not a target [-Wnon-target-passed-to-target]
      p => m% get()
           ^^^^^^^^

I guess CompilerExplorer can also be used for compilation with various compilers as well as flang-22.

(FYI, I have mixed feelings about this “TARGET” attribute, particularly because it is required for the entire derived type and their ancestors (rather than individual components), which seems to be over-requirement, but I guess it is probably a different topic…)

What happens if you remove the target attribute from the dummy argument self in the getter function? Do you still get a warning? You should still get valid results as the function result is associated with an actual target through the pointer assignment.

Thanks for mentioning it! I truly didn’t know that the setter/getter pattern was considered so harmful, but it does make sense as it breaks encapsulation.

Exactly. Now you can rethink your design in a way that is consistent with an object-oriented approach.

That is, you should move calculations/functions, that operate on your array but are presently embedded in other classes, into the original class that holds this array as a field.

In this way, you won’t have to break encapsulation. You’ll also get maximal cohesion within that class (as all the data and the functions operating on that data, which are all tightly coupled to each other, will reside in this same class). At the same time, this will minimize the inter-class coupling, and hence the required amount of communication with other classes.

Sometimes, a problem is only a problem because it is viewed from the wrong perspective.