Arena allocator for real arrays

UPDATE: I had Gemini copy formatting from my dp implementation and added support for integer(kind=c_int32_t) (i32), integer(kind=c_int64_t) (i64), and real(kind=c_float) (sp). To support this, init64 accepting 64-bit integer array sizes had to be separately named. Otherwise the all-optional-arguments signatures are ambiguous between i32 and i64 versions.

I recently started working on Mac OS and discovered that you cannot simply ulimit -S unlimited to pretend the stack is system RAM sized.

This forced me to make a bunch of routines with large internal working arrays changed over to allocatable. Of course I hate allocating memory, so I wrote a simple arena allocator to handle real variables: GitHub - dacarnazzola/farena: simple Fortran arena allocator for real arrays · GitHub

It is set up to provide contiguous space for real(kind=c_double) variables, but that can be reconfigured at the top by pointing wp at something else. It supports arrays from rank 1-15, and will take either 32-bit or 64-bit integer dimensions for 1D requests. For any dimension higher than 1, I only wrote interfaces for 32-bit inputs.

2 Likes

Wouldn’t it be better if you could just use local arrays and the compiler would use an arena behind the scene to do the same thing?

1 Like

Yes.

We have an old issue for this feature here: Efficient implementation of stack arrays using a custom allocator · Issue #2657 · lfortran/lfortran · GitHub, I think very soon we’ll be able to implement it.

4 Likes

Aha, right on, that’s awesome. LFortran keeps bringing so many niceties to Fortran, keep up the great work!

1 Like

Looking at your example,

program main
use, intrinsic :: iso_fortran_env, only: rk => real64
use, non_intrinsic :: farena, only: arena_type
implicit none

    type(arena_type) :: workspace

    real(rk), pointer :: arr1d(:), arr2d(:,:), arr3d(:,:,:)

    call workspace%init() ! default initialization provides 1 GB

    call workspace%gptr(10, arr1d)
    call random_number(arr1d)
...

I thought having target attribute on the workspace variable is necessary?

If I compile the code with flang -pedantic (see Compiler Explorer), I get a bunch of warnings:

/app/example.f90:14:5: warning: Any pointer associated with TARGET dummy argument 'self=' during this call must not be used afterwards, as 'workspace' is not a target [-Wnon-target-passed-to-target]
      call workspace%gptr(10, arr1d)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1 Like

Honestly I am not sure about that one… Previously I had basically never used pointers in Fortran, and asking AI about it indicated to me that the usage shown here was fine. The routines that actually produce the pointer have the dummy argument marked target as expected.

Normally my debug/”safe” compiler options come from gfortran, and this string compiles + runs as expected without any warnings: gfortran -O0 -g -Wall -Wextra -Werror -std=f2018 -pedantic -fmax-errors=1 -fcheck=all -fbacktrace farena.f90 main.f90

I will have to defer to the standards readers on this one, I truly do not know what the “correct” answer is supposed to be.

Checking through all the options in Compiler Explorer, with no flags the only compilers with issues seem to be nvfortran and lfortran. nvfortran doesn’t support array ranks >8 it seems, and lfortran produces this error:

Compiler stderr
warning: `--generate-object-code` is deprecated and will be removed in a future release; use `--separate-compilation` instead.
semantic error: Only a pointer variable can be associated with another variable.
 →  /app/example.f90:76:13
   |
76 |             ptr(1_i64:int(dim1, kind=i64)) => self%rdata(start_ii:stop_ii)
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note: Please report unclear, confusing or incorrect messages as bugs at

That is particularly strange since self is declared with target, and ptr has pointer. Dunno.

I think it’s related to this issue: Pointer attribute for instance argument in type bound procedures · Issue #354 · j3-fortran/fortran_proposals · GitHub

As @aradi explained there,

The problem here is, that if the user forgets to define the target attribute for the list instance, there is no warranty, that the returned pointer is valid and there will be no compiler error (or even warning).

1 Like

My understanding has been, that this attribute is valid only for the scope of the procedure. But I’m happy to hear other opinions on this.

Quoting from section 8.5.18 (J3/24-007)

The TARGET attribute specifies that a data object may have a pointer associated with it (10.2.2). An object without the TARGET attribute shall not have a pointer associated with it.

If an object has the TARGET attribute, then all of its nonpointer subobjects also have the TARGET attribute.

The way I like to think of this is, unless there is any visible TARGET in the same scope as, anyone reading the code might as well assume the pointer is associated with an anonymous target. On the other hand with,

    type(arena_type), TARGET :: workspace
    real(rk), pointer :: arr1d(:), arr2d(:,:), arr3d(:,:,:)

serves as a reminder to the reader, that there is an association in place here. I do admit that the rules are not very intuitive, especially if you come from a C background it is the total opposite… I saw a good quote on this at the LLVM Discourse (thanks to @nv_epshteyn):

“Fortran is almost, but not quite, entirely unlike C.”

Edit: I wonder if this is a paraphrasing of Chapter 2 in Michael Wolfe’s book “High Performance Compilers for Parallel Computing”. This book has a nice explanation of pointer aliasing:

Fortran 90 pointers are more restricted, in that they can point only to objects of the same type. Moreover, only variables with the target attribute can have their address taken, so the alias sets for dereferences pointers do not get so large. Fortran 90 also defines a special type of pointer variable, called an allocatable variable, which can be defined only by an allocate statement; an allocatable variable cannot be an alias for any other declared variable.

Applying the reasoning to the example above, assuming no other visible pointers a targets, a compiler could attempt to do aliasing analysis between arr1d, arr2d and arr3d. But the aliasing analysis would not be complete, due to the missing target attribute on workspace. If present, the aliasing alias could perhaps correctly infer the arrays all alias the private derived type component workspace%data_dp or workspace%data_sp (depending on the kind value rk).

1 Like

Several good points. I think the linked github post is exactly this same situation. They were talking about returning a pointer as a view into something else, which is very similar to what I’m doing here. I guess because the arena is meant to replace the need for direct allocate statements, mine is meant to be a mutable reference, so it is even more important to clarify to the compiler where the potential targets are.

I have added target to the example, and expanded the primary farena.f90 module to support 32 and 64-bit integer + real spaces now. I also added the contiguous attribute to both the returned output pointers and the example code. That one should be required, any consumer of a pointer returned from arena_type_object%gptr(dim1, dim2, …, dimN, output_pointer) should need to declare output_pointer as contiguous or face a compilation error.

Hopefully that helps the compiler still know it’s safe to generate vectorized code even with target called out on the arena itself.

Any time there are pointers and targets, there is a potential to suppress vectorized code generation. Sometimes the compiler can see that no actual aliases can occur and can then optimize the code, but sometimes it can’t. The programmer can sometimes hide the possible aliases from the compiler, so that can help optimization. One way to do that is to pass the pointers and targets to a subroutine, and the dummy argument asssociations (without pointer or target attributes) then tell the compiler that aliases are not possible. I think you can do the same kind of thing now with ASSOCIATE blocks too. Another possibility is that some compilers allow options or directives for the programmer to tell the compiler that aliases cannot occur; these approaches are, of course, not portable.

Here is one way to look at this situation. Fortran does not define how argument association works for normal subroutine calls. So in principle, it could be through addresses or it could be copy-in/copy-out of temp arrays, or it could be through address association of those temp arrays. So if you are wondering if a pointer assignment might be valid after a subprogram call, just ask yourself if the compiler would be allowed to do one of those copy-in/copy-out types of association. If the answer is yes, then that means that the pointer is undefined.

This is not the way it is described in the standard. The standard backs away from such low-level details and rewords things in an abstract way. But so far, this way of looking at argument association is consistent with the standard text. That might change in the future, some compiler might find a way to make these things different. But at least right now, that is a good way to keep track of when pointers are and aren’t defined after subprogram return.

Perhaps ironically, adding the contiguous attribute to a dummy argument can sometimes trigger copy-in/copy-out association when it would not have occurred otherwise. One would think that this would occur when the actual argument is not contiguous and the dummy argument is, but the compiler might decide to do this anyway given, e.g., some appropriate set of compiler options or optimization directives. So if you are combining the contiguous attribute to help with vectorization optimizations, but also expecting local pointers to remain defined upon return, you may run into problems due to the conflicts of those two goals. I think there are cases where you can combine those and make it work in a standard conforming way, but you also need to beware. You may be laying landmines that are safe for you to navigate, but will be difficult for the next programmer who needs to modify the code; and of course, that next programmer may be you six months from now.

When the compiler sees an actual argument with the pointer or target attribute, and the dummy argument also has the pointer or target attribute, then it knows that copy-in/copy-out is not allowed. That combination is how the programmer tells the compiler to use the same actual memory locations. That particular combination is when local pointer assignments remain defined after the subprogram return. I think, but I’m not certain, that is the only combination the programmer has to ensure that the pointer is defined afterwards. Maybe there are some other corner cases that are also covered somehow, but I can’t think of one right now.