Understanding Fortran pointers

I’m trying to fully understand pointers in Fortran and what it’s the exact behaviour of them. I’m reading “Modern Fortran explained (2018)” and also going through https://www.tutorialspoint.com/fortran/fortran_pointers.htm but they’re still unclear to me.

I thought that pointers are just objects that save the memory address of an actual variable when associated to it. That is, if point is this a pointer object and targ is a target variable then point => targ makes that when using point in any calculation, print or whatever operation in the program it will exactly behave as targ (and with behave I simply mean that it will just use the value of targ but now using point, of course assuming both have same data type). Following this reasoning my expectation will be that point is linked to targ and it’s numerical value will be associated always with targ, so I I change the value of targ when I use point I will get the same value as in targ. What I will not expect is that if I change point it will also change the value of targ. Let me use the following example (basically the same as in the link above).

What I understand:

  • Example 1:

    program pointers
      implicit none
      integer, pointer :: point
      integer, target :: targ
    
      targ = 3
      point => targ
    
      print'(" point = ", g0)', point
      print'(" targ  = ", g0)', targ
    end program pointers
    

    Output:

    point = 3
    targ  = 3
    

    Even though this makes sense to me, what it’s actually happening here? Is point only getting the memory address of targ or it is actually storing the numerical value of targ thus duplicating memory usage? I think is the former but not sure.

What I don’t understand:

  • Example 2: Same as Example 1 but doing:

    program pointers
      implicit none
      integer, pointer :: point
      integer, target :: targ
    
      point => targ
      point = 3
    
      print'(" point = ", g0)', point
      print'(" targ  = ", g0)', targ
    end program pointers
    

    Output:

    point = 3
    targ  = 3
    

    Here is when I start to not understand. Why targ is getting the value of point? In principle I will expect that, although point is linked to targ, changing the value of point will not alter targ's. What is happening here? Actually, following this weird thought that I have about what a pointer does, what does it means to use the operation = with a pointer? I believe that using this operation there’s a real assignment of the value 3 to the variable point and that now there’s an actual variable point occupying the needed memory of a default integer storing the value 3.

  • Example 3: although I think it is enough for the moment, maybe another fundamental thing that I don’t understand. What is the meaning of using allocate over a pointer in general, but in particular to a scalar variable (either integer or real)? In the book I’m reading mentioned above it says “[referring to a pointer] … or may be given fresh storage using the allocate statement…”. What does it means to give fresh storage? And particularly to a scalar pointer as for example allocate (point) being point of type integer/real?

Sorry for being so verbose when I think my question is clear with just the second paragraph.

Thanks in advance!

PS. Is there a way to keep track or get logs of the actually memory usage of variables and pointers and see the values and links of them?

1 Like

Note POINTER is an ATTRIBUTE of an object in Fortran, a “characteristic” that allows that object to serve as an ALIAS in a way of another object that has either the TARGET or the pointer attributes.

ALLOCATE statement on an object that has the pointer attribute allows you to allocate memory for an “anonymous” object that has the TARGET attribute. The pointer object gets associated with this “anonymous” object and the pointer object becomes a de facto “alias” for it.

8 Likes

When a pointer points to a target it does not duplicate the memory usage. A real array with 3*10**8 elements is close to memory limit of my PC, at least with the gfortran default options, but a program that has several pointers pointing to an array of this size

! demonstrate pointers pointing to an array are not stored separately
implicit none
integer, parameter :: n = 3*10**8
real   , target    :: x(n)
real   , pointer   :: p1(:),p2(:),p3(:)
call random_number(x)
p1 => x
p2 => x
p3 => x
print*,x(1),p1(1),p2(1),p3(1),x(n),p1(n),p2(n),p3(n)
end

runs on my PC, giving output such as

0.766056478 0.766056478 0.766056478 0.766056478 0.222064555 0.222064555 0.222064555 0.222064555

Gfortran does not compile a program with n twice as large as above.

implicit none
integer, parameter :: n = 6*10**8
real   , target    :: x(n)
call random_number(x)
print*,x(n)
end

saying

c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o:crtexe.c:(.text+0x1ed): relocation truncated to fit: R_X86_64_PC32 against symbol `__imp_Sleep' defined in .idata$5 section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/libkernel32.a(dvhos01202.o)
c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o:crtexe.c:(.text+0x275): relocation truncated to fit: R_X86_64_PC32 against symbol `__imp_SetUnhandledExceptionFilter' defined in .idata$5 section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/libkernel32.a(dvhos01189.o)
c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o:crtexe.c:(.text+0x2a5): relocation truncated to fit: R_X86_64_PC32 against symbol `__mingw_winmain_hInstance' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o
c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o:crtexe.c:(.text+0x2ff): relocation truncated to fit: R_X86_64_PC32 against symbol `__mingw_winmain_lpCmdLine' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o
c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o:crtexe.c:(.text+0x455): relocation truncated to fit: R_X86_64_PC32 against symbol `__imp_GetStartupInfoA' defined in .idata$5 section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../x86_64-w64-mingw32/lib/../lib/libkernel32.a(dvhos00630.o)
c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o): In function `_gfortran_set_options':
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:149:(.text$_gfortran_set_options+0xc): relocation truncated to fit: R_X86_64_PC32 against symbol `_gfortrani_compile_options' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o)
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:151:(.text$_gfortran_set_options+0x1a): relocation truncated to fit: R_X86_64_PC32 against symbol `_gfortrani_compile_options' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o)
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:153:(.text$_gfortran_set_options+0x28): relocation truncated to fit: R_X86_64_PC32 against symbol `_gfortrani_compile_options' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o)
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:155:(.text$_gfortran_set_options+0x36): relocation truncated to fit: R_X86_64_PC32 against symbol `_gfortrani_compile_options' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o)
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:157:(.text$_gfortran_set_options+0x46): relocation truncated to fit: R_X86_64_PC32 against symbol `_gfortrani_compile_options' defined in COMMON section in c:/equation/bin/../lib/gcc/x86_64-w64-mingw32/11.0.0/../../../../lib/libgfortran.a(compile_options.o)
/home/gfortran/gcc-home/workshop/gcc/objdir64/x86_64-w64-mingw32/libgfortran/../../../gcc-11-20200927-mingw/libgfortran/runtime/compile_options.c:159:(.text$_gfortran_set_options+0x56): additional relocation overflows omitted from the output
collect2.exe: error: ld returned 1 exit status

Intel Fortran and g95 give similar results to gfortran for the two programs on Windows.

3 Likes

Yeah, pointers are hard. There’s multiple, very subtly different things going on behind the scenes that one needs to be cognizant of and remember to handle or you’ll end up with either memory leaks or “spooky action at a distance” which you’ve already stumbled across.

First, a pointer points to a thing, it is not a thing itself. In order to point a pointer at a thing, one uses pointer assignment (i.e. pointer => thing). After that, pointer and thing refer to the same memory. Using one or the other in an expression or as an argument to a procedure is exactly equivalent at that point. That’s why your p = 3 changes the value of targ too.

One can also allocate new things through a pointer, that nothing else refers to. This is where memory leaks become possible though. If you allocate something through a pointer, if you don’t deallocate it or point a new pointer at it before pointing the original pointer at something new, there will no longer be any way to refer to the allocated memory, and thus no way of deallocating it.

There’s also another potential pitfall. If you point a pointer at a thing, and then leave the scope of where that thing is defined, you’re pointer is now pointing at something that no longer exists, and trying to refer to it is invalid. I.e. in the following example, after the call to the subroutine, the pointer is pointing at invalid memory.

program main
  implicit none
  integer, pointer :: p
  call sub(p)
  print *, p ! this is invalid and will either crash or print garbage
contains
  subroutine sub(p_)
    integer, pointer :: p_

    integer, target :: x

    x = 42
    p_ => x
  end subroutine
end program

Given the above pitfalls, I tend to avoid pointers at (almost) all cost. The semantics aren’t so hard to understand, but the usage is really hard to get right. You have to keep in your head at all times too many things to avoid getting it wrong, including:

  • data ownership (who’s supposed to free this memory?)
  • the call stack (how long is the thing I’m pointing at valid?)
  • aliasing (how many things refer to this data, and when and where might they change it?)
  • not mixing up assignment and pointer assignment (i.e. don’t use => when you mean = and vice versa, which is a really easy typo to miss and causes very subtle bugs)
6 Likes

That’s actually insightful, thanks. So the ALIAS explanation makes sense to me and allows me to understand the Example 1 I mentioned. However, with this explanation is still unclear to me why Example 2 returns the same value for the target when only assigned to the pointer. I see the assignment point => targ as point is now the alias of targ. The behaviour of Example 2 seems to me more like one wants to say point <=> targ (I know this is meaningless). I mean, that the assignment seems bidirectional.

And thanks again because this actually solves my doubt on the allocate issue.
EDIT
Well, maybe not. Because I can understand that you can allocate the “anonymous” target through the pointer when the target is an allocatable array, but whats the point for a scalar variable? What I get is that allocating a scalar pointer is pretty much the same as associate it by the => assignment. It this the case?

Kind of glad to hear that :sweat_smile:

That’s indeed what I wanted to hear :ok_hand:

Messy…

Ok, I guess we arrived to the same conclusion. Just gave up with them time ago but wanted to really understand them. The idea is great when you think on optimising memory usage but seems that allocating and deallocating is an easier and safer practice.

The really nice thing about Fortran is that, unlike C, you very rarely need to use pointers - the vast amount of typical functionality can be achieved without them.

I don’t avoid Fortran pointers, rather I only use them where their unique functionality is needed (e.g. C interoperability).

3 Likes

Same thing for me: I have never used or needed them except for C interoperability. But we all have different kinds of programming style and programming needs, so it’s just a personal statement.

2 Likes

I’ve used pointers in the past to swap the addresses of arrays of old and new values for the integration of a PDE. The pattern looks something like this:

integer, parameter :: n = 100000     ! number of discretization points
integer :: step, numsteps

real, pointer, contiguous :: fnew(:,:) => null()
real, pointer, contiguous :: fold(:,:) => null()

numsteps = 10000
allocate(fnew(n,9),fold(n,9))

! ... init fnew and fold ...

do step = 1, numsteps
  
   ! ... update fnew using values of fold ...

  swap: block
    real, pointer :: ftmp(:,:) => null()
    ftmp => fnew
    fnew => fold
    fold => ftmp
  end block swap
end do

Preferably, the update is done in a kernel subroutine which accepts an assumed-size array and “hides” the pointer attribute of the array in the calling program. This raises the chances the compiler will generate optimized code since it does not need to worry about aliasing.

The alternative way to do this without pointers is by adding an extra dimension to the array and swapping the integer indexes:

integer, parameter :: n = 100000
integer :: pold, pnew, step, numsteps
real, pointer :: fwork(:,:,:) => null()

numsteps = 10000
allocate(fwork(n,9,2))
pold = 1
pnew = 2

do step = 1, numsteps

  ! ... update fwork(:,:,pnew) using values of fwork(:,:,pold) ....

  swap: block
    integer :: ptmp
    ptmp = pnew
    pnew = pold
    pold = ptmp
  end block swap
end do
4 Likes

Like many others, I rarely use pointers. But then I learned Fortran before being exposed to C. Also, because pointers are effectively aliases, their use can be detrimental to optimization. Which, in the end, is why most people avoid them, There are a couple of exceptions where I find pointers to be useful:

Associating a simply-named pointer with a target that has a long name makes subsequent programming easier to both type and read.

Pointer assignment allows bounds remapping - the pointer can have a different shape than the target (which has to be simply contiguous for this to be permitted). The new bounds cannot specify an object that is larger than the target. This sort of shape-change was common for explicit-shape array dummy arguments and was expanded to work with direct pointer assignment as well. (See the earlier comment by pmk regarding the similarity of pointer association and argument association.)

1 Like

I would say, no. The point of allocating scalars is that Fortran scalars could be megabytes in size.

character(len=1000000), allocatable:: big

If this is an entity you might not need in a subroutine, making it allocatable will save resources. It will also probably allocate it on the heap rather than the stack (but these are not Fortran concepts but Fortran-compiler practicalities).

There are two ways of thinking about Fortran concepts. One is to imagine a normal machine with a cpu, registers, a stack, addressable memory, I/O, instruction pointer, condition flags etc. Another is to remain abstract. The Fortran standard does not even mention the word “memory” or “address” (except in the C interoperability sections). The deepest understanding comes from doing both these approaches. If you come from assembly language or C, you are firmly in the first mindset. Arrays are just a memory location (an address) with some implied address-step to get to the next element. Fortran arrays are much richer than that. Imagine you store a 10x10 matrix in memory, in Fortran order, so that the first element (where your address points) is row,col=(1,1), the next is row,col=(2,1), the eleventh is row,col=(1,2), etc. How would you pass the diagonal of the matrix as a one-dimensional array to another procedure? In assembly/C, you cannot. You must also pass another quantity: the stride. In Fortran, you can just pass an array using the POINTER attribute. See the example in Modern Fortran Explained (re edition , f2018), 7.15 Pointer assignment. Fortran is really a much higher level language than C.

Assembly/C works with addresses: you can create them out of thin air and proceed to write very nice code or shoot your foot off. Fortran has sliced things finer: you get ALLOCATABLE which is safer than C’s addresses (you can’t magic them up, they get deallocated automatically, no memory leaks possible). And for the remaining 10-20% of use cases, you get POINTER with which you can still shoot your foot off but the compiler (if you ask nicely and it is a sufficiently good compiler) can help you notice when you are about to do that and terminate the program instead. Plus, you get much more functionality out of a POINTER than you get from a memory address.

I like to think of ALLOCATABLE as a POINTER “jailed” by procedure boundaries and of a POINTER as a “wild” ALLOCATABLE that escapes and is long-lived but is not always safe to
reference (ALLOCATABLE reference can always be protected by an IF(ALLOCATED(var)) THEN/ENDIF). Note that there is no safe way to do this with a POINTER, because POINTER can be in one of 3 states: undefined, not-associated and associated. You can ask IF(ASSOCIATED(var)) but that already assumes var is in one of the last 2 states, otherwise it’s foot-off time.

Notice that Fortran does not have syntax for “dereferencing” POINTER, there is no equivalent to “*p=20;”. Normal assignments (“=”) always dereference, i.e., mean modify the target of the POINTER (and if the pointer has no target, your program is wrong and is about to have your foot off). Pointer assignment (“=>”) mean the POINTER (on the left) has a new target (on the right, which could also be a POINTER). So, when you wrote “pt => tgt” you made tgt the target of POINTER pt, so the subsequent “pt = 3” means change the value of the target of pt (your target variable, here tgt) to 3. (I changed the variable names slightly to make things clearer, hopefully).

6 Likes

Yeah, at the moment me too.

This is a beautiful example to use pointers. Thanks for sharing!

Yes! I’m not C programmer but my approach to understand pointer has always been from that perspective and I think that was the error…

Totally right. Is clearer in that example.

Great. This is the second important thing I needed to know.

Sure, pretty insightful explanation!