Automatic finalization of derived types

ivanpribec · November 18, 2021, 5:16pm

Say I have a derived type which owns a C array. The purpose of doing this is to guarantee the C memory will be suitably finalized. Below are my C memory management routines and the Fortan wrapper:

// resource.c

#include "stdlib.h"
#include <stdio.h>

double* allocate_resource(int n) {
    double* ptr;
    /* Array of n doubles. */
    ptr = malloc(n * sizeof(double));
    return ptr;
}

void free_resource(double* ptr)  {
    free(ptr);
}

! main.f90

module test

  use, intrinsic :: iso_c_binding
  implicit none
  private
  
  integer, parameter, public :: dp = c_double
  
  public :: assign_resource
  public :: resource_owner
  
  type :: resource_owner
    type(c_ptr) :: ptr = c_null_ptr
    real(c_double), pointer :: a(:) => null()
  contains
    procedure :: print
    final :: finalize
  end type

  !
  ! C routines that allocate and free memory (resource.c)
  !
  interface
    function allocate_resource(n) bind(c,name="allocate_resource")
      import c_int, c_ptr
      integer(c_int), intent(in), value :: n
      type(c_ptr) :: allocate_resource
    end function
    subroutine free_resource(ptr) bind(c,name="free_resource")
      import c_ptr
      type(c_ptr), intent(in), value :: ptr
    end subroutine
  end interface

contains

  subroutine assign_resource(owner,n)
    type(resource_owner), intent(out) :: owner
    integer(c_int), intent(in) :: n
    print *, "initialize(owner,n)"
    owner%ptr = allocate_resource(n)
    call c_f_pointer(owner%ptr, owner%a, [n])
    if (.not. associated(owner%a)) then
      print *, "Resource allocation failed"
      stop
    end if
  end subroutine

  subroutine print(this)
    class(resource_owner), intent(in) :: this
    print*, "print(this)"
    print*, this%a
  end subroutine

  subroutine finalize(this)
    type(resource_owner) :: this
    print *, "finalize(this)"
    if (associated(this%a)) then
      nullify(this%a)
      call free_resource(this%ptr)
      this%ptr = c_null_ptr
    end if
  end subroutine

end module

program main
  use test
  implicit none
  
  type(resource_owner) :: owner

  call assign_resource(owner,4)
  owner%a = [real(dp) :: 1.0, 2.0, 3.0, 4.0]
  call owner%print()

  ! owner is not finalized automatically in main program block

end program

Compiling is done with the commands:

gcc -c resource.c
gfortran -o main main.f90 resource.o

What I see as output is

>main.exe
 finalize(this)
 initialize(owner,n)
 print(this)
   1.0000000000000000        2.0000000000000000        3.0000000000000000        4.0000000000000000

I would expect another call to finalize, just before the program exits. Is it acceptable to leave the C memory not freed?

A simple solution exists, which is to place everything in the main program block into a subroutine:

program main
  
  call main_with_finalization()

contains 

  subroutine main_with_finalization()
    use test
    implicit none
  
    type(resource_owner) :: owner

    call assign_resource(owner,4)
    owner%a = [real(dp) :: 1.0, 2.0, 3.0, 4.0]
    call owner%print()

  end subroutine
  
end program

With this modification I can see two calls to finalize in the output, giving me the assurance that the C memory has been freed.

This brings me to two questions:

Why is the behavior of the main program unit with respect to finalization different from subprograms?
Should placing the “driver” code in a subprogram be considered best practice?

themos · November 18, 2021, 7:19pm

28 integer(c_int), intent(in),value :: n

themos · November 18, 2021, 7:31pm

owner never goes out of scope if declared in PROGRAM, hence never finalised.

7.5.6.4 Entities that are not finalized

If image execution is terminated, either by an error (e.g. an allocation failure) or by execution of a stop-stmt, error-stop-stmt, or end-program-stmt, entities existing immediately prior to termination are not finalized.
NOTE 1
A nonpointer, nonallocatable object that has the SAVE attribute is never finalized as a direct consequence of the execution of a RETURN or END statement.

ivanpribec · November 18, 2021, 7:38pm

Thanks for fixing the mistake and bringing up the relevant standard section.

Unfortunately, it doesn’t explain the reasoning why finalization is not needed in this case. Resources on programming in C are always paranoid about memory leaks and dangling pointers.

I’ve found this StackOverflow issue that gives me a partial answer: Is leaked memory freed up when the program exits?

My current understanding is we just don’t care about finalization at this point, because it will be the duty of the operating system to clean up once the process terminates anyways?

themos · November 18, 2021, 7:58pm

I believe that is indeed the rationale. Fortran originated in machines that rebooted after finishing a program. The operating system kernel has taken over that hardware job.

ivanpribec · November 18, 2021, 8:21pm

I’ve marked your reply as the solution to question #1.

After looking at a few more threads on memory leaks, there are cases where not finalizing a derived type could be a problem: c++ - What are the long term consequences of memory leaks? - Stack Overflow, quoting,

Much more serious leaks include those:

[…]

where the program can request memory — such as shared memory — that is not released, even when the program terminates

[…]

running on an operating system that does not automatically release memory on program termination. Often on such machines if memory is lost, it can only be reclaimed by a reboot, an example of such a system being AmigaOS.

FortranFan · November 18, 2021, 9:50pm

Yes, that’s essentially the thought process though in Fortran parlance, it will be considered “left up to the processor”.

Another way to look at this is via the lens of the abominable “implicit SAVE”: the standard states, “A variable … declared in the scoping unit of a main program … implicitly has the SAVE attribute.”

And as pointed out upthread, an object with the SAVE attribute is never finalized due to END (or RETURN) statement.

FortranFan · November 18, 2021, 9:58pm

Dunno about “best practice” yet, but another option to consider with Fortran 2008 and later revision supporting compilers is to place such code in a BLOCK construct:

module m
   type :: t
   contains
      final :: final_t
   end type
contains
   impure elemental subroutine final_t( this )
      type(t), intent(inout) :: this
      print *, "In final_t"
   end subroutine 
end module
   use m
   block
      type(t) :: a
   end block
end

C:\Temp>a.exe
In final_t

ivanpribec · November 18, 2021, 10:55pm

Which rule applies then to allocatable arrays? (Note 1 above explicitly states non-allocatable objects.)

AFAIK, allocatable arrays declared in the main program also aren’t deallocated.

FortranFan · November 19, 2021, 4:15am

Again, the SAVE aspect comes into play here as well. The automatic deallocation of allocatable objects is for unsaved local variables in procedures, provided they are not a function result.

And situations with no deallocation imply no finalization.

plevold · November 19, 2021, 2:01pm

This is unfortunately very difficult to get right generally. I don’t think you need to be concerned about leaked memory not being cleaned up after the program terminating, but depending on how you use type(resource_owner) code similar to the following example could cause very real problems:

type(resource_owner) :: owner
call assign_resource(owner,4)
owner%a = [real(dp) :: 1.0, 2.0, 3.0, 4.0]
! Now we have one Fortran variable pointing to one C array
block
    type(resource_owner) :: owner_copy
    owner_copy = owner
    ! Now we have two Fortran variables of type(resource_owner), but both
    ! point to the same C array since type(c_ptr) :: ptr has been copied
end block

When execution returns from the inner (block) scope owner_copy has gone out of scope and finalization is triggered. This causes the C array to be deallocated. This means that owner%ptr is now invalid because of what has happened to owner_copy! Not good!

I think there’s two possible solutions to this:

Overload the assignment operator for the type and have it copy the C array. Now each instance can own its own C array
If the C data is something more complex than an array which is not easily copied you might be able to have a type(c_ptr) which points to a C++ shared_ptr which keep track of the number of instances pointing to the data and deallocate it once the last instance goes out of scope. A raw C pointer stored in a Fortran type pointing to a C++ shared_ptr, now that’s complexity for you right here!

To make matters worse I’ve tried the latter (I’ll see if I can dig up the code) and I had some issues with Intel Fortran and gfortran handling the finalization slightly different. The result was that I could not make this work in gfortran without a memory leak, but I’m not sure if that was beacuse of a compiler bug or not.

ivanpribec · November 19, 2021, 2:06pm

A solution in the spirit of your second solution, is to do reference counting on the Fortran side. There are a few resources on this:

A reference counting implementation in Fortran 95/2003
GitHub - sourceryinstitute/smart-pointers: An object-oriented, extensible reference-counting utility for Fortran

I’d definitely like to explore extending the assignment operator.

plevold · November 19, 2021, 2:08pm

Good point. That’s at least one less language to think about!

FortranFan · November 19, 2021, 10:02pm

plevold:

! Now we have one Fortran variable pointing to one C array
block
    type(resource_owner) :: owner_copy
    owner_copy = owner
    ! Now we have two Fortran variables of type(resource_owner), but both
    ! point to the same C array since type(c_ptr) :: ptr has been copied
end block
… I think there’s two possible solutions to this:

Overload the assignment operator for the type and have it copy the C array. Now each instance can own its own C array

Not a solution per se, but as a design consideration it may be worthwhile to consider a “method” (TBP) to clone:

block
   type(resource_owner) :: owner_copy
   call owner%clone( owner_copy ) !<-- the cloning method does the needful with the "data" (C array, etc.)
   ..
   ! Exit from block "cleans" up the clone
end block

ivanpribec · November 27, 2021, 11:22pm

I was just reading the book Code Craft: The practice of writing excellent code by Pete Goodliffe, and found the following item (pg. 13, Chapter 1) which rang a bell:

Handle Memory (and Other Precious Resources) Carefully

Be thorough and release any resource that you acquire during execution. Memory is the example of this cited most often, but it is not the only one. Files and thread locks are other precious resources that we must use carefully. Be a good steward.

Don’t neglect to close files or release memory because you think that the OS will clean up your program when it exits. You really don’t know how long your code will be left running, eating up all file handles or consuming all the memory. You can’t even be sure that the OS will cleanly release your resources—some OSes don’t.

There is a school of thought that says, “Don’t worry about freeing memory until you know your program works in the first place; only then add all the relevant releases.” Just say no. This is a ludicrously dangerous practice. It will lead to many, many errors in your memory usage; you will inevitably forget to free memory in some places.

rouson · November 30, 2021, 8:58am

This advice makes perfectly good sense in the context of the original posting: managing memory allocated in C. However, I wouldn’t necessarily apply this same advice in other contexts. For example, if the Fortran program allocates memory via an allocatable variable that doesn’t have the save attribute explicitly or implicitly, I would follow the advice of my book co-author Jim Xia, who was on the IBM compiler test team and the Fortran standard committee. Jim advised, “Let the compiler do its job.” If the compiler is obligated to free the memory, as is the case for allocatable variables, then it’s not clear that there will generally be an advantage to the programmer doing it proactively and I can imagine some possible disadvantages from a performance perspective.

rouson · November 30, 2021, 9:10am

In my final sentence, I should have written “as is the case for allocatable variables that don’t have the save attribute implicitly or explicitly…” As noted in an earlier message in this thread, compiler implementations of finalization can be uneven. In particular, GFortran still doesn’t get some aspects of finalization quite right (as tracked by a GCC meta-issue) so it might be better to not force the issue unless one is trying to address a specific problem that has arisen when main program variables aren’t finalized.

ivanpribec · November 30, 2021, 11:50am

Thanks @rouson for the extra clarification. I almost never use manual deallocation, except with pointer variables.

As indicated earlier by @FortranFan, allocatable variables in the main program scope have an implicit save. It’s here where I am perplexed, whether I should just leave it to the operating system kernel, or do it myself as a good habit (even if only to prevent false positive memory leaks).

Consider the program:

program main
  implicit none
  real, allocatable :: a(:)
  allocate(a(3))
  a = 1
  print *, a
end program

$ gfortran -o main main.f90
$ ./main
   1.00000000       1.00000000       1.00000000

Now running the application through valgrind you get

$ valgrind --leak-check=full ./main
==4854== Memcheck, a memory error detector
==4854== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4854== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4854== Command: ./main
==4854== 
   1.00000000       1.00000000       1.00000000    
==4854== 
==4854== HEAP SUMMARY:
==4854==     in use at exit: 12 bytes in 1 blocks
==4854==   total heap usage: 22 allocs, 21 frees, 13,596 bytes allocated
==4854== 
==4854== 12 bytes in 1 blocks are definitely lost in loss record 1 of 1
==4854==    at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4854==    by 0x1089CA: MAIN__ (in /home/ipribec/Desktop/main)
==4854==    by 0x108AE8: main (in /home/ipribec/Desktop/main)
==4854== 
==4854== LEAK SUMMARY:
==4854==    definitely lost: 12 bytes in 1 blocks
==4854==    indirectly lost: 0 bytes in 0 blocks
==4854==      possibly lost: 0 bytes in 0 blocks
==4854==    still reachable: 0 bytes in 0 blocks
==4854==         suppressed: 0 bytes in 0 blocks
==4854== 
==4854== For counts of detected and suppressed errors, rerun with: -v
==4854== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

However if you add a deallocate statement at the end of the main program scope, valgrind insteads outputs:

$ valgrind --leak-check=full ./main
==4923== Memcheck, a memory error detector
==4923== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4923== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4923== Command: ./main
==4923== 
   1.00000000       1.00000000       1.00000000    
==4923== 
==4923== HEAP SUMMARY:
==4923==     in use at exit: 0 bytes in 0 blocks
==4923==   total heap usage: 22 allocs, 22 frees, 13,596 bytes allocated
==4923== 
==4923== All heap blocks were freed -- no leaks are possible
==4923== 
==4923== For counts of detected and suppressed errors, rerun with: -v
==4923== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Topic		Replies	Views
Finalization of array of derived type Help	6	533	February 24, 2023
Final procedure for polymorphic base type Help	4	576	August 10, 2022
Managing derived types with pointers Help	13	1259	September 9, 2023
Automatic array of derived type with allocatable component GNU	2	227	August 15, 2024
Gfortran-13/14 finalization: issue, called from deallocated memory? GNU	8	529	June 20, 2024

Automatic finalization of derived types

Handle Memory (and Other Precious Resources) Carefully

Related topics