Recursive subroutine stack overflow?

Dear all,

I’m stuck with a weird issue on Windows with MSYS2-gfortran (No problems on macOS and UNIX systems).

I have a 3D tree data structure that is built recursively; it also calls a pure recursive sorting routine internally. The sorting routine is used to keep track of point IDs between the scattered dataset and the tree ordering. Both routines have the bare minimum of local variables and use no local arrays:

From gdb, I find that for mildly large datasets (~10^5 points, at tree depth ~3,000), this causes a segfault on Windows and I’m worried I may be hitting the Windows stacksize which seems to be 1MB by default.

Both the tree and the sorting routines would be counterintuitive to write again in a non-recursive way, so I’m looking for ways I could work around this. Are there ways/coding strategies this could be controlled with that do not involve playing with compiler flags? Appreciate any hints!

@FedericoPerini ,

Have you looked at /STACK option with the Microsoft linker that I presume is ultimately employed by this program? Please see here.

As you have realized, the default on Windows at 1 MB is too low to be practical for simulations. If you’re on 64-bit OS and working on a 64-bit target, trying out significantly larger stack sizes via the above parameter may prove useful.

Thank you @FortranFan! I’m experimenting with -fmax-stack-var-size=XYZ as suggested by @shahmoradi in -frecursive .vs. fmax-stack-var-size .vs. -unlimit -s , but now I’m getting an internal compiler error elsewhere with the associate construct:

   48 |       associate(watch=>settings%timers%diffusion,s=>settings)
      |                                                             1
internal compiler error: Segmentation fault
libbacktrace could not find executable to open
Please submit a full bug report, with preprocessed source (by using -freport-bug).

It must have to do with the fact that associate creates stack variables?

Not sure, an ICE is painful, a compiler bug nonetheless.

Since these issues you are facing have to do with Windows OS, I suggest trying out Intel oneAPI processor (s), say IFORT, with Microsoft linker. That might give you hints you can apply with gfortran.

You’re right, though it looks like my issue (highlighted by Windows’ tiny stack space) is part of a bigger picture: N.M. Maclaren - Re: Dealing with default recursive procedures in Fortran (gnu.org)). My understanding is:

  • recursive procedure put local variables in the stack
  • you can use fmax-stack-var-size, but that will put larger variables in static memory (SAVE), and that is not thread safe
  • F2018 wants all routines recursive by default, so gfortran will in the future have to move these variables to the heap, to avoid many of these issues that would arise everywhere

I’ve found a workable solution going all-in on Fortran 77 (actually F90) i.e. creating a “working array” that contains ALL variables (including scalars) needed by the recursive function, and supplying that as a dummy variable:

    ! A working array for the recursive procedure, to avoid stack overflow
    type, private :: tree_work
        integer :: npoints,j,m,nsame,left_ubound,right_lbound,msave
        type(vector) :: box
    end type tree_work

    ! build binary tree
    recursive subroutine new_tree_node([...],work)
        type(tree_work), intent(inout) :: work
        ! NO LOCAL VARIABLES AT ALL!
    end subroutine new_tree_node

Not great, but works!

1 Like

If you are looking for an effective way to change the stack size, the following may help.
The gfortran documentation for changing the stack size on windows can be difficult to find.
I got this from documentation of how to provide options to linking.

The following .bat file has worked for me to set the stack to 500 MBytes.
In this build, the text file “load_gf.txt” lists all .o files to be included

rem  all .o files are listed in load_gf.txt

set program=%1
set tce=load_gf.tce
set load=gfortran

set stack_options=-Wl,-stack,536870912,-Map=%program%.map

del %program%.map
del %program%.exe

now >> %tce%

%load% @load_gf.txt -fopenmp -fstack-arrays %stack_options% -o %program%.exe >>%tce%  2>&1

dir %program%.* /od >>%tce%

notepad %tce%

I think the stack size provided is rounded to the nearest memory page size. From memory the resulting size is reported in the .map file.

Use of environment variables for compile and link options can clarify the build text.

1 Like

If you are able to exploit tail recursion optimization in your code, you may be able to process larger data sets than at present without running out of stack. This kind of tail recursion elimination has turned out to be quite useful with quicksort. It requires far less additional code than to implement and maintain your own stack management code and thereby do away with recursion altogether.

2 Likes

Recently, I encountered a recursive function which had this problem:

! Add two integers using bit-wise operations (carry-and-add)
pure recursive function iadd(a,b) result(c)
   integer, intent(in) :: a, b
   integer :: c
   c = merge(a, iaddr(ieor(a,b),shiftl(iand(a,b),1)), b == 0)   ! WARNING (!)
end function

Apparently with some compilers, the .false. source operand can be evaluated before the mask is checked, causing the procedure to segfault.

The solution was to rewrite the routine with an if block:

pure recursive function iadd(a,b) result(c)
   integer, intent(in) :: a, b
   integer :: c
   if (b == 0) then
       c = a
       return
   end if
   c = iaddr(ieor(a,b),shiftl(iand(a,b),1))
end function

In C, and since F2023 also in Fortran, you could use the ternary-if:

int iaddr(int a, int b) {
    return b == 0 ? a : iaddr(a ^ b, (a & b) << 1);
}

Recursion isn’t a particularly efficient way of doing c = a + b, moreover the iterative algorithm requires much less stack space:

pure function iadd(a,b)
   integer, value :: a, b
   integer :: carry, iadd
   do while(b /= 0)
      carry = iand(a,b)
      a = ieor(a,b)
      b = shiftl(carry,1)
   end do
   iadd = a
end function