Gfortran bug or am I missing something?

Hey guys, I’ve been working with Fortran about 6 years now but only started playing with it for some personal projects recently. Today I came across a very strange bug, which seems to only affect gfortran. I have a simple math module which implements sum, average, and variance calculations as below:

math module with assumed rank functions
module math
implicit none
private

    public :: sum_reduce
    public :: avg
    public :: var

    contains

        function var(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            select rank (x)
                rank (0)
                    y = 0.0
                rank (1)
                    write(*,*) 'size(inner): ',size(x - avg(x))
                    y = sum_reduce((x - avg(x))**2)/real(size(x))
                rank default
                    error stop '- VAR only defined for scalar or vector <x>'
            end select
        end function var

        pure function avg(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            select rank (x)
                rank (0)
                    y = x
                rank (1)
                    y = sum_reduce(x)/real(size(x))
                rank default
                    error stop '- AVG only defined for scalar or vector <x>'
            end select
        end function avg
    
        pure recursive function sum_reduce(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            integer :: n
            select rank (x)
                rank (0)
                    y = x
                rank (1)
                    n = size(x)
                    if (n.gt.2) then
                        y = sum_reduce(x(1:(n/2))) + sum_reduce(x(((n/2)+1):n))
                    else
                        y = sum(x)
                    end if
                rank default
                    error stop '- SUM_REDUCE only defined for scalar or vector <x>'
            end select
        end function sum_reduce

end module math

My test program is as follows:

test program
program main
use math
implicit none

    integer, parameter :: n = 2000000

    real :: x(n), x_sum, x_avg, x_var

    call random_number(x)
    write(*,'(a,i0,2(a,e13.6))') 'x has ',size(x),' elements, minval: ',minval(x),', maxval: ',maxval(x)

    x_sum = sum_reduce(x)
    write(*,*) 'x_sum: ',x_sum
    
    x_avg = avg(x)
    write(*,*) 'x_avg: ',x_avg
    
    x_var = var(x)
    write(*,*) 'x_var: ',x_var

end program main

Which yields output:

assumed rank variance calculation fails with array out of bounds
fpm: Entering directory '/home/tyranids/projects/math-error'
Project is up to date
x has 2000000 elements, minval:  0.119209E-06, maxval:  0.100000E+01
 x_sum:    1000636.81    
 x_avg:   0.500318408    
 size(inner):      2000000
At line 51 of file app/math.f90
Fortran runtime error: Index '2000000' of dimension 1 of array '__tmp_REAL_4_rank_1' outside of expected range (1999999:0)

Error termination. Backtrace:
#0  0x14d0046b4ad0 in ???
#1  0x14d0046b5649 in ???
#2  0x14d0046b5c46 in ???
#3  0x5575e14e7c7f in __math_MOD_sum_reduce
	at app/math.f90:51
#4  0x5575e14e8433 in __math_MOD_var
	at app/math.f90:20
#5  0x5575e14e7739 in MAIN__
	at app/main.f90:18
#6  0x5575e14e77ff in main
	at app/main.f90:2
<ERROR> Execution failed for object " math-error "
<ERROR>*cmd_run*:stopping due to failed executions
STOP 1

fpm appears to run with array bounds checking enabled by default, which is what caught this. I can confirm that with ifort, even with -check all, this does not happen. Why is gfortran indexing the temporary array it creates starting from 0? From the debug print added in var, I can see that the size of the arguments provided to sum_reduce is the expected 2000000. I can also see that sum_reduce works correctly for both the direct call and the one from avg, both print the expected value earlier than the call to var, and neither trips the array out of bounds check.

After additional investigation, I have copied the math module to a new math2 version, which uses all assumed shape arrays of dimension 1, since that is my currently failing use case. With this new code, I get no errors and it works as expected. Is there a known problem with chaining assumed rank procedures together in gfortran, but not Intel’s compilers?

math module with assumed shape arrays, seems fine with all compilers
module math2
implicit none
private

    public :: sum_reduce
    public :: avg
    public :: var
    public :: std

    contains

        pure function std(x) result(y)
        implicit none
            real, intent(in) :: x(:)
            real :: y
            y = sqrt(var(x))
        end function std

        pure function var(x) result(y)
        implicit none
            real, intent(in) :: x(:)
            real :: y
            integer :: n
            n = size(x)
            if (n.gt.1) then
                y = sum_reduce((x - avg(x))**2)/(real(n) - 1.0)
            else
                y = 0.0
            end if
        end function var

        pure function avg(x) result(y)
        implicit none
            real, intent(in) :: x(:)
            real :: y
            integer :: n
            n = size(x)
            if (n.gt.1) then
                y = sum_reduce(x)/real(n)
            else
                y = x(1)
            end if
        end function avg
    
        pure recursive function sum_reduce(x) result(y)
        implicit none
            real, intent(in) :: x(:)
            real :: y
            integer :: n
            n = size(x)
            if (n.gt.2) then
                y = sum_reduce(x(1:(n/2))) + sum_reduce(x(((n/2)+1):n))
            else
                y = sum(x)
            end if
        end function sum_reduce

end module math2

==========
I originally thought it was coming from here. Upon further investigation, this was found to not be the case. The following code implements a simple box-muller transform to produce normal random numbers with mean v_mu and standard deviation v_sig:

box-muller transform for normal random numbers
subroutine normrnd(vals, n, v_mu, v_sig)
implicit none
    real, parameter :: twopi = 2.0*acos(-1.0)
    real, intent(out) :: vals(*)
    integer, intent(in) :: n
    real, intent(in) :: v_mu, v_sig
    real :: u(2), r
    integer :: i
    do i=1,n                                  
        if (mod(i,2).eq.1) then
            call random_number(u)
            do while (u(1).eq.0.0)
                call random_number(u(1))
            end do
            r = v_sig*sqrt(-2.0*log(u(1)))
            vals(i) = r*cos(twopi*u(2)) + v_mu
        else
            vals(i) = r*sin(twopi*u(2)) + v_mu
        end if
    end do
end subroutine normrnd
1 Like

A complete reproducer with a little main program that calls this in a loop would be very helpful. I am not sure what criteria you are using for “undefined”. The array is dimensioned with an asterisk but as intent(out) so if n is not the size of the array, part of it really could be undefined although most compilers are probably pretty much ignoring that; but technically everything in an intent(out) is undefined except for what you explicitly set so if N is not the size of input array values outside of the range 1:N could be in trouble; and if you are thinking it is “undefined” because a result is negative that would be likely that ifort/ifx would not do that but that gfortran would with or without the -flto.

If not a reproducer, adding checks in the routine to look for something “undefined” before you return it and printing the input values and stopping if something wrong is detected might give a clue, even if the clue is the values only look bad from the calling location instead of from inside
the procedure.

program main
   implicit none
   real, parameter :: twopi = 2*3.14159265358979323846264338327950288419716939937510582097494459230
   integer, parameter :: sz = 10000
   real :: values(sz)
   real :: v_mu=2.0, v_sig=0.50
   integer :: i, j
   do j = 1, 10000
      values = -huge(0.0)
      call normrnd(values, sz, v_mu, v_sig)
      write(*,*)values
      do i = 1, sz   ! check for whatever you think looks bad ...
         if (values(i) .le. 0.0) then
            write (*, *) 'pass', j, 'index', i, 'value', values(i)
         end if
      end do
   end do
contains
   YOUR ROUTINE with a check on exit for "bad" values
end program main

where this is just junk; but you would add real values for v_mu, v_sig, n, and a check for whatever you think is “undefined” (which might just be a value you did not expect?). If the problem cannot be reproduced with a little program it might be an indication something else is causing values to see if they go bad at some other point; and compile your real program with all the debug switches it can stand.Could be a problem with link time optimization of course.

1 Like

Thank you for your reply. I’ve investigated the error further and located the actual source of the error. For now, I believe the original box-muller routine is fine.

The OP has been updated with my current findings. In short, gfortran does not seem to like chaining together too many assumed-rank functions, but the assumed-shape array version works as expected. I do not understand why this is the case, or why gfortran seemingly chooses to switch to 0-based array indexing in my pure function var but not for avg or sum_reduce.

I would like to suggest a different approach to declare, and eventually assign a value to the twopi parameter. That is – constraining the example to this parameter:

program main
  use fortran_iso_env, only : qp => real128

  implicit none
  real(kind=qp), parameter :: twopi = 2 * 3.14159265358979323846264338327950288419716939937510582097494459230_qp

  print *, "Check again Fortran's precision of twopi: ", twopi
end program main

This both explicit and portable (compiler independent) definition of quadruple / long double precision covers about half of the decimals shown. If it is not necessary to carry this many decimals in the computation, real64 can be a compromise between better precision (compared to real32), and resources to provide (compared to real128).

Thank you for your reply, but this is unrelated to the issue of the OP - namely that gfortran appears to be using 0-based indexing on a temporary array created in the assumed rank function var. Oddly, it doesn’t seem to use this behavior in the assumed rank functions avg or sum_reduce. I am looking for explanations about this, or any advice regarding something I’ve done that made the compiler generate this behavior. The same routines work fine in both ifort and ifx.

Hi @tyranids

Could you try to initialize x(:) in some deterministic way (rather than using random_number), for example, as

    !! call random_number(x)

    do i = 1, n
        x( i ) = 1.0 / i
    enddo

or similar, so that we can compare results among different computers / environments more easily? (But for such initialization, we need to be careful about possible overflow also…)

Also, I think it will be useful to make a “reference” code that does not utilize the assumed-rank feature but performs the same calculation (to get the “correct” result). For example, we can make assumed-rank and assumed-shape versions. Comparison of the two codes may reveal more info. (I tried a bit, but the results differ between two approaches…)

PS. I got the same out-of-bounds run-time error when I attached -fcheck=all with gfortran-10, 11, and 12.

@septc, I have already done as you suggested with regards to an assumed shape array version. It is posted in the OP in the drop down about math2 module using assumed shape arrays. That appears to work correctly. I am also fairly confident that the issue is gfortran using 0 based indexing on a temporary array for the assumed rank version. Or at least, that’s what the error message says.

I am wondering if there are additional semantics about assumed rank that I am currently unaware of my ignorance about. For example why do assumed rank sum_reduce and avg have no issues, but var, which calls both of those, does?

@tyranids Ohhh, sorry!! I haven’t read the post until the last part (while moving to my terminal to test the first code… :sweat_smile: ) Yes, that was what I was asking in the above post.

In my test, gfortran-10,11,12 seem to give wrong results when using assumed-ranks, while correct results with assumed-shape arrays. When -fcheck=all is attached, all versions show out-of-bounds errors (same as OP). I guess this is a compiler bug, but I cannot say for sure because I am not familar with this feature (sorry for not helpful…). Also, to report the issue, I guess it might be useful if the code is further simplified somehow.

I will make a more simple example, but at least from what I have now, the assumed rank code seems to work for a basic function, even 2 of them chained together, but not 3 levels. I don’t think inlining has anything to do with it because this happens even at -O0.

I can in fact achieve the same error with the following code:

all-in-one small example
program main
implicit none

    integer, parameter :: n = 10

    integer :: i
    real :: inputs(n), output

    do i=1,n
        inputs(i) = real(i)
    end do
    write(*,*) 'inputs: ',inputs

    output = f1(inputs)
    write(*,*) 'f1(inputs): ',output

    output = f2(inputs)
    write(*,*) 'f2(inputs): ',output

    output = f3(inputs)
    write(*,*) 'f3(inputs): ',output

    contains

        pure function f1(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            select rank (x)
                rank (1)
                    y = sum(x(1:size(x)))
                rank default
                    error stop '- F1'
            end select
        end function f1

        pure function f2(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            select rank (x)
                rank (1)
                    y = f1(x)
                rank default
                    error stop '- F2'
            end select
        end function f2

        pure function f3(x) result(y)
        implicit none
            real, intent(in) :: x(..)
            real :: y
            select rank (x)
                rank (1)
                    y = f1(x - f2(x))
                rank default
                    error stop '- F3'
            end select
        end function f3

end program main

Interestingly, I get no error if f1 instead uses the intrinsic sum(x) without specifying the bounds. If there way a way to get array bounds used by an input, then I could solve the problem I suppose. Still, it is odd that gfortran chooses to internally create 0-indexed arrays, while neither ifort nor ifx do this, and none of them emit any warnings or errors that this may be happening.

If you do not make the function pure and add a write statement just above the sum it verifies it is a bug; as you can query the bounds …

  write(*,*)'f1 x=',x,'lbound=',lbound(x,dim=1),'ubound=',ubound(x,dim=1),'size=',size(x)

shows it at least does not lie about the bug, but that is a bug. You could just do “sum(x)” or
query and use the bounds

i=lbound(x,dim=1)
j=ubound(x,dim=1)
sum(i:j)

but it actually shows the bounds have become 0 and size(x)-1; which is wrong. gfortran compiles and runs without debug switches but of course gets the wrong answer (it of course should be -495).

I had changed the example a bit to use all integers and such, but basically it actually printed bounds of 0 and 9:

inputs:  1 2 3 4 5 6 7 8 9 10
f1 x= 1 2 3 4 5 6 7 8 9 10 lbound= 1 ubound= 10 size= 10
 y1=          55 y=          55 T
f1(inputs):  55 PASSED
f1 x= 1 2 3 4 5 6 7 8 9 10 lbound= 1 ubound= 10 size= 10
 y1=          55 y=          55 T
f2(inputs):  55 PASSED
f1 x= 1 2 3 4 5 6 7 8 9 10 lbound= 1 ubound= 10 size= 10
 y1=          55 y=          55 T
f1 x= -54 -53 -52 -51 -50 -49 -48 -47 -46 -45 lbound= 0 ubound= 9 size= 10
 y1=        -495 y=      115864 F
f3(inputs):  115864 FAILED
1 Like

After a little testing this is starting to feel familiar. I think I reported something like this.
A simplified version that I think is hitting the same bug shows that it is that the bounds are incorrect when an expression is passed (versus an array) inside the select rank but not outside of it in general. So I do not think it is the nesting level. Oddly I have not found the bugzilla report so maybe I did not submit it, but this is a shorter reproducer I think. Still testing, but right now using:

program main
implicit none
character(len=*),parameter :: g='(*(g0,1x))'
integer, parameter :: n = 10
integer :: i, inputs(n)=[(i,i=n,1,-1)] 
   print g, 'f3(inputs): ', merge('PASSED','FAILED',f1(inputs).eq.55)
   ! if an expression is passed bounds are wrong rank
   print g, 'f3(inputs): ', merge('PASSED','FAILED',f1(inputs+0).eq.55 )
contains
function f1(x) result(y)
integer, intent(in) :: x(..)
integer :: y
   print g, 'F1: before select rank:   lbound=',lbound(x,dim=1),'ubound=',ubound(x,dim=1),'size=',size(x),'rank=',rank(x)
   select rank (x)
   rank (1)
      print g,'F1: in select rank:x=',x,'lbound=',lbound(x,dim=1),'ubound=',ubound(x,dim=1),'size=',size(x),'rank=',rank(x)
      y=sum(x(1:size(x)))
   rank default; error stop '- f1'
   end select
end function f1
end program main
/a.out
f3(inputs): F1: before select rank:   lbound= 1 ubound= 10 size= 10 rank= 1
F1: in select rank:x= 10 9 8 7 6 5 4 3 2 1 lbound= 1 ubound= 10 size= 10 rank= 1
 PASSED
f3(inputs): F1: before select rank:   lbound= 1 ubound= 10 size= 10 rank= 1
F1: in select rank:x= 10 9 8 7 6 5 4 3 2 1 lbound= 0 ubound= 9 size= 10 rank= 1
 FAILED

need to put OUTPUT back so not doing I/O in a function, but I think the underlying problem is when tmp arrays are created when an expression is passed.

Yes, in my testing earlier I found that the variance calculation worked if I had a local variable set inner=x-avg(x) and then called sum_reduce(inner) everything was fine. I have submitted a request for an account on the bugzilla, since apparently I need one of those to even submit bug reports.

It is also possible that this is already fixed in newer versions. I believe the current is 13.something, and I only have 11.3 installed on this machine.

combining everything in this thread so far, the error seems limited to expressions as arguments for assumed rank procedures in gfortran, the same procedure as assumed shape array works fine
program main
implicit none

    integer, parameter :: n = 10
    integer :: i, output
    integer :: input(n) = [(i, i=1,n)]

    write(*,*) 'input has ',size(input),' elements: ',input
    write(*,*) 'lbound(input: ',lbound(input),', ubound(input): ',ubound(input)
    write(*,*) 'intrinsic sum(input): ',sum(input)

    write(*,*) 'about to call `output = f_as(input)`'
    output = f_as(input)
    write(*,*) 'received output: ',output

    write(*,*) 'about to call `output = f_as(input+0)`'
    output = f_as(input+0)
    write(*,*) 'received output: ',output

    write(*,*) 'about to call `output = f_ar(input)`'
    output = f_ar(input)
    write(*,*) 'received output: ',output

    write(*,*) 'about to call `output = f_ar(input+0)`'
    output = f_ar(input+0)
    write(*,*) 'received output: ',output

    contains

        !! ASSUMED-RANK argument x
        function f_ar(x) result(y)
        implicit none
            integer, intent(in) :: x(..)
            integer :: y
            select rank (x)
                rank (1)
                    write(*,*) 'f_ar :: x has ',size(x),' elements: ',x
                    write(*,*) 'f_ar :: lbound(x): ',lbound(x),', ubound(x): ',ubound(x)
                    write(*,*) 'f_ar :: intrinsic sum(x): ',sum(x)
                    y = sum(x(1:size(x)))
                rank default
                    error stop '- f_ar'
            end select
        end function f_ar

        !! ASSUMED-SHAPE argument x
        function f_as(x) result(y)
        implicit none
            integer, intent(in) :: x(:)
            integer :: y
            write(*,*) 'f_as :: x has ',size(x),' elements: ',x
            write(*,*) 'f_as :: lbound(x): ',lbound(x),', ubound(x): ',ubound(x)
            write(*,*) 'f_as :: intrinsic sum(x): ',sum(x)
            y = sum(x(1:size(x)))
        end function f_as

end program main

That looks to be it. Title implied it was just about pointers. I was surprised more of these did not fail even with -O0, for reference.

program main
   implicit none
   integer,parameter :: const(4) = [1, 2, 3, 4]
   integer,save :: arr(4) = [1, 2, 3, 4]
   print *, f1([1, 2, 3, 4])
   print *, f1([1, 2, 3, 4] + 0)
   print *, f1(const)
   print *, f1(const + 0)
   print *, f1(arr)
   print *, f1(([1,2,3,4]))  
   print *, f1((const))      
   print *, f1(arr + 0)      ! fails
   print *, f1((arr))        ! fails
contains
   function f1(x) result(bounds)
      integer, intent(in) :: x(..)
      integer :: bounds(4)
      bounds(1:2) = [lbound(x), ubound(x)]
      select rank (x)
      rank (1); bounds(3:4) = [lbound(x), ubound(x)]
      rank default; error stop 'F1: rank not 1'
      end select
   end function f1
end program main