Why a simple code, Intel Fortran -O3 -O2 gives wrong results?

Dear all, I have met a bug, and the code which has extremely simplified, less than 60 lines.
30 lines versions by @Euler-37 here

In the main program, I call subroutine test01, in it I output an array xout.
xout is non-zero.
However, with Intel Fortran OneAPI, with
-O3 -O2 flags,
it simply gives zeros.
With O0 the result is correct.
No matter windows or linux.

However, gfortran has no problem.

Note than, at line 31,

	integer ( kind = i8 ), parameter :: n = 10, np = 10

If I set np <=8, the result is correct.
However if np>8, the result becomes all zero which is incorrect,
However, if I add some write statement, such as at line 17, 43, 45. The result become correct again.

Can anyone point out what could be the problem?
Thank you very much indeed!

The code is extremely simple as below,

module constants
	implicit none
	integer, public, parameter :: i8=selected_int_kind(15)
	integer, public, parameter :: r8=selected_real_kind(15,9)
	real(kind=r8), public, parameter :: zero=0.0_r8,one=1.0_r8
end module constants
module stochastic_rk
use constants  
implicit none
contains    
subroutine rk4_ti_step_mod3 ( x, xstar ) 
implicit none
real(kind=r8), intent(in) :: x 
real(kind = r8), intent(out) :: xstar
real(kind = r8) :: ks, xs 
xstar = x 
!write(6,*) 'x', x 
xs = x 
ks = - xs * 0.1_r8
xstar = xstar + ks
return
end subroutine rk4_ti_step_mod3 
end module stochastic_rk    
module tests 
implicit none
contains
subroutine test01
use constants
use stochastic_rk
implicit none
integer ( kind = i8 ), parameter :: n = 10, np = 10
integer ( kind = i8 ) :: i, itot, istart, istep, j, k
real ( kind = r8 ) :: x(0:n)
real (kind = r8) :: xout(5,np)
itot = n
istart = itot/5
istep = istart
do j = 1, np
    i = 0
    x(i) = 20.0_r8
    do i = 1, n
        call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
        !write ( *, '(2x,i8,2x,f14.6,2x,g14.6)' ) i, x(i-1), x(i)
    end do
    !write (6,'(''x = '',t20, 5(f15.7,1x))') x(istart:itot:istep)
    xout(1:5:1,j) = x(istart:itot:istep)
enddo    
write (6,'(5(f15.7,1x))') xout
return
end subroutine test01
end module tests    
program main
use tests
use stochastic_rk
implicit none  
call test01
stop ('Program end normally.')
end program main
1 Like

@CRquantum I tested in GFortran, it seems to work. I don’t see anything wrong at first sight.

What you can do is try to simplify the code to produce a minimal example that still reproduces the bug. Once the code becomes just a few lines, it will be much easier to see if it is a bug in a compiler.

P.S. I edited the title of this post and removed the word “absolutely”. If you wouldn’t mind using less charged words, it can help towards having a more productive discussion.

2 Likes

Thank you very much!

Yes, I stated in the post, gfortran has no problem.
It only happens using intel Fortran with -O2, or -O3 flag.

I have tried to make it short, currently already < 60 lines.

it is likely intel fortran just did not do anything when call rk4_ti_step_mod3 ( x(i-1), x(i) ) .

do i = 1, n
    call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
    !write ( *, '(2x,i8,2x,f14.6,2x,g14.6)' ) i, x(i-1), x(i)
end do

Try to make it 10 lines if possible. That would make it easier on others to test, and if it is a bug in the Intel compiler, you can then use it as a reproducer for them to fix.

2 Likes
module stochastic_rk
    use constants  
    implicit none
contains    
    subroutine rk4_ti_step_mod3 ( x, xstar ) 
        implicit none
        real(kind=r8), intent(in) :: x 
        real(kind = r8), intent(out) :: xstar
        real(kind = r8) :: ks, xs 
        !xstar = x !DELETE
        !write(6,*) 'x', x 
        xs = x 
        ks = - xs * 0.1_r8
        xstar = x + ks
        !xstar = xstar + ks ! CHANGE
        !write(*,*)x,xstar
        return
    end subroutine rk4_ti_step_mod3 
end module stochastic_rk  

The bug come from this subroutine ,but i don’t know the reason.And if print something ,the bug will disappear.

2 Likes

Thank you my man! @Euler-37
Yes I know. I have managed according to @certik suggestion, even simpler version here, 30 lines.
He suggested 10 lines, that is a little bit too tough. 10 lines there is almost nothing I can do :sweat_smile:

module tests 
implicit none
contains
subroutine test01
integer, parameter :: n = 10, np = 10
integer :: i, itot, istart, istep, j, k
real  :: x(0:n), xout(5,np)
itot = n
istart = itot/5
istep = istart
do j = 1, np
    i = 0
    x(i) = 20.0
    do i = 1, n
        call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
        !write ( *, '(2x,i8,2x,f14.6,2x,g14.6)' ) i, x(i-1), x(i)
    end do
    !write (6,'(''x = '',t20, 5(f15.7,1x))') x(istart:itot:istep)
    xout(1:5:1,j) = x(istart:itot:istep)
enddo    
write (6,'(5(f15.7,1x))') xout
return
end subroutine test01
subroutine rk4_ti_step_mod3 ( x, xstar ) 
real :: x, xstar
xstar = x 
!write(6,*) 'x', x 
xstar = xstar + x
return
end subroutine rk4_ti_step_mod3
end module tests    
program main
use tests
implicit none  
call test01
stop ('Program end normally.')
end program main

In this above version, there is just one module called tests.

However if I move the subroutines to the main program. That is, use no module, is bug is gone. I am a little bit confused. Please see below, this code does not have problem.

program main
implicit none  
call test01
stop ('Program end normally.')
contains
subroutine test01
integer, parameter :: n = 10, np = 10
integer :: i, itot, istart, istep, j, k
real  :: x(0:n), xout(5,np)
itot = n
istart = itot/5
istep = istart
do j = 1, np
    i = 0
    x(i) = 20.0
    do i = 1, n
        call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
        !write ( *, '(2x,i8,2x,f14.6,2x,g14.6)' ) i, x(i-1), x(i)
    end do
    !write (6,'(''x = '',t20, 5(f15.7,1x))') x(istart:itot:istep)
    xout(1:5:1,j) = x(istart:itot:istep)
enddo    
write (6,'(5(f15.7,1x))') xout
return
end subroutine test01
subroutine rk4_ti_step_mod3 ( x, xstar ) 
real :: x, xstar
xstar = x 
!write(6,*) 'x', x 
xstar = xstar + x
return
end subroutine rk4_ti_step_mod3
end program main
module tests
    implicit none
contains
    subroutine test01
        integer :: i
        real  :: x(0:4)
        x(0) = 1.0
        do i = 1, 4
            call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
        end do
        write (*,*)x
    end subroutine test01
    subroutine rk4_ti_step_mod3 ( x, xstar )
        real :: x, xstar
        xstar = x
        xstar = xstar + x
        !xstar = x + x
    end subroutine rk4_ti_step_mod3
end module tests

program main
    use tests
    implicit none
    call test01
end program main

I have changed the code like this one.

Thank you man @Euler-37 ! Yours is even better!
Again, it is strange, without using module, the result is correct,

program main
implicit none
call test01
contains
subroutine test01
integer :: i
real :: x(0:4)
x(0) = 1.0
do i = 1, 4
call rk4_ti_step_mod3 ( x(i-1), x(i) ) ! check n=10 bug.
end do
write (,)x
end subroutine test01
subroutine rk4_ti_step_mod3 ( x, xstar )
real :: x, xstar
xstar = x
xstar = xstar + x
!xstar = x + x
end subroutine rk4_ti_step_mod3
end program main

It looks like some over-optimized things for SSE , but i am not familiar with ASM.

1 Like

I am guessing it’s the dependency between loops that might confuse the optimizer.

It seems @CRquantum that you should report the bug to Intel.

1 Like

Thank you @certik , I have posted a similar one on Intel’s forum.
If Dr. Fortran see the post and verified this, I think he would be able to submit the report.

In my another link,

Dr. Fortran has verified this issue and have submitted an report to Intel.

1 Like

Yes, I reported your previous issue because it interested me and was straightforward. It used to be my job to investigate and report everything that came in to the forum, but that ended almost five years ago. As the famous Polish saying goes, “Nie mój cyrk, nie moje małpy”. Retirement gives me the luxury of choosing when and if to get involved.

Your latest problem has been properly analyzed by other forum members, and appears to be a simple inline optimization error. It is the responsibility of Intel employees to pick up on that.

8 Likes

I see @Euler-37 did this already, so this is very similar but note if you remove the “redundant” line it will work:


module tests 
contains
subroutine bug
    real  :: x(11)=20.0
    do i = 1, 10
       call double ( x(i), x(i+1) )
    enddo
    write (*,'(*(g0,1x))') x
end subroutine bug
subroutine double ( x, xstar ) 
    xstar = x  ! remove this line
    xstar = x + x
end subroutine double
end module tests    
program main
    use tests
    call bug()
end program main

I could reproduce the problem with ifort 2021.3.0 20210609 but not with ifx.

1 Like

PS: could not quite get down to ten lines cutting every corner; so went for minimum characters:

 ifort bug2.f90;./a.out;ifort -O0 bug2.f90;./a.out
      1.000000       2.000000       2.000000       2.000000       2.000000    
   1.000000       2.000000       4.000000       8.000000       16.00000    

$ cat bug2.f90
subroutine b()
real::x(5)=2
do i=1,4
call d(x(i),x(i+1))
enddo
print*,x
end 
subroutine d(x,r) 
r=x
r=x+x
end 
call b
end

:slight_smile:

2 Likes

Thank you for your comments, Dr. Fortran!
Frankly, your name basically equals to Fortran!
Cannot remember how many times google Fortran stuff and your name jumped into my eyes!
What is more remarkable is that, basically all of your answers bingo and beat the problem to death!

Ultimate regards! :grinning:

@CRquantum, thanks for the kind words, but please see The Real Doctors of Fortran - Doctor Fortran (stevelionel.com).

4 Likes

PS: I have a huge script whose purpose originally was for tuning performance more than for detecting bugs that essentially empirically tries a lot of compiler options. If the bug involves something like clobbering memory the results can almost be random and misleading; other times it is very informing. When I tried it on your problem it seemed more likely it was misleading so I had not mentioned it.
But since the majority of behavior changes involved inlining I thought you might want to pass on a section of the results after all:

+ ifort -O2 bug2.f90                          2 4 4 4 4
+ ifort -O2 bug2.f90 -finline-functions       2 4 4 4 4
+ ifort -O2 bug2.f90 -fno-inline-functions    2 4 8 16 32
+ ifort -O2 bug2.f90 -finline                 2 4 8 16 32
+ ifort -O2 bug2.f90 -fno-inline              2 4 8 16 32
+ ifort -O2 bug2.f90 -Os                      2 4 8 16 32

+ ifort -O2 bug2.f90 -finline-limit=0    2 4 8 16 32
+ ifort -O2 bug2.f90 -finline-limit=1    2 4 8 16 32
+ ifort -O2 bug2.f90 -finline-limit=10   2 4 8 16 32
+ ifort -O2 bug2.f90 -finline-limit=20   2 4 8 16 32
+ ifort -O2 bug2.f90 -finline-limit=30   2 4 8 16 32
+ ifort -O2 bug2.f90 -finline-limit=100  2 4 4 4 4

why it seems to be misleading at first was because adding both -finline and -fno-inline appeared to turn off the bug; and because the bug disappeared with -finline at small values but appeared at larger ones. That is odd because both turning on a switch and disabling a switch that is supposed to do with compiler directives that are not even used in the example turned the bad answers off; and because numbers far bigger than the size of the procedures in the code all appeared to turn the bug off; but then a bigger still value made it re-appear. Anyway, I had second thoughts so here are the results in case you want to pass it on.

1 Like

Thank you very much indeed for your detailed analysis! I really appreciate that!
You are absolutely right, in the real code I observe the similar thing as you described.
But you level is way above me and you explain thing much more clearer than me, thank you so much! :+1:
Hopefully intel could fix this issue soon.