A piece of code that causes LLVM Flang to generate NaN/Inf randomly

zaikunzhang · February 11, 2026, 5:39pm

See

flang-21 produces NaN and Inf when it should not, randomly

opened 03:16PM - 11 Feb 26 UTC

flang

With flang-21, the code at the end of the post generates NaN and Inf unexpectedl…y and **randomly**. The behavior can be reproduced by the following bash script. ```bash rm -f a.out uname -a flang --version flang -fstack-arrays -std=f2018 -O3 -ffast-math -g test_div_flang.f90 # Run a.out 5 times to demonstrate the random NaN/Inf generation behavior for i in {1..5}; do echo "Run $i:" ./a.out done ``` N.B.: 0. The `-ffast-math` flag is being used. I know, this is not recommended in practice, and it may give rise to un-mathematical results. However, the resutls should be deterministic. 1. Keeping `-ffast-math`, different combinations of the compilation flags will lead to different results. 2. Here are the results in my tests: - [ubuntu_x86.log](https://github.com/user-attachments/files/25238070/ubuntu_x86.log) - [ubuntu_arm.log](https://github.com/user-attachments/files/25238072/ubuntu_arm.log) - [macos_x86.log](https://github.com/user-attachments/files/25238071/macos_x86.log) - [macos_arm.log](https://github.com/user-attachments/files/25238073/macos_arm.log) **The results are not always the same across the runs. Look carefully :) ** 3. The above results can be reproduced by forking https://github.com/zequipe/flang_ranom_nan and running the GitHub Actions. Thank you for taking a look. Code: ```fortran ! test_div_flang.f90 program test_div_flang use iso_fortran_env, only : RP => REAL32 implicit none ! The code may behave differently depending on the dimension of the array. Try them. real(RP) :: a(14), b(14), c a = [0., 0., 7.E-45, 7.E-45, 0., 5.E-45, 0., 5.E-45, 5.E-45, 0., 0., 0., 0., 5.E-45] !real(RP) :: a(9), b(9), c !a = [5.E-45, 0., 5.E-45, 5.E-45, 0., 0., 0., 0., 5.E-45] ! !real(RP) :: a(8), b(8), c !a = [0., 5.E-45, 5.E-45, 0., 0., 0., 0., 5.E-45] b = a / maxval(abs(a)) c = maxval(abs(a)) print *, '>>> Dimension = ', size(a) print *, a/maxval(abs(a)), '|', b print *, (a/maxval(abs(a)))**2, '|', b**2 print * , '------------------' print *, a, '|', a/maxval(abs(a)), '|', b print *, a, '|', (a/maxval(abs(a)))**2, '|', b**2 print * , '==================' print *, a/c, '|', b print *, (a/c)**2, '|', b**2 print * , '------------------' print *, a, '|', a/c, '|', b print *, a, '|', (a/c)**2, '|', b**2 end program test_div_flang ```

N.B.:

I am well aware that -ffast-math can cause un-mathematical results. Whatever the results might be, they should be deterministic.
The behavior of the code is indeed quite complex — more complex than it may look like, varying across different combinations of the dimension, printing, compilation flags, and operating system. This is why I did not provide a minimal example — actually I was not able to because the pattern was unclear to me.

themos · February 11, 2026, 8:34pm

I would argue the precise opposite. Returning the same answer is likely to make users think that this is the “correct” answer. And before you know it, they will be demanding to get that same answer with other flags. Better to give them garbage, since that is what they asked for.

zaikunzhang · February 11, 2026, 9:02pm

Interesting. Let’s see what other compiler developers and Fortran programmers think about this point.

RonShepard · February 11, 2026, 11:06pm

In case it is not obvious, numbers like 7.0E-45 are denormal floating point values. They are smaller than tiny() for the real32 kind (the default real kind), but they can have a binary representation if gradual underflow is enabled. I think the compiler is allowed to set them to zero at compile time, and also the run time behavior can vary depending on which exceptions are enabled. Further, the reciprocal of a number like that will overflow, so depending on what exceptions are enabled, which floating point flags are set, and how the expression is evaluated, the result could be huge(), INF, or NaN.

If I were a numerical analyst, I might want a specific kind of behavior with numbers like this. If I’m an applications programmer, I probably don’t care much exactly what the results are, but I would want to know that something isn’t normal.

septc · February 12, 2026, 11:41am

FWIW, on my mac (M1),

gfortran-15 -ffpe-trap=invalid -O2 test_div_flang.f90

gives results with no NaN / Inf, while

gfortran-15 -ffpe-trap=invalid -O2 -ffast-math test_div_flang.f90

gives the following runtime error:

Program received signal SIGILL: Illegal instruction.

Backtrace for this error:
#0  0x100b6a103
#1  0x100b69083
#2  0x1947ad6a3
#3  0x1007a50e7
#4  0x1007a50e7
zsh: illegal hardware instruction  ./a.out

which seems to suggest that something like NaN or Inf was sent to some “instruction” (at the assembly level?) and detected somewhere. I wonder if a similar flag is available for LLVM flang also…? (I was not able to find such an option via flang --help).

In the case of flang, the “random” results of the program might be explained if such an “illegal” value was sent to some instruction and caused some strange behaviors (e.g. like memory corruption)…?

tmj · February 14, 2026, 1:32am

One of the optimizations nuked the division; we never load anything into the descriptor - we see it allocate the space (56 bytes), save the pointer in the descriptor, but never populate it with results before the call.

570         movl    $56, %edi                                                       
571         callq   malloc@PLT                                                      
572         movq    %rax, %rbp                                                      
573         movq    %rax, 440(%rsp)                                                 
574         movq    $4, 448(%rsp)                                                   
575         movq    %r12, 456(%rsp)                                                 
576         movq    $1, 464(%rsp)                                                   
577         movq    $14, 472(%rsp)                                                  
578         movq    $4, 480(%rsp)                                                   
579         leaq    440(%rsp), %rsi                                                 
580         movq    %r13, %rdi                                                      
581         callq   _FortranAioOutputDescriptor@PLT

JohnCampbell · February 14, 2026, 7:47am

As someone who has used -ffast-math in finite element calculations for many years, I disagree with your post.
When using floating point calculations, there is never a “correct answer”. We are asking for an answer with an acceptable round-off.
When this is not the case, there are many reasons for the answer being unacceptable, even when using -ffast-math. In all cases I have found the problem being a poor numerical modelling approach.
If the modelling approach is improved, then the errors due to -ffast-math are never significant.
I have only observed problems using -ffast-math when the floating point values are so unusual, due to the poor modelling approach.
I find this criticism of -ffast-math a gross exageration for practical usage.

In structural FE analysis, where localised excessive round-off can occur, there is typically sufficient redundency in the structural model equations that these round-off errors are not significant. I don’t work in turbulent analysis where a butterfly can change the results. Practical analysis of round-off in large systems of equations is very difficult to assess.

certik · February 14, 2026, 3:03pm

Yes, I’ve been using -ffast-math with great success also, and never had any issues. I started this thread here about it: Can one design coding rules to follow so that `-ffast-math` is safe?, and I link a document there about some rules to follow that make using -ffast-math “safe”.

themos · February 15, 2026, 12:06pm

I admit it was an exaggeration and kiind of click-baity. The point I was trying to get across is that unless you have evidence to the contrary, -ffast-math output should be treated as garbage because there is no numerical analyst hiding inside the compiler (yet) and we know that for every innocuous-looking transformation there is an example where radically different outputs are possible. You may have convinced yourself that already-completed calculations A to Y were not unduly affected by the transformations involved but you cannot be sure that the Z calculation that you will do tomorrow will also be unaffected. In the early days, when computers still had names ending in -AC, people had no idea what the output would be, that is why they were using a computer in the first place. These days, almost everyone has a pretty good idea of what the output should be, roughly so many picobarns, Angstroms, dollars, degrees Kelvin. The game has changed from “I wonder what the answer is” to “I wonder if I can get a plausible answer faster”. The principle of “thou shall not fool thyself” requires a tool that reminds you of the instability inherent in cutting corners.

Having said all that, the actual bug that I see in LLVM-flang is that the runtime routine Fortran::decimal::ConvertToDecimal<24> hits a reference to an uninitialised variable, as valgrind reveals (after compiling with -O2 -ffast-math). This could be how different results end up in the output. But please don’t wait for “noise” like that before you suspect that your output digits may be fiction.

JohnCampbell · February 16, 2026, 5:12am

I am not aware of this description of the performance. Surely there is better information on the conditions for when -ffast-math calculations fail.
I thouight that denormalized floating point numbers were a cause, but other cases are not identified.

themos · February 16, 2026, 10:15am

Do you see reproducible results when you run the executable under a debugger?

certik · February 16, 2026, 5:29pm

You always test in Debug mode (no -ffast-math), get the answer you want. Then you enable Release mode (and yes, also -ffast-math) and see if things change and if the accuracy is acceptable to you. The document I linked above talks about the techniques you can use to always (in my experience) get an acceptable answer ouf of -ffast-math as currently implemented in compilers.

themos · February 16, 2026, 5:51pm

If LLVM Flang people “fixed” the problem above so that the program produces the same output for each run, but takes a bit longer to do so, would the customer be happy?

hkvzjal · February 16, 2026, 6:06pm

For commercial software development, it might yes, because the developers have to deal with to many issues from their own code and answerer about why results are different (even if better or faster) but if the only answerer to explain difference is some randomness produced by a compiler option such as fast-math, then it gets really complicated. Reproducibility is important to build trustworthiness.

RonShepard · February 16, 2026, 6:32pm

Nonreproducibility is usually caused by accessing some undefined or uninitialized memory location, and then using that location somehow (as an array index, or a pointer, or in a floating point operation, etc.). It is unclear to me in this thread how the fast-math compiler option might cause that to occur.

certik · February 17, 2026, 2:56am

I agree that things should be reproducible with -ffast-math also, unless there is some optimization that just fundamentally is not deterministic? Which one?

themos · February 17, 2026, 10:16am

A programming error (by the user, or the compiler writer) can cause undefined references leading to nonreproducibility.

But so can simple addition operations, if there are enough of them to be executed in parallel and combined in a non-reproducible order. Enforcing the same order takes synchronization which costs run-time, and some people would be happy enough with a different sum each time the program runs (because they have analysed the effect the variation will produce downstream).

certik · February 17, 2026, 2:16pm

I am one of those people. However that’s typically a parallel sum, correct? On a single core why would it give different result each time? A parallel code indeed will give different results each time, but you can run it on single core to make it reproducible.

themos · February 17, 2026, 3:19pm

Because it does not have to. A REDUCE intrinsic function with ORDERED=.FALSE. is not required to return the same value each time, because the order of reducing array elements is left unspecified.

certik · February 17, 2026, 3:45pm

Yes, but in practice a compiler will generate certain (unspecified) order, but then when you run the binary, I would think it would still be deterministic.

Topic		Replies	Views
Should 0.0/0.0 be NaN? Help	17	1004	March 10, 2024
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO	16	7957	April 4, 2023
Challenge: Testing Inf and NaN with `gfortran-13 -Ofast`	38	2015	October 3, 2023
Ifort (IFORT) 2021.8.0: `1.0E+37 / 1.0E+38 = 0`	34	1520	December 25, 2022
Strange behavior of `ifort`	57	2681	December 23, 2022

A piece of code that causes LLVM Flang to generate NaN/Inf randomly

Related topics