Nearest(huge(1.0), 1.0)?

Harper · January 6, 2025, 8:25pm

Some compilers print Inf or Infinity with this program, others won’t compile it, saying there is a floating overflow. Which is right?

program nearesthuge
  print *,nearest(huge(1.0), 1.0)
end program nearesthuge

PierU · January 6, 2025, 8:58pm

Without digging in the standard, I would say this is illegal code, thus the compiler can do whatever he wants.

sblionel · January 7, 2025, 9:24pm

In general, I’d agree with @PierU that it is implementation-dependent. huge() returns the largest model value, whereas nearest() returns a machine-representable value next larger than the argument. Even though the real model is not the same as the typical machine model (we’re adding a note to F202Y to say so), in most cases there will be no such representable value.

RonShepard · January 8, 2025, 12:22am

This means that it is up to the programmer to avoid that argument pair to the nearest() function, right?

It seems like the situation is a little different for the other extreme, nearest(tiny(1.0),-1.0). Most machines have denormal numbers, so there are machine-representable numbers smaller than tiny(1.0), and a programmer might intentionally want some of them, while knowing that he is not guaranteed to generate them that way.

Harper · January 8, 2025, 1:12am

@RonShepard is right: the other extreme is different. This program to test it compiled and ran with gfortran, g95, ifort, ifx and AMD flang, with various outputs. Lfortran gave an internal compiler error and said
'Nearest` intrinsic is not yet implemented for runtime values
The program:

program nearestsmall
  implicit none
  real:: x 
  integer i
  x = tiny(x)
  do i = 1,digits(x)+1
     if(x==0) exit
     x = nearest(x,-x)
     print "(A,I3,E14.7)",'In loop i,x =',i,x
     x = x/2.0
  end do
  print "(A,2E14.7)",'After loop x, tiny*epsilon =',x,tiny(x)*epsilon(x)
end program nearestsmall

urbanjost · January 8, 2025, 5:14am

I think printing the values with a B32.32 format makes these kinds of edge conditions so informative. Seeing the bit pattern of huge(1) and understanding the integer nature of the exponent and how the fraction is normalized into a decimal value makes it all make sense accept how exceptional numbers like Inf/-Inf, the different Nan values and negative zero usually are implemented perhaps. It can lead to a better understanding of why overflow often results in a negative number, why so many 32-bit functions like COS and SIN are so smooth with smaller values but can look like random number generates when used with large values, why adding one to a big number often produces no change in the value and so on.

On most machines nowadays when you see the bit pattern of huge(0) and then add one to the fraction as a binary number you can see it would change the sign bit but otherwise start you over, so you might expect the value to now be a negative zero. But then on most machines you find you get Infinity instead, which is educational. But increment the bits (in a modulo fashion as you just discovered) and so then (if your platform supports it) you find out how negative zero IS represented and so on.

In the implementations I have seen it is pretty clear that NEAREST is just adding or subtracting 1 from the binary fraction bit pattern t of the float value.in base2 But working near the edge of the range like with HUGE() usually makes you want to switch everything to 64-bit :>.

These kinds of examples can be great tools for explaining floating-point math versus conventional math; but I find it easier to discuss and understand when looking at the bit patterns. That is where ideas like representable numbers versus arbitrary constants become suddenly clear and the proper fear of floating point math can be instilled in those only previously exposed to conventional math or arbitrary precision computation can being

So then take a small float and start dividing it by two. When you realize the result stops changing you look at the bit pattern and suddenly EPSILON makes sense.

Then you start thinking how maybe Planck’s constant is so similar to that and you remember those guys talkng about how the world is really a simulation and you start thinking thinking – wow! quantum behavior really does sort of look like what happens when you use floating point math on a computer …

RonShepard · January 8, 2025, 5:30am

This works almost identically on gfortran and nagfor on arm64 hardware and MacOS. They both step down through the range of denormal numbers and finally end up at 0.0, which is both a model number and representable. The only difference was that nagfor reported an underflow condition at the end of the program.

urbanjost · January 8, 2025, 6:06am

program demo_epsilon
use,intrinsic :: iso_fortran_env, only : dp=>real64,sp=>real32
implicit none
real(kind=dp) :: t
real(kind=dp) :: my_dp_eps

   ! starting with a value of 1, keep dividing the value
   ! by 2 until no change is detected. Note that with
   ! infinite precision this would be an infinite loop.
   ! 1/2 of a non-zero number is always smaller than the number, right?
   my_dp_eps = 1.0d0
   INFINITE: do
               my_dp_eps = my_dp_eps/2.0d0
               t = 1.0d0 + my_dp_eps
               if (t <= 1.0d0) exit
   enddo INFINITE
   my_dp_eps = 2.0d0*my_dp_eps
   print *, my_dp_eps
   print *, 'weird coincidence, that is the same as ',epsilon(0.0d0)
   print *, 'congratulations! You calculated the epsilon value of a machine the hard way!'
   print '(b64.64)',my_dp_eps

end program demo_epsilon

So I think understanding this one is fun too!

ivanpribec · January 8, 2025, 6:31am

Such loops were a common sight in the days before the epsilon(...) and other query intrinsics were introduced. I have one in a code I’m refactoring:

C
C   CALCULATE ROUGH ESTIMATE OF UNIT ROUND-OFF ERROR FOR CHECKING
C
      TWOU = 0.1D0
   40 TEMP = 1.0D0 + TWOU
      IF (1.0D0.EQ.TEMP) THEN
         TWOU = TWOU*2.0D0
      ELSE
         TWOU = TWOU*0.5D0
         GO TO 40
      END IF

I noticed a report about this among some old Bell Labs material (Selected Bell Labs Tech Reports, some pretty famous names in there ):

N. L. Schryer, CSTR #89, A Test of Computer’s Floating-Point Arithmetic Unit, Bell Labs, February 1981. Also in Sources and Development of Mathematical Software, ed. W. Cowell, Prentice-Hall.

The link is broken unfortunately. I was going through the works of N. L. Schryer in search of something else. He wrote a very nice piece about software for solving PDEs in one dimension (his tone of writing is very clean and comprehensible, with a pinch of humor).

PierU · January 8, 2025, 7:22am

To me, nearest( tiny(1.0), -1.0 ) shalll return 0.0, otherwise it’s not standard conforming.

Harper · January 8, 2025, 7:42am

Using B32.32 (thanks @urbanjost) I changed the progran to this:

program nearestsmall
  implicit none
  real    x 
  integer i
  x = tiny(x)
  do i = 1,digits(x)+1
     if(x==0) exit
     x = nearest(x,-x)
     print "(2X,A,I3,E14.6,A,B32.32)",'In loop i,x =',i,x,' = ',x
     x = x/2.0
  end do
  print "(12X,A,B32.32)",'tiny(x)*epsilon(x)   = ',tiny(x)*epsilon(x)
end program nearestsmall

The Intel compilers were interesting: ifort 2021.13.1 20240703 gave

  In loop i,x =  1  0.117549E-37 = 00000000011111111111111111111111
            tiny(x)*epsilon(x)   = 00000000000000000000000000000001

and ifx 2025.0.0 20241008 gave

  In loop i,x =  1  0.117549E-37 = 00000000011111111111111111111111
  In loop i,x =  2  0.587747E-38 = 00000000001111111111111111111111
  In loop i,x =  3  0.293873E-38 = 00000000000111111111111111111111
  In loop i,x =  4  0.146937E-38 = 00000000000011111111111111111111
  In loop i,x =  5  0.734683E-39 = 00000000000001111111111111111111
  In loop i,x =  6  0.367341E-39 = 00000000000000111111111111111111
  In loop i,x =  7  0.183670E-39 = 00000000000000011111111111111111
  In loop i,x =  8  0.918341E-40 = 00000000000000001111111111111111
  In loop i,x =  9  0.459163E-40 = 00000000000000000111111111111111
  In loop i,x = 10  0.229575E-40 = 00000000000000000011111111111111
  In loop i,x = 11  0.114780E-40 = 00000000000000000001111111111111
  In loop i,x = 12  0.573832E-41 = 00000000000000000000111111111111
  In loop i,x = 13  0.286846E-41 = 00000000000000000000011111111111
  In loop i,x = 14  0.143353E-41 = 00000000000000000000001111111111
  In loop i,x = 15  0.716064E-42 = 00000000000000000000000111111111
  In loop i,x = 16  0.357331E-42 = 00000000000000000000000011111111
  In loop i,x = 17  0.177965E-42 = 00000000000000000000000001111111
  In loop i,x = 18  0.882818E-43 = 00000000000000000000000000111111
  In loop i,x = 19  0.434403E-43 = 00000000000000000000000000011111
  In loop i,x = 20  0.210195E-43 = 00000000000000000000000000001111
  In loop i,x = 21  0.980909E-44 = 00000000000000000000000000000111
  In loop i,x = 22  0.420390E-44 = 00000000000000000000000000000011
  In loop i,x = 23  0.140130E-44 = 00000000000000000000000000000001
            tiny(x)*epsilon(x)   = 00000000000000000000000000000001

Arjen · January 8, 2025, 8:25am

I found a copy of that report at https://telecomarchive.s3.us-east-2.amazonaws.com/docs/bsp-archive/Letters%20and%20Memos/CSTR/CSTR%2089.pdf

RonShepard · January 8, 2025, 9:15am

Harper:

The Intel compilers were interesting: ifort 2021.13.1 20240703 gave
  In loop i,x =  1  0.117549E-37 = 00000000011111111111111111111111
            tiny(x)*epsilon(x)   = 00000000000000000000000000000001
and ifx 2025.0.0 20241008 gave […]

Some compilers have the option to flush denormal numbers to zero. That is apparently what the ifort compiler did, but the ifx compiler doesn’t. Also, it may be possible to set this behavior at runtime with the IEEE library routine IEEE_SET_UNDERFLOW_MODE().

I think this is the difference between model numbers and hardware numbers.

PierU · January 8, 2025, 9:59am

Yep, and I was wrong. Although tiny() refers to the model numbers, nearest() refers to the hardware numbers. Consequently, nearest( tiny(1.0), -1.0 ) won’t be 0.0 on machines that handle denormalized numbers.

PierU · January 8, 2025, 10:42am

Harper:

Some compilers print Inf or Infinity with this program, others won’t compile it, saying there is a floating overflow. Which is right?
program nearesthuge
  print *,nearest(huge(1.0), 1.0)
end program nearesthuge

Regarding this one, all compilers tested on godbolt.org (that is, gfortran, ifx, flang, nvfortran, lfortran) compile the code successfully and print Inf/infinity. Compilers usually do not check/trap floating point overflows by default. Using ifx -fpe0 traps the fp overflow here, but using gfortran -ffpe-trap=overflow doesn’t, which I believe is a bug.

RonShepard · January 8, 2025, 4:48pm

I modified the smallest program a little to see how the IEEE functionality works.

program ieee_nearestsmall
   use ieee_arithmetic, only: IEEE_SET_UNDERFLOW_MODE
   implicit none
   call IEEE_SET_UNDERFLOW_MODE (GRADUAL=.false.)
   call loop()
   call IEEE_SET_UNDERFLOW_MODE (GRADUAL=.true.)
   call loop()
contains
   subroutine loop()
      real :: x, xn
      integer :: i
      x = tiny(x)
      do i = 0, digits(x)+1
         xn = nearest(x,-1.0)
         print "(2X,A,I3,E14.6,2(A,B32.32))",'In loop i,x =',i,x,' = ',x, ' xn = ', xn
         if(x==0) exit
         x = x * 0.5
      enddo
      print "(12X,A,B32.32)",'tiny(x)*epsilon(x)   = ',tiny(x)*epsilon(x)
      return
   end subroutine loop
end program ieee_nearestsmall

$ gfortran nearestsmall.f90 && a.out
  In loop i,x =  0  0.117549E-37 = 00000000100000000000000000000000 xn = 00000000011111111111111111111111
  In loop i,x =  1  0.000000E+00 = 00000000000000000000000000000000 xn = 10000000000000000000000000000001
            tiny(x)*epsilon(x)   = 00000000000000000000000000000001
  In loop i,x =  0  0.117549E-37 = 00000000100000000000000000000000 xn = 00000000011111111111111111111111
  In loop i,x =  1  0.587747E-38 = 00000000010000000000000000000000 xn = 00000000001111111111111111111111
  In loop i,x =  2  0.293874E-38 = 00000000001000000000000000000000 xn = 00000000000111111111111111111111
  In loop i,x =  3  0.146937E-38 = 00000000000100000000000000000000 xn = 00000000000011111111111111111111
  In loop i,x =  4  0.734684E-39 = 00000000000010000000000000000000 xn = 00000000000001111111111111111111
  In loop i,x =  5  0.367342E-39 = 00000000000001000000000000000000 xn = 00000000000000111111111111111111
  In loop i,x =  6  0.183671E-39 = 00000000000000100000000000000000 xn = 00000000000000011111111111111111
  In loop i,x =  7  0.918355E-40 = 00000000000000010000000000000000 xn = 00000000000000001111111111111111
  In loop i,x =  8  0.459177E-40 = 00000000000000001000000000000000 xn = 00000000000000000111111111111111
  In loop i,x =  9  0.229589E-40 = 00000000000000000100000000000000 xn = 00000000000000000011111111111111
  In loop i,x = 10  0.114794E-40 = 00000000000000000010000000000000 xn = 00000000000000000001111111111111
  In loop i,x = 11  0.573972E-41 = 00000000000000000001000000000000 xn = 00000000000000000000111111111111
  In loop i,x = 12  0.286986E-41 = 00000000000000000000100000000000 xn = 00000000000000000000011111111111
  In loop i,x = 13  0.143493E-41 = 00000000000000000000010000000000 xn = 00000000000000000000001111111111
  In loop i,x = 14  0.717465E-42 = 00000000000000000000001000000000 xn = 00000000000000000000000111111111
  In loop i,x = 15  0.358732E-42 = 00000000000000000000000100000000 xn = 00000000000000000000000011111111
  In loop i,x = 16  0.179366E-42 = 00000000000000000000000010000000 xn = 00000000000000000000000001111111
  In loop i,x = 17  0.896831E-43 = 00000000000000000000000001000000 xn = 00000000000000000000000000111111
  In loop i,x = 18  0.448416E-43 = 00000000000000000000000000100000 xn = 00000000000000000000000000011111
  In loop i,x = 19  0.224208E-43 = 00000000000000000000000000010000 xn = 00000000000000000000000000001111
  In loop i,x = 20  0.112104E-43 = 00000000000000000000000000001000 xn = 00000000000000000000000000000111
  In loop i,x = 21  0.560519E-44 = 00000000000000000000000000000100 xn = 00000000000000000000000000000011
  In loop i,x = 22  0.280260E-44 = 00000000000000000000000000000010 xn = 00000000000000000000000000000001
  In loop i,x = 23  0.140130E-44 = 00000000000000000000000000000001 xn = 00000000000000000000000000000000
  In loop i,x = 24  0.000000E+00 = 00000000000000000000000000000000 xn = 10000000000000000000000000000001
            tiny(x)*epsilon(x)   = 00000000000000000000000000000001

This is on arm64 hardware with gfortran on MacOS. When gradual underflow is turned off (the first call), nearest() still returns a denormal number, but the subsequent multiplication by 0.5 flushes to zero as expected. Then notice that the nearest(0.0,-1.0) returns the negative of the smallest denormal number. That all seems reasonable, but I think nearest(0.0,-1.0) could also return -tiny(1.0) here and still be conforming. That is, I think nearest() is allowed to look at the current underflow mode if it wanted to, and return the value consistent with that. The final tiny(x)*epsilon(x) value also prints the denormal value rather than flushing to 0.0. That might be an incorrect result, or it might be just the difference in the compile-time and run time evaluation of that expression. Maybe someone with more experience with these intrinsics could comment on this feature. Certainly, on a machine that does not support gradual underflow at all, I would expect these results to all flush to 0.0.

With gradual underflow turned on, the full range of single-bit denormal values is printed. When x has the smallest denormal value, nearest(x,-1.0) and the multiplication both return 0.0. As before, nearest(0.0,-1.0) returns the negative of the smallest denormal number, which I think is the expected result here.

edit: The nagfor compiler on arm64 hardware gives the same results as gfortran, but with a warning at the end of the run that a floating point underflow occurred. In particular, when gradual underflow is turned off, nagfor also returns denormal numbers for both the nearest(0.0,-1.0) expression and for the final product expression. The nagfor compiler also requires transfer() to write the b32.32 fields.

RonShepard · January 8, 2025, 5:01pm

I think historically a programmer might expect floating point overflows to be trapped and to result in a program abort, while floating point underflows would be expected to silently flush to zero. Of course, that behavior might be modified with compile time options or with run time modifications of the environment.

Harper · January 9, 2025, 9:47pm

Here is another program I found instructive, testing what compilers do with nearest(0.0,1.0) in each ieee_set_underflow(gradual) possibility.

program ieee_nearestzero
  use ieee_arithmetic, only: ieee_set_underflow_mode
  implicit none
  real x
  integer i
  do i = 1,2
     call ieee_set_underflow_mode(i==1)
     x = nearest(0.0,1.0)
     print "(A)",'Underflow '//merge('gradual','sudden ',i==1)
     print "(A,B32.32,A,L2)",'nearest(0.0,1.0) = ',x,&
          ' = tiny*epsilon?',x==tiny(0.0)*epsilon(0.0)
  end do
end program ieee_nearestzero

RonShepard · January 9, 2025, 11:52pm

I’m curious if any compilers return tiny(1.0) for this nearest() value. I’ve looked at three compilers now, and they all return a denormal value regardless of the underflow setting. I wonder if that is the intended result?

Harper · January 10, 2025, 12:39am

Same result as @RonShepard got for the 4 compilers that would compile it. Lfortran hasn’t implemented ieee_set_underflow_mode yet.

Topic		Replies	Views
Ifort (IFORT) 2021.8.0: `1.0E+37 / 1.0E+38 = 0`	34	1425	December 25, 2022
Working with reals near the machine precision Help	35	1445	February 16, 2024
Challenge: Testing Inf and NaN with `gfortran-13 -Ofast`	38	1867	October 3, 2023
Is there a way to know if a variable is NaN, +infinity, -infinity?	25	6653	May 31, 2022
`nint(-0.1) = ?`	18	1707	December 8, 2022

Nearest(huge(1.0), 1.0)?

Related topics