Looking at some old code

RonShepard · October 23, 2022, 6:27am

I was about to post a comment that expressions like x*epsilon(x) can be simplified to simply spacing(x). However, this program

program xxx
   integer :: i
   real :: x
   x = 1.0
   do i = 1, 10
      write(*,'(3es12.5)') x, x*epsilon(x), spacing(x)
      x = x / 3.0
   enddo
end program xxx

produces the following output

$ gfortran spacing.f90 && a.out
 1.00000E+00 1.19209E-07 1.19209E-07
 3.33333E-01 3.97364E-08 2.98023E-08
 1.11111E-01 1.32455E-08 7.45058E-09
 3.70370E-02 4.41516E-09 3.72529E-09
 1.23457E-02 1.47172E-09 9.31323E-10
 4.11523E-03 4.90573E-10 4.65661E-10
 1.37174E-03 1.63524E-10 1.16415E-10
 4.57247E-04 5.45081E-11 2.91038E-11
 1.52416E-04 1.81694E-11 1.45519E-11
 5.08053E-05 6.05646E-12 3.63798E-12

which shows that my statement would not have been correct. Does anyone know why those last two columns are not the same?

BTW, if you replace the 3.0 by 2.0 in the program, then the two columns are the same. So the expression is sometimes true, but I thought it was always true, at least for positive x.

mecej4 · October 23, 2022, 7:29am

When you started with x = 1.0 and kept dividing by 2.0, all the values of x were exactly representable, and nearest(x,1.0)-x, spacing(x) and epsilon(1.0)*x were all equal for each x. When you changed the divisor to 3.0, exact representability was lost.

Compared to the other two, epsilon is different in that it has meaning/relevance only when addition/subtraction is performed, and its value depends on the internal algorithms used in the processor to do the addition, whereas the values returned by the other functions are completely determined by the representation that has been chosen for floating point numbers.

We could make a back inference that any widely marketed processor would have addition/subtraction algorithms implemented (in hardware or microcode) to guarantee the equality of the three expressions involving x that I noted above when the x in them is exactly representable.

I should love to hear an expert’s view on this.

RonShepard · October 23, 2022, 4:41pm

Every single possible floating point number of any KIND is an exactly represented number. I know floating point numbers are often considered as approximations to real numbers with error bars, but in fact they are each an exact rational number with an infinitely precise value.

To answer my own question, the difference is that when dividing by 2.0, all of the lower order bits are zero in every x, and when dividing by 3.0 they are not. I added the nearest() expression and changed to a z format,

program xxx
   integer :: i
   real :: x
   x = 1.0
   do i = 1, 10
      write(*,'(es12.5,3z10.8)') x, nearest(x,1.0)-x, x*epsilon(x), spacing(x)
      x = x / 3.0
   enddo
end program xxx

and here is the result.

$ gfortran spacing.f90 && a.out
 1.00000E+00  34000000  34000000  34000000
 3.33333E-01  33000000  332AAAAB  33000000
 1.11111E-01  32000000  32638E39  32000000
 3.70370E-02  31800000  3197B426  31800000
 1.23457E-02  30800000  30CA4588  30800000
 4.11523E-03  30000000  3006D905  30000000
 1.37174E-03  2F000000  2F33CC07  2F000000
 4.57247E-04  2E000000  2E6FBAB4  2E000000
 1.52416E-04  2D800000  2D9FD1CD  2D800000
 5.08053E-05  2C800000  2CD517BC  2C800000

So now it is clear what is happening. Two of those columns are equivalent and the lower order bits are all zero, while the other expression has nonzero low order bits. I now realize that I should be more careful with my coding exactly which expression I use in my tests for convergence, and near-equality, and so on. They in fact are not equivalent tests as I had presumed.

When the 3.0 is replaced with 2.0, then all three columns are the same and the low order bits are zero everywhere.

RonShepard · October 23, 2022, 5:33pm

As an aside, the reason I used 3.0 in that test code is that the binary representation of 1.0/3.0 has the form

0.010101...

for however many bits are in the mantissa. This was used in the f77 days as one way to compute epsilon() before that function was part of the standard. The expression used was (1.0-3.0*(1.0/3.0)). You can work out by hand to see the following:

x  =   0.010101
2x =   0.101010    this is a shift left
3x =   0.111111    add the above two rows
1  =   1.000000
1-3x = 0.000001    subtract the above two rows.

When you do this, you get either epsilon() or 2*epsilon() depending on whether there are an even or an odd number of mantissa bits (counting the hidden bit, if it is there).

There were a few things that stopped this from working. One was that some compilers would evaluate the expression at run time with extended precision, and they would round the result to 0.0 rather than epsilon. Another was that some computers, such as the Cray, did not do fp arithmetic correct to the last bit, so you would sometimes get 0.0 on those machines even when evaluated at run time. The other, which began to appear in the 1980s, was that the compiler would evaluate it using 80-bit extended precision, in which case the epsilon that was computed was for the 80-bit register rather than for the 32-bit or 64-bit real. When that happened, then your convergence tests might never be satisfied because your actual calculation was performed with a lower precision than the convergence tolerance.

So having the modern fortran intrinsics such as epsilon(), nearest(), and spacing() that give reliable results for all KINDs is a really nice feature of the language.

FortranFan · October 24, 2022, 1:14am

@zaikunzhang ,

Why is “how to test” a question for you? Are you considering something more complicated than trying to keep it simple whilst being thorough? In the context of Fortran libraries and programs, if the author(s) and/or refactorer(s) can focus first on simple unit tests for each of the important program units in the program/library, a lot of ground gets covered. Then the focus can move on key packages (or modules) in the library followed by integration testing where all the packages/modules are combined together.

As to what to test, you will know it’s everything but one can start by thinking of it like a machinery and focus on testing every “moving part” i.e., subprogram arguments leading toward the effect on the customer’s (the caller’s) “independent” and “dependent” variables and whether the latter are determined as expected.

mecej4 · October 24, 2022, 8:19am

I think that you and I are writing from different contexts. Floating point notation existed for decades before computers came into widespread use, as we can see in any physics/chemistry textbooks of the early 20th century. Almost always, the radix was 10 (ten).

Floating point numbers can be written with different radices. The rational number 1/10 is written as a base-10 floating point number, 1E-1 or 1.0E-1. This notation was used on many electronic calculators in the 1970s, some of which performed real arithmetic in base-10.

On a modern computer, we still use (as literal numbers in our programs or in files read/written by our programs) base-10 in the program source or lines of text that are processed using formatted I/O. It is the process of changing base 10 → 2 → 10 that frequently introduces approximations.

In a Fortran program, the programmer writes “1E-1” and usually thinks of it as the exact rational number 1/10. There is no exact representation for this rational number in base-2 floating point.

We often see complaints from beginners that their short program printed 0.09999999 when they expected 0.1.

RonShepard · October 24, 2022, 4:50pm

If you read my sentence, you can see that I do understand your perspective. Floating point numbers, which you take to include hand-written scientific notation, are used in that way, with significant digits and error bars. But my point was that that had nothing to do with the differences in the columns of numbers. I was not comparing the exact mathematical value of 1/3 to the floating point approximation. I was instead taking an exact number represented in floating point, however it was obtained, and applying the fortran intrinsics epsilon(), nearest(), and spacing(), and noting a difference that might be important while testing for convergence or testing for near-equality.

mecej4 · October 24, 2022, 5:37pm

Another way of explaining the second and fourth columns in the output table that you showed is to note that the numbers in them are always 2 raised to an integer exponent. The explicit part of the significand is zero,

For the range of x that you selected, those exponents range from -23 to -38.

Topic		Replies	Views
The speed of if	8	1310	October 21, 2021
Questions about statement functions Help	16	1079	March 14, 2023
Same logic, different result	9	395	July 25, 2023
Small bug in "Python Fortran Rosetta Stone"	11	327	September 10, 2024
Common block mystery Help	11	406	September 3, 2024

Looking at some old code

Related topics