Fix comparing float number with integer issue

Dear all,

I have an old code (not written by me). There is a piece

do i=1,10
  if (AAA(i) == -99) then  ! real number compare with integer, bad!
    ......
  else
    .....
  endif
end do

where AAA array is a float number, real type. Sometimes the code do

AAA(j)=-99

Therefore it will do the

 AAA(i) == -99

check shown in the the if then else statement. Which is bad. It is a real number compare with integer issue.

I wonder, is there a simply and safe fix for this issue?

Is the below fix correct?

if ( (AAA(i) + 99) < epsilon(AAA(i))  ) then  

Thanks much in advance!

Two points to consider:

  • It appears that -99.0 is a sentinel value. For example, if AAA is an array of students’ ages, the -99.0 for a particular index i could signify that for that student the age has not yet been entered.

  • The real number -99.0 is a small integer, and can be represented exactly as a 32-bit real, so you need not take the trouble to use epsilon(), etc.

1 Like

You should keep in mind why comparisons between reals or between reals and integers are frowned upon: in many cases the values are not exact because of rounding issues. Like:

x = 10.0 * 0.1
y = 1.0
if ( x == y ) then
   write(*,*) 'x and y are equal'
endif

It is almost certain that x and y in this example are not equal, because of rounding issues. The eternal matter of binary versus decimal arithmetic.
But as @mecej4 says, -99 is a small value that can be represented exactly. In that case there is no problem at all comparing a real that has been set to this value at some point and the literal value. It is for tools, however, much easier to flag every such comparison than to examine if it is indeed a hazard. And in some cases it is impossible to do at compile time.

Conclusion: in this case is the comparison is completely valid.

The only thing you might want to change is: put the -99 in a parameter instead of using a literal value. That way you can indicate its role via its name rather than have the reader guess it is a sentinel value.

1 Like

Thanks @mecej4, @Arjen ! Yeah, the integer -99 is just to indicate if a record exist or not.
I mean in fact the integer -99 is in a txt file, AAA(j) will read in this -99 as real number, since AAA is real type.
Then I am worried that if this -99 real number will cause problem. But like you said, -99 compare with float number is OK. Then problem solved LOL.

Eh @Arjen do you mean do things like below?

real, parameter :: NA = -99
...

if (AAA(i) == NA) then
...
endif
...

Seems great.

Exactly that :slight_smile:

1 Like

You don’t want to play with machine precision to avoid cases where real → integer conversion is done by int which rounds to the LOWEST integer.

You want to use nint instead. Plus, you can improve the coding since you’re refactoring:

elemental logical function IS_FLAGGED(this)
   real, intent(in) :: this ! note this is real
   integer, parameter :: FLAG = -99 ! could be in a module
   IS_FLAGGED = nint(this)==FLAG
end function IS_FLAGGED

so you can do

where (IS_FLAGGED(AAA)) do_something
3 Likes

It is almost correct. epsilon is always positive, but AAA(i)+99. may be negative, and in that case the comparison won’t do what it is supposed to do.
Furthermore, I wouldn’t rely on the fact -99.0 can be represented exactly as a 32-bit real number. What if tomorrow, for whatever reason, another value needs to be used as a flag (by you or another colleague using the same code)? Will you remember to pick a “small integer” then, or to modify the comparison? I would probably forget to do that. A code relying on assumptions that may or may not be true tomorrow is not future-proof and therefore prone to painful debugging sometime in the future.

I don’t know the context of this code snippet, but if -99.0 serves just as a flag, and you are absolutely sure it will never change, why not compare like that: if AAA(i)<0.) then ... (provided AAA(i) is positive and not too close to zero - as in students’ ages)? This will eliminate any issue caused by the value of the flag - it is enough to be negative.
If AAA(i) is not always positive, using epsilon is the safest, future-proof solution. As @Arjen pointed out, it is better to define a parameter for the flag value (say, flag=-99.). Also do not forget that AAA(i)-flag may be negative, so the comparison should be:
if ( abs(AAA(i)-flag) < epsilon(AAA(i)) ) then...

1 Like

It seems like there are lots of possibilities in these situations, and the best option depends on things unstated in the original post. For example, how many “special” values are there in the input file? If -99 is the only one, then that simplifies things. If there are intended to be an arbitrary number, and perhaps future ones will be added to the input file and the current code is expected to adapt to those as necessary, then things get more complicated.

One suggestion is to use “real, parameter :: flag = -99.” or something similar. There are a few possible problems with this. What if the real kind is changed in the future? You may have problems with how the parameters are defined at compile time. Small integer values are usually alright, but there is no guarantee that this will always hold.

Others suggest to compare with a tolerance using something like “abs(value-flag)<=tol”. That can work too, but what exactly is the tolerance? How precisely are the values and the flag set? Should tol be something like 0.5, or should it be some small multiple of value*epsilon().

What if values near -99. are all supposed to be related, but distinct. That is, what if there are -99.0, -99.1, -99.2 special values that are all supposed to be treated the same at one point in the code, but differently is other parts of the code? Then something like int(value)==-99 might be appropriate for the general comparison, but other comparisons, e.g. with a tolerance or as nint(10*value)==-991, are appropriate elsewhere.

Something like “nint(value)==flag” can also work. But to be future proof, you must know that the special values will always be integers that are small enough so that no two values will map to the same integer with nint(). That is an unstated assumption in the original post.

For the exact comparison of real values, you might also be faced with the prospect that the i/o library converts values the same way as the compile time constants are evaluated. For example, if instead of -99, the input value were specified as -.99e2, then you might not end up with exactly the same floating point bit pattern, there might be some small numerical differences. This is the reason I might avoid the direct real and integer comparison in the code. Instead, I might localize all such comparisons into a function (either logical, or integer with just the values -1,0, and +1). Then if you need to refine the function later to account for some corner cases, it is just in one place, not scattered throughout the code.

3 Likes

@RonShepard provides a good discussion of why some thought should go into the selection of special/magic values such as -99, which is more of a signature with a cryptic meaning than the value itself. The author of a code that uses such special values must make a good effort to document the purpose and usage of the special value. Otherwise, it can become a big mystery, as @JacobWilliams demonstrated in a recent thread here.

1 Like

This is a good hint. Fixing the comparison to -99 only tackles the symptom but not the root of the problem.

To me, the most sensible approach would be to flag missing values with an “N/A”. This is easy to understand and as far as I know, pandas and similar tools will handle it in ASCII files correctly. @CRquantum this “big solution” is of course only possible if you have time for more severe refactoring and breaking backward compatibility can be managed.

1 Like

@CRquantum , you’ve a lot of great advice here from highly experienced and diverse backgrounds to consider!

To build on the comments thus far and to offer you a twist, you may want to definitively refactor this code to bring in named constant(s) instead of continuing with “magic numbers” and perhaps also look at reading in the data first as text and do string-based evaluations for “valid” records before proceeding to load the data into floating-point variables:

...
    character(len=*), parameter :: SENTINEL_nn = "-99"  !<-- nn is simply an identifier suffix presuming there are more such values
...
   if ( index(record_field, SENTINEL_nn) == 0 ) then ! read the "data" as floating-point values
1 Like

Thank you all! @Arjen @FedericoPerini @Pap @RonShepard @mecej4 @MarDie @FortranFan ! I did not expected to received so many useful tip! Awesome!

There are two steps.

Step 1. Data file preparation. This data file is a csv file.

Step 2. Fortran read in the data file and then do the calculation.

The Step 1.
is done by the user using any software or programming language. Like Excel, Python, R, etc. This preparation of data file does not need to be done in Fortran. It is the user’s responsibility to check their data file and provide the correct data csv file with correct format.

In fact, the data csv file is basically some medical data. There are only two types of columns, numeric columns and character columns. Numeric ones, such as, the blood pressure, or some drug concentration at each time slot like i=1, 2, 3, … etc. Or weight, height, age, etc. Those are definitely non-negative numbers, no need to worry. For character type, they can be the patient ID, race, gender, etc.

For any missing data, the user will either leave a mark at that place, the mark is character dot, eg, “.” Or leave a -99 there. Note that, if it is a valid data like -99, the user must write -99 in float format such as -99.0 you know something like that. So if it is just exactly -99 it just means the data is missing.

The Step 2.
The old F77 code, creates arrays for numeric columns. Such as the real type array AAA mentioned. It treats the place with “.” or -99 just as -99. Then it do the AAA(i) == -99 comparison. Anyway it just use this sloppy -99 magic number and lucky.

Now for me. In my code, actually, I always read any data as character first. Then depending on the content of the character, I convert them into the correct data type. This can prevent such -99 sloppy issues.

For example, I actually have two arrays,

real, allocatable, :: AAA( : )
logical, allocatable, :: AAA_valid( : )

I read in the data as character, if the character is " . " or “-99”, I set AAA_valid( i) = .false. Else, I set AAA(i) as the number converted from the character. So I do things like

do  i=1, 100
  if (AAA_valid(i)) then
     ... ! treat AAA(i) as a number some do calculation.
  else
     ... ! I know AAA(i) is a missing data so I do something else.
endif
enddo

There can be more elegant way, but my way is to create logical arrays like AAA_valid( : ) to help the numeric array AAA( : ).

Most statistics packages I’ve used or had to work with offer the option of setting something to a missing value and -99 or -999 are very common choices. I would guess that the idea came from there.

1 Like

Years ago, I had to work with a custom library using -4711 for a flag (the number presumably taken from the old fragrance), so I got used to it and I am still using that value whenever I need a flag. People can use anything for that purpose. That’s why I think if ( abs(AAA(i)-flag) < epsilon(AAA(i)) ) then... is a generic solution that should work no matter what. Using an integer (or logical) index vector is also a good solution, but what if AAA is huge? Is it worth sacrificing a big chunk of memory just for an index vector?

1 Like

I have just a couple of comments abouit this. First, if AAA(i) is > 2 in magnitude, then the above test will never be satisfied except for exact equality. In this case, you might as well just test for exact equality. Similarly, for small AAA(i) and flag values, the test will be satisfied regardless of how much the two values differ. This is because the fortran EPSILON() function returns its result based on the KIND of its argument, not its value. Almost always, the programmer wants a relative difference, not an absolute difference, in these kinds of tests. In this case, the comparison needs to be something like flag*EPSILON(flag). Maybe the programmer really wants something like 8*flag*EPSILON(flag) to allow the test to succeed when a few of the least significant bits differ. This is in contrast to the NEAREST() function, which does depend on its arguments value and not just its kind.

As for the last part, this is why a bit-string data type, or a 1-bit logical type, or a 1-bit integer type, would be so useful in fortran. There is no need to waste 32x (or these days even 64x) space with this kind of logical, true/false, yes/no, 0/1 information. Programmers have been wanting this to be added to fortran since the 1970s, even before f77, but especially afterwards, during that period when there were no standard bit operators in the language for a portable workaround. Even now, with those bit operators, it would still be simpler to just have a bit string indexed by the integer i.

2 Likes

Well, it seems like people have been using the sign bit for decades as a way to do it (when only positive numbers are involved of course): wanna flag some stuff? Set it positive/negative. Want to pass that property after some operation? then do result = sign(f(x),x). It’s the same way as having a 1-bit flag in the 32-bit representation that returns a true/false condition.

1 Like

Don’t forget what we originally wanted here was checking if AAA(i)==flag. In the general case, this won’t happen with real numbers, hence the inequality test. I don’t see why there is problem if AAA(i)> 2 orders of magnitude. If AAA is, say, 10000 and flag=99 the test obviously returns .false. and that’s exactly what we wanted in this case. if AAA(i) happens to be exactly equal to flag (or not exactly equal, but very slightly different, as when you manually set AAA=99. but you actually get 98.9999999) then the left hand side of the inequality test is zero or a very-very small positive number. In both cases, inequality test returns .true. which, again, is what we wanted to return.

Using epsilon(flag) instead of epsilon(AAA(i)) doesn’t make much sense. Both 'flag` and AAA should be reals of the same kind. If not, the compiler will promote one of them anyway (issuing a warning as well with the proper compilation flags, clearly telling you this is wrong, they should be of the same kind).

Furthermore, I never said epsilon is the crucial part; of course we can modify the right-hand side of the inequality, depending on what are the values stored in AAA. There was a time I had to deal with matrices holding values in the [1e10,1e12] range (and scaling down was not a plausible option). In such cases, comparing to epsilon(AAA) is obviously overkill (but wil still work). We could as well modify the right-hand side as you recommended.

The point is, I was looking for a generic solution.

1 Like

Yes, if AAA and FLAG are the same kind, then EPSILON() will return the same value, so it doesn’t matter which argument is used. In fact, the value is probably determined by the compiler at compile time, not at run time, because the value of the argument is irrelevant. But when FLAG>2.0, then the difference cannot be smaller than 2*EPSILON except for exact equality. If FLAG were 10000, then the differences cannot be smaller than about 10000*EPSILON.

In your example of numbers in the 1e10 range, the raw EPSILON() test will be looking for differences that are less than 1.19e-7 (in ieee 32-bit binary). The nearest number that differs from 1e10 will have a difference of about 1.19e+3 (=1e10*EPSILON) for that kind. There are no floating point numbers near to 1e10 that differ by 1.19e-7 or less except for exact equality (where the difference is zero). So the comparison against the unscaled EPSILON() value is exactly the same as the original test for equality for numbers in that range. Nothing is accomplished by the more complicated looking code. On the other hand, suppose your numbers were in 1e-10 range. Now every comparison of the numbers in that range will pass the test, even those with slightly different exponents. For example (1e-10-1e-11) will satisfy the equality test even though they have no significant digits at all in common. That probably isn’t what is wanted in the test for near equality to produce; it could be, but probably not.

On the other hand, if AAA and FLAG have different kinds, then of course it does make a difference which you use in the argument, and it is very likely to be a mistake to use the wrong one. For example, if the code is doing 64-bit floating point calculations, but comparing to the 32-bit EPSILON value, then the code is likely to get false positive tests for near equality. Or if doing 32-bit floating point calculations, but the comparison is to the 64-bit EPSILON, then it is likely to get false negatives.

As I stated above, when comparing to -99.0, it is possible that the programmer wants anything within 0.5 to test for equality. That is, he wants to distinguish FLAG=-99.0 from -98.0 and -97.0, but that coarse test is all that is required. That kind of test has nothing to do with EPSILON.

Here is some code to experiment with:

program nearestx
   implicit none
   real :: x
   x = 1e10
   write(*,*) 'epsilon(x)=', epsilon(x)
   write(*,*) x-nearest(x,-1.0), nearest(x,1.0)-x
   write(*,'(3z10)') nearest(x,-1.0), x, nearest(x,1.0)
end program nearestx

 epsilon(x)=   1.19209290E-07
   1024.00000       1024.00000    
  501502F8  501502F9  501502FA

Note that the actual differences, which are powers of 2 for large numbers like this, are slightly different than the X*EPSILON(X) estimate. It should be clear from the hex values why that happens.