Loop variable reaching integer `huge` causes infinite loop

Apologies as this may have been discussed before.

Consider the following tiny program:

program test_huge
    use iso_fortran_env, only: int8
    integer(int8) :: i
    do i=-huge(i),huge(i)
       print "('i=',i10,' b=',b8.8)", i,i
    end do
end
~                                                      

With gfortran 12.2 on Mac, this causes an infinite loop that never ends. I guess there is a problem with huge(i)=127, because if use the same bounds with an implicit loop, I get:

./test_huge.f90:3:62:

    3 |         integer(int8), parameter :: a(*) = [(i,i=-huge(i),huge(i))]
      |                                                              1
Warning: DO loop at (1) is undefined as it overflows [-Wundefined-do-loop]

Shouldnā€™t a 1-byte signed integer value be in the [-128, 127] range?

EDIT: godbolt shows no issues with other compilers, so this may just be GNU-specific.

1 Like

Yes, going up to huge() in a loop can cause problems, depending on how the compiler manages the loop bounds. See this thread:

3 Likes

There have been several discussions about this in the Fortran newsgroup and other forums. The counter on a DO is incremented before being tested so at the end of the loop the value would be 128, causing an overflow.

The longest related discussion is probably
https://groups.google.com/g/comp.lang.fortran/c/14EQelifFHs/m/gzXCDUhCBwAJ

2 Likes

Note that -huge(i) will not return the smallest possible integer:

print*, -huge(1)    ! -> -2147483647
print*, -huge(1)-1  ! -> -2147483648
print*, -huge(1)-2  ! -> Error: Arithmetic overflow
1 Like

Ah @urbanjost @vmagnin @ivanpribec,

thank you for your answers - I had missed that discussion with the Forum search.

The reason I found this issue is that I wanted a parameter array that tabulated a value for each possible integer(int8) value.

So is it safe to say that the only deterministic way to do it is to use a loop variable thatā€™s got a larger storage size? Something like:

integer(int16) :: i
integer(int8) :: j

do i=-int(huge(j),kind=int16),int(huge(j), kind=int16)
   ! do something
end do

There are some additional subtleties here as well. The standardā€™s description of loop counters (and discussions about it seemed to indicate some compilers implement it this way), is that the number of iterations is actually calculated and used as the ā€œcounterā€, i.e. the loop index and bounds are not actually used in the test for whether the loop is finished. This means that any loops that would execute more than huge(i) times are not possible. I.e.

do i = -huge(i)/2, huge(i)/2 + 1
   ! do stuff
end do

might still be an infinite loop depending on compiler implementation.

1 Like

Most compilers I tested donā€™t do that if I recall correctly. It was only one older version of nag which seemed to calculate the number of iterations first.

What that circa means in practice is, ā€œall loops in Fortran should never involve more than half the unsigned integer capacity of that variableā€™s storage sizeā€. wow, thatā€™s a pretty steep limitation (Imagine looping int32 pointers, that only gives you access to ~2e9 elements instead of ~4e9)

Hereā€™s the relevant sections of the standard:

11.1.7.4 Execution of a DO construct
11.1.7.4.1. Loop initiation
When the DO statement is executed, the DO construct becomes active. If loop-control is
[ , ] do-variable = scalar-int-expr1 , scalar-int-expr2 [ , scalar-int-expr3 ]
the following steps are performed in sequence.

  1. The initial parameter m1 , the terminal parameter m2 , and the incrementation parameter m3 are of type integer with the same kind type parameter as the do-variable. Their values are established by evaluating scalar-int-expr1 , scalar-int-expr2 , and scalar-int-expr3 , respectively, including, if necessary, conversion to the kind type parameter of the do-variable according to the rules for numeric conversion (Table 10.9). If scalar-int-expr3 does not appear, m3 has the value 1. The value of m3 shall not be zero.
  2. The DO variable becomes defined with the value of the initial parameter m1.
  3. The iteration count is established and is the value of the expression (m2 āˆ’ m1 + m3 )/m3 , unless that value is negative, in which case the iteration count is 0.

If loop-control is omitted, no iteration count is calculated. The effect is as if a large positive iteration count, impossible to decrement to zero, were established. If loop-control is [ , ] WHILE (scalar-logical-expr), the effect is as if loop-control were omitted and the following statement inserted as the first statement of the block:

IF (.NOT. (scalar- logical-expr )) EXIT

11.1.7.4.3 The execution cycle
The execution cycle of a DO construct that is not a DO CONCURRENT construct consists of the following steps performed in sequence repeatedly until termination.

  1. The iteration count, if any, is tested. If it is zero, the loop terminates and the DO construct becomes inactive. If loop-control is [ , ] WHILE (scalar-logical-expr), the scalar-logical-expr is evaluated; if the value of this expression is false, the loop terminates and the DO construct becomes inactive.
  2. The block of the loop is executed.
  3. The iteration count, if any, is decremented by one. The DO variable, if any, is incremented by the value of the incrementation parameter m3.

Except for the incrementation of the DO variable that occurs in step (3), the DO variable shall neither be redefined nor become undefined while the DO construct is active.

The conclusion of the conversation that I remember from a committee meeting (IIRC) was that it doesnā€™t say how the iteration count is stored (i.e. it could be a different kind), so it was recommended that processors (compilers) simply use a storage mechanism for it that would be sufficient to store and count down to zero from huge(i)*2 + 1. Iā€™m guessing most of them do now.

1 Like

Thank you for the quotes, the standard is very clear and your explanation 100% matches that. Regarding your last comment, thatā€™s probably where gfortran is still lagging behind (other compilers, latest versions, run that loop just fine on godbolt).

Is this what you are looking for?

   use, intrinsic :: iso_fortran_env, only : I1 => int8, I4 => int32
   integer(I4), parameter :: n(*) = [( i, integer(I4) :: i = -huge(1_i1), huge(1_i1) )]
   print *, n
end
Program response
C:\temp>ifx /standard-semantics p.f90
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2023.0.0 Build 20221201
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.34.31937.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
 -127 -126 -125 -124 -123 -122
 -121 -120 -119 -118 -117 -116
 -115 -114 -113 -112 -111 -110
 -109 -108 -107 -106 -105 -104
 -103 -102 -101 -100 -99 -98
 -97 -96 -95 -94 -93 -92
 -91 -90 -89 -88 -87 -86
 -85 -84 -83 -82 -81 -80
 -79 -78 -77 -76 -75 -74
 -73 -72 -71 -70 -69 -68
 -67 -66 -65 -64 -63 -62
 -61 -60 -59 -58 -57 -56
 -55 -54 -53 -52 -51 -50
 -49 -48 -47 -46 -45 -44
 -43 -42 -41 -40 -39 -38
 -37 -36 -35 -34 -33 -32
 -31 -30 -29 -28 -27 -26
 -25 -24 -23 -22 -21 -20
 -19 -18 -17 -16 -15 -14
 -13 -12 -11 -10 -9 -8
 -7 -6 -5 -4 -3 -2
 -1 0 1 2 3 4
 5 6 7 8 9 10
 11 12 13 14 15 16
 17 18 19 20 21 22
 23 24 25 26 27 28
 29 30 31 32 33 34
 35 36 37 38 39 40
 41 42 43 44 45 46
 47 48 49 50 51 52
 53 54 55 56 57 58
 59 60 61 62 63 64
 65 66 67 68 69 70
 71 72 73 74 75 76
 77 78 79 80 81 82
 83 84 85 86 87 88
 89 90 91 92 93 94
 95 96 97 98 99 100
 101 102 103 104 105 106
 107 108 109 110 111 112
 113 114 115 116 117 118
 119 120 121 122 123 124
 125 126 127

C:\temp>

1 Like

I think the quoted section 11.1.7.4 above explains how the trip count is evaluated and incremented. If you want to write portable code, then you should keep your code consistent with that, even if some compilers allow you to violate those constraints.

Another problem with the original code is the value of the index after the loop. Any do loop of the form do i = m1, m2, m3 where m2=huge(i) cannot be standard conforming. Consider a simple example

do i = huge(i)-1, huge(i)

which should only execute twice. This is clearly not a problem with the trip count value. After loop execution, the vaue of i should be huge(i)+1, which cannot occur. As a practical matter, the loop may in fact execute the correct number of times, and if the value of i is never referenced afterwards, then the violation may be benign. But in a technical sense, that loop violates the standard.

You give one possible solution to the problem by using do loop parameters with a different KIND. Another workaround is something like

i = -huge(i)
do
   ! loop code here
   if ( i == huge(i) ) exit
   i = i + 1
enddo

You might be able to do the same thing with a do while loop, but it takes some care to stay within the range of the model integers. This likely will not to be as efficient as a do loop, but that is why do loops are defined with trip counts, to make them very efficient to execute in the first place.

Another related question comes to mind. Suppose one had an array expression

...array(-huge(i):huge(i))...

Is that array range expression required to work, or does it have the same kind of index range restrictions as the equivalent do loop?

1 Like

That requirement is a little ambiguous. For an 8-bit integer, what are the values? The model numbers have the range -huge(i) to huge(i). But a twos-complement machine has a slightly larger range, -huge(i)-1 to huge(i). A fortran compiler is allowed to give you access to that extra value, but I think it is not required to do so.

1 Like

Yep thatā€™s what I reverted to using. Nice in-place declaration of the loop variable!

Iā€™ll go off on a tangent here, in C++ there is a lot of confusion among programmers on what type to chose for loop variables, signed or unsigned:

for (int i = 0; i < N; ++i) { /* ... */ }
// vs
for (size_t i = 0; i < v.size(); ++i) { /* ... */ }

The second form is often chosen because the sizes returned by STL containers are unsigned and compilers will issue a warning (false positive) about inconsistent types otherwise. In many cases range-based for loops, iterators, or STL algorithms can be used to get rid of the explicit loops entirely.

To keep the language simple, proposals from Bjarne Stroustrup suggest that signed integers (int) are the most natural loop variables (follow the links in this older post). In Fortran, where arrays can have negative lower bounds, they make sense too.

The smaller range (1 lost bit) feels like youā€™ve lost a lot of space, but if youā€™re problem is already close to 4 billion items, you might as well use a larger type and save yourself the trouble later if/when the size of your data increases.

I think it would be good if compilers could warn about this, perhaps at runtime with some debugging options on, since this behavior is indeed surprising.

1 Like

With gfortran use -Wall. Example with the original example in this thread:

gfc looper.f90 gfc -Wall looper.f90
looper.f90:4:25:

4 |     do i=-huge(i),huge(i)
  |                         1

Warning: DO loop at (1) is undefined as it overflows [-Wundefined-do-loop]

What is surprising is how many people do not know of or use -Wall with gfortran.

2 Likes

I agree if all the debug switches were on by default it would be great, and as much work that goes into adding such checks to a compiler they are very underused. Perhaps if a de-facto standard emerged (like -L, -D, -I switches) like -debug that turned on all the basic debug switches it would help users (new users in particular) to take advantage of features like -Wall on gfortran.

Note that fpm(1) does exactly that, defaulting to debug mode on all the supported compilers.

In this case even getting the warning might not make it obvious why it is happening and the subtle details about how a work-around might be compiler-dependent as discussed here though. In particular, I would guess the majority of new users would have little idea of why there was an overflow even if they got that message; so I still like seeing discussions like this (which might even result in a language change) even though I was already familiar with it.

But is is surprising when users encounter bugs that developers do not know there were switches that would have detected the errors during development, and how little literature there is to explain when and why to use all those compiler switches. If you want to make a mini-book for fortran.lang describing that I think that would be great. In the meantime, fpm(1) seems the closest there is to something that defaults to using all those features to auto-detect common errors.

If you look at other compilers it seems like they run this edge case where the loop variable overflows only after the loop is run, with no warnings. Would it make sense to contribute a patch to gfortran to handle it too?

Are you talking about the index overflow or the trip count computation?

If it is the index overflow, then this is definitely a code error. If that loop index, which overflows at loop exit, is referenced later, it would almost certainly result in a silent and difficult to debug error. So I think a better approach is to warn the programmer, early at compile time if possible, and at run time otherwise, of the error, so it can be addressed.

The question then is when the programmer knows that the overflow is benign, then what options should be available to him? Should there be a way to designate that this overflow is benign and to ignore it, say with a compiler directive or something equivalent at that point in the code?

If it is the trip count computation, then this has other implications. If the language is required to test every trip count computation at run time for possible overflow, then that could adversely impact the computational efficiency. It might be nice to have compiler options to enable these tests, while debugging for example, but perhaps the default should be to not have them.