I started following the comp.lang.fortran newsgroup in the early 1990s, when f77 was still the dominant dialect and f90 compilers were not yet common, and this issue was discussed often even at that time. There was a newsgroup FAQ document, and I think that was one of the topics discussed there too. I remember arguments on both sides of the issue that seemed to make sense, but I think the final conclusion was that the fortran standard (f77 at that time), allowed the compiler maximum flexibility to do it either way. This was more or less explicit when the issue involved function evaluations within an expression, such as the foo()+foo()
in your example. Another numerical example is an expression that involves n*foo()
when n
is either a literal zero or a variable that has the value zero. I think things were a little less certain when the function evaluations were not in a single expression but spread over several statements. In that case, the issue was how much “dead code” the compiler was allowed to recognize and to eliminate. In addition to numerical examples, there are also some common situations involving logical expressions with function references.
There is also the issue of “mathematical” equivalence, which is a little unclear in some of these cases. Consider the example given above with a uniform random number function ran(). Is 2*ran()
mathematically equivalent to ran()+ran()
? 2*ran()
produces a uniform random number in the range 0.<=x<2.
However, ran()+ran()
produces a number in that same range but with a triangular distribution. [Think of the distribution of values produced from two dice.] Is that difference in distributions a “mathematical” or a “numerical” one?
Often, the difference isn’t so much the function result, but the side effects (as in some of the examples in this discussion).
One thing I remember about those discussions was that it was not going to be possible to please everyone, or even to please one programmer all the time. Sometimes you want the functions to be always evaluated (for their side effects, etc.), and sometimes you want the compiler to optimize away the unnecessary functions and to produce efficient code.
My personal take away from this is that if you always want the side effects, then use a subroutine, not a function, and if you want the absolute optimal code, then hand optimize it yourself, don’t expect the compiler to read your mind.
I think this is probably correct. However, it was only the semantics of the result of the instruction that was included, the MERGE intrinsic is not required to emulate exactly the Cray hardware. Thus a compiler is free to either evaluate all the arguments and then mask, or to use the mask to conditionally evaluate the arguments, depending on what hardware is available, what optimization options are invoked at the time, and so on.