Minus not being part of the definition of a literal number - a thought

Some while ago, I could not find the thread, @sblionel pointed out that the minus sign is not part of the definition of a literal number as used in the standard. I just realised that this fact makes a practical difference and that this might be construed as the reason:

Consider the formula -2.0**2. By not making the minus sign a part of the literal, the value will be -(2.0**2) = -4.0, and that is how I would understand it from mathematics. If the minus sign had been part of the literal, then it would have been (-2.0)**2 = +4.0, quite contrary to my intuition. For languages like C that do not have an exponentiation operation, this subtlety in the definition of literals does not play a role.

4 Likes

I think you meant -2.0**2?

That is what he typed, but I added the backticks indicating a formula to make it display correctly.

Thanks, I did not realise that asterisks are seen as formatting characters.

Some Fortran compilers admit, as an extension to standard Fortran, the use of two adjacent arithmetic operators. The current Intel compiler, IFX, will compile the following program, and you will see warnings about the extension only if you specify an option such as /stand:f95.

program nonUnary
   implicit none
   integer i,j,k,m
   real x,y,z
   i = 3
   j = 2+-i*i
   k = 2+-3*-3
   m = 2+-i**2
   print *,i,j,k,m
   x = 2.0**-3
   y = 2.0**-i
   z = 1.0+-x**-3
   print *,x,y,z
end

Not being aware of this extension being alive can make for interesting debugging sessions.

In C, it would be without ambiguity:

pow(-2.0, 2)

or

-pow(2.0, 2)
1 Like

This should be part of the fortran wringer tests: GitHub - klausler/fortran-wringer-tests: A collection of non-portable Fortran usage, standard-conformant or otherwise

One programming rule I learned when I took my first Fortran class over 50 years ago. When in doubt, parentheses are your friends.

4 Likes

Done, thanks; please see newly added file.

I think this is a separate issue. Even if literal constants were allowed to have a leading sign, there could still be a unary minus operator (which negates what follows) and a binary minus operator (subtraction of the two operands). The semantics of an expression like -2.0**2 could still be defined to be -(2.0**2)=-4.0, so all that could be the same. The precedence rules might need to be modified in a few other places too, but that would still be possible.

As mentioned above, one could always use parentheses to make the programmer’s intentions clear, -(2.0**2) or (-2.0)**2 as appropriate. Unlike some other languages, fortran has always been required to honor parentheses.

I guess there are many corner cases where these things gets hard to generalize. I kind of get your stance here, but one could equally be suprised by:

a = 2 -2**2
b = -2**2
c = 2 - b

This is hard to generalize, there are discussions on this in other languages, and how they interpret it, e.g. in the Nim language:

The fact that the unary minus - in a number literal like -1 is considered to be part of the literal is a late addition to the language. The rationale is that an expression -128'i8 should be valid and without this special case, this would be impossible – 128 is not a valid int8 value, only -128 is.

For the unary_minus rule there are further restrictions that are not covered in the formal grammar. For - to be part of the number literal the immediately preceding character has to be in the set {' ', '\t', '\n', '\r', ',', ';', '(', '[', '{'}. This set was designed to cover most cases in a natural manner.

In the following examples, -1 is a single token:

echo -1
echo(-1)
echo [-1]
echo 3,-1
"abc";-1

In the following examples, -1 is parsed as two separate tokens (as - 1):

echo x-1
echo (int)-1
echo [a]-1
"abc"-1

The suffix starting with an apostrophe (‘’') is called a type suffix. Literals without a type suffix are of an integer type unless the literal contains a dot or E|e in which case it is of type float. This integer type is int if the literal is in the range low(int32)..high(int32), otherwise it is int64. For notational convenience, the apostrophe of a type suffix is optional if it is not ambiguous (only hexadecimal floating-point literals with a type suffix can be ambiguous).

Taken from here: Nim Manual