Why are parens required around format strings?

While reading a thread about the G format descriptor, I found this comment:

Can someone explain why this would be hard to change? Usually the concern is backward compatibility. But if the surrounding parentheses were made optional, wouldn’t all existing valid format statements remain valid?

Incidentally, what tag should I apply to a post like this, a general question about the language? It’s not a request for Help, not a Language Enhancement, etc.

3 Likes

It seems like a language enhancement to me. That’s a good idea, the extra parentheses indeed seems superfluous to me as well, I think about it every time I type it.

6 Likes

AFAIK the outermost parentheses are representing the looping behaviour. When you run out of formatters e.g., print "(i0)", 1, 2, it will print 1 in one line and then because there are no more formatters, the last pair of parentheses are repeated as long as needed, adding a newline and 2.

If the outer parentheses were optional and left out, would this mean, that if there are no other parentheses, this loop doesn’t happen?

Besides, you can concatenate format strings since additional parentheses inside doesn’t interfere, e.g.,

character(*), parameter :: fmt_i = "(i0)"
character(*), parameter :: fmt_g = "(g0)"
print fmt_i, 42
print fmt_g, 3.14
print "(" // fmt_i // ", x, " // fmt_g // ")", 42, 3.14

Admittetly, you have to manually add the outer parentheses, but IMHO this isn’t that much of a problem. Most of the time you have to concatenate something in between anyway.

3 Likes

Adding the outer parentheses to allow concatenation did occur to me, but concatenation wasn’t really my interest in this question. I was more wondering why any parentheses are necessary in the first place. What would fundamentally prevent a compiler from performing the looping behavior if there were no parentheses, say, by adding implied parentheses to the whole string? To me the outer parens just seem like visual clutter, and one more opportunity for errors. And since it’s a format string, it’s often annoyingly a runtime error instead of compile time.

IMHO Fortran already has too much “implied/implicit behaviour”. I would therefore be in favour of looping format strings only if the parentheses are present.
A decision must be made whether to truncate the output or raise an error (compiler or runtime) if there are more output values than formatters.

@certik how costly would it be to prototype this in LFortran?

1 Like

Didn’t the IBM G and H Fortran 66 compilers have format strings programmable at runtime? The string was in an integer array, as I recall. Then the final paren was the only indication the runtime code had for the end of the format string.

Yes. In Fortran 66, the format in a read/write statement could either be a statement label (with some compilers could also be an ASSIGNed statement label) or an array. With an array you would use Hollerith constants to do something like:

  INTEGER IFMT(5)
  DATA IFMT /4H(12H,4HHELL, 4HO WO, 4HRLD!, 1H)/
  ...
  WRITE (6, IFMT)

So yes, the parenthesis at the beginning and end helped the run-time library identify the bounds in memory of the format spec.

This will only work if there is no looping over the format statement. If there is looping, then it will be the last of those inner format strings that will get looped over rather than the whole format string being looped over. To give a simple example, the write statement

write(*,'(a,(2i2))') 'a=', a(1:5)

would result in

 a= 1 2
 3 4
 5

because it is the inner (2i2) that is looped over rather than the outer (). So sometimes the parentheses do matter.

One could also read that array from a file, and if so, then the outer parentheses were required then too. Another way to specify the array on many compilers was with ENCODE/DECODE statements, which were replaced in f77 with standard internal read/write statements.

Although there were a number of things Fortran 77 should have had but didn’t, character data type alone, and associated deletion of all the Hollerith madness, made for a massive improvement of Fortran 77 over Fortran 66.

I guess it was the same with the unix guys and doing C, after finding B (which like BCPL only supported integers and addresses) inadequate.

You’re right, those parentheses do matter!
But it doesn’t make a big difference to wrap the format string in additional outer parentheses:

print "((" // fmt_str // "))", arr

The parentheses are needed, of course, in the FORMAT statement. As has been discussed, they were there to prevent running off the end of a format contained in an array. While I could imagine adding the option to not provide the parens in an I/O control list, that would need some tweaking of the rules about format recursion (the “looping” mentioned here - see Doctor Fortran in “Revert! Revert! The End (of the format) is Nigh!” - Doctor Fortran (stevelionel.com))

My view is that this change would be more trouble than it is worth. It does not add any functionality and is therefore “syntactic sugar”, something we generally try to avoid, unless the benefit seems clearer than it does here.

I had forgotten about this, and it does seem to be a complication to simply removing the enclosing parens as a requirement.

Thanks to everyone for the replies.