Improvement to 'expecting endif'

juhalappi · July 27, 2022, 1:17pm

I have written a precompiler which generates use statements and makes indentations. Recently I added the property that the if there are unproperly nested ifthens or dos, the precompiler tells on what line ifthen or do is opened when it is not properly closed with endif/enddo. This would be useful property in the compiler itself, especially when the compiler notices at end subroutine that there are ifthens or dos which are not propertly closed. In long subroutines the opened ifthen or do can be very far from the bottom. These properties would not be difficult to implement as I as an nonprefessional programmer could do them in my precompiler.

whuhn · July 27, 2022, 1:48pm

This is harder than you’d think, nested if statements with else blocks leading to ambiguities is one of the standard textbook examples of ambiguous grammars, and this is for well-formed should-be-valid statements, c.f. Dangling else - Wikipedia .

juhalappi · July 27, 2022, 5:22pm

As far as I can understand, there is no ambiguity in Fortran’s if()then … elseif()then …else… endif -syntax or do … enddo syntax (or in the oldfashioned do 100 … 100 continue syntax). My precompiler has two integer vectors iflines() and dolines() and do character*40 lists iflabels() and dolabels(). nif is the number of open ifs and ndo is the number of open dos. If on line 147 there is
if(a.gt.b)then

I make nif=nif+1 ;iflines(nif)=147;iflabel(nif)=‘if(a.gt.b)’

When I encounter endif I replace the endif -line with
endif ‘!’//ifalabel(nif)(1:len_trim(iflabel(nif))//’ 147’
and update nif: nif=nif-1

Similarly for do and enddo.

In places where Gfortran writes ‘expecting endif at line xx’ or expecting ‘endo at line xx’.
My precompiler writes the same thing, but it also writes
'there are ‘, ndo,’ open dos at lines ', dolines(1:ndo)
'there are ‘,nif,’ open if-thens at lines ',iflines(1:nif)

If the if-thens and do’s are mixed up, I understand that it is not possible to tell how they are mixed up, but the above messages help to see how they are mixed up. Also the comments telling at endif and enddo what if-thens or do’s are useful, as they may show that endif or endo is closing other ifthen or do what was intended. Also in cases where there are no problems, tehy are useful. Also the generated indentations are useful as I’m lazy to put them into place if there is exiting code writing going on.

I cannot understand why the compiler could not also write how many open dos or iftens there are and on what lines they started. Even if the parser does not need iflines() and dolines(), and nif and ndo, they could be added without any interference with the parsing process.

ke 27. heinäk. 2022 klo 16.53 William Huhn via Fortran Discourse (notifications@fortran-lang.discoursemail.com) kirjoitti:

RonShepard · July 27, 2022, 6:56pm

I think this is correct. The “endif” resolves the ambiguity. Other languages resolve it with brackets of some kind, like {}.

On the other hand, I’ve used languages that allowed the programmer to end lines of code, or even code blocks, without the matching right bracket. These were interpreted languages where the language was trying to be “user friendly”. When the end of line or end of file was reached, the language would just add as many } as required to empty the stack.

BTW, that is what you are doing with your arrays and the nif and ndo variables. You are pushing your nesting level information onto a stack.

fortran4r · July 27, 2022, 7:17pm

I have been saying this for a long time, end if/do/… should just be end. If you compare Fortran codes with Julia and Matlab codes, a fair proportion of the verbosity comes from these things after ends. Making them optional will make Fortran codes much more concise and cleaner. This is such a low-hanging fruit to pick, but people here are still against it. It is almost impossible to sell such verbose syntax to younger generations.

oscardssmith · July 27, 2022, 8:47pm

IMO, that’s a really minor issue. Sure fixing it would be an improvement, but Fortran has way bigger issues at the moment (half precision numbers, generics, simple arbitrary precision, meta-programming, string support, better autodiff, better symbolic math, etc). Focusing too much on these cosmetic issues IMO risks overshadowing the bigger and harder issues.

ivanpribec · July 27, 2022, 9:04pm

I’m young (< 30) and it has never bothered me (I met Fortran first at an age of 22 having only a little previous experience in Python). I’m afraid the verbose and clunky syntax is here to stay.

At least the Fortran end statements are not functions, like they are in CMake:

if(<condition>)
  <commands>
elseif(<condition>) # optional block, can be repeated
  <commands>
else()              # optional block
  <commands>
endif()

I was never able to see what those are for; it turns out there is no purpose remaining:

Per legacy, the else() and endif() commands admit an optional <condition> argument. If used, it must be a verbatim repeat of the argument of the opening if command.

MarDie · July 27, 2022, 9:37pm

I also find that the end xxx increases readability. I sometimes even use the labeled form for more clarity. For me, it is inline with better explicit than implicit which is part of the Zen of Python

But I must admit I can’t be considered young .

juhalappi · July 28, 2022, 4:11am

I started programming in mid seventies using Fortran 66 and punched cards. Fortran66 did not have if then structures and do loops were alwasy done at least once. Thus the code was a terrible mess with if()goto’s. Fortran77 was a big improvement. In Fortran90 the most useful things were to me dynamic allocation and own data types which allowed to have vectors which had vectors of different sizes and types as components.

About verbosity: I recently had to use Matlab and one thing I didnt like in it was that end did not show what it was ending. Long time ago I used quite much Mathematica. Initially I didnt like to write long functions names. But soon I learned that it was very nice that it was never necessary to read manual (this was before web-time) what is the shortcut name for this or that function. If you know a mathematical concept, you know what is the name of the corresponding function in Mathematica. Verbosity is especially useful when returning to old code. Tomorrow I start make also more comments, unfortunately I’m too busy today to start make comments today.

Fortran is still very useful language to make mathematical algorithms.

to 28. heinäk. 2022 klo 0.42 Martin via Fortran Discourse (notifications@fortran-lang.discoursemail.com) kirjoitti:

martin · July 28, 2022, 10:44am

Verbosity like in end if/else/associate/block/type/etc… statements, with names (type, subroutine etc) if available, is really helpful. In particular if it comes for free. In emacs, I never write more than “end” and then hit TAB.
I would consider it an issue, if fortran did not have this verbosity.

Beliavsky · July 28, 2022, 2:26pm

I use Emacs. Would it be possible to modify the Fortran mode so that pressing TAB after end
gave end do ! i, if i is the loop variable?

In VBA you can write

for i = 1 to 5
   ...
next i

I have suggested that Fortran allow

do i=1,5
   ...
end do i

You can currently write

do_i: do i=1,5
   ...
end do do_i

but this is more verbose, and multiple loops over a single variable in the same procedure cannot use this labeling.

RonShepard · July 28, 2022, 5:23pm

In this case of do loops, the verbosity serves a good purpose. I do think it should be optional, but it would be nice to add the loop index and to have the compiler enforce it in the language (i.e. it is not simply ignored as a comment).

I sometimes do this enddo labeling with comments, but the compiler does not enforce consistency, so if the enddo statements get out of sync with the loop nesting, those incorrect comments are worse than having no comment at all.

JohnCampbell · July 29, 2022, 2:41am

I want a dislike !!
I do not think this is a good idea. My question would be end what ?
My impression is END by itself is limited to be the end of a program/subroutine/function, although I do not uses this abbreviation.

My personal programming style is to avoid using END without a following descriptor, so have END FUNCTION function_name. I find this helpful when scanning code.

drikosev · July 29, 2022, 4:27am

Hello,

IMHO, it’s very difficult!

PGI Fortran (and possibly the newer NVIDIA) ie supported decent error recovery. I
don’t know about the Intel Fortran compiler.

Another Fortran front-end (dummy parser) that supports error recovery is here:

The BNF grammar of the command ‘fcheck’ documents part of it in the errror production
unexpected-end-stmt. Yet, there is additional hand coded logic. Even so, the result
isn’t so impressive to make the author (me) feel satisfied with it.

Regards,
Ev. Drikos

martin · July 29, 2022, 4:59am

This should not be that difficult, provided you know some elisp. The language specific code in f90.el can be complemented. I would directly edit this file (do not forget to compile it or delete the f90.elc). Or check, whether there might be hooks or similar. I am not that proficient with elisp/emacs.

A quick look gives f90-beginning-of-block in f90.el, which might be a good starting point.

FedericoPerini · July 29, 2022, 6:58am

I also do this for each construct. I remember ifort complaining when I put subroutines/functions inside a module, that ended with end instead of end subroutine or end function. I don’t know whether this is a feature of the standard, or it’s just something that ifort was enforcing.

juhalappi · July 29, 2022, 10:10am

My precompiler (dirty) which gives decent error messages for mixed ifhtens and dos and adds comments telling what is ended is available at github.com/juhalappi/J . The main motivation for the precompiler was to generate use statements for variables in modules. I have found it very useful to have prefix for such variables (I have prefixes j_ jlp_ o1_ o2_ and o3_), so the name of a variable tells whether it is local or not. My J is script based program which has two programming levels, input programming and interpreted programming. Input programming generates text lines from possibly nested include files and using its own loops and control structures. It can handle variable names having indexes in any part of the name. The generated code goes then to parser, which then generates code which is executed immediamently or packed in to functions. On feature in my software is that I have a program which generates latex code for the manual from comments in the source files and from additional text file. This way a function needs to be described only once in the source file. The same program which generates the latex code writes a sript file by which an user can run all the examples in the manual by just typing the name of exmple at prompt. J can be used for many purposes (e.g. for matrix computaions and interface to Gnuplot), but the main interest is in the linear pprogramming algorithm which utlizes effifiently the special structure of forest management planning problems. It would be nice if a professional software developer would look at my solutions. The software contains still many bugs.

pe 29. heinäk. 2022 klo 10.03 FEDERICO PERINI via Fortran Discourse (notifications@fortran-lang.discoursemail.com) kirjoitti:

Beliavsky · July 29, 2022, 12:13pm

Thanks for sharing your interesting project. There is already a J programming language, an APL descendant. Maybe you could call your project Juha to avoid confusion.

RonShepard · July 29, 2022, 7:56pm

I was part of the standard for a while, but now I think it has been made optional.

The statement “END” by itself originally had only one meaning in fortran up to f77. Then in f77, they added ENDIF (spaces being ignored, of course). Then in f90, they added ENDDO, END SELECT, END TYPE, END MODULE, and maybe some others too, and also required END statements within modules to have FUNCTION or SUBROUTINE as a qualifier. Then later, f2008 maybe, they made that optional in order to facilitate moving legacy subprograms into modules. That worked because “END” by itself still had only that one meaning. So now, if ENDDO, ENDIF, and those other END statements were allowed to not have qualifiers, it would probably lead to inconsistent grammar. I don’t know for sure, but it seems likely. In any case, with everything needing its own type of END statement except for the END of a subprogram, where it is optional, it is easier for humans to keep track of things. Also, I often make mistakes copying blocks of code from one place to another, and when the compiler encounters an ENDIF when it was expecting an ENDDO, it can tell me immediately what is the problem. Without those qualifiers, the compiler could not do that, and in really unlucky cases, it might even end up compiling the code in a way that I did not intend without knowing there was an error.

juhalappi · July 30, 2022, 10:20am

Thanks. We must rethink the name, even if my J has been around quite long time. Having the same name disturbes in web-searches. This time might be could to modify the name as I try now to get more visibility to the software.
Juha

pe 29. heinäk. 2022 klo 15.18 Beliavsky via Fortran Discourse (notifications@fortran-lang.discoursemail.com) kirjoitti:

Topic		Replies	Views
Closing if-loops by "endif" vs. "end if" Homework	15	2962	October 11, 2021
Automatic block between if and else/endif Language enhancement	2	201	October 17, 2024
Succinct statement end Help	17	783	October 31, 2021
Universal good practice style guide for fortran Language enhancement	21	2028	August 2, 2024
202X feature: Conditional Expressions	30	2189	June 28, 2021

Improvement to 'expecting endif'

Related topics