No reserved keywords: Why?

In contrast to many other languages, Fortran does not have any reserved keywords. Is that decision still considered today as good practice or is it just a legacy from being the first high level language?

Anyways, what are the pros and cons?
I have

  • :white_check_mark: New functionality can be added to the language without breaking backward compatibility (see how this is handled in Rust: Rust 2018 is here… but what is it? - Mozilla Hacks - the Web developer blog)
  • :white_check_mark: Functions (e.g. bessel_yo(X)) can be replaced if one has a faster implementation.
  • :x: Parsing Fortran code must be very difficult
  • :x: Code can become obfuscated:
     program main
     implicit none
    
     print*, if(.true.)            ! 42.000
     print*, 'real is ',real()     ! 'real is god'
    
     contains
    
     function if(question) result(answer)
       logical :: question
       real :: answer
    
       answer = 42.0
     end function
    
     function real() result(answer)
       character(len=3) :: answer
    
       answer = 'god'
     end function
    

One reason for not reserving keywords could be the large amount of intrinsic procedures and subroutines which compensate for the lack of a standard library.

I would be interested in getting your opinion in the poll or, even better, as a response.

Which handling of reserved keywords do you prefer?

  • no reserved keywords
  • only statements (e.g. if, do)
  • statements and intrinsic procedures

0 voters

Note: I don’t think reserving intrinsic features but not statements would be a sensible choice.

2 Likes

IMO, a practical case without reserved words:

type something
  ...
contains
   procedure :: read => read_something
end type something

contains

  subroutine read_something(me,...)
  ...
  ....

 type(something) :: thing
 call  thing%read(...)

2 Likes

I’m afraid that for a a language that is as old as Fortran the answers to many “Why…” questions may be impossible to obtain now.

Yes, the job of a compiler is more difficult when keywords are not reserved. However, Fortran compilers handle this issue in stride. The following fixed form program is an example.

      PROGRAMPAVG;IMPLICITNONE;INTEGERI,N;PARAMETER(N=5);REALX(N),SUM;DO
     1I=1,N;X=2.5*I-0.1;ENDDO;SUM=0.0;DOI=1,N;SUM=SUM+X(I);ENDDO;PRINT*,
     2'mean is ',SUM/N;END

On the other hand, if keywords and function names are reserved, it is difficult for programmers to remember the reserved names unless the list of such names is short. For comparison, see the list of reserved words in Unisys Extended Algol, which has over 500 of those. Cobol, likewise, depending on the version, has about 500 reserved keywords.

A pragmatic approach would be: make up our own list of “reserved” names, and studiously avoid using those names in new Fortran programs that we write. There are ways of checking a program for such names and to change those names to something else. For example, we could change the name SUM in a Fortran 77 program to SUM0 in order to avoid a clash with the FTN90+ intrinsic function sum().

4 Likes

May be compilers can provide a facility to be used with projects that like to have reserved keywords, say a conffg file of some sort? LFortan may have this in mind

2 Likes

A related fortran_proposals thread is Pros and Cons of making keywords reserved.

3 Likes

Yes, I list all the keywords in the proposal that @Beliavsky linked. It’s quite long and some of the words I would like to actually use as variable names such as name, local, value, entry. What is worse, with every new standard revision, more “keywords” get added, thus rendering existing programs incorrect if keywords were reserved.

Yes, it is a major pain to parse. You either have to use a backtracking parser (LFortran, new Flang), or you have to have a very strong lexer that essentially preparses the code to figure out for sure if a given token is a keyword or a variable/function/etc name (I believe you have to essentially know if you are in a subroutine, keep track of if statement nesting etc. all in the lexer; I think NAG does something like this). Both approaches are not ideal and slower than it would otherwise be.

Unfortunately the solution is to either make all keywords reserved or none – if we only do some, that would still require the backtracking parser.

Regarding code style, I do believe one should not use if as a variable name. But using some of the less common keywords such as value as a variable name I think is fine, I believe I’ve done that.

Yes, a compiler can easily warn against this. Literally yesterday I’ve added the following warnings to see if it would make sense:

program expr2
implicit none
integer :: x
x = (2+3)*5
print *, x
if (x .eq. 25) then
    print *, 25
endif
end program

this gives:

$ lfortran examples/expr2.f90 
style suggestion: Use '==' instead of '.eq.'
 --> examples/expr2.f90:6:7
  |
6 | if (x .eq. 25) then
  |       ^^^^ help: write this as '=='

style suggestion: Use 'end if' instead of 'endif'
 --> examples/expr2.f90:8:1
  |
8 | endif
  | ^^^^^ help: write this as 'end if'

In color it looks like this:

You can turn warnings off with --no-warnings. We’ll make it more configurable in the future.

Btw, the endif warning I first did in the tokenizer, if it sees endif to simply warn. And then our tests revealed that this warning was false positive in cases where endif was used as a variable name… Precisely due to the issue in this thread. So I added the warning into the parser. There shouldn’t be any false positives.

Anyway, let me know which other warnings you would like to get.

5 Likes

sorry, I’ve missed it because I was just searching on discourse.

I also use value as a variable name but that breaks syntax highlighting. An LFortran-based language server could fix this.

1 Like

Some of the non-reserved keywords, such as allocate and deallocate, are like subroutines – they perform an action, change their “arguments”, and do not return a value. If a derived type has allocatable components, it is natural to define allocate and deallocate subroutines for it, and I don’t think lines such as

call allocate(dt)
call deallocate(dt)

will confuse the reader. For other keywords such as if, endif, do, and enddo, I don’t see a rationale for the user defining variables or procedures with the same name.

1 Like

We do not have to praise the peculiarities of Fortran, but it is helpful to be aware of them. The following program expands on @Beliavsky’s example of using allocate as a user defined subroutine.

program main
    integer, allocatable :: ia(:)
    !
    call allocate(ia,4)
    print *,ia
    deallocate(ia)
    allocate(ia(3))
    print *,size(ia)

contains

   subroutine allocate(iw,n)
      integer, intent(in) :: n
      integer, allocatable :: iw(:)
      allocate (iw(n))
      iw = [(i*2, i=1,n)]
      return
   end subroutine
end program

It is worth noting that the keyword call serves to disambiguate between the intrinsic verb allocate and the user provided subroutine allocate.

[Caution: Do not see this post as promoting the view that anyone should use keywords in an ambiguous way.

And, as we know from experience, it is too much to expect the syntax highlighter of Discourse to know which instances of “allocate” should not be highlighted.]

3 Likes

We don’t have have to make hypothetical examples. We decided to “overload” open in stdlib:

2 Likes

Thanks. The open function should provide the equivalent of the fopen function in C, in a convenient manner. I followed the link to the specs of the function, and read the description.

This use of the word “specifies” struck me as odd:

" u : Shall be a scalar of type integer that specifies the unit number associated with the file filename ."

As I understand it, the variable u will contain whatever unit number the library chooses to assign to it. That does not give the user the ability to specify the unit number. Before the function call, u is either undefined or contains some previously assigned value, and neither affects the returned value of u.

3 Likes

Indeed, the spec doc is wrong here and that paragraph should be removed (u is not an argument). Do you mind submitting a PR with the fix?

1 Like

The facility per the standard thus far to help with such situations is to provide the INTRINSIC attribute rather than go with keywords:

   blk1: block
      print *, "norm2 of [1,2] = ", norm2( [ 1.0, 2.0 ] )
   end block blk1
   print *
   blk2: block
      intrinsic :: norm2 !<-- Note the attribute
      print *, "norm2 of [1,2] = ", norm2( [ 1.0, 2.0 ] )
   end block blk2
contains
   real function norm2( x )
      real, intent(in) :: x(:)
      real, parameter :: fact = 1000.0
      ! Floor to 3 decimal digits for illustration
      norm2 = floor( sqrt( sum( x*x ) )*fact ) / fact
   end function
end 

norm2 of [1,2] = 2.2360001

norm2 of [1,2] = 2.2360680

2 Likes

For some reason, the fixed-form program example you posted earlier reminds me of the The International Obfuscated C Code Contest (IOCCC). I wonder if that’s also a thing for Fortran :laughing:

1 Like

Apologies for bumping this thread, but I was wondering what is the reasoning behind the following inconsistency.

I seem to be allowed to name variable with type names, for example

real :: real              ! Allowed
type(integer) :: integer  ! Allowed
real = 42.
integer = 37
print *, real, integer
end

bot not with derived-type names:

type :: foo
   integer :: bar
end type
type(foo) :: foo  ! BOOM!
foo = foo(42)
end

Here are the error messages returned by ifort and gfortran:

$ ifort test_foo.f90 
test_foo.f90(4): error #6406: Conflicting attributes or multiple declaration of name.   [FOO]
type(foo) :: foo
-------------^
test_foo.f90(5): error #6478: A type-name must not be used as a variable.   [FOO]
foo = foo(42)
^
compilation aborted for test_foo.f90 (code 1)
$ gfortran test_foo.f90 
test_foo.f90:4:16:

    1 | type :: foo
      |           2     
......
    4 | type(foo) :: foo
      |                1
Error: Symbol ‘foo’ at (1) also declared as a type at (2)
test_foo.f90:5:4:

    5 | foo = foo(42)
      |    1
Error: Derived type ‘foo’ cannot be used as a variable at (1)

Addendum: ironically, in C++ it appears to be the opposite. Intrinsic types are reserved keywords, but it’s perfectly okay to use the same name for your struct/class and your instance.

#include <iostream>

struct Foo {
  int bar;
};

int main(int argc, char const *argv[])
{
  //int int;     // ERROR
  Foo Foo{42};   // OKAY
  
  std::cout << Foo.bar << '\n';
  return 0;
}

In some cases when using C++ templates, the programmer must help the compiler by adding the typename declaration.

2 Likes

I can see why it should be disallowed in a module.
USE MOD_M, ONLY : FOO ! ambiguous

1 Like

I don’t think the semantics are unambiguous in the case of arrays

type :: foo
  integer :: bar
end type foo
type(foo) :: foo(3)
foo(1) = foo(2)
end

Considering that it is also possible to get into the same situation using real (this actually compiles)

type(real) :: real(3)
real(1) = real(2)
end

which shadows the intrinsic to “create” a real value.

1 Like

As oft discussed, FORTRAN and later Fortran developed without the notion of reserved keywords, good or bad.

Instead the language evolution led to the concept of global and local identifiers that then place restrictions on names of identifiers in a program and a scope.

Now the reason(s) as to why are nearly impossible to ascertain, however as things stand, the points to note are:

  1. intrinsic types are neither global or local identifiers. This then permits anomalous declarations such as type(integer) :: integer.
  2. nonintrinsic types and named variables are local identifiers of Class(1). The standard states, “Within its scope, a local identifier of one class shall not be the same as another local identifier of the same class.” Thus is ruled out declarations such as type(foo) :: foo. As mentioned upthread, the semantics of USE association likely played a big role in this during the development of Fortran 90 standard when nonintrinsic types and USE association both got introduced.
2 Likes

In this case, I think the standard disallows the usage per its listing of the intrinsic procedures to be class (1) local identifiers and it includes REAL as an intrinsic procedure.

There is no numbered syntax rule or a constraint requiring a conforming processor to detect and report the violation, so the onus lies on the program author.