Variables with the same names as intrinsic functions, statements, and basic types

Fortran does not have reserved words. Based on a small program it appears that

(1) intrinsic functions cannot be used when variables with the same name are in scope
(2) variables with the names of statements such as PRINT or basic types such as REAL
can be used, and those statements and types are still available even while those variables
are in scope.

I would never name a variable sum, real or print, of course. Could Fortran syntax be simplified if certain words were reserved? I am not advocating that the standard be changed but wonder if a pseudo-Fortran with reserved words and simpler syntax could be created, along with a program to convert it to standard Fortran. If you knew that print, if, endif, do, and enddo were not variable names maybe it be possible to write code such as

do i=1,5
   if mod(i,2)==0
      print i,"is even"

Here is a legal program that uses badly named variables.

module m
implicit none
real function mean(n,x)
integer n
real x(n)
real sum ! here sum is a variable name
integer i
sum = 0.0
do i=1,n
   sum = sum + x(i)
end do
mean = sum/n
! line below is invalid -- here sum is a variable
! that does not take a subscript or argument
! mean = sum(x)/n
end function mean
end module m
program main
use m, only: mean
implicit none
integer, parameter :: n = 3
real   , parameter :: real(n) = [4.0,9.0,16.0]
! real is both a basic type and a variable name
logical, parameter :: print = .true.
! below sum refers to the intrinsic function
if (print) print*,mean(n,real),sum(real)/n
end program main

9.666667 9.666667


Dr Fortran spake thus

We have tools (within fpt) which report Fortran keywords and intrinsic functions and subroutines used for variables. In analysing many millions of lines of code we have found:

Exactly 1 example of a primary keyword used for a variable which caused a problem;
No auxiliary keywords used for variables which caused problems
Several hundreds of intrinsic function names which caused problems when used for variables

The worst cases are intrinsic function names used for arrays - the favourite is INDEX, closely followed by MAX and SUM. The problem here is that every new edition of the Fortran standard adds more intrinsic functions and subroutines, and they should. There is no way to know that a variable name will not become intrinsic in a future standard So I think we have to live with a sub-optimal situation.

I am also in favour of this because it significantly hinders the development of Fortran tools. I would as much as advocate that modifier keywords should be reserved by the standard, although I very well understand the nightmare this would unleash.

Here are my main grievances with non-reserved keywords

1. Syntax highlighting

Syntax highlighting is commonly provided by Textmate files using Oniguruma Regular Expressions, in which one defines scopes that are used to match Fortran expressions. These scopes can then be included to one another to form the final syntax highlighting scope.

It is very important that the scopes’ regexs match only what they are intended and nothing else, poorly written regexs will result in never closing scopes, which depending on the case will lead to a complete failure of the syntax highlighting for the entire Fortran source code.

Moreover, these regexs are static and hence cannot be modified by the extension at launch. Moreover, modifying the syntax files will affect the syntax highlighting of all Fortran code across all projects.

So why is that an issue?

Imagine you have the following code

  ! Normal function
  function fun( arg )
    integer, intent(inout) :: arg
  end function fun

  ! Argument list and argument declaration are incorrectly highlighted
  function fun_with_fun(   function )
    integer, intent(inout) :: function
  end function fun_with_fun

  ! Argument list and argument declaration are incorrectly highlighted
  subroutine sub_with_fun( function )
    integer, intent(inout) :: function
  end subroutine sub_with_fun

As you can see the argument function has been erroneously highlighted in the last two cases. It turns out you need to do 2 things to address this:

  1. while matching for a function's/subroutine's definition, make sure that that definition is not part of an argument list i.e. surrounded by parenthesis.
  2. while matching your argument declarations make sure that type, keywords, ... :: function never triggers the regex for the a Fortran function

In terms of regex both of these are doable with non-fixed width look-behinds, the only problem is that Oniguruma regex does not support that e.g. (?<!::[ ]*)). VS Code, Atom and probably more editors do not necessarily abide by the strict rules of Oniguruma, so realistically this could be implemented and has been part of Modern Fortran’s syntax highlighting solution since version 2.4.0. The only problem; non-fixed width look-behinds have terrible performance for large files and long lines to the point that they will cause your editor to crash, see issue 309 in Modern Fortran, to which currently I don’t have a solution.

All that is to say that if function, subroutine, etc. were reserved keywords none of these would be issues and writing syntax highlighting files for Fortran would become trivial.

2. Language Server implementations

Similarly to syntax highlighting a Language Server needs to parse Fortran code. Source code is parsed into an Abstract Syntax Tree (AST) which is then used to efficiently retrieve information for the original code e.g. the hover signature of a function.

To be able to handle cases like the code snippet above or like this example

program main
integer :: i
integer, allocatable :: sqrt(:)

do i = 1, 5
    sqrt(i) = i**2
end do

print*, sqrt(sqrt) ! this is not the intrinsic sqrt, but an array indexing itself
end program

One has to code to extremely complex routines and/or write spaghetti code which not only is a pain to write but also a nightmare to maintain.

There are a few more issues like autocomplete result relevance, that would be resolved if Fortran used a larger set of reserved keywords, but I think the 2 points I raised cover most of my “complains”.


Personally, I’d support if compilers or other tools like fpm, provided an optional flag which would effectively introduce reserved keywords (now they only issue warnings that user variable overwrites an intrinsic function). For maximum flexibility it could allow the user to modify the list of reserved keywords (e.g. for some legacy code).

One edge-case that I have ambivalent feelings about are greek letters, which are often mathematical variables, but may also be functions. Examples are gamma in Fortran, or lambda in Python. In those cases I find reserved keywords somewhat annoying, e.g. having to write lmbda or lambda_ everywhere because lambda is reserved. In Julia which allows using Unicode characters in source code, you can avoid the issue by using the actual characters, γ and λ. This arguably brings the code closer to math notation, however I’m not entirely convinced by the idea. I can barely remember any keyboard shortcuts, and having to remember how to input special characters from a standard keyboard is just another burden for me.

1 Like