Universal good practice style guide for fortran

In the context of our development of the fortran-based crystal plastisity simulation software DAMASK, we are currently developing a pre-commit hook with the pre-commit utility to enforce that our style guidelines are properly followed in every commit.

Aside from the entries in the style guide from fortran stdlib, we are gathering suggestions for what constitutes good styling and would also like to consult the community if anything in our current list is missing or could be enhanced:

  1. Consistent use of end do and end if
  2. Repeat function/subroutine/module name after end
  3. if( => if (
  4. Capitalize all HDF5 prefixes (h5a => H5A)
  5. Forbid white space flexibility in between end and action
  6. Comment indentation verification
  7. Verify empty lines between functions
  8. Verify empty lines between variable declarations and code

Just my opinion, but the “look and feel” of a style is less important than making sure whatever is adopted by your organization is consistently (and rigorously) applied to your entire codebase. Adherence to a defined style should be one of the first things checked by any peer review process required for committing code to the project.

We find style guides concerned with whitespace and capitalization merely a distraction not worth spending much time on, other than as suggestions. Almost anything that can be applied automatically with a simple filtering program is generally not worth more than that.

The most important thing to do is to ensure the code is standard-conforming and as clean of unused variables and dead code as possible, and not numerically sensitive. The compilers (this varies widely) can be your best asset here. Non-standard usage should only be accepted if explicitly reviewed as neccessary; code should contain a certain amount of comments; user and developer documentation and test cases (hopefully verified with a coverage tool as being thorough) are important. Some type of indentation, a measure of how descriptive variable names are (but do not discourage use of short names in complex mathematical expressions, so if names appear in long lines allow for them to be short); and reducing the use of deprecated features are all important.

Deciding whether to indent one or four characters or keep to an 80-column width for source code; and whether you have a space between end and if wastes programmers time and just irritates people.

If it really bothers someone, buy something like SPAG. Utilities like that allow you to specify almost all the formatting options but also allow for refactoring older code, a much more valuable pursuit for those dealing with legacy software.

Descriptive coding can be good but becomes horrible to read when done religiously with complex formulas. Some people believe their code is so readable it does not need comments. They are wrong. Comments including references should be required. It is trivial to remove comments; most editors can be set to make them invisible or fold them. It is very hard to automate adding them.

As an aside, one of our recruiters told me he brings this topic up and if someone is avid about rigid stylistic standards that do not affect performance he considers it a major negative. If they talk about readability, reproducibility, verification, … then he considers it a plus. If they do not believe they should have to pass any criteria or have their code reviewed at all, a big negative.
So as with most things, it is a balancing act. Provide guidelines but policing non-functional ones
is not particularly productive, and there is no single answer anyway. If I want my code viewable on a phone display on some web forum it should probably be terse and narrow and use minimal indentation. If I want someone to use long descriptive variable names I should then expect long code lines.

My personal tastes are to always indent, but more than three characters reduces readability; to always have a closing name on an end statement; and that putting a space in between endif and elseif is ugly. I work with someone that thinks endif and elseif are horrors that the compiler should reject, capitalizes standard keywords and indents eight characters. We get along great because we learned long ago though a lot of experience that it does not matter, and is no more important that what your favorite flavor of ice-cream or color is.

Have your developers develop with every compiler warning flag you have; tests should include running with high optimization and justifying if incorrect results are produced; train them on how to generate test cases and the perils of floating point computation. As part of the review process have them provide everything someone else needs to build the code; … throw in styling suggestions but do not make it into a Holy War.

4 Likes

I generally agree with the comments here about making a fuss over endif vs. end if and so on. However, there is another aspect of this kind of source code formatting that has some more practical consequences.

When there are several coders working on a project, including open source projects, then the style conventions for the source code in the repository (such as git and svn) becomes a significant feature of the code. A programmer is no longer allowed to download the code, reformat it to his preferred style, make modifications, and then upload that code back to the repository. Other programmers will see all of the dozens/hundreds/thousands of style changes mixed in with the actual modified code, basically nullifying that feature of the code repository process. What the other programmers want to see is just the modified code. This means that the programmer must adhere to whatever style conventions apply to that code and to minimize any other changes.

One possible solution to this is to use automatic code formatting for the repository code. Programmers are then allowed to freely modify their downloaded versions, make any changes, and then upload their code through the automatic formatting filter. Different versions of the repository code will then always ignore any local formatting conventions that a programmer might have and reveal only the essential modifications (with the repository style conventions).

It is not practical for a programmer to compare his local version with the repository version – that would show all of the nonessential changes. This is an important step when working with codes shared through repositories (such as git and svn).

What other approaches are there to address this general problem?

I have a style guide, but I agree with others that it isn’t nearly as important as some make it out to be. The more important aspects of “style” are to sure that the code itself expresses as much as possible, in succinct a way as possible, that there are comments to say the rest of things (typically references, background info, why it’s written that way), and that somebody other than the author judges its “readability”.

2 Likes

Skimmed the referenced guides; which are definitely worth the reads but surprised I did not see (admittedly might have missed) discussion of kind suffixes on constants; and trimming trailing white-space and expanding tabs which is easy to automate.

It is very easy in standard free-format Fortran (although preprocessor directives can complicate things) to extract comments from code and run them through spell-checkers and review them (as mentioned very nicely comments are not checked by the compiler so are prone to accumulated errors);

I know of strong camps over whether variables should be declared only one per line or in tightly related groups versus minimizing the number of declarations. In either case asking people to look at their variable names and changing them to be self-describing if appropriate can be useful.

Often the compiler can flag unused variables, which are almost always something to remove.
As mentioned, if the compiler writers took the time to produce a warning message about something it is almost always worth considering eliminating.

Did not see order and grouping of variable declarations – locally most preferred is parameters first in order declared in the procedure declaration unless required by the compiler to be in a different order; and then local variables first by type and then alphabetically seems to be the most popular style locally.

So if those are really not in the referenced guides that is a few more that can generate interesting discussions. Some actually can affect results such as constant prefixes so many od those are not considered as just “style” preferences (?)

I was confused by Urbanjost’s parameters first in order declared in the procedure declaration unless required by the compiler to be in a different order until I realised that those parameters were not the usual Fortran constants (where a compiler may require a different order, e.g. with

integer:: i
integer,parameter:: isquared(10) = [(i**2,i=1,10)]

but what the Fortran standard calls arguments of a procedure. The good practice style guide for Fortran could require things to be called by their correct names.

A good style guide per current standard could also require no duplication of magic numbers like 10 and perhaps in situ declarations of implied-do indices!

integer, parameter :: isquare(*) = [( i**2, integer :: i = 1, 10 )]
6 Likes

The British Met office used to publish a Fortran style guide that I thought was a good place to start if you were putting together your own guide for your organization. It was focused on Fortran 90 if I remember correctly. However, I can’t find a link to the PDF anymore. Anyone got a copy.

Not a PDF, but I think it is here for Fortran 95.

Almost forgot to mention, Clerman and Spectors excellent book

Some books still use uppercase for Fortran keywords and procedure names.

Something I rarely see discussed for style is the use kind parameters for integers. Using _wp and alike for reals is the norm, but rare for integers. Admittedly the need for a kind other than the default for int is much more uncommon than that of a non-default real, but if the code is full of _wp / _r4 / _r8 ..., why not
_int4 / _i8 / _ikind ...?

That raises the question of using the kind in definitions e.g.

integer([kind=]int8), parameter :: hundred = 100_int8
integer([kind=]int4) :: array(5_int4)
array(1_int4) = -1_int4
array(2_int4) = 0_int4
! ...

Similarly for character, where the default (I assume selected_char_kind("default") is the same as selected_char_kind("ascii") in most implementations?) probably works most of the time. I’ve seen some uses such as ascii_"foo", but it’s rare. Similarly:

character([len=]10_int4, [kind=]ascii) :: line
line = ascii_"asdf--1234"
if (line(3_int4 : 5_int4) == "bar") then
!...

Adding suffixes for all double/quad precision gets annoying sometimes, but not doing the same for integers seems inconsistent (to me). Of course, like any other point brought up here, it’s all a matter of preference in the end.

PS: Maybe some of these are not standard conforming?

One reason for this is that up until f90, there was only one standard integer type, and even if programmers went beyond the standard and declared types as integer*2, integer*8, and so on, there was no way to specify constants of those types. F90 introduced the KIND system, which addressed the constant notation issue, but still there was only one integer KIND required by the standard (the default). Now, (as of f2008 I think) the standard requires an integer KIND with at least 18 decimals. That could be the default KIND, but on all compilers I use, INT32 is the default (unless a compiler option changes that) and INT64 is the extended KIND that meets the 18 decimal requirement. So now there are, more or less, requirements for those two KINDS and a nice KIND system that allows interconversions and constants to be specified. I am finding more and more cases where nondefault integer KINDS are used and where I want to control exactly when the conversions are done, so I’m using constants with KIND qualifiers more now than, say, 10 or 20 years ago.

1 Like