Fortitude: a Fortran linter

Has anyone tested this new project?

2 Likes

I was thinking of posting about it. It seems useful and already has 17 stars on GitHub although it is relatively new. For a sample program

real :: pi = 3.14159265358979311599796346854
double precision :: pi_d = 3.14159265358979311599796346854
print*,pi
print*,factorial(3), factorial(4)
contains
integer function factorial(n)
integer :: n
integer :: i, ifac=1
do i=2,n
   ifac = ifac*i
end do
factorial = ifac
end function factorial
end

fortitude check pi.f90 says

warning: pi.f90:1:1: P021 real has implicit kind
  |
1 | real :: pi = 3.14159265358979311599796346854
  | ^^^^ P021
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
  |

warning: pi.f90:1:1: T001 program missing 'implicit none'
  |
1 | real :: pi = 3.14159265358979311599796346854
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ T001
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
  |

warning: pi.f90:1:14: P001 real literal 3.14159265358979311599796346854 missing kind suffix
  |
1 | real :: pi = 3.14159265358979311599796346854
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ P001
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
  |

warning: pi.f90:2:1: P011 prefer 'real(real64)' to 'double precision' (see 'iso_fortran_env')
  |
1 | real :: pi = 3.14159265358979311599796346854
2 | double precision :: pi_d = 3.14159265358979311599796346854
  | ^^^^^^^^^^^^^^^^ P011
3 | print*,pi
4 | print*,factorial(3), factorial(4)
  |

warning: pi.f90:2:28: P001 real literal 3.14159265358979311599796346854 missing kind suffix
  |
1 | real :: pi = 3.14159265358979311599796346854
2 | double precision :: pi_d = 3.14159265358979311599796346854
  |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ P001
3 | print*,pi
4 | print*,factorial(3), factorial(4)
  |

warning: pi.f90:7:12: T031 function argument 'n' missing 'intent' attribute
  |
5 | contains
6 | integer function factorial(n)
7 | integer :: n
  |            ^ T031
8 | integer :: i, ifac=1
9 | do i=2,n
  |

warning: pi.f90:8:15: T051 'ifac' is initialised in its declaration and has no explicit `save` or `parameter` attribute
   |
 6 | integer function factorial(n)
 7 | integer :: n
 8 | integer :: i, ifac=1
   |               ^^^^^^ T051
 9 | do i=2,n
10 |    ifac = ifac*i
   |


-- Number of errors: 7
-- For more information, run:

    fortitude explain [ERROR_CODES]
1 Like

It looks like it is using this tree-sitter-based Fortran parser GitHub - stadelmanma/tree-sitter-fortran: Fortran grammar for tree-sitter.

1 Like

Weā€™ve been discovered! :smiley: Weā€™re currently still in a fairly beta phase and maybe not quite ready for full production use yet, although I do think it is already a useful tool. The current push is to get auto-fixes working before we start shouting a bit more about it.

It is indeed built on the tree-sitter grammar, so it is very fast. I just ran fortitude check on a large, popular package, and it checked 473 files, finding 38,000 warnings, in 5 seconds. Tree-sitter grammars are also error-tolerant, which means weā€™ll be able to check as you type, especially once we get LSP integration.

Weā€™re heavily inspired by the Python linter ruff, and in fact are building on that too.

Current goals:

  • auto-fixes (development version can make suggestions, but not apply them)
  • multiple output formats (JSON, GitHub, etc)
  • more checks

Longer term goals:

  • topiary integration for formatting (someone has started looking at this just this week)
  • fixed-form checking and auto-conversion
  • preprocessor compatibility

Feedback and contributors are very welcome :slight_smile:

9 Likes

@zedthree thanks for implementing this!

I was curious about the speed of the tree sitter parser. So I compiled fortitude with ā€œcargo build -rā€. I used this 100,000 lines benchmark file: bench3.f90 Ā· GitHub. (We should figure out a better one by taking some real project, and concatenating all files into just one file, in the right order.)

In Ubuntu 22.04 in WSL on Surface 5 laptop I got with fortitude:

$ time ../../../../../fortitude/target/release/fortitude check bench3.f90

fortitude: 1 files scanned.
All checks passed!


real    0m0.651s
user    0m0.621s
sys     0m0.031s

and LFortran:

$ time ./lfortran --backend=wasm bench3.f90 -o a.wasm --time-report
Codegen Time report:
ASR -> wasm:    96
Save:           1
Total:         97
Allocator usage of last chunk (MB): 16.1908
Allocator chunks: 1

Time report:
File reading:    2
Src -> AST:     47
AST -> ASR:     40
ASR -> wasm:     98
Total:         187

real    0m0.198s
user    0m0.178s
sys     0m0.020s

And Flang 19 (from conda):

$ time flang-new -fsyntax-only bench3.f90
real    0m3.390s
user    0m3.069s
sys     0m0.192s
$ time flang-new bench3.f90
real    0m18.782s
user    0m27.811s
sys     0m2.551s

Note that Flang seems to compile in parallel (thatā€™s why the ā€œuserā€ is larger than ā€œrealā€). LFortran and GFortran compile on single core in this benchmark.
And GFortran 13.2.0 from conda:

$ time gfortran bench3.f90
real    0m18.657s
user    0m17.633s
sys     0m0.602s

Would you say that Fortitude parsing and running the checks would be equivalent to LFortran or Flang parsing and doing semantic analysis?

Compiler Parsing and Semantics Full compilation Lines / s
Fortitude 0.651 (7.3x) - 153,625 / -
LFortran 0.089 (1.0x) 0.198 (1.0x) 1,123,707 / 505,101
Flang 3.390 (38.1x) 18.782 (94.9x) 29,501 / 5,324
GFortran - 18.657 (94.22x) - / 5360

Above LFortran was compiling to webassembly, which is a nice reference as it shows what is possible, and why we are developing a new compiler.

@zedthree if you want to collaborate on our very fast parser and maybe use it in Fortitude, let me know! Our parser to AST is beta quality, meaning it is expected to work for most codes. Itā€™s error resilient (we are still making improvements to the resiliency).

6 Likes

Nice, that is very fast! Yes, itā€™s probably roughly equivalent to AST + semantics, although we probably donā€™t do as much as a real compiler.

If you have rust bindings, it would definitely be interesting to try and use lfortran in fortitude, especially if we can then do some more in-depth semantic checks.

The main advantage to tree-sitter parsers is theyā€™re very lightweight to build (just a standalone C library), with bindings in many languages. Thereā€™s a language-agnostic ecosystem around them too, for example various editors can use tree-sitter grammars for syntax highlighting.

Interestingly, I see much less of a difference on my machine (although lfortran is still x2 faster!):

$: hyperfine "tree-sitter parse --quiet bench3.f90"
Benchmark 1: tree-sitter parse --quiet bench3.f90
  Time (mean Ā± Ļƒ):     473.5 ms Ā±   4.1 ms    [User: 442.6 ms, System: 29.5 ms]
  Range (min ā€¦ max):   469.7 ms ā€¦ 483.9 ms    10 runs
 
$: hyperfine "lfortran --backend=wasm bench3.f90 -o a.wasm --time-report"
Benchmark 1: lfortran --backend=wasm bench3.f90 -o a.wasm --time-report
  Time (mean Ā± Ļƒ):     287.5 ms Ā±   5.5 ms    [User: 254.5 ms, System: 32.7 ms]
  Range (min ā€¦ max):   280.8 ms ā€¦ 298.1 ms    10 runs

and also with just --show-ast:

$: hyperfine "lfortran bench3.f90 --show-ast > /dev/null"
Benchmark 1: lfortran bench3.f90 --show-ast > /dev/null
  Time (mean Ā± Ļƒ):     202.8 ms Ā±   2.4 ms    [User: 165.8 ms, System: 37.9 ms]
  Range (min ā€¦ max):   199.6 ms ā€¦ 209.1 ms    14 runs

Comparison with a more real world case shows very little difference:

$: wc -l gs2/src/init_g.f90 
4574 gs2/src/init_g.f90

$: hyperfine "tree-sitter parse gs2/src/init_g.f90 --quiet"
Benchmark 1: tree-sitter parse gs2/src/init_g.f90 --quiet
  Time (mean Ā± Ļƒ):      36.8 ms Ā±   1.0 ms    [User: 33.4 ms, System: 3.7 ms]
  Range (min ā€¦ max):    35.3 ms ā€¦  42.1 ms    75 runs
 
$: hyperfine "lfortran gs2/src/init_g.f90 --show-ast > /dev/null"
Benchmark 1: lfortran gs2/src/init_g.f90 --show-ast > /dev/null
  Time (mean Ā± Ļƒ):      40.2 ms Ā±   1.0 ms    [User: 30.0 ms, System: 10.3 ms]
  Range (min ā€¦ max):    38.7 ms ā€¦  45.5 ms    67 runs

(--show-ast needed here to avoid building the .mod files)

I also test the tree-sitter grammar against a corpus of ~160 projects, with the following results:

Total parses: 19211; successful parses: 19045; failed parses: 166; success percentage: 99.14%; average speed: 6155 bytes/ms

(some of those parsing failures are due to invalid Fortran!)

Just based on that bench3.f90 file, it looks like lfortran is about ~8000 bytes/ms. Not much in it!

Yes, you have to be careful what to compare against. In your benchmark you compared lfortran --backend=wasm bench3.f90 -o a.wasm --time-report which creates a WASM binary on disk (so it does SRC->AST, then AST->ASR, various ASR passes, and ASR->WASM, and saves the file to disk), and you compare it against tree-sitter parse --quiet bench3.f90 which presumably just does an equivalent of SRC->AST, so tree-sitter should be several times faster than lfortran on this benchmark, but it is 2x slower.

On smaller codes you also get various other overheads, for example the --show-ast walks the whole AST and prints it, so if you are not printing something similar in tree-sitter parse, the best way is to read the timing from --time-report just for SRC->AST.

Are you worried that LFortran is not faster than tree-sitter for smaller codes? If so, send me the init_g.f90 file and I can have a look. A good benchmark is to emit some warnings in both fortitude and LFortran, ideally the same warnings, and benchmark how long it takes end to end on some actual code.

We currently donā€™t have Rust bindings, but my understanding is that it is very easy to call C++ from Rust.

The main advantage of collaborating on the same parser and semantics is that it gets reused in all our projects and thanks to LFortran, we guarantee that the parsing and semantics is all correct, because that is what the compiler uses, and there are a lot of corner cases in parsing (including fixed-form) that we are fixing as we discover them, so you can reuse all this work. We can make the parser independent from LFortran, either to AST, or even to ASR (with semantics). Itā€™s just a few files and they donā€™t depend on the rest of the compiler. So for example there could be a Rust Cargo package that just builds the parser and uses it (LFortran source code is small, so we can add some cmake option to only build the parser, and the cargo package would turn this option on).

This is definitely not a warning that I want to seeā€¦

Idemā€¦ The implicit kind suffix is fine for an assignment to a default real variable.

I disagree here: double precision can really be on purpose, to mean ā€œhigher precision than the default realā€, or to write interfaces to external procedures that are defined with double precision arguments or results. real(real64) is not exactly the same.

Most members of this forum know that

real :: pi = 3.14159265358979311599796346854

defines a single precision variable that is assigned to a single precision constant, but people new to Fortran donā€™t, so I think the warnings are OK. (In Python/NumPy, R, and Matlab, floats are double precision by default.)
One can select or ignore individual rules or whole groups with --select and --ignore

3 Likes

@Beliavsky

I modified the example program you posted and got interesting results.

real             :: pi   = 3.14159265358979311599796346854
double precision :: pi_d = 3.14159265358979311599796346854

print*, pi_d,pi
print*, dble(pi), 4*atan(1.0d0)
print*, factorial(3), factorial(4)

contains
integer function factorial(n)
integer :: n
integer :: i, ifac=1
do i=2,n
   ifac = ifac*i
end do
factorial = ifac
end function factorial
end

There are clearly problems with this code, as the Fortran standard will not support what the original author appears to have intended.

These problems include:

  1. both pi and pi_d are both initialised by a default real constant. Gfortran Ver 12 lets this author down, as it truncates the real constant, while Silverfrost FTN95 warns they ā€œwould be truncated under ISOā€, but retains the real64 value.
  2. using ā€œinteger :: i, ifac=1ā€ will not be compiled by Fortran in the way the author has intended, as ifac is not initialised at each call.
  3. Using ifac as a default integer, factorial(n), it is unlucky n >= 13

While there are examples of compiler or scanner warnings in this thread, I think the warnings presented in the thread miss the key points of Formula Translation.

These are code smells rather than outright errors, and certainly the default warnings may not be for everyone :slight_smile:

At the moment, warnings can selected or ignored on a whole project basis (as @Beliavsky mentions), but the intention is to have both per-file settings, and per-line ignores through comments something like ! allow(double-precision) so that you could have a warning enabled in general except for the specific places where something is required for compatibility.

fortitude explain shows the rationale behind each rule:

$ fortitude explain P011
# P011: double-precision

## What it does
Checks for use of 'double precision' and 'double complex' types.

## Why is this bad?
The 'double precision' type does not guarantee a 64-bit floating point number
as one might expect. It is instead required to be twice the size of a default
'real', which may vary depending on your system and can be modified by compiler
arguments. For portability, it is recommended to use `real(dp)`, with `dp` set
in one of the following ways:

- `use, intrinsic :: iso_fortran_env, only: dp => real64`
- `integer, parameter :: dp = selected_real_kind(15, 307)`

For code that should be compatible with C, you should instead use
`real(c_double)`, which may be found in the intrinsic module `iso_c_binding`.

We very much welcome community input and feedback on rules (and their explanations), and what would be a sensible default set. Obviously not everything will be suitable for every project, and thatā€™s fine.

1 Like

Then the warnings should be different. Instead of ā€œwarning: pi.f90:1:1: P021 real has implicit kindā€, it could be something like ā€œwarning: the default real in Fortran is most of time 32 bits, with only 6 significant digitsā€.

Regarding the literal constants without an explicit kind, I maintain that the warning should be issued only if it used in an expression where a casting to a higher precision occurs, e.g.:

dp = kind(1d0)
real     :: pi = 3.14159265358979311599796346854   ! no casting, no warning
real(dp) :: pi = 3.14159265358979311599796346854   ! casting, warning
! or
real     :: x
real(dp) :: y 
x = 42.42 * x   ! no casting, no warning
y = 42.42 * y   ! casting, warning
1 Like

Having to add comments to disable warnings about perfectly valid features looks super-heavy to meā€¦

The standard requires the double precision type to have at least 10 significant digits and a range of at least 37:
image
Technically this cannot fit into 32 bits. And in practice it will always be 64 bits (or more).

For portability, it is recommended to use `real(dp)`, with `dp` set
in one of the following ways:

- `use, intrinsic :: iso_fortran_env, only: dp => real64`
- `integer, parameter :: dp = selected_real_kind(15, 307)`

Well, "portability " is probably not the right word here. As a matter of fact double precision is 100% portable, which is -strictly speaking- not the case of the 2 suggested solutions.

  • The iso_fortran_env may not be available (compilers are not required to provide it)
  • selected_real_kind(15, 307) may return a negative number, as the standard doesnā€™t require that such a kind exist.

Moreover:

  • in contrast to double precision, real64 has no requirement on the precision/range. It is just ā€œsome floating point stored on 64 bitsā€. It could be a 32 bit (effective) floating point, with 32 bits unused, without violating anything.
  • real64 assumes that the 16/32/64/128 scale is here ā€œforeverā€. But who knows?
  • if the intent is ā€œI want a floating point number with at least this precision and this rangeā€, then the 2nd suggestion is the way to go, much better IMO than the first suggestion (but the warning explicitly suggests the first one).

LFortran gives you this by default:

$ lfortran a.f90
warning: Assuming implicit save attribute for variable declaration
  --> a.f90:11:15
   |
11 | integer :: i, ifac=1
   |               ^^^^^^ help: add explicit save attribute or parameter attribute or initialize in a separate statement


Note: Please report unclear, confusing or incorrect messages as bugs at
https://github.com/lfortran/lfortran/issues.
...
3 Likes

Congratulations, I think this is a really interesting and helpful tool.
Two comments from the perspective of a non-computer scientist using fortitude in a windows 11 environment with Python 3.13.0:
I can see some effort has been put into using colour in the output and this works well at the command line but is not so good if one ā€œredirectsā€ output to a file, e.g. > fortitude check >fortitude.out
can this ā€œredirectionā€ be detected so the colour escape characters do not appear in fortitude.out? Or perhaps a command line switch ā€œ-no_colourā€

I followed the instructions for a fortitude.toml file but I get an error for the line ā€œselect = [ā€œSā€, ā€œTā€]ā€
ERROR: Unknown rule codes [ā€œSā€, ā€œTā€]

Great work, and I have a new perspective on my code already
Norman

4 Likes

I propose that there could be a badge. Fortitude is set to increase the quality of the code style, as such with similar intent as for instance black for Python. A couple of projects on GitHub display if a project passes such a set of criteria by a badge, including black itself

Now playing a bit with the parameters shields.io provides to set up such a badge, with the RGB (115,79,150) tuple of the Fortran logo (source) Inkscape reads out in hand, and given shields.io already knows about about the Fortran logo, I came up with the definition of

![Static Badge](https://img.shields.io/badge/fortitude-%2520?style=plastic&logo=fortran&logoColor=115%2C79%2C150&labelColor=purple&color=green&link=https%3A%2F%2Fgithub.com%2Fageweke%2Ffortitude)

which visually resolves to

Static Badge

A green were to indicate ā€œpassingā€, or ā€œafter the application of fortitudeā€. (I agree, compared other green badges this named green is not so good yet, and could/should be improved further.*) However, in addition to this indicator, the copy-paste of the html the page equally exports to

https://img.shields.io/badge/fortitude-%2520?style=plastic&logo=fortran&logoColor=115%2C79%2C150&labelColor=purple&color=green&link=https%3A%2F%2Fgithub.com%2Fageweke%2Ffortitude

i.e. it includes a link back to the repository of Fortitude ā€“ hence potentially attracts additional interest into your project too.

* Given their previous work on Fortran related logos, I presume @jacobwilliams or @vmagnin better have an eye on it.

2024-11-16_proposal_shield_fortitude.zip.txt (67.6 KB)

3 Likes

I was not implied in the Fortran logo design, but I used its purple color in several occasions and I can confirm it is RGB = #734f96 = (115, 79, 150), as you used in your badge.

I agree that fortitude sounds great (havenā€™t tried it yet), but itā€™s akin to eslint, so having a badge for it is too subjective.

In the JavaScript realm, what some projects do is having a badge for the well-known set of eslint rules the project adheres to ā€”e.g., ā€œstandardā€, ā€œairbnbā€, etc.

As some of the comments in this thread have shown, fortitude defaults are somewhat opinionated, so at least one set of rules truly compatible with the latest Fortran standard is required.

2 Likes

Thanks Norman! The first issue with colour will be fixed in the next release (hopefully this coming week); the second issue with selecting whole categories is actually a new feature in the next release! Weā€™ve not got a docs website sorted yet, so we have this slightly annoying issue of the README being ahead of the latest published release.

@nbehrnd A badge would be nice, though I fear weā€™d be getting a little ahead of ourselves to have one already! :smiley:

@jwmwalrus Yes, the current default set is just ā€œeverythingā€ at the mo. Weā€™re likely to remain pretty opinionated (thatā€™s sort of the job of a linter, after all), but we definitely appreciate weā€™ll need to evolve both the default set, as well as our categories.

1 Like

@zedthree I am having alot of fun with fortitude and it is improving my code no end. I have violated T042 (life, the universe and everythingā€¦) alot in my code so I tried ā€œexplain T042ā€ and the example corrected code is wrong

program example
  character(len=3) :: short_text
  call set_text(short_text)
  print*, short_text
contains
  subroutine set_text(text)
    character(len=:), allocatable, intent(out) :: text
    text = "longer than 3 characters"
  end subroutine set_text
end program

in the main program the declaration of short_text should be recommended to read


  character(len=:), allocatable :: short_text

Keep up the great work :heart_eyes:

2 Likes