Has anyone tested this new project?
I was thinking of posting about it. It seems useful and already has 17 stars on GitHub although it is relatively new. For a sample program
real :: pi = 3.14159265358979311599796346854
double precision :: pi_d = 3.14159265358979311599796346854
print*,pi
print*,factorial(3), factorial(4)
contains
integer function factorial(n)
integer :: n
integer :: i, ifac=1
do i=2,n
ifac = ifac*i
end do
factorial = ifac
end function factorial
end
fortitude check pi.f90
says
warning: pi.f90:1:1: P021 real has implicit kind
|
1 | real :: pi = 3.14159265358979311599796346854
| ^^^^ P021
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
|
warning: pi.f90:1:1: T001 program missing 'implicit none'
|
1 | real :: pi = 3.14159265358979311599796346854
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ T001
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
|
warning: pi.f90:1:14: P001 real literal 3.14159265358979311599796346854 missing kind suffix
|
1 | real :: pi = 3.14159265358979311599796346854
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ P001
2 | double precision :: pi_d = 3.14159265358979311599796346854
3 | print*,pi
|
warning: pi.f90:2:1: P011 prefer 'real(real64)' to 'double precision' (see 'iso_fortran_env')
|
1 | real :: pi = 3.14159265358979311599796346854
2 | double precision :: pi_d = 3.14159265358979311599796346854
| ^^^^^^^^^^^^^^^^ P011
3 | print*,pi
4 | print*,factorial(3), factorial(4)
|
warning: pi.f90:2:28: P001 real literal 3.14159265358979311599796346854 missing kind suffix
|
1 | real :: pi = 3.14159265358979311599796346854
2 | double precision :: pi_d = 3.14159265358979311599796346854
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ P001
3 | print*,pi
4 | print*,factorial(3), factorial(4)
|
warning: pi.f90:7:12: T031 function argument 'n' missing 'intent' attribute
|
5 | contains
6 | integer function factorial(n)
7 | integer :: n
| ^ T031
8 | integer :: i, ifac=1
9 | do i=2,n
|
warning: pi.f90:8:15: T051 'ifac' is initialised in its declaration and has no explicit `save` or `parameter` attribute
|
6 | integer function factorial(n)
7 | integer :: n
8 | integer :: i, ifac=1
| ^^^^^^ T051
9 | do i=2,n
10 | ifac = ifac*i
|
-- Number of errors: 7
-- For more information, run:
fortitude explain [ERROR_CODES]
It looks like it is using this tree-sitter-based Fortran parser GitHub - stadelmanma/tree-sitter-fortran: Fortran grammar for tree-sitter.
Weāve been discovered! Weāre currently still in a fairly beta phase and maybe not quite ready for full production use yet, although I do think it is already a useful tool. The current push is to get auto-fixes working before we start shouting a bit more about it.
It is indeed built on the tree-sitter grammar, so it is very fast. I just ran fortitude check
on a large, popular package, and it checked 473 files, finding 38,000 warnings, in 5 seconds. Tree-sitter grammars are also error-tolerant, which means weāll be able to check as you type, especially once we get LSP integration.
Weāre heavily inspired by the Python linter ruff
, and in fact are building on that too.
Current goals:
- auto-fixes (development version can make suggestions, but not apply them)
- multiple output formats (JSON, GitHub, etc)
- more checks
Longer term goals:
- topiary integration for formatting (someone has started looking at this just this week)
- fixed-form checking and auto-conversion
- preprocessor compatibility
Feedback and contributors are very welcome
@zedthree thanks for implementing this!
I was curious about the speed of the tree sitter parser. So I compiled fortitude with ācargo build -rā. I used this 100,000 lines benchmark file: bench3.f90 Ā· GitHub. (We should figure out a better one by taking some real project, and concatenating all files into just one file, in the right order.)
In Ubuntu 22.04 in WSL on Surface 5 laptop I got with fortitude:
$ time ../../../../../fortitude/target/release/fortitude check bench3.f90
fortitude: 1 files scanned.
All checks passed!
real 0m0.651s
user 0m0.621s
sys 0m0.031s
and LFortran:
$ time ./lfortran --backend=wasm bench3.f90 -o a.wasm --time-report
Codegen Time report:
ASR -> wasm: 96
Save: 1
Total: 97
Allocator usage of last chunk (MB): 16.1908
Allocator chunks: 1
Time report:
File reading: 2
Src -> AST: 47
AST -> ASR: 40
ASR -> wasm: 98
Total: 187
real 0m0.198s
user 0m0.178s
sys 0m0.020s
And Flang 19 (from conda):
$ time flang-new -fsyntax-only bench3.f90
real 0m3.390s
user 0m3.069s
sys 0m0.192s
$ time flang-new bench3.f90
real 0m18.782s
user 0m27.811s
sys 0m2.551s
Note that Flang seems to compile in parallel (thatās why the āuserā is larger than ārealā). LFortran and GFortran compile on single core in this benchmark.
And GFortran 13.2.0 from conda:
$ time gfortran bench3.f90
real 0m18.657s
user 0m17.633s
sys 0m0.602s
Would you say that Fortitude parsing and running the checks would be equivalent to LFortran or Flang parsing and doing semantic analysis?
Compiler | Parsing and Semantics | Full compilation | Lines / s |
---|---|---|---|
Fortitude | 0.651 (7.3x) | - | 153,625 / - |
LFortran | 0.089 (1.0x) | 0.198 (1.0x) | 1,123,707 / 505,101 |
Flang | 3.390 (38.1x) | 18.782 (94.9x) | 29,501 / 5,324 |
GFortran | - | 18.657 (94.22x) | - / 5360 |
Above LFortran was compiling to webassembly, which is a nice reference as it shows what is possible, and why we are developing a new compiler.
@zedthree if you want to collaborate on our very fast parser and maybe use it in Fortitude, let me know! Our parser to AST is beta quality, meaning it is expected to work for most codes. Itās error resilient (we are still making improvements to the resiliency).
Nice, that is very fast! Yes, itās probably roughly equivalent to AST + semantics, although we probably donāt do as much as a real compiler.
If you have rust bindings, it would definitely be interesting to try and use lfortran in fortitude, especially if we can then do some more in-depth semantic checks.
The main advantage to tree-sitter parsers is theyāre very lightweight to build (just a standalone C library), with bindings in many languages. Thereās a language-agnostic ecosystem around them too, for example various editors can use tree-sitter grammars for syntax highlighting.
Interestingly, I see much less of a difference on my machine (although lfortran is still x2 faster!):
$: hyperfine "tree-sitter parse --quiet bench3.f90"
Benchmark 1: tree-sitter parse --quiet bench3.f90
Time (mean Ā± Ļ): 473.5 ms Ā± 4.1 ms [User: 442.6 ms, System: 29.5 ms]
Range (min ā¦ max): 469.7 ms ā¦ 483.9 ms 10 runs
$: hyperfine "lfortran --backend=wasm bench3.f90 -o a.wasm --time-report"
Benchmark 1: lfortran --backend=wasm bench3.f90 -o a.wasm --time-report
Time (mean Ā± Ļ): 287.5 ms Ā± 5.5 ms [User: 254.5 ms, System: 32.7 ms]
Range (min ā¦ max): 280.8 ms ā¦ 298.1 ms 10 runs
and also with just --show-ast
:
$: hyperfine "lfortran bench3.f90 --show-ast > /dev/null"
Benchmark 1: lfortran bench3.f90 --show-ast > /dev/null
Time (mean Ā± Ļ): 202.8 ms Ā± 2.4 ms [User: 165.8 ms, System: 37.9 ms]
Range (min ā¦ max): 199.6 ms ā¦ 209.1 ms 14 runs
Comparison with a more real world case shows very little difference:
$: wc -l gs2/src/init_g.f90
4574 gs2/src/init_g.f90
$: hyperfine "tree-sitter parse gs2/src/init_g.f90 --quiet"
Benchmark 1: tree-sitter parse gs2/src/init_g.f90 --quiet
Time (mean Ā± Ļ): 36.8 ms Ā± 1.0 ms [User: 33.4 ms, System: 3.7 ms]
Range (min ā¦ max): 35.3 ms ā¦ 42.1 ms 75 runs
$: hyperfine "lfortran gs2/src/init_g.f90 --show-ast > /dev/null"
Benchmark 1: lfortran gs2/src/init_g.f90 --show-ast > /dev/null
Time (mean Ā± Ļ): 40.2 ms Ā± 1.0 ms [User: 30.0 ms, System: 10.3 ms]
Range (min ā¦ max): 38.7 ms ā¦ 45.5 ms 67 runs
(--show-ast
needed here to avoid building the .mod
files)
I also test the tree-sitter grammar against a corpus of ~160 projects, with the following results:
Total parses: 19211; successful parses: 19045; failed parses: 166; success percentage: 99.14%; average speed: 6155 bytes/ms
(some of those parsing failures are due to invalid Fortran!)
Just based on that bench3.f90
file, it looks like lfortran is about ~8000 bytes/ms. Not much in it!
Yes, you have to be careful what to compare against. In your benchmark you compared lfortran --backend=wasm bench3.f90 -o a.wasm --time-report
which creates a WASM binary on disk (so it does SRC->AST, then AST->ASR, various ASR passes, and ASR->WASM, and saves the file to disk), and you compare it against tree-sitter parse --quiet bench3.f90
which presumably just does an equivalent of SRC->AST
, so tree-sitter
should be several times faster than lfortran on this benchmark, but it is 2x slower.
On smaller codes you also get various other overheads, for example the --show-ast
walks the whole AST and prints it, so if you are not printing something similar in tree-sitter parse
, the best way is to read the timing from --time-report
just for SRC->AST
.
Are you worried that LFortran is not faster than tree-sitter for smaller codes? If so, send me the init_g.f90
file and I can have a look. A good benchmark is to emit some warnings in both fortitude and LFortran, ideally the same warnings, and benchmark how long it takes end to end on some actual code.
We currently donāt have Rust bindings, but my understanding is that it is very easy to call C++ from Rust.
The main advantage of collaborating on the same parser and semantics is that it gets reused in all our projects and thanks to LFortran, we guarantee that the parsing and semantics is all correct, because that is what the compiler uses, and there are a lot of corner cases in parsing (including fixed-form) that we are fixing as we discover them, so you can reuse all this work. We can make the parser independent from LFortran, either to AST, or even to ASR (with semantics). Itās just a few files and they donāt depend on the rest of the compiler. So for example there could be a Rust Cargo package that just builds the parser and uses it (LFortran source code is small, so we can add some cmake option to only build the parser, and the cargo package would turn this option on).
This is definitely not a warning that I want to seeā¦
Idemā¦ The implicit kind suffix is fine for an assignment to a default real variable.
I disagree here: double precision
can really be on purpose, to mean āhigher precision than the default realā, or to write interfaces to external procedures that are defined with double precision
arguments or results. real(real64)
is not exactly the same.
Most members of this forum know that
real :: pi = 3.14159265358979311599796346854
defines a single precision variable that is assigned to a single precision constant, but people new to Fortran donāt, so I think the warnings are OK. (In Python/NumPy, R, and Matlab, floats are double precision by default.)
One can select or ignore individual rules or whole groups with --select
and --ignore
I modified the example program you posted and got interesting results.
real :: pi = 3.14159265358979311599796346854
double precision :: pi_d = 3.14159265358979311599796346854
print*, pi_d,pi
print*, dble(pi), 4*atan(1.0d0)
print*, factorial(3), factorial(4)
contains
integer function factorial(n)
integer :: n
integer :: i, ifac=1
do i=2,n
ifac = ifac*i
end do
factorial = ifac
end function factorial
end
There are clearly problems with this code, as the Fortran standard will not support what the original author appears to have intended.
These problems include:
- both pi and pi_d are both initialised by a default real constant. Gfortran Ver 12 lets this author down, as it truncates the real constant, while Silverfrost FTN95 warns they āwould be truncated under ISOā, but retains the real64 value.
- using āinteger :: i, ifac=1ā will not be compiled by Fortran in the way the author has intended, as ifac is not initialised at each call.
- Using ifac as a default integer, factorial(n), it is unlucky n >= 13
While there are examples of compiler or scanner warnings in this thread, I think the warnings presented in the thread miss the key points of Formula Translation.
These are code smells rather than outright errors, and certainly the default warnings may not be for everyone
At the moment, warnings can selected or ignored on a whole project basis (as @Beliavsky mentions), but the intention is to have both per-file settings, and per-line ignores through comments something like ! allow(double-precision)
so that you could have a warning enabled in general except for the specific places where something is required for compatibility.
fortitude explain
shows the rationale behind each rule:
$ fortitude explain P011
# P011: double-precision
## What it does
Checks for use of 'double precision' and 'double complex' types.
## Why is this bad?
The 'double precision' type does not guarantee a 64-bit floating point number
as one might expect. It is instead required to be twice the size of a default
'real', which may vary depending on your system and can be modified by compiler
arguments. For portability, it is recommended to use `real(dp)`, with `dp` set
in one of the following ways:
- `use, intrinsic :: iso_fortran_env, only: dp => real64`
- `integer, parameter :: dp = selected_real_kind(15, 307)`
For code that should be compatible with C, you should instead use
`real(c_double)`, which may be found in the intrinsic module `iso_c_binding`.
We very much welcome community input and feedback on rules (and their explanations), and what would be a sensible default set. Obviously not everything will be suitable for every project, and thatās fine.
Then the warnings should be different. Instead of āwarning: pi.f90:1:1: P021 real has implicit kindā, it could be something like āwarning: the default real in Fortran is most of time 32 bits, with only 6 significant digitsā.
Regarding the literal constants without an explicit kind, I maintain that the warning should be issued only if it used in an expression where a casting to a higher precision occurs, e.g.:
dp = kind(1d0)
real :: pi = 3.14159265358979311599796346854 ! no casting, no warning
real(dp) :: pi = 3.14159265358979311599796346854 ! casting, warning
! or
real :: x
real(dp) :: y
x = 42.42 * x ! no casting, no warning
y = 42.42 * y ! casting, warning
Having to add comments to disable warnings about perfectly valid features looks super-heavy to meā¦
The standard requires the double precision type to have at least 10 significant digits and a range of at least 37:
Technically this cannot fit into 32 bits. And in practice it will always be 64 bits (or more).
For portability, it is recommended to use `real(dp)`, with `dp` set
in one of the following ways:
- `use, intrinsic :: iso_fortran_env, only: dp => real64`
- `integer, parameter :: dp = selected_real_kind(15, 307)`
Well, "portability " is probably not the right word here. As a matter of fact double precision
is 100% portable, which is -strictly speaking- not the case of the 2 suggested solutions.
- The
iso_fortran_env
may not be available (compilers are not required to provide it) selected_real_kind(15, 307)
may return a negative number, as the standard doesnāt require that such a kind exist.
Moreover:
- in contrast to
double precision
,real64
has no requirement on the precision/range. It is just āsome floating point stored on 64 bitsā. It could be a 32 bit (effective) floating point, with 32 bits unused, without violating anything. real64
assumes that the 16/32/64/128 scale is here āforeverā. But who knows?- if the intent is āI want a floating point number with at least this precision and this rangeā, then the 2nd suggestion is the way to go, much better IMO than the first suggestion (but the warning explicitly suggests the first one).
LFortran gives you this by default:
$ lfortran a.f90
warning: Assuming implicit save attribute for variable declaration
--> a.f90:11:15
|
11 | integer :: i, ifac=1
| ^^^^^^ help: add explicit save attribute or parameter attribute or initialize in a separate statement
Note: Please report unclear, confusing or incorrect messages as bugs at
https://github.com/lfortran/lfortran/issues.
...
Congratulations, I think this is a really interesting and helpful tool.
Two comments from the perspective of a non-computer scientist using fortitude in a windows 11 environment with Python 3.13.0:
I can see some effort has been put into using colour in the output and this works well at the command line but is not so good if one āredirectsā output to a file, e.g. > fortitude check >fortitude.out
can this āredirectionā be detected so the colour escape characters do not appear in fortitude.out? Or perhaps a command line switch ā-no_colourā
I followed the instructions for a fortitude.toml file but I get an error for the line āselect = [āSā, āTā]ā
ERROR: Unknown rule codes [āSā, āTā]
Great work, and I have a new perspective on my code already
Norman
I propose that there could be a badge. Fortitude is set to increase the quality of the code style, as such with similar intent as for instance black for Python. A couple of projects on GitHub display if a project passes such a set of criteria by a badge, including black itself
Now playing a bit with the parameters shields.io provides to set up such a badge, with the RGB (115,79,150) tuple of the Fortran logo (source) Inkscape reads out in hand, and given shields.io already knows about about the Fortran logo, I came up with the definition of
![Static Badge](https://img.shields.io/badge/fortitude-%2520?style=plastic&logo=fortran&logoColor=115%2C79%2C150&labelColor=purple&color=green&link=https%3A%2F%2Fgithub.com%2Fageweke%2Ffortitude)
which visually resolves to
A green were to indicate āpassingā, or āafter the application of fortitudeā. (I agree, compared other green badges this named green is not so good yet, and could/should be improved further.*) However, in addition to this indicator, the copy-paste of the html the page equally exports to
https://img.shields.io/badge/fortitude-%2520?style=plastic&logo=fortran&logoColor=115%2C79%2C150&labelColor=purple&color=green&link=https%3A%2F%2Fgithub.com%2Fageweke%2Ffortitude
i.e. it includes a link back to the repository of Fortitude ā hence potentially attracts additional interest into your project too.
* Given their previous work on Fortran related logos, I presume @jacobwilliams or @vmagnin better have an eye on it.
I was not implied in the Fortran logo design, but I used its purple color in several occasions and I can confirm it is RGB = #734f96 = (115, 79, 150), as you used in your badge.
I agree that fortitude
sounds great (havenāt tried it yet), but itās akin to eslint
, so having a badge for it is too subjective.
In the JavaScript realm, what some projects do is having a badge for the well-known set of eslint
rules the project adheres to āe.g., āstandardā, āairbnbā, etc.
As some of the comments in this thread have shown, fortitude
defaults are somewhat opinionated, so at least one set of rules truly compatible with the latest Fortran standard is required.
Thanks Norman! The first issue with colour will be fixed in the next release (hopefully this coming week); the second issue with selecting whole categories is actually a new feature in the next release! Weāve not got a docs website sorted yet, so we have this slightly annoying issue of the README being ahead of the latest published release.
@nbehrnd A badge would be nice, though I fear weād be getting a little ahead of ourselves to have one already!
@jwmwalrus Yes, the current default set is just āeverythingā at the mo. Weāre likely to remain pretty opinionated (thatās sort of the job of a linter, after all), but we definitely appreciate weāll need to evolve both the default set, as well as our categories.
@zedthree I am having alot of fun with fortitude and it is improving my code no end. I have violated T042 (life, the universe and everythingā¦) alot in my code so I tried āexplain T042ā and the example corrected code is wrong
program example
character(len=3) :: short_text
call set_text(short_text)
print*, short_text
contains
subroutine set_text(text)
character(len=:), allocatable, intent(out) :: text
text = "longer than 3 characters"
end subroutine set_text
end program
in the main program the declaration of short_text should be recommended to read
character(len=:), allocatable :: short_text
Keep up the great work