I’ve recently started working on a large, old Fortran codebase. It is of the style wherein all variables are either subroutine-local or stored in common
blocks. Of course, all variables are implicitly typed, as well. I’d like to move toward something where parameters are either passed in or stored in modules. However, it would be useful to have an automatic way of determining (1) all of the variables which are implicitly declared/used in a given subroutine, and (2) whether they are read from, written to, or both (basically the intent
). Are there any tools or libraries which would be useful for extracting this? FORD does some of these things, but I haven’t figured out how to extract quite what I’m looking for. In principle, given an AST or ASR from, e.g., LFortran, it should be possible to extract it as well.
Welcome to the forum.
If a subroutine accesses variables in a module mod
, but there is no use mod
statement in the subroutine, and if the implicit none
statement is present, the compiler should complain about the undeclared variables. One could write a program to process the error message from the compiler to get their names. You can declare a module variable to be protected
. If a subroutine that accesses the module variable still compiles, the variable is effectively intent(in)
for that subroutine.
@Beliavsky Thanks for the response and the welcome.
I guess I wasn’t clear – the code currently has neither modules nor implicit none
statements. Almost all variables are implicit
(with explicit implicit
statements) and are either local or in common
blocks. Most subroutines do not have any dummy arguments.
I’m looking for a tool which will help extract lists of variables and their types, as well as whether they’re read from/written to/both. I have done a little bit by throwing in an implicit none
in a subroutine and then repeatedly addressing individual compiler errors, but that’s maddeningly tedious… I’m not sure I can handle that for 10s of thousands of lines of code.
There might be two different things going on here – I’m trying to track down what variables exist in each subroutine, but also I’m trying to determine which subroutines mutate which global state. (And of course, since everything is in common
blocks, everything is global state…)
I don’t think that parsing the compiler output will be sufficient, either – the compiler usually bails after enough errors, so you don’t ever get the full set of name errors.
I do not know of a tool that does all of that.
If the variables are declared in the common blocks the same way in all the subroutines, then there is a straightforward procedure to convert those to module variables. If the common block variables are declared differently in the various subroutines, then this procedure does not work in a straightforward way, more analysis and testing will be required.
First, before you do anything else, set up a testing suite. This should ideally test every subroutine directly, but if that is not practical, then it should at least indirectly test every subroutine. You want to avoid the situation where modified code is never executed and tested because of the way the program flow depends on the input. So try to make sure all your mods are actually tested somehow.
Ok, for the common block replacement, first create a module and put the common block in it, with all the variables declared appropriately. Then go through your subroutines and replace the common blocks with a use statement. This can be done one-at-a-time, or it can be done in batches. Run your test suite after each substitution to verify that everything works. After all the common blocks have been eliminated in the subroutines, then you can eliminate the common block in the module, leaving just the module variables and their declarations. If your code used INCLUDE or #include to reference the common blocks in the first place, then that helps things a little. But neither of those were standard in f77 and before, so it is typical to see multiple common block declarations scattered throughout the legacy code.
As for the subroutine argument lists, that is a tedious task. As you have noted, you can replace the implicit declaration with implicit none
in each subroutine, and then make explicit variable declarations that are consistent with the old implicit declarations. The intent declarations require even more manual effort, particularly getting the intent(inout) and intent(out) ones correct. It isn’t always obvious from looking at just one subroutine what is the appropriate declaration, sometimes you need to look more globally.
Welcome to Fortran archeology! I think that fpt - http:simconglobal.com will do what you need. It will declare all undeclared objects, show all the interfaces (arguments and through COMMON) and provide a lot of other documentation. It will also check for many classes of errors.
@Jcollins fpt
looks like it might do what I want!
- Add Declarations - declaring all undeclared variables
- Add IMPLICIT NONE - enforcing IMPLICIT NONE, removing existing IMPLICIT statements
and more…
@Jcollins I haven’t had the pleasure of working with fpt, but it seems like an amazing piece of work built up over a long time. As someone interested in tooling for Fortran and its semantics, would it be possible some time to speak directly with you about its techniques?
@gak - yes, of course. Please e-mail me at john.collins@simconglobal.com and we can arrange a call. I suggest a zoom call so that we can share screens.
Best wishes,
John
@pjfasano we are close to compiling all of SciPy with LFortran to ASR (our intermediate representation) that has all the types and interfaces explicitly figured out. It would not be difficult to write a Fortran backend to convert ASR back to (modern) Fortran. People have contributed a Julia backend for example. Right now my main focus is on compiling more codes. But if you or anyone want to contribute the backend, let me know, I’ll help.
Another program that performs static code analysis on Fortran 77 programs is FTNCHEK. I used it to great advantage many years ago. I believe it does everything that you mentioned.
You might want to check plusFORT, a tool that was introduced here by their developers. It is commercial software, but I had the opportunity to try it since people in this Discourse got a free license for a limited amount of time. It was actually quite good, considering how hard what it tries to do is. If I had to convert massive legacy codes to a more modern Fortran, I would consider buying a license.
Among other things, plusFORT automatically adds intent
attributes to procedure argument declarations, which is one of the things you want to do. It is also very configurable to what it does and how. But I bet the legacy code you have is also full of goto
statements, which is the worst problem with old codes, causing spaghetti hell. Restructuring the code to minimize goto
s is also a thing plusFORT tries to do.