Parsing contents of .mod files using Python

gnikit · April 16, 2022, 11:57pm

While developing the fortls Language Server I have always been troubled with how one deals with precompiled module files (.mod files). In contrast to normal modules the source code of the external module is not required to be present. This makes extracting information like public interfaces out of .mod files extremely complicated. Further complicating matters, .mod file contents vary between different compilers.

In gfortran at least, .mod files are just an optimised and gziped version of the Abstract Syntax Tree (AST) structure the compiler normally generates. Doing:

zcat modulefile.mod

will output the AST in plain text. However, the problem is that gfortran optimises away the PUBLIC/PRIVATE attributes that are normally attached to the AST nodes, so the format of the extracted information (and maybe the information itself) are not very helpful

One can generate the normal AST and compare the differences

gfortran -fdump-fortran-original mysrc.f90

Making matters worse, AST representations between compilers also differ so for fortls to be able to parse any .mod file it would also need to be able to parse all compiler AST representations to Python, which is unlikely to happen.

I would be interested in hearing your ideas about how one could parse some of the contents of a .mod file.

Are there any ways of getting the public information + attributes and types that do not involve parsing the compiler-generated AST?
Are there any Python APIs (GCC, Intel, LLVM, etc.) that could be used to obtain the ASTs in Python?
If not, would it be a good idea to create some?

rwmsu · April 17, 2022, 2:45pm

One solution to the incompatible module file problem that plaques anyone who desires to write libraries and/or applications that can be used by multiple compilers without forcing the user to compile a different version for each compiler he/she choses to use just to have compatible module files is for the Fortran community in general and the standards committee in particular to define a standard module file format in some kind of markup like language (maybe XML etc) that can be read and/or generated as a option to the default vendor formats. This could be controlled by compiler flags etc. A precompile step (hidden from the user) would probably be needed but I’m willing to sacrifice a little compile time for the flexibility that transportable module files would give you.

just my 2 cents

gnikit · April 17, 2022, 6:15pm

Yes that would be ideal, a strategy similar to C++'s headers would be nice. I think the closest Fortran has to this are submodules.

As for adding the .mod file generation to the standard, I suspect that would be a hard sell to the committee members. Other than allowing for compiler agnostic Language Servers to provide information for external modules, I don’t necessarily see a lot of other applications.

Maybe it will make the lives of compiler Devs easier (in the long run) and improve the available tooling for the language and maybe it won’t. My guess is that it would require a disproportionately greater amount of work to standardise and implement when compared to the potential perceived benefits.

If any J3 committee members want to share their thoughts on this it would be great.

rfarmer · April 17, 2022, 7:12pm

My library gfort2py GitHub - rjfarmer/gfort2py: Library to allow calling fortran code from python parses most of a gfortran mod file (The in dev version gfort2py/module_parse.py at dev2 · rjfarmer/gfort2py · GitHub does a better job of tracking all the data present, even if I don’t do anything with it).

Mod files are definitely not the way to go if you need to support multiple compilers. In that case all you can do is parse the fortran source code (like f2py does). Anything else will be entirely compiler dependent.

everythingfunctional · April 18, 2022, 1:39pm

This can’t (I believe) actually be true. It would prevent submodules from working correctly. When compiling a module, it cannot be known that a submodule will not be used. And because a submodule has access to private module entities, they must always be included in a *.mod file. I’ll admit that a submodule wouldn’t be of any use without any interface blocks, but the presence of an interface block also doesn’t mean that there will necessarily be a submodule either.

everythingfunctional · April 18, 2022, 2:15pm

ok, now try and compile the following:

submodule (foo) foo_s
  implicit none
contains
  function baz()
    integer baz
    baz = bah()
  end function
end submodule

It of course wouldn’t be useful for anything, as nothing can see baz, but it’s still valid Fortran.

gnikit · April 18, 2022, 2:19pm

I think it would help if one could compare with the unoptimised AST. Basically, some of these UNKNOWN correspond to the visibility attribute

Namespace: A-Z: (UNKNOWN 0)
procedure name = foo
  symtree: 'bah'         || symbol: 'bah'          
    type spec : (INTEGER 4)
    attributes: (PROCEDURE PRIVATE MODULE-PROC  FUNCTION IMPLICIT_PURE)
    result: bah
  symtree: 'bar'         || symbol: 'bar'          
    type spec : (INTEGER 4)
    attributes: (PROCEDURE PUBLIC MODULE-PROC  FUNCTION IMPLICIT_PURE)
    result: bar
  symtree: 'foo'         || symbol: 'foo'          
    type spec : (UNKNOWN 0)
    attributes: (MODULE )

  code:
CONTAINS

  Namespace: A-Z: (UNKNOWN 0)
  procedure name = bah
    symtree: 'bah'         || symbol: 'bah' from namespace 'foo'

    code:
    ASSIGN foo:bah 1
    

CONTAINS

  Namespace: A-Z: (UNKNOWN 0)
  procedure name = bar
    symtree: 'bah'         || symbol: 'bah' from namespace 'foo'
    symtree: 'bar'         || symbol: 'bar' from namespace 'foo'

    code:
    ASSIGN foo:bar bah[[()]]

Topic		Replies	Views
Reconstruct interface from .mod and .o files Help	9	1325	May 1, 2021
Installing Fortran module files Help	15	3009	January 29, 2022
Multiple Fortran support in Debian/Ubuntu	15	977	December 5, 2023
#7002: Error in opening the compiled module file	3	976	July 19, 2024
Gfortran and .smod files	14	1048	December 26, 2023

Parsing contents of .mod files using Python

Related topics