While developing the fortls Language Server I have always been troubled with how one deals with precompiled module files (.mod files). In contrast to normal modules the source code of the external module is not required to be present. This makes extracting information like public interfaces out of .mod files extremely complicated. Further complicating matters, .mod file contents vary between different compilers.
In gfortran at least, .mod files are just an optimised and gziped version of the Abstract Syntax Tree (AST) structure the compiler normally generates. Doing:
will output the AST in plain text. However, the problem is that gfortran optimises away the PUBLIC/PRIVATE attributes that are normally attached to the AST nodes, so the format of the extracted information (and maybe the information itself) are not very helpful
One can generate the normal AST and compare the differences
gfortran -fdump-fortran-original mysrc.f90
Making matters worse, AST representations between compilers also differ so for fortls to be able to parse any .mod file it would also need to be able to parse all compiler AST representations to Python, which is unlikely to happen.
I would be interested in hearing your ideas about how one could parse some of the contents of a .mod file.
Are there any ways of getting the public information + attributes and types that do not involve parsing the compiler-generated AST?
Are there any Python APIs (GCC, Intel, LLVM, etc.) that could be used to obtain the ASTs in Python?
One solution to the incompatible module file problem that plaques anyone who desires to write libraries and/or applications that can be used by multiple compilers without forcing the user to compile a different version for each compiler he/she choses to use just to have compatible module files is for the Fortran community in general and the standards committee in particular to define a standard module file format in some kind of markup like language (maybe XML etc) that can be read and/or generated as a option to the default vendor formats. This could be controlled by compiler flags etc. A precompile step (hidden from the user) would probably be needed but I’m willing to sacrifice a little compile time for the flexibility that transportable module files would give you.
Yes that would be ideal, a strategy similar to C++'s headers would be nice. I think the closest Fortran has to this are submodules.
As for adding the .mod file generation to the standard, I suspect that would be a hard sell to the committee members. Other than allowing for compiler agnostic Language Servers to provide information for external modules, I don’t necessarily see a lot of other applications.
Maybe it will make the lives of compiler Devs easier (in the long run) and improve the available tooling for the language and maybe it won’t. My guess is that it would require a disproportionately greater amount of work to standardise and implement when compared to the potential perceived benefits.
If any J3 committee members want to share their thoughts on this it would be great.
Mod files are definitely not the way to go if you need to support multiple compilers. In that case all you can do is parse the fortran source code (like f2py does). Anything else will be entirely compiler dependent.
This can’t (I believe) actually be true. It would prevent submodules from working correctly. When compiling a module, it cannot be known that a submodule will not be used. And because a submodule has access to private module entities, they must always be included in a *.mod file. I’ll admit that a submodule wouldn’t be of any use without any interface blocks, but the presence of an interface block also doesn’t mean that there will necessarily be a submodule either.