Multiple Fortran support in Debian/Ubuntu

Hi, I’m Alastair McKinstry, a Debian/Ubuntu developer and I’m adding multiple Fortran-compiler support to Debian. I’d appreciate your feedback.

The current choice of open-source compilers appears to be gfortran, flang and lfortran. flang in practice ship two compilers, flang-new and flang-to-external-fc , where an external fortran compiler is used (typically gfortran) to do compilation after optimisation.

The challenge is that different compilers create incompatible libraries and also incompatible pre-compiled mod files.

The solution i’m proposing (partly implemented since 2017) is to install modules files in (LIBDIR)/fortran/(fortran-moddir). gfortran or flang (so far).

Libraries then get suffixes: eg the package spherepack which produces a runtime library libsphere.so.0 which is then installed as (LIBDIR)/libsphere-gfortran.so.0 while the devel symlink to it is (LIBDIR)/fortran/gfortran/libsphere.so

I had started this work in 2019-2020 with flang-7. This was a standalone flang, since obsoleted. Flang has since been merged into llvm proper and flang-new-17 looks stable enough to work with.

So I would like opinions on this, in particular with future plans for flang and lfortran.

Thanks

4 Likes

Greetings from a fellow packager in Slackware and Slackware-Current compatible distributions. I actually left Debian shortly before systemd was adopted (but that’s another story.)

In my opinion placing mod files in LIBDIR makes no sense. What I do with my Fortran libraries is putting mod files in /usr/include (or /usr/local/include in FreeBSD.) This is were they belong, since mod files are basically equivalent to header files - you just don’t have to write them yourself, the Fortran compiler does that for you. The fact they are binary files doesn’t mean they should be in LIBDIR. Will you put header files of any other language in LIBDIR just because they happen to be in binary form?
Other developers of Fortran libraries do the same as well, and we didn’t even spoke about that (see, e.g., gtk-fortran.) /usr/include is just the obvious home for mod files.

As for the incompatibility between different compilers, the solution is rather simple: a different directory in /usr/include for each compiler, like /usr/include/gfortran, /usr/include/lfortran. Even if only one compiler was available, I would do the same so that the include directory won’t be cluttered. Whenever LFortran is ready for stable packaging, I’ll certainly do that in the distributions I am working. Now, FLang is kind of a mess because two versions exist - but again, the right place for their mod files, if any, is in a /usr/include subdirectory.

Now, lib files (.so, .a and links to them) obviously belong to /usr/lib (or /usr/lib64, I’m not quite sure Debian-based distros distinguish between those two nowadays.) And naming is not a problem, libgfortran.*, liblfortran.*, etc.

3 Likes

When you compile a file with one or more modules in it, you get a .mod file for each module along with the .o file. The .mod files must be available at compile time when other fortran files are compiled, and the .o file must be available at link time. The .o file can be moved into a normal library file (.a or .so, etc.) but it still must be available at link time. If you have multiple fortran compilers (or multiple versions of a compiler), then both the .mod files and the .o files for each compiler must be separated and made accessible.

I would assume that renaming library names breaks many build systems, so using different directories might be the better option.
Aren’t there any “lessons learned” from C++? The situation of having at least two mature compilers exists there for longer.

Debian/Ubuntu supports multiarch, so instead of /usr/lib or /usr/lib64, LIBDIR is eg. /usr/lib/aarch64-linux-gnu/ on my laptop. There can be multiple architecture directories on the same system to enable cross-compilation, emulators etc.

Ditto for INCDIR but when I started INCDIR was not multiarch I think. A minor change to make. It could be INCDIR but needs to be multiarch as module files can be binary and differ between architectures.

The challenge with library names is not libgfortran.* vs liblfortran.* (the runtime libraries) but libmpi_usempif08.* etc - libraries of Fortran code, compiled by different Fortran compilers and incompatible with each other. Its about how we build a full stack of packages built with different compilers and have them installed as necessary.

What happens here is that libmpi_usempif08 compiled for gfortran is built and installed as $(LIBDIR)/libmpi_usempif80-gfortran.so.40 with an SONAME ``libmpi_usempif08-gfortran.so.40. There is a symlink in $(LIBDIR)/fortran/gfortran/lib called libmpi_usempif08.so to this, so I can find just the gfortran versions of libraries by including this directory on my library path when compiling - the binaries produced will then look for a shared lib named libmpi_usempif08-gfortran.so.40 at runtime and find it in $(LIBDIR).

libgfortran.* files are usually placed in LIBDIR “raw” (not in a subdirectory.) I totally agree this is a bad habit, presumably adopted to save developers one line in their makefiles, LDFLAGS=/usr/lib/gfortran - or a change in PATH, or some kind of config (like SDL does.)
This practice is very widely adopted, and that is the reason LIBDIR is cluttered. A quick look in my “daily driver” shows LIBDIR currently contains 4565 items, only 196 of those being subdirectories. The rest is dynamic or static libraries, links to them, and a few shell scripts - all thrown right into LIBDIR. This is a mess, but it’s basically impossible to fix it, because it is a widely accepted mess.
Strangely enough, this is not the case with header files, though. /usr/include is put in order (for the most part,) with a separate subdirectory for each library (but with notable exceptions even there.)

Usually packagers can’t move lib files to a subdirectory, even if that would objectively put things in order. In my case, I’m sure such a change would be rejected, even if I recompile all other libraries affected by the change. It will be considered “userspace breakage”, therefore an abomination that should be blatantly denounced. :laughing:
I seriously doubt the situation is different in any other distribution, especially distros following FHS. At least it wasn’t different in any distro I tried. And it’s not just a GNU/Linux mess. The exact same LIBDIR mess is found in FreeBSD, Haiku, you name it. Putting every library file in LIBDIR is a bad practice, sure, but an old one. Years after years, it’s not a bad practice anymore, it’s a “feature”.

So we can’t do much about gfortran (at least, I can’t) - but we can do something with new compilers added in distributions.

Ok, I got it now… the mighty multiarch. I do remember now that Debian had multiarch supported even back when I was a Debian user - but I never used multiarch myself. In the distributions I am involved now, the general principle is “it’s there, not enabled by default, not recommended”. So I’m afraid I can’t help much about it. For 32-bit Fortran libs (or any 32-bit libs for that matter,) I just rebuild packages natively, in the 32-bit version of the distribution. Even if multiarch was encouraged, I would do the same, to be honest.

I had a closer look in .mod files. In my systems, most of them are in INCDIR, with the notable exception of gfortran itself (of course.) Its .mod files are in LIBDIR/gcc/.../finclude/. Presumably they thought “they are binary files, so LIBDIR it is” (I disagree, and I am not the only one.) But even then, they distinguished them from normal lib files, so they placed them in a separate subdirectory called finclude. That alone should tell you we are talking about header files here, just in binary form.

Another exception is plplot. Its .mod files are in LIBDIR/fortran/modules/plplot/. I weird choice indeed. In fact this is just a plplot convention, there is nothing else in LIBDIR/fortran/ except their stuff.
The rest of the .mod files are either not related to Fortran at all, or just placed in INCDIR.

Essentially, gfortran .mod files are gzipped plain files. If you decompress them (you might need to change the suffix to .gz first,) you will get a plain text file. And if you have a closer look at it, you will see it’s basically a header file, just not in the format you are probably familiar from C header files. The problem is that’s not standardised. How exactly a .mod file is formatted and saved depends on the compiler. For instance, LFortran does creates .mod files totally different than gfortran’s. But the principle is the same: the equivalent of header files.

It would be nice if packagers finally agree to place Fortran .mod files in their proper place, or at least one place, whatever that would be. Their obvious home should clearly be INCDIR, if you ask me.

1 Like

Placing everything in LIBDIR happens because of RPATH issues. If I place something in /usr/lib/mydir/ then the executable or library that uses it needs either RPATH hard-coded in the binary or $LD_LIBRARY_PATH set at runtime to contain possibly hundreds of entries. As it currently stands distro packages (deb or rpms) can be installed in somewhere other than /usr (eg /usr/local) and just work with $PATH and $LD_LIBRARY_PATH set, which hard-coding RPATH .

My concern is how to get all the software to work together in a distribution. We will see increasing library stacks as software matures - at last checking i’ve some cases of shared libraries 7 deep. People work against the top-level library and leave the rest to “the distribution” - containerised workflows typically do this. The dirty secret of containerisation is that it leaves the heavy lifting to the distro without that being obvious.

A good example of how to work is hdf5 in Debian. There are three implementations of hdf5: serial, openmpi, mpich which can be co-installed. So on my system:

ls /usr/lib/aarch64-linux-gnu/hdf5/openmpi/
include    libhdf5_cpp.a      libhdf5_fortran.so  libhdf5_hl_cpp.so     libhdf5_hl.so
lib        libhdf5_cpp.so     libhdf5_hl.a        libhdf5hl_fortran.a   libhdf5.settings
libhdf5.a  libhdf5_fortran.a  libhdf5_hl_cpp.a    libhdf5hl_fortran.so  libhdf5.so

These are symlinks, so:

/usr/lib/aarch64-linux-gnu/hdf5/openmpi/libhdf5_cpp.so -> ../../libhdf5_openmpi_cpp.so

and I can build against hdf5 openmpi with --with-hdf5=$LIBDIR/hdf5/openmpi in most applications.

The aim is to implement a similar structure for Fortran libraries.

Its important to remember there may be multiple versions of a compiler coinstalled, eg flang-new-16 , flang-new-18 and they may have different mod- formats etc too.

This was easier with just gfortran and flang for classic flang; Currently the issues are what to name directories and how to handle lfortran.

flang-new-* seems to have a stable mod structure with v1 in the first line header. I can use that to put modules in the right directories (eg. $LIBDIR/fortran/flang-mod-1 .

flang-to-external-fc-* produces gfortran-mod files (compressed; version number can be extracted from the header.

Issues:
(1) Are module files produced directly from gfortran and flang-to-external-fc really interoperable? If I have stuff with both gfortran and flang-to-external-fc on my system, should I put the latter in a different directory, eg $LIBDIR/fortran/flangext ?
(2) Ditto libraries?
(3) Are module files guaranteed to be binary-compatible across all archs?
(4) What to do about lfortran? At the moment the format seems tied to the exact compiler version, which is very brittle and requires everything to be rebuilt for every compiler version.

You can find a few posts on this topic in an older thead:

1 Like

A useful thread. Also some summaries of previous work it references.

https://fortranwiki.org/fortran/show/Library+distribution

Some points:

  • The directory containing module files should be labelled “fmoddir”. This is effectively a convention already.
  • where relevant pkg-config files should include fmoddir as a variable. While pkg-config does not have a clean syntax for it, you can do, eg.:
pkg-config --var=fmoddir eckit

to retrieve it.

  • The current Fortran wiki recommends something like:
    /usr/lib/fortran/gfortran-4.1 . My proposal presumes LIBDIR is multiarch on Debian, but also the fmoddir is named based on the version, ie gfortran-mod-15.So $LIBDIR/fortran/gfortran-mod-15.

  • There is also a symlink gfortran-13 -> gfortran-mod-15 for each supported compiler, as gfortran-10 + have the same modules file format. This allows a new compiler version support the full stack with just the addition of a symlnk in the common case of no format change.

There is discussion of not shipping mod files at all, just source: this isn’t realistic for a distribution as the stack grows. The lfortran case of the mod file being valid Fortran than includes the signatures of all elements is pretty close to this though.

The aim in the design is that when a component in the distribution has multiple alternatives to support, it should be possible to do:
--with-feature=$LIBDIR/XXX/$VARIANT for it all to work.
So there is a symlink tree as per the hdf5 example upthread.

Similarly in the (Debian) packaging, it should be possible to build for a different Fortran just by setting FC=xxx as much as possible; ie someone who has a commercial compiler can get an (Intel / Arm, etc) netcdf library etc by doing

$ FC=ifx dpkg-buildpackage

in the unpacked netcdf library source (from Debian) and be able to install the generated debs.

One complication to be aware of is that .mod files are not just per-compiler, they are per-compiler-configuration.

Examples of things that can produce incompatibilities in .mod files:

  • what option is in effect determining the KIND numbering system
  • what option is in effect determining the precision/range of the default integer/real variables
  • what option is in effect determining the level of runtime checking requested

Every compiler makes its own choices about .mod files.

1 Like

Yes, this is why $fmoddir has a version configuration (eg gfortran-mod-15, based on the internal version number).

They are mostly static though, its not changed for gfortran for 4 versions (I think). More problematic is lfortran which is changing internal version number each release; It appears however to be a minimised Fortran file with just the necessary signatures, and hopefully can be treated as unchanging.

pkg-config also needs to be supported.

This needs typically:

  • @FMODDIR@ to be set in configure/cmake;
  • The variable fmoddir=@FMODDIR@ then set in the first stanza of the pc file
  • The pc file to be in a per-Fortran file: $LIBDIR/fortran/$variant/pkgconfig

Do you mean compile-time configuration, not just compiler version = eg. can mod files be incompatible between two configurations of. gfortran-13 on the same architecture?

I can’t answer definitively for gfortran but it looks like the contents of a .mod file generated by gfortran (12.1.0) varies when, say, the option -fdefault-integer-8 is used.

Ok, thats reasonable.
If the module file for a package changes the library will be different too.