Module dependencies in cmake not tracked after a module is updated

Hi to all, I am migrating a fortran code from Makefile to cmake. I encountered some issues in tracking module dependencies. More precisely, cmake works well when I compile from scratch since it compiles the sources in the right order, starting from the modules and then the routines using them.

However, when I modify a module, giving a make only compiles that module, and NOT the routines that use it. Since the project is made of a large number of sources, I would like to avoid recompiling again from the scratch.

In the makefile way, I use the makedepf90 tool, but I become aware that it is no longer mantained.

So, is there a way to manage module dependencies updated in a cmake project? You can find the CMakeLists.txt I have used.
CMakeLists.txt (677 Bytes)
Thank you.

1 Like

Recompilation cascades is a problem with CMake and/or certain compilers and/or certain build tools. There’s an old discussion on the topic here: Compilation time vs. C++ - #7 by certik

You could use Fortran submodules to avoid recompilation cascades, but that would require quite a bit of refactoring.

Do you know if fpm also suffers from that?

A very quick test indicates that yes fpm has the same problem:

src/foo.f90:

module foo_mod
    implicit none

    private
    public foo

contains

    subroutine foo()

        write(*,*) 'foo!'
    end subroutine foo
end module

src/bar.f90:

module bar_mod
    use foo_mod, only: foo
    implicit none

    private
    public bar

contains

    subroutine bar()
        call foo()
    end subroutine
end module

app/main.f90:

program main
  use bar_mod, only: bar
  implicit none

  call bar
end program main

fpm run builds all sources the first time (obviously). If I change write(*,*) 'foo!' to write(*,*) 'foo' and re-run fpm run then all sources are recompiled. Theoretically it would only be neccessary to recompile foo.f90 and then link the library and executable.

1 Like

I believe OP is actually complaining about the opposite of recompilation cascades. It seems the change they make does not trigger a recompilation cascade when they expect it to. It’s possible, maybe even likely, the recompilation cascade is not necessary because no “public” aspect of the module changed (i.e. procedure interfaces, public variables, etc.). In that case all that is necessary is to recompile the single source file and re-link. It is possible that CMake somehow detects this, but I’m not a CMake expert so I could be wrong. I believe fpm goes solely by timestamps, so can’t tell that a *.mod file didn’t actually change. The original (Haskell) version of fpm tried to detect this by keeping a hash of the *.mod files, but even then it would be thwarted by some compilers including time-stamp info in them.

TLDR: OP’s case may not have actually needed to recompile everything, and fpm may be recompiling more than is necessary.

2 Likes

Thank to all. So, the two solutions are at the moment:

  1. a generator of dependencies, e.g. makedepf90, with makefile or other compilation tool. I wonder that makedepf90 is not mantained, are there other simillar Fortran dependencies tools?

  2. using submodules, but this requires a huge redesign of the code.

The CMake module dependency generator is actually derived from makedepf90. You can see for yourself here: makedepf90 · Search · GitLab

I’d also point out the following on GLOB_RECURSE in the file command:

Note: We do not recommend using GLOB to collect a list of source files from your source tree. If no CMakeLists.txt file changes when a source is added or removed then the generated build system cannot know when to ask CMake to regenerate.

As @everythingfunctional has said, it is possible that recompiling was not necessary for your change. You can inspect what happens in the build phase by using make VERBOSE=1.

1 Like

Good point. I routinely do this in my CMake build systems to avoid having to list sources:

file(GLOB_RECURSE sources CONFIGURE_DEPENDS LIST_DIRECTORIES False
        "${CMAKE_CURRENT_SOURCE_DIR}/src/*")

I don’t think it plays well with the IDE generators in CMake like for Visual Studio, but as long as one sticks to Makefiles, Ninja, etc. I’ve found it works very well. The downside is a slight performance overhead, especially on Windows.

I made a small test with a simple chain:

program main → module b → module a

where the arrow → stands for “uses”. If I add a private module variable to a it causes recompilation of a, but the objects of b and main are reused.

ivan@thinkpad:~/fortran/cmake_example01/build$ make clean
ivan@thinkpad:~/fortran/cmake_example01/build$ make      # fresh build
[ 20%] Building Fortran object CMakeFiles/ab.dir/a.f90.o
[ 40%] Building Fortran object CMakeFiles/ab.dir/b.f90.o
[ 60%] Linking Fortran static library libab.a
[ 60%] Built target ab
[ 80%] Building Fortran object CMakeFiles/main.dir/main.f90.o
[100%] Linking Fortran executable main
[100%] Built target main
ivan@thinkpad:~/fortran/cmake_example01/build$ make   # after adding a private variable in a
Scanning dependencies of target ab
[ 20%] Building Fortran object CMakeFiles/ab.dir/a.f90.o
[ 40%] Linking Fortran static library libab.a
[ 60%] Built target ab
[ 80%] Linking Fortran executable main
[100%] Built target main

Well done. That was the trick, now the make recompiles all the sources using a modified module.

1 Like

In an existing medium or large project I find that adding new files doesn’t happen very often so it doesn’t incur much effort.

The following video may also be relevant, starting at the slide “Don’t use file(GLOB) in project”:

I’m confused by what you mean with recompiling. The line shown by @plevold adds the CONFIGURE_DEPENDS flag, which requires CMake 3.12. The description of this parameter is:

New in version 3.12: If the CONFIGURE_DEPENDS flag is specified, CMake will add logic to the main build system check target to rerun the flagged GLOB commands at build time. If any of the outputs change, CMake will regenerate the build system.

From my understanding this has nothing to do with tracking module dependencies, but simply with the fact your Makefiles are regenerated at each file change.

As already mentioned by @plevold, it also incurs a build overhead:

Note: We do not recommend using GLOB to collect a list of source files from your source tree. If no CMakeLists.txt file changes when a source is added or removed then the generated build system cannot know when to ask CMake to regenerate. The CONFIGURE_DEPENDS flag may not work reliably on all generators, or if a new generator is added in the future that cannot support it, projects using it will be stuck. Even if CONFIGURE_DEPENDS works reliably, there is still a cost to perform the check on every rebuild.

So the solution you accepted goes against the teachings of the CMake documentation.

You are right. I was looking for a solution and I did not care that it was against thumbs of rule. I have checked the compilation more carefully and I have verified that the dependency resolution was working even using the only directive

file(GLOB_RECURSE sources src/*.f90)

When f90 was introduced, with modules, it became an essential part of writing a large program to specify the interdependence of files within the project. Modules broke the convention that the object files could be created in any order, they require a specific order of compilation for interdependent files. Unfortunately, the mechanism to specify these interdependencies was not included in the fortran standard itself.

So external mechansims, such as make files, are required.

But the fortran standard did not standardize how the information from modules was formatted or used. So tools such as make were limited to use only file modification dates. This in turn results in the compilation cascades and in unnecessary compilations during the build step.

One solution to this problem is to standardize the information and the format in which that information is stored within the *.mod files. This way, the essential information can be used to determine when compilations are actually required, and the nonessential information such as time stamps and programmer name can be ignored.

BTW, I have always thought that make was a terribly defined tool for its purpose. But for whatever reasons, that is the standard way these things were done. It is part of POSIX. So don’t mistake my advocacy for make as approval of its oddball conventions (e.g. tab characters, which you cannot distinguish from spaces onscreen).

1 Like

The CMake developers themselves don’t recommend globbing for sources, but it seems to be used quite a bit in the wild. CMake is a powerful tool, but I like the fpm and Cargo philosophy of having a build system that stays out of the way better. Globbing is a pragmatic solution to solving that when I’m stuck with CMake.

1 Like

C++ is in a similar situation now that C++20 introduced modules. I found the following talk very relevant also to the state of Fortran:

From the talk by Daniel Pfeifer I linked above, the GLOB flag is useful primarily in script mode (i.e. files ending in .cmake) for tasks that don’t involve build targets. You can interpret such files with cmake -P <script-file>. For example you could write a script to symlink a list of files (LIST_DIRECTORIES False) to a single directory or to copy a list of Fortran namelist files to the target install directory.

I found a few threads on Stack Exchange which list some of the pros and cons of using GLOB for collecting source file lists:

One of the exceptions for using GLOB listed is

For setting up a CMakeLists.txt files for existing projects that don’t use CMake.
Its a fast way to get all the source referenced (once the build system’s running - replace globbing with explicit file-lists).

which is exactly what @artu72 is doing.

Another comment indicates GLOB as a potential vulnerability for a supply chain attack. A harmful actor could secretly add a file to the source directory which would later get installed as a shared library. A supply chain attack targeting a build system was used in the 2020 United States federal government data breach.

1 Like

Resuming, an explicite file listing or generation with external script is recommended instead of globbing in Cmake. Is it right? In this perspective, a old style Makefile is more useful from my point of view.

@artu72 Sorry, I somehow missed your latest question. To summarize the discussion @ivanpribec and I had, the CMake developers recommend not to use globbing for file listings however it is frequently used for this purpose in the wild because it makes the development process much smoother.

CMake is a very flexible and powerful tool, but very few decisions are made for you. You have to consider the pros and cons and make your own decision. This is in stark contrast to other build systems like fpm (Fortran) and Cargo (Rust) where standardization is prioritized instead of supporting every conceivable project layout. This talk - or perhaps rant - goes into some detail on the topic (some C++ knowledge might be beneficial to get anything out of it): CppCon 2017: Isabella Muerte “There Will Be Build Systems: I Configure Your Milkshake” - YouTube

1 Like