Compilation time vs. C++

I think the amount of template code used in a C++ project will drive compilation times up. Fortran doesn’t have templating and that makes the compilation stage much simpler.

Where you often will see a huge disadvantage for Fortran though is when compiling after making minor changes in a large application:

In C/C++ one would typically separate the interface (header file) and implementation (source file) which means that as long as you just change the implementation you only need to re-compile that particular file before linking.

If you write a Fortran module the interface information will be bundled together with the implementation so a change in one file will typically trigger recompilation of this module and all files which depends on the module (and the files depending on those files and so on…). For large projects, this adds up very quickly! The solution is to use Fortran submodules to define the interfaces to different components in your application.

3 Likes

Is that just an implementation detail? Would it not be possible for a compiler to compare the interface built from a changed module to a previous module file and not replace the old one if the interface had not changed? That would allow one to write dependencies based on the mod files instead of the source file, although that would be a “trick”; that would prevent the compilation cascade?

1 Like

I have made this exact point multiple times. Unnecessary compilation cascades are a failure of the build systems and compilers, not the language itself.

2 Likes

In fact gfortran + cmake already fix this. You can try it yourself, just add a print statement to some function in a module, and dependent modules will not be recompiled. So this is fixed in practice.

Yes, the compilation speed of C++ depends on how many templates you use, as well as how many header files you include with templates and compile time computation.

2 Likes

Indeed, gfortran with cmake mostly avoids recompilation if interfaces are left unchanged. Sometimes it still recompiles, probably because some internal data within interfaces can still change (like registers used for argument passing etc.)

However, there are some limits to modules and compilation times. Top-level modules, even if not directly using lots of modules, still indirectly import a huge amount of data (at least in gfortran). For a larger project this easily approaches or surpasses 1MB for one .mod file, and this is well compressed data. Decompression alone eats 30% or more for such top level modules in gfortran if I remember correctly. Organising the data in search trees and such like takes another big chunk of time. Compilation itself was peanuts compared to symbol organisation. (I did some profiling of gfortran, as we saw compilation times of more than 30sec for top level modules, even with little and harmless code.)

Ifort is somewhat better, but it cannot avoid recompilation cascades and is almost useless for development stage in larger projects. Unfortunately, nobody at Intel seems to care (for more than 10 years or so) to deal with the time-stamp problem in the mod files, which leads to the recompilation cascades.

With ninja logs it was easy to find bottlenecks in parallel compilation. Using submodules in those cases did help to some extend. However there is caveat. For example take a class in a module, which uses some other modules internally to do the computational work. These internal modules should only appear in the use section of the submodule header, not the module header itself, thus breaking the dependency chain and allowing for fast compilation. However, any private method must be included in the class definition in the module interface. Any derived-type from these internal modules which appear as arguments in these private methods are now exposed and the compilation dependency chain is not broken anymore, as was originally intended. In most OOP situations, this means that most use definition must already appear in the module header… My takeaway was that OOP+submodules is useless (from the viewpoint of compilation cascades, but also for hiding some of the internal stuff).

3 Likes

That’s a major win for gfortran :+1: I really hope Intel would start doing this as well!

Could something like the code below work for you in this case? It’s a bit of ceremony, but this approach completely hides the implementation details from the module with the interface specification. The downside is that you have to rely on polymorphism, but if you’re already using that it might no be such a big deal.

mytype.f90:

module mytype_mod
    implicit none

    private
    public mytype_t
    public mytype_factory

    type, abstract :: mytype_t
        private
    contains
        procedure(public_sub), deferred :: public_sub
    end type

    interface
        subroutine public_sub(this)
            import mytype_t
            class(mytype_t), intent(inout) :: this
        end subroutine

        module function mytype_factory(i) result(this)
            integer, intent(in) :: i
            class(mytype_t), allocatable :: this
        end function
    end interface

end module

mytype_impl.f90:

submodule(mytype_mod) mytype_impl
    implicit none

    type, extends(mytype_t) :: mytype_impl_t
        integer :: i
    contains
        procedure :: public_sub => public_sub_impl
        procedure :: private_sub
    end type

contains

    module function mytype_factory(i) result(this)
        integer, intent(in) :: i
        class(mytype_t), allocatable :: this

        allocate(this, source=mytype_impl_t(i))
    end function


    subroutine public_sub_impl(this)
        class(mytype_impl_t), intent(inout) :: this

        write(*,*) 'This is public sub for mytype_impl_t with i  = ', this%i
        call this%private_sub()
    end subroutine


    subroutine private_sub(this)
        class(mytype_impl_t), intent(inout) :: this

        write(*,*) 'This is private sub for mytype_impl_t with i = ', this%i
    end subroutine

end submodule

main.f90

program main
    use mytype_mod, only: mytype_t, mytype_factory
    implicit none

    class(mytype_t), allocatable :: mytype

    mytype = mytype_factory(42)
    call mytype%public_sub()
end program

When run, it gives me the following output:

 This is public sub for mytype_impl_t with i  =  42
 This is private sub for mytype_impl_t with i =  42

I think it works with Intel too, but not always, or not with all versions, I can’t remember right now.

It’s implemented in CMake, that parses the mod files and determines if they changed.

I just tested this with CMake 3.20.3 and Ifort 2021.4 and I wasn’t able to avoid recompilation cascades. I did the test on Windows though, it could theoretically only work on Linux.

Worked beautifully with gfortran 10.3 though! The issue @martin pointed out about adding a private type bound procedure is an issue with gfortran also though. This is not really an interface change, but it still causes a recompilation cascade. To avoid this I think one have to use a pattern like the one i previously suggested.

I’ve never known CMake + Ifort + Windows + Visual Studio to avoid any compilation cascades. I would also be interested to know more about what compilers/platforms this works on. Is it documented anywhere?

Yes, I think ifort stores and updates a timestamp in its .mod files which means they change even when the interfaces do not, hence clever build systems can’t rely on them to avoid recompilation cascades.

See for example: Ninja: re-compilation cascades in Fortran builds (#17524) · Issues · CMake / CMake · GitLab (kitware.com).

1 Like

I have had very mixed experiences with ifort on this subject. I now use submodules extensively and I have found that ifort will often try to do much more compilation than is necessary. But, at your own risk, if only the implementation has changed in a submodule, ifort will happily accept just that module being recompiled (by hand, F7 in VS) and then relink. I am not sure this is recommended, but it has not yet let me down. ( OK I might do a complete recompile every day or so )

Just to be contrarian, I’ll note that some modern Fortran codes use modules in such a way as to require a completely sequential build, whereas C++ codes can be built in parallel (e.g. make -jnproc`). I have been told HIRAM is such a code, but haven’t verified it.

Obviously, large Fortran codes can be designed in such a way to avoid build serialization due to modules, and an old-school design like NWChem has ample build parallelism, but I have found that Fortran modules get in the way far more often than C++ headers.

I could not agree more, my own codes are quite serial but are slowly getting better. I do try to write new modules to be as independent as possible of all the rest, but, as I say, ifort will tolerate individual modules being compiled and then the program relinked, provided, of course, that the interfaces have not changed, Seasons Greetings

Indeed, C++ codes are slow to compile individual files, but otherwise they are trivially parallelizable. Often the bottleneck is linking executables and libraries.

I wonder how this will change with C++ modules?

1 Like

Some day I’d like to test out Mold to try to speed up the linking part. I think it looks very interesting. It should work with a Fortran codebase as well.

1 Like

Is this also true of more recent versions of C++ (ie. C++ 11 -17)? There has been a lot of change to the C++ language over the past 10 years, so some of this might be out of date.

That’s been my (subjective) experience with recent C++. It seems to compile pretty quickly if the code is “simple”. But once you start using templates a lot, the speed starts slowing down.

1 Like

Thanks for the suggestion, I have thought along these lines as well. That is much boiler plate code and indirections for little gain. With current tooling navigating OOP code is already a challenge sometimes, even if it is kept simple with small and shallow class hierarchies.

On a side node, a few years ago, somebody suggested a two-pass compile step for gfortran. First pass creates mod files and needs to respect dependencies. Second pass really compiles the code and can be done fully parallel. However, with my profiling numbers of gfortran in mind, this might not be that fast, as compiling and code generation was the smaller part in larger projects. Handling the huge symbol tables was what killed the performance, and that would be not much different with such a two-pass approach.

The speed of compilation is essential in my opinion and I have spent a lot of time designing LFortran to be as fast as possible. Things that LFortran can already compile are very encouraging (from the speed of compilation perspective), but we have to see once we bring it from alpha to beta, then we can try it on real codes.

For Fortran, it seems usually the compilation is fast enough. The slow down stuff is in linking.