Defining a derived type over many files

nicholaswogan · October 16, 2021, 6:23pm

In C++ you often define a class over two file types: header files, and .cpp files. The header files define variables, and function arguments (e.g. void myfunc(int a, int b);), while the .cpp files contain the code for what the functions do.

This is nice for when you have big classes with lots of functions. You can see overall structure in the header file, and you can organize functions over many different .cpp files. No single file gets too long and unwieldy.

My question is, what is the best way to emulate this organization in Fortran, distributing the definition of a single derived type over many files? Using a single file can be really unwieldy sometimes.

Here is what I’m currently thinking. Analogous to a class defined in a C++ header, you can define an abstract types in Fortran, with deferred subroutines:

module myobject_abs
  implicit none
  
  type, abstract :: mytype_abs
    integer :: a
  contains
    procedure (mytype_printa), deferred :: printa 
    procedure (mytype_add2a), deferred :: add2a
  end type
  
  abstract interface
     subroutine mytype_printa(self)
        import mytype_abs
        class (mytype_abs),intent(in) :: self
     end subroutine
     
     subroutine mytype_add2a(self, b)
        import mytype_abs
        class (mytype_abs),intent(inout) :: self
        integer, intent(in) :: b
     end subroutine
  end interface
  
end module

Then you define the subroutines in separate modules (or the same module). This is analogous to .cpp files:

module myobject_subs1
  use myobject_abs, only: mytype_abs
  implicit none  
contains
  
  subroutine printa(my)
    class(mytype_abs), intent(in) :: my
    print*,my%a
  end subroutine
  
end module

module myobject_subs2
  use myobject_abs, only: mytype_abs
  implicit none
contains
  
  subroutine add2a(my, b)
    class(mytype_abs), intent(inout) :: my
    integer, intent(in) :: b
    my%a = my%a + b
  end subroutine
  
end module

Finally, you knit everything together in a final module

module myobject
  use myobject_abs, only: mytype_abs
  implicit none
  
  type, extends(mytype_abs) :: mytype
  contains
    procedure :: printa => mytype_printa
    procedure :: add2a => mytype_add2a
  end type
  
contains
  
  subroutine mytype_printa(self)
    use myobject_subs1, only: printa
    class(mytype), intent(in) :: self
    call printa(self)
  end subroutine
  
  subroutine mytype_add2a(self, b)
    use myobject_subs2, only: add2a
    class(mytype), intent(inout) :: self
    integer, intent(in) :: b
    call add2a(self, b)
  end subroutine
  
end module

Maybe there is a better solution?

FortranFan · October 16, 2021, 7:00pm

@nicholaswogan , the gist of what you are looking for is possible to some extent already though you may suffer from the usual Fortrannic issues with verbosity and some duplication and in some cases, having to make things PUBLIC that you may not want otherwise. Here SUBMODULEs and procedures with MODULE prefix can help, I suggest you look into them.

However in your illustration, the section with “define the subroutines in separate modules (or the same module)” doesn’t make sense since you show your myobject_abs as ABSTRACT and the bound procedures as deferred.

shahmoradi · October 16, 2021, 7:47pm

I concur with FortranFan’s recommendation and I highly recommend taking the submodule path with Fortran procedure implementations. Surely it hurts emotionally to separate the procedure interface (to appear in the parent module) from the procedure implementation (to appear in the submodule), but as soon as your project reaches any level of seriousness, you will realize what a development-stage timer-saver submodule is. Suppose your project’s build time is 15 minutes or more (which is not unrealistic at all). Then making a single letter change to the interior of one procedure in a basic module will initiate a whole compilation cascade, wasting another 15 minutes of the developers time. With submodules, that will not happen.

FortranFan · October 16, 2021, 7:53pm

Here’s one option that you can look into for now, as to whether it’s a “better solution” is up to the beholder!

So note with SUBMODULEs, the idea is to separate the interface from the implementation. This can help with 1) interface (contract) driven development that has benefits, 2) to “hide” the implementations in case the source for the module needs to be shared with parties who have no such “need to know”, and 3) prevent compilation cascades in situations of complex dependencies with the code in those implementations.

File 1: say a.f90

module a_m

   type, abstract :: a_t
      integer :: a = 0
   contains
      procedure :: printa_default
      procedure(printa_default), deferred :: printa 
      procedure :: add2a_default
      procedure(add2a_default), deferred :: add2a
   end type

   interface
      module subroutine printa_default( self )
         class(a_t), intent(in) :: self
      end subroutine
      module subroutine add2a_default(self, b)
         class(a_t), intent(inout) :: self
         integer, intent(in) :: b
      end subroutine
   end interface

end module

File 2: say printa_sm.f90

submodule(a_m) printa_sm
contains
   module subroutine printa_default( self )
      class(a_t), intent(in) :: self
      print *, self%a
   end subroutine
end submodule

File 3: say add2a_sm.f90

submodule(a_m) add2a_sm
contains
   module subroutine add2a_default(self, b)
      class(a_t), intent(inout) :: self
      integer, intent(in) :: b
      self%a = self%a + b
   end subroutine
end submodule

File 4: say e.f90

module e_m
   use a_m, only : a_t
   type, extends(a_t) :: e_t
   contains
      procedure :: printa => printa_e
      procedure :: add2a => add2a_e
   end type
contains
   subroutine printa_e( self )
      class(e_t), intent(in) :: self
      call self%printa_default()
   end subroutine 
   subroutine add2a_e( self, b )
      class(e_t), intent(inout) :: self
      integer, intent(in) :: b
      call self%add2a_default( b )
   end subroutine 
end module

certik · October 16, 2021, 8:53pm

I will note that with a good compiler and a build system it should never happen that if you change an internal of a subroutine that the whole project recompiles… But of course I know that is what often happens. Although I thought cmake is quite clever about this and does not trigger a cascade if the mod file did not change (i.e., the public API didn’t change).

Regarding 15 minutes or more to build the whole project, which I know very well happens. I feel that Fortran compilers are not taking advantage of how simple the language really is, and I think they should compile much faster. On a very synthetic benchmark, LFortran can compile roughly 250,000 Fortran lines per second to machine code, on a single core on my laptop; but only via the direct x86 backend. If we use the LLVM backend, it drops to about 13,000 lines per second. Say your big Fortran project has 10M lines of code, so with 13,000 lines/s that corresponds to about 13 minutes. But if it could compile at 250,000 lines/s, the whole project will compile in just 40s. On 8 cores it could drop down to 5s (I doubt that can be achieved in practice, as for these big projects a lot of time is spent linking, which the above estimate ignores: the x86 backend does linking, but I still think it will get slower for a huge project). These are obviously just back of the envelope calculations, we have to see how LFortran will perform in practice, but I feel much faster compilation can completely change how people develop, and that is why I spent so much time designing LFortran to compile very fast; and even after we can compile everything via the LLVM backend, our main focus currently, we would then have to improve the x86 backend to take advantage of the speed (and write an ARM backend too). But we will, that is my plan.

nicholaswogan · October 16, 2021, 10:35pm

@FortranFan Thanks! Submodules are perfect. This accomplishes exactly what I want:

module myobject
  implicit none
  private
  
  public :: mytype
  
  type :: mytype
    integer :: a = 2
  contains
    procedure :: printa => mytype_printa
    procedure :: add2a => mytype_add2a
  end type
  
  interface
    
    module subroutine mytype_printa(self)
      class (mytype),intent(in) :: self
    end subroutine
    
    module subroutine mytype_add2a(self, b)
      class (mytype),intent(inout) :: self
      integer, intent(in) :: b
    end subroutine
    
  end interface
  
end module

and in two other files

submodule(myobject) myobject_subs1
  implicit none  
contains
  
  subroutine mytype_printa(self)
    class(mytype), intent(in) :: self
    print*,self%a
  end subroutine
  
end submodule

submodule(myobject) myobject_subs2
  implicit none
contains
  
  subroutine mytype_add2a(self, b)
    class(mytype), intent(inout) :: self
    integer, intent(in) :: b
    self%a = self%a + b
  end subroutine
  
end submodule

oscardssmith · October 17, 2021, 12:58am

why is the llvm backend so much slower? is it doing more optimization?

shahmoradi · October 17, 2021, 1:37am

All true. I have noticed that CMake/Make has become clever enough to distinguish at least comment changes from code changes. I have not paid attention to whether it can recognize interface change from implementation change. That would be an easy test though. It is quite easy to create long compilation times with interprocedural optimization enabled, all needed appears to be a few thousand tests each of which works with some hierarchical derived type/class in some other module.
update: I think I know the answer. CMake/Make currently cannot recognize interface change from implementation change based on my observations within a basic module.

certik · October 17, 2021, 5:13am

I tested LLVM with no optimization. (So the resulting executable was roughly as fast as with the LFortran’s x86 backend.) I don’t know why it is so slow. I noticed other languages also discovered the same issue. I believe Zig and Jai. My experience with LLVM is that it is really good for low to medium level optimizations. But the price is slow compilation even without optimizations.

NormanKirkby · October 17, 2021, 6:30pm

As a side issue, I have started to put each abstract interface into a separate module with no submodule implementations, I even conclude the name of the module with _ADT (for abstract data type) I find this makes the code organisation very clear.

BTW the comments above about the performance of LFortran are an eye opener, thanks

Topic		Replies	Views
Compilation time vs. C++ Help	22	1770	January 10, 2022
How to conceal derived-type components?	8	717	September 22, 2022
The recompilation cascade issue and its solutions Help	7	833	November 13, 2022
Question about interfaces and submodules	6	2355	January 19, 2021
Define a function/subroutine inside a derived type Help	9	579	October 20, 2021

Defining a derived type over many files

Related topics