Circular dependency of refactored library on monolithic model code

I am working on a fortran codebase (monolithic model for weather/climate simulations) for which I wish to refactor a part of its source code to a fully external library called libfeature. The reason I want to do this is so that the institute I am working at can make consolidated code contributions to libfeature rather than having a bunch of forks of the main model codebase. The model codebase currently has an externals/ directory with many externals (e.g., libmodelmath) and would be an appropriate location for libfeature. Note that the model can be configured to enable or disable such externals with

./configure --enable-feature

However, libfeature is dependent on a lot of core source code in the model like e.g., mo_grid.f90. This ends up creating a circular dependency when trying to extract feature. An example source tree of the current model is in graphic (a) below.

If I attempt to extract feature into a new library and then place it in the externals folder as libfeature, I would inevitably have a dependency with the model codebase (now named desired-model). So if the model codebase source tree now looks graphic (b) above, it is clear that a circular dependency arises in

desired-model/externals/libfeature/dependencies

since a dependency of libfeature is desired-model.

Are there any Fortran (or C/C++ for that matter) codebases that encounter and solve a similar issue that I might use as a reference to inform my refactoring? I’m finding this circular dependency problem to be challenging to reason about and plan, so any other advice would be very much appreciated!

I list here some things I have also considered about this refactoring process:

(1) For the Release build of libfeature, dependencies/desired-model should not be compiled since this would be redundant. The parent desired-model will already compile e.g., src/mo_grid.f90 so it wouldn’t make sense to also have externals/libfeature/dependencies/desired-model/src/mo_grid.f90 be compiled.

(2) For the Debug build of libfeature, system tests would (I think) necessitate running dependency/desired-model since one would want to verify that a full simulation works once one makes changes to libfeature. Under such a test, dependency/desired-model/libfeature should point to the in-development version of libfeature (i.e., the one in the current directory) instead of using the usual .gitmodules logic to download a different version of libfeature (i.e., one that is stable and in a remote repo).

(3) libfeature developers might need to make changes to desired-model/src by adding statements like

#ifdef LIBFEATURE 
    CALL foo() 
#endif`

This does not seem like something to be too concerned about because (a) pushing to submodule repositories is streamlined (see this stackoverflow), and (b) there is precedent in the desired-model codebase where such preprocessor directives are used to add other enabled features but not libfeature yet. For example, in the actual codebase, there is an external library called ART and its functionality is enabled via

#ifdef __ICON_ART
    CALL art_rad_aero_interface( .... )
#endif

This is what I mean by precedent. Obviously, this is not necessarily a good practice in terms of maintainability, but so it goes.

Note, the above simplified problem is based on a real problem with the publicly available icon-model codebase. For reference, libfeature corresponds to icon-model/src/upper_atmosphere and grid corresponds to icon-model/src/shr_horizontal

1 Like

Is the circular dependency an implementation dependency or an interface dependency? How does the circular dependency manifest?

Is it a circular module dependency (interface dependency) ?

! mo_feature.f90
module feature
  use grid
! ...
end module
! mo_grid.f90
module grid
#ifdef LIBFEATURE
   use feature, only: foo
#endif
! ...
end module

Edit: you can find some previous discussion of circular dependencies,

1 Like

As @ivanpribec’s answer already indicated, your question is too generic to be answered in any specific manner.

Generally speaking, circular dependencies are always issues of design, and can always be avoided through suitable refactoring.

In case they are circular implementation dependencies, they can always be reduced to circular interface dependencies (by programming to [i.e. depending on] interfaces rather than implementations).

In case they are circular interface dependencies, they can always be eliminated by breaking the involved interfaces down into smaller ones.

“Programming to interfaces” can be done using a suitable object-oriented approach.

For anyone interested in the topic, I would recommend reading Robert C. Martin’s book “Clean Architecture”, that also contains a chapter on the “Acyclic Dependencies Principle”, of how to organize code such as to avoid circular dependencies.

2 Likes

Welcome to Discourse @jaredfrazier!

I am very familiar with this issue. Unfortunately, there are tools that can help you at detecting when you hit a circular dependency (fpm is one of them), but you will have to tackle this gradually and carefully.

Old Fortran codes did not have inter-dependencies because they only used intrinsic types, and the shared data would be stored in COMMON blocks: a common structure is redefined every time, you just put a name to it: so no dependency across source files is introduced (it’s a massively parallel build).

Now, modules introduce complex dependencies and its’ the software architect’s job to figure out how to keep code as much independent as possible. So you will probably want to start from the bottom up, moving most basic functionality (I/O, basic data structures, etc. ) to independent modules, and then begin separating more and more high-level modules from the rest of the code.

So this whole refactoring will take time and effort but it will be eventually very much worth it!

1 Like

I have read both good and bad things about this book (especially if you need to keep performance in mind), so one should take some of the principles with a grain of salt.

But thanks for bringing up the ADP, because there is a page on Wikipedia about it – Acyclic dependencies principle - Wikipedia – which links to an archived PDF (38.9 KB) containing a report Martin wrote for C++ Report.

3 Likes

@ivanpribec The design principles that Martin discusses in this book are all about the organization of code on package- (i.e. coarse) and class-level (i.e. intermediary) scales, and are thus orthogonal to questions of performance.

There’s no need to take these principles with a grain of salt.

After having had a look into the code base in question, I presume that the option of using an object-oriented approach to achieve the desired decoupling might not appear too attractive to the OP effortwise, as he’s working with an essentially procedural code base.

An option that one can try then, is to create a new code component (i.e. a module or a set of modules), and to move all of the code that the OP’s “feature” and “model” both depend upon, into this new component, in order to break the circular dependency (as it is described in Martin’s archived PDF, that @ivanpribec linked to above).

1 Like

I got the book mixed up with “Clean Code”, which has been criticized in some circles, so consider my earlier comment void.

Procedures could be used as arguments in F77 and presumably earlier too. F2003 introduced procedure pointers. As Rob Pike notes,

I argue that clear use of function pointers is the heart of object-oriented programming. Given a set of
operations you want to perform on data, and a set of data types you want to respond to those operations, the
easiest way to put the program together is with a group of function pointers for each type. This, in a nutshell, defines class and method. The O-O languages give you more of course — prettier syntax, derived
types and so on — but conceptually they provide little extra.

So perhaps some clever application of procedure arguments could introduce the desired decoupling. One could even use sequence types to avoid the need for sharing interfaces via modules. But in general I wouldn’t recommend this.

Edit: after peeking into the code, I agree with the idea of creating a new common code component, which can be used independently (i.e. does not require building of the full desired-model).

1 Like

Several years ago I started also a refactoring process to split a large monolithic code-base into smaller modular bricks and one of the first problems I had was how to break certain circular dependencies. Procedure pointers were the magic bullet that enabled it as it was possible then to abstract away certain procedures which I needed to know their signature but for which I could simply leave either a null function pointer or have a “default” behavior within the (newly created) library and change the behavior by pointing to the actual procedure in the application depending on the library.

1 Like

@ivanpribec and @kkifonidis I’ve looked closer at my simplified example and realized that I may need to re-frame the question to more accurately reflect my problem.

I think the root of the issue is not that there is some sort of circular dependency, since actually mo_feature.f90 does use mo_grid.f90 but mo_grid.f90 does not use mo_feature.f90. So I realize that it is incorrect to call my problem a circular dependency issue since dependency between those modules is acyclic. Rather the root is more that there is tight coupling between mo_feature.f90 and other modules in desired-model/src. So I should probably rename my post something related to tight coupling, which is more of a design problem than a Fortran specific problem, but nevertheless the insight from Fortran users has been useful as this is the lang for the codebase. I’ll think about this some more and definitely take into consideration the advice and pointers you all have given! I realize now my problem is more generic than I thought, though the suggestions arising from my post such as procedure pointers are very good tips :))

1 Like

I am part way down a similar track. I think the tight-coupling problem is mainly a design issue as you say, but making the coupling explicit helped me. I have been busy adding ‘only’ clauses to all ‘use’ statements, and patterns have emerged which helped see the bigger picture.