Weather and climate modeling codes from Fortran to C++

@milancurcic, I got into this discussion after reading about EAMxx. Apparently, they intend to rewrite the entire atmospheric physics in C++ (see https://science.osti.gov/-/media/ascr/ascac/pdf/meetings/202309/Bertagna_ASCAC_092823_EAMxx.pdf page 18). They also say they may achieve better performance.
Do you have any update/additional info on this?
Thank you very much!

1 Like

I don’t but the slides that you linked seem both recent and descriptive. What info are you looking for? Unless you’re in contact with one of the team members, the best way to stay up to date with the progress is probably to follow the GitHub repositories of the project.

This activity - a very high-resolution climate model written in C++ around the Kokos framework - won a Gordon Bell Prize for Climate Modeling in 2023.

1 Like

These are a newbie’s 2 cents, but it is not surprising that most codes are migrating towards C++. The base for C++ is much wider, and in my humble opinion, you all would have nightmares looking at the source codes of those tools that are still written in Fortran. I use nemo on a daily basis, which is a fairly big project, and I plan to start using Croco this year (that is a standalone branch of Roms). These codes are awful when it comes to modify the source. Even worse if you want to renew them. If you look at Nemo, as an example, and you want to take its management of the domain, you will have no other way to do it than read every single line and try to extract its behaviour, or to go speak directly with the developer, because that part of the code is not documented. This is because every single person that started developing these tools found no standard approach to rely on, and basically started reinventing the wheel, every single time. Croco, as another example, is now being ported to GPU with openACC, but still uses FORTRAN77 standards. Most of these codes are however fairly simple (I am mainly referring to finite differences codes) and the most cumbersome aspect is probably domain decomposition and communication. Beside this aspect, that usually changes from project to project, I think that if you want to give Fortran a chance the only way is to implement libraries to handle efficiently the mathematical operations on a subdomain, such as tensor operations, finite differences (and why not, use automatic differentiation for those), linear algebra, all implemented in such a way that running on a shared memory paradigm can be exploited (so multithreaded CPU or GPU acceleration are possible within the domain). Without this, and a fair amount of advertising, everyone will be forced to start from scratch every single time, and it will always be convenient to start from scratch with a language that has already a lot of libraries.

I am thinking about starting (in a couple of months) from @milancurcic implementation of the shallow water model from his book, use MPI for the domain decomposition, use nemo-like geographic fields, and try to understand what are the needs of an ocean model, and potentially hide behind an API an openMP/openACC implementation.

TL;DR: we need a complete, high performance, well documented, well advertised Fortran version of numpy, tensorflow and linalg. Without these, the migration from old fortran to modern fortran is going to be as difficult as to migrate towards another language, without all the benefits of a larger base/documentation for the new language.

6 Likes

100% agreed!

The stdlib should cover an spectrum close to Numpy+Scipy. Just recently I was reading this issue Improve user documentation (feedback from students) · Issue #697 · fortran-lang/stdlib · GitHub and found out that there is a tentative to create a documentation for stdlib that looks very promising. I think it would be worth pushing that forward.

3 Likes

@kimala

I read your lines in lines of «too bad the present code basis’ style is too archaic, and rewriting to modernize it is a considerable investment». It can be tedious if done manually, though for instance Fortran Wiki’s entry Modernizing Old Fortran, and the compilation old programming idioms curated by Arjen Markus and Brad Richardson help to identify and port them to contemporary Fortran.

Please note, in April 2021 (link to a video recording), John Collins presented Fortran partner, fpt, as a tool to offer a programmatic approach to this problem, especially with larger projects in mind. The tool analyzes the original source code submitted to report e.g., inconsistencies and errors, optionally provides an interactive translation into the new (i.e., free) format Fortran. Distributed by SimCon, its broad spectrum of functionality is listed here – it plausibly could be helpful here.

IMO, this is true of many scientific codes in C++ too. If you look at the slide deck linked by @Fabio, they mention this concern:

C++ syntax is much heavier than Fortran, which can be a barrier for field scientists. Good software design is crucial to separate concerns, increase productivity, and make code maintenance manageable

Certainly, there are elements of Fortran 77, such as short variable names, implicit typing, extensive use of common blocks, go to and other obsolete language elements, which make reading and modifying code very challenging, especially when this “dialect” of Fortran has to be “picked up on the job” and the pool of software developers consists heavily of graduate students.

When it comes to HPC nowadays, the main advantage C++ has is more options for portable GPU offloading and the bigger user base. The latter has several important consequences: more programmers, better tooling, better educational courses, and so forth.

Isolation of user communities is one of the problem we’re trying to address with Fortran-Lang. But the success ultimately depends on the time and effort of contributors to develop the missing tools jointly. If there are parts of nemo or Croco that could be reused, why not think about contributing them to stdlib?

Is this a manual porting, or are they a source-to-source translation approach using something like RefactorF4Acc? As a side-note I think it’s important to differentiate between the F77 standard and F77 programming practices. It’s possible to write F77-like code which is still F2008 or even F2018 standard-conformant.

1 Like

Speaking of porting to other languages, Matt Godbolt (the author of Compiler Explorer) gave an amusing keynote talk on “C++'s Superpower” at CPPP 2021 (a french C++ conference). Here are a few screenshots answering the question what is C++'s Superpower:

image
image
image

The fact Fortran allows the old and new to live together, means you can incrementally improve your programs, making them safer, easier to debug, more understandable, and generally easier to work with.

3 Likes

If I recall correctly, Croco is partially developed by the French navy, so I’m not fond with the idea of having to run away from France :rofl:

Jokes and spy stories aside, I think what should be learned from those codes is not the actual coding but rather the methods implemented. What would be great for stdlib to have is a “ocean/atmosphere” package capable of handling geographical fields, netCDF I/O with possible compatibility with managers like XIOS. (And an easy way to install these dependencies because this is truly the most difficult part of it)

I was an experienced C++ programmer who spent a great deal of time in the wars a decade or two ago.
The scientists did not want to move to C++ and they had their reasons. Looking back-some were rather valid. The early obsession with OO was a concern for these very data oriented coders. OO is a disaster for data (and just about every thing else)-it fragments the cache in terrible ways. An OO solves everything mindset was part and parcel with early C++.
These ingenious weather codes in f77 were the epitome of efficiency and speed, pure array brilliance.
Unfortunately, the weather guys failed to understand the idea of a hybrid C++ data-oriented programming methodology-i.e. arrays where arrays thrive, but objects and generics at a higher level to make your code more flexible and extensible.
So they missed the boat. They made the decision to stay with fortran for their large infrastructure projects and so they stagnated horribly. Talented programmers mocked their backwardness.
Places like Sandia, on the other hand, migrated their data oriented finite element codes to hybrid C++ architectures and they have a thriving software microcosm.
The fortran guys then became involved with things like F90, which, inmho was a disaster. Slow, array copies, opaqueness and complication. All the bad things from OO and a de-emphasis on the good and simple things in f77.
Many of us who understood software and modern scientific code architecture finally left the stagnant field and never looked back.

1 Like

@nickels thanks for the feedback and welcome!

I generally agree with your sentiment. I’ll add two clarifications: First, Sandia has large teams to maintain their C++ codes. Generally if you have a team, you can easily afford a large C++ code base and it will deliver for you. However if you have a small team or are a single developer, C++ is hard. Fortran, on the other hand, can be used by a single person and it will deliver for you, if you need numerics. (And I use C++ daily.)

Second, regarding F90, I think it was initially slower, but it seems to me that modern compilers can compile it to as fast code as F77.

C++ is fine, but if it can become as easy, safe and high performance as Fortran for numerical array computing is not clear to me, right now it still has a long way to go.

3 Likes

It is rather disappointing to me that creating a Fortran equivalent to eigen remains somewhat impossible at this point, or at least, has not been done and widely accepted as the go-to standard.

1 Like

Thanks for the friendly reply!
I don’t know if this is appropriate for here, but-what would really be an improvement for me is if fortran found its way in some form into a default compiler for Visual Studio-I’ve got an old airplane model in fortran that I need to use (searching led me here) but its just so hard to deal with fortran integration on windows (its mostly abi and stdlibrary clashes). We had a levenberg-marquardt optimization code a while back in fortran that was the same issue. Linux is a choice but it only gets one so far!
Some of of the blas type things at sandia were in fortran under the C++ covers!
This cross language usage can be very powerful.

2 Likes

The stdlib team is waiting for you :slight_smile: !

2 Likes

It is rather disappointing to me that creating a Fortran equivalent to eigen remains somewhat impossible at this point, or at least, has not been done and widely accepted as the go-to standard.

BLAS/LAPACK.

To quote from Eigen’s FAQ:

Eigen has an incomparably better API than BLAS and LAPACK

Here is an example of the generalised eigenvalue decomposition from our code:

integer m,info
real(8) vl,vu
integer iwork(5*n),ifail(n)
real(8) w(n),rwork(7*n)
complex(8) work(2*n)
call zhegvx(1,'V','I','U',n,A,n,B,n,vl,vu,1,nv,evaltol,m,w,v,n,work,2*n,rwork,iwork,ifail,info)

Here’s the nearly equivalent from Eigen:

GeneralizedSelfAdjointEigenSolver<MatrixXd> es(A,B);
cout << "The eigenvalues of the pencil (A,B) are:" << endl << es.eigenvalues() << endl;
cout << "The matrix of eigenvectors, V, is:" << endl << es.eigenvectors() << endl << endl;

Now if you’ve been brought up on MATLAB or Python then the Eigen interface may seem better, although I can’t see it myself. However, the LAPACK interface (with it’s antiquated call statement) is easier when you have to deal with details, which is so often the case with large codes. In this case we need to compute a subset of eigenvectors within an integer range. I’m sure this can be done in Eigen, but the desired range still has to be typed in somewhere, and you end up with the same (or more) amount of coding to be done.

1 Like

And the Wizard of BLAS, Jack Dongarra and his group, are working on the MAGMA and PLASMA projects to extend LAPACK to multi-core and GPU systems as well as MPI. They supposedly have Fortran interfaces but the latest version of MAGMA appears to be missing a Python script needed to compile the Fortran interfaces.

https://icl.utk.edu/magma/

https://icl.utk.edu/projectsfiles/plasma/html/doxygen/

Edit:

The latest MAGMA development version on bitBucket appears to have the files missing in the 2.7.2 tar files. See

https://bitbucket.org/icl/magma/src/master/

2 Likes

There is a useful (although not often updated) Fortran 95 interface, which enables you to get by with

   Use Lapack_95
   ...
   call hegvx(A, B, w)

The Lapack-95 interfaces take care of converting the Fortran-95 argument list to the much longer F77 argument list and back, as well as allocating/deallocating the work arrays needed.

2 Likes

A word of warning about using the MKL version of LAPACK95. I found the argument lists of some of the routines did not correspond to the original NETLIB version (same arguments just in different positions in the argument list). I strongly suggest you use keywords with all arguments if you want to later use the NETLIB version of LAPACK95.

Yes, we are figuring out more modern Fortran API for linear algebra as part of stdlib. @FedericoPerini is leading the effort.

2 Likes

I always saw the main point of Eigen as providing matrix and vector classes, which are otherwise missing in C++.

Looking at some of the showcases of the Eigen API, Fortran has equivalents for most of them:

// performing row/column operations on a matrix
m.row(i) += alpha * m.row(j);
m.col(i) = alpha * m.col(j) + beta * m.col(k);
m.row(i).swap(m.row(j));
m.col(i) *= factor;

I’d say the overloaded in-place arithmetic operators are the nice thing here. In Fortran you’d use array expressions (or alternatively reach for BLAS), and hopefully the compiler won’t generate temporary array expressions:

M(i,:) = M(i,:) + alpha*M(j,:)
M(:,i) = alpha * M(:,j) + beta * M(:,k)

tmp = M(j,:)
M(j,:) = M(i,:)
M(i,:) = tmp

M(:,i) = factor*M(:,i)

Swapping rows can be accomplished with a helper subroutine:

call swap(a(1,:),a(2,:))

contains
    elemental subroutine swap(a,b)
        real, intent(inout) :: a, b
        real :: t
        t = a
        a = b
        b = t
    end subroutine

Computing various reductions in Fortran is easier than Eigen:

// computing sums with Eigen
result = m.sum(); // returns the sum of all coefficients in m
result = m.row(i).sum();
result = m.rowwise().sum(); // returns a vector of the sums in each row

// sum of the cubes of the i-th column of a matrix:
result = m.col(i).array().cube().sum(); // array() returns an array-expression
! computing sums in Fortran
result = sum(m)
result = sum(m(i,:))
result = sum(m,dim=2)
result = sum(m(:,i)**3)

The factorization syntax and solvers in Eigen are very convenient:

result = m.lu().solve(right_hand_side);            // using partial-pivoting LU
result = m.fullPivLu().solve(right_hand_side);     // using full-pivoting LU
result = m.householderQr().solve(right_hand_side); // using Householder QR
result = m.ldlt().solve(right_hand_side);          // using LDLt Cholesky

Fortran does not allow chaining operators, so this would become a two-step process:

type(matrix) :: M
M = matrix([...])
call M%lu(inplace=.true.)
result = M%solve(right_hand_side)

Nevertheless, thanks to Eigen’s powerful template-based design, it’s relatively easily to use from Fortran too. Here’s how to do it using F2018:

// eigen.cpp
#include <ISO_Fortran_binding.h> // F2018
#include <Eigen/Dense>

using namespace Eigen;

// Convert descriptor to matrix
template<typename T>
auto fmat(CFI_cdesc_t *M) -> Map<Matrix<T,Dynamic,Dynamic>> {
    return {static_cast<T *>(M->base_addr), M->dim[0].extent, M->dim[1].extent};
}

// Convert descriptor to column vector
template<typename T>
auto fvec(CFI_cdesc_t *x) -> Map<Matrix<T,Dynamic,1>> {
    return {static_cast<T *>(x->base_addr), x->dim[0].extent};
}

template<typename T>
void solve_lu(CFI_cdesc_t *A, CFI_cdesc_t *b)
{
    auto A_ = fmat<T>(A);
    auto b_ = fvec<T>(b);
    b_ = A_.lu().solve(b_);
}

extern "C" {

void sslvlu(CFI_cdesc_t *A, CFI_cdesc_t *b) { solve_lu<float>(A, b); }
void dslvlu(CFI_cdesc_t *A, CFI_cdesc_t *b) { solve_lu<double>(A, b); }

} // extern "C"
! test_eigen.f90

module eigen
use, intrinsic :: iso_c_binding, only: c_double, c_float
implicit none
interface solve
   subroutine sslvlu(A,b) bind(c)
      import c_float
      real(c_float), intent(in), contiguous :: A(:,:)
      real(c_float), intent(inout), contiguous :: b(:)
   end subroutine
   subroutine dslvlu(A,b) bind(c)
      import c_double
      real(c_double), intent(in), contiguous :: A(:,:)
      real(c_double), intent(inout), contiguous :: b(:)
   end subroutine
end interface
end module

program main
use eigen
implicit none
real(c_float) :: A(3,3)
real(c_float) :: b(3), x(3)
A = reshape([1,-1,2,-2,3,-5,3,-1,5],[3,3])
b = [9,-6,17]
x = b
call solve(A,x)
print *, "Solution: ", x
print *, "Norm: ", norm2(matmul(A,x) - b)
end program

Commands to run the example:

$ g++-13 -c -O2 eigen.cpp -I/usr/include/eigen3/
$ gfortran -Wall -O2 test_eigen.f90 eigen.o -lstdc++
$ ./a.out
 Solution:    1.00000000      -1.00000000       2.00000000    
 Norm:    0.00000000    

In this manner you can wrap all of Eigen’s dense matrix decompositions in an evening.

4 Likes