Fortran Arrays in C++

This is a bit out of the typical for this forum, and I apologise if it is inappropriate. Working as research software engineers, we often encounter people who have problems that would probably be best implemented in Fortran, but who want to work in another language (usually C++) because Fortran is “old fashioned”. Where possible we encourage them to use Fortran anyway, but sometimes we have to work in C++. To try and make our lives easier we have created a C++ library that implements Fortran’s approach to arrays and array operations at a performance comparable to both Fortran and C++ native arrays. I present it here in the hope that it might be useful for other people who might find themselves in a similar situation.

https://github.com/csbrady-warwick/FARpp

Feedback, comments, suggestions and things that are missing are definitely welcomed!

12 Likes

I think the three features that would be nice are,

  • cast to Fortran array descriptor CFI_cdesc_t *
  • multidimensional array subscript operator A[i,j,k] (since C++23)
  • cast to C++23 mdspan

The Fortran array descriptor would allow you to interoperate with Fortran codes using assumed-shape array arguments. There is a little example here: C interoperability with assumed-shape arrays - #12 by ivanpribec

2 Likes

Thanks for the feedback. We are considering keeping the array subscript operator unused against possibly implementing something like coarrays in future. I’ll certainly look into the other two though.

1 Like

That’s a cool idea. I recently discovered a C++ project which does something similar: GitHub - jwbuurlage/Bulk: A modern interface for implementing bulk-synchronous parallel programs.

Keeping operator() for subscripting sounds reasonable, as you can always get [] via the mdspan if you really want it. Then you could always do:

far::Array<int,2> a(3,3);
auto a_view = far::make_mdspan(a);
a_view[i,j] = 4.0;
1 Like

Interesting, thanks. Please don’t make it too good :slight_smile:

The manual says

By analogy with Fortran, deallocating an unallocated array will cause a std::runtime_error
to be raised and trying to explicitly allocate an array when it is already allocated will also cause a std::runtime_error. The restriction on allocation only applies to explicit allocation with the
allocate command. Allocation caused by assigning a different sized FAR++ array will always allocate the destination to the correct size.

The need to write

if (allocated(x)) then
   deallocate(x)
   allocate(x(n))
end if

is something Fortran programmers have grumbled about. In your C++ class, could you add an argument to your allocate method so that it works even if the array is already allocated?

Hi, and thanks for the feedback. I did implement “reallocate” which reallocates in one line if you know that you have an allocated array, but now that you mention it it would be nice not to have to check the allocation status before reallocation. It would be better to have either an extra optional “reallocateIfNeeded” parameter to allocate or a special “alwaysAllocate” function. I will look into the options.

1 Like

Thanks for the link! I will definitely look into that.

I have looked into the source and under the hood it seems to use plenty of templates and constexprs that likely evaluate to raw pointers, therefore the amazing performance results seem completely believable. Does that mean that the only real advantage of Fortran over C++ just vanished? We got same array functionality at the same performance in the language which is otherwise superior to Fortran in every aspect?

TIOBE ranks C++ 2nd and C 4th. C is still very popular, and the main advantage of C over C++ is that it is smaller, so it is feasible for a programmer to learn most of it. Modern Fortran is bigger than Fortran 77 but is still a smaller language than C++.

C++ had the same performance as Fortran for the past 20 years at least, but you have to work harder to get it. The advantage of Fortran is that the language is simpler and it’s easier to write faster code and for the compiler to optimize it.

1 Like

I will admit that this was something that we did think about, because Fortran is a language that we both really appreciate, recommend first for any suitable project and generally really like. After some thinking, we thought that there were several reasons why this library wasn’t something that we should just keep for us internally

  1. There is already a substantial pool of people who want to move to C++ just because it is seen as more “modern” than Fortran. If you already feel that drive then there are a great many things that you can point to to justify your preference without this library - Kokkos, Sycl and the template system to name just a few. If people were already looking to start a project in C++ then the existing infrastructure is strong enough that they wouldn’t look at a problem and say “No, I’ll do it in Fortran”, and if they were looking to write it in Fortran, this library isn’t good enough for them to say “No, I’ll do it in C++”.

  2. Our library is not as good as Fortran. While the core “I just want to use arrays” is fairly elegant, there are a lot of rather ugly edges, particularly around elemental functions and component selection, that are forced on us by C++’s different idiom.

  3. If we can do it, so can other people. There is clearly a demand for scientific/technical tools in C++, see the above mentioned Kokkos and Sycl, and while we like to think that we are fairly good C++ programmers by academic standards, there are undoubtedly people better than us working on these other projects. We already know of two projects just at our home institution where people have chosen to work in C++ and have implemented the particular functions and array sizes that they want from Fortran in C++, so this is already going on. If there is a demand for this, it will be supplied sooner or later. If there is no demand then this library’s existence is not a matter for concern. In some senses, it is better that this is provided by people who actually like Fortran and want people to use it rather than people who prefer C++ and want people to move over

  4. I hope that one might wind up with a bit of a two way traffic here. One of the reasons why we are so explicit in the documentation that this library pretty much just implements what Fortran can do already is that we hope that it would encourage people to think more about how there is an entire language that is specifically designed to do this! I know students who have been made to learn C++ but don’t particularly like the language. If you show them that Fortran’s approach is better than C++’s, then they might want to move over for the other advantages. Another thing is that people who don’t use it still see Fortran as being FORTRAN77, and think that it is an old fashioned language. Seeing that to replicate its features in C++ needs 12,000 lines of heavily templated code might change some minds.

  5. This is just a random library by two people, which should mean that it isn’t really an option for actually serious work. We aren’t a national laboratory, and we aren’t even branding it with the name of the university that we work at (since they aren’t supporting the project), so while we want to make this available to the community because we think it will be useful, I fully expect it to become one of the many libraries that has no real effect on the world.

2 Likes

Absolutely. Brady and I really like and value Fortran, and would turn to it even being decent C++ devs, because it is so much easier to write performant code. That’s one of the things we teach about it first and foremost

5 Likes

The lack of standard multi-dimensional arrays has been a long-standing problem for C and C++ programmers, forcing them to invent dozens of array and tensor containers. I assembled a list at some point:

C:

C++:

Each library caters to different use cases. Some are just array containers some offer extensive linear algebra (BLAS/LAPACK wrappers) and some even offer nice tensor syntax. There are downsides compared to built-in arrays:

  • you need to install and configure the library
  • if you write an algorithm tied to one array container, it ain’t directly reusable from other libraries (most libraries have a way to wrap an existing pointer)
  • some libraries make extensive use of template meta-programming, which can be hard to develop and leads to long compilation times

I think having a Fortran-inspired library like FAR is good place for Fortran programmers to start with modern C++ when they have reasons to do so.

It should also be easy to interoperate with existing Fortran code in case needed using the advanced interoperability features in Fortran 2018. I did an example of Eigen++ wrappers before: Weather and climate modeling codes from Fortran to C++ - #41 by ivanpribec

6 Likes

The Armadillo C++ library has multidimensional array access with x(i,j,k) as in Fortran, although the lower bounds are 0. It is used in many R packages. When will it be preferable to use FAR++ over Armadillo?

Armadillo primarily sees itself as a linear algebra package with some array like features, and it does rather show if you try to use it for general numerical computation. It only supports arrays of rank <=3 which is surprisingly limiting, and its syntax is quite annoying if you are thinking of it as an array library. You can’t assign a section of an array by assignment, and if you forget and try to do it anyway then it only fails at runtime. Finally, and most awkwardly is that Armadillo has real performance issues when you access arrays by index rather than using array slices or whole array operations. One of the early tests for FAR that is still in there was a simple implementation of Jacobi iteration, and it runs at almost the same speed using both slice and index notation. The same test in Armadillo is slightly slower but very much comparable when using slice notation, but takes about 1.3 times as long when using the indices. Armadillo has real strengths as a linear algebra package, but some real weaknesses as an array package.

To answer the question as asked, Armadillo is better if you are doing linear algebra with some array-like operations, FAR++ is better if you are really working with arrays as a more general concept.

2 Likes

Thanks for the informative answer. Maybe you could discuss the pros and cons of Far++ vs. Armadillo and other C++ algebra packages in the Readme of your project, in case others have the same question.

Encouraged by this article, I wrote my own very simple wrapper for arrays in C++. My convolution benchmark run 15% faster in C++ than in Fortran, despite using objects with overloaded () operator to emulate arrays. Of course, this got all nicely inlined using a modern compiler and vectorized efficiently. One more nail in the coffin I guess :slight_smile:

I’ve been writing an electronic structure code in Fortran for the past 22 years. And it’s an excellent language for this type of purpose.

The main reason is that I can hold up the mathematics next to the Fortran code and immediately see the connection. After all, FORTRAN originally stood for FORmula TRANSlation.

Add to this is the fact that even a novice can write very fast code and make it parallel, so that it can run on a cluster.

Furthermore, I can run code written in the late 1950’s on a modern Fortran compiler. Thanks to the work of the Fortran Standards committee, backwards compatibility is strongly enforced. Thus I can be fairly confident that our code will still compile and run decades from now.

If Fortran didn’t exist, then I most likely have written our code in C and not C++. The reason is that I want simple code which does complicated things, not the other way around. Linus Torvalds however, puts it better than I can.

4 Likes

I think that is the thing, as jkd2022 says there is more to Fortran than pure speed. Fortran is easier to learn, easier to use to write most programs in etc. etc… A big part of my job is teaching new PhD students scientific programming and we do it in Fortran, because it means that we don’t have to teach them anything like as much just about the language. They can concentrate on good programming style, testing and verification and other non-language dependent skills.

It is also worth saying that I suspect that you are only able to get faster than Fortran by not having some capabilities that Fortran has. Very few people use every feature of Fortran’s arrays, but I think that pretty much every feature is used by someone, so the only way to achieve those speed benefits is for everyone to write their own array library that implements the subset of Fortran arrays features that they use. That is not a good idea in many ways, and a lot of working scientific programmers really couldn’t do that even if it was. FAR++, Blitz++, Armadillo and other libraries offering truly comparable features are typically as fast as Fortran at best!

1 Like

Agreed. Subtle bugs are less likely in a more restricted language and I reckon far more compute time gets ‘wasted’ due to things that plain don’t work than things which are 10% slower than they could be. I mean, people put plain-Python codes on clusters that are 10-times slower than they could be and don’t seem to worried about it!