"A Perspective on Sustainable Computational Chemistry Software Development and Integration": doutbtful comments about Fortran

I came across this article (A Perspective on Sustainable Computational Chemistry Software Development and Integration) in ACS journal and I don’t agree with the following (moving away from Fortran??):

From the perspective of sustainability, the computer language
a module is written in tends to be less important than the
languages for which it provides APIs for. For example, C-
bindings exist for many Fortran libraries, and an increasing
number of C/C++ libraries also provide Python bindings.
Generally speaking, computational science is moving away from
Fortran. While software written in Fortran is likely to persist for
some time, it is our present recommendation that developers
prioritize providing C/C++ and Python APIs regardless of the
language in which the module is written. C/C++ retains critical
roles in the software implementation ecosystem as the most
widely used languages for low-level implementation.

Ironically many mentioned software in this paper make heavy use of Fortran and are still being in active development. Many people and colleagues around me use of Fortran (rather many like Modern Features of Fortran).

5 Likes

I am surprised that an article about sustainability does not cite the article by Pereira et al.
Fortran scored pretty well in that study.

3 Likes

I am a coauthor of that paper, and I was an advocate for fortran in general at the workshop. Most of the quantum chemistry legacy codes mentioned are actually mixed fortran plus C/C++, with old f77 style fortran interfaced to new developments in the other languages. I think I was the only developer who is routinely using modern fortran features. The general feeling among the other coauthors is to maintain the old legacy code with minimal effort, and to focus new efforts on the new languages. Of course I pointed out that the new (f90+) features in fortran offer many advantages (modules, derived types, allocatable arrays, etc.) and that those new features are, for the most part, not interoperable with other languages. If you want to use them, then it is best to use them in a fortran development environment.

Many of the participants were university professors. Their research model is based on a flow of grad students and postdocs, and these days, they typically have no prior experience programming in fortran, while they do have experience with other languages. A grad student or a postdoc might only do productive work for a year before they graduate or move on to the next job. It is difficult for them to learn the science, learn a new programming language, and learn the quantum chemistry application code in order to do their research during the limited time they have.

Another systemic problem is that funding agencies in chemistry typically do not fund programming efforts. They fund applications, and any programming efforts must be done on the side in order to accomplish that goal. A student or a post doc cannot publish a paper about modernizing a legacy code, so there is little incentive to do that kind of work. There is funding and incentive for students to program modern supercomputers, such as the upcoming exascale machines. However, most of the tools on those machines are not fortran based, they are other languages. That is why so many legacy codes are still f77, there has been, and probably will never be, any major funding to modernize legacy codes.

That is the state of affairs in quantum chemistry. I think the wording in the final report was a fair representation of the discussions at the workshop. There was no animosity toward fortran at the meeting, it seemed to be more the idea that modern computing is going in a different direction, and the choice is either swim with the current or swim against it.

11 Likes

@davidpfister maybe Pereira et al. was not cited because it was not convenient.

@RonShepard totally agree. While fortran is mandatory in the physics curriculum in my university, other professors teach basically f77 slightly seasoned with very little modern fortran. People teach what they know. I was taught f77, twenty years ago, but I have converted my codes to modern fortran and now I stick to modern fortran only, but this has been an effort that probably not all the people are willing to do. Now I try to force my students to use modern fortran and, in fact, some of them have been offered job right after graduation because of their fortran skills. I always tell them that nowaday everybody knows python, but not fortran. Skills in fortran now can make the difference.

I guess. I just wanted to give a counter reference that showed that Fortran can be sustainable. But of course it comes à the price of dealing with legacy code which seemed to be the pain point in that study

I will try to find a reference on the impact of programming languages in Astrophysics, which I checked a while ago. Python did not finish very well at all. Fortran and C did.

This is the key point, and why do grad/post-doc students manage to accomplish so much in so little time? Because the ecosystem in other languages is much more stronger, and people are not ashamed to stand upon the shoulders of others and use libraries: say for Python with numpy, scipy, pandas, pytorch and matplotlib you can already do so much. In C++ the std and Boost offer also many features. This makes the language AND its ecosystem indistinguishable in a sense… Fortran feels like is not there yet, the efforts behind fpm, stdlib and many others go in this direction, but it also needs active users to encourage others.

IMHO more examples and reacher tutorials are needed, showing how easy is to go from zero-to-something thanks to the Fortran ecosystem (not only the language).

7 Likes

Their research model is based on a flow of grad students and postdocs, and these days, they typically have no prior experience programming in fortran, while they do have experience with other languages.

My experience is that most students know neither Fortran nor C++. They usually know some Python. In my personal experience all my students could pick up both Fortran and C++ quite quickly to be productive. Fortran is easier to pickup than C++.

3 Likes

My experience is that most students know neither Fortran nor C++. They usually know some Python. In my personal experience all my students could pick up both Fortran and C++ quite quickly to be productive. Fortran is easier to pickup than C++.

same here, students don’t know both languages. However, they often had one or two courses where C/C++ was taught which makes them believe that they know these languages. But realistically, students learn proper coding and software development during their PhD and then there is no reason for not choosing Fortran.

3 Likes

Another one to be sent to the ACS.

I think this summarizes most problems in any field of computational sciences. Take environmental fluid mechanics: most codes nowadays are old, terribly developed, terribly documented and full of patches and workaround. And looking at the numerics behind, they are not even that high order. In the field of oceanography for example, it’s very difficult to find a finite element code, everything is finite differences. This is not because finite elements are not suitable for fluid mechanics though, its because the founding comes for/after the results of the paper, not for the implementation of the tools to achieve those results. If you look at the industry then, you will see that the programs they use have high order schemes and have a lot of fancy functionalities. Because those codes are commercial and cost a lot of money, and there is a huge investment in developing those tools.

Now, given that I know that everyone here is not paid to develop tools with fortran, and it’s terribly easy to say what we should do without actually doing it, let me say the following, with the risk of being off topic and obnoxious:
I truly think, that this community should stop trying to “keep fortran relevant” and think at a way to “make fortran an interesting option”, by joining efforts not only in providing a set of intercheangable high performance tools that are easy to use in the everyday programming life (as you guys are doing ) but also routines and documentation to learn advanced programming with them and discover how deep is the rabbit hole. The Fortran-lang/learn webpage has a good quickstart, but what about a tutorial to create wrappers, APIs and more advanced stuff? Every single tool outside has a documentation, “you need to do this operation so you call this function”, but imagine if we had a website where not only we provide this quick information but also a walkthrough of the implementation to explain why it is coded in a certain way. This could be the strength of fortran, because there is no one out there teaching fortran anymore, and yet people has to learn it.
Take linear algebra as an example: How many times a week someone asks how to write wrappers for blas and lapack? And every time, one addresses the problem of the single user in its single post. Wouldn’t be great if in the fortran-lang/learn webpage there was a walkthrough to this kind of problems? It could be a prototype task, like a linear system resolution, for which the pseudocode usually does not involve more than 15 lines and it’s mostly basic linear algebra operations, there could be a description of how to make general wrappers to these operations, the best practices, and step by step the solution to the task, from low difficulty (i.e. a beginner code) to high performance, with a blas implementation or an openMP/openACC implementation, or a do concurrent, each behind the same API so one can not only have the choice but can also see how to treat each approach and learn the best practices. We would probably the only one with this kind of teaching power.

Sorry for the rant, it’s sunday and I have nothing else to do than let my mind wander free.

There are efforts in this direction within stdlib which I think would be even more productive as one could play with a first high-level, general purpose API and then try to go down the Rabbit hole if needed or just too curious. You can take a look at:

Then of course, the documentation, tutorials and accessible teaching materials are paramount.
Another initiative to help in that direction is mentioned here:

@FedericoPerini, funded by the the Sovereign Tech Fund, recently started the fortran-lapack project to modernize Lapack and BLAS:

The following refactorings are applied:

  • All datatypes and accuracy constants standardized into a module (stdlib-compatible names)
  • Free format, lower-case style
  • implicit none(type, external) everywhere
  • BLAS modularized into a single-file module
  • LAPACK modularized into a single-file module
  • All procedures prefixed (with stdlib_, currently).
  • preprocessor-based OpenMP directives retained.

and I assume easy-to-use interfaces to the modernized Lapack will be built.

1 Like

I took linear algebra as an example due to its obiquity, I was aware of the starting of the project because I was following the initial discussion but I kind of faded out due to other obligations and missed the updates. It is nice to see that my point is starting to be proven wrong, even though it is still the case for too many other topics.

PS: thanks @Beliavsky for the edits on my english, I wrote that post before my morning coffee :smiley:

The sad thing is that much of this ecosystem is built on Fortran. Even the performant parts of pandas are Fortran. Why did people put decades of effort to convert blas/lapack to c? Because languages are taught in CS and they don’t know Fortran. My students teach themselves Fortran to be able to extend workhorse codes in my field. I’ve seen Julia people including founders of the language laugh at Fortran seemingly ignorant of where most hpc cycles are going. Fortran is on hpc because it is performant, appropriate for scientific coding, and backwards compatible.

As a side note, a large HPC system I use is being crippled by I/o from thousands of python jobs because that is how grad students using python know how to scale their workflows.

4 Likes

I think there are several reasons. One of the early reasons was that in the 1980s every unix mainframe, minicomputer, workstation, or personal computer came with a free (or cheap) C compiler, but the fortran compiler on that machine was typically a relatively expensive additional package. Thus, utilities like f2c were developed to initially allow fortran codes to be maintained/compiled on these computers. It was than a small step to just convert the codes to C and maintain them in that language. A related issue was that development tools in C became more popular than development tools in fortran. For example, there were meta codes such as GOTO BLAS and OpenBLAS written in C that could automatically optimize and tune codes in C. Because of the lack of a standard macro preprocessor in fortran, such developments were more natural in C.

This was also during a dark period of fortran history were new standards incorporating new features were stymied by disagreements among vendors and users about the direction of the language and its capabilities. This was the fortran 8x period when the language almost died.

Those are the three legs of the triad. If any of those gets broken, the language probably disappears, quickly, not just on hpc but from everywhere.