"A Perspective on Sustainable Computational Chemistry Software Development and Integration": doutbtful comments about Fortran

snano · January 13, 2024, 2:16am

I came across this article (A Perspective on Sustainable Computational Chemistry Software Development and Integration) in ACS journal and I don’t agree with the following (moving away from Fortran??):

From the perspective of sustainability, the computer language
a module is written in tends to be less important than the
languages for which it provides APIs for. For example, C-
bindings exist for many Fortran libraries, and an increasing
number of C/C++ libraries also provide Python bindings.
Generally speaking, computational science is moving away from
Fortran. While software written in Fortran is likely to persist for
some time, it is our present recommendation that developers
prioritize providing C/C++ and Python APIs regardless of the
language in which the module is written. C/C++ retains critical
roles in the software implementation ecosystem as the most
widely used languages for low-level implementation.

Ironically many mentioned software in this paper make heavy use of Fortran and are still being in active development. Many people and colleagues around me use of Fortran (rather many like Modern Features of Fortran).

davidpfister · January 13, 2024, 8:41am

I am surprised that an article about sustainability does not cite the article by Pereira et al.
Fortran scored pretty well in that study.

RonShepard · January 13, 2024, 9:30am

I am a coauthor of that paper, and I was an advocate for fortran in general at the workshop. Most of the quantum chemistry legacy codes mentioned are actually mixed fortran plus C/C++, with old f77 style fortran interfaced to new developments in the other languages. I think I was the only developer who is routinely using modern fortran features. The general feeling among the other coauthors is to maintain the old legacy code with minimal effort, and to focus new efforts on the new languages. Of course I pointed out that the new (f90+) features in fortran offer many advantages (modules, derived types, allocatable arrays, etc.) and that those new features are, for the most part, not interoperable with other languages. If you want to use them, then it is best to use them in a fortran development environment.

Many of the participants were university professors. Their research model is based on a flow of grad students and postdocs, and these days, they typically have no prior experience programming in fortran, while they do have experience with other languages. A grad student or a postdoc might only do productive work for a year before they graduate or move on to the next job. It is difficult for them to learn the science, learn a new programming language, and learn the quantum chemistry application code in order to do their research during the limited time they have.

Another systemic problem is that funding agencies in chemistry typically do not fund programming efforts. They fund applications, and any programming efforts must be done on the side in order to accomplish that goal. A student or a post doc cannot publish a paper about modernizing a legacy code, so there is little incentive to do that kind of work. There is funding and incentive for students to program modern supercomputers, such as the upcoming exascale machines. However, most of the tools on those machines are not fortran based, they are other languages. That is why so many legacy codes are still f77, there has been, and probably will never be, any major funding to modernize legacy codes.

That is the state of affairs in quantum chemistry. I think the wording in the final report was a fair representation of the discussions at the workshop. There was no animosity toward fortran at the meeting, it seemed to be more the idea that modern computing is going in a different direction, and the choice is either swim with the current or swim against it.

conradoat · January 13, 2024, 9:54am

@davidpfister maybe Pereira et al. was not cited because it was not convenient.

@RonShepard totally agree. While fortran is mandatory in the physics curriculum in my university, other professors teach basically f77 slightly seasoned with very little modern fortran. People teach what they know. I was taught f77, twenty years ago, but I have converted my codes to modern fortran and now I stick to modern fortran only, but this has been an effort that probably not all the people are willing to do. Now I try to force my students to use modern fortran and, in fact, some of them have been offered job right after graduation because of their fortran skills. I always tell them that nowaday everybody knows python, but not fortran. Skills in fortran now can make the difference.

davidpfister · January 13, 2024, 9:59am

I guess. I just wanted to give a counter reference that showed that Fortran can be sustainable. But of course it comes à the price of dealing with legacy code which seemed to be the pain point in that study

conradoat · January 13, 2024, 10:02am

I will try to find a reference on the impact of programming languages in Astrophysics, which I checked a while ago. Python did not finish very well at all. Fortran and C did.

hkvzjal · January 13, 2024, 10:04am

This is the key point, and why do grad/post-doc students manage to accomplish so much in so little time? Because the ecosystem in other languages is much more stronger, and people are not ashamed to stand upon the shoulders of others and use libraries: say for Python with numpy, scipy, pandas, pytorch and matplotlib you can already do so much. In C++ the std and Boost offer also many features. This makes the language AND its ecosystem indistinguishable in a sense… Fortran feels like is not there yet, the efforts behind fpm, stdlib and many others go in this direction, but it also needs active users to encourage others.

IMHO more examples and reacher tutorials are needed, showing how easy is to go from zero-to-something thanks to the Fortran ecosystem (not only the language).

certik · January 14, 2024, 1:47am

Their research model is based on a flow of grad students and postdocs, and these days, they typically have no prior experience programming in fortran, while they do have experience with other languages.

My experience is that most students know neither Fortran nor C++. They usually know some Python. In my personal experience all my students could pick up both Fortran and C++ quite quickly to be productive. Fortran is easier to pickup than C++.

MarDie · January 14, 2024, 8:44am

My experience is that most students know neither Fortran nor C++. They usually know some Python. In my personal experience all my students could pick up both Fortran and C++ quite quickly to be productive. Fortran is easier to pickup than C++.

same here, students don’t know both languages. However, they often had one or two courses where C/C++ was taught which makes them believe that they know these languages. But realistically, students learn proper coding and software development during their PhD and then there is no reason for not choosing Fortran.

conradoat · January 14, 2024, 9:50am

Another one to be sent to the ACS.

kimala · January 14, 2024, 12:36pm

I think this summarizes most problems in any field of computational sciences. Take environmental fluid mechanics: most codes nowadays are old, terribly developed, terribly documented and full of patches and workaround. And looking at the numerics behind, they are not even that high order. In the field of oceanography for example, it’s very difficult to find a finite element code, everything is finite differences. This is not because finite elements are not suitable for fluid mechanics though, its because the founding comes for/after the results of the paper, not for the implementation of the tools to achieve those results. If you look at the industry then, you will see that the programs they use have high order schemes and have a lot of fancy functionalities. Because those codes are commercial and cost a lot of money, and there is a huge investment in developing those tools.

Now, given that I know that everyone here is not paid to develop tools with fortran, and it’s terribly easy to say what we should do without actually doing it, let me say the following, with the risk of being off topic and obnoxious:
I truly think, that this community should stop trying to “keep fortran relevant” and think at a way to “make fortran an interesting option”, by joining efforts not only in providing a set of intercheangable high performance tools that are easy to use in the everyday programming life (as you guys are doing ) but also routines and documentation to learn advanced programming with them and discover how deep is the rabbit hole. The Fortran-lang/learn webpage has a good quickstart, but what about a tutorial to create wrappers, APIs and more advanced stuff? Every single tool outside has a documentation, “you need to do this operation so you call this function”, but imagine if we had a website where not only we provide this quick information but also a walkthrough of the implementation to explain why it is coded in a certain way. This could be the strength of fortran, because there is no one out there teaching fortran anymore, and yet people has to learn it.
Take linear algebra as an example: How many times a week someone asks how to write wrappers for blas and lapack? And every time, one addresses the problem of the single user in its single post. Wouldn’t be great if in the fortran-lang/learn webpage there was a walkthrough to this kind of problems? It could be a prototype task, like a linear system resolution, for which the pseudocode usually does not involve more than 15 lines and it’s mostly basic linear algebra operations, there could be a description of how to make general wrappers to these operations, the best practices, and step by step the solution to the task, from low difficulty (i.e. a beginner code) to high performance, with a blas implementation or an openMP/openACC implementation, or a do concurrent, each behind the same API so one can not only have the choice but can also see how to treat each approach and learn the best practices. We would probably the only one with this kind of teaching power.

Sorry for the rant, it’s sunday and I have nothing else to do than let my mind wander free.

hkvzjal · January 14, 2024, 1:46pm

There are efforts in this direction within stdlib which I think would be even more productive as one could play with a first high-level, general purpose API and then try to go down the Rabbit hole if needed or just too curious. You can take a look at:

github.com/fortran-lang/stdlib

Support for linear algebra

opened 02:25PM - 04 Dec 23 UTC

perazz

idea

Hello stdlib developers, I'm opening this issue to summarize and coalesce upco…ming efforts to integrate linear algebra operations in stdlib, in particular: 1) Accessible interfaces for common linear algebra operations We want stdlib to be able to solve linear algebra tasks by wrapping against libraries like BLAS, LAPACK, SCALAPACK, with a user-friendly interface. I think this means: - To get the best possible user-level API (almost no inputs at all), maintaining familiarity the syntax that other libraries also provide (scipy, etc.) but have an expert interface with all control knobs to tune the algorithms (e.g. via settings stored in derived types). Thanks to Fortran arrays, If this is done well, I believe stdlib could have the best linear algebra API out there; - Wrapping against BLAS and LAPACK means that the low-level API should probably not be changed (so that down the road, stdlib could just link against external BLAS libraries or platform-dependent frameworks), but a reference implementation must be provided anyways. Should we aim at maintaining our own Modernized BLAS/LAPACK at fortran-lang, or just automate the download process from netlib? I would personally like the idea to develop a Modernized version once and forall (those reference implementations will almost never change anymore), although those are huge repos that would require significant work. 2) Support for common IO formats for matrices and tensors - To allow efficient working with multidimensional arrays, we should support easy serialization/deserialization and conversion among formats. This could include formats like NPZ, matrixmarket, etc. An easily extendible API should be provided to have plugins for other formats which might or might not fit in the scope of stdlib. - I believe an equally important task is to define derived types that are capable of storing temporary information which is not strictly matrix data (e.g. matrix factorization, or working arrays for the matrix solvers), to avoid unnecessary overhead in case of repeated algebra operations. This means that "simple" array storage may need to be replaced with matrix derived types in those cases. Because there are plenty of options in defining these APIs, It is crucial to the success of this task that as much feedback as possible is given, so I would like to encourage all ideas - and criticisms - to be discussed on this issue, so that we can come up with the best possible version. I am also opening a discussion page on the [Fortran-lang Discourse](https://fortran-lang.discourse.group/t/linear-algebra-support-in-stdlib/6917) that we can use for more verbose discussions. Thank you, Federico cc @certik @awvwgk @fortran-lang/stdlib @fortran-lang/admins ### Linked issues Linear algebra (BLAS/LAPACK) #1 #10 #67 #450 #476 -> Regarding dense algebra, I've started a discussion at https://github.com/fortran-lang/stdlib/issues/450 Sparse algebra #38

github.com/fortran-lang/stdlib

The first step in facilitating BLAS integration into stdlib: add a solve function

opened 09:05AM - 06 May 23 UTC

zoziha

idea

### Motivation Fortran arrays have a natural advantage in linear algebra, but u…nfortunately, stdlib currently does not integrate any high-level function implementation based on BLAS, LAPACK interfaces, such as `det`, `solve`, `inv`. For Fortran users, this is frustrating. Obviously we can use LAPACK directly, but stdlib's motivation is to be the math library that can catch up with `numpy`, and linear algebra is essential. (I understand that this may not be a purely technical issue, it may involve consensus and norms.) I'd like to start by trying to link `openblas` in stdlib (`numpy` gives preference to `openblas` and `mkl`) and show you that we are enthusiastic about BLAS. Maybe it won't succeed directly, but I'll still try to implement `linalg.solve` like `numpy`. `solve` is not computationally efficient: There are two array assignments before `_gesv` is called, which takes a bit of time for large arrays, but is good for ease of use. Also, since this is the first time LAPACK is encapsulated, there is no assertion on the return value `info`, so discussion is welcome. ### Prior Art - [numpy.linalg.solve](https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html) - [scipy.linalg.solve](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.solve.html#scipy.linalg.solve) ### Additional Information _No response_

github.com/fortran-lang/stdlib

Add solve based on OpenBLAS

fortran-lang:master ← zoziha:blas-1

opened 09:16AM - 06 May 23 UTC

zoziha

+284 -1

- [x] Use `pkg-config` to find `openblas` and define `USE_OPENBLAS`; - [x] `BUILD_OPENBLAS` is off by default because stdlib has few BLAS-based functions yet. - [x] Routines based on BLAS and LAPACK often only support single or double precision floating-point types; - [x] Add `BLAS_common.fypp` to set the BLAS related common variables; - [x] Add a `solve` function to `stdlib_linalg`, and test it; - [ ] Test it in CI (MINGW✔, Linux-gfortran✔, Linux-ifort, macOS-gfortran); - [ ] Discusses and handles the `_gesv` return value `info`. This PR is an attempt to irrationally generate enthusiasm for BLAS integration into stdlib. Close #709 . ### Syntax ```fortran x = [[stdlib_linalg(module):solve(interface)]](a,b) ```

Then of course, the documentation, tutorials and accessible teaching materials are paramount.
Another initiative to help in that direction is mentioned here:

github.com/fortran-lang/stdlib

Improve user documentation (feedback from students)

opened 01:30PM - 02 Mar 23 UTC

Carltoffel

documentation

I'm currently giving a Fortran lecture with twelve students, and today I introdu…ced modules, fpm dependencies and finally the stdlib. I didn't tell them much about the stdlib, instead I asked them to try it out by themselves. Most of them directly found the documentation, but they seemed to be a bit underwhelmed by it. I'm not up-to-date with the progress of stdlib... sorry if I mention duplicate/obvious things. Problems that occurred: - missing or incomplete documentation - not clear enough what *public* functions a module has (one student tried to use xoshiro256ss). I think it would be best to have a toggle for private functions - navigating the docu: when searching in the documentation, it's easier to find the source file, which might be `.fypp` instead of the actual documentation which is a bit scary for students After the initial frustration and confusion, they found many useful modules, e.g.: - error_stop and check - string type - ascii - bitsets - io (npy) - arange - optval - sorting - stats I think overall the routines convinced them, but the documentation definitely needs some improvement. That shouldn't be surprising.

Beliavsky · January 14, 2024, 2:00pm

@FedericoPerini, funded by the the Sovereign Tech Fund, recently started the fortran-lapack project to modernize Lapack and BLAS:

The following refactorings are applied:

All datatypes and accuracy constants standardized into a module (stdlib-compatible names)

Free format, lower-case style

implicit none(type, external) everywhere

BLAS modularized into a single-file module

LAPACK modularized into a single-file module

All procedures prefixed (with stdlib_, currently).

preprocessor-based OpenMP directives retained.

and I assume easy-to-use interfaces to the modernized Lapack will be built.

kimala · January 14, 2024, 2:52pm

I took linear algebra as an example due to its obiquity, I was aware of the starting of the project because I was following the initial discussion but I kind of faded out due to other obligations and missed the updates. It is nice to see that my point is starting to be proven wrong, even though it is still the case for too many other topics.

PS: thanks @Beliavsky for the edits on my english, I wrote that post before my morning coffee

jcwright · January 15, 2024, 3:23pm

The sad thing is that much of this ecosystem is built on Fortran. Even the performant parts of pandas are Fortran. Why did people put decades of effort to convert blas/lapack to c? Because languages are taught in CS and they don’t know Fortran. My students teach themselves Fortran to be able to extend workhorse codes in my field. I’ve seen Julia people including founders of the language laugh at Fortran seemingly ignorant of where most hpc cycles are going. Fortran is on hpc because it is performant, appropriate for scientific coding, and backwards compatible.

As a side note, a large HPC system I use is being crippled by I/o from thousands of python jobs because that is how grad students using python know how to scale their workflows.

RonShepard · March 7, 2024, 5:05pm

I think there are several reasons. One of the early reasons was that in the 1980s every unix mainframe, minicomputer, workstation, or personal computer came with a free (or cheap) C compiler, but the fortran compiler on that machine was typically a relatively expensive additional package. Thus, utilities like f2c were developed to initially allow fortran codes to be maintained/compiled on these computers. It was than a small step to just convert the codes to C and maintain them in that language. A related issue was that development tools in C became more popular than development tools in fortran. For example, there were meta codes such as GOTO BLAS and OpenBLAS written in C that could automatically optimize and tune codes in C. Because of the lack of a standard macro preprocessor in fortran, such developments were more natural in C.

This was also during a dark period of fortran history were new standards incorporating new features were stymied by disagreements among vendors and users about the direction of the language and its capabilities. This was the fortran 8x period when the language almost died.

Those are the three legs of the triad. If any of those gets broken, the language probably disappears, quickly, not just on hpc but from everywhere.

Topic		Replies	Views
Why abandon Fortran for Linear Algebra?	46	7079	August 11, 2021
For whom Fortran? For what? Language enhancement	19	1211	May 6, 2022
Fortran in the TIOBE Top 10	53	2779	February 6, 2025
Weather and climate modeling codes from Fortran to C++	64	4404	March 30, 2024
What is the superiority of Fortran over alternative languages like Chapel or Julia?	49	9723	August 13, 2021

"A Perspective on Sustainable Computational Chemistry Software Development and Integration": doutbtful comments about Fortran

Related topics