The counter-intuitive rise of Python in scientific computing

In fact, when I learned to program Fortran on the IBM 026 keypunch, the commercial version didn’t have equals or parens at all. We had to type the following:

# for =
% for (
¤ for )

The mind adapts. :slight_smile:

(The New Mexico school I attended was too poor to buy the IBM 029 keypunch, which was introduced about 8 years before I started learning to program in high school.)

1 Like

I’ve been collecting Fortran code examples. I have one project that is 11+ million lines of Fortran, and several others that are over 1 million lines of code. I have several single Fortran files that are over 100,000 lines of Fortran (the largest is 576,989 lines).

The notion that the language would (more than rarely) evolve in incompatible ways is kind of terrifying. (Maybe it’s a tooling issue, but determining the dependencies on a semantic change to Fortran on an 11 million line program is more than a bit daunting.)

3 Likes

From my anecdotical experience, most of this CPU times comes from legacy codes, not from recently started projects. Fortran is very underrepresented in the two “mega trends” artificial intelligence with neural networks and GPU computing.

Of course, there situation is more complex than 50 years ago when there was no literally no alternative to Fortran in scientific computing. Still, the fact that Python with its very poor native performance became a serious competitor to Fortran shows that the Fortran language and its ecosystem lacks relevant features that Python offers.

4 Likes

This is very likely true, nonetheless it’s a driver that keeps some people using Fortran. When you are dealing with a code in a given language you tend to go on with this language in the project for the new developments. This would not happen without these codes, and Fortran would be dead for good.

My other point was that Python was virtually non-existent in these statistics. The main competitor to Fortran was rather C++.

Sure, but it has definitely little to do with the obsolescent features that you were pointing above. Regarding GPU computing there’s AFAIK no equivalent to Cuda in the Fortran world, and it’s not because of the implicit mapping or because Fortran uses ( ) instead of [ ] for arrays !

It has been discussed many times, and there are many reasons for the success of Python, which are much more important than the few obsolescent features of Fortran.

1 Like

CUDA Fortran has been available since 2009 at least in the PGI compiler. Support for CUDA is also coming to new flang (LLVM Flang Begins Seeing NVIDIA CUDA Fortran Support - Phoronix). Unfortunately, the pgfortran compiler (now nvfortran) is stuck at the level of F2003 support. This was also one of the gripes LLNL had (An evaluation of risks associated with relying on Fortran for mission critical codes for the next 15 years):

Even when a technology provider has an in-house Fortran compiler team that supports advanced hardware, such as the case for the NVIDIA nvfortran compiler, the lack of timely and robust support for “modern Fortran” has proven a major hurdle.

1 Like

This confirms the “compilers that are lagging behind [the standard]” statement… Introducing plenty of bright new features in the standard doesn’t help a lot if most of compilers do not implement them.

I don’t know a lot about nvfortran, but to me the approach is quite different from Cuda: it enables offloading some computation to the GPU, but not writing kernels. But maybe I’m wrong?

You can write kernels, just like in the C++ version. Here’s an example doing some large computations on up to 512 A100 GPUs: Redirecting

The code can be found here: GitHub - copmat/LBcuda: CUDA Fortran code to solve Lattice Boltzmann equations of a multi-component flow interacting with rigid spherical particles.

The CUDA Fortran manual can be found here: CUDA Fortran Programming Guide Version 23.11 for ARM, OpenPower, x86. But you need to have NVIDIA hardware, and be willing to use language extensions other compilers don’t support. Abstracting the computational kernels behind well-defined interfaces is key if you want to later switch to different hardware.

2 Likes

I can’t comment if the standard is to blame here, or compiler-vendors can’t find skilled people to do the work or simply find the return on investment too low. So many programming new languages have appeared in the last decade, it’s hard to keep up.

What I witnessed at PASC is, compiler vendors like to hear success stories about their technologies (software and hardware). Promoting the achievements of your (CUDA/OpenACC/OpenMP/…) Fortran applications, and the scientific or business value they bring, helps the technology advocates working at these companies make a case to their management to keep investing.

3 Likes

To me the elephant in the room wrt the rise of Python in scientific computing is not the basics of the language itself but the ecosystem that rapidly grew around it. People want stuff for free. They don’t want to waste their time writing their own bespoke software to do a critical task if that capability already exists and is easy to incorporate into their application. The same can be said to a certain extent for C++ and now Julia. Had we had visionaries in the standard committees and the compiler vendor community 20 years ago that would have pushed things like an interactive version of Fortran, intrinsic containers for lists, queues, etc., and a universal package system instead of the various “pet projects” of some standards committee members that currently pollute the language, Fortran wouldn’t be in the shape its in now. Instead Fortran has been forced to play catch up with newer more modern and more “sexy” languages because of that lack of vision and more importantly leadership. While I’m encouraged by things like LFortran and stdlib, the realist in me says its probably too little too late. Something like stdlib should have been an intrinsic part of the language decades ago. Why has there never been an attempt to formulate a carefully thought out plan and path that would encourage developers to upgrade their codes to use newer language constructs while still maintaining backwards compatibility for the things that matter (and in my experience a lot of the backwards compatability issues that some folks obsess over only make up a relatively small fraction of a code base). Why isn’t the capability to refactor old code as automatically as possible not built into compilers or at least a set of free vendor supplied standalone refactoring tools. All of this leads up to my final questions. Why doesn’t Fortran have an ecosystem built around it like Python. Why does any change in Fortran standards, compilers etc. proceed at a glacial pace compared to other languages. I’ll leave it to younger and more agile minds to figure out if there is a way to save Fortran (I think there is a chance we can if the powers that be just get their heads out of the sand). However, I don’t think anything of substance can be done until we rethink Fortran from the ground up.

Just my 2 cents

6 Likes

And now the missing question: what have we done -all of us- as a contribution to all of this? The rise of Python is the result of a vast community effort, nothing can happen without that. It’s too easy to always blame the same people, i.e. the committee members, for all the problems in the Fortran world.

5 Likes

Actually, none of the changes you propose would be a great problem for me and only the use of for arrays would affect my code at all. But I fear that a compiler that required such source code would be about as popular as Perl 6, and the bulk of Fortran users and their code would stick with existing conventions.

Python has been successful not because of any greatness in the language, but because it was useful for teaching and at least adaquate for many larger projects. People hate to learn anything new. The fact that it is terrible for computationally heavy work has been papered over by the easy availability of server capacity at AWS, etc. The tendency of the standard to avoid anything that might be thought of as OS dependent has not helped, even though the OS dependence was hardly immutable in many cases.

1 Like

@PierU, unfortunately Fortran has to a certain extent been a slave to the whole standards committee process and the agenda’s of the commercial compiler vendors for decades. In my opinion, that is the major factor inhibiting the growth of Fortran because it DOESN’T allow for a broader user participation in deciding what Fortran should be and how it should evolve into the future. It’s only recently that something like a Fortran user community akin to Python’s actually existed. For all of the committee’s assurances that they are listening to the user community at large, I don’t see that in their work product. Look at what’s in F2023 and compare that to all the suggestions that have been posted on the j3-fortran/proposals site over the last several years and tell me how many of those suggestions that didn’t originate with a committee member actually made it into the standard. I’m sorry but as far as I can tell (again based on their work product) the committee doesn’t exist to serve the broader user community but only the needs of a handful of committee members and compiler vendors. The great strength of Python and Julia are that they are NOT tied to some quasi-legal international organization with rigid rules and processes that inhibit the relatively fast implementation of new ideas and features that are beneficial to ALL programmers. That allows those languages greater freedom to experiment with new concepts and new methods and then keep the ones that work and reject (even if it destroys backwards compatability for some developers) the things that don’t. My problems with the committee is not so much with individual members but with the idea that the need for rigid standards are as great now as they were 50 years ago when I wrote my first Fortran program. In the era when all computers were “big iron” taking up several hundred square feet of floor space and compilers were specific to each vendors hardware, standards were important. Today when almost all vendors use commodity processors (X86, ARM, POWER) its not uncommon for a HPC center to offer several different compilers. The DoD Cray systems I used to use had Intel, PGI/Nvidia, and gcc/gfortran compilers in addition to Crays native compilers. People now choose which compilers to use based on the features they need and don’t care as much if those features conform to some arbitrary standard. When you have more than one option to compile with its the one that offers the most features, the ones that are most reliable, and the ones that give you the most performant code that people will choose. I’m not calling for the total elimination of the committee, just a rethinking of what it’s real purpose is, how it operates, and an honest discussion about who does it actually serve today.

3 Likes

When discussing the large footprint of legacy Fortran code still in use and its role in the use of Fortran today, it’s important to distinguish when Fortran is being used because one wants to, or because one has to (often begrudgingly).

In this diagram, it’s much easier to move toward the right than it is to move upward.

9 Likes

My two cents as a Fortran developer on Windows.
I have always preferred strongly typed compiled languages. But I must admit that whenever I need to script something I easily use Python. And it’s not due to the syntax but rather to some features like:

  • pip,
  • it’s cross platform (ie it works even on Windows)
  • IDEs.

Considering all this, it is not so counter intuitive that Python is being used so much.
If I were to setup a Fortran dev environment today on Windows (and I exclude the option VS+Intel) I would probably install MinGW, VSCode and fortls. Then since I do not want to reimplement a string module for the 100 times, a would probably use stdlib, which leads to install fpm, Python and fypp. By the time I get all this running with the right paths and environment variables a good Python dev already wrote a new library.

I am no expert about the Fortran Committee but I get that they cannot be held responsible for this. This being said, do I complain that there is no intrinsic string type that can handle utf8? Yes I do everyday. But that does not stop me from coding in Fortran. I love the language despite its flaws.

6 Likes

“slave” is an extremely negative term, as if all these people had an evil agenda.

Fortran has been for decades a language primarily used by institutions and companies, with proprietary/commercial compilers, with success. So, absolutely no surprise if the language has been driven by these people, and that’s perfectly normal.

And it should not be an excuse. Neither the committee nor the classical vendors have ever prevented the community to develop alternative compilers, a standard library, or whatever.

I can’t answer, I would have to browse all the opened issues on fortran-proposals. Nonetheless:

  • A lot of opened issues are just throw-in ideas, some of them can be interesting, but to go further they would need to be better formalized
  • It seems that fortran-proposals has been setup in 2019. Given the usual timeframe for a revision of the standard, and the time needed between a first throw-in idea and an adoption, with many back-and-forth in between, again it’s no surprise if not a lot (if any) went into F2023.
  • And again, a bottleneck seems to be the number of people who do the actual work

C and C++ are also tied to a committee and an ISO standard, with the same rules and processes than Fortran, and it does not prevent a relatively fast evolution.

As I said above, a major difference is that Python and now Julia have had a wide and organized community for a long time. In contrast, when has fortran-lang.org been created? Or stdlib ? Only a couple of years ago. And ask the gfortran developers, they will tell you that would like much more people to contribute.

Maybe I’m not part of the so-called “people”, but in my codes I do care about portability

4 Likes

I would dare to say, that the opposite is actually part of what makes certain features not evolve enough. While industrially speaking one can peek a given compiler as the reference, it is common to have a second backup compiler that verifies portability “just in case” … because of that, certain features of the language are not used because if the second reference compiler does not support it well … then portability is not guaranteed, so to avoid lock-in situations the feature is not used, leading to > no publications > why should the feature be supported anyway? … two big examples: PDTs and coarray. Hopefully the latter might get some more traction and not end up with the same fate !? :pray:

It’s not enough to not “actively prevent the community to develop…”, what is needed is to “actively encourage the community to develop…”. That’s a huge difference.

It’s like trying to get up a hill in a car. It’s not enough to get my foot off the brake. That will not get up the hill. You need to actively press the gas.

Somebody has to lead these efforts. The best way is to go help with stdlib and fpm, go help with GFortran, Flang or LFortran. With modern tooling and a community around it, most of the issues above will be resolved.

6 Likes

@certik after reading the post about Hare here where they propose to create blogs hosted in their site, it gave me the idea if the fortran-lang.org site could host a Blog section that could be fed with simple .md files !? Sometimes people here create very useful posts that would deserve a mini blog tutorial page but that gets lost in threads after some time.

2 Likes

The idea of hosting blogs was mentioned a few years ago (I did not find the thread), but it did not happened (yet).

In fact, it was discussed several times, especially in this thread:

3 Likes

I will add one more, slightly specific reason, that has not been mentioned as to why Python is being considered as a viable solution for scientific computing. For Finite Element Analysis Python has some amazing libraries to work with that have a very low barrier to entry and scale from your laptop to +100k cores (or even GPUs) with minimal efforts from the developer. I am referring to the following 2 projects:

Both libraries have an extremely low barrier to entry and allow scientists of any background to go from an equation to a numerical simulation in minutes. They also don’t suffer from the usual performance bottlenecks that are associated with Python applications, since they offload the heavy-lifting work to PETSc. There is a lot more to these packages that is not relevant to this discussion, but my point is that we have fierce competition in the scientific computing game and as a community we should not be ignoring it, but actively engaging with it.

4 Likes