While Fortran has long been the butt of the joke in IT departments the world over, in a curious twist of fate, it has seen a dramatic resurgence over the last few years. While the reasons for this are not exactly obvious (at least to us!), a few possible explanations are:
It has relatively simple syntax, and maps well to the structure of mathematical formulae.
It produces performant code, especially in combination with parallelisation frameworks like OpenMP.
Due to the large amount of Fortran code in scientific computing, and the promised performance gains through GPUs, it became an attractive target for supporting GPU computations.
As such, there was renewed vigour in the Fortran compiler space, and several important developments happened in short succession. For example, PGI / NVIDIA open sourced a version of their compiler called pgfortran with a new backend based on LLVM (more on that below), which later turned into what’s now known as “classic” Flang. In the process of trying to upstream this into LLVM itself, the compiler got rewritten completely under the name f18, which later turned into “new” Flang that eventually got merged into LLVM itself. Pretty much at the same time, another group started developing a Fortran compiler based on LLVM: LFortran.
Right when LFortran can compile about 50% of all Fortran code in SciPy. I was hoping with good compiler support (Flang today, LFortran soon) they would keep Fortran, but looks like they are not interested.
So Fortran will lose SciPy also, after many other major projects. But the way I see it is that Fortran is seeing a resurgence and quite a lot of small projects are started all the time, as documented by @Beliavsky in this thread: New Fortran projects.
I’m the author of the article, and also a maintainer (one of many!) of both SciPy and conda-forge. I had actually come here to share the blog post I had written, and was pleasantly surprised it was there already!
To make this more explicit: all the difficulties that I was concerned with were about the lack of usable Fortran compilers on windows, which is a platform we simply cannot afford to neglect.
I know that you have interacted a lot with key SciPy maintainers on this, so I’m not telling you anything new, but for the benefit of more casual readers:
The problems of Fortran in SciPy development (as opposed to distribution) are of a different kind: we have loads of ancient Fortran spaghetti code that’s almost impossible to change, because there are still bugs even in that old code, but touching it is iffy, especially since none of the remaining regular maintainers is an expert in Fortran.
For relatively easy functions, it’s much more natural for us to have these things (re)written in Cython, which has a much lower barrier to entry in terms of maintenance. It’s IMO quite unrealistic that this will be done in a year though, the speed depends entirely on people putting in the work to rewrite things; the lowest energy state is “no change”.
Based on the discussions that I’m aware of (though I’m only speaking for myself!), I believe that “modern” Fortran with knowledgeable and active maintainers (e.g. replacing MINPACK with its modern version or cobyla with PRIMA) would definitely still have a shot. The maintenance bit is the key aspect here – someone that maintains the Fortran code. The chances are probably way higher if it’s a separate project (and ideally there’s someone who helps us to wrap it correctly), than trying to have the code live in the SciPy repo[1].
For Fortran code beyond what’s currently in SciPy, it’s likely that the above model would have to prove itself for a while before inclusion of new functionality based on Fortran would be considered. Still, the context is not so much about Fortran itself (quoting from the issue you linked):
I have to repeat this very often, this is not a crusade against fortran, but dormant and unmaintained F77 code.
So in summary, having the windows side of dealing with Fortran become unstuck is great news. It removes a huge pain point for SciPy and opens the door to new opportunities in the future. Just bear in mind that people involved might need some time to turn the scars from the problems of the past into the joy of using modern Fortran[2].
The gcc compilers, including gfortran, do run on Windows. I’ve read the article but still don’t understand why gfortran cannot be used to build SciPy on Windows.
Because running a compiler induces an ABI, and while you can parse Fortran code and generate a library from it with gfortran on windows, that library needs to survive in an ecosystem of existing libraries with the native ABI, because in an ecosystem built around the distribution of binary artefact, it’s all about the ABI.
Historically, you could not use gfortran in a way that was compatible on windows. That has been changing recently with mingw-w64 allowing to target the UCRT, but this is still more complicated (you need a lot of “non-Windows-native” things that you’re dragging along to run gfortran) than having a compiler that’s made for native use.
I’ve been contributing myself towards LFortran compiling SciPy and ensuring LFortran is a joy to use. I need others to take what we have and help out. For example, compiling to webassembly, see here: Use LFortran to compile any Fortran to WASM · Issue #684 · emscripten-forge/recipes · GitHub, so far nobody volunteered. LFortran is ready, we can start with what we can already compile (50% of SciPy already).
This leads me to:
I think the truth is that it’s both. Nobody wants old F77 in fixed-form with implicit typing and goto statements. That has nothing to do with modern Fortran as a language, so that part is not a crusade against Fortran.
However, it is also true that SciPy just does not want Fortran in general, as you wrote yourself just a paragraph above, since it doesn’t have people willing to maintain it or learn it. In this sense, it is a crusade against the language itself.
The second part is about 50% psychological, due to:
While Fortran has long been the butt of the joke in IT departments the world over, …
however, we decided to fix it. To fix it, we decided to fix ALL the technical issues with modern Fortran. All of them. You name a problem, and I will tell you exactly either how we are fixing it, or what our plan is.
Assuming the technical issues are going to be fixed, the other part are to build the Fortran community so that there are enough people who like Fortran and are ok maintaining it, including in SciPy. We are doing what we can here too and making progress. Fixing the technical aspects help, I know several completely new Fortran users who started using modern Fortran recently because they like it technically. We just need more new users. We are making good progress, as manifested by your own words:
…, in a curious twist of fate, it has seen a dramatic resurgence over the last few years.
But we need a little more time.
And a lot of this is just psychological, realizing that “it is ok to use Fortran”, as long as all technical issues are minor.
The lack of Windows compiler is just one problem. There are other issues using Fortran in SciPy that were raised in previous discussions (Using reserved words as variables - #23 by ilayn), such as:
The SciPy developers and maintainers generally aren’t familiar with Fortran and prefer C++ for new development (the number of students exposed to Python and C++ is bigger than Fortran, so who can blame them)
The Fortran routines in SciPy are fixed-form, use implicit typing, and use an “old” programming style, that makes them difficult to read and extend
NumPy (and hence SciPy) using C array layout by default; this means that temporary copies are sometimes needed when Fortran routines are to be called.
Sparse algebra codes in Fortran use 1-based indexing, whereas the SciPy sparse matrices use 0-based indexing (in principle this is addressable nowadays in Fortran through custom array bounds, but it would require work that not many are willing to invest)
Meanwhile, processors have become faster, the Python interpreter has improved, and JIT compilation has flourished, meaning that algorithms are often “fast enough” even when implemented directly in Python.
Anyways, I found the Quansight article interesting. Thanks for sharing it. Even cooler now that I know the author @h-vetinari is also part of our community.
Speaking for myself as a former academic user, I had access to both MATLAB and Mathematica at university, and hence no incentive to contribute to SciPy in that period. Now I mainly work with compiled languages, and only use Python for plotting and scripting.
It’s hard to say whats the reason for the insufficient investment in open-source Fortran compilers. Here are just a few of my speculations:
Lack of Fortran users able and willing to contribute to open-source compiler development; Fortran users are generally domain experts first, and programmers second. We tend to take tools such as compilers for granted.
Fortran users tend to be part of enterprises and their needs are met by the proprietary compilers from numerous vendors including Intel, NAG, NVIDIA, Silverfrost, previously also Lahey and Absoft.
Fortran stakeholders (users, educators, and even the committee) are poor communicators. In other programming language communities, much more effort (and money) goes toward dissemination, education, conferences, and even advertising, frankly. Essentially, they make themselves heard, whereas Fortran users often don’t even mention what language they work in for fear of being someone’s joke.
Just as an example regarding the last point, both Bjarne Stroustroup (C++) and Guido van Rossum (Python) appeared on the Lex Fridman channel. Bjarne regularly appears at C++ conferences, talking about how C++ is important, and permeates all domains of computing, as do speakers from a bunch of other companies (Adobe, Bloomberg, NVIDIA, Apple, Google, etc.). (I found a recent talk at Strangeloop, “The Economics of Programming Languages”, enlightening.)
In contrast, look at the pages of WRF, ICON, or ECMWF, some of the main operational weather forecast models in daily use. It takes a dozen of clicks before you find out these things are implemented in Fortran. It’s similar with Ansys CFX, CalculiX (in use at MTU Aero Engines). I suspect (but don’t know for sure) that STAR-CCM+ (from Siemens) and LS-DYNA (Ansys) are also in Fortran. AVBP is written in Fortran and in use by the industry including Safran Group, Air Liquide, Total, and Airbus. OpenFAST is a wind turbine simulation tool in Fortran from the DoE. OpenRadioss for shock and impact problems is also Fortran.
Essentially, the weather, transport, and energy sectors are all heavy users of Fortran (which isn’t surprising given the maturity of these industries), but few realize this. However little of the money trickles back into Fortran compiler development.
Apart from the DoE (that has funded flang development), Siemens (that has funded GCC development through their child organization Mentor), or the (German) Sovereign Tech Fund (that has funded LFortran development), I’m not aware of any direct support from enterprises toward gfortran, flang, or LFortran. (Just for clarity, I mean direct support from parties that are not also chip vendors.) Hopefully, we can make this change in the future.
Edit: I believe that Intel and AMD contribute to GCC occasionally to add support for their new architectures. Not sure if they contribute to the Fortran frontend in terms of language features and bug fixes.
It’s a chicken and egg problem, because you first have to show people the utility of something before you can expect them to invest time into a longterm pay-off. It’s also not like SciPy maintainers (or FOSS projects in general) are short on things on their to-do lists, so adding any new language[1] that maintainers need to be fluent in is an enormous hurdle, and basically never happens. Modern Fortran has an advantage there because at least one could integrate something like PRIMA without introducing new infrastructure.
To be clear, I have absolutely enormous respect for you work – what you’ve pulled off already is stunning.
I also don’t make the decisions about what gets used. Over the very long term, I agree that is psychological and probably boils down to popularity (not just Fortran, but everything in this space – Python, the libraries, the compilers, the build tools, etc.).
If you can manage to consistently attract new users and some percentage of them get actively involved in various places where they’d like to see Fortran, then the rest will follow. Though overcoming all the ingrained FUD will probably take a while.
Well, the resurgence already speaks to your success so far – good luck that it keeps going like that!
@h-vetinari indeed. Another analogy I like to use is harvesting fruits: the existing Fortran trees are very old and we are expecting to pick fruits, but have not planted any new trees in several decades. So until new trees are planted and they bear fruit, we are depending on the old trees which have many technical issues that are unfixable. The Flang tree was planted in 2012 I believe, and only recently it started giving fruits for SciPy. The LFortran tree was publicly planted in 2019, fruits are almost ready. A mature project like SciPy however requires well established production trees with regular fruits.
So while we are not quite there yet, I think we will have some tasty new fruits soon in Fortran!
This is an interesting statement. From my experience, building software from source in general is more difficult on Windows, at least when the software depends on other libraries. That is why a lot of libraries for science and engineering are difficult to use on Windows. As far as I know, the situation for web development is similar, but I’m not working in this area.
So I question myself why people care so much about Windows compatibility when it comes to science and engineering even though Windows plays absolutely no role in high performance computing. Since the arrival of the Windows Subsystem for Linux, there is an alternative that makes almost all the established tools from Linux/Unix available on Windows.
Windows matters because somewhere between 1/3rd and 2/3 of students (depending on school) have windows laptops. Windows plays no role in HPC, but given the power of a modern laptop, HPC is fairly niche.
Building for windows is an {et,inf}ernal struggle, that’s true, but despite being of no importance in HPC, it’s of tremendous importance in universities, companies, and a large set of the user (and even developer) population being targeted by modern programming languages. In terms of user numbers (though not in terms of many other metrics of course!) HPC is the niche, not windows.
5 years ago, windows support in the Python ecosystem was spotty or non-existent, but over that time, expectations have changed. Common libraries are expected to provide builds for windows.
That’s a non-starter in many managed deployments, aside from being (depending on circumstances) up to an order of magnitude slower than native software.
I’ll speak from my side of the pond. Actually windows plays quite a role here, but since it is mainly on the industrial side, there’s less publicity… Since manny manny years already, one can get with a little investment a pretty nice workstation with a couple of high performance numa processors giving you say 32 cores! It is not HPC in the sense of running on big clusters, but you can do fairly pretty nice things with that for engineering and scientific purposes not having to leave the comfortable environment of a Windows desktop… And when you are ready to leap for HPC for bigger computations, one can send the simulation to a Linux cluster from the windows desktop without leaving the OS, at the end not carrying much about the fact that it is Linux.
You can actually make the parallel between any compiled language and Python. Python became so popular because for the end user the learning curve is just minimal compared with all the quirks related to build systems and so on. Most of the time people don’t know about all of the heavy lifting that happened behind…
And just as I’m guessing many here and outside do, you go to python for testing ideas and once you have something workable you move back to the compiled language. Users of compiled software can test their ideas first on their windows workstations and then move to linux cluster at the end of the design process.
At the end, all of these problems boil down to the learning curve (language, OS). How fast can one go from nothing to something? How many things one is required to know and “master” before having a first result available.
Yes WSL is quite nice, I started using it a few years back with WSL1 and moved to WSL2 already from the experimental version. But it is not free of its own quirks, and it demands a fair level of experience. Migration between frameworks takes a lot of time (years) when talking about industrial use of software. Every change has to be heavily validated before you can even start thinking about using this new framework in production, once it is running, you don’t change it unless there’s a veeeery good reason.
In my opinion the reason is that Microsoft does not care because science and engineering is a niche for them. Wildly guessing, only 5% of Windows PCs have Python or a compiler installed.
I agree. I started software development for science and engineering 15 years ago and switched to Linux simply because it made life so much easier and since then I avoided all the trouble that comes with building software on Windows.
I did not know that. Probably WSL is then only a solution for web developers, which is probably the bigger community for development on Windows.
@h-vetinari: Many thanks for the heroic efforts of maintaining SciPy. I really appreciate that.
@h-vetinari I have made a modern-fortran port of FITPACK that could replace the one in scipy.interpolate. If there is any interest, and an ISO C ABI could work, please let me know and I could write the C interface to all classes.
Normally I would say a working PR is the most convincing argument, but in this case, I think it’s best to start with an RFC issue (coupled with a post to the dev mailing list) that lays out:
what issues the new fitpack solves compared to the old one (probably helpful to refer to existing issues there and point out how they’d get solved or be easily solvable).
why it is well-maintained, and arguments why that state of affairs will continue
how to integrate this with SciPy (e.g. through a submodule)
why this adds no substantial maintenance effort for SciPy (e.g.: has been tested successfully to compile with all the compilers we use across various platforms)
how this relates to other Fortran efforts in SciPy (Minpack, Cobyla, …)
The last two points may sound like an unfair hurdle, and are not strictly speaking necessary, but given that this would be the first Fortran module in a while[1], and the first “modern” Fortran module ever, such an effort will probably set the tone for the next few such modules that people have already been discussing, so you might want to collaborate with @certik et al. to ensure that the initial proposal is well-aligned and solid.
In the other thread on this topic, a scipy maintainer also talked about a lot of bugs in the legacy F77 codes. This surprised me, because my experience is the opposite: I am using some legacy codes (both some famous public ones and some proprietary), and although I would not say they are bug-free (no software is completely bug-free) the number of bugs I could find over the years is absolutely minimal (I mean something like one every 5 years). That’s true that when it happens it can be a pain to fix them properly (possibly spaghetti code (*), and the main problem: nobody really knows the code). Sometimes a quick and dirty fix can do. But it’s really uncommon overall: the point is that these famous legacy codes have been heavily used along the years, and virtually all the initial bugs have been fixed in a more or less distant past, so that the remaining ones are uncommon and generally not critical. I’m sure that many people here share the same experience. The advantage with a “dormant code” is that no new bug is introduced …
When asked to point these “lot of bugs” (opened issues or whatever), this maintainer never really answered.
I fully get (almost) all your other concerns, such as the lack of a native compiler on Windows, but I am a bit skeptic about the “lot a bugs” concern.
(*) but not all F77 codes are spaghetti… Some old F77 codes are well written and readable (within the limits of what F77 allowed at that time, of course). And I sometimes see modern codes that are nonsense and unreadable/unmaintainable by anyone but the one who has written it…
Thank you for your advice. I am no scipy user (shout out to you and all the developers on a tremendous tool!) and that’s why I would hope that someone with expert knowledge of it at some point will get interest in doing what you’re suggesting.
To be fair, these are all problems that we fortran developers also face all the time (why change something that works? Esp if changing it may break things? How to interface to non-Fortran code?).
But it seems like to survive, the time has come that 50+years old Fortran code needs be modernized anyways, which is what the tiny Fortran community is already doing tremendous work (mostly for free) for.
If I was a scipy developer, I would also be very upset that I need to ditch robust code just because “there are no more compilers”, so I understand it may be a better trade-off to rewrite the same code in C or Python rather than modernizing Fortran (as it wouldn’t solve the compiler issue). At the end of the day, people keep reinventing wheels all the time.
However, tbh that would be a disservice to a language (and it’s community) that’s kept scipy’s core going for 20 years.