The difficulties of using Fortran in SciPy

I didn’t say lots of bugs, just bugs. As one example, a few bugs (1, 2, 3) in corner-cases of hyp2f1 were lying around for 4-6 years until they got fixed over ~10 PRs (1, 2, 3, 4, 5, 6, 7, 8, 9, maybe more…). Another related bug is still unfixed.

If you look at the PRs that finally got merged, you can see that a lot of effort went into them (literature research, testing, accuracy comparisons etc.). That’s because someone knowledgeable dedicated themselves to the problems of that single function, but it should be clear that this was not a simple operation, resp. why the vast amount of bugs are not so lucky in attracting someone to overcome such a high barrier to fixing them.

3 Likes

In the end, pragmatism wins. As I wrote in the article: when these libraries were started, Fortran was the obvious choice to provide a lot of already-implemented functionality. Over time (before the recent resurgence), there was a dramatic lack of Fortran developers, at least those in the intersection with interest in Python. So without knowledgeable caretakers, the code fell out of maintenance, and now it’s impossible to fix without extraordinary effort.

I personally can both respect the shoulders of giants we get to stand on, without having particular reverence for the code that was used at the time. Just because people stopped looking at it over the last decade doesn’t mean it’s gold-plated – like any code it contains bugs (both discovered and undiscovered), and the only way to keep the quality high is to actively keep maintaining it.

Again, people on average overwhelmingly choose pragmatism as the fastest way to solve their issues with the tools they have at hand. This cuts both ways: if something’s good enough, it doesn’t get touched anymore, but if a better solution appears, it may “suddenly” get dropped unceremoniously. But IMO there’s no entity to render (dis)service to, only people doing the work in front of them. The fact that conditions change over time is no judgement on the work of our forebears.

3 Likes

The Fortran Discourse is here to revert shame to pride:

  • we are proud to compute with a language that is designed for numerical computing,
  • we are proud to use a language that can use parallel computing features,
  • we are proud to use an energy efficient language (compiled languages could soon regain users as we are facing problems such as energy cost and associated environmental problems, each clock cycle could become again precious like it was a few decades ago),
  • we are proud to contribute to its ecosystem,
  • we are proud of its long history,
  • we are proud because we don’t follow fashion,
  • etc.

They should instead show a “Fortran inside” logo, as some companies do with their trademark. Any artist here to design such a logo?

4 Likes

I agree with you 100% and there is no judgement or blame in saying that ditching Fortran is a disservice to it, but just the observation that Fortran-in-Python is one of the few shiny examples of Fortran’s usefulness left, and if that is lost, it may be hard to recover from that both in terms of public perception and actual usage

7 Likes

I liked the article but it missed a rather important perspective (in my admittedly biased perspective). The SciPy team didn’t really co-ordinate much with f2py (used for generating wrappers) whose build system -c flags were also becoming obsolete with 3.12. since they only used preconstructed wrapper files (this is the recommended practice, so no harm in that)… I ended up cranking out the 3.12 meson backend in a week to ensure f2py can continue to be used with 3.12 and corresponding NumPy releases (and am still cleaning up the docs this week). There were a bunch of changes relating to meson support in SciPy as well (generating more uniform wrappers) and other things.

A lot of the codes (Fortran) in SciPy are very well tested over many years. Ad-hoc python ports (which seem to be in vogue) are IMO not the best idea. Passing SciPy’s testsuite / visual inspection falls short of reproducing the original scientific contexts and publications.

As I understand it, the problem with Fortran code in SciPy is a lack of maintainers. Which would suggest that if the community here is willing to port and maintain their codes within SciPy, much like other bits of SciPy (e.g. HiGHs for linear solvers, fast_matrix_market for the .mtx files) it shouldn’t be required that the existing SciPy maintainers necessarily know and understand all the Fortran code (as also noted by @h-vetinari here).

2 Likes

ISO_C bindings are acceptable now that they are supported in f2py (please let me know / open an issue on NumPy and ping me if otherwise). I spoke to Albert (in charge of the scipy special rewrite) and he was positive about adding Fortran (modern) codes as long as f2py generates wrappers SciPy can use.

1 Like

I do not understand this. Already some 10 years ago I used to interface C++ library with a C interface in the X-Plane flight simulator (Visual C++ compiled) with MinGW-gfortran-compiled Fortran code and hit no issues whatsoever.

I know you didn’t :slight_smile: … but this other maintainer did, so I just wanted to clarify this point with you, as it participates to the general FUD against Fortran.

It’s actually a behavior that is quite common and not limited at all to issues around Fortran. I often observe that people who want to add a feature to <whatever existing> prefer forking <whatever> or even fully rewriting a <my_whatever> because they don’t want to invest time in learning how <whatever> is designed and/or in interacting with other developers. When rewriting, they generally severely underestimate the amount of work needed to fully re-implement everything, every detail, every secondary feature… Codes are rarely just raw and straightforward implementations of a theoretical method, they almost always end up containing a lot of bell & whistles that are actually important. The result of this is multiple similar tools that are doing more or less the same things, but not exactly, etc…

2 Likes

One way that the Fortran community could help to foster the Fortran-in-Python workflow, making it as easy as possible would be (IMO) to push this How to build a shared library with fpm · fortran-lang/fpm · Discussion #655 · GitHub fpm (thanks to all the developers and maintainers of fpm :pray:) is already making it so easy to work in Fortran, making it possible to build shared objects from the fpm build would make the whole thing very smooth.

btw, thanks @h-vetinari I’m also a user of scipy and this library has been a great source of help and inspiration!!

6 Likes

Jane and I have been involved in commercial training for over 20 years. The majority of the people we provide training for use Windows. A common secenrio is development on a Windows laptop or desktop pc for a large part of their work, with with access to a cluster running Linux with MPI for other parts of their work.

hyper-v and wsl both enable programmers/developers to develop for both Windows and Linux from a Windows pc.

As someone who has been developing software for over 50 years I don’t find Windows any better or worse than any of the other systems I’ve used - CDC, IBM, Cray, ICL, Dec (VMS and openVMS), Data General, HP, FPS, Harris, Apple, Linux (SuSE, Redhat, Ubuntu).

2 Likes

Here are just the ones I closed by rewriting iterative solvers (cg, cgs, bicg, bicgstab, gmres, qmr) and still open ones. There are currently about 45 fortran related issues (spread across many fortran code pieces) that we are not able to address for years.

1, 2, 3, 4, 5, 6, 7, 8, 9, 10,

There are more similar but minor requests that we deemed as unplanned which hurts the image of SciPy and reduces trust to it.

If you are interested here is the progress of rewriting cdflib

cdflib_cython

Actual fortran code is in the cdflib folder in the same directory. You can compare the go to jungle especially the subroutine ones. And I’ll leave it to your judgement ipmpar.f and spmpar.f files. My favorite is definitely dzror.f. Just chef’s kiss to maintain (took me at least 10 hour to even understand the logic and I’m supposed to be doing this on my free time).

These have even more subtle bugs that we don’t even know how to debug. Certain floating point numbers just don’t work and annoy people greatly. So when I say there are bugs it comes out of actual experience not feelings.

I understand the underlying implication that I am probably exaggerating and there can’t be this many bugs. But get to the popularity of SciPy and every nano edge is hit eventually by someone somewhere.

1 Like

The PR is a nice representative one: https://github.com/scipy/scipy/pull/18488/. I am surprised that the performance is identical — my experience is that Python is usually a lot slower for such algorithms, unless the algorithm is dominated by matmul or similar heavy array operations. It would be worth looking closely at what is going on. Take the GMRES or CGS solvers here: https://github.com/ilayn/scipy/blob/99a57bcc15a2fe83267eb0764591080f07a3ff67/scipy/sparse/linalg/_isolve/iterative.py#L570. These can’t be faster than Fortran for small matrices. They are dominated by matvec, but for small enough matrices Fortran should be faster. So in the benchmarks, we should be able to see this, but we don’t. Well, for CGS it is slower in Python as expected. But GMRES is faster — is it using the same algorithm?

Just to give an example of why Fortran loses its appeal quickly when glued to Python as underlying low level compiled code.

As I gave the example above I am rewriting the the renowned cdflib in Cython. I picked a function from this psi_fort.f as the candidate wrapped with f2py. compared to my own version called psi from https://github.com/ilayn/scipy/blob/remove_cdflib/scipy/special/_cdflib.pyx The fortran code is in

Here are the timings

>>> %timeit cypsi(10)
42.4 ns ± 0.798 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> %timeit fortpsi(10)
82.3 ns ± 1.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Now I’m no expert so if there is a secret gfortran flag that would make this faster I don’t know. But when Fortran does things in its own way and insists on doing it that way, its appeal only applies to folks who are using it natively. Because all the Fortran lightning speed, which even I appreciate, is eaten up by the glue. But the glue is unfortunately needed to keep things compatible. It’s nobody’s fault if things are fundamentally incompatible.

This is not an architectural argument on how a language should be designed. This is basically a factual statement that we have while using Fortran with any other language. Like I am quoted above, I can’t care less about what Fortran does with its syntax, I don’t have that much free time to feel strong about a language I didn’t use for a very long time.

I’m strictly addressing the incompatibility issues about going against the grain and insisting on it (not as a hyperbole but backwards-compatibility wise). So row-major order is not something I like, instead it is what I have to use and how much performance I’m leaving on the table just to pass things to fortran. And thus why we are not keen on keeping such code around.

I don’t claim that I did a good job but the main trick is to keep things in-place for numpy (not generating new arrays but using infix operators +=. *= as much as possible) and you are skirting around the NumPy C speed with a little of Python overhead.

They can because of the f2py overhead. There is this concept called “reverse communication” that is you pass the result back to the caller with the arguments to be computed again, say, matvecs or preconditioning details and each back and forth eats up considerable amount.

Since we can’t pass functions as first class citizens the communication is done over f2py bridge.

EDIT: Oh and ignore the CGS and GMRES stats, there probably I’m measuring the failure speed. Somehow benchmark is broken on those two. The rest however are identical pretty much

As stated in the comments of the routine, it’s based on reverse communication, which was a way to implement generic solvers at times where OOP/callbacks etc were not available or too restrictive. I didn’t go into the details of the code, but it took me a few minutes to get the overall logic of the routine. And I’m not supercodeman… I tried once to write a reverse communication code, just out of curiosity, and could see that it was… tricky to say the least. And actually, given the limited control statements available in F77 to do that, I find this code relatively well structured.

1 Like

Well then if you have time, I’d appreciate the pseudo- or modern goto-less version if you can quickly cook it up.

I was wondering about that. Yes, this indeed is slower if you add the f2py and inefficient interface. I thought it was passing in the arrays directly via a pointer. If it is calling back into Python for each matrix operation, then the pure Python/NumPy can be faster.

This ties to our previous discussion of being able to operate on NumPy arrays directly in Fortran without any overhead, which has to be solved.

I’m a bit confused about this, of course F2PY can be used inefficiently, but one of the main points of it is being able to communicate between NumPy and F2PY… (we do have specialized routines to generate C code for working with NumPy arrays) The reverse communication problem is a design issue (it came up in HiGHs as well) and not an interface issue, i.e. if you must support reverse communication then it is generally always a copy.

That being said, it would always be best to see code where these things happen, (i.e. not just an example here in the post but an issue in NumPy highlighting the slowdown due to F2PY).

there is already a solution for that: numpy.ctypeslib posted something on this regard in the previous thread:

I’ve used the ctypeslib for crunching big arrays and saw basically the same cpu time running the shared library called from python compared to a compiled executable.

1 Like