A scipy dev here. (Speaking for myself only, not the project, so personal opinions below).
Here’s a pragmatic high-level view on issues with Fortran libs in SciPy. There are four major concerns:
- spaghetti code
- monolithic architecture
- issues with fortran / python glue
- build woes
Axel’s post stressed the last item, and the first one was discussed both here and elsewhere, so I’m going to focus on 2 and 3.
The second item is the main concern. Together with 1., it blocks not only fixing bugs but also adding features, perf improvements, porting to new platforms etc. Vendored bits of LINPACK are one part of it. As just one other example, consider `scipy.interpolate` needs flexible ways of constructing smoothing splines and to influence the number of knots (including manual selection of number) · Issue #2579 · scipy/scipy · GitHub.
A blocker here is that FITPACK mixes together library-specific algorithms for knot placement with bespoke QR-type manipulations of the design matrix. So to even start experimenting requires working within the monolith and nobody has the nerve to do it for almost a decade. Yet another example is quad_vec
which is a pure python version of (parts of) QUADPACK, written to address a common request of being able to integrate vector-valued functions.
What we want in the long run is to break these monoliths into manageable parts.
The third item is also a concern. Historically, the Fortran-to-python glue (Fortran-to-C API, if you will) is a source of at least as many issues as the algorithmic code itself. Of course, f2py does a great job — mostly. We do have a mixture of f2py and manual wrappers, and while all this has been mostly ironed out over time, remaining issues are few but require disproportionate amount of effort. This is both joys of runaway C pointers / aliasing (of course), and things like integer widths and associated overflows etc etc etc.
We have of course heard many great suggestions on how to improve the glue story. f2py is getting better, there’s Cython, there’s ISO_C_BINDING and so on — what’s lacking is the last mile of actually demonstrating an end-to-end integration to even start considering testing on the relevant range of platforms.
Which brings me to the next issue: the elbow grease / the last mile. We’ve heard multiple times about the modernized MINPACK and PRIMA and other great efforts. These look great — but they do not get integrated into SciPy. I’m sure there are reasons, but the fact remains: nobody has so far stepped in to actually do the integration work. From the SciPy perspective, we cannot even start evaluating them, testing them, or looking at specific technical details; they just sit somewhere. Maybe they are used somewere else already (superb if so!), but not in SciPy and so far there is barely any activity on that.
So I wonder what SciPy can do to encourage the Fortran community to take a lead on integrating their efforts? I fully realize it’s a big ask, especially given that SciPy itself is evaluating multiple options, ranging from “rewrite it all” to “autoconvert” (yikes) to “use modernized Fortran versions” to “break the monoliths, consider Fortran kernels”. An active participation of the Fortran community would be of tremendous help.
Cheers,
Evgeni