The difficulties of using Fortran in SciPy

ilayn · November 9, 2023, 7:03pm

Yes but I have no idea how I can do such analysis. No idea what that thousands of lines code does internally. But the example above is not a carefully crafted one, typically if I write something in Cython for C-contiguous array, it is almost always faster than the equivalent fortran code. It’s not f2py fault but it has to interject itself to do the adapter work. Not sure why it came out as if I blame f2py for it.

Take any mid/big NumPy array create a slice with every other row and column. Pass that to fortran to see the speed for noncontiguous array pass to fortran. Do the same ops with NumPy and typically NumPy will win because it understands its memory. This is not a criticism it’s an incompatibility.

hkvzjal · November 9, 2023, 7:17pm

@ilayn indeed, the slicing issue is a major one and I remember one of your examples in which you showed the problem of calling a Fortran function to work on a slice of the array with something of the sorts:

fortran_fun( A[10:20,10:20] )

Which does imply a copy of the slice of memory. A “minimal” interface change could solve the efficiency issue:

fortran_fun( A , [10,20] , [10,20] )

Meaning: pass the address of the memory and the sections of the slice you want to work as argument. The caller Fortran function can understand this and internally do the operations on the slice without copying the slice for the sake of seeing it as a contigous region.

certik · November 9, 2023, 7:40pm

One problem with non-contiguous arrays (slicing) even in pure Fortran (or any other language) is that the operations on them (such as matmul) are typically slower than on contiguous arrays. Not always, but often times a temporary copy of non-contiguous array to a contiguous one actually makes things faster.

I don’t have a clear opinion on this yet, but for my fast codes I usually only use contiguous arrays.

For interactive use and prototyping, being able to use slices is great, so the language should allow it (and both Fortran and NumPy does), but for production when performance matters, it seem it’s not that great.

hkvzjal · November 9, 2023, 7:51pm

Totally true @certik !! I also find it annoying that while both Fortran and Numpy allow working on slices quite naturally, the gluing obliges either a copy or a change of user interface.

And you are right, even in pure Fortran one can see big differences if copying sparsely slices in case it is needed.

I’m just guessing that (this is pure speculation) already passing the whole array + the slices as individual arguments could show some major improvements in the binding (even if the Fortran code works on the slices but knowing exactly the jumps it has to make). From there it could be worked out how to push even further.

h-vetinari · November 9, 2023, 10:17pm

If that’s the case, then the Fortran community should join the effort in helping us maintain this. It might mean you have to learn some Python, but we’ll be happy to provide guidance on that.

The article is already monstrously long, so I had to edit things down to the core story wherever possible. f2py is important, but wouldn’t solve the actual compilation situation in itself. Thanks for fixing this and making it ready for Python 3.12!

Absence of proof is not proof of absence. You could have been lucky enough to not use the parts of the interface that differ between the two C libraries, for example. It’s easy to break the ABI even with the same compiler, e.g. MKL’s LAPACK uses the G77 ABI, and if you use gfortran’s default, things will break.

RonShepard · November 9, 2023, 10:20pm

Another twist to this is that many matrix operations are performed on subblocks of the input matrix (or matrices). The possibly noncontiguous subblocks are copied to either local contiguous memory for the purpose of localizing it within level-1 cache, or the subblock is copied directly to something like vector registers or gpu memory. The floating point operations are then done using whatever optimal hardware is available, and the results are then copied back to the noncontiguous output array subblocks. Sometimes making a local contiguous copy of the entire array at the beginning is redundant with this low-level subblocking, adding yet another layer of memory copy overhead, sometimes it facilitates it.

certik · November 9, 2023, 10:49pm

@RonShepard yes, that’s exactly right.

PierU · November 9, 2023, 11:21pm

Here is an attempt. I made minimal modernization, just getting rid of the labels and GOTO’s and converting to free-form. There would be a bit more to do…

Actually, most of the goto’s were simulations of IF/THEN/ELSE blocks. Some other ones were more tricky.

The code structure remains non conventional because of the reverse communication constraints. And IMO the main issue with the routine is not the GOTO’s, but the SAVE statement that makes all the local variables static, in order to keep the values between calls. Hence the routine is not thread-safe. At the time the code was written the only alternative would have been to pass all these variables in the argumens. With current Fortran the right way would be to put them in a type and pass a single argument, and enclose everything in a module.

I have quickly tested it, but more tests would be needed.

SUBROUTINE dzror(status,x,fx,xlo,xhi,qleft,qhi)
!**********************************************************************
IMPLICIT NONE

   ! Scalar Arguments
   DOUBLE PRECISION fx,x,xhi,xlo,zabstl,zreltl,zxhi,zxlo
   INTEGER status
   LOGICAL qhi,qleft

   ! Save statement !!! makes the routine non thread safe !!!
   SAVE

   ! Local Scalars
   DOUBLE PRECISION a,abstol,b,c,d,fa,fb,fc,fd,fda,fdb,m,mb,p,q, &
                    reltol,tol,w,xxhi,xxlo,zx
   INTEGER ext
   CHARACTER(16) phase
   LOGICAL first,qrzero

   ! Intrinsic Functions
   INTRINSIC abs,max,sign

   ! Statement Functions
   DOUBLE PRECISION ftol

   ! Statement Function definitions
   ftol(zx) = 0.5D0*max(abstol,reltol*abs(zx))

   ! Executable Statements ..

   ! Fake infinite loop to simulate jumps to specific phases
   ! phase = 'XXX' ; RETURN 
   !    exits the routine to collect a new function value, and come back at phase "XXX"
   ! PHASE = 'XXX' [; CYCLE]
   !    equivalent to jump to the phase "XXX"
   !    CYCLE can be omitted if "XXX" is located after in the code and if this is the
   !    last instruction of the current phase
   CONTROL_LOOP: DO     
      
      IF (phase == 'START') THEN
         xlo = xxlo
         xhi = xxhi
         b = xlo
         x = xlo
         ! GET-FUNCTION-VALUE
         status = 1
         phase = 'PHASE1';  RETURN
      END IF
      IF (phase == 'PHASE1') THEN
         fb = fx
         xlo = xhi
         a = xlo
         x = xlo
         ! GET-FUNCTION-VALUE
         phase = 'PHASE2'; RETURN
      END IF
      IF (phase == 'PHASE2') THEN
         ! Check that F(ZXLO) < 0 < F(ZXHI)  or
         !            F(ZXLO) > 0 > F(ZXHI)
                      
         IF (fb.LT.0.0D0 .AND. fx.LT.0.0D0) THEN
            qleft = fx .LT. fb
            qhi = .FALSE.
            status = -1
            RETURN   ! termination
         END IF 

         IF (fb.GT.0.0D0 .AND. fx.GT.0.0D0) THEN
            qleft = fx .GT. fb
            qhi = .TRUE.
            status = -1
            RETURN   ! termination
         END IF

         fa = fx

         first = .TRUE.
         phase = 'PHASE2_B'
      END IF
      
      IF (phase == 'PHASE2_B') THEN   
         c = a
         fc = fa
         ext = 0
         phase = 'PHASE2_C'
      END IF
      
      IF (phase == 'PHASE2_C') THEN   
         IF (abs(fc).LT.abs(fb)) THEN
            IF (abs(fc).LT.abs(fb)) THEN
               IF (c.NE.a) THEN
                  d = a
                  fd = fa
               END IF
               a = b
               fa = fb
               xlo = c
               b = xlo
               fb = fc
               c = a
               fc = fa
            END IF
         END IF
         tol = ftol(xlo)
         m = (c+b)*.5D0
         mb = m - b
         IF (.NOT. (abs(mb).GT.tol)) THEN
            phase = 'PHASE3_B'
         ELSE
            IF (ext.GT.3) THEN
               w = mb
            ELSE
               tol = sign(tol,mb)
               p = (b-a)*fb
               IF (first) THEN
                  q = fa - fb
                  first = .FALSE.
               ELSE
                  fdb = (fd-fb)/ (d-b)
                  fda = (fd-fa)/ (d-a)
                  p = fda*p
                  q = fdb*fa - fda*fb
               END IF
               IF (p.LT.0.0D0) THEN
                  p = -p
                  q = -q
               END IF
               IF (ext.EQ.3) p = p*2.0D0
               IF ((p*1.0D0).EQ.0.0D0 .OR. p.LE.(q*tol)) THEN
                  w = tol
               ELSE
                  IF (p.LT. (mb*q)) THEN
                     w = p/q
                  ELSE
                     w = mb
                  END IF
                  CONTINUE
               END IF
               CONTINUE
            END IF
            d = a
            fd = fa
            a = b
            fa = fb
            b = b + w
            xlo = b
            x = xlo
            ! GET-FUNCTION-VALUE
            phase = 'PHASE3' ; RETURN
         END IF
      END IF
      
      IF (phase == 'PHASE3') THEN
         fb = fx
         IF ((fc*fb).GE.0.0D0) THEN
            phase = 'PHASE2_B' ; CYCLE CONTROL_LOOP
         END IF
         IF (w.EQ.mb) THEN
            ext = 0
         ELSE
            ext = ext + 1
         END IF
         
         phase = 'PHASE2_C' ; CYCLE CONTROL_LOOP
      END IF
      
      IF (phase == 'PHASE3_B') THEN
         xhi = c
         qrzero = (fc.GE.0.0D0 .AND. fb.LE.0.0D0) .OR. &
                  (fc.LT.0.0D0 .AND. fb.GE.0.0D0)
         IF (qrzero) THEN
            status = 0
         ELSE
            status = -1
         END IF
         RETURN   ! termination
      END IF

      EXIT CONTROL_LOOP
      
   END DO CONTROL_LOOP    

ENTRY dstzr(zxlo,zxhi,zabstl,zreltl)
!**********************************************************************
   ! Reset all saved variables to known values
   a = 0d0
   abstol = 0d0
   b = 0d0
   c = 0d0
   d = 0d0
   fa = 0d0
   fb = 0d0
   fc = 0d0
   fd = 0d0
   fda = 0d0
   fdb = 0d0
   m = 0d0
   mb = 0d0
   p = 0d0
   q = 0d0
   reltol = 0d0
   tol = 0d0
   w = 0d0
   xxhi = 0d0
   xxlo = 0d0
   zx = 0d0
   ext = 0
   phase = 'START'
   first = .FALSE.
   qrzero = .FALSE.

   ! Set initial values
   xxlo = zxlo
   xxhi = zxhi
   abstol = zabstl
   reltol = zreltl
   RETURN

   ! The following stop statement can never be reached (??)
   STOP '*** EXECUTION FLOWING INTO FLECS PROCEDURES ***'

END

ilayn · November 9, 2023, 11:42pm

Ah thank you very much. The reason why I mentioned the maintenance is the way this thing is called from other places. I agree with the SAVE bonanza but I already accepted my fate. My issue was that I wasn’t able to parse all that i99999 business.

Still this is used in a few subroutines such as cdfnbn.f and from that I think I’ll lose some days on this one…

But anyways, that’s on me. Thanks again.

RonShepard · November 10, 2023, 1:00am

I dislike reverse communication implementations for this reason. It makes even simple algorithms complicated and obscure, difficult to maintain, and difficult to modify. Even f77 provided a straightforward way to do this with procedure arguments, I wonder why people didn’t use that facility. Now in modern fortran, it is even easier to do with internal procedures as arguments, and most compilers even can check that the interfaces are all consistent.

ivanpribec · November 10, 2023, 1:26am

One valid reason to use reverse communication is in mixed-language environments, as explained by this write-up. For example in Microsofts Component Object Model (COM) you could have inter-process communication among objects (procedures) implemented in different languages.

In direct communication, the procedure argument must be compiled into the executable statically or loaded dynamically. (I’m not sure the latter was possible in F77, and even now since F2003 it can only be achieved via C bindings.)

PierU · November 10, 2023, 8:17am

True, in F77 you could already pass a procedure as argument, but the procedure interface was then enforced by the solver. So, it was up to the user of the solver to write a wrapper to its actual function in order to fit the expected interface, and for complex functions (complexity, not complex numbers) the wrapper could be quite obscure. Typically the expected interface was something like:

real function f(x,rpar,ipar)
   real x   ! or x(n) for multivariable functions
   real rpar(*)
   integer ipar(*)
end function

with the rpar and ipar arrays designed to store all possible parameters needed to evaluate the function.

With this scheme, the pressure was put on the user of the solver. Reverse communication is more difficult to write and maintain, but once written it’s more user friendly, you don’t have to wrap the functions you are solving.

Anyway, modern Fortran allows now writing solvers that are both easy to write and easy to use, without the need of reverse communication.

feenberg · November 10, 2023, 11:39am

I don’t understand the problem with libraries. Is it difficult to make a library from fortran source in Windows? I have only done it once, but had no difficulty. Is the problem dependencies in the Fortran code? I don’t see why the functions suitable for SciPy to call on Fortran for (such as matrix manipulations) should have dependencies. I know that it is difficult to mix Fortran I/O with another language. Is that the problem? I know that inI recall that IMSL libraries used to advertise as a feature that they were I/O free, just to address this problem.

ev-br · November 11, 2023, 11:56am

A scipy dev here. (Speaking for myself only, not the project, so personal opinions below).

Here’s a pragmatic high-level view on issues with Fortran libs in SciPy. There are four major concerns:

spaghetti code
monolithic architecture
issues with fortran / python glue
build woes

Axel’s post stressed the last item, and the first one was discussed both here and elsewhere, so I’m going to focus on 2 and 3.

The second item is the main concern. Together with 1., it blocks not only fixing bugs but also adding features, perf improvements, porting to new platforms etc. Vendored bits of LINPACK are one part of it. As just one other example, consider `scipy.interpolate` needs flexible ways of constructing smoothing splines and to influence the number of knots (including manual selection of number) · Issue #2579 · scipy/scipy · GitHub.
A blocker here is that FITPACK mixes together library-specific algorithms for knot placement with bespoke QR-type manipulations of the design matrix. So to even start experimenting requires working within the monolith and nobody has the nerve to do it for almost a decade. Yet another example is quad_vec which is a pure python version of (parts of) QUADPACK, written to address a common request of being able to integrate vector-valued functions.
What we want in the long run is to break these monoliths into manageable parts.

The third item is also a concern. Historically, the Fortran-to-python glue (Fortran-to-C API, if you will) is a source of at least as many issues as the algorithmic code itself. Of course, f2py does a great job — mostly. We do have a mixture of f2py and manual wrappers, and while all this has been mostly ironed out over time, remaining issues are few but require disproportionate amount of effort. This is both joys of runaway C pointers / aliasing (of course), and things like integer widths and associated overflows etc etc etc.
We have of course heard many great suggestions on how to improve the glue story. f2py is getting better, there’s Cython, there’s ISO_C_BINDING and so on — what’s lacking is the last mile of actually demonstrating an end-to-end integration to even start considering testing on the relevant range of platforms.

Which brings me to the next issue: the elbow grease / the last mile. We’ve heard multiple times about the modernized MINPACK and PRIMA and other great efforts. These look great — but they do not get integrated into SciPy. I’m sure there are reasons, but the fact remains: nobody has so far stepped in to actually do the integration work. From the SciPy perspective, we cannot even start evaluating them, testing them, or looking at specific technical details; they just sit somewhere. Maybe they are used somewere else already (superb if so!), but not in SciPy and so far there is barely any activity on that.

So I wonder what SciPy can do to encourage the Fortran community to take a lead on integrating their efforts? I fully realize it’s a big ask, especially given that SciPy itself is evaluating multiple options, ranging from “rewrite it all” to “autoconvert” (yikes) to “use modernized Fortran versions” to “break the monoliths, consider Fortran kernels”. An active participation of the Fortran community would be of tremendous help.

Cheers,

Evgeni

certik · November 11, 2023, 11:35pm

@ev-br thank you for the comment and welcome to the Forum.

I think this pretty much summarizes it. In the end the technology is chosen by those that are doing the work. I am trying to help as much as I can, but right now I still need to focus on just compiling the Fortran parts, not the wrappers. My plan for the wrappers is to automate them with LFortran itself. But it still requires a lot of work. If anybody here can help, that would be awesome.

steppi · November 11, 2023, 11:55pm

Hi,

I’m little late to the party; another SciPy dev here. I agree with @ev-br that it would be great to see more involvement from the Fortran community, and wanted to give a concrete idea of what steps could be taken in special, the submodule I help maintain.

I think most of the bug reports in special come from a library we vendor as specfun. This is not William J. Cody’s specfun that appears in Netlib. It instead comes from the book Computation of Special Functions by Shanjie Zhang and Jianming Jin. This code was written for teaching purposes and contemporaneous reviews, 1, 2, speak to it not being intended for production use. There just aren’t many options for permissively licensed implementations of special functions of a complex variable.

The story here is a bit simpler on the Fortran/Python glue side though. We just need functions with scalar inputs and scalar output which then get plugged into NumPy’s ufunc machinery for vectorization, which in the Python world just means the function will operate over the values of an array without Python overhead.

@rgoswami mentioned I’d be positive about seeing modern Fortran in special. As Ralf Gommers mentioned here, we’re prioritizing the top 50 or so functions to write in C++ for interoperability reasons. There are 264 ufuncs in special though of which 94 have their scalar kernels written in Fortran. If some Fortran devs want to help replace some of our old buggy F77 special function implementations with clean modern Fortran versions, and help maintain this code, I think that would be great; I just want to see special get into better shape, and would appreciate the help. As a starting point, it would be nice to see this issue about integrating Arnie Van Buren’s high quality modern Fortran implementations of the Spheroidal Wave Functions into SciPy get attention; I was already planning on looking into it early next year. Anyone, ping me on that issue if your interested in helping out.

certik · November 12, 2023, 12:09am

@steppi, welcome!

I think this is the core of the issue. The modern SciPy devs are willing to rewrite to C++ themselves, they are not asking the C++ community to help. But if it were to stay in Fortran, the Fortran community must maintain it. However that does not scale. We can’t maintain the interface of Fortran packages for all libraries like SciPy.

Note that the original SciPy authors didn’t have a problem using Fortran code about 23 years ago.

So the question is what has changed?

What does scale is that we can maintain the library itself, say modern Minpack. But SciPy devs have to help integrating it. We can improve the tooling, such as compilers, build system etc.

For example, let’s take specfun, the library you are maintaining. The default there is that unless somebody from here volunteers, the library will get rewritten to C++. The writing on the wall is that if somebody does volunteer and help, the minute they step down, the code will again get rewritten. In such an atmosphere, it’s hard to find volunteers to help with some temporary short-term solutions.

I’ll give you another example. Take LFortran. I truly believed that if we can compile SciPy, that SciPy would use it. But the feedback we got at CI: test `scipy/special/cdflib` by Pranavchiku · Pull Request #2743 · lfortran/lfortran · GitHub is that it would be only temporary, until all Fortran is removed. My reading is that in general you don’t want Fortran. And that’s ok!

In that case, let’s not waste time trying to make something work that is not meant to work. I think it is important to be frank.

We will continue improving the Fortran ecosystem, and I personally will not stress about SciPy anymore. We will finish LFortran to fully compile it, since we need it for the quality of LFortran itself. You will continue on your plan to remove Fortran. And that is ok.

steppi · November 12, 2023, 12:37am

I think this is the core of the issue. The modern SciPy devs are willing to rewrite to C++ themselves, they are not asking the C++ community to help. But if it were to stay in Fortran, the Fortran community must maintain it. However that does not scale. We can’t maintain the interface of Fortran packages for all libraries like SciPy.

special doesn’t need an interface, once we have one working case of a scalar kernel written in modern Fortran, all that would need to be supplied are scalar subroutine like this one, and I could take care of plugging it into the ufunc infrastructure. The only help I’d need honestly is to have someone I could ping to help review PRs and ask about bug reports, since I’m not well versed in Fortran.

The rewrite in C++ is more of stars aligning sort of thing, because I’m getting some expert help reviewing my PRs from a Professor who is an expert in both C++ and special functions. Soon he’s going to have his students help as well. Also, I am a C/C++ developer and would consider myself at least somewhat a member of that community.

For the past 1.5 years or so, I think I’ve been the only active maintainer of special. I don’t think it’s at all feasible that every single special function in SciPy will get rewritten in C++. I don’t have the time or will to do it. If I get hit by a bus, I’m not sure much of anything will get rewritten at all. If people want to submit modern Fortran PRs to special, I will do my best to review them. I will still plan to look into integrating Van Buren’s spheroidal wave functions implementations. @rgoswami, would you be willing to help me vet that code when I get to it?

ilayn · November 12, 2023, 12:42am

That’s pretty much due to the maintainers’ C/C++ affinity. We can handle some level of C-fu. Not the case with Fortran.

Speaking for myself, I had enough of not being able to do anything about such code sometimes not even possible to debug without setting up a complete build-chain.

Once I sat down and started reading the various parts of the F77 codebase we have, I was not happy with the code I saw. In 20+ years I think we just took it for granted as rigorous and battle tested. Then our meson switch reminded our folks once again how difficult it is to build things with old F77 code.

Also I think I made the most noise with META: FORTRAN Code inventory · Issue #18566 · scipy/scipy · GitHub so things started moving. I still think I was too naive to jump into it but it’s too late to turn back Some might argue it’s a sunken cost fallacy.

That’s indeed very important. The reason why I am rewriting stuff like a madman is that, if there would have been an interest in the last 35 years (I think the oldest codepiece is from 1975), somebody would have rewritten these code. But nobody even bothered so no cavalry coming. And we are not experts of these domains. Hence I would assume that the situation will not change. However if we air this codebase in a relatively more familiar state, people strangely enough flock around and we see PRs coming out of the blue sky rewriting functions. Most probably they got hurt in their own work.

A community is not just the LFortran devs but an active userbase who writes everything into that language. Python, Julia, Javascript folks all do this and it happens naturally. There is noone to blame here. I hope I am being sufficiently transparent about my intentions that I need to be able to maintain the code that is not suitable for SciPy level code. So far we weren’t able.

Like I mentioned, if LFortran chose to be the fully backwards compatible all-round Fortran compiler then indeed being able to compile SciPy is a mega milestone as it has lots and lots of strange old code in it.

However, please note that again if our F77 codebase is maintained say like HiGHs or PocketFFT or any other library we currently vendor, then we would have not question the bespoke code every Monday.

This ties back into the my quote from above, the main issue is that we have F77 code in not-so-production state. We have to do something about it. We also have build tooling problems. Lfortran can help the latter and not the first. The first is equally important.

We have our BLAS/LAPACK dependency anyways hence Fortran dependency is not going anywhere. Once we can clean the F77 code then we can start ingesting maintained modern fortran code such as PRIMA and we can use lfortran. But keeping the old dormant F77 code just to switch to lfortran does not sound like a viable strategy to me. I want to understand if we come across as dismissive because that’s not our intention at all.

Plus, there is a whole fortran world waiting to be recompiled hence I don’t think we are really a big player in the grand scheme of Fortran things.

ilayn · November 12, 2023, 1:04am

This is not what we want to imply at all. Take LAPACK. Once LAPACK devs reignited the maintenance on GitHub we started to discover bugs in ?getrf or other central pieces. Think about how many times getrf being used daily. So if LAPACK still finds bugs in 40 year old code then maintenance is an indispensable part of any codebase. I still have bug reports like BUG: 4x4 test example for sy/hevr with INFO=2 · Issue #466 · Reference-LAPACK/lapack · GitHub
It is incredible that we still hit eigenvalue bugs after being slammed by the whole number crunching world for decades.

When we really mean by maintenance is that some people actually looking at the code and knowing what they are looking at. In that sense we are just the providers and not really maintainers. We wrap the code and hope for the best. So if people maintain some Fortran library and keep it in a relatively good shape we will not rewrite because there is no one there.

We rewrite things because the code is 45 years old and we can’t even decipher what it does to respond to bug reports and there is nowhere to turn for help.

Topic		Replies	Views
Conda toolchain for Fortran on Windows	9	1726	May 4, 2023
Scipy is replacing Fortran with C	19	994	January 22, 2025
SciPy is moving to the Meson build system	3	581	August 16, 2021
SciPy: "there is interest in adopting this (PRIMA) implementation, but little appetite for taking on more Fortran code"	28	11654	February 2, 2024
Let's get fortran-lang/minpack into SciPy	11	2267	July 1, 2022

The difficulties of using Fortran in SciPy

Related topics