The difficulties of using Fortran in SciPy

ivanpribec · November 12, 2023, 1:28am

Just curious, have you (or Quansight) given any consideration into consulting a professional?

For example, Polyhedron Solutions Ltd. has been developing the plusFORT suite of tools since 1986. It includes the tool SPAG that can analyze and restructure spaghetti Fortran codes. (cc @apple3feet)

There is also SimCon which provides the fpt static analysis tool, and has been in development since 1995. (cc @Jcollins)

Lastly, you have the newer team at Archaelogic (cc @everythingfunctional)

Perhaps, instead of spending days untangling GOTO’s, spending a a few hours to learn how to use the tools they offer, could solve all your problems.

rgoswami · November 12, 2023, 3:01am

Absolutely, I’d be happy to look over Fortran (or other) PRs to SciPy in general, and to vet the Var Buren code when you get to it

rgoswami · November 12, 2023, 3:05am

In general, for community maintained projects I think its unfair to expect professional (paid) consultation as a valid way forward. There are several reasons for this (IMO):

The maintainers are largely volunteers (notwithstanding partial support from supportive employers like Quansight)
Even if someone was working full-time on SciPy (can’t think of anyone), if they are already senior / well seasoned C++ / C / Python programmers, it would not make sense to pay for professional help for these situations

At the end of the day, funding for open source is in no way (as I see it) stable enough to form long-term contracts of support with professionals (unlike, for example, some academic collaborations or HPC centers).

To be clear by “these situations” I mean where the code can either be rewritten by existing maintainers, or a payment can be made to be modernized into a form where the maintainers wouldn’t be able to understand it best anyway, just because the existing implementation is in Fortran. The point where it reaches “pay to maintain” or rewrite is where Fortran in Python becomes a lost cause.

ivanpribec · November 12, 2023, 3:08am

Quoting Reynolds, NATO Conference on Software Engineering 1969:

The difficulty with obscure bugs is that the probability of their appearance increases with time and with
wider use.

Many of the problems SciPy appears to have are symptoms of bigger problems in software engineering, that some programmers were aware of already a long time ago. Also the dubious state of Fortran routines in SciPy today, is a deja vu of what it was 50 years ago, in 1974, when the author of specfun, W. Cody wrote,

There are many challenging and important problems for the numerical analyst in this work. Until these challenges are met, our libraries will continue to be filled with random collections of software of questionable quality. (emphasis mine)

There are more articles on these topics, here are just a few:

The number of citations these papers have is painfully low, meaning that sadly, very few seem to have read them and learned their lessons.

As long as we (volunteers, college students, …) keep rewriting software manually into every possible language, without proper expertise in the domain of these codes, or any formal training in software engineering, I don’t think the situation will get better. Dijkstra put it as follows:

To the economic question “Why is software so expensive?” the equally economic answer could be “Because it is tried with cheap labour.” Why is it tried that way? Because its intrinsic difficulties are widely and grossly underestimated. So let us concentrate on “Why is software design so difficult?”. One of the morals of my answer will be that with inadequately educated personnel it will be impossible, with adequately educated software designers it might be possible, but certainly remain difficult.

ivanpribec · November 12, 2023, 3:28am

I was thinking of it more as a one-time thing, just to “mechanically” remove the gotos, common blocks, and other legacy Fortran construct which harm readability and make it harder to pinpoint bugs. The problem of understanding the domain of an algorithm is always there, irrespective of the programming language it is written in.

steppi · November 12, 2023, 3:50am

Thanks @rgoswami. I’ll ping you when I start working on that one. I guess I shouldn’t be asking the Fortran community, as in the language developers, for help with special, just any modern Fortran developers who might be interested in participating.

@ivanpribec, thanks for the list of papers. I’ve read Cody’s but not the other two and will make sure to read them all. I’m sure you’re familiar with it, but pointing it out for others, history.siam.org is another great resource for historical information surrounding numerical computing. I mentioned earlier, but our specfun is not the same as Cody’s. Cody was a brilliant numerical analyst and his specfun is of high quality but limited in scope.

I might be kind of an oddball case, but I do have some proper expertise in numerical computation of special functions, and so does Irwin Zaid who is working with me. We’re trying to get things right with special and are not just trying to rewrite things blindly for the sake of doing it, which is understandably probably most people’s expectation.

rgoswami · November 12, 2023, 4:16am

For this, I would admit that a brief skimming of the resources you had listed left me unclear on what the licensing terms would be (e.g. I could probably score some tools with my academic affiliations but I’d be hesitant to use the resulting code in a PR). I think it is perhaps a good first step to “keep Fortran-Lang in the loop” as is happening now, and perhaps over time joint proposals can be made for funding towards such one time endeavors. e.g. if there were a regular contributor from Fortran-Lang to SciPy, perhaps a small development grant could be found to fund such work… but such ideas (from non-contributors, even professionals) are likely to not be looked at favorably by governance (for the same reason we in Fortran-Lang don’t solicit Python or C professionals).

certik · November 12, 2023, 4:42am

@steppi LFortran can compile specfun and all SciPy tests pass. I discovered a bug there myself during my Ph.D. (I think it was the hypergeometric one) 13 years ago and had to write my own special function that is more accurate (https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/special_functions.f90#L189). But I did it in modern Fortran, not C++.

@ilayn Yes, we tested Lapack just today, looks like we get to 12% already (Compile Lapack · Issue #1232 · lfortran/lfortran · GitHub). You are right that Fortran Lapack is here to stay for a few more years at the very least (probably much longer).

Why do I care about Fortran in SciPy?

The SciPy package is very high profile. People care. In some circles, SciPy is the only reason people care about Fortran. So losing that is actually a huge deal that Fortran (and SciPy) people don’t even realize. For example thanks to SciPy, we got funding for LFortran (to compile it), and thanks to SciPy, Flang got packaged in conda-forge and many meson issues resolved. I believe also SciPy was the first high profile result of Flang that I’ve noticed on Twitter that people celebrated. For LFortran, if I mention that we compiled one more modern Fortran package, we get a few likes. If I mention that we are ready for testing it with SciPy for WASM, we get about 3x more likes. So SciPy has been helping Fortran tremendously.

However, for this to work, it must be mutual. What can Fortran do for SciPy?

Compared to C++, I would say speed of compilation, safety in Debug mode, speed in Release mode. I think that Fortran is very close to Python in terms of syntax, array support and easiness of use. Fortran was made for numerics.

Harper · November 12, 2023, 5:00am

I have never used SciPy but I did use Zhang and Jin’s book in 2012. Its MCQLN.FOR results were OK for Q_n(z) with integer n, complex z, if z was far from the cut from -1 to 1, but not near the cut (e.g. (0.6D0, 0.1D0)). When I complained to the authors they sent a bug fix. Complex Legendre functions are a pain! I also had to change their 0.0 and 1.0 into 0D0 and 1D0 because everything else was in double precision.

h-vetinari · November 12, 2023, 11:16am

Edit: I write “we” a lot, but it’s really just my personal take; I’m not speaking for the other SciPy maintainers – though I think I’m reflecting the broad consensus position.

There was a long time of the Fortran community (at least on the FOSS side) being basically non-existent, while C++ had a thriving community. This differential got starkly reflected in the contributor population, which eventually translated into the maintainer population, etc. I get that it’s mostly a rhetorical question, but it’s really not like this is due to the capriciousness of the current SciPy maintainers – there’s simply a massive lack of Fortran expertise.

In fact, getting rid of any (non-Python) language in our repo would be a huge maintenance win, as long as functionality, performance, accuracy and maintainability does not decrease. That too is just a natural consequence of how much complexity we burden ourselves with. There are great libraries written in R and Julia, but we’re not going to include those because every extra language just has a maintenance impact that can hardly be overstated. For the same reason, it’s unlikely that we’d get anywhere with “why don’t you learn Fortran then?” – we can understand and lightly modify the code we’ve inherited, but becoming an expert is very a different thing.

While I agree that gratuitous rewrites are bad, this misses the point IMO. We have code no-one is able to maintain and 10+ year old bugs that cannot get fixed. That is a serious project health issue, and the last years have shown that these things (with very few exceptions) do not get fixed either by the community or the available maintainers.

Even if mistakes will be made in a rewrite, the key point is that we gain the ability to iterate on the code going forward, which is what we’re really after. So unless sustainable alternatives for the functionality in question appear, the only available path forward is to rewrite^[1].

I get the dynamics here – you’re fighting impossible-seeming odds on a shoestring budget^[2], so it’s natural to be disheartened by bad news that further pessimize the odds. But even though several SciPy maintainers are individually sympathetic to your efforts, the health of the Fortran ecosystem is not something that matters to our users, and so it doesn’t figure very highly in which decisions are good for our long-term health.

I think we’ve all been engaging very frankly – it’s not like I want to be the bearer of bad news to people who care a lot about Fortran (in fact, I think the work you’re doing is amazing and I want you to succeed!) – which is exactly why I am not sugarcoating things.

Help us fix these bugs and unblock necessary improvements^[3]. Whether that’s through new stand-alone modules or through helping maintain the Fortran code in the SciPy repo is secondary. No-one from SciPy will blame the Fortran community if you don’t have the resources to do so, but likewise we shouldn’t be blamed if we find ways to solve these issues that don’t involve Fortran.

or deprecate and eventually remove without replacement, which is obviously less attractive ↩︎
metaphorical; I presume almost no-one is paid to work on this ↩︎
“help us” clearly does not mean: “do all the work” – IOW, of course we’re willing to do our part. ↩︎

ilayn · November 12, 2023, 12:31pm

This time our problem is that we have folks who want to do something about things but the code does not allow for a meaningful entry point. Even the hypot example Cody gives evolved further. So it’s not just the classical issues. We have much better testing facilities and understanding of what a good test writing should look like. While very valuable the lessons are, not relevant in this particular rewrite vs binding subject.

I know those articles which I enjoyed in the past but cannot anymore since they are paywalled (another problem of numerical software).
In turn, here is another you might also like (also getting implemented in reference BLAS as we speak)

Anderson E. (2017), Algorithm 978: Safe Scaling in the Level 1 BLAS, ACM Trans Math Softw 44:1–28

and it has a full test suite.

For a single use, spending a hefty amount of the funding does not seem like a viable option to me. And I think the license does not apply here since I’ll put the result in a multi-million use package. Not necessarily a “single user” use. But I don’t know. Regardless, the distinction of

deciphering F77 code (goto infested code that in my opinion needs to go away)
building F77 code (Lfortran is the key here)
maintaining F77 code (General community and interest in removing antics)

is crucial here. In this post we are, admittedly necessarily, using these three interchangeably. Since no one is rewriting these things, that should tell something which I failed to convey in these exchanges. Either people are moving away or people are not using them. Neither are good news.

If it is that easy to untangle, those software companies you linked to can do this out of charity for a few “the most popular” Fortran libraries and someone can convert the result to fortran 2008 and there you go extending the life of the library another 3 decades. You might say why would anybody do that. And that is exactly what I mean about the state of fortran community. This would have been done already if it was Rust, Python, Go, Julia etc. (Rust has 3 independent BLAS project attempts already). I mean no offense just plain comparison. An interested community springs out countless initiatives about the most obscure details. I mean we have the original sympy author among us.

That was what I meant previously but somehow I think I am not eloquent enough to put things into words about it.

mecej4 · November 12, 2023, 1:44pm

See this post by @apple3feet (John Appleyard, the author of SPAG), where he indicates that he may be favorably inclined to grant a free license for an opensource project that involves modernizing legacy Fortran code.

themos · November 12, 2023, 6:29pm

I am not convinced this is a Fortran-from-Python issue. NAG is ready to sell anybody huge tracts of its numerical libraries with Python interfaces, and all that numerical code is written in Fortran. The issue seems to be that there is no community equivalent to the NAG enterprise (which started as a community project before becoming a commercial not-for-profit company), and no ideas how to build such a thing.

certik · November 13, 2023, 5:53am

You might like LPython then! Same technology as LFortran, same speed.

Regarding the other comments, I generally agree, I don’t have anything else to add. As I said, nobody planted new Fortran trees for a very long time. We have finally planted several in the last several years, but we have to wait for the fruits to come. But they will!

Beliavsky · November 13, 2023, 1:25pm

The g95 compiler was first released in 2000 and was developed through 2012. The gfortran compiler, an offshoot of g95, is still developed and is one of the most important Fortran compilers. I think they should be recognized as important “trees”, without which the language would likely be moribund.

certik · November 13, 2023, 3:06pm

The GFortran compiler was planted in 2000 (or possibly even earlier, that is when g95 was released). It takes about 10 years to develop a new compiler from scratch to reach production. Absolutely, GFortran is the “old” tree that I am talking about. Without it, we would be finished long time ago. But GFortran by itself can’t fix all the Fortran’s problems.

Jcollins · November 14, 2023, 7:32pm

To give an idea of what can be done entirely automatically, here is a fragment of “Hunt the Wumpus” from the early 17th Century.

C *****************************************************************************
C shoot.f                         17-May-86                John Christopherson
C *****************************************************************************

      SUBROUTINE SHOOT

      INCLUDE 'ints.inc'

      INTEGER ACAVE

      ACAVE = CAVE
      DO 340 I = 1,5
      DO 230 J = 1,3
      IF (SEQ(I) .EQ. PASS(J,ACAVE)) GOTO 260
230   CONTINUE
240   CONTINUE
      J = 1+3*RAND()
      IF (J .GT. 3) GOTO 240
      ACAVE = PASS(J,ACAVE)
      GOTO 280
260   CONTINUE
      ACAVE = SEQ(I)
280   CONTINUE
      IF (ACAVE .EQ. WUMP) THEN
      WRITE(6,300)
300   FORMAT(/,'CONGRATULATIONS',
     1 ' - YOU HAVE SLAIN THE WUMPUS !!!'/)
      WUMP = 0
      GOTO 500
      ELSEIF (ACAVE .EQ. CAVE) THEN
      WRITE(6,320)
320   FORMAT(/,'YOU SHOT YOURSELF IN THE FOOT',
     1 ' AND ARE FORCED TO RETIRE !!!'/)
      WUMP = 0
      GOTO 500
      ENDIF
340   CONTINUE

      ARRW = ARRW+1
      IF (ARRW .LT. 3) THEN
      WRITE(6,420)3-ARRW
420   FORMAT('MISSED - you have',I2,' arrows left')
      ELSE
      WRITE(6,440)
440   FORMAT(/,'You have NO ARROWS LEFT ',
     1 'AND ARE FORCED TO RETIRE !!!'/)
      WUMP = 0
      ENDIF

500   RETURN

      END
type or paste code here

This is converted entirely automatically to:

! *****************************************************************************
! shoot.f                         17-May-86                John Christopherson
! *****************************************************************************
!
SUBROUTINE shoot
!
!
! ****************************************************************************
!
!
        USE fpt_module_kinds
!
        IMPLICIT NONE
!
! ****************************************************************************
!
        INCLUDE 'ints.i90'
!
        INTEGER(KIND=ki4)arrow_cave
        INTEGER(KIND=ki4) :: i
        INTEGER(KIND=ki4) :: j
        REAL(KIND=kr4) :: rand
!
        arrow_cave=your_cave
        DO i=1,5
           DO j=1,3
              IF (arrow_path(i)==passage(j,arrow_cave)) THEN
                 GOTO 260
              ENDIF
           ENDDO
240        CONTINUE
           j=1+3*rand()
           IF (j>3) THEN
              GOTO 240
           ENDIF
           arrow_cave=passage(j,arrow_cave)
           GOTO 280
260        CONTINUE
           arrow_cave=arrow_path(i)
280        CONTINUE
           IF (arrow_cave==wumpus_cave) THEN
              WRITE (6,"(/,'CONGRATULATIONS',                                          &
               & ' - YOU HAVE SLAIN THE WUMPUS !!!'/)")
              wumpus_cave=0
              GOTO 500
           ELSEIF (arrow_cave==your_cave) THEN
              WRITE (6,"(/,'YOU SHOT YOURSELF IN THE FOOT',                            &
               & ' AND ARE FORCED TO RETIRE !!!'/)")
              wumpus_cave=0
              GOTO 500
           ENDIF
        ENDDO
!
        arrow=arrow+1
        IF (arrow<3) THEN
           WRITE (6,"('MISSED - you have',I2,' arrows left')")3-arrow
        ELSE
           WRITE (6,"(/,'You have NO ARROWS LEFT ',                                    &
            & 'AND ARE FORCED TO RETIRE !!!'/)")
           wumpus_cave=0
        ENDIF
!
500     CONTINUE
        RETURN
!
END

The fpt script which specified the changes is:

! wumpus.fsp  25-Oct-23  John Collins

! File handline
% input directory: ../original_source/wumpus
% infer input code layout from file name extension
% output directory ../fpt_output/wumpus
!!! % keep layouts
!!! % keep file name extensions
% primary input file name extension: ".f"
% include input file name extension: ".inc"
% primary output file name extension: .f90
% include output file name extension: .i90
% overwrite changed files

! Look in modified source for changed files
% check modified source

% change do continue to do enddo
% remove labels from enddo statements
% remove labels from executable statements
% change if to if-then
% specify implicit none
% specify numeric kinds
% change relational operators to symbolic form
% remove format statements used fewer than 3 times

! Suppress the diagnostics which mark the changes
% suppress diagnostic 2255 4243 4495

! Code formatting
% free format
% no column format
% output code line length: 132
% page width: 140
% write continuation character in column 88
% lower case symbols
% space before "::"
% space after "::"

! More meaningful names
% rename wump wumpus_cave
% rename cave your_cave
% rename arrw arrow
% rename caven cave_alias
% rename pass passage
% rename seq arrow_path
% rename bats bat_cave
% rename pits pit_cave
% rename acave arrow_cave

! The wumpus primary source files
cavnum.f
inipos.f
input.f
intro.f
move.f
passag.f
play.f
shoot.f
wumpus.f

! End of wumpus.fsp

To do this you need fpt v4.2-j (or later) because we only added the FORMAT handling very recently. Please mail me if you want to try it (or if you want to play Hunt the Wumpus)

John

Jcollins · November 14, 2023, 7:33pm

Apologies - the text handler in the forum truncated the examples. How do I stoo; that?

rgoswami · November 14, 2023, 7:39pm

This is fantastic! However, in the context of this discussion, would it be OK (licensing / attribution) to use it on SciPy’s existing Fortran code? If so, I’m willing to try to push a PR through SciPy.

Jcollins · November 14, 2023, 7:58pm

No problem in cutting a free licence for use on SciPy. fpt is simply a tool - there are no licensing implications for the software it is run on - ownership stays with the original owners.

Happy if we can make a contribution.

Topic		Replies	Views
SciPy: "there is interest in adopting this (PRIMA) implementation, but little appetite for taking on more Fortran code"	28	11658	February 2, 2024
Using reserved words as variables Help	61	2158	September 4, 2023
"A Perspective on Sustainable Computational Chemistry Software Development and Integration": doutbtful comments about Fortran Advocacy	15	1017	March 7, 2024
About f2py, IPC and Wrappers	13	1792	September 15, 2021
Scipy is replacing Fortran with C	19	1032	January 22, 2025

The difficulties of using Fortran in SciPy

Related topics