Just curious, have you (or Quansight) given any consideration into consulting a professional?
For example, Polyhedron Solutions Ltd. has been developing the plusFORT suite of tools since 1986. It includes the tool SPAG that can analyze and restructure spaghetti Fortran codes. (cc @apple3feet)
There is also SimCon which provides the fpt static analysis tool, and has been in development since 1995. (cc @Jcollins)
In general, for community maintained projects I think its unfair to expect professional (paid) consultation as a valid way forward. There are several reasons for this (IMO):
The maintainers are largely volunteers (notwithstanding partial support from supportive employers like Quansight)
Even if someone was working full-time on SciPy (canât think of anyone), if they are already senior / well seasoned C++ / C / Python programmers, it would not make sense to pay for professional help for these situations
At the end of the day, funding for open source is in no way (as I see it) stable enough to form long-term contracts of support with professionals (unlike, for example, some academic collaborations or HPC centers).
To be clear by âthese situationsâ I mean where the code can either be rewritten by existing maintainers, or a payment can be made to be modernized into a form where the maintainers wouldnât be able to understand it best anyway, just because the existing implementation is in Fortran. The point where it reaches âpay to maintainâ or rewrite is where Fortran in Python becomes a lost cause.
The difficulty with obscure bugs is that the probability of their appearance increases with time and with
wider use.
Many of the problems SciPy appears to have are symptoms of bigger problems in software engineering, that some programmers were aware of already a long time ago. Also the dubious state of Fortran routines in SciPy today, is a deja vu of what it was 50 years ago, in 1974, when the author of specfun, W. Cody wrote,
There are many challenging and important problems for the numerical analyst in this work. Until these challenges are met, our libraries will continue to be filled with random collections of software of questionable quality. (emphasis mine)
There are more articles on these topics, here are just a few:
The number of citations these papers have is painfully low, meaning that sadly, very few seem to have read them and learned their lessons.
As long as we (volunteers, college students, âŚ) keep rewriting software manually into every possible language, without proper expertise in the domain of these codes, or any formal training in software engineering, I donât think the situation will get better. Dijkstra put it as follows:
To the economic question âWhy is software so expensive?â the equally economic answer could be âBecause it is tried with cheap labour.â Why is it tried that way? Because its intrinsic difficulties are widely and grossly underestimated. So let us concentrate on âWhy is software design so difficult?â. One of the morals of my answer will be that with inadequately educated personnel it will be impossible, with adequately educated software designers it might be possible, but certainly remain difficult.
I was thinking of it more as a one-time thing, just to âmechanicallyâ remove the gotos, common blocks, and other legacy Fortran construct which harm readability and make it harder to pinpoint bugs. The problem of understanding the domain of an algorithm is always there, irrespective of the programming language it is written in.
Thanks @rgoswami. Iâll ping you when I start working on that one. I guess I shouldnât be asking the Fortran community, as in the language developers, for help with special, just any modern Fortran developers who might be interested in participating.
@ivanpribec, thanks for the list of papers. Iâve read Codyâs but not the other two and will make sure to read them all. Iâm sure youâre familiar with it, but pointing it out for others, history.siam.org is another great resource for historical information surrounding numerical computing. I mentioned earlier, but our specfun is not the same as Codyâs. Cody was a brilliant numerical analyst and his specfun is of high quality but limited in scope.
I might be kind of an oddball case, but I do have some proper expertise in numerical computation of special functions, and so does Irwin Zaid who is working with me. Weâre trying to get things right with special and are not just trying to rewrite things blindly for the sake of doing it, which is understandably probably most peopleâs expectation.
For this, I would admit that a brief skimming of the resources you had listed left me unclear on what the licensing terms would be (e.g. I could probably score some tools with my academic affiliations but Iâd be hesitant to use the resulting code in a PR). I think it is perhaps a good first step to âkeep Fortran-Lang in the loopâ as is happening now, and perhaps over time joint proposals can be made for funding towards such one time endeavors. e.g. if there were a regular contributor from Fortran-Lang to SciPy, perhaps a small development grant could be found to fund such work⌠but such ideas (from non-contributors, even professionals) are likely to not be looked at favorably by governance (for the same reason we in Fortran-Lang donât solicit Python or C professionals).
The SciPy package is very high profile. People care. In some circles, SciPy is the only reason people care about Fortran. So losing that is actually a huge deal that Fortran (and SciPy) people donât even realize. For example thanks to SciPy, we got funding for LFortran (to compile it), and thanks to SciPy, Flang got packaged in conda-forge and many meson issues resolved. I believe also SciPy was the first high profile result of Flang that Iâve noticed on Twitter that people celebrated. For LFortran, if I mention that we compiled one more modern Fortran package, we get a few likes. If I mention that we are ready for testing it with SciPy for WASM, we get about 3x more likes. So SciPy has been helping Fortran tremendously.
However, for this to work, it must be mutual. What can Fortran do for SciPy?
Compared to C++, I would say speed of compilation, safety in Debug mode, speed in Release mode. I think that Fortran is very close to Python in terms of syntax, array support and easiness of use. Fortran was made for numerics.
I have never used SciPy but I did use Zhang and Jinâs book in 2012. Its MCQLN.FOR results were OK for Q_n(z) with integer n, complex z, if z was far from the cut from -1 to 1, but not near the cut (e.g. (0.6D0, 0.1D0)). When I complained to the authors they sent a bug fix. Complex Legendre functions are a pain! I also had to change their 0.0 and 1.0 into 0D0 and 1D0 because everything else was in double precision.
Edit: I write âweâ a lot, but itâs really just my personal take; Iâm not speaking for the other SciPy maintainers â though I think Iâm reflecting the broad consensus position.
There was a long time of the Fortran community (at least on the FOSS side) being basically non-existent, while C++ had a thriving community. This differential got starkly reflected in the contributor population, which eventually translated into the maintainer population, etc. I get that itâs mostly a rhetorical question, but itâs really not like this is due to the capriciousness of the current SciPy maintainers â thereâs simply a massive lack of Fortran expertise.
In fact, getting rid of any (non-Python) language in our repo would be a huge maintenance win, as long as functionality, performance, accuracy and maintainability does not decrease. That too is just a natural consequence of how much complexity we burden ourselves with. There are great libraries written in R and Julia, but weâre not going to include those because every extra language just has a maintenance impact that can hardly be overstated. For the same reason, itâs unlikely that weâd get anywhere with âwhy donât you learn Fortran then?â â we can understand and lightly modify the code weâve inherited, but becoming an expert is very a different thing.
While I agree that gratuitous rewrites are bad, this misses the point IMO. We have code no-one is able to maintain and 10+ year old bugs that cannot get fixed. That is a serious project health issue, and the last years have shown that these things (with very few exceptions) do not get fixed either by the community or the available maintainers.
Even if mistakes will be made in a rewrite, the key point is that we gain the ability to iterate on the code going forward, which is what weâre really after. So unless sustainable alternatives for the functionality in question appear, the only available path forward is to rewrite[1].
I get the dynamics here â youâre fighting impossible-seeming odds on a shoestring budget[2], so itâs natural to be disheartened by bad news that further pessimize the odds. But even though several SciPy maintainers are individually sympathetic to your efforts, the health of the Fortran ecosystem is not something that matters to our users, and so it doesnât figure very highly in which decisions are good for our long-term health.
I think weâve all been engaging very frankly â itâs not like I want to be the bearer of bad news to people who care a lot about Fortran (in fact, I think the work youâre doing is amazing and I want you to succeed!) â which is exactly why I am not sugarcoating things.
Help us fix these bugs and unblock necessary improvements[3]. Whether thatâs through new stand-alone modules or through helping maintain the Fortran code in the SciPy repo is secondary. No-one from SciPy will blame the Fortran community if you donât have the resources to do so, but likewise we shouldnât be blamed if we find ways to solve these issues that donât involve Fortran.
or deprecate and eventually remove without replacement, which is obviously less attractive âŠď¸
metaphorical; I presume almost no-one is paid to work on this âŠď¸
âhelp usâ clearly does not mean: âdo all the workâ â IOW, of course weâre willing to do our part. âŠď¸
This time our problem is that we have folks who want to do something about things but the code does not allow for a meaningful entry point. Even the hypot example Cody gives evolved further. So itâs not just the classical issues. We have much better testing facilities and understanding of what a good test writing should look like. While very valuable the lessons are, not relevant in this particular rewrite vs binding subject.
I know those articles which I enjoyed in the past but cannot anymore since they are paywalled (another problem of numerical software).
In turn, here is another you might also like (also getting implemented in reference BLAS as we speak)
Anderson E. (2017), Algorithm 978: Safe Scaling in the Level 1 BLAS, ACM Trans Math Softw 44:1â28
and it has a full test suite.
For a single use, spending a hefty amount of the funding does not seem like a viable option to me. And I think the license does not apply here since Iâll put the result in a multi-million use package. Not necessarily a âsingle userâ use. But I donât know. Regardless, the distinction of
deciphering F77 code (goto infested code that in my opinion needs to go away)
building F77 code (Lfortran is the key here)
maintaining F77 code (General community and interest in removing antics)
is crucial here. In this post we are, admittedly necessarily, using these three interchangeably. Since no one is rewriting these things, that should tell something which I failed to convey in these exchanges. Either people are moving away or people are not using them. Neither are good news.
If it is that easy to untangle, those software companies you linked to can do this out of charity for a few âthe most popularâ Fortran libraries and someone can convert the result to fortran 2008 and there you go extending the life of the library another 3 decades. You might say why would anybody do that. And that is exactly what I mean about the state of fortran community. This would have been done already if it was Rust, Python, Go, Julia etc. (Rust has 3 independent BLAS project attempts already). I mean no offense just plain comparison. An interested community springs out countless initiatives about the most obscure details. I mean we have the original sympy author among us.
That was what I meant previously but somehow I think I am not eloquent enough to put things into words about it.
See this post by @apple3feet (John Appleyard, the author of SPAG), where he indicates that he may be favorably inclined to grant a free license for an opensource project that involves modernizing legacy Fortran code.
I am not convinced this is a Fortran-from-Python issue. NAG is ready to sell anybody huge tracts of its numerical libraries with Python interfaces, and all that numerical code is written in Fortran. The issue seems to be that there is no community equivalent to the NAG enterprise (which started as a community project before becoming a commercial not-for-profit company), and no ideas how to build such a thing.
You might like LPython then! Same technology as LFortran, same speed.
Regarding the other comments, I generally agree, I donât have anything else to add. As I said, nobody planted new Fortran trees for a very long time. We have finally planted several in the last several years, but we have to wait for the fruits to come. But they will!
The g95 compiler was first released in 2000 and was developed through 2012. The gfortran compiler, an offshoot of g95, is still developed and is one of the most important Fortran compilers. I think they should be recognized as important âtreesâ, without which the language would likely be moribund.
The GFortran compiler was planted in 2000 (or possibly even earlier, that is when g95 was released). It takes about 10 years to develop a new compiler from scratch to reach production. Absolutely, GFortran is the âoldâ tree that I am talking about. Without it, we would be finished long time ago. But GFortran by itself canât fix all the Fortranâs problems.
To give an idea of what can be done entirely automatically, here is a fragment of âHunt the Wumpusâ from the early 17th Century.
C *****************************************************************************
C shoot.f 17-May-86 John Christopherson
C *****************************************************************************
SUBROUTINE SHOOT
INCLUDE 'ints.inc'
INTEGER ACAVE
ACAVE = CAVE
DO 340 I = 1,5
DO 230 J = 1,3
IF (SEQ(I) .EQ. PASS(J,ACAVE)) GOTO 260
230 CONTINUE
240 CONTINUE
J = 1+3*RAND()
IF (J .GT. 3) GOTO 240
ACAVE = PASS(J,ACAVE)
GOTO 280
260 CONTINUE
ACAVE = SEQ(I)
280 CONTINUE
IF (ACAVE .EQ. WUMP) THEN
WRITE(6,300)
300 FORMAT(/,'CONGRATULATIONS',
1 ' - YOU HAVE SLAIN THE WUMPUS !!!'/)
WUMP = 0
GOTO 500
ELSEIF (ACAVE .EQ. CAVE) THEN
WRITE(6,320)
320 FORMAT(/,'YOU SHOT YOURSELF IN THE FOOT',
1 ' AND ARE FORCED TO RETIRE !!!'/)
WUMP = 0
GOTO 500
ENDIF
340 CONTINUE
ARRW = ARRW+1
IF (ARRW .LT. 3) THEN
WRITE(6,420)3-ARRW
420 FORMAT('MISSED - you have',I2,' arrows left')
ELSE
WRITE(6,440)
440 FORMAT(/,'You have NO ARROWS LEFT ',
1 'AND ARE FORCED TO RETIRE !!!'/)
WUMP = 0
ENDIF
500 RETURN
END
type or paste code here
This is converted entirely automatically to:
! *****************************************************************************
! shoot.f 17-May-86 John Christopherson
! *****************************************************************************
!
SUBROUTINE shoot
!
!
! ****************************************************************************
!
!
USE fpt_module_kinds
!
IMPLICIT NONE
!
! ****************************************************************************
!
INCLUDE 'ints.i90'
!
INTEGER(KIND=ki4)arrow_cave
INTEGER(KIND=ki4) :: i
INTEGER(KIND=ki4) :: j
REAL(KIND=kr4) :: rand
!
arrow_cave=your_cave
DO i=1,5
DO j=1,3
IF (arrow_path(i)==passage(j,arrow_cave)) THEN
GOTO 260
ENDIF
ENDDO
240 CONTINUE
j=1+3*rand()
IF (j>3) THEN
GOTO 240
ENDIF
arrow_cave=passage(j,arrow_cave)
GOTO 280
260 CONTINUE
arrow_cave=arrow_path(i)
280 CONTINUE
IF (arrow_cave==wumpus_cave) THEN
WRITE (6,"(/,'CONGRATULATIONS', &
& ' - YOU HAVE SLAIN THE WUMPUS !!!'/)")
wumpus_cave=0
GOTO 500
ELSEIF (arrow_cave==your_cave) THEN
WRITE (6,"(/,'YOU SHOT YOURSELF IN THE FOOT', &
& ' AND ARE FORCED TO RETIRE !!!'/)")
wumpus_cave=0
GOTO 500
ENDIF
ENDDO
!
arrow=arrow+1
IF (arrow<3) THEN
WRITE (6,"('MISSED - you have',I2,' arrows left')")3-arrow
ELSE
WRITE (6,"(/,'You have NO ARROWS LEFT ', &
& 'AND ARE FORCED TO RETIRE !!!'/)")
wumpus_cave=0
ENDIF
!
500 CONTINUE
RETURN
!
END
The fpt script which specified the changes is:
! wumpus.fsp 25-Oct-23 John Collins
! File handline
% input directory: ../original_source/wumpus
% infer input code layout from file name extension
% output directory ../fpt_output/wumpus
!!! % keep layouts
!!! % keep file name extensions
% primary input file name extension: ".f"
% include input file name extension: ".inc"
% primary output file name extension: .f90
% include output file name extension: .i90
% overwrite changed files
! Look in modified source for changed files
% check modified source
% change do continue to do enddo
% remove labels from enddo statements
% remove labels from executable statements
% change if to if-then
% specify implicit none
% specify numeric kinds
% change relational operators to symbolic form
% remove format statements used fewer than 3 times
! Suppress the diagnostics which mark the changes
% suppress diagnostic 2255 4243 4495
! Code formatting
% free format
% no column format
% output code line length: 132
% page width: 140
% write continuation character in column 88
% lower case symbols
% space before "::"
% space after "::"
! More meaningful names
% rename wump wumpus_cave
% rename cave your_cave
% rename arrw arrow
% rename caven cave_alias
% rename pass passage
% rename seq arrow_path
% rename bats bat_cave
% rename pits pit_cave
% rename acave arrow_cave
! The wumpus primary source files
cavnum.f
inipos.f
input.f
intro.f
move.f
passag.f
play.f
shoot.f
wumpus.f
! End of wumpus.fsp
To do this you need fpt v4.2-j (or later) because we only added the FORMAT handling very recently. Please mail me if you want to try it (or if you want to play Hunt the Wumpus)
This is fantastic! However, in the context of this discussion, would it be OK (licensing / attribution) to use it on SciPyâs existing Fortran code? If so, Iâm willing to try to push a PR through SciPy.
No problem in cutting a free licence for use on SciPy. fpt is simply a tool - there are no licensing implications for the software it is run on - ownership stays with the original owners.