Resistance to modernization

From C++ to Fortran? Yes, I was thinking of doing that as well using Clang, at least for a subset. But after LFortran works.

1 Like

You’re certainly not the only one who does this. But, I consider this to be an anti-pattern that the Fortran community needs to get away from. For one thing, what you end up with is a fork that is unique to your project. What happens if you want to update something in the Fortran 77 code? You have the update and nobody else does. All the work all the people do to write these interface modules would be better spent updating the underlying code. SciPy is chock full of forks of legacy Fortran codes that aren’t the same as other forks floating around and can’t easily be used by anybody else. It’s just a disaster. And these ancient legacy codes were not free of bugs. Importing ODEPACK should be as easy as adding a line to your fpm.toml with the version you want to use. And then you are done. That’s where we need to get to as a community.

Look at the modernized Minpack and Quadpack for what we can do if we try.

4 Likes

Our experience is mostly the opposite.

For every instance where a modernized Fortran codebase ran into a failure where the legacy code worked ok, there are countless instances where the refactored code performed more robustly and faster than the older code and moreover the updated codebase was ready for parallelization.

It has been our experience far too often the legacy code is chock full of bugs. In many cases the legacy codebase came with no test suite whatsoever, let alone unit tests. The so-called users learnt over time what works and what does not, so these users - often the subject matter experts (SMEs) remain in their “sandboxes” with the use of the legacy programs. And they would have to do all kinds of manipulations, often with the help of younger staff who come through the org - interns, co-op students, new employees, etc. - and who used other highly up-to-date packages using MATLAB, Python, R, etc. to overcome the limitations of the original legacy FORTRAN codebase.

Many of the younger staff eventually wisen up and migrate and transform the whole legacy codebase with their incarnations, often in modern C++.

All along, we have found many occasions where the above mentioned SMEs are, on the inside, absolutely “terrified” of the changes, all they know in terms of coding is legacy FORTRAN style and the use of COMMONs and EQUIVALENCEs, and become highly intransigent and resistant to change, and follow the “anti-pattern” vis-a-vis Lord Kelvin and only have words and more words and quips and complaints against the changes but rarely, if ever, can express any numbers or quantifiable results to illustrate how the earlier, legacy apps did any better in any meaningful aspect. Management simply waits for retirements and other forms of attrition, but all along it is Fortran that gathers all the ill-will.

4 Likes

Correct, but I beg to disagree with the second part. It would be better to “standardize” such “interface modules”, as I call them (or “forks”, if you will) to something like a library or rather a collection of libraries. I was actually thinking to do something like that in the past, but I didn’t. The reason is I insist to strip the underlying packages like SLATEC or whatever to the absolute minimum needed, which is different from task to task at hand.
Again, such an interface library would be much easier/quicker to make and, most importantly, way less prone to bugs than “updating” spaghetti code. Don’t get me wrong though, I have to congratulate people willing to modernize legacy code. I am just thinking the result does not justify the time needed for this. It would be better to spend much less time for a much safer module such as the one I described earlier, and even better spread such modules to the community.

I can understand older programmers, retired or close to retirement, who refuse to change, and I don’t think they deserve such a treatment with so strong words. What you would do if you were at your 60s/70s they told you that you have to forget everything you know and worked with for decades for something new, therefore “better”? And no, I am not defending them because I am one of them. Once I started learning Fortran 90 I had a blast. But I doubt I would do the same if I was older. The problem is not older staff, but rather younger programmers who were told Fortran is still FORTRAN, even though this is not true long before they were born.

I’m sorry but I would consider modern C++ migration the exact opposite of “wisen up”. You do the mistake thinking “younger staff” is always right by definition, and this is far from being true. More often than not, what I see is younger staff arrogance, usually based on misinformation.

5 Likes

That is not our experience at all. The issues I was referring to had to do with a small but highly influential staff, often on account of seniority, marking a set of FORTRAN codebases as “perfect” or “not to be modified” based on reasons that are the same as or similar to those in the original post.

This is even after being presented with many actual use cases that are not satisfied with those codebases.

Then even after showing how the modernization retained the core IP (e,g, fundamental / engineering formulae) in code that are very similar to or the exact replica of the original FORTRAN fixed-format and that the algorithms in pseudocode were the same as the original but the implementations were simply using modern constructs such as SELECT CASE, named DO loops with CYCLE, EXIT instead of GOTOs and so forth and the results are updated to a separately analyzable precision but are otherwise in agreement with the original, there is no acceptance.

The challenges are emotional and there is no fighting those.

I disagree: starting around C++14 and moving on to C++17 and now C++20, it has really turned a corner. C++ remains a difficult language to master and it has its own problems, but those in the know can write apps and especially libraries that are very efficient and fast, more readily extensible for multithreaded execution and parallel computing which is highly important, succinct (perhaps too much so) in some ways (though with the bloat of curly brackets), and which can be made to conform to an ISO IEC standard too.

1 Like

Probably related to this, but as far as I experienced, one of the biggest reasons is the lack of incentive / motivation for this (for students and postdoctoral fellows). Very briefly, using a lot of time and effort for “modernizing” legacy codes itself does not usually lead to a paper (= academic achievement), except for very rare situations, so I think it is usually not advised by supervisors as a “project”. Also, students / postdoctoral fellows are often struggling to learn new methods and techniques (in their field) and do not have much room to do something beyond,e.g., adding new codes for experimenting new ideas (as far as I experienced, again…).

I can understand that a modern version of the legacy code (if exists) could be easier to modify, but to do so, one often needs to change the structure of the program itself rather than doing “one-to-one” modernization. I guess the former takes much more time and energy than the latter (because of the consideration of new code design).

2 Likes

Bogus or not; for many Fortran codes that’s the reality. Take a look at NumPy, the Python package for array computations used among other things in the discovery of gravitational waves, and the first image of a black hole. According to Google Trends, NumPy surpassed Fortran in searches somewhere around 2015:

image

(Edit: I’m not sure how trustworthy the Google trend for Julia is. The peaks appear in 2012, which was the date of the Julia release…)

The numpy.linalg module uses a vendored version of LAPACK dubbed lapack_lite, which is patched to remove “new” features just so it is compatible with f2c:

 !                 Skip any trailing zeros.
                   DO LASTV = N, I+1, -1
-                     IF( V( LASTV, I ).NE.ZERO ) EXIT
+                     IF( V( LASTV, I ).NE.ZERO ) GO TO 15
                   END DO
+   15             CONTINUE

Now you could take this as a success, old F77 is still helping scientists. But it’s not the case. Some of the troubles faced by the Python community in using Fortran are discussed here: Releasing (or not) 32-bit Windows wheels - #13 by ev-br - Contributor & Development Discussion - Scientific Python. Here are a few of the most chilling quotes from there:

Philosophical musing: how many person-years would it take to get rid of Fortran in SciPy completely? That would be a better outcome; the most work but it’d get rid of a ton of outdated and badly written code that no one wants to maintain. Many birds with one stone …

Fortran on Windows is just never-ending pain, and responsible for our worst packaging issues. It was also the worst problem for getting things to work on macOS M1

For Fortran as we have it in SciPy though (F77 mostly), there’s just no pros at all beyond “we already have the code”, and many cons.

3 Likes

Just to add my two cents (pence?), I am currently modernising a code written in F77 that models stellar atmospheres. I’m basically porting it to F90 by getting rid of the COMMON blocks and replacing with modules, as well as replacing precision declarations with SELECTED_REAL_KIND etc. The only downside I’ve seen thus far is that the numerical results are slightly different (1e-7 residual compared to original code version). I’m also working to make a lot of the arrays allocatable. Why am I doing this? Well for one, I learn what the code does on a fundamental level. And two, I can make it far more space savvy. Yes technically I could just write more subroutines and add more COMMON blocks, but as it is all statically allocated prior to runtime, it’s putting unneeded memory pressure on my mac.

I also agree with what some other people are saying with regards to future generations. Modern Fortran is something that people should understand and want to write. When we have legacy code including hollerith constants and equivalence statements, it doesn’t surprise me that people might want to steer clear!

3 Likes

I find this claim to be very misleading.

Most well tested legacy codes have few bugs, these mostly due to using previously untested data cases.
Modern Fortran compilers unfortunately appear to have a significant number of bugs, mostly related to the new features and as @whuhn elequently identifies, “battletested programmer hears the word “modernization”, they reach for their gun.”

I wonder how many significant codes are using F2018 features ?

So lets get back to reality and be a bit more honest where most of the bugs are !

Regarding refactoring, I should add that all code I compile has IMPLICIT NONE and !$OMP PARALLEL DO DEFAULT (NONE) … included so that all variables are fully identified.
I find this change introduces few new bugs when done carefully.
( gfortran does provide good adherence to !$OMP DEFAULT (NONE), an area where compiler checking can be a problem )

I am quite interested in this point. Could you or anyone elaborate on why whole-array constructs are more difficult to optimize than loops? Is this claim applicable to other whole-array operations such as matmul and broadcasting?

When saying loops are easier to optimize than whole-array constructs, I guess there is a hidden assumption that the loops are well written (e.g., respecting the column-major order), if not well tuned according to the hardware.

However, ideally, shouldn’t the compilers do that kind of tuning for us, as what has been happening in MATLAB since the last century? Shouldn’t the programmers focus more on the mathematics and logic of the algorithms, which are better formulated using whole-array operations? As long as the whole-array operations are not “stupidly written” (e.g., solving a linear system Ax=b by taking the inverse of A), shouldn’t we expect the compilers (machines) to optimize the computation for us (humans) in most cases, leaving us focusing on the more creative part of the job?

Surely, there must be someone who teaches the compilers how to optimize array operations in the first place. However, why should the programmers teach their own compilers for their own code & hardware manually in an ad-hoc fashion? Shouldn’t we let the professionals (compiler designers) do the professional thing except for rather special/rare cases? I know that it poses huge difficulties for the professionals, but isn’t this the way we make advances in science and technology in general?

Take matmul as an example. What are the major difficulties if we want compilers to optimize it? Is there some stupidity in writing A = matmul(B, C) so that compilers are incapable of rescuing this piece of code?

What is wrong if I claim that the following two pairs of comparisons are similar?

Explicit loops v.s. Whole-array constructs/operations
Addition by assembly code v.s. Addition by +

Sorry if my questions do not make much sense. I am not one of “the professionals” when talking about compilers, code optimization, or even programming. Thanks.

1 Like

In my experience, this claim is difficult to justify.

My recent coding approach is, where appropriate, to replace the inner loop with array (vector) syntax.
My reasoning is this may improve the likelihood of recognition by the compiler, for AVX/simd instructions. Also possibly difficult to justify, but my experience shows no worse outcome.

This is a change from earlier F77 wrapper styles. The most surprising finding I have found recently (over the last 10 years) with ifort and gfortran optimising compilers is that the F77 wrapper approach of converting array sections to vector functions can be slower than n dimension array syntax. This certainly was not the case last century(!) when these approaches in legacy codes provided significant gains.

In many cases, the optimisation coding styles need to be reviewed as compilers and hardware evolve.

I am finding that memory <> cache interface is the new efficiency frontier.

1 Like

Well, if you think that way, I can imagine people would wonder what are you doing here instead of being in other forums talking about C++… excellence. But I am not asking, because it’s not my business anyway.

At any rate, I am against resistance to modernization as you described it (not in general), but I am even more against “wisen up” migration of legacy FORTRAN code to… C++, instead of Fortran. I explained the reasons elsewhere and I’m not going to repeat myself. If you want to live in a language that has half-baked matrix support (basically good for nothing) that’s perfectly fine by me.

2 Likes

Not mentioned so far is my case. I still need FORmula TRANslation to do my job quickly and reliably using good library and special functions. Few hours of coding and run efficiently on HPC that is my way for which I need some modern features. I do not write code for every hardware on the planet or any precision I could imagine. I am interested in modern features but implement them carefully and rather slowly by evolution.

I think part of the issue for some people is that they look at the modern features and see things that are either poorly thought out and implemented (see SELECT TYPE etc) or are poorly implemented in some compilers and think “why should I waste my time with this”. I think some (maybe many) share my perception that the standards committee waste time implementing things that are pet projects of some committe members and don’t give much real thought on what features are really needed in real world codes in the subject domains that Fortran used to dominate (CFD, Finite elements, computational physics and chemistry etc). As others have stated arrays processing is probably Fortran’s greatest strength, but Fortran’s intrinsic array processing ability (in terms of intrinsic functions) severly lags MATLAB. Why are there no intrinisic set functions like unique, union, intersect etc. Why are there no intrinsic sort, sortrows, tensor_product functions. Why are there no intrinsic linear algebra functions that mirror what MATLAB has. Beyond the things in MATLAB that should be in Fortran, why are there no intrinsic ADTs (container classes) like lists, unordered (hash) maps, dynamic arrays (ie C++ vector class), stacks, queues etc. While I think templates are a useful addition, I would have little need for them if we had intrinsic containers. I assume the committee’s answer to the lack of these features is “write them yourself” as though they think the compiler developers time is more valuable than mine. Again, while I consider myself firmly in the modernist camp and have advocated just forking the modern portions of modern Fortran into a separate language, I can also understand people looking at the new and improved features in Fortran and saying to themselves " I see nothing here that is going to help me write a better code"

9 Likes

Well said. Totally agree.

Recall that a major portion of Fortran users are researchers and scientists, but not professional programmers. They have their own job. Programming is only their way of finishing (a small part of) their job, NOT the job itself.

When scripting languages like MATLAB, Python, and Julia provide sufficient tools for solving normal users’ daily problems satisfactorily, literally nobody will use Fortran. There is no need to mention that these languages can very often outperform Fortran out of box on basic tasks, e.g., matrix multiplication, let alone things that Fortran cannot do intrinsically, e.g., solving linear systems, factorizing matrices, and calculating eigenvalues.

Surely, Fortran can be much more efficient if you are willing to spend time on tuning your code/compiler. Surely, those scripting languages can handle the aforementioned basic tasks only because of the Fortran (FORTRAN) libraries underneath (N.B.: the necessity of Fortran/FORTRAN may be overestimated here. See the remark by @oscardssmith below). Surely, without Fortran (FORTRAN), those languages are useless. However, the situation is no better (if not much worse) for Fortran — without efficient and robust intrinsic procedures that can deal with the aforementioned basic tasks, Fortran is (even more) useless to most researchers and scientists, the original target users of Fortran.

7 Likes

This overestimates the necessity of Fortran for Julia’s success. I would be fairly surprised if there is any Fortran remaining in Julia by the time Fortran 202Y comes out. Currently, Julia only calls to Fortran for Blas and LAPACK, and there are already initial experiments showing better performance from pure Julia versions (see Octavian.jl and LinearSolve.jl).

3 Likes

For me, this is exactly the reason why programmers who code for the future should STOP optimizing their code in ad-hoc fashions.

Just write things in array operations that describe the mathematics/logic properly, and then let the compilers optimize the code for you. This has been the case in MATLAB since the last century. Let professionals do the professional things. This is how we make advances in science and technology in general. Don’t do jobs that should be done by compilers/compiler developers. Otherwise, there will (only) be two possibilities:

1.) compilers will never learn how to handle basic array operations in a decent way because they are not motivated/requested; or

2.) compilers do make advances in basic array operations, but your (over-)optimized code will not benefit from these advances, if not become suboptimal.

Remember, the “optimisation coding styles” will change as compilers and hardware evolve, but the correct array operations that describe the mathematics/logic of your algorithm will not (or less likely).

1 Like

I agree with this. Same for SciPy (except they will rewrite it all in C or C++). Julia is showing what can be done if you have enthusiastic and forward-thinking users and stewards of the language.

Fortran people: If it ain’t broke, don’t fix it. If we get rid of implicit typing we might drive away users.
Julia people: We will rewrite 40 years of numerical linear algebra in Julia. It will take a while but the end result will be better than can possibly be imagined in FORTRAN 77.

7 Likes

While modernization is great, it may be even better if more and more new code are written in modern Fortran.

3 Likes