Why is C used in many optimization libraries?

fortran4r · January 28, 2022, 5:56pm

This is a side question. C also does not have templates, but many great optimization libraries such as cplex and nlopt are written in it. Why this didn’t happen to Fortran?

jacobwilliams · January 28, 2022, 6:19pm

Lots of reasons I guess. You don’t need templates for most of this. C has dynamic memory and structures. You do need that and FORTRAN 77 didn’t have it, so that’s when we lost some projects. By the time Fortran got those things, it was too late. For more complicated codes, you need classes, so we lost some codes to C++. By the time Fortran got those, it was too late. Now, very complicated codes need templates, automatic differentiation, etc., and by the time Fortran gets that (if ever) it will be too late. Meanwhile, the Fortran ecosystem continues to be poisoned by the existence of all this FORTRAN 77 code that is still floating around and people are still using.

Beliavsky · January 29, 2022, 1:22pm

The Fortran ecosystem benefits from the existence from all the FORTRAN 77 code online, because
(1) Some people, still more familiar with FORTRAN than modern Fortran, will use them as is.
(2) Developers of R, Matlab/Octave, Python/Scipy etc. have decades of experience interfacing with such codes.
(3) We can provide modern Fortran interfaces to such codes.
(4) We can translate such codes to modern Fortran, as you and others have done.

jacobwilliams · January 29, 2022, 2:20pm

I tend to disagree with the thinking that all the ancient FORTRAN 77 code is good for Fortran in some way. No young person who sees this code wants anything to do with Fortran. It reinforces the perception of Fortran as some kind of ancient legacy machine code for math. I think it is very harmful.

I also think that it is inevitable that all of that FORTRAN 77 code in SciPy is going to be removed. It’s just a matter of time. It’s already happening with fftpack. No one wants to work on it. And for good reason, it’s a scene of horrors. Some of it has bugs from decades ago (recent bugs were found in QUADPACK).

I’m also not a big fan of modern interfaces to old FORTRAN77 code, for the same reasons. Also, when you do that, the namespace of your project is still polluted with all those subroutine names, and you can still call them (with no interface checking) since nothing is requiring you to use the module. So if you forget, you are back in 1977 without realizing it. I have been bitten by this before in the past.

Modernization is the way to go!

Beliavsky · January 29, 2022, 2:38pm

The FORTRAN 77 code could be converted to free source form with a tool and put in a module along with modern Fortran wrappers. The module can be declared PRIVATE with only the wrappers given the PUBLIC attribute, so that the FORTRAN code is inaccessible outside the module.

I am wary of rewriting code and introducing more bugs, but people will differ on this, based on how good they are at programming in general and how well they understand the algorithms in a domain.

jacobwilliams · January 29, 2022, 2:45pm

Oh I agree totally! That’s a fine approach. Then you have it in good shape for further modernizations if necessary.

ivanpribec · January 29, 2022, 2:59pm

If you are wary of rewriting and inadvertently introducing bugs, but afraid of polluting the namespace with external subroutines, it is technically perfectly fine to leave the code fixed-form and just introduce a module around it, and preferably add the type and intent declarations.

The key difference I see in modernization is getting rid of non-standard constructs, manual workspace arrays and other oddities.

jacobwilliams · January 29, 2022, 3:59pm

But note: I’ve seen that there are some constructs in the old FORTRAN 77 standalone routines that aren’t allowed when you put that routine in a module. So sometimes some minor mods are still necessary. So you might as well fix those. And now that you are editing the code, you might as well convert it to free-form first so you don’t have to deal with that nonsense. Then you might as well fix some duplicated code that isn’t needed anymore (e.g. constants defined over and over again that can just be moved to be module parameters). Then you are well on your way…

zaikunzhang · January 30, 2022, 4:18am

jacobwilliams:

I tend to disagree with the thinking that all the ancient FORTRAN 77 code is good for Fortran in some way. No young person who sees this code wants anything to do with Fortran. It reinforces the perception of Fortran as some kind of ancient legacy machine code for math. I think it is very harmful.

I also think that it is inevitable that all of that FORTRAN 77 code in SciPy is going to be removed. It’s just a matter of time. It’s already happening with fftpack . No one wants to work on it. And for good reason, it’s a scene of horrors. Some of it has bugs from decades ago (recent bugs were found in QUADPACK).

I’m also not a big fan of modern interfaces to old FORTRAN77 code, for the same reasons. Also, when you do that, the namespace of your project is still polluted with all those subroutine names, and you can still call them (with no interface checking) since nothing is requiring you to use the module. So if you forget, you are back in 1977 without realizing it. I have been bitten by this before in the past.

Modernization is the way to go!

I have almost the same opinions. When the legacy packages become not understandable and maintainable (let alone extendable), the legacy may do more harm than good. They must be modernized or else they will be abandoned when nobody wants to work on/look at them anymore.

wyphan · January 30, 2022, 5:09am

fftpack

I worked on code that embedded a modified version of FFTPACK5 in its source code directory. One of the first things I did to the FFT interface in that code was to switch it to FFTW. That saved so much headache from trying to compile the code with a modern Fortran compiler. IIRC starting from GFortran 10 the unmodified code won’t compile anymore due to the improper access of COMPLEX arrays as REAL.

Ashok · January 30, 2022, 5:16am

There is a project called f2j - which is used to convert FORTRAN77 programs to java classes.
The main motivation to develop was to convert BLAS and LAPACK to run on Java Virtual Machine.
The code can be found on netlib here - java/f2j
It is written in c.
If FORTRAN77 can be converted to Java, certainly it must be possible to convert to Modern Fortran.

certik · January 30, 2022, 5:20am

Yes, all such Fortran code is on the way out, not just SciPy but all such Fortran codes. Nobody wants to work with it. The way forward is to use modern Fortran.

Where I differ with Jacob is that rewriting introduces bugs. For example the big rewrite here: https://github.com/fortran-lang/fftpack/pull/18 introduced several bugs. We caught some in the review, but how many are there left that we didn’t catch?

We also didn’t do any benchmarking besides what I have done for larger arrays in the PR. How do we know we didn’t slow things down for small arrays? Is that not important?

So yes, we have to modernize, but we have to be careful about introducing bugs (we need good tests) and performance (we need good benchmarks).

mecej4 · January 30, 2022, 3:02pm

Sometimes, the old code itself may contain bugs that never surfaced on the mainframes and minicomputers on which it was developed and tested. If we convert it or rewrite the old code, the converted code may still contain that bug, and a modern compiler or a new application can make that bug come alive. If that happens, it can take quite a bit of work to assign or shift blame, and it is likely that the people who did the conversion will be blamed, because “nobody had seen such errors for 40 years in that software”.

An example of this was discussed in a comp.lang.fortran thread three years ago. Eispack was published in the mid 1970s, and the bugs were observed in 2018 by Jack Wolfman.

jacobwilliams · January 30, 2022, 4:20pm

Yes agree 100% Ideally, a big refactor should be accompanied by testing and benchmarks.

jacobwilliams · January 30, 2022, 4:28pm

Absolutely. The Fortran giants of the 1980s made mistakes just like we do. Some of these old codes have bugs that have not been noticed, and carry over to all the languages that the old FORTRAN77 has been translated into. (see recent minor bugs found in QUADPACK). One of the bugs that @certik mentioned that was in introduced in the FFTPACK refactor was because of a comment that was incorrect and had been incorrect for decades. It’s one of the reasons we need to resurrect these old codes from their Netlib graves and provide a single definitive place where they are being worked on and improved and people can go to with bug reports. The Fortran-lang GitHub is the natural place for that I think.

Jweber · January 30, 2022, 8:45pm

I am not personally involved in the development of a large-scale-library but I guess one factor might be that C is kind of the lingua franca of programming with respect to the fact that a lot of programming languages provide standardized interfaces to the C programming language.

I know that is is also possible to interface Fortran code like it is done for R or SciPy. But as far as I understand, this has to be done either in a non-standardized way or by using C interoperability by both Fortran and the other programming language.

MarDie · January 30, 2022, 9:14pm

This is very important. The libraries are not only written in the programming language of our (grand)-parents, they have been also developed with the tools that were available 30 years ago. That means: No internet, no version control, no virtual machines for testing, …

If you take this historic perspective, the fear of changing/touching something and staying backward compatible at all costs can be understood. But it is simply not how software engineering is done today.

Certainly, but giving the means that they had the software quality is very impressive. I even fail to use LAPACK with its short function/variable names. Still, I think with modern tools even a crowd of less skilled programmers can do a good job at modernizing legacy code.

Interestingly, there is the trend in system programming to move away from C simply because it is very hard to write bug free code. Rust is pretty much hyped as a replacement. In science and engineering it is not even required to change the language. One can gradually move to modern Fortran to get checks to prevent out of bounds access or type mismatches. Choosing C with its 1001 possibilities to shoot yourself in the knee is a very strange choice.

certik · January 30, 2022, 10:44pm

@jacobwilliams I think we want the same thing. The reason I am so “conservative” to change these old libraries is that the only tests we have are the actual 40 years worth of usage of the codes. So what is really needed, as you said, are:

Good benchmarks (for fftpack small and large arrays, of various sizes to exercise all the kernels of radix 2, 3, 4 and 5)
Good tests

And then as long as we did not slow things down or break anything, we should refactor and rewrite using more modern Fortran as much as we want!

I have contributed benchmarks to go along with the PR, but my benchmarks are not sufficient. I need help from others to contribute benchmarks per my guidelines above. Otherwise it is way to easy to slow things down without even realizing it, and we should not do that under any circumstance.

Indeed. There are several languages that aim at replacing C (among other goals, such as also replacing C++):

Topic		Replies	Views
Poll: refactoring a chunk of legacy code Poll	56	2866	February 28, 2022
Resistance to modernization	108	5480	September 7, 2022
NEK for computational fluid dynamics moving to C++?	58	4450	February 8, 2022
Glmnet migrates to C++	29	3396	December 8, 2021
Why abandon Fortran for Linear Algebra?	46	7121	August 11, 2021

Why is C used in many optimization libraries?

Related topics