Backwards compatibility in different programming languages

The thought behind the term namespace is to relate to the fact MODULEs in Fortran already some degree of namespace but not quite everything that might be sought in more modern programs, The new functionality intends to build on the existing one. Now of course, MODULEs do more than namespaces and as a program unit too, they can drive certain common semantics of the contained entities and subprograms. The new thingy tries to further integrate this across a set of modules.

So one can call it something else, say supermodule or foo or whatever. The idea is the same: offer the practitioners the kind of features in @certik’s popular proposal which is from the consumption (use/import) point-of-view but do so with further organization on the modules’ side themselves whilst also allowing further granularity on namespacing and also include control over semantic overrides to overcome the legacy that is otherwise unchangeable (for otherwise the sky will come falling down) such as implicit mapping and default KINDs of constants, etc…

Fortran already has submodules which is well-received by the teams I work with. My sense is the above thingy, whatever it be named, will be well-received too, particularly if it offers good stuff. Bottom line is implicit mapping and implied save are extremely, extremely disliked. If the term supermodules were to better convey what is intended here rather than calling it namespaces, then think of it that way.

However there is no need to get hung up on the naming at this stage simply because it got referred to namespace (pl. excuse any sense of pun) in the first posting.

This could be…

Frankly, I’ve never perceived these two as a big problem among regular programmers.

For Fortran newbies around me (there are not a lot, but we have a lot a Fortran codes, we regularly hire young people, and they have to deal with Fortran sooner or later) the implicit mapping rules are indeed surprising, but they generally understand the historical reason and quickly get used to put implicit none everywhere. As for the implied saves, I warn them about how misleading they can be, and they also get it. These are actually non-problem for them, they understand that all langages have to some degree some weird aspects. They are much more critical about other aspect of the language, like the lack of nicely packed libraries (which doesn’t mean good librairies btw) for about everything, or the lack of genericity…

I mean, let’s not fight the wrong battle. Implicit mapping or implied save have issues, I agree, but these are minor issues compared to other challenges with Fortran.

I request readers to note the amount of new code being written now in modern Fortran or even being contemplated, let along considered, is for all practical purposes the equivalent of zero even if it occasionally rises above the numerical epsilon. This is also a fact in technical computing domains that should otherwise be a forte for Fortran. An indirect measure of this is the latest programming language survey from IEEE Spectrum where Fortran is virtually a dead language with a score around 0.59 compared to Python at 100 and C++ at about 88. IEEE Spectrum practically covers the interests of the largest group of engineers interested in coding floating-point calculations on modern computers, something that really took off only when IBM FORTRAN I first introduced REAL type in a higher-level programming language back in the 1950s:

Whilst Fortran has many, many challenges that also require big language evolution around Generics and the handling of error termination of programs initiated by a Fortran processor or by means other than a Fortran processor, particularly with FUNCTION subprograms, a couple of significant irritants with newcomers are indeed implicit mapping and implied SAVE.

The nasty, nasty semantics in the Fortran standard with implicit mapping and implied SAVE do indeed lead to the proverbial straw that broke the camel’s back with many newcomers not bothering further with Fortran and especially with managers and powers-that-be and decision makers resolving to not select Fortran as the language of choice in even scientific and technical computing components in new application development. The effects are all there to see in the above IEEE Spectrum link.

To dismiss these two detrimental features as effectively “mere straw” and thus informing all newcomers to simply put up with them forever only further hurts the language.

Moreover the resolution toward these two issues are really low-hanging fruits that can be plucked toward a clean up in this language for the sake of posterity. They don’t require much of any effort on the part of standard bearers nor the compiler implementors nor the users. And the timeline involved is a decade plus to perhaps 15-20 year window considering the earliest the changes can now be considered is 203X standard revision which is already far more than about the ten years Python people extended to Python 2 → 3 migration. Not to forget Python rose to the very top among the practitioners following such a backwardly incompatible transition - so what lesson does Python’s brave move impart here?

There is really no need to make a big deal about this in terms of community resistance to the request, particularly if you are using implicit none everywhere already. This is not a “battle”, for heaven’s sake, least of all the “wrong battle”, these can addressed quickly, taken care of, and the language can move on to the big challenges. Even silence will be preferable than to blow it up as a “battle”.

The “old codes” are already broken in several other ways with respect of the current standard, or not getting compiled at all in any statistically significant circumstances by any modern processors, or already employing implicit statements fully, or at times partially with a nonstandard IMPLICIT REAL*8(A-H, O-Z) one when they could have made it not a concern at all for anyone else with the full IMPLICIT REAL*8(A-H, O-Z), INTEGER(I-N) statement. But even otherwise, every Fortran processor, nonconforming as they usually are “off the shelf”, will support all such “old codes”. There is no risk whatsoever nor any downside.

To hold the language hostage for future practitioners for the indulgences and indolence of the past practitioners and current maintainers is simply not right.

I don’t usually like breaking changes but I tend to favor dropping implicit mapping. It’s a small thing, hinders adoption and I came to think it should always have been a compiler directive.

Initially I thought a simpler solution would be to just add implicit all which would enable type inference for all variables. But:

Type annotations are very useful and there will inevitably be code released with no type annotations at all. Ok, so if the code has implicit typing we could require the user to pass --generate-type-anotations (generates in place) before compiling, that way all code that has ever been ran would have type annotations.

But then why add implicit all? Why not simply add a compiler directive and let users omit implicit none? Isn’t implicit none really a compiler directive, not a language construct?

Then I realized that CMake was only released in 2000 and implict was added in f77. If CMake or fpm were a thing back then maybe it would not be part of the language.

That’s probably true, but do you really, seriously, think that the implicit mapping default and the implied saves have any responsabilty for this, other than an epsilon2 one??

Sorry but this doesn’t correspond to any real world observation. There are much more important reasons why newcomers ignore Fortran, and which have been extensively discussed.

Fortran’s implicit rules have always caused trouble, but trouble that’s easily overcome. A colleague overseas told me decades ago that he couldn’t process my data until he remembered my first name begins with J and he had chosen a name for an array of real numbers mostly between -1.0 and +1.0 that began with J, and had declared it with a DIMENSION statement instead of a REAL statement. Both were valid f77 and both are still valid f2018, though ‘Modern Fortran Explained’ advises against using the former, for good reasons.

2 Likes

I guess what I mean is: the standard should require compiler to have --implicit-none and build systems to set it by default. If you are compiling very old code you will just get a very clear error.

Any programming language has to be taught, and learnt. Because beyond the syntax there are many things to know. Any Fortran course nowadays should state that implicit none (or actually implicit <whatever>) should be part of any module/routine/main, and that an initialization in a declaration implies a save. And once you know, you know.

I never formally learnt C (let alone C++), and I don’t practice often enough. When I have to write or modify pieces of C/C++ codes I am regularly making some mistakes because these langage do not have the same underlying logic than Fortran. My first reaction is often “wtf is this langage!”, but no, it’s actually just me. The same with Python actually: Python looks easy but there are some important things to know and that cannot be guessed.

Let me try to give a broader view on backward compatibility beyond the implicit none: yes or no discussion that ends many threads in this forum. Let’s start with the statement: We live in a world where backward incompatible changes are ubiquitous:

  1. Streaming has replaced video tapes and CDs
  2. HDMI and DisplayPort have replaced VGA
  3. Stean engines have been replaced by electric machines

So, it seems that often the advantages of using a new technique outweigh the obvious benefits of backward compatible solutions. It can also be observed that in many cases a change in technology (whether abrupt or not) also lead to a decline of old and raise of new players. BASF, for example, had a relevant market share for music cassettes but is not at all a player in video streaming.

Coming back to Fortran: I’m teaching in a master program called “Mathematical Engineering”, and about 80% of the students prefer C++ over Fortran. For them, Fortran is the video tape of programming languages. If students trained for writing code for scientific and engineering applications don’ use Fortran, who else should use it?

My understanding of the current state of Fortran is the following: Avoiding backward incompatible changes at almost all costs relieved the pressure on code owners to modernize their software. This is certainly a good thing on short or medium term because it reduces maintenance costs. However, on the long term, it results in legacy code that does not follow current best practices. This legacy code can’t be modernized anymore and the will be replaced by something completely new. In many cases, this re-implementation is not in Fortran. We have a backward incompatible change.

Of course, the question is: What is wrong with old code that works? Besides the actual improvements of the language, the way of developing software has been changed since Fortran has been punched into cards: Unit tests and version control reduce the costs of adjusting the code to new requirements. To me, the problem of legacy code is not that it uses implicit typing and common blocks. To me, the problem is that no one dares to modernize the code because the consequences are unpredictable.

To make the connection to the initial list of breaking changes: These are all hardware examples. At the software side, the situation is technically much easier because one can install multiple versions of the same software at the same time. And that would IMHO also my solution for Fortran: I don’t see the problem of breaking changes between different standard versions as long as the compilers still support older versions. In other words: Use -std=f77 to indicate that you have lost control over your code and ignore 30 years of advancement in software engineering or show with -std=f2018 that you have a set of unit tests that enable you to re-factor your code if required.

My conclusion is the following: Backward incompatible changes in a versioned language nudges developers towards continuous changes while enforcing backward compatibility gives a wild card for leaving code untouched.

4 Likes

We all agree on this. But really, what does it have to do with the default implicit mapping and so on? Have you ever heard a student telling you “Oh, now that they have deprecated/deleted < whatever already deprecated/deleted feature >, I can make Fortran my prefered language” ?

I am on the contrary convinced that the standard has little impact on that. As long as old codes are useful to a large enough group of users, the compiler vendors have a pressure to continue supporting all possibly deleted features that are used by these codes.

They see code as in LAPACK and associate this with Fortran. Even enforcing the 2008 standard allows to write code that fits on a punch card.

I didn’t say that compiler vendors should not support compiling old code. I just say that it would make sense if old code is compiled with a compiler or a compiler flag for old code while new code needs to be compiled with compilers for newer code. Speaking in terms of the example: If you want to use an old screen with VGA connector, go ahead and attach it to your 2000s PC. But don’t expect that your 32" 8K screen supports it. This is because the screen industry realized that sticking old standards hampers progress.

The Fortran attitude of “write once, compiles forever” gives IMHO the wrong impression that code ‘just’ needs to run. In reality, it needs to be adopted to new requirements. Again, LAPACK is the prime example:

  • on a modern monitor, there is space for speaking names and the restriction to cryptic, 6-letter long abbreviations is not needed.
  • letting the compiler check whether one calls a routine with the right arguments is less error prone than cross checking manually with the documentation.
  • checking for EBCDIC encoding (see Resistance to modernization)
  • not to miss the obligatory implicit none: saving a few statements might have been beneficial for punch cards, but explicit is better than implict does not only hold for Python.

I totally agree that LAPACK is still useful, but it would be much more useful if it would have been modernized. In my opinion, continuous modernization is the best approach because it divides the work in small, manageable packages. An evolving standard that is always based on the best approach known at the time of writing instead of sticking to old approaches that have been proven wrong supports IMHO this modernization process.

3 Likes

Interesting read from rust: What are editions? - The Edition Guide

3 Likes

The way LAPACK is being modernized is via a rewrite in C++ - <T>LAPACK.

3 Likes

From a programming language user point of view, editions in Rust feels like a very good way of breaking backwards compatibility. I think there are things to be learnt for other languages here.

Another approach, though maybe not as obvious from the outside, is the Java/JVM ecosystem: There are multiple “newer” languages like Scala and Kotlin which compile to JVM bytecode. They are completely different languages than Java and the syntax largely incompatible. Because they all run on the JVM however they can seamlessly interoperate with Java. This makes introducing a completely new programming language into legacy Java systems relatively easy. See for example https://kotlinlang.org/docs/java-interop.html

1 Like

This is absolutely true, we’ve lots and lots of real feedback on this from teams who have been adversely impacted by inadvertent mistakes by developers almost all of whom have to code in multiple languages, who have little to no background in Fortran, and who face tremendous challenges in dealing with software components that include Fortran.

To reiterate, the highly detrimental features of implicit mapping and implied save have become been the straw that broke the camel’s back because they are easy to explain to a 3rd / neutral party. And there have been real, real issues with type safety or lack thereof due to overlooked implicit none, especially in INTERFACE blocks and also severe problems with thread safety or lack thereof due to implied SAVE in Fortran subprograms.

Fixing these two issues by 203X standard revision is a small project technically and also effort-wise and the effect is absolutely inconsequential from backwards compatibility point of view.

To retain these two dirty semantics in the language means Fortran will be rightly perceived as inconducive to write type-safe code and thread-safe code. It is not good for Fortran.

It will be far better for modern Fortran to finally remove these two impediments from the language, at least by 203X. It is not at all a big ask

To not do so is a tremendous disservice to many current practitioners and all newcomers to the language.

People who have never learnt Fortran are making mistakes when writing pieces of Fortran: what a surprise!!

1 Like

For Fortran to survive it has to be easier to use than every other language that is as fast. Every sharp edge makes that harder to achieve.

3 Likes

I’ve written Fortran for quite a few years now and know many of the pitfalls. I can avoid them, but I’ve helped out Fortran newcomers who has fallen into pitfalls from outdated language constructs and said things like “no, integer :: i = 0 doesn’t do what you think”.

Now what happens if we’re going to start a new project from scratch and somebody asks me which programming language I think we should use? Do I say they should use Fortran and prepare for yet another round of explaining why the code doesn’t do what they expect it should? Or would I recommend a language with fewer pitfalls? While I think modern Fortran has a lot of good characteristics I’m honestly not sure I’d recommend every time even if it theoretically would be a good fit.

In my opinion deleting obsolete features - though with a mechanism for backwards compatibility - isn’t just for making the language easier for newcomers. It could also tip the scale in favour of Fortran for more projects which otherwise would choose other languages.

6 Likes

My point was: if the compilers de facto continue supporting deleted features (and they do), then there’s still nothing that pushes the developers to regularly update their codes. Whatever the standard says.

About Rust I can read in the link you gave : “Ever since the 1.0 release, the rule for Rust has been that once a feature has been released on stable, we are committed to supporting that feature for all future releases.”. If you translate that for a langage that is ruled by a standard, that means that features are never deleted from the standard.

About the first two points: they could have been easily solved starting from 25 years ago just by releasing a standard wrapper module (or an include file with interfaces) to BLAS/LAPACK, without touching at all the old codes. Instead of calling DGEMV() you would call blas_general_matrix_vector(), with all the arguments checked, and this one would call DGEMV(). Why it didn’t happen is the question, and the answer won’t involve the implicit none, the implied saves, or any other obsolescent feature.

1 Like

I believe you… But exactly the same, as a perpetual C/C++ newbie I have fallen into pitfalls when writing some C, because I was thinking “The Fortran Way”. In all languages there are pitfalls that can be misleading when you are not aware, Fortran is definitely not a special case.

One of the bottom line is: in Fortran there is nothing like a variable initialization on declaration at runtime. Everytime you see an initialization on declaration, it happens at compile time, which implies that the variable is static: this is true for local variables, and for module variables. This is consistent. People tend to think the other way because they are used to the C behavior, but Fortran in not C, the paradigms are not exactly the sames.

You should recommend a language that fits the project AND that is mastered by the people who will write the code. This can imply some training for them. Would you recommend using C++ with people who never seriously learnt C++ (actually it happens, and it results in garbage code)?