Implied save behavior for defined-and-initialized variables

Greetings everyone;

I know I pop up and disappear sporadically here but once again thank you for your patience. This time I am rewriting another F77 code which is the well-known monolith that is ARPACK for SciPy. I have some limited expertise on this subject and typically I could follow the algorithm since I also translated PROPACK already. Hence on paper, things should be easier however I wasted a good chunk of an entire week on the behavior discussed in the title.

I found this through an invaluable nugget given in

then searched here and then found

and also

I can see that there is a lot of debate around this, and after wasting very frustrating week on this subject, I can tell why from firsthand experience. And the bug I was hunting for involved variables in SAVE and COMMON blocks which placed me in the wrong direction since I assumed something was going wrong about those variables somewhere else without knowing implicit save is a thing in Fortran. But in fact it was this implicit save that was making things go awry. Seemingly, there is no end to these things in F77; even the most innocent definitions are landmined.

However, let us not have any further discussions, since the links above have ample amount.

Could I ask for a definite explanation of this behavior or the rules of the game in a single place for posterity in case someone also has the luck of hitting this (in case they figure it out in the first place)? If any, please include extra gotchas that you are aware of.

And specifically: Gotchas ā€” Fortran Programming Language

1 Like

Iā€™ve checked that page too (after I figure things out) but that static analogy is not entirely correct. Or only correct for a variable inside a C function.

In a translation unit, static makes the statement local to the file so that you donā€™t pollute the namespace. Alternatively if defined outside a function, it is still initialized to zero, that is to say, static int myvar; as a standalone statement, makes myvar equal to 0 implicitly. Also in C, the user explicitly does by adding a clear static keyword, so you actually need to do extra work to turn this behavior on.

This conditional behavior change of static is not good or ergonomic (it points to ā€œstaticā€ storage in one and namespace in the other; and both are unintuitive) and but it is at least googlable and clear in its utility. I canā€™t say it is the case for this bizarre syntax in F77. Am I understanding correctly that there is nothing else to it other than this SAVE behavior?

Youā€™re right, the text should be modified.

If youā€™re still speaking about

real [, save] :: x = 0.5

where x is not a declared as a parameter elsewhere, then itā€™s not valid F77. This syntax has been introduced in F90, and the F77 equivalent (which is still valid) was:

real x
data x /0.5/

The variable is static and initialized to a value at compile time.

It is a little more complicated than that. If the variable is not modified, then in f77 the above was sufficient to guarantee that the value of x would remain 0.5 on subsequent calls to that subprogram. However, if the variable was modified within the subprogram, its value on subsequent calls was undefined. One might imagine that the variable would either be reinitialized to 0.5 on subsequent calls, or that it retained its last value on subsequent calls, but the f77 standard was silent on that and just declared that its value was undefined. In practice, with overlay linkers a compiler might have even supported both conventions. In order for the modified value to be retained on subsequent calls the variable needed to be saved, e.g.

save x

or one of the other ways to indicate that it was a saved entity. In f77, a separate statement of some kind was required, it was not possible to add the attribute to its declaration; that syntax came later with f90.

The problem was that many f77 programmers did not save the value, yet they expected the modified value to be retained on subsequent calls anyway, because that is how their particular compiler treated the variable. Then the fortran standard committee decided, against many valid arguments otherwise, to make implicit save for data statements standard (along with the equivalent initialization within the declaration statement). This made many previously nonconforming codes standard conforming, but at the expense of clarity to the programmer and at the expense of simple semantics for recursive and parallel subprograms. There are also unintended performance aspects of this decision. What the committee should have done was to make the explicit save attribute required for any entity that was initialized, either on the declaration statement or in a data statement, and require compilers to print error messages when save was not explicit. This would have required modifications of the old nonstandard codes in order to compile run correctly with new fortran compilers (which I think would have been a good thing).

However, now that implicit save has been in the language for over 20 years (I think it was introduced in f2003), I doubt that this mistake can be corrected without causing many other problems. So at this point, I think it is something that fortran programmers must deal with in perpetuity. This does demonstrate an important aspect of language design, that it is much easier to add something to the language than it is to remove it.

4 Likes

These are very illuminating remarks hence thank you both. As an outsider, I am finding it difficult to understand why fortran has the tendency to make things implicit as if it will make things convenient or reduce the character typing but it is making things exponentially difficult to read. But anyways, I donā€™t have a dog in that race hence once again thanks for the explanations. Now I know where to place more print statements.

Mostly because of the history. Fortran was the first high level language almost 70 years ago. In the context, at that time it was a huge improvement in terms in readability over assembly. Nonetheless, I guess that reducing the character typing was still desired at these time where punch cards were used.

Then, the strong emphasis on backward compatibility has prevented changing these basic rules along the years. Instead, the implicit none statement appeared to at least state at the begining of the programing units that all variables must be explicitely declared.

About the ā€œimplied saveā€ introduced in F90, almost everybody agrees today that it was a mistake. Things like that happen in all langages.

Arf, it had always been unclear to me, thanks for the clarification.

@ilayn modern Fortran is explicit except a few corner cases and those are tackled by compiler warnings and errors. Old Fortran code is often not, as you found out.

Poor design in hindsight from a time when each character was laborious to write out/include.

Vehemently defended until the end of time by those that want ā€œif it compiled once it should compile and run foreverā€ to remain true at all costs. Implied save, implicit none, double precision pi=acos(-1.0), the list goes on of remaining gotchas and poor design decisions that will seemingly never be corrected.

1 Like

The first two are already fixed by LFortran (warning for first, error for second). The third is on our TODO list. And so on, for every item on your list. :slight_smile:

7 Likes

Upthread there is a blurb, ā€œthe strong emphasis on backward compatibility has prevented changing these basic rules along the yearsā€

But where was the ā€œthe strong emphasis on backward compatibilityā€ when the standard committee nonchalantly set the far more dangerous precedent that allows a Fortran standard revision to change behavior of conforming programs?

Consider the silly example below:

   character(len=:), allocatable :: s
   allocate( character(len=10) :: s )
   write( s, fmt=* ) "Hello"
   print *, "len(s) after write: ", len(s)
   print *, "Expected length is 10 with a program conforming to standards earlier than 2023,"
   print *, "but 5 starting Fortran 2023."
end
  • program response using one current processor
C:\temp>gfortran -ffree-form p.f -o p.exe

C:\temp>p.exe
 len(s) after write:           10
 Expected length is 10 with a program conforming to standards earlier than 2023,
 but 5 starting Fortran 2023.

C:\temp>

I was the only person at the J3 meeting in Fall 2020 earnestly requesting the voting members on the committee to factor backward compatibility around the semantics of an otherwise good feature in 2023 revision. None of the voting members paid heed.

This change actually affected one data processing application among the teams I work in industry - this application as structured by a group of people among different departments had a Fortran layer working with some data where the program depended on the internal WRITEs to not alter the string length (equivalent of that of s in the example above). There was a significant cost of change to refactor this code to not break starting Fortran 2023.

So when it came to a situation involving some ā€œmere mortal Fortran practitionersā€ i.e., the teams I work with, the Fortran committee did not care one bit about backward compatibility, their response effectively was, ā€œgo pound sandā€.

And with the context in this thread, countless users are actually HARMED by the utterly nonsensical semantics around ā€œimplied SAVEā€ behavior and yet the very same cast of characters on the committees will refuse to do anything to help with the situation, so ā€œconvenientlyā€ hiding behind ā€œbackward compatibilityā€.

Seriously, what gives?

This is why I repeatedly inquire, ā€œFor whom Fortran, for what!?ā€

3 Likes

Yes, the committee is not consistent, but itā€™s not done with malice. I think itā€™s due to the process: if nobody is working on some new big backwards incompatible feature, then the response is ā€œwe need to be backwards compatibleā€ (rightfully in my opinion), but if established people at the committee are submitting papers for some feature, then it can break backwards compatibility (itā€™s questionable if it is worth it in my opinion).

But you canā€™t fix that from here, so either run for a chair, or worry about things you can change. :slight_smile:

Mmm I didnā€™t know that. I would have expected write to not change the length, but if you had just done s=ā€˜Helloā€™ then that would reallocate to length 5.

Very nice. Another confirmation that the new compiler is in good hands.

I have absolutely no background in the fortran committee issues but if I understood correctly from the other thread you linked to, I can offer you some perspective from my professional domain, industrial automation, for which Astroturfing is the norm.

There are industrial protocols, almost all closed source, and you pay ridiculous money upfront as a company just to download the standard PDF, donā€™t get me started on the cost of getting the compatibility certificate. In fact this is quite common; DALI, BacNET, OPC-UA and so on are very old and very ā€¦ <insert some negative connotation here>. All of them say that they are open, not always quite understanding what open really means.

These old programming languages C/C++/Fortran/COBOL/JSā€¦ are from the same era that typically everything that needs consensus run by a committee. Sometimes this is very nice that it provides a common direction and sense, sometimes as in the programming languages, it is an anachronistic practice, left from the past, driving many insane; coughā€¦ C++ ā€¦ cough. But I think in your case you would be interested to listen to Brandon Eich talking about ECMA TC39 committee My TXJS talk (Twitter remix) ā€“ Brendan Eich from 2011. And JS is no fortran. You are dealing with giant companies and peopleā€™s internet, so stakes are much higher. I like how Python is driven without a committee but even that has its own issues. So this is not a simple problem to deal with.

However, what you cannot do is sticking your oar in. The committee, by its definition and function, does not listen to you or even hear you (you = non-committee members), otherwise they cannot function. So it is a very unproductive situation for you to keep pounding at the door. Two options for you to take, one is like @certik mentioned, get in the committee and stir up the soup inside OR make your own work better (brand new language / fork off of Fortran / just keep using it with avoiding bad parts of it).

Then again, I have absolutely no dog in this race hence Iā€™ll take my win which is learning about the weird ā€œimplicit saveā€ behavior and run away :slight_smile:

3 Likes

I was not aware of this feature. It is described in section 12.4 of the f2023 standard, and it is mentioned in the compatibility sections 4.3.3, 4.3.4, and 4.3.5. The reallocation also occurs for the IOMSG and ERRMSG arguments. It is allowed for the variable to be unallocated when the write statement is exectued, something that was not allowed before f2023.

Was this feature ever discussed here before the standard was adopted?

I must admit I donā€™t like the ā€œnewā€ auto (re-) allocate feature that is done in modern Fortran.
My problem is with arrays as well as character strings.

This re-allocate (which is identified at compile time) can be unexpected, so I think there should be a compiler option to report when auto-reallocate occurs.

This may not be ā€œbackward compatibleā€ with F90/95 use of allocatable arrays, where auto re-allocate was not available. We need a warning !

1 Like

In Intel Fortran there is -standard-realloc-lhs, which is affected by -standard-semantics, to control this behavior :slightly_smiling_face: See also this post by Steve Lionel.

1 Like

Iā€™m unsure about this, but I think the reallocation during an internal write follows the same semantics as reallocation during assignment. That is, if the shapes match exactly, then no reallocation occurs, and if the shapes are inconsistent, then reallocation does occur. As discussed here previously, the recent fortran standards are unclear on this detail, but if the programmer is familiar with f90/f95 semantics, and if he assumes backward compatibility of the newer standards, then he can arguably conclude that reallocation does not occur when the shapes match exactly. The remaining inconsistency is when the lhs is allocated to be longer than the rhs expression. F90/f95 semantics would assign the leftmost characters to the rhs, and blank fill the remaining characters, while F2003 (and later) semantics would reallocate the lhs.

Especially now with f2023 when this functionality is being extended to new situations (internal write), this ambiguity in the standard should be addressed. Specifically, it should somehow be stated explicitly that reallocation does not occur when the shapes match for the lhs and rhs side during assignment or when the shapes match for the formatted string and for the internal file in the case of a write statement. This ambiguity has both performance implications and also implications regarding the association of pointers and of the lhs lower/upper bounds.

There are some cases that can be identified at compile time (e.g. when the character string is initially unallocated), but in general the shape mismatch must be detected at run time.

My understanding is that the reallocation on assignment (in f2003) and this newer one related to internal write statements and to the IOMSG and ERRMSG arguments are all compatible with f90/f95 semantics except for the case described above (long lhs/short rhs). This is because these new reallocation situations (unallocated lhs, short lhs/long rhs) were not allowed in these older standards, and it was up to the programmer to ensure that they did not occur.

I donā€™t have access right now to ifort or ifx. Does this compiler option also cover to the internal write situations (i.e. the internal file string and the IOMSG and ERRMSG arguments)?

1 Like

I really like the new auto re-allocate feature, and look forward to using it. It fixes an inconsistency in the language with respect to how allocatables work. It has also been a minor PITA to have to assign ā€˜big enoughā€™ character strings for the error messages and such.

If the automatic reallocation isnā€™t wanted, instead of s use s(:) in the above example. Just like one does on the LHS of assignment statements.

2 Likes