Catch-23: The New C Standard Sets the World on Fire

I found this opinion piece in ACM Queue to be an interesting look at what happens when a new standard breaks old programs. Nobody wants Fortran to do that - do they? :thinking:

Catch-23: The New C Standard Sets the World on Fire - ACM Queue

Developers should also note that C23 has drifted further from C++ than the earlier C standards. The notion that C is (mostly) a subset of C++ is further from reality than ever before.

Sadly, missed opportunities and incompatibilities with C++ arenā€™t the worst aspects of the new standard. C23 transforms decades of perfectly legitimate programs into Molotov cocktails.

3 Likes

ā€œThe root problem is the failure of a standard to standardize.ā€

That is among the key points of the stated opinion piece.

Since roughly 1978 the actual practice of Fortran has been toward implicit none as default, yet the Fortran standard fails to standardize on that practice.

The recommendation with the Fortran standard is communication-driven-standardization, a la documentation-driven-development.

There are some changes that if pursued, they can be communicated well in advance, that standard revision X will make some change A. With good communication across the Community, incendiary articles such as the topic here will get written well in advance of the standard revision and the ā€œfiresā€ will have long burned out by the time of the eventual publication.

TL;DR: the article in the original post is no basis for the Fortran standard to hesitate from making positive changes and with good communication, progress can be made.

Indeed, it seems some of the key points in the article are:

  • ā€œDonā€™t break existing programsā€
  • Progress means draining swamps and fencing off tar pits

  • Major disappointments of inaction involve the pillar of C programming: pointers.

  • C23 fails to correct misguidance dating to the earliest version of the standard.

So ā€œnot breaking old programsā€ is of course necessary, but not sufficient.

1 Like

I have read this sentence a few times, and I do not understand what is its meaning.

For example, I use implicit none routinely, but I do not favor changing the default implicit typing rules (which would break legacy codes). I donā€™t think Iā€™m unusual in that respect. So tell me, explicitly, in what way does my actual practical use of implicit none encourage that change in the fortran standard? And if it somehow does encourage such a change, how can I continue to use implicit none in a way that does not get interpreted in that unintended way?

1 Like

Adding implicit none is not much of a trouble, from my perspective. But forgetting to add it everywhere can cause intricate bugs that only some compilers like gfortran can detect. I find implicit typing a helpful feature for quick testing. But it should not be the default behavior, in my opinion.

1 Like

Letā€™s not start another implicit-none-battle here. The article mentioned in OP concerns mainly undefined behavior. AFAIK the omission of implicit none in Fortran program does not lead to any UB.

BTW, making realloc(ptr,0) UB, instead of an equivalent of free(ptr) is indeed a very stupid idea. As the author aptly notices, it will break even existing dynamically linked executables once the updated system libraries implement the change. Thatā€™s scary.

3 Likes

@msz59 I agree, but letā€™s not call ideas stupid. I am sure they had their reasons, even if we disagree. I am sure it is quite similar to the Fortran Standard Committee, which also has reasons, even if I might disagree. It would be good to figure out the reason behind this decision of the C committee. I think the above feedback should have been received and addressed by the committee before they released the standard.

1 Like

The probable reason (mentioned in the article) is another strange inconsistency of the C standard, allowing the result of allocation of zero-sized block to be implementation-defined:

If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

That indeed makes realloc(ptr,0) somewhat ambiguous. But then, instead of fixing one problem, the committee has introduced another, more serious. You can call the idea as you please but IMHO it is ā€¦ not wise :slight_smile:

In general, this C feature of zero-size memory allocation does not match well (or at all) the fortran case. Modern fortran does allow zero-length character strings and arrays (f77 and earlier did not), and in this case they are natural and useful and I would not want that feature to be changed.

Neither does Fortran have realloc working as effectively as in C. That was discussed in another topic and, IMHO, is unfortunate.

So this thread is, of course, highly OT. For those (still) interested, Iā€™d recommend a presentation (18 minutes video) referenced as last item in the OP article bibliography

  1. Yodaiken, V. 2021. How ISO C became unusable for operating systems development. 11th Workshop on Programming Languages and Operating Systems (PLOS '21). How ISO C became unusable for operating systems development | Proceedings of the 11th Workshop on Programming Languages and Operating Systems.

Interesting view on C as opposed to other programming languages.

1 Like

Thereā€™s nothing about ā€œundefined behaviorā€ in the original post, but thereā€™s a take on ā€œbreaking old programsā€ and thatā€™s what I refer to via the implied none example.

To state the obvious, I am a long-time, sustaining advocate for two changes to the Fortran language and standard:

  1. Remove implicit-mapping that will effectively make implicit none the default,

  2. Delete implied SAVE when it comes to definition upon declaration of local objects in subprograms, meaning only that conforming processors be required to detect and report when the explicit SAVE attribute is absent,

Besides these two, there have been a few other requests - not from me - but not all that frequent or sustained of late.

There is no official and little Community support for the above two items in the name of backwards compatibility and that it would break old programs, programs that are already broken with respect to the rest of the standard, or were always broken to begin with!.

And now the article re: C23 and realloc(ptr,0) situation is being attempted as a warning shot to the community to not request breaking changes.

My point is only the C23 example does *not" and need not apply to the two Fortran cases above, definitely not the first which is also because ā€œundefined behaviorā€ is not of relevance.

There is no resumption of implicit none ā€œbattleā€ here unless some else also want to be incendiary, like the original post.

1 Like

Looks like this opinion paper can also be used to support my point of view :wink: :

Progress means draining swamps and fencing off tar pits, ā€¦

or (a quoted quote):

Standards are not some kind of holy book that has to be revered. Standards too need to be questioned.
ā€” Linus Torvalds on C standards

IMHO, questioning means to revise design decisions if needed instead of keeping (ā€˜reveringā€™) them for the sake of backward compatibility.

Also interesting:

Compile old code as C23 only for good reason and only after verifying that it doesnā€™t run afoul of any constriction in the new standard. If you need new C23 features, consider quarantining C23 code in separate translation units;

which translates for Fortran to

Compile old code as F2023 only for good reason and only after verifying that it doesnā€™t run afoul of any constriction in the new standard. If you need new F2023 features, consider quarantining F2023 code in separate translation units;

which in turn means: Compile old code according to old standards and new codes according to the standard version that is best suited for the problem you want to solve.

3 Likes

I agree, with just one caveat - this is not a very good hint for people new to Fortran (or C or any language in similar situation)

A book ā€œModern Cā€ including C23, freely available under licence CC BY-NC-ND:

Jens Gustedt. Modern C. Manning, In press, 9781617295812. āŸØhal-02383654v2āŸ©

4 Likes

The first C program in the book, reformatted slightly, is

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
   double A[5] = {[0] = 9.0, [1] = 2.9, [2] = 3.0E+25, [3] = .00007};
   for (size_t i = 0; i < 5; i++) printf("element %zu is %g, \tits square is %g\n",
					 i, A[i], A[i] * A[i]);
}

and a similar Fortran code is

implicit none
integer, parameter :: dp = kind(1.0d0)
integer :: i
double precision :: a(5) = [9.0_dp, 2.9_dp, 3.0e25_dp, .00007_dp, 0.0_dp]
do i=1, 5
   print*,"element", i, "is", a(i), ", its square is", a(i)**2
end do
end

The output of the C program is

element 0 is 9,         its square is 81
element 1 is 2.9,       its square is 8.41
element 2 is 3e+25,     its square is 9e+50
element 3 is 7e-05,     its square is 4.9e-09
element 4 is 0,         its square is 0

and of the Fortran program is

 element           1 is   9.0000000000000000      , its square is   81.000000000000000     
 element           2 is   2.8999999999999999      , its square is   8.4100000000000001     
 element           3 is   3.0000000000000001E+025 , its square is   9.0000000000000003E+050
 element           4 is   6.9999999999999994E-005 , its square is   4.8999999999999992E-009
 element           5 is   0.0000000000000000      , its square is   0.0000000000000000

The C output does look neater than Fortranā€™s. I wonder what format string would make the Fortran output look closer to that of C. The C printf statement is harder to read than the Fortran print statement. A beginner would wonder what the #include statements do in C, why
int main(void) follows, and why the Fortran code has implicit none, defines dp, and uses it repeatedly. I think C giving A[4] a value of 0.0 because it was not set is a misfeature. Itā€™s better to fill unset variables with nonsense as Fortran compilers do, to alert the user of a probable bug. Only the Fortran output reveals round-off error. Whether that is good is arguable.

Fortran does have the G edit descriptor. That will look for the short representation that C is using. Note that list-directed output gives NO control over the form, but in C that is impossible of course.