Flaw with typed enumerators in F202X

zedthree · May 10, 2022, 11:22am

Is the specification for the new enumerator type fixed for F202X? I’ve just been reading the proposed spec and while it’s missing a few features that would be incredibly useful (user-defined values, IO, character conversion), it also has a massive flaw: name scope. The missing features could always be added in later revisions, but the current proposal for the scope of names (“class one names”) would hamstring it on arrival.

It’s very possible that I have misunderstood the status of the current design, in which case I would be very happy to be corrected.

From John Reid’s great explainer on the new features of F202X:

enumeration type :: colour
  enumerator :: red, orange, green
end type
type(colour) light
:
if (light==red) ...
…
The names of the types and their constants have exactly the same scoping rules as for other types and constants; for example, a procedure that contains the above declaration must not contain a real variable named red.

With this current rule for the scope of the names, it means it would be impossible to have two enumerators with the same name in the same scope:

module colours
  enumeration type :: additive_primary_colours
    enumerator :: red, green, blue
  end type

  enumeration type :: subtractive_primary_colours
    enumerator :: red, yellow, blue
  end type
end module colours

Here the red and blue enumerators would clash and so this wouldn’t compile. Basically every other language since C has had some sort of namespacing for enums. C++11 introduced enum class for exactly this purpose; C#, Java, Swift, etc. have all had namespaced/scoped enums from the beginning. They are a Good Idea.

From what I can tell by reading through the proposals on the J3 site, the initial proposals did include namespacing the enumerators, but this seems to have been removed in J3/21-120r3.

The relevant comments on that document are:

(13) I note that enumerators as class one names is not only simpler
but what WG5 asked us to do in the first place.
(13) Fortran already has basic namespace management (USE ONLY and
renaming). Some ideas for extensions to that have already been
floated (for a future revision, possibly F202y); we should not
preempt that with a complicated feature here.

For the first comment, simpler is not always better. C’s enum is simpler than C++'s enum class, but enum class solves lots of problems with enum.
And I think that the second comment only applies to importing enumerators into another scope. As it stands, it would not be possible to declare two enums with clashing names in the same module, and I think that is a significant flaw in the current design and one that would be impossible to fix later.

Does anyone know if the current design is set in stone, or if it’s still possible to alter it?

plevold · May 10, 2022, 11:56am

Agree 100%. My list of requirements for a usable enum language feature contains some additional items (believe I’ve posted about this in an earlier thread as well):

Namespaces (like you say).
Variants of different types, not just integers. See Haskell’s |, Java’s sealed classes, Rust’s
enum and others.
Exhaustive select statements - omitting a variant is a compile error unless type default is present.

I see absolutely no use for an enum/sum type/tagged union that does not support these items as a minimum.

zedthree · May 10, 2022, 12:03pm

I could see your second item being introduced in a later standard, which would be fine, and I would be fine if your third point were only implemented as a compiler warning.

There’s a few other incredibly useful enum features that I think are missing, but again, they could be introduced later. I certainly understand that the committee only has finite resources and so getting a simpler feature in sooner is desirable, but I don’t think it would be possible to introduce namespaced/scoped enums later without breaking existing code, so it absolutely needs to get in now.

plevold · May 10, 2022, 12:34pm

Different types could be introduced later, but runs the risk of making syntax very unergonomic.

I don’t find a compiler warning to be enough for exhaustiveness checking. It’s far too easy to ignore when adding a new variant to an existing enum.

Requiring exhaustiveness is a great tool for ensuring correctness at compile time which I think is something that Fortran should strive for to provide an attractive alternative to Julia and Python. Getting the same type of quality assurance from an optionally typed language like Julia is more or less impossible.

Adding this at a later stage would either break existing source code or require two alternative syntaxes which is why I’m very sad to see that it’s not included now.

FortranFan · May 10, 2022, 2:06pm

@zedthree, @plevold ,

Please note I serve as an alternate, non-voting member to INCITS J3 and I have been requesting and advocating for a proper enumeration type facility in Fortran 202X since 2018, right around when the work began on Fortran 202X.

My own vision and the use cases for Fortran based on industry experiences as to how a related set of named constants are preferably consumed in scientific and engineering codes are summarized in this paper.

In the above paper usually referred to as 19-229, you will find the use cases and requirements pretty much touch upon the needs and requirements the two of the you have listed upthread. For example, @zedthree what you show with name collision with the colours module was precisely what I wanted Fortranners to not encounter in their codes and thus I added a requirement:

My simple goal is to minimize “magic values” from appearing in Fortran code because that leads to costly mistakes and to employ mnemonics as much as possible when it comes to working with constant values that belong to a group or a set.

The above paper was submitted to J3 committee and it is drafted entirely from the user perspective, especially in industry. I failed miserably and was unable to influence the design even a bit.

So my proposal was clearly seen by the voting members of the committee as entirely B.S. but I also then failed to see anything better being proposed that could be much of any use in actual coding practice with Fortran 202X. What eventually got voted in left me aghast, that is:

I am of the opinion the enumeration type feature in Fortran 202X is mostly flawed.
It is a tremendous disservice to the actual users of Fortran to have two deeply flawed and highly confusing features, enum type in addition to enumeration type, that convey rather similar use cases and yet have very different semantics and syntax,
I have long been thinking whether it will be better for Fortran to get “something”, however poorly designed, into the language starting 202X that will take 10+ years after publication for compiler support only to consider improvements and extensions in 202Y and 203X that may or not happen but if they do, they will take additional 10+ years before users can consider using it in their codes. Or, whether it will be better to do-it-right-this-time. I’m inclined toward the latter.
Given the previous point, I have been contemplating whether to start a petition with the users that reach WG5 to withdraw the enumeration type feature from Fortran 202X considering how useless it is at this point.
Outside of a strong external influence on WG5 e.g, via one or more national bodies, say BCS in UK, or similar in Germany or Japan, etc., it shall indeed remain the case “the current design is set in stone.”

So there you have it, one take on the situation.

mecej4 · May 10, 2022, 3:32pm

@FortranFan, this response is not directly related to the topic of enumerators, but is intended to record an example that I came across recently, in which a straightforward solution did not present itself.

In a post on the Intel MKL forum, a user posed the following question:

" According to documentation, the function should return one of a small number of values such as VSL_ERROR_OK, VSL_STATUS_OK, etc. I still haven’t been able to locate the definitions of those parameters."

These codes are, of course, defined in header files and include files provided by the compiler/library vendor (Intel). It is easy enough to write statements in your code to compare the return code with one or two symbolic integers if you know them. However, there are many possible return codes, and how does the user catch the error number and use that to print the name of the error? The user wrote, “I’m having trouble discovering what the value of 2 indicates, an error of some kind I’m assuming.”

Do any of the proposed additions to the Fortran standard, including yours, address this particular issue?

Thanks.

FortranFan · May 10, 2022, 3:48pm

@mecej4 , thank you for question, that is “right on the money” in terms of the use cases I (more what my peers) had in mind.

But I am afraid the current feature in Fortran 202X will not support our use case all that well.

Similar to the case you bring up with VSL_* codes (constants), please take a look at this link at NAG numeric library documentation site for a method to find the solution for a nonlinear system of equations:
https://www.nag.com/numeric/cl/nagdoc_fl26.2/html/c05/c05qdf.html

Please take a note of various dummy arguments: IREVCM, MODE, thru’ IFAIL.

This method is a common enough need in typical engineering practice, moreover the style of the subroutine (its API) is rather similar to a lot of other existing Fortran codes for other technical needs, meaning integers with magic values are used in type-unsafe fashion.

Ideally we would have liked to have modernized such legacy APIs to modern Fortran and use an enumeration type instead for arguments such as IREVCM thru IFAIL in a way where the code values are retained (IFAIL = 5 carries a special meaning in the above case). Because the specific values of codes have handled elsewhere also, often outside of Fortran context.

I believe my paper 19-229 tried to address what you bring up as well as the needs of others, but it failed.

septc · May 11, 2022, 12:04pm

Because I am not familiar with enum (though I often see it is extensively used in other languages), I’ve just read a few documents about enum class in C++, then find various explanations about the above issue of “unscoped” enum , e.g.,

Improving Enumeration Types [N1513=03-0096]
(https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1513.pdf)

Strongly Typed Enums (revision 3)
(https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2347.pdf)

Sec.2.3 of the above PDF says (the bold font is by me):

2.3.Problem 3: Scope
C++ enums are not strongly scoped. That is, the enumerators of an enum are exported to the scope in which the enum is defined. This is a relict from the earliest days of C where scoping was very weak. In the case of enumerators, there are nasty implications. In particular:
• It is not legal for two enumerations in the same scope to have enumerators with the same name. For example:

enum E1 { Red };
enum E2 { Red }; // error

• The name of an enumerator exists in the enclosing scope, which can cause surprising results. For example:

namespace NS1 {
 enum Color { Red, Orange, Yellow, Green, Blue, Violet };
};
namespace NS2 {
 enum Alert { Green, Yellow, Red };
};
using namespace NS1;
NS2::Alert a = NS2::Green;
bool armWeapons = ( a >= Yellow ); // ok; oops

The current workaround is not to use the enum and instead write a class wrapper (as in §2.1).

I guess similar pitfalls can occur when modules are used instead of C++ namespace above. In that case, we may need a “workaround” to separately define every enum in separate modules (to avoid name collision) and use xxx, only: <enum> to import only one enum into the local scope.

I haven’t read the relevant docs (linked in the above posts) yet so my understanding may be wrong in many points, but If the issue is well-known from many other languages, why repeat the same issue from the beginning…? Is it not possible to just require qualifiers? (like color%red and alert%red in the case of the above example) (For possible concern, @sblionel)

FortranFan · May 11, 2022, 1:12pm

All of the concerns raised thus far in this thread, especially the work by Miller, Sutter and Stroustrup toward C++, were communicated to J3 Fortran committee via emails back in 2018 and at meetings during 2019 but to no avail.

However, as things stand with the workings, the effective default stance is there is nothing really all that much to learn from these newer languages, from C++, C#, D, Julia, Python, Swift, Rust, etc. as far as Fortran should be concerned.

This is the case even as each and every one of these languages find greater and greater beneficial applications in scientific and technical programming, the supposed focus of Fortran.

An attitude most detrimental to productive practice of modern Fortran prevails. I personally think a lot of this is due to the tight / tiny budgets toward Fortran compilers at commercial vendors that leads to doing the smallest or the simplest of feature enhancements. But reality bites and it is the practitioners of Fortran who have to suffer. Simple does not always translate to what is good.

septc · May 11, 2022, 1:54pm

I remember that the following proposal (linked above) had been posted several times on forums etc

Use cases and formal requirements for enumeration types (2019-October-08)
(https://j3-fortran.org/doc/year/19/19-229.txt)

and also the “reply” (linked above also)

Simpler enumeration types specs and syntax (2021-March-02)
https://j3-fortran.org/doc/year/21/21-120r3.txt

The first sentence of the reply is

Over the last year it seems that our approach to “true enumeration types” has become too complicated, and unnecessarily so.

I partly share the feeling why it is considered “too complicated” (in short, the request seems too much at once). But looking at the first post of this thread (unscoped vs scoped), the final result seems really oversimplification and could lead to unrecoverable situations. This is also against the recent direction of modern Fortran, i.e., make the syntax more explicit and robust against possible errors.

FortranFan · May 11, 2022, 5:04pm

It was with papers 19-230r2 and 19-231r2 the specific comment about “too complicated” was made. So the sentiment applies even more strongly with 19-229 I suppose. But that’s a shame.

There are major. major concerns with such narrow line of thinking re: “too complicated”. And again, it comes back to “For whom Fortran?”

Is Fortran only meant for some individual coders who have essentially complete autonomy as to the kind of coding they intend to pursue with the “tools” of their interest? And many of whom have their own an ever sliding, ad hoc and discriminatory and inconsistent scale as to what is simple or “too complicated”, what they like or find acceptable, etc.?
or does Fortran has a wider, open tent?

Think for a moment: reliable programming toward any scientific or engineering aim is an extremely complicated exercise.

Also, advancing needs require advanced solutions. That was the very premise of FORTRAN with Backus and team at IBM early 1950s.

In contrast to Fortran’s rich legacy or where things are every programming language with advanced features now that are finding both acceptance as well as value in scientific and technical programming, paper 19-229 is nothing complicated, only detailed on one of the simplest features.
That circa 202X for a basic facility which will only be available around year 2030 or later, the very notion of going skittish and wimpy at the hint of detail and to settle instead for a half-baked facility that will only be of limited use really, really looks bad for Fortran.

I agree entirely on this.

certik · May 11, 2022, 9:14pm

I pretty much agree with @FortranFan. I will point out that this has been discussed at length at the committee (if I remember correctly, I also advocated to keep the enumerators local, not global), and the main argument seems to be to keep things simple in order to get at least something in.

However, as pointed out in this thread, adding a feature that is the simplest possible might not lead to what we want in the long term and it will prohibit adding a feature that will do what we want, and it would be better to rather not do it at all, and wait until we can do it right. See here for more details: Cost of adding (any) new feature to the Fortran language, note that @FortranFan was slightly opposed there.

The solution in my opinion is to prototype this correctly in a compiler and take time to do this right. And get the corner cases ironed out, and then write a proposal. The other part of the solution is to refuse to standardize anything that does not have a prior compiler prototype. (That was part of my platform when I ran for the WG5 and later J3 leadership, but I did not get the job. If I decide to run again, I think this will be part of my platform again, as I think it has to be done.)

Finally I would mention that compilers do not need to follow the standard. In this particular case, it seems we can have a compiler option that would disable global use of enumerators, and introduce an extension to use it locally, and if the community prefers, we can have it even as the default. But since this will be in the standard, some codes will inevitably use it, so we also need to support what is in the standard, with some simple option that would do exactly what is in the standard.

zedthree · May 12, 2022, 2:58pm

@certik I agree with many of your points here. Pragmatically, what is it we can do, specifically for enumeration type? I genuinely think this needs to either be removed to changed to class 2 names.

oscardssmith · May 12, 2022, 3:20pm

The easy solution would be to just implement a different feature in the open source compilers. The proprietary compilers probably won’t be updated for 10 years anyway, so if everyone using enums in fortran uses the better version that isn’t standard compatible, it would significantly discourage use of broken standard compatible versions.

certik · May 12, 2022, 4:39pm

@zedthree go ahead and write this up as a paper that proposes a fix to the already “approved” enumerations in F2023. You can send it as a PR against GitHub - j3-fortran/fortran_proposals: Proposals for the Fortran Standard Committee, and we can iterate on it there. Then we can submit it to the committee for the next meeting.

I just want to be frank and share that I think the chance is low it can succeed, but that is the process. All you need is to convince enough committee members that your fix is better. In order to even have a discussion, I suggest you write a paper how you think it should be done. If you have time, you can help us prototype this in LFortran, I am happy to help there also. Your paper can serve other compilers as an idea for an alternate implementation.

zedthree · May 12, 2022, 4:53pm

@certik Thanks! I guess this would ideally need to be done before the July 18th meeting?

plevold · May 12, 2022, 6:07pm

@FortranFan thank you for the insights and for your efforts towards minimizing the use of magic numbers in fortran applications. FWIW I’m happy to sign your petition should you go forward with it.

Fortran enumerators, as currently proposed, is a special case of the more general concept of Sum Data Types in type theory (the mathematics behind types in programming languages). See for example Algebraic data type - Wikipedia for an introduction.

Historically this was a concept only seen in functional programming languages. Lately it has become a feature most take for granted in recent not strictly functional programming languages as well:

Julia: Type Unions
TypeScript: Union Types
Swift: Enumerations with Associated Values
Kotlin: Sealed Classes
Rust: Enums

Many “old timers” still under active development also seems to be catching up:

Java: Sealed Classes
C++: Variants

Given that the Fortran committee has been working on enumeration types lately I find it very odd, perhaps even worrying, that they are either unaware or totally ignorant of the advances on the topic in other programming languages. Note that the usual “competitors” to Fortran like Julia and C++ are a part of that list.

If anyone is interested in what sum types can do for us more than reducing the use of magic numbers, I’ll highly reccomend any of Scott Wlaschin talks. For example this section is a very practical introduction to the subject.

certik · May 12, 2022, 8:38pm

Yes, it should be submitted for the July 18th meeting, so you should submit it to the github repository I linked above as soon as you can, so that we can iterate on it as a community to create a strong proposal.

milancurcic · May 12, 2022, 8:45pm

@zedthree, and in addition to what @certik wrote, the paper should be uploaded to J3 a few weeks before the meeting, to give enough time to people to read and digest it, so, ideally early July.

zedthree · May 13, 2022, 2:16pm

I’ve made a PR. Suggestions and comments very much appreciated. It’s a rough first draft, but I have very little free time for the next month or so, so it would be good to get eyes on it now.

Topic		Replies	Views
Fortran 202X features	12	1688	June 16, 2022
Enumerator type in `bind(c)` derived type: best practice? Help	4	852	June 9, 2023
Implicit typing and backwards compatibility	74	1999	July 19, 2022
Update on Fortran Templates Announcements	18	1262	July 29, 2022
Concerns regarding limitations in the current generics proposal	28	1693	February 19, 2023

Flaw with typed enumerators in F202X

Related topics