Generics in Fortran 202Y: Petition to WG5?

everythingfunctional · July 19, 2024, 10:27pm

This is basically ranked-choice voting. I doubt the poll in discourse supports it, but maybe another tool might? Otherwise you’d have to have respondents just post their votes and manually enter into a spreadsheet. When we did the vote live we gave a handful of workable options and for each voted it as “acceptable, unacceptable or undecided”.

I disagree. [] is workable because the compiler will know that the preceding entity is a template procedure, and [] are always paired, so you’ll know where the list starts and ends even in nested situations. With <>, those symbols are not always paired, so in certain situations it will be ambiguous whether a > ends the list or is being used as an operator. I’ll note that C++ seems to have solved that problem, but I don’t know what constraint they have that prevents the ambiguous cases. From what I’ve heard they may have just said “ambiguous cases are invalid, but it’s not the compiler’s job to tell you why/how it’s ambiguous and it’s your fault if it catches your computer on fire”. I don’t think that’s how we should leave things, so I’d say leave <> off the list since even if everybody voted that as their favorite it’s not workable.

I’d add just plain () as an option as well.

certik · July 19, 2024, 11:05pm

Ok, I can just add a poll for each option separately, then we can manually add up the votes. That might work.

You disagree that “there might be an issue with parsing []”?

That’s why a prototype is needed. In Bison-based parser (NAG, LFortran, probably others) you generally do NOT know the preceding entity is a template procedure, since the parser is “inside-out” and typically you parse expressions in a rule on its own, and then you use that to parse the function. For “top-down” parsers, like a recursive descent, you do know you are in a template procedure and can thus change the parsing rules as needed. In this particular case, I think it might be possible to get it working reasonably well in Bison, but I don’t know 100% until we try it. It seems you are saying there cannot be an issue. Ok, well, I’ll believe it when I see it.

Yes, the <> indeed has the issues you raised. So we can replace it with just () – however that might also have a parsing issue. Must be tried as well.

So to move forward: we know <> is going to be very tough to make work, so let’s exclude it. The other ones we do not know, they might work they might not. We know {} for sure works, since we have a prototype. So we can just communicate that {} is prototyped, and the rest probably can be made working, and if not, then of course we’ll have to go with the second best option; but the poll is mostly about preferences of these various options. (If for example most people choose [] by far, then we can try to prototype it and make it work.)

How about this:

For each syntax option for instantiating (and declaring) templates, select “acceptable, unacceptable or undecided”:

Option: `{}`
* [ ] acceptable
* [ ] unacceptable
* [ ] undecided

Option: `^()` for instantiation, `()` for declaration
* [ ] acceptable
* [ ] unacceptable
* [ ] undecided

Option: `[]`
* [ ] acceptable
* [ ] unacceptable
* [ ] undecided

Option: `()`
* [ ] acceptable
* [ ] unacceptable
* [ ] undecided

Other:
* [ ] Search for another option (please comment below)


There is a compiler prototype for `{}`, the other options have not been prototyped yet, so we do not 100% know if there is an issue. The poll is to figure out the preferred syntax, and if the most popular syntax turns out to be unworkable, we'll have to go with the second most popular, etc. The `<>` option was not included, because that for sure has many parsing issues due to the less-than `<` operator. Below are examples for each of the options.

Full example for `{}`:
...
Full example for `^()`:
...
Full example for `[]`:
...
Full example for `()`:
...

ivanpribec · July 20, 2024, 5:53am

I might be wrong, but PDT have a similar syntax.

Type :: foo(n, p)
Integer, len :: n
Integer, kind :: p
End type

Type(foo(5,dp)) :: a

a = foo(5,dp)()  ! Structure-constructor

I’m not sure if it can appear in specification statements, probably yes.

Btw, C++ also had a problem with (): Most vexing parse - Wikipedia

certik · July 20, 2024, 10:34am

@everythingfunctional is that why the committee went with a = foo^(5,dp)() to distinguish it from PDT?

FortranFan · July 20, 2024, 1:51pm

I have a strong doubt this is a legitimate concern with Fortran, particularly in the context around “instantiation” with Generics.

My vote will be for <..>.

Given the computer science education and how templates are taught in CS and the actual practice in industry given the influence of languages such as C++, C#, Java, etc, <..> makes most sense to me.

Fortran sample.code for Generics with <..> looks most elegant to me compared to all the alternatives.

certik · July 20, 2024, 2:39pm

Ok, we’ll put it back in. I think we just need to separate preferable syntax and its implementation.

Here is one example to consider for parsing:

I would even one-up it:

print *, tmpl_func<a, b>(c), e<d, f>(e, f)

This can be either:

first = tmpl_func<a, b>(c)
second = e<d, f>(e, f)
print *, first, second

or

bc = b > (c) ! same as `b > c`
ed = e < d
print *, tmpl_func<a, bc, ed, f>(e, f)

This seems pretty mind-bending.

FortranFan · July 20, 2024, 3:55pm

These look “conceptual” issues, rather than practical or legitimate concerns.

Meaning, a la explicit interface starting Fortran 90, the parser processing a program scope shall know a reference to tmpl_func involves a generic template and thus shall expect the <..>(..) sequence to follow.

Syntactically there are similarities here with objects of types with type parameters (character, PDT, etc) with double sequence of parenthesis that are processable based on explicit definitions.

Thus it ain’t like any other expression in Fortran with the unary operators of < and >.

With suitable employment around a modern Fortrannic paradigm of explicit everything, things like this should be readily workable.

certik · July 20, 2024, 4:22pm

This means the parser has to understand semantics (have a symbol table in this case) and you do not want to do that. C++ requires that, and the parser is very complex and slow due to that. Most other languages do not require that, Fortran currently doesn’t, so it is a lot faster to parse. Speed of compilation is essential (for some users).

But let’s say we do what you suggest. It still doesn’t disambiguate the expression above. Which of the two cases should it parse it to? Both are equally valid, as far as I can tell.

Machalot · July 21, 2024, 4:20am

Since multi-character combinations are under consideration like ^(), wouldn’t some grouping like (< >) or [< >] be unambiguous? Those aren’t valid usage of the unary operators in other contexts. It also has the advantage of symmetry over the lopsided ^().

There are already many other examples of multi character symbols in the language: ** (/ /) // == => <= /= :: and now .. for assumed rank.

eelis · July 21, 2024, 7:31am

I think when we write ^(), it looks terrible because of, as you said, multicharacter and asymmetry, but I think one should rather consider ^ as a binary “instatiate with” operator that takes a template and arguments and maps that to a function or a subroutine. Then, I think the whole syntax makes sense and is not terrible anymore. So in the end, there is always some sort of symmetry. Representing it this way T^() makes more justice to the syntax.

certik · July 21, 2024, 12:46pm

Great idea. We can start with (< >), similarly to the old (/ /). That way no new characters are introduced to the language, which was one of the concerns against {}. Then later (/ /) was replaced with just [] (new characters to the language). And we can similarly later replace (< >) with just {} (new characters to the language).

Also a great idea. This should have been clearly communicated above, instead of you “reverse engineering” it with your physics-based intuition (I am glad you did!). The idea here is that we are introducing a new operator ^ to instantiate a template. Ok. Couple questions:

Are spaces allowed around ^?
Can one use parentheses for each operand?
Can the operator be user overloaded?
Can it be chained (partial instantiation)?
What else can we do with this operator — when introducing a new operator, let’s think broadly and generally what exactly the scope of it is, how it can be possibly extended in the future, and so on.
Does it make sense to also introduce an operator for applying function arguments? So you first instantiate with ^, then apply function a arguments with another operator.
It feels we should not introduce an operator to apply function arguments. In the same way, perhaps the instantiation operation does not warrant introducing a new operator?

It seems this operator is not as general as + or *, it’s more similar to %, which is more restrictive what you can do with it.

AniruddhaDas · July 21, 2024, 2:40pm

Is it possible to tell what happened to the detached template procedures as presented here 24-106. I really like the idea presented there, maybe the same could be extended to non-generic code as well.

eelis · July 22, 2024, 6:08am

Great questions, thanks! I don’t have answers to those.

But you got me thinking that if ^ were truly a binary operator, one would need to see how it operates with different things (overloading, as you said). Some natural ways come to mind:

T^real (singular type, no parens)

T^[real, integer] (list of two types)

But one would have to introduce a list that can contain types. Then one could argue for operating with a dict T^{} but the whole concept of dicts do not exist. What about then replacing the dict with a derived type T^DT. I see that this derived type would not be a regular derived type, but something that contains types as values (same for the dict).

Then another problem that I see with a true binary operator, when testing overloading, in this context is that it is runtime, whereas the template operator should be a compile time operator.

I am also thinking about the possibility of T1 ^ T2, but not sure if it makes sense. One would probably have to define what the operator means in each case.

So the operator would be compiletime and require the introduction of types as parameter values. Thus, it may be that the concept remains a mind trick to see some inner beauty in the syntax notation of T^().

Btw. In this context T^[] feels more natural. ( could that be one option in the poll @certik ?

everythingfunctional · July 22, 2024, 2:01pm

That works fine. Your points in the discussion are good, so if you want to leave the various options in and try prototyping the winner(s), that would be a good way to go.

There wasn’t really a “technical” reason to add it. It was purely an aesthetic one to make it easier for humans to notice the template instantiation compared to all the other uses of ().

Possibly, but they might also require backtracking in the parser. Someone who’s actually written a Fortran parser would have to say whether that’s already needed in any other cases.

Yes. The syntax works in fixed-form, where spaces are ignored.

No, no and no. These taken together would nearly be a Turing complete, compile-time language, which we’re trying to avoid.

The fact that these questions come up now are reasons I didn’t like adding the extra symbol in the first place. It doesn’t have any semantic meaning, but it’s presence suggests that there is some.

We decided that with standalone template procedures this wasn’t as crucial, and didn’t have the bandwidth to pursue it for F202Y. So far we think we’ve left the door open for it in the next revision.

This idea already exists in module procedures and submodules. That concept doesn’t quite work for templates, because for procedure calls you only need the interface, but for template instantiation you actually need the entire contents of the template.

The ideas discussed here have been explored in some other languages. The term to search is “higher kinded types”. I don’t think it would be good to introduce that level of complexity into Fortran.

FortranFan · July 23, 2024, 3:19pm

To all other Community members interesting in enhancing Fortran:

There is an urgent need to form a Global Fortran Foundation,
Community must wrest the development of Fortran away from WG5 and place it with the Global Fortran Foundation. WG5 can work on ISO standardization, if it so chooses, but language development is clearly not their competence
Global Fortran Foundation shall recognize and champion all open source Fortran processors including LFortran (LLVM) and GCC\gfortran.
Global Fortran Foundation shall develop a completely open workflow and procedures for Fortran advancement proposals, say FAP similar to Python PEP, that includes support and funding for prototype development and inclusion in the official language version as determined by the Community via this workflow.

Note without something like this, the Community can wait decades and decades and no meaningful advancement shall occur, leave it to WG5 and J3 and >98% of the proposals will be treated as dead-on-arrival, regardless of an overwhelming majority of the Community getting behind a proposal.

certik · July 23, 2024, 7:45pm

@FortranFan I moved your post here. That way we have the workflow ideas in one place, and other threads can focus on technical discussion and we are not mixing the two. If you want to start some other thread dedicated to it, I am happy to move all such posts there too. Until then I figured this thread will do.

eelis · July 24, 2024, 12:59pm

Thank you for the interesting point @mfurquan . However, I am not sure if I understand what you mean. How would a “consistent OOP” solve the introduction of compile time types as values (higher order types)?

mfurquan · July 25, 2024, 5:32am

There were several inconsistencies in the post. Hence, I deleted it. Nevertheless, everything is an object in pure OOP languages, hence making a list of types is naturally allowed. Now, how is that possible with static typing is another question.

everythingfunctional · July 29, 2024, 8:15pm

Based on some feedback here, I’ve written a paper to suggest some alternative syntax for advanced instantiation of standalone template procedures. The suggestions in the paper are:

INSTANTIATE tmpl_proc(...) => new_name
INSTANTIATE tmpl_proc(...) <= new_name
new_name => INSTANTIATE tmpl_proc(...)
procedure(), parameter :: new_name = tmpl_proc^(...)

If you’d like to read the paper and make any suggestions or comments, you can do so here:

certik · August 4, 2024, 8:32pm

certik:

I would even one-up it:

print *, tmpl_func<a, b>(c), e<d, f>(e, f)

This can be either:

first = tmpl_func<a, b>(c)
second = e<d, f>(e, f)
print *, first, second

or

bc = b > (c) ! same as `b > c`
ed = e < d
print *, tmpl_func<a, bc, ed, f>(e, f)

This seems pretty mind-bending.

@FortranFan I figured out how this could be parsed. You would need to use the semantics while parsing, and you know if tmpl_func is a generic function and how many type parameters it is expecting, as well as if e is a generic function or just an integer (say). When you parse this line, you discover e, and if it is an integer, then the first parse tree above is not allowed, so it must be parsed as the second option. If e is a generic function, then this must be parsed as the first tree.

So this can indeed be parsed. But the price for it seems to be that you need to be creating semantics as you parse, and then using this semantics to determine what kind of parse tree to create. Which is exactly what C++ parsing also requires and why the parser is so complicated and slow. I would advice against requiring this kind of a parser.

Topic		Replies	Views
Japanese Subgroup GENERIC proposal Language enhancement	118	3538	October 25, 2023
GitHub mirror of J3 documents Announcements	5	256	January 23, 2025
Update on Fortran Templates Announcements	18	1249	July 29, 2022
Updates for Fortran-lang/LFortran for the J3 Fortran committee	4	545	October 17, 2022
Feedback for generics prototype	40	2227	March 2, 2023

Generics in Fortran 202Y: Petition to WG5?

Related topics