New generic procedures feature

certik · December 4, 2024, 6:51pm

Here is the “requirement”:

type(integer(*), real(*), complex(*)) :: a, b, ...

Here is the “usage”:

  if (a > b) then

The compiler checks that “a > b” satisfies the requirement. The requirement says the types can be one of: integer, real, complex. It checks them in order, and finds that this works for integer and real, but not complex. So it gives a very nice error message why the operation cannot be done. No instantiation needed.

Yes, the above requirement syntax only allows integer, real, complex, but not type(my_type).

If you read the “traits” proposal (linked above), it talks about this. Also compare this to other languages that allow this specification. It’s part of a generic feature.

The intention is clear that it was meant to work “generically” in the above sense (yes for integer, no for type(my_type)).

Again, I very strongly recommend the generic subgroup to figure out how to deliver a consistent and unified approach to generics, not two separate proposals.

jwmwalrus · December 4, 2024, 7:11pm

But what would you do with things like:

subroutine some_sub(a, b) bind(C, NAME = 'f_call')
    ...
    x = generic_name(a, b)
    ...
end subroutine

If generic_name is in a library, and the main function is on the C side, then the specific procedure won’t be instantiated at runtime and the program may compile but certainly crash.

The way I see it, the auto-generic subrpograms feature is just procedure name overloading (so no dynamic polymorphism), and all the combinations should be generated at compile time.

The templates feature, on the other hand, might be on the same category as coarrays, requiring a Fortran main program unit so that things can be properly initialized/instantiated.

certik · December 4, 2024, 7:37pm

jwmwalrus:

But what would you do with things like:
subroutine some_sub(a, b) bind(C, NAME = 'f_call')
    ...
    x = generic_name(a, b)
    ...
end subroutine
If generic_name is in a library, and the main function is on the C side, then the specific procedure won’t be instantiated at runtime and the program may compile but certainly crash.

I don’t see a problem here: the compiler knows at the line x = generic_name(a, b) at compile time what the argument types are for a and b, so it instantiates generic_name as needed, exactly once in this case.

Why do you expect this to crash?

jwmwalrus · December 4, 2024, 8:12pm

I should have been more clear:

Library containing generic_name is compiled with a Fortran compiler. The generic_name passes the requirements, but the specific code is not yet generated.
Executable invoking f_call is compiled with a C/C++ compiler, which doesn’t need to know that it has to generate extra code for geenric_name according to Fortran semantics.
Code might not crash (since it’s a subroutine and argument values might be okay), but its behavior won’t be the expected one.

That “instantiate-only-the-few-needed” is akin to just-in-time compilation (“just-in-time instantiation”?), which imposes a burden on the final runtime.

I know that calling, e.g., C# code from C is possible, but you need to know the specifics about COM objects initialization from unmanaged code. The same thing applies here —i.e., knowing the specifics about “just-in-time instantiation”, which varies from one compiler combination to another and, imho, defeats the purpose of C/Fortran interoperability.

The procedure name overloading feature (generic interface) was actually introduced in Fortran 33 years ago, with the obvious drawback of requiring manual name mangling and the use of an include statement. What’s being proposed is just a nice upgrade to that same feature.

certik · December 4, 2024, 8:59pm

@jwmwalrus I think you are missing a step where a Fortran compiler must compile the some_sub function at some point. At this very point it will instantiate generic_name.

I think you are also asking how can the compiler instantiate? One way is to save the generic function into .mod file, so that it is available when needed, that’s what LFortran does, for example.

The code will not crash and the function call will be the expected one.

In the above design there is no JIT at runtime and no runtime overhead. It’s as fast as if you manually wrote the specific function yourself in the Fortran code.

Let me know if this clarifies the issue.

jwmwalrus · December 4, 2024, 9:25pm

Fair enough.

That’s my point. You wrote the auto-generic, so all the combinations must be in the library —because, as I said, auto-generic subprograms is just an upgrade for a feature that has existed in Fortran for over 3 decades. It’s kind of late for an unifying approach (even in some other languages, function overloading and generics/templates are separate things).

In other words, the auto-generic subprograms feature should imply instantiation of all combinations. Just like the upcoming templates feature requires explicit use of the instantiate statement (or implicit through simplified templates).

These things aside, it’s funny that the interface keyword was overloaded so much, that they introduced the generic keyword, which is now overloaded as well (at least in concept).

jwmwalrus · December 4, 2024, 10:50pm

@certik I guess the approach I used didn’t help. Here’s another angle:

This code will go into libfuncs.so:

generic subroutine generic_name(a, b)
    ...
end subroutine

And this code will go into libwrap.so

subroutine some_sub(a, b) bind(C, NAME = 'f_call')
    ...
    x = generic_name(a, b)
    ...
end subroutine

Even if I use nm in libfuncs.so to get the specific name, and I know the types of the arguments for that specific combination, libfuncs.so is an incomplete library that cannot be used on its own. Am I right?

certik · December 4, 2024, 10:55pm

That’s what I initially thought above, as well as what @everythingfunctional confirmed, but it is in conflict with the actual proposal which is very clear that it should be on demand at compile time:

Ad-hoc specialization shall be performed at compile time, that is,
there will be no trace of the non-chosen specializations in the
generated anonymous specific.

So it would be good to understand what exactly this proposal is proposing.

Regarding your last example of libfuncs.so, the answer depends on which way the proposal is meant:

if all versions should be instantiated (I think that’s not what the proposal says), then libfuncs.so would have all the versions, and then you can call it from libwrap.so later.
if it should be instantiated on demand (I think that’s what the proposal says), then libfuncs.so doesn’t have an implementation of generic_name, rather the actual interface and generic (i.e., uninstantiated) implementation are in libfuncs.mod, then when you use this module in a separate libwrap.f90, it will get instantiated on demand for you, and when you create libwrap.so, it will contain both some_sub as well as instantiated generic_name (with some unique name, so that it doesn’t clash).

So both ways can be implemented.

For the lapack use case, if the second approach is used, it would mean you can’t easily create an lapack.so library, as all the implementations would be in a .mod file and it gets inlined into your application. That has pros and cons.

Another issue is what happens if you have a global generic function, not in a module. Fortran compilers traditionally do not create .mod files in that case.

The template generics have exactly the same issue, so it has to be solved anyway. But all these things make the generic feature almost identical to template generics, it really is the same thing.

everythingfunctional · December 4, 2024, 11:31pm

The current papers outlining the requirements and syntax do not say anything about whether “unused versions” are compiled or not. The closest thing it says is

As implied by the above, such a generic procedure implicitly defines an anonymous set of specific procedures, one for every combination of type, kind type parameter value, and rank.

This is from an older paper, but even still it is talking about branches inside the generic procedure, not about instantiation at the call site. For instance in the following

generic subroutine s(x)
  type(real, integer) :: x
  ...
  select type (x)
  type is (real)
  ...
  type is (integer)
  ...
  end select
  ...
end subroutine

the specific procedures generated should not actually contain a branch and the unused code.

These are disallowed.

No they don’t. If a template isn’t available in the current scope (through use or host association), you can’t instantiate it.

If you wanted to do what the generic procedures does with templates, you absolutely could, but it would be much more verbose and a bit less intuitive.

jwmwalrus · December 4, 2024, 11:43pm

I mentioned nm and knowing the types of the arguments, implying that a .mod file might not be involved in the invocation.

~~So it all really comes down to the fact that the “ad-hoc specialization” specification shouldn’t be there in the first place —or maybe it should change “shall” by “could”?~~

The specialization is probably okay in LFortran (since it knows there’s some main program involved and code can be safely discarded), but for a *.f90 -> *.o, then *.o -> a.out approach, that might not work.

certik · December 4, 2024, 11:44pm

I see. So if the generic function gets compiled into an .so file, all versions will get generated, correct?

jwmwalrus · December 4, 2024, 11:54pm

Yes, otherwise the .so is incomplete and kind of useless.

Although the incomplete shared object approach is not totally invalid —e.g., one of the reasons it’s difficult to build fully static executables with the GNU stack is because libc.so.6 is incomplete and actually loads other “complementary” libraries on demand.

everythingfunctional · December 5, 2024, 5:26pm

When the proposal has talked about “ad-hoc specialization”, it is not talking about the call sites. It is talking about conditional blocks within the procedure. Without addressing this, something like the following example would not be valid, because the argument is not polymorphic.

generic subroutine s(x)
  type(real, integer) :: x
  select type (x)
  ...
end subroutine

jwmwalrus · December 5, 2024, 5:54pm

Thanks for the clarification. I edited the post to scratch that.

So it seems auto-generic subprograms is indeed just an upgrade/improvement of the old feature —so no deferred instantiation.

septc · December 5, 2024, 6:09pm

If one wants to make a “generic” routine as defined above (= a set of statically overloaded routines in the F90 sense) for type-bound procedures shared by different types, does it look something like this?

generic subroutine some_method(self)
  class(Mytype1, Mytype2) :: self
  select type (self)
  ...
end subroutine

Then, the meaning of select type seems ambiguous… (i.e., does it mean compile-time selection of conditional blocks by the compiler, or run-time dynamic type selection of self?)

RonShepard · December 5, 2024, 6:11pm

It seems that deferred instantiation does not quite fit into the current compilation model of source file, object file, library file, and then a load step to produce the executable file. When the compiler sees a reference to a generic procedure that has not yet been compiled/instantiated, where should the specific compiled object code go? Should it reach into the library and modify or add the new specific code? Should it reach into the previously compiled object file and place the new code there? Should it compile the specific version and place it into the current object file? None of these options fit the current model. Are there other possibilities? And this refers just to the limited cases where the compiler known the types of the arguments at compile time, there are also situations where the compiler cannot know those types, so what is supposed to happen then?

septc · December 5, 2024, 6:22pm

It seems to me that the generic subroutine above is a set of compiler-generated specific routines (with their specific names being anonymous and not accessible to users), so essentially a subset of what fypp can do today…? I think it will be useful to save the effort of creating a lot of similar routines that differ only in types that are known at the definition of the routines (if one does not use tools like fypp). On the other hand, does it mean that, if one wants to write a “true” generic routine (in the same sense as other languages), the only way is to use the very verbose version rather than the more compact ones proposed recently? (I feel the cons of the verbose version is code readability, i.e., the generic code is very unreadable (at least to me) in the current form.)

Also, in the proposed generic subroutine, are the names of the specific routines kept internal and not “exported” to the user? (So, it is not possible to pass a particular instantiation of the routine to other procedures?)

RonShepard · December 5, 2024, 7:43pm

A programmer might want to pass a reference to a generic routine through the argument list or he might want to assign a procedure pointer to the routine. A question then arises of when is the generic resolved to its specific version. Is it done during the actual/dummy argument association step or during the procedure pointer assignment, or is it delayed somehow until the procedure is referenced based on its argument types at that later time? If it is done at the later time, then when would the procedure actually be compiled/instantiated? Would it be done at the pointer assignment step, or later when it is referenced in an expression with its arguments? If a pointer assignment occurs, but is then never referenced in an expression, then the compiled code would never need to be generated. Can a compiler know that at compile time?

septc · December 5, 2024, 7:54pm

Just a quick reply (only very very partly), I’m here referring to the generic subroutine proposed above, which instantiates everything / every combinations of arguments at the definition of the routine (i.e., not “delayed”). I think it is similar to “explicit instantiation” of templates or generics in other languages to make a shared library, but here automatically done for all combinations of dummy argument types (but my understanding may be wrong…)

RonShepard · December 5, 2024, 7:59pm

Even in this case, there is still the question of what does a pointer assignment do? Does it point to the generic, which can then be resolved later once its argument types are known, or does it point to one of the specific routines. Both types of assignments might be useful to a programmer.

Topic		Replies	Views
Generics : instantiate type from template? Help	11	607	June 17, 2023
Type bound generic procedures Help	5	392	December 3, 2024
Simple Generics	27	1770	June 18, 2023
Metaprogramming vs. generic programming in Fortran	8	2638	December 12, 2021
Concerns regarding limitations in the current generics proposal	28	1690	February 19, 2023

New generic procedures feature

Related topics