Prototype implementation of Fortran generics in LFortran

I can get why it’s designed like that, nonetheless it’s terribly verbose and obfuscated. One really need a mechanism to specify that a type is necessarily among a subset, so that one can really write y = a + b*x and not y = plus(a,times(b,x)) in the case one want the template to work for numeric types only. And some select type capability in the templated routines.

1 Like

In the templated subprogram, it is not even y = plus(a,times(b,x)), it is

y = F1( a, F2(b,x) ) 

where F1 and F2 are templated “interfaces” masquerading as “requirements” elsewhere.

This takes obfuscation to another level.

The users of Fortran, who are domain practitioners in science, engineering, etc. and who have stuck with or have come to Fortran for its simple formula translation and plain language style of expressing code, will be entirely lost with something like this.

Honestly this really calls for going back to the drawing board.

1 Like

I think that is how it works. The y = plus(a,times(b,x)) version allows also user-defined derived types.

I think the y = a + b*x version can be used for intrinsic integer and complex data types, but not intrinsic logical data types. The logical version would also require the obscure looking y = plus(a,times(b,x)) version with the appropriate definition of plus and times.

One other general question is when is the code actually compiled? Is it when an instantiate statement is encountered? Does every instantiate statement result in a new compilation? If not, then where is the compiled code placed? Is it in the calling object file, or is it somehow appended to the referenced module object file after it has already been compiled once?

Zig’s comptime is an excellent example of a powerful system with syntax not too different from the main language. Documentation - The Zig Programming Language This approach enables generic programming but is even more powerful. Adding comptime parameters alone to Fortran would enable a lot of the generic programming desires to limit code duplication and cluttered modules + interface definitions.

1 Like

Yes, I think the current design will allow adding y = a + b*x as an option for y = plus(a,times(b,x)).

We have created the following document precisely to answer these questions: https://github.com/j3-fortran/generics/blob/main/theory/comparison/comparison.md, let me know if it answers it.

The code is compiled when instantiated. So the templated function is stored in a mod file, no object code is generated, since these generics are a compile time feature.

I think the code should be simplified, as I mentioned many times in this thread. The underlying mechanism how it works I think is good.

1 Like

Let’s hope you’re right !

1 Like

Let’s say that a project has three files, a_mod.F90, b.F90, and c.F90. The a_mod.F90 has the template code in a module, and both of the other two files USE that module and they instantiate the function with the same argument types and kinds.

This means that a_mod.F90 must be compiled before the other two files are compiled. That compilation creates a_mod.mod and a_mod.o.

Then b.F90 is compiled, and that compilation produces b.o. Is the compiled function within b.o, or was it written back into a_mod.o?

Then c.F90 is compiled, and it produces c.o. Where is the compiled function that was instantiated within c.F90? Is it in c.o? Did it look into a_mod.o and see the previously compiled function, and then skipped its own recompilation? Is there some way that it should know to look in b.o to see if the compiled function is there?

If b.F90 has multiple instantiations of the function (all with the same argument types and kinds). then are there multiple functions compiled and placed in b.o, or is the compiler supposed to know how to make a single compilation and then reference it?

Or is none of this specified, and it is left up to the various compilers as a “quality of implementation” kind of issue?

1 Like

Excellent questions. This will likely depend on the compiler.

LFortran has two modes of compilation:

  • Separate compilation (enabled with --generate-object-code): each file is compiled separately and a .mod and .o files are produced, which are then linked at the end.
  • Global compilation (the default): only .mod file is produced for each module, but not compilation to .o happens. The .mod file stores the intermediate representation (ASR) for all the code in that module. Then when compiling the main program, all the ASR code is loaded to memory and compiled at once.

Right now we only worked on global compilation for templates. So each instantiation is reused, as needed.

In the separate compilation model, we will have to make some choices, such as instantiate what is needed for the given file using unique names, and the instantiation will not be reused in other files. Alternatively we would have to communicate somehow how to reuse it so that it links correctly.

1 Like

This could be tricky with the typical build systems. Say you are using make, which looks at file modification dates to determine which files must be compiled. If the compilation of b.F90 causes a_mod.o to change, then that could trigger some subsequent unnecessary compilations, say of c.F90. Then if c.F90 also modifies a_mod.o, perhaps for some other template function, then that might cause b.F90 to be recompiled, ad infinitum.

1 Like

@RonShepard indeed. Even without templates we realized that if you have enough memory to fit the whole project in, it’s better to use global compilation (as described above), which is faster to compile and faster to run (since you can inline functions from modules across the whole project). In other words, the interaction of the build system and the compiler should be much tighter than on the file level basis, as is traditionally done.

1 Like

@septc nice. Yes, the Go example, especially the last one, shows that you can have “strong concepts” and very simple code. What don’t you like about:

func mysum[T int | float64](x T, y T) T {
	return x + y
}

that you would rather like to (sometimes) write it as:

func mysum[T](x T, y T) T {
	return x + y
}

It seems it is almost the same code, but first is “strong concept”, while the second is “no concept”. Given that the code is almost the same, it seems better to always enforce “strong concepts”, as does Go. Note that Go indeed does not compile the last example, with the message:

./prog.go:5:13: syntax error: missing type constraint

Indeed Go enforces strong concepts.

I think every modern language that has generics does it, so I think we should too in Fortran.

But simplifying the syntax, so that it is as pleasant to use as Go, is something we still need to do.

The first code seems preferable because you are specifying that T shall be of type int or float64 (I think, I don’t know Go). If that is the case, then I would say that function definition is strictly superior to the second. Specifying data type is not the problem. Having to specify 3 layers of interface blocks, requires blocks, template blocks, etc is ridiculous. A template definition should be minimally more complex than any normal subprogram definition.

1 Like

@tyranids I agree 100%. I think our goal should be to always specify “types for templates” (=“strong concepts”) and keeping the code similar to the Go code, that we all seem to agree is readable and short. So far @FortranFan’s example above is the simplest, but we might be able to do even better. How about this:

function mysum(x, y) result(z)
template :: T
    requires :: T is (integer .or. real(dp))
end
type(T), intent(in) :: x, y
type(T) :: z
z = x + y
end function

Or perhaps just this:

function mysum[integer .or. real(dp) :: T](x, y) result(z)
type(T), intent(in) :: x, y
type(T) :: z
z = x + y
end function

or

function mysum(x, y) result(z)
template, integer .or. real(dp) :: T
type(T), intent(in) :: x, y
type(T) :: z
z = x + y
end function

The last one is probably the most “Fortranic”. It would be nice to be explicit somehow on the “function” line that T is type parameter though. Ideas?

Maybe something like:

function mysum{T}(x, y) result(z)
template, integer .or. real(dp) :: T
type(T), intent(in) :: x, y
type(T) :: z
z = x + y
end function

To cover every numerical type and kind, we could do “numerical” and do:

function mysum{T}(x, y) result(z)
template, numerical :: T
type(T), intent(in) :: x, y
type(T) :: z
z = x + y
end function
1 Like

I am always writing my functions like this, function name(args) result(val_out), since I think it is most clear even for simple functions of intrinsic types like real and integer. Since that leaves you having to specify the type of result(…) inside the function it doesn’t really bother me.

To be honest I like the idea of defining a template type and having the real/integer/numeric and applicable kinds as its own block. That way you can have it once in a module and use it for all subprograms within that module.

Alternatively, if there was a way to have it integrated with the normal declaration you’d have already, like type(T), intent(in), numeric :: xin, that works for me too. Minimizing duplication is nice, but we do always need those type declarations for functions and subroutines so this tells you everything you need to know in one line.

I also agree that Fortran should take advantage of {} and maybe also <>, since these are brackets we are not yet using.

1 Like

Yes, I think template, numerical :: T can be either defined in the function for brevity, or lifted into the module level, and even expand it to the full form as needed, so that you can reuse it.

Note that <> often introduces challenges in the parser since we already use < and > for comparison, so things like <x>y> can quickly get complicated. But using {} as Julia should be fine.

It’s still not completely clear in my mind if these “simpler” templates with “strong concepts” miss any feature from the current generics proposals. @everythingfunctional, let me know what you think. I think we need to allow this simpler syntax as “syntactic sugar” to the full form, to make it easier for people to use for simpler things, while still allowing the full form for all the features.

It seems like it would be useful to be able to specify that some deferred type includes all possible kinds of some intrinsic type. Say all possible integer kinds. The tricky thing about fortran is that the kind values are specified in, for example, the integer_kinds(:) array, but the length of that array and the values of the elements are compiler dependent. Thus it is not possible to specify them individually in the source code in a portable way.

I think the way the deferred type T is specified is that once the type and kind are set in the instantiation, it then applies everywhere it is used. I think that is the right way to specify types in fortran. This is in contrast to the way the class() is specified in type bound procedures. It would be very nice if one could specify that several arguments must all be the same type in a type bound procedure, but that’s not the way it works. That seems unfortran-like to me. That is why all the select_type() and same_type_as() stuff is required. Things must be done at run time that would better be done at compile time. I hope this template proposal gets those declarations right.

2 Likes

Maybe

requires bin_op(T,operator(+))
requires bin_op(T,operator(*))

not only for intrinsic real\integer\complex ,but also user’s offload.

I wholeheartedly agree with the sentiment that we need to be pushing as much checking as possible to compile time. If any language has demonstrated a method to successfully perform X, Y, or Z checking at compile time and those concepts are remotely relevant in Fortran, we should require similar behavior. Run time checks destroy performance, and push potential bugs down to runtime as well when they might have been caught at compile in a better system.

1 Like

Absolutely, 100% agree as well.

I think this syntax is the closest to what you’d like to have in a scientific code!

In my mind, the cleanest for templates involving numeric kinds would be:

elemental template function myrealsum(x,y) result(z)
   real(*), intent(in) :: x,y
   real(kind(x)) :: z
   z = x+y
end function myrealsum

elemental template function my_generic_sum(x,y) result(z)
   numeric(*), intent(in) :: x,y
   numeric(type(x)) :: z
   z = x+y
end function my_generic_sum
3 Likes