KART: a recommended design element toward Generics in Fortran

Introduction

  • Enhanced facility toward Generics in Fortran is a major focus of Fortran 202Y standard revision with several items on the worklist,
  • In response to Community feedback and also push from another national body, WG5 has also decided to, “Investigate other mechanisms for simplifying the use of templates.”
  • Among the most frequently expressed needs toward Generics by the user Community include
    a. Generic subprograms e.g., add ability to author a procedure such as FINDLOC in Fortran that works on any supported type of the language
    b. Generic containers e.g., add ability to parameterize a derived type in Fortran with any supported type such as a list type (linked list, trees, etc.) or stack, etc.
  • TKR semantics around Type (T), Kind (K), and Rank (K) is a fundamental aspect of generic resolution in Fortran,
  • Increasingly Attribute (A) is proving extremely important; Fortran 2008 recognized this and allowed ALLOCATABLE and POINTER attributes in generic resolution.

Proposal:
KART aspect around Kind (K), Attribute (A), Rank (R), and Type (T) is recommended as an integral element of the design toward enhanced Generics in Fortran starting with Fortran 202Y.
Premise:

  • Fortran language introduces an entity block similar to interface block in current language standard (c.f. 3.87 interface block, Terms and Definitions, Fortran 2023),
  • A generic entity block and a specific entity block are part of this introduction
  • KART is a basic builiding element of the entity block
  • The entity block
    1. provides a structured manner of collections of KIND, ATTRIBUTE, RANK, and TYPE of the entity, similar to an INTERFACE block
    2. serves toward the generic resolution,
    3. follows the generics-by-substitution design philosophy,
    4. builds on existing facilities in the language toward generics

Examples:
NOTE: syntax employed in the examples is notional only, especially the use of carats (<..>). The Community feedback shall permit better syntax development.

  1. Generic “accumulator” method toward intrinsic numeric types with all their supported kinds
module m
   generic, entity :: T
      type => intrinsic_numeric 
   end generic
contains
   subroutine accumulate<T>( x, a )
      <T>, intent(inout) :: x
      <T>, intent(in)    :: a
      x = x + a
   end subroutine
end module
  • Usage
..
use m
..
real :: r
..
call accumulate<real>( r, 1.0 ) 
  1. Generic Container for a stack of any type but with the ALLOCATABLE attribute
module stack_m
   generic, entity :: T
       type => *
       attribute => allocatable
   end generic
   type :: stack_t<T>
      <T> :: node
      type(stack_t), allocatable :: next
   end type
contains
   ..
end module

Next steps: a lot needs to be developed with great effort, but hopefully this initial post can lead toward influencing the language development with support of KART semantics as an integral element of the design of Generics in Fortran.

1 Like

I’m not sure I understand this proposal. Could you perhaps explain its goals and what the code examples do?

@meow464 and any other interested readers:

Great question by @moew464 , let me try to explain further.

I will break this up into 2 threads, one for a subprogram and another for a “container” solution (i.e., around a derived type). I will try to keep it simple, but please bear with me if it comes as too simple.

Note I focus along 2 threads because in terms of use cases in Fortran toward Generics communicated by users in a variety of forums and teams have converged as such:

  1. a need to conveniently author generic subprograms, often what serves generic libraries that Fortranners can offer to diverse users and applications,
  2. a need to conveniently author generic containers, again what works with item 1 as part of generic libraries. Collections, lists, stacks, trees, etc. often are the aspects around such needs.

My recommendation is the work on proper Generics in Fortran using this decade (say Fortran 202Y standard targeted around 2028/29 timeline) to retain laser-like focus on the above 2 use cases.

The first example in the original post is toward the first case.

For the example, readers please refer to this link on an accumulator factory for some background. Now say, you have authored an accumulate function for default REAL in Fortran as follows:

   subroutine accumulate( x, a )
      real, intent(inout) :: x
      real, intent(in)    :: a
      x = x + a
   end subroutine 

Now, say you want to extend to work with any numeric intrinsic type and all the supported kinds by a processor in a conforming manner, meaning no use of macros and such.

You know the drill: it’s a highly tedious, verbose process with unfortunate code duplication and which may not be portable because the language and its standard does not yet offer a convenient way to work with all supported kinds.

A programmer might do:

module m
   ..
   use, intrinsic :: iso_fortran_env, only : RKS => real_kinds, IKS => integer_kinds,  &
                                             CKS => complex_kinds
   ..
   generic :: accumulate => accumulate_1, accumulate_2, ..
contains
   subroutine accumulate_1( x, a )
      real(RKS(1)), intent(inout) :: x
      real(RKS(1)), intent(in)    :: a
      x = x + a
   end subroutine 
   subroutine accumulate_2( x, a )
      integer(IKS(1)), intent(inout) :: x
      integer(IKS(1)), intent(in)    :: a
      x = x + a
   end subroutine 
   subroutine accumulate_3( x, a )
      complex(CKS(1)), intent(inout) :: x
      complex(CKS(1)), intent(in)    :: a
      x = x + a
   end subroutine
   .. 
end module 

and as explained in many places, the combinatorial explosion is untenable.

But now in this simple case, given the legacy of Fortran, there are two aspects to the desired genericity:

  1. the solution must address all numeric types e.g., LOGICAL is insensible
  2. all the kinds of all the numeric types too must be addressed.

Now, in other similar situations toward a library solution such as the “accumulator” example here, genericity along all suitable RANKs and also the ATTTRIBUTEs, whether an ALLOCATABLE or POINTER or nonallocatable, nonpointer, is needed.

KART semantics is meant to cover all such scenarios. Additionally the use of an entity block, analogously to an interface block, is intended to ease the solution for the consumers.

Thus, using notional syntax, consider an entity block:

   generic, entity :: E
      type => intrinsic_numeric 
   end generic

that, say, implies by default, any kind, attribute, rank as long as it is an intrinsic_numeric type.

Note, in general, the facility of the entity block can be made comprehensive to cover a variety of the use cases around genericity, say unlimited genericity, or kind genericity or rank genericity only, or TKR genericity, or type and attribute genericity, etc.

For the simple accumulate case, a programmer may then author the generic subprogram simply as

..
   subroutine accumulate<E>( x, a )
      <E>, intent(inout) :: x
      <E>, intent(in)    :: a
      x = x + a
   end subroutine
..

and the consumer (the customer/caller, the practitioner) can simply do in situ instantiation as

call accumulate<real>( r, 1.0 ) 

Or, in more complex scenarios provide a specific entity, analogous to specific interface, to consume the generic subprogram.

However, some of the basic aspects of the Generics design that Fortranners must demand include:

  1. compactness - no needless verbosity,
  2. natural extension and considerable improvement over current standard Fortran, meaning the code must feel a natural part of the language, not another language (that may or may not be Turing-complete) that is force-fit into Fortran simply to introduce Generics. Macros and FYPP type of preprocessing suffer from this and that’s why they are options that are better avoided.
  3. It shall be feasible to take existing code and refactor that to make it generic. You can see it above with the accumulate routine. The changes are minimal.
  4. To build on the points 1 thru’ 3 above, the generic code shall be feasible stand-alone and also as MODULE subprograms. Meaning, it shall not require the generic code to be hosted in another scope such as TEMPLATE body as is currently worked upon by J3 - that looks bad.
1 Like

@FortranFan do the KART generics for the accumulate<E>( x, a ) case above check the generic code inside, that is, will the compiler check that the <E> type provided by the user will support +? What will happen if the user provides a string for E that does not support +? Will the compiler have to go into the generic subroutine (which can be arbitrarily nested) and try to figure out if the user type supports all the operations?

In other words, does KART support “strong concepts” (strict restrictions)?

I have a couple of basic questions about this.

First, why is <real> required in this syntax? Don’t the (presumably real) arguments provide the compiler enough information about the type and kind?

Second, where does the object code for the compilation end up? Is it in the object file where the reference occurs, or is it in the object file that corresponds to the subroutine source code? If it is the former, then does that mean that multiple identical compilations will occur when the subroutine is referenced from different files? If it is the latter, then does the modification/creation date of the remote object file get updated with each new instantiation? That would create havoc with the normal build process which depends on modification dates (e.g. with make or cmake).

Yes, that’s indeed and absolutely the idea re: both your questions:

  • the entity block is intended to inform the processor of the general semantics involved in the generic entity instances, say with accumulate in the example that the entity is restricted to a numeric type. Then whilst processing the specific instructions e.g., x = x + a, the operator + and the assignment = are validated as conforming based on properties of the specific instances of the generic entities x and a.
  • Say there were an instruction involving the length-type parameter (e.g., len(E)), the processor shall be required to detect and report the nonconforming instruction.

Thus strong concepts are at work.

1 Like

@RonShepard ,

With question 1, in the simple case for illustration, yes the processor can deduce:

  • However, I suspect there shall be a need to distinguish a generic interface as available currently in the current standard relative to a generic subprogram in a future revision, .
  • As ideas develop further though, perhaps a more compact syntax can be introduced.
  • But for now, my recommendation will be to retain the in situ instantiation notation i.e., call accumulate<real>( r, 1.0 )

Re: “where does the object code for the compilation end up?,” the feature development has to proceed in a manner that is absolutely consistent with the legacy with Fortran up until the current standard that the details left up to the processor up until now remain processor-dependent:

  • However, as a feature, as shown in the original post, it is fully expected the USE association and similar semantics shall extend to the entity block (say E in the above example) and the generic subprogram (say accumulate in the example).

Practically one might expect a processor to encode sufficient details toward Generics in the MOD file but complete the processing of the caller module/program at compile-time “instantiation”.

I recall reading about a solution to this problem about 50 years ago. An academic modified his Fortran compiler to append a string to each function or subroutine external symbol a string reflecting the types of the arguments. Then if a program called a function or subroutine with mismatched arguments, there would be a linker error (“external symbol XXX not found”) pointing to the problem call. Simple, but effective. Doesn’t this solve half the present problem too?

Is duplication of code really such a serious problem? While many may use generic subroutines, I don’t expect many to write them, and writers can use a macro processor to generate the alternative versions.

Code duplication is absolutely an issue. It creates trouble for maintenance, correctness, extensibility, and portability. The language currently offers support for multiple integer and real kinds, yet no way to write routines for all supported kinds in a compiler-independent way.

If the language cannot write generic libraries without relying on external macro processors or additional code generation steps as part of already complicated compilation processes, that is an issue. Being able to write native libraries in a language strengthens the ecosystem and gets people excited to use the language and tools, while contributing themselves to improve the ecosystem for everyone.

5 Likes

It seems to me that what you rationalize here within your KART acronym is broader than generics and applies to OOP in general.
IMHO, what is missing in Fortran OOP is a base object that would be at the root of any type (including intrinsic types). You find such objects in Java or .NET and they contain pretty much what you describe in KART.

1 Like