Implementing fypp like pre-processing functionality in fpm

In the short run we should simply make fpm call fypp in PATH to pre-process .fypp files automatically. That might end up being the only thing we will do for .fypp files.

A separate question is what it would take to rewrite fypp or something similar in functionality into Fortran, so that we can link it with fpm, so that fpm is standalone (does not depend on Python). The main target is stdlib. Let’s look at the usage in stdlib. Below I collected the main uses cases. Did I miss any?

We can use these to design a pre-processor that is possible to implement in Fortran. If these are all or most of the use cases, they can be separated into two independent kinds:

  • generic over any type (either one or a combination of integer, real, complex, logical) and kind
  • generic over any array rank

The first one is I think quite simple to implement (i.e., design a syntax, preferably as close to fypp as possible) and implement in Fortran. The second one is more complicated, as one has to do things like select subarrays and handle declarations properly, but still it seems very much doable.

Both of these main use cases seem to 100% fit as a subset into the generics effort in the Fortran Standards Committee. So in fact we should design this well as a community and use stdlib and other codes as examples of real world usage. Then the generics subgroup at the committee can take this and design some syntax for the language. Our pre-processor would work as a prototype for this. (The generics effort in Fortran is wider, we are also designing how to do “templates/traits”, but that is a separate problem that we do not seem to need here.)

Use Cases

1: Loop over all integer, real, complex and logical kinds

Loop over all integer and real types and kinds:

#:set IR_KINDS_TYPES = INT_KINDS_TYPES + REAL_KINDS_TYPES
...
    #:for k1, t1 in IR_KINDS_TYPES
    elemental function clip_${k1}$(x, xmin, xmax) result(res)
        ${t1}$, intent(in) :: x
        ${t1}$, intent(in) :: xmin
        ${t1}$, intent(in) :: xmax
        ${t1}$ :: res

        res = max(min(x, xmax), xmin)
    end function clip_${k1}$
    #:endfor

To loop over complex and logical kinds also: stdlib/stdlib_optval.fypp at 492543d0086ff3821ecf2ee8f0a9cf46a945a7ac · fortran-lang/stdlib · GitHub

#:set KINDS_TYPES = REAL_KINDS_TYPES + INT_KINDS_TYPES + CMPLX_KINDS_TYPES + &
  & [('l1','logical')]

This is by far the most common use case. This one would be easy (I think) to implement in Fortran.

2: Loop over all array ranks

Example: stdlib/stdlib_stats_moment_scalar.fypp at 492543d0086ff3821ecf2ee8f0a9cf46a945a7ac · fortran-lang/stdlib · GitHub

  #:for k1, t1 in INT_KINDS_TYPES
    #:for rank in REDRANKS
      #:set RName = rname("moment_mask_scalar",rank, t1, k1, 'dp')
      module function ${RName}$(x, order, dim, center, mask) result(res)
        ${t1}$, intent(in) :: x${ranksuffix(rank)}$
        integer, intent(in) :: order
        integer, intent(in) :: dim
        real(dp), intent(in) :: center
        logical, intent(in) :: mask${ranksuffix(rank)}$
        real(dp) :: res${reduced_shape('x', rank, 'dim')}$

        if (dim >= 1 .and. dim <= ${rank}$) then
          res = sum(( real(x, dp) - center)**order, dim, mask) / count(mask, dim)
        else
          call error_stop("ERROR (moment): wrong dimension")
        end if

      end function ${RName}$
    #:endfor
  #:endfor

This use case also seems possible to do in Fortran.

3. Different code path for complex type

Example: stdlib/stdlib_stats_cov.fypp at 492543d0086ff3821ecf2ee8f0a9cf46a945a7ac · fortran-lang/stdlib · GitHub

  #:for k1, t1 in RC_KINDS_TYPES
    #:set RName = rname("cov",2, t1, k1)
    module function ${RName}$(x, dim, mask, corrected) result(res)
...
       ${t1}$ :: res(merge(size(x, 1), size(x, 2), mask = 1<dim)&
                          , merge(size(x, 1), size(x, 2), mask = 1<dim))
...
          #:if t1[0] == 'r'
            res = matmul( transpose(center), center)
          #:else
            res = matmul( transpose(conjg(center)), center)
          #:endif
...
    end function

This could be done by introducing a function in the pre-processor such as:

          #:if is_real(t1)
            res = matmul( transpose(center), center)
          #:else
            res = matmul( transpose(conjg(center)), center)
          #:endif

4. rname

Example: stdlib/stdlib_stats_cov.fypp at 492543d0086ff3821ecf2ee8f0a9cf46a945a7ac · fortran-lang/stdlib · GitHub

    #:set RName = rname("cov",2, t1, k1)
    module function ${RName}$(x, dim, mask, corrected) result(res)

I actually don’t know what this does. I assume it creates the name of the function somehow. This seems minor and possible to do in Fortran also.

5. Loop over each dimension up to a rank, and select_subarray

Example: stdlib/stdlib_stats_moment.fypp at 492543d0086ff3821ecf2ee8f0a9cf46a945a7ac · fortran-lang/stdlib · GitHub

          #:for fi in range(1, rank+1)
          case(${fi}$)
            if (present(center)) then
              do i = 1, size(x, ${fi}$)
                res = res + (x${select_subarray(rank, [(fi, 'i')])}$ - center)**order
              end do
            else
              allocate(mean_, source = mean(x, ${fi}$))
              do i = 1, size(x, ${fi}$)
                res = res + (x${select_subarray(rank, [(fi, 'i')])}$ - mean_)**order
              end do
              deallocate(mean_)
            end if
          #:endfor
7 Likes

Leading up to the 2008 standard, there was a draft standard with a full blown lexical macro specification a la Lisp (or more recently Julia). It was dropped in later revisions. I don’t have the link handy now, but I will track it down this evening if someone doesn’t beat me to it.

Something like that is probably too involved for a simple preprocessor. But my memory is that the specification included several examples relevant to generic programming, both type and rank. It would be useful to study and compare with the fypp's approach.

Edit:

Pages 45-53

3 Likes

But please, let’s do not burry it within fpm. It should be a library, used by fpm with an optional command-line interface, so that also non-fpm based projects can use it…

2 Likes

Oh, that’s very interesting and a much more integrated approach than fypp (although kinda too verbose for my taste). However, we have to decide, whether we want to have “smart” (context sensitive) or “dumb” (context insensitive) macros in general. Fypp does the latter, this is where the clumsy syntax comes from. It parses only its own directives, but never the Fortran code itself. Therefore, it needs a robust syntax to make sure, it only manipulates what it is supposed to.

The example in the draft

DEFINE MACRO my_generic_interface(typename,array_of_kinds)
  MACRO INTEGER :: i, kind
  INTERFACE my_generic_procedure
  MACRO DO i=1, SIZE(array_of_kinds)
    ! Necessary in order to evaluate kind to an integer:
    MACRO kind = array_of_kinds(i)
    MACRO IF (kind>0) THEN
      SUBROUTINE MySpecificProcedure_%%kind(X)
        typename(kind), INTENT(IN) :: X  <-- That is tricky IMO
      END SUBROUTINE
    MACRO END IF
  MACRO END DO
  END INTERFACE
END MACRO my_generic_interface

needs more intelligence to decide, whether typename(kind) must be substituted with the macro arguments or not (as it is not in a line starting with the MACRO keyword). It could also occur in following lines:

! dummy comment typename(kind) should be not replaced (or should it?)

write(*,*) "typename(kind) should not be substituted here either"

write(*,*) "What about this?&
typename(kind)&
&"  

If one starts to parse the Fortran code as well, the pre-processor should be rather part (or use parts) of a compiler in order for the parsing capabilities.

Of course. This would be a standalone fpm package. The fpm itself would just depend on it and use it.

I think the “dumb” approach is the way to go. However, since this is a pre-processor for Fortran, the pre-processor should know and have built-in support for knowing the kinds, types, features for any rank array declarations etc. But it would not parse any Fortran statements, from that perspective it would be “dumb”. Pretty much fypp is almost there, it’s just the Python evaluation that is an issue, but if fypp has more built-in support for most of the frequent tasks that currently are done in Python, then the Fortran version can just be a subset of fypp.

Regarding the macro proposal, I don’t like it at all, it is verbose and much less readable than fypp to me. I talked with several Fortran users who also don’t like it.

Did anybody ever use COCO? I’ve not studied the syntax but I think there was an implementation somewhere written in Fortran right? Maybe that could be useful?

2 Likes

I don’t think CoCo has a looping construct. It does let you define a parameterized text block and then expand it with actual arguments (search that page for “fermion” to find an example). You could use that to manually loop over types, say, but it will get out of hand for procedures that are fully type-rank-kind generic.

CoCo is GPL, so there’s nothing stopping us from using its source as a starting point, maybe extending it with a loop directive.

2 Likes

Unfortunately the license does stop us to link it with fpm and distribute fpm with CoCo built-in. However, Dan Nagle (the author) might be willing to relicense it.

@certik’s idea of a preprocessor in Fortran itself is extremely relevant and important given other discussions c.f. Proposal: moving fftpack under fortran-lang - #29 by aradi

Fortran has a few crucial gaps when it comes to Generics but yet is high-level enough that the need for preprocessing to achieve generic algorithms (and containers) can be limited to particular aspects only. And it is conceivable to do with a Fortran processor-based facility that is both aware of the Fortran language semantics and syntax and can employ it effectively.

Please note, that in cases like Proposal: moving fftpack under fortran-lang - #29 by aradi, having the preprocessor in Fortran would only give any advantage over the current situation, if the preprocessor was always compiled with exactly the same compiler as the project. Otherwise, it won’t have access to the compiler-dependent Fortran internals, like the real_kinds array. Probably every compilation of a preprocessor dependent project would then have to start with the compilation of the preprocessor first. Doable, but extra effort when building projects.

1 Like