What is a pure function?

The “heavy restriction” becomes that it is possible only to have intent(in) polymorphic arguments to pure/simple procedures. That seems like quite a heavy hammer just because “well you don’t know if the dynamic type might have an impure final subroutine”. The fix is relatively straightforward. Add a type attribute (i.e. simplefinal) that dictates the type and all types that extend it must have simple final subroutines. Those types may then be used as polymorphic arguments or function results from pure/simple procedures.

1 Like

@everythingfunctional , I’m trying to understand why is it such a “heavy hammer”? What exactly are the use cases, besides consumer convenience that is, where the PURE/SIMPLE procedures can do with polymorphic received arguments that are not INTENT(IN)?

Separately, “The fix is relatively straightforward” is arguably a “heavy” assertion that can be disputed.

rojff. Nearly that entire library could be pure if not for the polymorphism. So if such a feature existed, it would be quite beneficial.

erloff could also be entirely pure with such a procedure.

Any heterogeneous container can’t be used in pure contexts, but could if such a feature were added.

I could go on, but I think maybe you get the point.

2 Likes

This would be SO useful. The current Fortran standard actually exposes a way to execute impure code inside a pure procedures because the rules around polymorphic arguments to pure procedures aren’t very well though out. I posted about it on the J3-proposals github but it hasn’t received much interest yet: Polymorphic intent(out) arguments in pure subroutines · Issue #287 · j3-fortran/fortran_proposals · GitHub

1 Like

@everythingfunctional ,

Sorry but I don’t get your point. I absolutely understand and relate to the fact you have some libraries with procedures that you would like to be all labeled as PURE. And I hinted at earlier, I can also see that you, as both a library author and someone who is also a consumer, find it convenient to work with polymorphic actual arguments - some of your tests with above libraries even show that. But these are not compelling use cases in any way in the Fortran world, particularly from everything-is-a-cost, have limited-budget / limited-resources, therefore need-to-focus-on-performance-features-only, that’s-only-what-Fortran-is-all-about mindset.

The kind of libraries you list above are of interest in the so-called preprocessing / postprocessing phase, before and after the number-crunching. This is the space that the compiler vendors, either explicitly with throw-it-in-the-user’s-face attitude or subtly, will argue (i.e., depending on which user(s) is asking and how they they ask - a la “no soup for you” ), is better done using means other than Fortran. The need for PURE procedures in such libraries is nice but not must-have. The need for polymorphic received arguments with INTENT(OUT) / INTENT(INOUT) attributes (thus ALLOCATABLE) in PURE procedures in such situations is even less compelling,

Well that’s a bit condescending. I figured you of all people would be sympathetic to making Fortran easier to use.

Also, I take exception to your assertion that Fortran users would not find error handling or heterogeneous lists to be useful in computational projects. I absolutely have projects where they are useful.

3 Likes

Heterogeneous lists in Fortran would have made my PhD so much easier. An extremely useful feature to have if you want to experiment with different finite element discretisations in unstructured geometries.

1 Like

Let’s discuss heterogeneous lists in Heterogeneous list.

@everythingfunctional ,

Please note I am not in any way trying to convey any of the use cases mentioned here - heterogeneous lists, error handling, etc. - are not useful. To the contrary, I have not long asserted such needs are crucial to the larger application of Fortran in scientific and technical computing.

Instead of usefulness, I am trying to understand and in the process, posing a question as a challenge to get to the bottom of the need for polymorphic arguments with INTENT(OUT) attribute in a PURE subroutine.

Look, I completely the understand the inclination to author library procedures that are all SIMPLE and ELEMENTAL (with, say, array computing taken as the hallmark of Fortran); if that is not viable, then ELEMENTAL which implies also PURE; and short of that, PURE subprograms.

But the Fortran ecosystem is not playing all; there are too many resource constraints and the chain is only as strong as its weakest link (the compiler vendor that seeks to do the least with standard extensions).

The result is a certain adhocism with what gets in to a standard revision and what doesn’t - it seems to depend on who is asking for what and how so. This is not good. A vision that addresses “For whom Fortran, for what” and which might then imply a “wider tent” of practitioners whose needs are considered and then working toward a coordinated set of actions that enhance the language standard toward a broader application in all aspects of scientific and technical computing appears lacking.

A tacitly held position which could be noticed, especially among the vendors, thru’ Fortran 2023 and leading up to Fortran 202Y is a considerable lack of enthusiasm for features / improvements that are not in the performance-sensitive aspects of computing and also, if there are alternatives to any requests. One can see it in the responses by subgroup chairs to proposals for Fortran 202Y. I am disappointed with that position.

Considerable heterogeneous lists: the Fortran 2003 standard mostly started to support them and Fortran 2008 and to an extent, 2018 revisions tied up some loose ends. The gap as noted here with polymorphic arguments with INTENT(OUT) in a PURE subroutine. So yes, that can be fixed if that makes it to the top of things-to-do. I am trying to get to whether this should have identified and enumerated reasons that would assign it a priority and thus move up the list, as opposed to gut-feel / adhocism.

I’m trying to understand how exactly this works. Is this discussion only for subroutines and not also functions? Is the idea that the compiler does not know the return type+kind of the argument at compile time, but is only known at run time?

The discussion does apply to function results as well (pure functions are only allowed intent(in) arguments).

In a sense, yes. With function results, the compiler does not know if the dynamic type of the return value will have an impure finalizer or not, and since the return value will necessarily be finalized (or at least deallocated if it turns out not actually to be finalizable), the effect of calling the pure function could lead to impure code being executed, even if the function itself doesn’t actually do anything impure.

With intent(out) subroutine arguments, the situation is in some ways actually the reverse. The compiler does not know whether the dynamic-type of the actual argument will have an impure finalizer, and it will necessarily be finalized prior to execution of the subroutine. Thus, calling the pure subroutine could lead to impure code being executed, even if the subroutine itself doesn’t actually do anything impure.

1 Like

@FortranFan @RonShepard a use case would be to return an error from a subroutine:

    subroutine sqrt_inplace(x, error)
        real, intent(inout) :: x
        class(error_t), allocatable, intent(out) :: error

        if (x <= 0.0) then
            error = negative_value_error_t()
            return
        end if
        x = sqrt(x)
    end subroutine

You could then identify the exact error using select type afterwards. Complete example here. The example would of course be much more interesting if multiple different types or errors could occur.

For now you have to declare error as intent(inout) in a pure subroutine for no good reason.

1 Like

Unless I’ve missed something, I believe intent(inout) is not allowed either, for basically the same reasons. The line

could case an impure final subroutine to be executed if the dynamic-type of error happens to have such a procedure on entry to the subroutine.

AFAIK the standard only mentions intent(out):

But you’re absolutely right that it enables execution of impure code inside a pure procedure. That’s why I opened this issue.

For a more concrete use case, where performance should be considered important, lets consider something like a data-frames implementation. A csv file could be considered as a matrix (2-d array) of values, where each may be of a different type (i.e. real, integer, string, etc.). It should be possible to split the file into lines, and then the lines into entries, and process every entry independently (and thus potentially in parallel) into the appropriate dynamic types to put into a data structure for further processing. Something like the following should be possible,

type(string), allocatable :: tokens(:, :)
type(entry), allocatable :: entries(:, :)
integer :: i, j

tokens = tokenize(file_contents)
allocate(entries, mold=tokens)
do concurrent (i = 1:size(tokens,dim=1), j = 1:size(tokens,dim=2))
  entries(i,j)%val = to_entry(tokens(i,j))
end do

but it’s not because to_entry needs to return something polymorphic, and thus can’t be pure. It’s entirely possible that this is a very large file, and that this step alone represents a significant fraction of the processing time for my application. So this absolutely could be an enabling factor for improved performance in many use cases.

2 Likes

Ok this clause makes me a little uncomfortable answering, since I don’t know if I can claim this is not just a “convenience” but here you have a barebones example that has bugged me some time ago.

Let’s say you have a derived type and you want to define an algebra for it: you’ll need overload for +, for -, maybe some products etc.

Since Fortran does not currently support sealed types (so yes, probably in some cases the best fix would be that, but current is not there, so whatever), if you want to define such overloads as type bound operators (which is paramount from a software design perspective: you want the user to import the algebra with a use, only: derived_type_name statement alone, without the fear that he/she could forget about the operators) you are forced to declare the input arguments as class(derived_type_name) and not type(derived_type_name). Then you remain with the problem of how to declare the return value of the function, right?

As Brad says

functions are affected as much as subroutines from the impossibility to return polymorphic objects.

What I would very much like is a way to enforce that in function add(a, b) result(c)

  • a and b are of the same type (still polymorphic, but coupled)
  • c would be emitted with the same type

Any time I would need to (re)define what should happen to additional fields in a inheriting type I would just override these operators. Hence marking the procedure as non_overridable is not an option, right?

So what should I do? Return everytime the parent type? That is surely not what I want, in many cases. Define all these operators outside the derived type? Sure, but this poses the API issues delined above… and again, it would mean to give up polymorphism.

What remains… giving up the operator API and resorting to a type bound subroutine? Ok but then, again, I have to give up its purity! Now we can do impure elemental subroutines (I totally would want algebra operations to be elemental!) but I don’t expect them to be really optimized if not declared as pure, what’s the point then, if they are just compiled as loops over the arrays?

I had commented along these lines on a somewhat related issue in the j3 proposals repository:

The main question was if something like this would work in F2023

pure function polymorphic_sum(A,B) result(C)
   class(some_class),intent(in) :: A
   type(typeof(A)),intent(in)   :: B
   type(typeof(A)) :: C
   C = [some implementation of A + B]
end function       

One answer is that the user shouldn’t be responsible for asserting purity. That should be the compiler’s job. Users are really bad at this type of thing. They will put a print in and not remove the pure. They will forget it in places where it should be annotated, etc.

If you do that it won’t compile (been there many times, as you say it would happen all the time, for normal humans at least).

Here I’m not sure I get what you mean. I don’t think that annotating a procedure as pure (or simple) is necessary for the compiler to take action: if it can infer the information by its own I believe a good compiler would use that anyway, to optimize or whatever. E.g. I’m pretty sure that if you request parallelism with compiler flags you would get parallelized not only do concurrent loops, but also regular do loops that the compiler can tell are eligible.

The point of annotating is more about asking help to the compiler (“please remind me if I do something stupid, I intend this to be pure, so if you find otherwise just yell at me”). Or giving a hint to go deeper, if not inferable by standard analysis (“you might have missed it, but I’m promising that this is pure and, again, if I’m lying hit me hard with that error”).

2 Likes

@oscardssmith I think the idea is that as a user you want to ensure you didn’t put a print statement there by a mistake, or reading a global variable by a mistake, so you put “simple” in there and the compiler will catch any such mistake for you.

3 Likes