What is a pure function?

What do you consider to be a pure function?

In your mind, the result of a pure function

  • does not depend on global variables
  • can depend on global variables
  • both / depends
  • no opinion

0 voters

I discovered today that opinions differ on that. Wikipedia has a nice summary: Pure function - Wikipedia. Essentially everybody agrees that pure functions should have no side effects (such as writing to global variables), the property 2. on the wiki page. But people disagree if pure functions should have the property that “arguments fully determine the output” (the function does not depend on external state such as reading global variables), the property 1. on the wiki page.

In Fortran, the “pure” attribute is just the property 2., but not 1. The new “simple” attribute will be both 1. and 2. The property 2. is enough to allow an optimizer to optimize the function out, or move it outside a loop. The property 1. seems to be needed for parallel loops, memoization and other applications.

5 Likes

Generally one requirement for purity is that all runs with the same input return the same output. This is very clearly violated if the function refers to mutable global state.

3 Likes

Nevertheless the Wikipedia link above brings to the table very concrete examples of such “violations”.

See specifically this snippet:

Ignoring what happens with the pure keyword in Fortran, as here it would be just a self reference, it still remains a fair amount of notable players using the word differently.

To be clear: I’m surprised too, god, why do naming conventions always seem to be so painful to agree on? Sometimes I feel like it is a curse inherent in human nature :smiling_face_with_tear:

2 Likes

Another property that is closely related but non-obvious is guaranteed termination. While you can say that an infinite loop is pure, a compiler that tries to use that information will have a very bad day. As of Julia 1.8, Julia’s compiler tracks a number of useful properties for functions including consistency (same output for same input), nothrow (no errors), termination, and effect free (as well as a few others). We used to have a @pure macro that was similarish to Fortran’s pure but we discovered that no one could agree on what it meant and that it caused way more problems than it solved. Splitting the effects up into much more strictly defined bits has made it a lot better since each one has very clear semantics.

5 Likes

A definition that may have broader accepted unique meaning, implying both properties 1 and 2, as defined in the original post, may be Referential transparency - Wikipedia

An expression is called referentially transparent if it can be replaced with its corresponding value (and vice-versa) without changing the program’s behavior.

3 Likes

Before 1.0, right? Having the chance of getting naming right (or else pruning away names that generate too much debate to make a definitive choice without long lived grudges…) is probably one of the greatest advantages of a pre-1.0 language :slight_smile:

@oscardssmith yes, it’s actually LFortran why I am asking this question, for the same reasons as Julia I am trying to figure out what to track for each function. Indeed it looks like tracking separately the property 1. (same output for same input) and property 2. (no side effects) is the way to go. How does Julia determine the “termination”? I thought that is very hard (impossible in general?). Fortran doesn’t have throws, so that’s good. The one other possible property is printing: you want to print from pure functions for debugging (and possibly logging), but obviously that makes them non-pure.

@certik Determining termination is really easy if you don’t do a good job of it. It’s not a problem for the compiler to think that something doesn’t terminate if it does, so the current algorithm is that anything that contains a goto or a loop is considered to not terminate. In the future we might be able to teach it that loops over fixed sized collections terminate but so far we’re extremely conservative. It turns out that just tracking things that obviously terminate has some real benefits since it lets you bring computations like trig and exponentials to compile time.

5 Likes

@rgba technically @pure still exists but it’s deprecated and we’ve gone through the entire package ecosystem and removed almost all of the uses of it. Most of them were relics from years ago when the compiler wasn’t as smart as it is now.

2 Likes

I see. Yes, I think probably even loops over small number of iterations can be done. Clever. I was wondering how to determine which user functions to allow running at compile time. This makes sense. Maybe the property could be called “100% sure this quickly terminates”. :slight_smile: Do you know where in the Julia compiler this is done and tracked?

I know Idris and Agda also perform some degree of termination analysis but not much besides that. Those languages have way more restrictive type systems than Fortran and Julia and thus better positioned. Maybe we could benefit from Fortran being used mostly for numerical software to derive “good enough” heuristics?

Edit: from my limited understanding of the subject, if you can infer bounds to integers used in loops that’s a good start.

2 Likes

@everythingfunctional seems like the kind of person who would be interested in this.

3 Likes

if I agree on that, it also makes sense that a mutable global state could be seen as part of the input of the pure function (like an hidden argument).

2 Likes

I voted both / depends since I wasn’t sure whether the term “variable” implies that it may be constant (parameter) or not. Now I’ve learned about referential transparency, thanks! :heart:

1 Like

Welcome semarie!
If you frame it like that, you could also say that every side effect is just a hidden output argument.

1 Like

julia/effects.jl at master · JuliaLang/julia · GitHub has the data structures and docs. the actual code is intermingled throughout abstract interpretation (where Julia does type inference)

1 Like

Welcome @semarie to the forum and thanks for all your help with LFortran!

Indeed, we can, and as @Carltoffel said, we can also put side effects into implicit output arguments, and we arrive at something like Monads: https://en.wikipedia.org/wiki/Monad_(functional_programming), now @everythingfunctional would really like this thread. :slight_smile: It’s my understanding that is how Haskell makes everything pure in a clean way.

@Sideboard great point — indeed global or parent scope constant variables are fine, they are essentially just like local values. I should have made it more clear.

Thanks @oscardssmith, exactly what I was looking for.

1 Like

Thanks for the ping @meow464 , yeah, I’ve got a little bit of input on the topic.

Most languages that consider themselves “purely functional” (i.e. all functions are pure by default) use the property 1 definition. And most use the term “referential transparency”. They usually do have some backdoors/escape hatches for debugging purposes (i.e. I want to stick a print statement here), but they also have designed ways of doing I/O properly/safely. I think the new simple designation gets Fortran pretty close, but until we get a way of returning polymorphic outputs from them, it also add’s some heavy restrictions. I have a proposal out there to allow this, but the committee hasn’t really considered it yet.

2 Likes

Yep. I haven’t quite worked out a convenient/user-friendly way of modelling Monads in Fortran yet, but it’s been rattling around in my head for a bit. I think it might be doable with intent(inout) derived types with type-bound subroutines, but I think the polymorphism in pure/simple procedures is still necessary.

1 Like

@everythingfunctional , can you please elaborate on the “heavy restrictions” aspect in the first quoted comment and related to the “necessary” in the second one?

Sure, with a general philosophy of “delighting the customer” that might extend to seeing consumer convenience as paramount if Fortran were viewed as a product, the above would make sense. Meaning, there are scenarios where it is rather “convenient” for the consumer to work with a polymorphic object and employ different concrete instantiations of said object in code for compact, feature-rich libraries as opposed to, say, different instances somehow of the library that work on specific concrete types. An important design decision the former option can be for library authors. And the view can be that Fortran shall be general-purpose programming language to allow all such options to the library author. But then rubber starts to hit the road hard, particularly given the legacy of both FORTRAN and Fortran 90 thru’ 2023 and the existing semantics and compiler concerns including their budgets and resources all come to the fore.

This leads to the question as to where does one make the compromise in Fortran? Particularly if someone sees some language design aspect as “heavy restrictions” / “necessary” and someone else (compiler implementors) tend to think the opposite …