Attribute for "pure" procedures that do I/O

All you need to do to achieve the above goal is to define PURE as an empty preprocessor macro, or even better yet, redefine it as impure,

    write(*,*) "function returns:", square(2.)
contains
    PURE function square(x) result(xSq)
        real, intent(in) :: x
        real :: xSq
        xSq = x**2
#ifdef PURE
        write(*,*) "impure function print of xSq: ", xSq
#endif 
    end
end

Compile with

ifort -fpp -DPURE= 

to get impure definition or with

ifort -fpp -DPURE=pure

to get a pure function definition. We have been using this approach for quite some time in our codebase, and it works great for purity-performance checks or debugging.
The above example is just an illustration, one can write a more elegant solution that also works with elemental.

2 Likes

I think this would be useful for logging and especially for debugging. Sure, the “right” thing to do would be to attach a debugger, but there are also cases where I actually would prefer print based debugging. For example if the problematic procedure is invoked many times before the issue occurs. This is often the case when developing simulation software which typically repeat the same code many times with slightly altered input.

As others have pointed out this is possible to do with a little bit of cheating so I whipped together a quick library for it: GitHub - plevold/fortran-debug-utils: Some debug utilities for Fortran
Example:

pure subroutine some_pure_sub()
    ! Do regular things here...
    ! Then some debug output:
    block
        use debug_utils, only: dbg
        call dbg('Hello from some_pure_sub' // new_line('c')  &
                // 'This can be useful for debugging pure procedures')
    end block
end subroutine

I have some other ideas to make it even more useful, but that will be for another day…

I would be extremely unhappy with the compiler/runtime if it actually did that out of my control regardless whether the procedure was pure or not. Luckily I don’t think that is or will ever be the case.

3 Likes

Thanks for doing this @plevold. IMO your “debug-utils” would be a very valuable contribution to stdlib (maybe in stdlib_error to complement the already useful subroutine check),

2 Likes

Thanks, @epagone. That is certainly an interesting idea. I think I will keep the separate repo for a bit as I have some ideas I’d like to test which might not be very “stable” on the first go… You are more than welcome to copy the code into stdlib in the meantime if you wish though.

Also, I do wonder if this actually fits well into a separate library. Paranoid users can then comment out that particular dependency in their fpm.toml file once done with debugging. If they’ve accidentally forgot to remove some debug code this will be caught at compile time.

Many thanks for making your code available for integration into stdlib. Unfortunately, I vaguely recall a few instances where there has been already significant discussion in places on “debug” and “release”-mode features of stdlib, so your (legitimate) doubt applies to other parts of the library and it has been identified. This issue and the integration with related features already implemented (e.g. the procedure check) make it difficult to simply copy and “drop-in” the code into stdlib.

I understand that you might prefer to experiment in a separate repository (actually I think it’s a good idea) but I was suggesting that, if you want, you might adapt your routines to work with stdlib as they mature more.

You should be aware that compiler makes a plethora of nasty things with your code, especially when high optimization levels are requested – doing such things is actually the principal way of code optimization.

“Pure” keyword by itself does not amend what the procedure is actually doing – just omit “pure” and you will get the same behavior. As Fortran has never been a language for computer science theoreticians (no need to introduce functional purity for the sake of the functional purity itself), the primary intent of this language feature is to help the optimizing compiler to optimize better by explicitly allowing a number of nasty tricks to be applied – enjoying the explicit assertion that the procedure depends only its the arguments and changes nothing in the world except the “returning values” (be it the value returned by function or dedicated parameter(s) of the subroutine), which, in turn, allows to invoke it in arbitrary order instead of following the order suggested by the author of the program.

1 Like

Sure, optimization will do a lot of “nasty” things to the code, but at least Intel will as far as I can see not introduce multithreading unless explicitly asked to do so with the -parallel flag. This is good because for many scenarios, parallelization by running multiple processes might be more relevant.

You can’t rely on this. Actually, you can’t rely on:

  1. the assumption that you or somebody else will always compile your code with the specific compiler;
  2. the assumption that you or somebody else will never use your subprogram in various parallelization schemes, including the most weird ones.

Again, pure subprograms are introduced for the possibility to use them safely in parallelizations and control flow rearrangements, avoiding (by design) various issues like race conditions, “heisenbugs” and other creepy stuff you never want to deal with.

It is interesting question if some kind of “IO monad” could be implemented in Fortran – if anybody ever decides to go really functional. Otherwise, returning sign of error or error message via argument of pure subroutine, or as a field of user-defined class, or via NaN value, followed by logging outside the scope of the pure subprogram, could be the options. Indeed, it is inconvenient, but nobody forces to use pure functions if they are inconvenient in the specific case.

1 Like

Unfortunately, that does not always work. As soon as you have a do concurrent loop, which allows calls to pure procedures only, removing the pure attribute from the called procedure (just to be able to print/write debug information) would result in code which might be (should be) rejected by the compiler…

1 Like

Fair point, never thought of it, although I have practically not encountered such conflicts among the ~1700 instances that we have so far had in the library.

Interestingly, pure functions in the D language seems to allow impure operations if they are written in the debug statement and the -debug option is used for compilation.

A pure function can perform impure operations in statements that are in a ConditionalStatement controlled by a DebugCondition .

DebugStatements have relaxed semantic checks in that pure, @nogc, nothrow and @safe checks are not done.

So something like…

import std;
pure int foo( int n )
{
    debug writeln( "n = ", n );  // valid if the -debug option is attached
    //writeln( "n = ", n );  // always error (pure function cannot call impure function)
    return n * 2;
}
void main()
{
    writeln( "foo = ", foo( 100 ) );
}

(Test with online compiler)

The Nim language also seems to have an exceptional treatment of “debug write” for pure procedures (= func), according to this manual page.

As a special semantic rule, the built-in debugEcho pretends to be free of side effects so that it can be used for debugging routines marked as noSideEffect.

An example similar to the above code may be…

func foo( n: int ): int =    # pure procedure
    debugEcho "n = ", n      # valid
    # echo "n = ", n         # always error
    return 2 * n

echo "foo = ", foo( 100 ) 

(Test with online compiler)

2 Likes

As much as I would like this, as I often would like to do print debugging on my pure procedures, I think it would be better as a compiler flag, such as --allow_pure_print or something.

I very much like @rwmsu’s proposal that a file may be opened with the PURE attribute and may then be written to, but not read by a program. Writes to such a file would then not violate the principles of PURE procedures.

There is a problem with ELEMENTAL procedures. Should the WRITE take place for every instance or only once in an elemental call? And would we need new syntax to choose? Elemental procedures can invoke other elemental procedures. Would a WRITE(pure_unit,format,ONCE…) construct write only once for a call of the outer elemental routine when calls are nested? This would not be easy to implement.

1 Like

The principles of pure procedures include no side effects, which is why the SAVE attribute is prohibited. This definition is not specific to Fortran, per Pure function - Wikipedia. Writing to a file has a clear side-effects, both to the filesystem itself and to the system buffers used to implement file operations.

In any case, if folks want a non-conforming but practical workaround, use C. Fortran compilers cannot introspect C code and thus have no way to observe the violation, so this implementation is warning-free. It’s also possible to have the C function print_error implemented as a no-op by default and then LD_PRELOAD a shared-library containing the version with the debug output in it at runtime when necessary. In this case, the subroutine is PURE except when debugging, which seems minimally evil.

// print_error.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ISO_Fortran_binding.h>

void print_error(int code, CFI_cdesc_t * message)
{
    char * buffer = calloc(message -> elem_len + 1, 1);
    memcpy(buffer, message -> base_addr, message -> elem_len);
    printf("code=%d, message=%s\n", code, buffer);
}
! pure-error.F90
module m

    interface
        pure subroutine print_error(code, message) bind(C,name="print_error")
            use iso_c_binding, only : c_int, c_char
            implicit none
            integer(c_int), intent(in), value :: code
            character(kind=c_char), dimension(..), intent(in) :: message
        end subroutine print_error
    end interface

    contains

    pure subroutine f(i)
        implicit none
        integer, intent(in) :: i
        if (i.eq.100) then
            call print_error(i,"bad things have happened")
        end if
    end subroutine f
end module m

program main
    use m
    implicit none
    integer :: i, n
    n = 1000
    do concurrent (i=1:n)
        call f(i)
    end do
end program main
% gcc-13 -I/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/gcc/aarch64-apple-darwin22/13/include  -c print_error.c && gfortran-13 pure-error.F90 print_error.o && ./a.out
code=100, message=bad things have happened

My handling of strings of strings in this code may not be correct but that is not the important part of the example.

3 Likes

I always wondered why this is allowed. Is the idea that we are “promising” that the bind(C) function is pure, and if we are lying, it’s our own problem?

The code above might work sometimes by accident, but it seems the code is fundamentally broken: the compiler assumes that print_error is pure, i.e., it has no side effects. Given that it is a subroutine and all arguments are intent(in), by definition this function is a no-op, it can’t possibly have any effect on anything. So I think the compiler is free to optimize it out.

If I am wrong, please let me know.

Short answer, yes.

Not entirely true. If it were a Fortran procedure it could

pure subroutine print_error(code, message)
  integer, intent(in) :: code
  character(len=*), intent(in) :: message
  if (code /= 0) error stop message
end subroutine

So just because a subroutine is pure (or simple) and only has intent(in) arguments does not mean it can be optimized out.

2 Likes

I see — it can have a side effect of stopping the program using error stop. Besides error stop and stop, can it have any other side effects?

No, and in fact stop is not allowed in pure procedures, only error stop is allowed.

2 Likes