ErrorFx: Fortran exception library

aradi · October 14, 2021, 4:19pm

Dear all,

I’ve uploaded to Github a small library

which provides (and demonstrates) exception like error handling in Fortran. We have discussed about the concepts already earlier in several forums, I’ve just now summarized my ideas in form of a publicly available minimal library.

Apart of providing the core functionality for exception like error handling, it contains several examples (throwing, propagating and catching errors, stopping due to an uncaught error, building error class hierarchies, etc.) as well as a description of the general concepts.

I hope, you’ll find it useful, comments are welcome.

Best regards,

Bálint

everythingfunctional · October 14, 2021, 6:32pm

This seems like the next logical step in the evolution of the integer status code that seems all too common in Fortran. error stopping for uncaught exceptions is a nice touch, and propagating the errors upwards seems pretty natural. Two minor critiques.

First, catching the exception feels complicated and error prone. I wonder if improvements could be made to that aspect of the API. Maybe something like

subroutine do_stuff()
  type(fatal_error_t), allocatable :: error

  ...
  call dangerous_stuff(..., error)
  if (allocated(error)) call error%catch(handler)
  ...
contains
  subroutine handler
    ! do stuff to handle the error condition
    ! Note we have access to variables in the host scope
    ! so we can set them as necessary
  end subroutine
end subroutine

So you’re fypp macro could now be a single line too: @catch_error(error, handler).

Second, a way of combining errors to pass back would be potentially useful too. I.e. something like

subroutine lots_of_danger(..., errors)
  type(fatal_error_t), allocatable, intent(out) :: errors

  type(fatal_error_t), allocatable :: first_error, second_error

  call first_thing(..., first_error)
  call second_thing(..., second_error)
  if (any([allocated(first_error), allocated(second_error)])) then
    errors = first_error + second_error
    return
  end if
end subroutine

All in all, I think it shows promise.

certik · October 14, 2021, 8:42pm

@aradi I only read the README so far and what I see is awesome. It’s exactly in the direction I think error handling should go. Great idea to also prototype a nice syntax via fypp. Later on I can help prototype in LFortran to have a natural Fortran syntax, and we can write a paper for 2Y based on this.

Now, couple major things.

In Rust natively and you can emulate it in C++, I started using the paradigm of Result<T> where T is the actual result of a function, say an integer, or std::string or whatever; if an error occurs, then Result contains the error and you have to query it if an error occurred. If it did, then you can extract the error structure, which corresponds to your fatal_error class. Here is an example: src/lfortran/codegen/fortran_evaluator.cpp · a211fc2e07fb30b6483a16f8e9304bfaacc7de90 · lfortran / lfortran · GitLab. In there you can see that I query the ok member, and if true, use the result member with the result type T, otherwise I return the error member up. The advantage of this approach is that you just return the result from a function, which is the most common case. It’s just that sometimes an error can occur, and so you either return the result or an error. How would we do this in Fortran?
Based on the previous point, Rust has a question mark ? operator which inspects the Result and if it contains an error, propagates it up (returns it), otherwise it returns the type T. This is the most common case — you want to propagate errors up, and otherwise you just want to use the result and not worry about anything else. With this syntax sugar, you can use functions returning Result<T> as if they were returning T directly, you just append ? at the end when you call them, and never worry about anything else. And errors will never be silenced, but rather propagated up.
Zig has a try keyword for precisely the same operation: Documentation - The Zig Programming Language
Finally, in C++ you can create a macro TRY that does the same thing, here is an example from another project: serenity/mount.cpp at fb7b4caa574e8d4ad72657525590c5076d97a65a · SerenityOS/serenity · GitHub, here is the definition of TRY: serenity/Try.h at 2d4650714f7b09e305b5cbb84c46191503be27dd · SerenityOS/serenity · GitHub. I am planning to create the same macro in LFortran also. This is a good idiom.
Your @:propagate_error(error) almost does the same thing, but because the error is an argument as opposed to a return type, it’s not quite the same.

Building on the last point, currently you have:

subroutine routine_propagating_error(..., error)
  :
  type(fatal_error), allocatable, intent(out) :: error
  :
  call routine_with_possible_error(..., error)
  ! If error happend, we propagate it upwards, otherwise we continue
  @:propagate_error(error)
  print "(a)", "Apparently no error occured"
  :
end subroutine routine_propagating_error

Instead, can we do something like:

function routine_propagating_error(...) result(error)
  :
  type(fatal_error), allocatable :: error
  :
  @:TRY(routine_with_possible_error(...))
  print "(a)", "Apparently no error occured"
  :
end subroutine routine_propagating_error

That would be aligned with Zig, Rust and TRY in C++.

If we can figure out how to compine an error with the result type (see the first point above), then we can do this:

integer :: i
i = @:TRY(routine_with_possible_error(...))

And it will simply return the integer result, or it will propagate an error up.

Examples where this could be used is the open function in stdlib: stdlib/stdlib_io.fypp at 089f325c63177672ca08d7676e74bcd2fae22ce3 · fortran-lang/stdlib · GitHub. If the file to be opened does not exist, it would return an error, otherwise it would return the unit. You would use it as:

integer :: u
u = @TRY(open("something.txt"))

Finally, what would be a good Fortran syntax for this? Well, the two obvious choices are either the try keyword::

integer :: u
u = try open("something.txt")

or the question mark operator:

integer :: u
u = open("something.txt")?

It’s unfortunate that we have literally just decided to use ? for a ternary operator (and yes, Rust does not have a ternary operator as far as I know). But maybe the syntax would not clash. The try seems more fortranic anyway.

everythingfunctional · October 15, 2021, 3:11am

If you like Rust’s Result<T>, you’ll like the fallible pattern I’ve come up with and my use case for generics. It is inspired by Rust’s Result<T>, and Haskell’s Either. generics/forwarding_errors.f90 at main · j3-fortran/generics · GitHub

But I also like the neat shorthand that Rust has for getting at intermediate results when you don’t want to have to put the next step(s) in their own function. Getting this built into the language might be an interesting proposition, but likely wouldn’t be doable until whatever’s after 202Y (203Z?).

certik · October 15, 2021, 5:08am

Indeed, also in the direction of what I am thinking.

Are you talking about ? or something else?

everythingfunctional · October 15, 2021, 1:35pm

Yes, although IIRC ? has some context specific semantics. I believe it will forward the error in a function that also returns a Result<T>, but will panic in a function that does not. I may be misremembering though. I.e.

Result<int> safely(x, y) {
  quotient = safe_divide?(x, y); // forwards the error if it occurs
  ...
}

int unsafely(x, y) {
  quotient = safe_divide?(x, y); // panics (stops the program) if an error occurs
  ...
}

P.S. My Rust is a bit rusty so forgive any syntax errors

aradi · October 15, 2021, 2:21pm

Thanks for all the comments, the interest is really encouraging.

@everythingfunctional The idea of simplifying the error catching by passing an error handler routine to catch() is magnificent. Although, it is not applicable under all circumstances (e.g. if you wanted to return from the subroutine during the error handling), it can shorten the error handling considerably. I’ve extended my implementation accordingly. Now one can do

subroutine main()

  type(fatal_error), allocatable :: error

  call routine_with_possible_error(..., error)
  call catch(error, error_handler)
  :

contains

  subroutine error_handler(error)
    type(fatal_error), intent(in) :: error
    ! Do whatever is needed to resolve the error
    print "(a,a,a,i0,a)", "Fatal error found: '", error%message, "' (code: ", error%code, ")"
  end subroutine error_handler

end subroutine main

The error handler routine only contains the error handling code, no deactivation/deallocation boilerplate. And in case of an exception class hierarchy, it even makes the clumsy select type construct superfluous:

subroutine main()

  class(fatal_error), allocatable :: error

  call routine_throwing_error(..., error)
  call catch_io_error_class(error, handle_io_error)
  call catch_linalg_error_class(error, handle_linalg_error)

contains

  ! Handler for io error
  subroutine handle_io_error(error)
    class(io_error), intent(in) :: error
    print "(2a)", "IO Error found: ", error%message
  end subroutine handle_io_error

  ! Handler for linalg error
  subroutine handle_linalg_error(error)
    class(linalg_error), intent(in) :: error
    print "(2a)", "Linear algebra error found: ", error%message
  end subroutine handle_linalg_error

As for chaining the errors, I think, that would add significant complexity. Also, I don’t see at the moment, how you would catch/deactivate the chained errors. Would a catch deactivate all of them? If not, how do you select, which ones get deactivated. And what happens, if different exception classes are added?

I think, it is easier and more robust to extend the base type and provide in the extended type a container for all the data, you want to extract (e.g. a parser_error containing all the locations, where the parser had problems to understand the content of a file).

aradi · October 15, 2021, 2:42pm

@certik The result type sounds like a great solution for returning a fatal_error from within a function. One would have to define for each possible return type a corresponding result type which can also accommodate the error as well. For example for integer one would have:

type :: result_integer
  integer, allocatable :: value
  type(fatal_error), allocatable :: error
end type

A function would then return result_integer instead of integer. If no error occurred, the value field would be allocated and error unallocated, otherwise the opposite. Forgetting to deactivate error would again lead to a stop, if went out of scope.

I just don’t know yet, how we can “fish out” the value in the most comfortable way, in case no error happened. I’ll play with that a little bit.

certik · October 15, 2021, 2:42pm

I think that’s exactly how it works. And that is what I like about it. That might be tricky to emulate with a macro or a preprocessor, but I am happy to prototype this into LFortran. From a user perspective, when a function returns a Result, you just append ? and don’t worry about it. And errors will not be silenced, and you can handle them later if you wish, or just leave it as is. Things will be robust.

certik · October 15, 2021, 2:52pm

Why not the TRY macro, that will first check if error is allocated, and if so, either panic, or return it. Perhaps TRY can just panic, and a new macro FTRY can return it (F stands for a function, and you can only call FTRY from a function that also returns a Result — one issue is that typically the calling function might return a Result of a different type — to make it work with a macro, we can create generic procedures to fill any result with an error using the same syntax that you can then call from a macro). And if there is no error, it just gives back the value as an expression. I would not even “allocate” the integer:

type :: result_integer
  integer :: value
  type(fatal_error), allocatable :: error
end type

I guess the TRY would not be a macro, but a generic function:

integer function try(fn_result) result(r)
type(result_integer), intent(in) :: fn_result
if (allocated(fn_result.error)) error stop "PANIC"
r = fn_result%value
end function

And you use it like:

integer :: i
i = try(some_function(5))

I think that will work.

But how would you do the FTRY macro or a function? It needs to return from a function on error, but return an expression otherwise. Not sure how to do that at the moment. How does this work in SerenityOS (1, 2)?

aradi · October 15, 2021, 3:13pm

The problem with this approach is, that although try() could panic in case of an error, I don’t see how one could propagate the error upwards instead. I would like to have a solution, where stop error is really only triggered, if an active error went out of scope.

As for serenity, C/C++ is easier, as you can have actual code on the RHS of an expression as it is the case for their TRY macro. That, we can not do in Fortran (un)fortunately.

As for the result_* type. I would suggest to keep the value field in general as allocatable for two reasons:

One might be tempted to handle the error in the type, and then read out the value field, which was not explicitely set in the called function. If it is unallocated, this can not happen.
If the value field contains a big array, one could override assignment the assignment (and use move_alloc) to avoid the copying of that big array as int would be the case with myresult = function_returning_a_result_type()

certik · October 15, 2021, 3:18pm

Can you explain the code, I don’t understand it unfortunately.

#define TRY(expression)                               \
    ({                                                \
        auto _temporary_result = (expression);        \
        if (_temporary_result.is_error())             \
            return _temporary_result.release_error(); \
        _temporary_result.release_value();            \
    })

I assume the _temporary_result.release_value(); is the value, correct? You can just leave it in ({ value }) and it will become an expression? I didn’t know that.

aradi · October 15, 2021, 3:49pm

I just had to look it up more carefully myself. Apparently, it is a “statement expression”, which is a GNU extension to C++ as explained here…

rcs · October 16, 2021, 12:14pm

Hi,

the code below reduces the error catching to a one liner and allows for propagation

module mod_x
  implicit none
contains
  function exception(success,msgio,msgi,stat) result(y)
    logical, intent(in) :: success
    character(:), allocatable, intent(inout) :: msgio
    character(*), intent(in) :: msgi
    integer, intent(in) :: stat
    logical :: y
    y=success
    if(.not.success) then
      if(.not.allocated(msgio)) then
        msgio=msgi
      else
        msgio=msgio//" "//msgi
      end if
    end if
  end function exception
  function funa(a,b,stat,msg) result(x)
    integer, intent(in) :: b
    integer, intent(inout) :: a,stat
    character(:), allocatable :: msg
    logical :: x
    x=.true.
    if(b==0) then
      stat=1;msg="error devision by zero";x=.false.
    else
      a=a/b
    end if
  end function funa
  function funb(a,b,stat,msg) result(x)
    integer, intent(in) :: b
    integer, intent(inout) :: a,stat
    character(:), allocatable :: msg
    logical :: x
    if(exception(funa(a,b,stat,msg),msg,"error in funa",stat)) return
  end function funb
end module mod_x
program test
  use mod_x
  implicit none
  integer :: x,y,stat
  character(:), allocatable :: msg
  logical :: z
  x=4;y=0
  if(.not.funb(x,y,stat,msg)) then
    write(*,*) msg;deallocate(msg)
  else
    write(*,*) "ok"
  end if
  x=4;y=2
  if(.not.funb(x,y,stat,msg)) then
    write(*,*) msg
  else
    write(*,*) x,"ok"
  end if
end program test

It comes at the cost of defining everything as a function and having to carry status variables, either naked (in this case an integer and a character) or encapsulated through the interface. I used a similar setup which required two lines in very large libraries. However, it appears that the general requirement of carrying status variables through interfaces cannot be avoided if the error must be propagated … so far a away from the comfortable try{}catch(){} facility of the competitor

Further be aware that any error handling function will slow down the program if it cannot be inlined. That is, checking +100Mio array elements for validity is substantially slower when using an error handler compared to a simple if branch if the error handler cannot be inlined. Inlining doesn’t always work even with ifort and the ipo flag set.

certik · October 16, 2021, 12:35pm

Ok, now that makes sense. Well, we can make an LFortran extension to Fortran in a similar way, to allow to write such a TRY macro. I don’t yet have a good strategy how to maintain such language extensions in the master branch, but doing it on a separate branch is free and I am happy to do it anytime. Just let me know.

certik · October 16, 2021, 1:26pm

Also the Elm language uses the same Result<T> approach to error handling. It has a good background and explanation here: Error Handling · An Introduction to Elm

aradi · October 16, 2021, 1:44pm

This is indeed a very compact approach, thanks for sharing. I have a few observations:

It uses functions with intent(inout) arguments, which I usually try to avoid for many different reasons (can not be made pure, could have undesirable side effects, etc.)
This pattern does not enforce, that an error condition is either handled or propagated. You could just write haserror = func(...) somewhere deep in the code, and then completely ignore the content of the haserror variable…

The type/class based exception approach on the other hand can be used in pure routines without side effects and also stops the coder (literally) from ignoring a possible error.

I think, in general we have to distinguish between two different scenarios of unsuccessfulness

A given routine wants to signalize unsuccessful execution to the caller (e.g. a search routine returns -1 for a position to signalize that the item was not found in an ordered container). To me, this is not really an error (at least not a unrecoverable/fatal one), but part of the normal operation of that routine.
A given routine wants to signalize an unrecoverable error, which must be propagated upwards over several levels up to the entry point of the library, because the library can not handle this error (e.g. somewhere deep in the library a matrix, which depends on the arguments passed to the library, became singular and can not be inverted).

I would use the exception-like mechanism only for the 2nd scenario, as an alternative for just simply calling stop error within the library.

rcs · October 16, 2021, 1:59pm

Hi.

The approach certainly leaves it up to programmer to propagate the error or not … might be a philosophical question whether propagation should be enforced or not. For some applications division by zero or wrong array dimension are non-recoverable, for other they are.

From my understanding pure works only if the function returns a type as @aradi and @certik have suggested. While this looks compfy I think it terms of handling and HPC it is a nightmare. One would need types for all sort of types a function can return, including arrays. And assigning function results directly into array position or sections appears not to be possible.

WTR to error stop … I have to admit I absolutely hate it. This is because it tells you where the code failed and why, but not how the program got there. Especially in large libraries/projects the latter is, imho, very important. A consistently evaluated return variable provides a remote call stack, even for the non-debugging versions, which allows to advise on workarounds until the code is repaired.

certik · October 16, 2021, 2:23pm

Some compilers can print a stacktrace. Is not that a good enough solution? I agree that without a stacktrace it is useless.

Regarding that, I agree that we might not be able to figure out a comfortable solution for Fortran and its HPC usage. It seems that a comfortable solution would require a language extension, but that is precisely what we can do, as this is something that can be considered for 2Y. So that is why we are brainstorming this to see if there is a way.

rcs · October 16, 2021, 2:34pm

For ifort and libraries built from many files I didn’t even get a stack trace in full debug mode. When shipping fully optimized commercial executable I strip all symbols anyway. Then sometimes I get a ring from a client that the program has given up printing an unreadable error message … the call trace.
That usually happens because I didn’t raise the flood gates against garbage input high enough. However, the call trace allows me to provide advise on what not to do until the code is repaired or catches the garbage input correctly.

Topic		Replies	Views
Fortran error handling including stacktrace generation Announcements	8	1629	November 7, 2022
An experiment with exceptions	11	1268	May 4, 2022
Modernising old libraries - error handling	40	1928	April 4, 2022
Exceptions proposal Language enhancement	52	2894	March 7, 2023
Idioms for exception handling	2	322	April 6, 2024

ErrorFx: Fortran exception library

Related topics