I recently read this blog post:
Which especially @everythingfunctional will enjoy (examples in Haskell!). It’s worth the read, and it lead me to the following idea regarding error handling: I think there are two ways to handle errors in functions:
- Weaken the result (Rust
Result
, C++optional
, HaskellMaybe
) - Strengthen the input arguments (create a special type that prevents the error input to be passed in by the type system at compile time)
The second approach is what the blog post above argues for, and that is also this proposal for Fortran: Assert by wclodius2 · Pull Request #177 · j3-fortran/fortran_proposals · GitHub. The first approach is what the various proposals in the Exceptions proposal discussion try to address.
And I think we need both. Let me give examples.
Example where the second approach is better
This example was given in the “exceptions thread”:
pure function mean(vals)
real, intent(in) :: vals(:)
real :: mean
mean = sum(vals) / size(vals)
end function
The issue is about how to handle the case where size(vals) == 0
. One can use both approaches here, one can use the first approach and weaken the result by changing real :: mean
to real, optional :: mean
or by throwing an exception. However, in this case I think the second approach is better:
pure function mean(vals)
real, intent(in) :: vals(:)
require :: size(vals) > 0
real :: mean
mean = sum(vals) / size(vals)
end function
This strengthens the type of the function “mean” to include this requirement. The compiler then refuses to even compile your code unless it can ensure at compile time that the argument val
has nonzero size. So a = mean([1., 2., 3.])
will compile and is guaranteed to always work. a = mean([])
will not compile. a = mean(x)
will compile only if x
is declared with the requirement require :: size(vals) > 0
in the parent function. If it is not and it cannot be inferred that it is nonzero size, it will not compile. You then propagate this requirement all the way up. Say if you read the array from a file into an array that is declared with this requirement, the compiler can ensure at the reading from the file that the requirement is met (at runtime).
This approach even works for bounds checking for things like CSR arrays:
function csr_matvec(Ap, Aj, Ax, x) result(y)
! Compute y = A*x for CSR matrix A and dense vectors x, y
integer, intent(in) :: Ap(:), Aj(:)
real(dp), intent(in) :: Ax(:), x(:)
require :: all(1 <= Ap) .and. all(Ap < size(Ax)) .and. all(1 <= Aj) .and. all(Aj <= size(x))
real(dp) :: y(size(Ap)-1)
integer :: i
do i = 1, size(Ap)-1
y(i) = dot_product(Ax(Ap(i):Ap(i+1)-1), x(Aj(Ap(i):Ap(i+1)-1)))
end do
end function
The requirements are not 100%, but you get the idea, it is possible to exactly specify all the conditions so that the function never fails bounds checking, so the compiler in ReleaseSafe mode (all optimizations on, all bounds checking still on) does not have to bounds check, only check the requirements (at compile time).
If the CSR array is read from a file, it might still be relatively cheap (although not free) to check if the arrays satisfy the requirement in ReleaseSafe and Debug mode, and one can ignore it in Release mode. The beauty of this approach is that it shifts all these bounds checks from inner loops to the main program, and thus ReleaseSafe could run at the same speed as Release, while guaranteeing it will not segfault. The compiler could even ensure that every time you index into an array, there is an appropriate requirement set, so that it can guarantee at compile time that no bounds errors will happen (it is not clear to me if this is possible every time, perhaps only for functions which you declare with “enforce requirements”).
Example where the first approach is better
pure function solve(A, b) result(x)
real, intent(in) :: A(:,:), b(:)
real :: x(size(b))
x = ... ! call Lapack to solve the system A*x = b.
end function
Here we could insert a requirement require :: "inv(A) exists"
, but clearly determining if the solve will succeed is as much work as doing the solve itself, it’s not a simple condition to check like in the previous cases above. For example if you read the matrix from a file, it’s not easy to check that it satisfies the condition, without actually doing the solve, which is expensive. I can actually see some simple applications where this might be ok to do, you’ll pay the price up front, and then don’t have to worry about any error checking later and it is guaranteed to work, but in general this approach will not work here.
Rather, we need to weaken the result, for example something along these lines:
pure function solve(A, b) result(x)
real, intent(in) :: A(:,:), b(:)
real, optional :: x(size(b))
x = ... ! call Lapack to solve the system A*x = b.
! If lapack returns an error, simply return "None"
end function
Or raise an exception, which at some level is equivalent, it’s a way the function can return “None”. But then this new “return state” must be handled in the caller. So there is extra work to be done, it’s slower than the second approach, however, since solve
is an expensive operation (which is the very thing that makes the second approach not feasible), it is not a big deal to do these extra checks at runtime.
Conclusion
It seems to me that if the requirements are cheap to compute, then the second approach is better, since it eliminates any error handling at runtime. If the requirements are costly to compute, the first approach is better, and we need to handle the error at runtime: but ideally the compiler would still enforce at compile time that we always handle the error state.