Fortran Best Practice Minibook

To avoid the PDT wrapping it is also possible to perform the operations the other way round (Fortran calls a C routine):

interface
  ! void c_fill_points(int n, float (*xyz)[])
  subroutine c_fill_points(n,xyz) bind(c)
    integer(c_int), value :: n
    real(c_float), intent(inout) :: xyz(3,*)
  end subroutine
end interface

! void f_init_pdt_and_process(int n)
subroutine f_init_pdt_and_process(n) bind(c)
  integer, value :: n
  type(point_t(n)) :: points
  call c_fill_xyz(n,points%xyz) ! filled on C side
  call process_points(points)
end subroutine

which now has the benefit all memory management is done automatically on the Fortran side. This however might not be an option if the caller in C wants to keep a copy of the xzy array around.

1 Like

Following up on the the comments of @themos, one can write a procedure with explicit shape arrays and also a wrapper with assumed shape arrays. The wrapper should be called for safety unless there is a material speed penalty or one is calling from C. Here is an example based on a code of John Burkardt.

subroutine qr_solve_wrap(a,b,x)
!! QR_SOLVE_WRAP solves a linear system in the least squares sense by calling QR_SOLVE_BASE
real(kind=dp), intent(in)  :: a(:,:) ! (m,n) LHS matrix
real(kind=dp), intent(in)  :: b(:)   ! (m)   right hand side
real(kind=dp), intent(out) :: x(:)   ! (n)   least squares solution
integer                    :: m,n
m = size(b)
n = size(x)
if (size(a,1) /= m .or. size(a,2) /= n) then
   write (*,*) "in qr_solve_wrap, size(b), size(x), shape(a)=",size(b),size(x),shape(a), &
   "need shape(a) = [size(b),size(x)], STOPPING"
   stop
end if
call qr_solve_base(m,n,a,b,x)
end subroutine qr_solve_wrap
!
subroutine qr_solve_base(m,n,a,b,x)
!! QR_SOLVE_BASE solves a linear system in the least squares sense.
!  Licensing: This code is distributed under the GNU LGPL license.
!  Modified: 15 April 2012
!  Author: John Burkardt
!  Parameters:
!    Input, integer :: M, the number of rows of A.
!    Input, integer :: N, the number of columns of A.
!    Input, real(kind=dp) A(M,N), the matrix.
!    Input, real(kind=dp) B(M), the right hand side.
!    Output, real(kind=dp) X(N), the least squares solution.
integer      , intent(in)  :: m      ! number of rows of A
integer      , intent(in)  :: n      ! number of columns of A
real(kind=dp), intent(in)  :: a(m,n) ! LHS matrix
real(kind=dp), intent(in)  :: b(m)   ! right hand side
real(kind=dp), intent(out) :: x(n)   ! least squares solution
integer                    :: ind,itask,jpvt(n),kr,lda
real(kind=dp)              :: a_qr(m,n),qraux(n),r(m),tol,work(n)
a_qr(1:m,1:n) = a(1:m,1:n)
lda = m
tol = epsilon(tol)/maxval(abs(a_qr(1:m,1:n)))
itask = 1
call dqrls(a_qr,lda,m,n,tol,kr,b,x,r,jpvt,qraux,work,itask,ind) ! call Linpack
end subroutine qr_solve_base
2 Likes

As I alluded to in my comments earlier, the best practices for Fortran hopefully take into consideration what is best first and foremost for code in Fortran by library authors and by the consumers. With such a consideration, as things stand now, the use of dummy arguments that are explicit-shape arrays would fail to make the cut as a best practice.

Yes, we all agree that currently the best advice is to use assumed-shape, not explicit-shape arrays.

However, I am interested in discussing the future of Fortran.

I still believe it can be done without false positives, I haven’t seen any counter example yet.

I can think of a couple of cases that I would want to be warned about, but are technically standard conforming.

subroutine do_work(data, size)
  integer, intent(in) :: size
  real, intent(inout) :: data(size)
  ...
end subroutine

real :: work_array(1000000)
...
call do_work(work_array(5000), 5000)

That looks like a rank mismatch to me, and I’d like it if the compiler warned me about it. I know of programmers who’ve used this style/feature extensively and would disagree. They say they know what they’re doing, their code works, stop warning me about it.

I’ve also seen stuff like

subroutine do_work(data, size, how_many_to_access)
  integer, intent(in) :: size, how_many_to_access
  real, intent(inout) :: data(size)

  do i = 1, how_many_to_access
    ! do something with data(i)
  end do
end subroutine

real :: data(10)

call do_work(data(5), 10, 5)

Now that is definitely a shape mismatch, but because the procedure does not actually access any elements beyond the bounds of the actual argument, it is technically standards conforming. To somebody unwilling to spend time updating their “working” code, they don’t want to see warnings about that.

I’m sure there will be an option to turn warnings off for the above examples, but if they are on by default, you will almost certainly hear from the people who disagree about them. Of course, I could always offer my services helping them to update their codes :wink: .

Yes, I think the idea is to have something like -Wargument-shape-mismatch or similar to enable it, and it would be included with -Wall or similar. I like all warnings off by default, but allow for full control to enable warnings, including this one.

Rank and shape mismatches for explicit shape (and assumed size) dummies are conforming.

Your second example is non-conforming, because by storage association, data(5) has six elements but the dummy needs 10. My reply to @milancurcic above applies. It’s an error regardless of the executable statements in do_work.

1 Like

Any chance you could point to the place in the standard that says this? I’ve kind of looked at it from a “if a tree falls in the woods” kind of perspective. If nothing attempts to access the missing entities, does it matter that they aren’t there?

Like I said, I’d like it if compilers warned about this stuff, but there are those who prefer to live dangerously.

@everythingfunctional the requirement is quoted in this post: Fortran Best Practice Minibook - #51 by themos

1 Like

And the standard tries to support the sentiment of those coders given the notion they are an appreciable fraction of those who worked with (legacy) FORTRAN prior to Fortran 90. Thus the sequence association semantics in the standard places the onus on the programmer that the size of the dummy argument be equal to less than the actual argument but it does not require the processor to diagnose when the size of dummy array exceeds the actual. Now a compiler may decide to offer an option to check for size overflow but then “those” coders will likely not use the option anyway.

As @themos said, this is perfectly valid. The way I personally read this is as:

call do_work(work_array(5000:5000+5000-1), 5000)

And the usual bounds checking of work_array(5000:5000+5000-1) applies, in this case it will pass the check.

I see work_array(5000) as a syntactic sugar for work_array(5000:5000+5000-1). Both should (eventually…) be checked by a compiler and thus safe.

Regarding your second example:

That is equivalent to:

call do_work(data(5:5+10-1), 10, 5)

Which would indeed fail the bounds check for data(5:5+10-1), even if technically the out of bounds is never accessed inside.

Ok, so this second example is one example where such a check would produce what could be argued is a “false positive”. Thanks for that.

As @themos said, this second example is non conforming and should be an error regardless of whether you actually access the (out of bounds) array or not inside the routine.

I might have seen code like that too. If that is what some code is using (and I know they are there), then indeed you might want to turn the checks off. However, such a code you don’t want to have, you really want to fix that.

The above example can be trivially fixed by replacing real, intent(inout) :: data(size) with real, intent(inout) :: data(how_many_to_access). However, if the routine was written like this:

subroutine do_work(data, size)
  integer, intent(in) :: size
  real, intent(inout) :: data(size)

  do i = 1, some_complex_calculation()
    ! do something with data(i)
  end do
end subroutine

Then it seems the most natural fix is to realize that size in this case is meaningless and essentially emulating assumed-shape arrays and so we might as well write it like this:

subroutine do_work(data)
  real, intent(inout) :: data(:)

  do i = 1, some_complex_calculation()
    ! do something with data(i)
  end do
end subroutine

And then all checks will work again. This is a case where assumed-shape arrays should be used, because we do not (easily) know the upper size of the array data that we need in the subroutine. So we give up runtime bounds checking (“sequence association checking”) at the call site, and rather just get the bounds checking inside the subroutine.

In couple years, once compilers check every explicit-shape arrays thoroughly, I will recommend:

  • If you do not easily know the upper bound inside the subroutine, use assumed-shape array. The second example above is a prime use case. The compiler can only check mismatches inside the subroutine.
  • If you know the upper bound inside the subroutine, use explicit-shape. The compiler can then check mismatches at the call site.

Note that in my integer, dim :: n proposal above, 1D arrays like the second example above are actually equivalent in both approaches, that is, the assumed-shape case above and this “extended Fortran”:

subroutine do_work(data)
  integer, dim :: n
  real, intent(inout) :: data(n)

  do i = 1, some_complex_calculation()
    ! do something with data(i)
  end do
end subroutine

are equivalent, as there will never be an error at the call site, any array you pass in will go through. You will only get bounds check errors inside the subroutine. If you pass n as an argument to the subroutine, then that’s different, the compiler would (in the future!) then check it at the call site to ensure consistency. The same if you either have a multi-dimensional array or more arrays and you reuse n multiple times, then the compiler would (in the future!) also check this.

The Standard says

If the dummy argument is not of type character with default or C character kind, and the actual argument is an array element designator, the element sequence consists of that array element and each element that follows it in array element order.

Consequently, that should correspond to:

call do_work(work_array(5000:), 5000)
1 Like

What is the meaning of passing a longer array to a function that expects a shorter array? Isn’t that equivalent to call do_work(work_array(5000:5000+5000-1), 5000)? It seems to me it is.

The call to do_work can be made in the absence of an explicit interface, a la F77. The compiler in that case has no knowledge of do_work’s interface or internals. It does not know how the corresponding dummy is dimensioned so it cannot check for size violations in the caller. But it must ensure that the right thing happens when the corresponding dummy is a scalar or one of the F77 supported arrays (explicit-shape (n,n) and assumed-size (n,*)). I should note here that the “opposite” case, i.e. passing an array to a scalar dummy, is a violation.

If you want to “think like the compiler”, you would put the runtime check in the callee, after having arranged, somehow (hint:in the caller), to have access to all the sizes of all actual argument arrays. The callee, do_work, sees the dimensions of the dummies, and can calculate all the required sizes and compare with the sizes of the actual arguments. The -C=calls option of nagfor does this by putting the actual argument sizes of the current call in a global structure (which the callee then retrieves). We need one such structure per thread of execution.

1 Like

@themos thanks! While we are on this topic, do you know if there is anything in the standard that prescribes how the compiler must implement explicit-shape and assumed-shape arrays internally?

In particular, my understanding is that most compilers implement explicit-shape as a pointer, while assumed-shape using a descriptor.

Are compilers free to for example implement explicit-shape using an array descriptor also?

Are compilers free to implement assumed-shape arrays by passing the dimensions+strides as extra arguments (thus no array descriptor in the traditional way)?

I am just trying to understand what freedom there is for the implementers.

The Fortran standard as s rule tries to allow as much freedom for the implementor as possible, freedom that is typically used to adopt the highest performance approach for the implementation. Because C interoperability uses array descriptors, in some contexts array descriptors will have to be defined. In other contexts you could implement dimensions plus strides as extra arguments, but loading and unloading the arguments on the stack will have a performance hit and I don’t see any benefits. Constructing and deconstructing the array descriptor for explicit shape arrays will also have a performance hit, but will allow some run time error checking that could be useful for an “interpretive” implementation…

I’ve had to go back to an older email exchange with Malcolm Cohen to remind myself of pertinent facts. I can’t say that I can give chapter and verse for everything but the situation is complicated. The short version is: do what other successful compilers are doing, otherwise you will have a myriad cases to consider, implement and test.

Malcolm says that there was a compiler that “passed some (not all) assumed-size arrays by descriptor, not by address.”

The main conceptual stumbling block that we seem to have is that you regard explicit-shape arrays as more “advanced” or “content-rich” or “compiler-helpful” whereas they were in fact an F77 feature and predate any serious attempt in the language to catch errors at compile-time (or run-time). I don’t think they can be “rehabilitated” as long as Fortran supports F77 conforming codes. New statements or attributes will probably be needed. And if you are using explicit-shape arrays, you probably need a compiler with nagfor’s -C=calls capability at some point in the development cycle, I should think, in order to check what you think the compiler should be checking.

You might be dismayed to learn that the situation is worse than you thought. Consider an explicit interface for a solver routine that has a dummy procedure argument (its job is to evaluate a user function) with interface block

Subroutine f(n,x,fx)
Integer, Intent(In):: n
Real, Intent(In)::x(n,*)
Real, Intent(Out):fx(n)
End Subroutine f

and you think that you can pass the following procedure as f to the routine through explicit interface:

Subroutine actual(n,x,fx)
!edited
Integer, Intent(In):: n
Real, Intent(In)::x(n,n)  ! explicit-shape, not assumed-size
Real, Intent(Out):fx(n)
End Subroutine actual

Well, that’s illegal. With explicit interfaces, the rules of argument association are more stringent and characteristics of the procedure must match. And yes, whether a dummy is assumed-size or explicit-shape counts as a characteristic (15.3.2.2 Characteristics of dummy data objects) and this mismatch is non-conforming.

I realize this has not answered your question fully but a full answer would require checking many clauses in the standard. Perhaps you can ask Malcolm at the next meeting because he has spent more than 3 decades thinking about exactly this stuff!

1 Like

Guys, the minibook is precious! Thanks for the effort. Some comments and suggestions.

When you talk about best practices, one should definitely consider a variant for codes that are parallelized with MPI.

I’d also like to know your opinion on precompiler directives. Any guidance when it is justified to use them?

Perhaps it may be useful to mark different sections of the minibook by their improtance, from merely aesthetic (typing conventions) to fundamental (precision and similar, issues that may lead to inaccurate or incorrect code) and to good-programming ones (how to make code that is already accurate and correct, better optimized and easier to use and reuse).

2 Likes

You mean the C preprocessor?

1 Like

Yes, that’s what I meant. I find them super useful and use them a lot in the code. Is it something too noncanonical to be mentioned in the context of the Fortran practices?

1 Like