Fortran Best Practice Minibook

Indeed. And I believe I had to turn off GFortran bounds checking (or some other option that I use for modern codes) to get code like this to work.

I agree with this. Given the current Fortran standard and the state of Fortran compilers, explicit-shape arrays as dummy arguments fail to make the cut as a “best practice”.

2 Likes

I think that’s a different issue pertaining to legacy FORTRAN and compiler extensions, not standard Fortran.

Does this declaration give n the save attribute (making the procedure impure)?

No, it would be a runtime dimension. No save.

1 Like

I have been a strong advocate of explicit shape dummy arguments in the past for the same reason as yours. But only recently I decided to convert most, if not all, procedures to assumed-shape dummy arguments with contiguous attribute. This makes C interoperability likely more complex, but the generic interfaces for arguments of different shapes become very nice and elegant if the focus is on the Fortran side and usage alone. My only worry is that I occasionally see bizarre runtime seg-faults with some compilers (Intel, in one ase) when some procedures have assumed-shape dummy arguments. This may well be a bug in the user’s code and implementation. But the fact that it only appears when a specific compiler is used and it goes away when explicit shape procedure argument is used, makes the alternative hypothesis more likely.

1 Like

@FortranFan yes the C interface loses a lot of information about the array to be passed. It could also be declared something like float A[n][n] which communicates more to the programmer, but wouldn’t come with any actual safety. const for intent(in) could be added as well.

But ultimately, my original question was what is wrong with this style, and that question has been decisively answered: it gives a false sense of security, because compilers don’t actually check the bounds, even with check flags in use. I’m convinced it’s not a best practice, though admittedly, I will likely continue to use it when writing Fortran kernels which exist only to be called from C. I appreciate everyone’s input.

3 Likes

@shahmoradi especially in light of the discussion here, I definitely agree that assumed-shape is the way to go for Fortran-focused development. Regarding the segmentation faults on ifort, perhaps try -check all. This will add bounds-checks, which might help reveal whether the user’s code is to blame.

1 Like

The same here, I use assumed-shape dummy arguments. That is the recommendation in my original “best practices”, as well as the new one. So I agree on this with @everythingfunctional and @FortranFan and you.

While this is the best practice today, I am starting to very seriously question if this is the best way long term. As a user, the explicit-shape are so much more readable (from the explicit dimensions I can tell right away what each dimension is doing, say X(Nelem,Nquad) I know right away that the first index is element number, the second index is the quadrature, while if you see X(:,:) I have to go read the documentation what each dimension is supposed to do), and they are with the spirit of Fortran — no runtime overhead (equal or lower overhead than assumed-shape), much more possible error checking at compile time, what’s not to like? There might be needed some more checks in compilers (but those should happen anyway), and possible extension of Fortran to make some things easier. We already have to use explicit-shape as function return values. So they are not going away.

The strided (non-contiguous) arrays are very rare in my codes. But things like call sub(A(n:)) are common. However, that’s already allowed with explicit-shape. The only current annoyance is to pass the n integer argument, but my above proposal for Fortran extension of integer, dim :: n = size(X,1) can alleviate this. Compilers can internally pass n as a hidden argument, just like they already do for strings.

The compilers need to be improved to check this, no matter the “best practice”. I consider that just a “temporary bug”. Is there any other argument against explicit-shape?

4 Likes

EDIT

Sorry, I realised that my message does not make much sense without a bit more context (this is yet another reminder to self that writing messages in a hurry is never a good idea).

The assumed-shape approach reported below is IMO a better practice when there is a private procedure (typically as type bound procedure) because the public caller should have already made the consistency checks (or use an explicit-shape form) on the inputs to be passed further. Although it can be argued that it would be more considerate to consistency-check in the private procedure again (I’ve done that in the past) I prefer a more lean approach focussing my efforts on different goals in the two routines (sanity checks in the public routine and clear and efficient algorithm implementation in the private routine).

Obviously, this does not suggest a good practice that is valid in general, considering different situations, but it is usually hard to do. According to my experience (that might be limited, of course) this public caller and private “receiver” setup is very common in practice for programs that go beyond the prototyping stage (and very typical of type bound procedures). I agree with @certik that in the future if stricter bound checks by the compilers and a feature like dim become available, the explicit-shape route is optimal. Probably this should be the medium-long term goal. However, I would not suggest this as a good practice “here and now”.

I have edited the original message to better reflect these considerations.

Edited original message.

For Fortran-centred coding, I currently do the following most of the times. Why not?

public  :: caller
private :: sub

subroutine caller(in, out)   ! or use explicit-shape
  real(dp), intent(in)                :: in(:,:)
  real(dp), allocatable, intent(out)  :: out(:,:)
  
  if (size(in,1) /= size(in,2)) error stop
  ! ...
  call sub(in, out)
  ! ...
end subroutine caller

subroutine sub(X, Y)
  real(dp), intent(in)                :: X(:,:)
  real(dp), allocatable, intent(out)  :: Y(:,:)
  integer :: n

  n = size(X,1)
  allocate(Y, mold=X)   
  ! or allocate(Y(n)) depending on what `sub` is doing and 
  ! how you want to "document" it with Fortran statements.
  ! No need for these checks:
  ! if (size(X,1) /= size(X,2)) error stop
  ! if (size(Y,1) /= size(Y,2)) error stop
  ! if (size(X,1) /= size(Y,1)) error stop
  ! ...
end subroutine sub

or

subroutine whatever(A)
  real, intent(in) :: A(:,:)
  integer :: n
  
  n = size(A,1)
  ! ...

Is there a problem with the overhead in returning an allocatable array in the first example?

The current IMO “best practice” in this case would be:

  • use an explicit interface (which I guess it’s obvious for any modern code);
  • use assumed-shape arrays with checks for the public caller;
  • for the private “receiver” procedure
    • use assumed shape arrays specifying intent(in) for input arguments and allocatable (with intent(out)) for output arguments;
    • declare integer variables for the dimensions of the arrays and set them correctly as the first execution statement (for convenience and to clarify or “document” the code)
      Example:
    subroutine quadrature(X)
      real, intent(in)  :: X(:,:)
      integer :: Nelem, Nquad
      
      Nelem = size(X,1)
      Nquad = size(X,2)
      ! ...
    
    • if there are output arrays (declared allocatable) allocate them immediately below the integer variables at the above point so that they are still visually close to the declaration statements, choosing the form that better “documents” what the procedure is doing (e.g. allocate(Y, mold=X) vs allocate(Y(n)) above). (It could be debated if it’s good practice or not to check the outcome of the allocation or let the program fail if there’s a problem, but I guess this is for another discussion.)

With this approach

  • the problematic example showed by @Beliavsky is automatically resolved;
  • although slightly more verbose, the purpose of the dimensions is still clear, if the declaration of the relevant integers is done as first thing after the declaration of variables (no need to go check the docs);
  • there is no need to add another feature to the language, like the proposed dim that, still would be nice to have because it unlocks an effective and compact use of explicit-shape arrays, if also compilers will be able to strictly check consistency automatically.

Am I missing something? Yes, I was missing a lot of context, apologies.

1 Like

Just a friendly reminder that in C, unless n is a constant expression a declaration such as float A[n][n] is not permitted.

Also, with arrays of rank greater than one in Fortran as shown in the example with A(n,n) upthread, the interoperability with C always requires further thought and close attention. An element of an array of rank greater than one in C is not exactly the same as the element in Fortran which, unlike C, has the same type as the array object itself.

Thus there are a lot of details involved that makes it particularly challenging if the goal is a minibook toward best practices

@FortranFan,

It is permitted in a function argument list, as of C99 (and indeed outside of that context, as a variable-length array, but those shouldn’t be used for a variety of reasons). In your comment “an element of an array of rank greater than one in C is not exactly the same as the element in Fortran”, are you referring to the row vs column major ordering issue, or something else?

I’ve found it easy to deal with the row vs column major issue - just reverse the shape. What in fortran would be A(m,n) is A[n][m] in C.

There is no general restriction on rank agreement for argument association of arrays. In the version of Standard that I can read (J3/18-007r1 PDF), at 15.5.2.4 “Ordinary dummy arguments”, para 16 states

16 If a dummy argument is an assumed-shape array, the rank of the actual argument shall be the same as the rank of the dummy argument, and the actual argument shall not be an assumed-size array.

and paras 2 and 3 state

2 The dummy argument shall be type compatible with the actual argument.

3 The kind type parameter values of the actual argument shall agree with the corresponding ones of the dummy argument.

So, it is for the specific case of assumed-shape dummies (which is being deprecated by some people here) that ranks must match (and the compiler is not obliged to report a violation, unless I missed the constraint, although 3 compilers I tried do report the violation).

Can your compiler catch this problem with explicit-shape arrays?

program foo
  implicit none
  integer i
  real :: x(10)
  x = [ (i,i=1,size(x)) ]
  call s(x,2)       
  call s(x(1),2)   ! sequence association
  call s(x(3),2)   ! sequence association   
  call s(x(8:),2)  ! bounds violation at the print statement
contains
  subroutine s(a,n)
    integer, intent(in)::n
    real :: a(n,n)
    print *,a(n,2)
  end subroutine s
end program foo
5 Likes

The problem with explicit-shape dummy arguments comes with code like

subroutine sub(n, X, Y)
integer, intent(in) :: n
real, intent(in) :: X(n,n)
real, intent(out) :: Y(n,n)
...
end subroutine

allocate(A(5,5), B(5,5))
...
call sub(4, A, B)

(Almost) no compiler is going to catch that with even the most strict checking options turned on. In fact, it’s standards conformant, but probably isn’t what you wanted. And when the definition of sub, the allocations of A and B, and the call to sub are sufficiently far apart, even when you notice that something is not right in your program, it’s going to be very hard to find that bug, because you probably won’t even be looking for that kind of mistake. For most people, when they see that definition of sub, they think, “oh, the shapes are explicit, I guess the compiler will tell me if I pass in something that doesn’t match it,” but that assumption is false.

Furthermore, you can even pass in something of the wrong rank and you won’t get an error.

allocate(C(16), D(2, 2, 4))
...
call sub(4, C, D)

will compile and execute with no warnings or errors. In fact, it’s perfectly standards conformant.

In my mind explicit shape dummy arguments are one of the biggest gotchas in Fortran.

5 Likes

Everything I said or proposed above (as an idea to consider, not something set in stone) is assuming that compilers can check every explicit-shape usage, including both examples that @themos and @everythingfunctional posted. If this assumption is not true, then there is nothing to discuss, indeed, explicit-shape should not be used unless necessary.

So it seems most people are assuming that just because compilers currently do not check these things, it must mean that they cannot check it. I am arguing that they should check it. Hopefully in about half a year or so, I should be able to answer that question if it is possible to check this or not. And after that, we can revisit this discussion.

I agree that until compilers can reliably check this, the usage is unsafe and thus should not be used.

2 Likes

I do wish compilers would, even were required to, check that the shape of actual arguments matched explicit shape dummy arguments, but I worry they can’t. Before allocatable and assumed shape were added to the language, people made use of this feature as the most convenient way of “emulating” run-time sized arrays. The examples that @themos and I gave are valid, standards conforming Fortran (aside from the runtime bounds violation in the last call in @themos example). I’ve seen lots of old code written that way.

I don’t think it would necessarily be a good idea for compilers to start rejecting that previously valid code. You’d start seeing lots of complaints from developers and maintainers of legacy software about how their code has worked just fine for decades, and they shouldn’t have to change it. As much as I’d be in favor of such a change, I’m not convinced it would be a winning battle to fight.

@themos, can you explain why is this a bounds violation?

As I understand it, a bounds violation happens when you subscript an array with an index that exceeds the bounds of the array. I don’t think that happens in this example.

Your actual argument evaluates to [8, 9, 10] (size 3). In the subroutine, the dummy argument is declared as a(2,2) (size 4). I expect that its values in memory are [8, 9, 10, undefined]. So referencing element a(2,2) is still within bounds, though its value is undefined (compiler dependent).

Am I incorrect?

@milancurcic, think of the actual argument, the array x. S is instructed to access the fourth element counting x(8) as the first. 8, 9, 10, boom! It’s not that there is an entity there that has not been defined due to a programmer error, there is NO valid program entity there at all. That feels like a bounds violation to me. I can only fix it by making the array larger, not by assigning a value to a valid program data entity.

1 Like

I agree that it feels like a bounds violation, and I had doubts about it while participating in this thread. What makes me think that it’s not a bounds violation is that:

  • The three compilers that I currently have access to (GNU, Intel OneAPI, XLF) do not document this scenario as part of their bounds-checking.
  • The same compilers happily build and run the program
  • As I understand it, a program with a bounds violation is not standard-conforming; but your example program is (or am I wrong about this?)

@milancurcic

  • I guess you need to add one more compiler to your arsenal😀.

  • compilers vary, some produce fast code, some help out in other ways, what can I say?

  • the example is not standard-conforming.