Good programming practices: Should I use the `dimension` attribute?

Hello, Fortran Community! I am currently writing a subroutine in Fortran that works with arrays. Among other dummy variables, there is an input array y0 and an output one y1. Part of the header of my subroutine looks like this:

integer, parameter :: rp = kind(1.0d0)
real(kind = rp), intent(in) :: x0, y0(:), h, smin, smax
real(kind = rp), intent(out) :: y1(size(y0))

I am wondering, in the spirit of good programming practices/preferred programming styles, is it preferred to declare the arrays y0 and y1 with the dimension attribute? Like this:

integer, parameter :: rp = kind(1.0d0)
real(kind = rp), intent(in) :: x0, h, smin, smax
real(kind = rp), dimension(:), intent(in) :: y0
real(kind = rp), dimension(size(y0)), intent(out) :: y1

The first one seems shorter and more elegant to me, but the latter seems more clear.

On a second related matter, considering that y1 has the same shape as y0, would it be considered correct to declare it as follows?

real(kind = rp), intent(out) :: y1(:)

Or should I use the y1(size(y0)) version?

Thank you very much in advance for your answers!

1 Like

Kargl’s response is a good explanation of the Occam’s razor that one should follow not just in coding, but in everyday logic. To extend his response, you may want to use it if it leads to more concise code, for example, when you have many declarations with the same shape in the same place.

Occam’s razor states the simpler solution is preferable - not the shortest. What “simpler” means is everyone’s interpretation of the term I guess, but to me the dimension attribute makes the code more clear, therefore I use it religiously. Yes, it may require a few additional lines of code, since arrays of different dimensions must be declared separately, but that’s exactly the point: clearly distinguish them.
Granted, it is a minor issue (if you can call it an “issue” at all), but wrong solutions to minor issues tend to accumulate, eventually making the code harder to read and understand. And good luck understanding your own code, if it was written half a year ago, and it wasn’t written in a clear manner.

Last but definitely not least, following the F standard is always a good idea in my opinion. Although not very common nowadays, F is a subset of modern Fortran that eliminates old or deprecated features of the language, as well as everything that might confuse the compiler or make the code harder to understand. It also makes data hiding imperative. The F standard enforces many rules. To name a few: arrays must be declared with the dimension attribute, functions must always have the result attribute (even if they are not recursive), you cannot use external functions but should use interface instead, modules must always be declared as private with public procedures clearly declared as such, etc. Everything not following the rules is treated as a syntax error without mercy. Personally, I like that style. Not only it makes the code easier to read, it also eliminates mixing old-style Fortan with modern Fortran - a very common and annoying programming style (especially by new programmers).

4 Likes

I recommend against the use of y1(size(y0)) as it doesn’t do what you think. An explicit shape dummy argument triggers element association, meaning that the actual argument gets associated element-wise with the dummy array regardless of if the actual is the same shape or even rank. This leaves open possibilities of all kinds of bugs. I’d recommend

real(kind=rp), intent(out) :: y1(:)

if (size(y1) /= size(y0)) error stop "arguments of different size"
3 Likes

Generally I agree but would advocate against religious use. There are, and quite often, situations when one declares arrays obviously related but of different shape. Say, linear equations system, with the main and augmented arrays, the RHS and solution vectors, all defined by single value N:

REAL :: A(N,N), AA(N,N+1), B(N), X(N)

looks simpler and clearer to me than three separate declarations with DIMENSION attribute.

1 Like

As you can gather from the responses so far, it’s totally up to you. FWIW, we cover this item in the stdlib style guide, which recommends not using dimension in general and using it in exceptional cases.

1 Like

Hello! Thank you very much for your answer, which I find very interesting. However, I am not completely sure if I understand it correctly. Would you mind elaborating, perhaps with an example?

Thank you!

Hello, everybody!

Thank you very much for your answers, and sorry for the late reply. From what I can gather from your answers, I believe the best way to proceed is to use the stdlib style guide, as suggested by @milancurcic, therefore, not using the dimension attribute. However, here is a situation where I believe that using it is unavoidable: Consider a Runge-Kutta method that requires a lot of partial evaluations of the RHS of the differential equation y'=f(x,y)—let’s say, from f_0, all the way up to f_{9}—, and therefore we need to declare the variables f0, …, f9 as having the same size as y. Doing that with

real(dp) :: f0(size(y)), f1(size(y)), f2(size(y)), f3(size(y)), f4(size(y)), f5(size(y)), &
            f6(size(y)), f7(size(y)), f8(size(y)), f9(size(y))

is not as good as

real(dp), dimension(size(y)) :: f1, f2, f3, f4, f5, f6, f7, f8, f9

However, mixing both styles in some piece of code would also be a bad practice.

How would you proceed in cases like this?

allocate(f1(n), f2(n), f3(n), ..., f9(n)
won’t compile but
allocate(f1(n), f2(n), f3(n), ..., f9(n))
will. I’m glad to see I’m not the only one who falls into that trap.

Whether the arrays are allocatable or not I use DIMENSION when (declaring if there are a lot of arrays with the same bounds., but seldom if not. However my code is usually read only by me and compilers.

Using allocatables will put those arrays in the heap. If they are automatic arrays, they will likely be on the stack, which should be more efficient. If this routine is called within a tight loop, then that could be significant. Otherwise, it would not make any difference.

On the other hand, if the same arrays can be used multiple times, say collected together in a derived type and passed as an argument to the routine, then that would be even better.

@dsejas , have you considered what is advised by @RonShepard here with a derived type?

A derived type in a module that can include your solver code for the differential equations along with helper subprograms is likely the “good programming practice” to pursue here, even more important than the stylistic aspect of how multiple arrays are declared.

As to the latter, a derived type which is parameterized by a length-type parameter can further simplify and streamline the declarations and give you the assurance of consistency with shape of the variables toward f0 up to f9 in the evaluations while not having to pass around all the variables as arguments in the solver procedures:

   type :: rkmeth_t(n)
      integer, len :: n
      ..
      real(..) :: y(n)
      ..
      real(..) :: f0(n)
      ..
      real(..) :: f9(n)
      ..
   end type
..
   subroutine solve( .., rkmeth, .. )
      ..
      type(rkmeth_t(n=*)), intent(inout) :: rkmeth

      ..
      rkmeth%f0(1) = .. 
1 Like

I was thinking of a derived type with allocatable arrays in my original post. In that case, the arrays would live on the heap rather than the stack, but since they would only be allocated once per run, that heap overhead would not be significant. However, the parameterized data type approach also has some advantages. Does anyone know if the PDT arrays are allocated typically on the heap or the stack in the common implementations? Also, PDTs were introduced in f2003, but their implementation was slow in the various compilers. Even now, 19 years later, there are still some portability problems with using them. This is a simple case, so it is probably portable and relatively bug-free. It would be good to know if any popular compilers now do NOT support this kind of application of that feature.

As an example

program example
  implicit none

  real :: x(4)

  x = [1.0, 2.0, 3.0, 4.0]
  call square(x)
  print *, x
contains
  subroutine square(y)
    real, intent(inout) :: y(3)

    do i = 1, 3
      y(i) = y(i)*y(i)
    end do
  end subroutine
end program

Is perfectly valid and standards conforming, but violates the principle of least surprise for the line call square(x). One would naively expect this to square every value in the x array, but it only affects the first 3. This is surprising, and I’ve seen these kinds of mistakes made all the time, where the sizes of some arrays are assumed to be the same throughout a program, but when later changed, not all procedures got updated. These kinds of bugs are really hard to track down.

1 Like

That program is not standard-conforming because it dioes not say integer:: i or integer i in either the main line or the subroutine. On fixing that and compiling and running the program I was surprised that neither gfortran -Warray-bounds nor ifort -check bounds gave a warning. Thank you for pointing out why one should always declare array arguments with : or * not explicit integers.

I prefer not to use dimension. A separate issue is that overriding dimension as in the code below should be avoided.

implicit none
real, dimension(5) :: x,y(2)
print*,size(x),size(y) ! 5 2
end