'reshape' may behave inconsistently

I have tagged this as “compilers” but it could be a hole in the language spec. Consider the appended program, in that with gfortran the reshape and pack result in the lower bound of the assigned-to array to be 0 in the x to x and x to y cases, but 1 for in the x to z case; I would expected the lower bound to be 1 in every case.

Put another way my impression is that at least in gfortran if reshape and pack do not require a copy the lower bound of the assigned-to array is not changed, but it is changed if the operation requires a copy.\

Is this is according to the language specification? Is this a bug in the compiler?

program main
  character(len=*), parameter::     f1 = "(A,I2,I2,A,A16,$)"
  character(len=*), parameter::     f2 = "(A,I2)"

  integer, allocatable::        x(:)
  integer, allocatable::        y(:)
  integer, allocatable::        z(:)

  block
    integer::                     nbc
    integer,dimension(-1:1)::     b,c

    nbc           = size(b)+size(c)

    b             = 2
    c             = 3

    allocate(x(0:nbc-1))
    allocate(y(0:nbc-1))
    allocate(z(2:nbc))

    x(:)          = [b,c];                call px("x(:)=[b,c]")
    x             = [b,c];                call px("x=[b,c]")
    print *
  end block

  x             = [x];                    call px("x=[x]")
  x             = pack(x,.true.);         call px("x=pack(x)")
  x             = reshape(x,[size(x)]);   call px("x=reshape(x)")
  x             = reshape(x(:),[size(x)]);call px("x=reshape(x(:))")
  x             = reshape([x],[size(x)]); call px("x=reshape([x])")
  print *

  y             = [x];                    call py("y=[x]")
  y             = pack(x,.true.);         call py("y=pack(x)")
  y             = reshape(x,[size(x)]);   call py("y=reshape(x)")
  y             = reshape(x(:),[size(x)]);call py("y=reshape(x(:))")
  y             = reshape([x],[size(x)]); call py("y=reshape([x])")
  print *

  z             = [x];                    call pz("z=[x]")
  z             = pack(x,.true.);         call pz("z=pack(x)")
  z             = reshape(x,[size(x)]);   call pz("z=reshape(x)")
  z             = reshape(x(:),[size(x)]);call pz("z=reshape(x(:))")
  z             = reshape([x],[size(x)]); call pz("z=reshape([x])")
  print *

  z             = [1,2,3];                call pz("z=[1,2,3]")
  z(:)          = [4,5,6];                call pz("z(:)=[4,5,6]")
  print *

  deallocate(z)
  deallocate(y)
  deallocate(x)
contains

  ! The array to print must not be passed as an argument

  subroutine px(m)
    character(len= *), intent(in):: m
    print f1,'bounds: ',lbound(x),ubound(x)," ",m; print *,"; x: ",x
  end subroutine px

  subroutine py(m)
    character(len= *), intent(in):: m
    print f1,'bounds: ',lbound(y),ubound(y)," ",m; print *,"; y: ",y
  end subroutine py

  subroutine pz(m)
    character(len= *), intent(in):: m
    print f1,'bounds: ',lbound(z),ubound(z)," ",m; print *,"; z: ",z
  end subroutine pz

end program main
1 Like

Welcome to the forum!

I have not studied your source code in any detail, but the lower bound of an array is a bit tricky. This may be exaggerated by the automatic reallocation. As you noticed in the comment, you should not pass the array as an argument indeed. My guess is that:

  • Without reallocation, the lower bound is preserved.
  • With reallocation you get the default lower bound.

This does not necessarily mean a hole in the language specification or an error in the compiler, merely several features working together to put you on the wrong foot.

That is indeed what the test shows (at least in gfortran) but that seems to me rather undesirable because whether reallocation happens or not depends on circumstances which may depend on the value of some runtime variable. So:

  • with reshape and pack the lower bound may change unpredictably so one should always use lboundexplicitly to index an array.
  • Since in general it is hard to know whether an array has been the result of a reshape or a pack that means that all array references should be based on lbound. Unless I guess lower bounds are always set or defaulted to be 1 but again how can one be sure especially if using libraries.

I guess that this has not become a common issue because use of reshape and pack may be quite rare still and because off-by-1 errors will be the most common and are often hard to notice.

Just noticed in the related links this one:

Which pretty much comes to the same conclusion even without reshape or pack. One of the comments on that topic is the table in comment 12:

The only case (I think) where the lower bound is not under your control is for an allocatable, intent([in]out) dummy array.

1 Like

Yes. This has been my experience. When I’m trying to implement some algorithm from C code or just some pseudo-code in a book or report that uses zero-based indexing, I’ve found that its lot easier if I declare the dummy arguments as:

 subroutine asub(arg1, arg2)
     real, intent(in) :: arg1(0:)
     real, intent(out) :: arg2(0:)  

This allows me to use either the standard Fortran 1-based lower bound or allocate with lower bound as 0: for the actual arguments

1 Like

Regarding your initial post, what is somehow inconsistent is not the behavior of reshape, but of the assignment to an allocatable array.

When the (re)allocation on assignment has been introduced (F2003), a choice had to be made for the lower bound of the left-hand side after the (re)allocation:

  • keeping the existing one (which doesn’t exist if the LHS is unallocated)
  • inheriting the one from the right-hand side
  • setting it to 1

None of the choices could be fully consistent with the legacy behavior of the assignment when the LHS and the RHS have the same shape. And changing the legacy behavior was not acceptable (a silent change of behavior is the worst case of backward compatibility break).

There is also the related issue in this case of whether reallocation occurs when the LHS and RHS have the same shape. This has been discussed here before (and in comp.lang.fortran for even longer), and different people read the text of the standard in different ways. In section 10.2.1.3 it states:

“If the variable is an allocated allocatable variable, it is deallocated if expr is an array of different shape, any corresponding length type parameter values of the variable and expr differ, or the variable is polymorphic and the dynamic type or any corresponding kind type parameter values of the variable and expr differ.”

I have highlighted the part relevant to this discussion. The way I read that sentence, it does not say what occurs when the shape of the LHS and RHS are the same, it only states what occurs when they differ. Others read that “if” as if it says “if and only if”, which of course means something different. So if my reading is correct, when the LHS and RHS shapes are the same, a compiler is free to either reallocate or not. There are two different ways this can affect the LHS array. The first way is that its memory locations change before and after the assignment. The programmer can tell if this happens, for example, by looking at c_loc() of an array element. The second way is for the LHS array bounds to change before and after the assignment. One might further suppose that those two unspecified behaviors are independent, namely that both storage and bounds could change, storage could be the same but bounds could change, storage could change but the bounds could be the same, or neither storage nor bounds change. As @PierU mentions, there was a defined behavior for this kind of assignment prior to f2003 (namely the LHS bounds and the storage are kept unchanged), so one might argue that the current behavior should be consistent with that legacy behavior, even if that prior behavior is not specified in the current standards (or in any of the standards since f2003 for that matter). However, even if that is a weak legacy argument, I think it is probably the best one for compilers to adopt, even newer compilers that do not have a pre-f2003 legacy.

This issue has been with us for over two decades now and it has not been corrected in any of the standards (since and including f2003, and in particular the current f2023 standard), so I think it is just something that we will have to live with.

1 Like