Is it standard-conforming?

The problem to solve in Fortran was to declare an allocatable array, then pass it to an internal procedure and fill it there with some sequentially obtained values. The student did it almost right but forgot to add allocatable attribute to the dummy arg in the procedure , so the code did not work:

program t
  integer, allocatable :: tab(:)
  ! [...]
  call sub(tab)
  ! [...]
subroutine sub(t)
  integer :: t(:)      ! lacks allocatable attrib!
  integer :: n
  ! [...]
  t = (/ t, n /)
  ! [...]
end subroutine sub
end program t

What does surprise me is that no compiler (gfortran -Wall, ifort -check all) give any warning here, not to mention an error. On the other hand, if one writes

integer :: tab(0)
tab = [ tab, 0 ]

ifort gives error #6366: The shapes of the array expressions do not conform.
gfortran: Error: Different shape for array assignment at (1) on dimension 1 (1 and 2)

Any hints?

The declaration, without the allocatable attribute, within the subroutine in itself is alright. Within the subroutine, the dummy argument is just a regular array (of size 0). Thus, there is an array size mismatch in the assignment. This is the programmer’s responsibility, the compiler is not required to catch this error, but compilers sometimes do so. Apparently with this code, gfortran does not warn for the error, but ifort does. Note that the tab(:) array in your last case is not allocatable. If it were, then it would be legal code with allocation on assignment.

I don’t see ifort giving any warning/error on the original code

I am guessing n is initialized before appending it to t. Just a remark though, t=[t,n] is perfectly valid but not recommended, unless it is done only a few times. If not, there are better ways to append stuff to an array (see, e.g., this) and I think F2023 makes it even better. But since this is student’s work t=[t,n] is more than ok.


My mistake, I did not read your post carefully enough.

Extending an array one element at a time for a final length of N requires N allocations and (N*(N+1))/2 memory accesses. One approach to improve this is to extend the array by blocks rather than one element at a time. That puts a factor of 1/blocksize in front of those values. If an ever-increasng block size is used, then that puts a log(N) prefactor in front of those two operation counts. With those approaches, it can still be expensive. Another approach is to use a different data structure, such as a linked list, initially, and then optionally copy the elements from the linked list to the final array. If done one element at a time, that requires N allocation/deallocations and only 2N memory accesses. The linked list allocations can also be done in blocks, which decreases the allocations by a factor of 1/blocksize. If you never actually need the array, and you can work with sequential access of the elements thereafter, then you can just use the linked list data structure and never copy to the array. Another useful data structure is a binary search tree (BST). This allows efficient one-by-one insertion; each insertion step requires log(N) effort, sequential access is N effort, and random access is N*log(N) effort. There are many situations where the BST approach is optimal.

edit: I think most of this is possible (now) with f2003, I’m unsure what new features of f2023 brings for this.

Since the array is being reallocated in the subroutine, the dummy arg does need to be allocatable afaik.

@RonShepard My “How?” question was referring to the citation from @Pap: “I think F2023 makes it even better”. I can’t remember a feature of F2023 that would help here, but I don’t know all the new features in detail…

1 Like

I edited my reply above to agree with your statement. I think there is an automatic allocation of character strings when using internal i/o in f2023. I guess that could be used in some situations to simplify the code. To implement this, the i/o library probably uses the linked-list approach internally. The i/o library can do this a little more efficiently than a programmer because, among other reasons, it can use stack allocations rather than heap allocations within the temporary linked list data structure. A fortran programmer using allocatable arrays is almost certainly limited to heap access.


   subroutine s1( a )
      integer, intent(inout) :: a(:)
      a = [ a, 2 ] !<-- assumed shape array: check on shape conformance demanding; few, if any, processors do so
   end subroutine 
   subroutine s2( a )
      integer, intent(inout) :: a(0)
      a = [ 1 ]    !<-- explicit shape array: check on shape conformance inexpensive, likely ALL Fortran 90 and later processors detect and report this
   end subroutine 

The cases in both the subprograms above do not conform since the standard states var = expr shall conform in shape. But I don’t recall the standard including any numbered constraints that require the processor to detect and diagnose the non-conformance (I’m going by memory here and typing with a mental compiler on my child’s tab so I may be wrong).

My hunch is the processors since the days of Fortran 90 have detected and reported the latter since they find it straightforward and it made sense to do for their user base following that landmark standard revision which introduced intrinsic assignment that also supported arrays.

However the first case with the assumed-shape received argument may not be diagnosed at all by most processors, or if at all they do, they may do so only with an additional explicit request for a check at run-time (e.g., a compiler option that comes into effect at run-time).

My guess is that few processors enforce this, as it was a common workaround for the missing assumed size in early days. I.e.

call do_stuff(arr, 10)
subroutine do_stuff(arr, n)
  integer :: arr(1)
  integer :: n, i

  do i = 1, n
    arr(i) = ...
  end do
end subroutine

I believe this is and always was not standard conforming, but that doesn’t mean it was uncommon.

Accessing elements outside the bounds is only part of the problem. The array expression a = [ a, 2 ] when a(:) is not allocatable is the same as a(1:n) = [ a(1:n), 2], which has no practical interpretation. The compiler is not free to interpret it as a(1:n+1) = [ a(1:n), 2], which would be more like the do-loop situation where it is only the bounds on the lhs that is being violated.