Allocatable arrays and lower bounds

Two valuable hints from @pmk (priv.comm.), plus some my additions:

  1. A pointer dummy argument with INTENT(IN) can be associated with a non-pointer actual argument (see 15.5.2.7) as if it were a pointer assignment statement. For arrays, the bounds are “preserved” only if the target is a whole array (i.e. array component or array name without further qualification, see 3.156, 9.5.2), otherwise lbounds are set to 1.
  2. One can combine the power and flexibility of assumed-shaped array dummy args with the idea of explicit-shape/adjustable-array to get original bounds in the procedure, w/o using allocatable/pointer dummy args
real :: tab(0:9,0:19)
real, allocatable :: atab(:,:)
allocate(atab(-1:5,-2:8))
call foo(tab, lbound(tab))     ! prints   0   0
call foo(atab, lbound(atab))   ! prints  -1  -2
...
contains
subroutine foo(arr, lbs)
  integer, intent(in) :: lbs(2)
  real, intent(in out) :: arr(lbs(1):,lbs(2):) ! assumed-shape with forced lbounds
  print *, lbound(arr)
  ...
end
2 Likes

Thanks – here is an illustrative program.

module pointer_arg_mod
implicit none
character (len=*), parameter :: fmt = "(*(1x,i3))"
contains
subroutine print_bounds(v)
integer, intent(in), pointer :: v(:)
print fmt,lbound(v),ubound(v)
v = 42 ! allowed to modify the pointee
! deallocate (v) -- illegal
end subroutine print_bounds
end module pointer_arg_mod
!
program test_pointer_arg
use pointer_arg_mod, only: print_bounds,fmt
implicit none
integer, target :: v(-2:5)
call print_bounds(v)     ! -2 5
call print_bounds(v(3:)) !  1 3
print fmt,v !   42  42  42  42  42  42  42  42
end program test_pointer_arg

Alltogether, the rules for array bounds “inheritance” when passed to procedures, seem to me a bit complicated, if not messy.

3 Likes

I don’t think they are messy. Pointer and Allocatable dummy arguments inherit the bounds of the actual argument, others just inherit the extent (number of elements). The reason for this is that you can pass non-contiguous or vector-subscripted arguments to the non-pointer/allocatable dummies, and those don’t have definable lower bounds. It’s a convenience that you can assume the lower bound is 1 no matter what the actual looks like (in the “other” case, which is much more common.)

4 Likes

Another way to put it: if we were designing a language from scratch, how would we design the array passing and what array properties would we implement?

In my mind and experience with other languages, it would be something very similar if not identical to Fortran. I do agree with @msz59 that it feels sometimes a bit inconsistent or confusing to newcomers and even to me now. But overall the way Fortran does it is not too bad, it works pretty well in practice and in production. And Fortran is not designed from scratch in 2022, but had to make this all work with backwards compatibility in mind, and many of these array features are from the very beginnings of Fortran, and the rest was added around 1990, so it’s actually quite amazing how well it works. It’s close to ideal, I don’t know if it is ideal or if it could be made more consistent and simpler in a new language from scratch. (Yes, we are not designing a new language, but it helps me to think about “what if”, to see if there might be a few improvements we can do even for Fortran.)

3 Likes

MATLAB has some nice features, some more related to matrices, than others. I like being able to assign multiple values on the LHS, like “[a,b,c]=[10,20,30]”. . Everyone hates all the strings having to be the same length in a CHARACTER array constructor, and several compilers already allow that as an extension. Some will argue, but it was part of de-facto Fortran to be able to pass scalars to array arguments and to pass arrays of one shape to functions where the argument was of another shape; and I have seen multiple comments about just being able to say something like A=B when A and B are the same size but not the same shape. If A is a vector, which is a common case, A=[B] works, so there is some of that already. Passing different shapes still works without interfaces. It was so common it is very often an issue in “modernizing” old code when you almost always want to place the routines in modules. Even libraries like BLAS still contain routines like DCOPY, which is used in many other libraries for copying data between differently shaped arrays. Not having simple things like .inv. and .dot. standard, so you could say something like .inv.A or B.dot.C, which would allow for mathematical expressions that mapped more directly to standard equation syntax would be nice, and so on. MATLAB, which in some ways was “what Fortran should be doing” for many years, has several other nice syntax features I would not mind seeing in Fortran even today.

I am sure you know the string array constructors in Fortran do not require fixed equal lengths for all elements (in practice, the compiler will make it all equal). For the sake of others who may get the message wrongly, here is how it would be done,

character(:), allocatable :: str(:)
str = [character(10) :: "Fortran", "is", "a", "modern", "language."]

I have used MATLAB for as long as I have known Fortran. It’s great software. I still teach it occasionally and use it on a daily basis. But it has some fundamental design flaws. In my opinion, those flaws are where they decided to go away from Fortran conventions. The worst of all is the strictly 1-based indexing. Next in line is that everything is a matrix; no scalar, no vector. An immediate consequence is that a vector can have two different forms that are not the same. This has created significant confusion in my work. The traces of the design by non-Fortran (Java/C++) developers behind some of the bizarre rules in MATLAB are easily detectable as if there is an internal tug-of-war in MATLAB organization between those who want to push MATLAB toward Java/C++ syntax and those who prefer(red) Fortran syntax. Example: the horrendous treatment of optional function arguments in MATLAB. Still, with all the problems, it’s great useful software.

I am familar with that, but hate the syntax and specifying the length. Things like adding a new word to an array with a constructor are particularly ugly. Yeah, MATLAB did some odd things and has it’s own backward compatibility support issues that will probably keep them there, but that sounds familiar :slight_smile: . I somtimes abuse that syntax with the A descriptor when I am feeling too lazy to create a format with a CHARACTER variable and want to change string lengths to make compact colums, like finding the minimum width I need and then doing something more complicated, but like “WRITE(*,’(5(A,1X))’) [character(len=mywidth) :: stringarray]” and many other uses, where mywidth=maxval(len_trim(string_array))

1 Like

With allocatable array of deferred length strings, as in @shahmoradi’s example, the syntax is OK. But

  character(12) :: str(5)
  str = [character(12) :: "Fortran", "is", "a", "modern", "language."]

seems absurdly redundant. And, w/o character(12) in the constructor, it is an error. I can hardly imagine the reason for that requirement. Fortran has always allowed scalar assignment of a string to a variable of a different length. Why can’t this be done in array constructors?

I recall some discussion of allowing different character lengths in an array constructor, and as you say, some compilers allow this as an extension. Unfortunately, I don’t remember the arguments against adding this to the language. If you did this, you might also ask for:

[3.0,42,(-1,0.5)]

what should that do? In the end it was felt that keeping the “same type and type parameters” restriction in place was for the best.

I also find the current [character (len=5) :: "one","seven"] syntax awkward, so I wrote a hack, discussed in a GitHub fortran-lang/stdlib issue that lets you write

s = c("one","seven")

as in R to get an array of up to 10 character variables with LEN 100.

module util_mod
implicit none
contains
function c(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10) result(vec)
! return character array containing present arguments
character (len=*)  , intent(in), optional    :: x1,x2,x3,x4,x5,x6,x7,x8,x9,x10
character (len=100)            , allocatable :: vec(:)
character (len=100)            , allocatable :: vec_(:)
integer                                      :: n
allocate (vec_(10))
if (present(x1))  vec_(1)  = x1
if (present(x2))  vec_(2)  = x2
if (present(x3))  vec_(3)  = x3
if (present(x4))  vec_(4)  = x4
if (present(x5))  vec_(5)  = x5
if (present(x6))  vec_(6)  = x6
if (present(x7))  vec_(7)  = x7
if (present(x8))  vec_(8)  = x8
if (present(x9))  vec_(9)  = x9
if (present(x10)) vec_(10) = x10
n = count([present(x1),present(x2),present(x3),present(x4),present(x5), &
           present(x6),present(x7),present(x8),present(x9),present(x10)])
! print*,"n=",n !! debug
allocate (vec(n))
if (n > 0) vec = vec_(:n)
end function c
end module util_mod
!
program xc
use util_mod, only: c
implicit none
print "(*(a7))", c("one")
print "(*(a7))", c("one","two")
print "(*(a7))", c("one","two","three")
print "(*(a7))", c("one","two","three","four","five","six","seven","eight","nine","ten")
end program xc

output:

one    
one    two    
one    two    three  
one    two    three  four   five   six    seven  eight  nine   ten
1 Like

Intuitively I would expect it to promote everything to complex, and that default promotion would be defined in standard.

Even though INT(), REAL(), … are not elemental I have (so far) not found a case where I used that syntax with numeric values (ie. [TYPE(KIND) :: V1, V2, V3 …] but find I HAVE to use it constantly with strings, even though it is intuitively clearer that an INTEGER and DOUBLEPRECISION are different types, not so clear when it is just two text strings of different length. I could see an argument being if you actually have different CHARACTER types like UNICODE and ASCII and …??? , where “promotion” might seem less clear being one of the reasons to require it.

1 Like

Simple-minded thinking can be of help here in that assignments in Fortran can be viewed as RHS → LHS and that the expr on RHS effectively needs to be evaluated fully and independently of LHS. This comes into play especially with constructors, whether they be type constructors of derived type and/or those with length-type parameters and extending into arrays of them.

The suggested “canonical” approach in modern Fortran then toward working with “strings”, derived types with length-type parameters whilst alleviating the perceived absurdity with the need to duplicate the length-type parameters and object shapes is to employ as much as possible

  1. the named constant facility with the PARAMETER attribute,
  2. and/or the ALLOCATABLE attribute

Some examples:

! A "store" for the array of string literals 
   character(len=*), parameter :: s(*) = [ character(12) :: "Fortran", "is", "a", "modern", "language."]
   ..
   character(len=12) :: str(5)
   ..
   str = s
   ..

Or

   ..
   character(len=:), allocatable :: s(:)
   ..
   s = [ character(12) :: "Fortran", "is", "a", "modern", "language." ]
1 Like

I like this too. But it has the side-effect of the left-hand side of the equation impacting the right-hand side, which goes against Fortran’s philosophy, I believe. In MATLAB, the LHS can change the behavior of the RHS, which is a strange but sometimes handy feature.
Another possible remedy would be to require the compiler to automatically set the length type parameter of all elements to the largest needed. With this rule, the resizing of elements will happen twice: once for array construction and then assign the value to the LHS.

Would coercion to the stronger type/kind be a sensible convention in such cases? After all, a programming language is just a collection of conventions. I fully agree that devising good conventions requires extreme care, and the Fortran committee should be praised for that level of care in their work.
Another similar but more annoying issue is that string concatenation requires strings to be all of the same kind. But why not set a rule here for coercion to the stronger higher kind in string concatenation? Is there a reason for not supporting this, or is it more of an oversight or lack of the need?

Fortran doesn’t have implicit conversion/coercion between character kinds. (10.1.5.1p4 for concatenation, 10.2.1.2p1 for assignment).

Well, that’s not a reason then. Honestly have not used anything but ASCII except for a few experiments. Was thinking it would be really complicated. Maybe finally caving in and using a user-defined type is a better way to go for me; there are a lot of issues with lengths and accidental truncations and constantly having to use TRIM() and such that are pet peeves that are mostly resolvable that way. I know there are some things coming like the AT descriptor in formats but it is not quite enough, although a lot of the TRIM() calls are on WRITE()s; I like that upcoming change quite a bit.

Unfortunately in Fortran, there isn’t really promotion or coercion. Effectively, every sub-expression is evaluated completely independently of anything with respect to any enclosing expression, including as @FortranFan pointed out that the LHS does not influence the RHS. Instead, the standard numeric operators and assignment are effectively generic interfaces such that integer + real has a specific procedure that gets called. It is inside these procedures that the “coercion” happens.