Just say no to non-default lower bounds

Yes I believe this may be misleading to some.
I guess the basic rule is that “in Fortran, first the RHS is evaluated, then, it is assigned to the LHS”

Contrary to cases like real(8) :: x = 1.0 where opinions may be more nuanced, I think the rule for allocatable assignment is clear enough.

  • is lhs allocated? then perform assignment only (do not change bounds)
  • is it not? allocate and copy bounds, then assign
  • it is, but with different size? reallocate if necessary (F2003+), and in this case, go to 2.
1 Like

I’ve introduced many new people to Fortran and every single one of them (myself included!) struggle with knowing what the lowest index is (when it is preserved, when not, etc.). So I would say it’s definitely not intuitive. But one can learn the rules.

In my codes I mostly use 1, and sometimes 0, depending on the problem. Most math problems naturally start from 1, but a significant minority naturally start from 0. Even smaller minority start from some negative values (say m=-l, .., l in spherical harmonics; kappa in the radial Dirac equation; and other examples given by others above).

Is it worth allowing non-default lower bounds? As @rouson showed, there are many land mines with that, unfortunately. Is there a way to fix that in Fortran?

1 Like

@rouson,

“Funny” you should think and write that!

You will be the first to note in many, many domains, so many powers-that-be based on feedback from so many senior technical leaders are "officially declaring all use of " Fortran itself “as evil forevermore”!!

And that is due to “byzantine rules around” all kinds of semantics in the language standard and its processors that “are sufficiently complicated and nuanced as to make seemingly simple code modifications problematic”.

You will note the “evil” complications and nuances, etc. all begin with implicit mapping and on and on the list goes. The complication with ALLOCATABLE attribute and reallocation on assignment issue with shape differences and the effect on bounds is part of a long list.

It is the J3 and WG5 committees who should take note of all these problems including what you raise here but they don’t.

There is drastic need for out-of-the-box thinking to address the problems e.g., J3 is a mere contractor, subcontract to others or WG5 gets a new contractor, or ISO / IEC itself acknowledges other contributor groups besides WG5 to standardize the language. Something has gotta give.

Or everyone can start “throwing the baby out with the bathwater” and get into differing definitions of what constitutes the “baby”. @rouson posits “non-default lower bounds” as one, for some it may be some other feature also, for others the language standard itself, and for many others all of Fortran!!!

I wholeheartedly agree with each comment suggesting that non-default bounds make certain algorithms feel more natural to code than the default lower bound of 1. In particular, I very much agree that having having bounds going from say -N/2 to N/2-1 make it really easy to translate analytical expressions related to Fast Fourier Transforms into code. The algorithm that inspired my post is forward and back propagation for a neural network in which the number of nodes in each layer is stored in an integer array n. For such a scheme, translating common explanations of back-propagation into code makes a lower bound of 0 feel most natural.

So the key question here relates to maintenance and documentation. One person getting the code to work at one moment in time is no problem. In that regard, non-default bounds are great. To make my post less provocative, I’d say, “Anywhere non-default bounds are used, be kind to future maintainers of the code by including a comment that highlights some of the obvious pitfalls by informing the reader of a few common scenarios in which the bounds are not preserved.” But that would have been too long for the title of a post. :slight_smile:

2 Likes

Interesting, that didn’t cross my mind (and I don’t readily see it) as a use case in my implementation.

Consider “Non-default lower bounds sometimes considered harmful”. :slight_smile:

2 Likes

Indeed, and I don’t find them that complicated, once one gets the rationale behind them. I struggled a bit too at the beginning, but not differently than for many other new feature that has to be learnt.

However, the allocation on assignment may add a bit of confusion…

It’s more an “as if”, because as @RonShepard has pointed out, it seems that nothing in the standard prevents the compiler to “reallocate anyway” in the first case. I expect the bounds of c to be preserved anyway, though:

Even if the compiler chooses to reallocate behind the scene, it should normally restore the original bounds of b at the end.

2 Likes

I think the fortran standard is ambiguous about this feature too. If reallocation occurs, then I think the compiler is allowed to copy the bounds from the RHS. As I said before, if the programmer wants to keep the original storage and the original bounds, then the assignment should be written as b(:)=a. The compiler is then required to do what you want. The alternative, b=a when the sizes match, is ambiguous.

The other possibility, b=a(:), has not been discussed here, but I think it is also ambiguous when the sizes match. In this case, the RHS lower bound is 1, regardless of lbound(a) (because it is an expression, not an array reference), so the lower bound of b after the assignment could be either the original lower bound of b, or 1, but not lbound(a), which might also surprise a fortran programmer.

Regarding the storage allocation, there are two important aspects of this. One is the run time efficiency. Heap allocation is an expensive operation, so one might want the compiler to avoid unnecessary reallocations as much as possible just for this reason. The other involves pointer associations involving the LHS array. If reallocation does not occur, then any previous pointer associations would remain valid after the assignment. If reallocation occurs, then those pointer associations would be invalid after the assignment. This issue of pointer associations also arise with move_alloc(), where the programmer takes control of if/when a reallocation occurs. In this case, pointer assoications get transferred from the from= argument to the to= argument and remain valid afterwards.

1 Like

@rouson,

Just remembered something which I have found to trip up some Fortranners: it can give you an axiom to go with your table:

The lower bound of a zero-sized array is always unity.

The result with LBOUND intrinsic for array dimensions that have zero extents is always unity.

P.S.> The above correction made on 11-July-2023 is based on the comment threads below.

2 Likes

Good one! Is upper bound always zero in this case?

Yes. Since it is a zero-sized array, the upper bound is zero also.

1 Like
program zero
   integer :: a(-3:-4)
   write(*,*) lbound(a), ubound(a)
end program zero

$ nagfor zero.f90 && a.out
NAG Fortran Compiler Release 7.1(Hanzomon) Build 7114
[NAG Fortran Compiler normal termination]
 1 0

:slight_smile:

@rouson,

To make it truly provocative and to totally trigger quite a few compiler implementors, you know what I would title the post?!

Design a container type (class) with “blade guards” such as a PDT if your consumers employ non-default lower bounds with arrays!

Here’re an example to consider with a parameterized derived type (PDT) “blade guard”:

module m
   integer, parameter :: LB = -1   !<-- An arbitrary lower bound
   type :: t(k,n)
      integer, kind :: k = LB
      integer, len :: n
      integer :: v(k:k+n-1)
   end type 
end module
   use m, only : t
   type(t(n=:)), allocatable :: b, c
   allocate( t(n=3) :: b )
   b%v = [ 42, 43, 44 ]
   allocate( t(n=2) :: c )
   print *, "Originally: lbound(c%v) = ", lbound(c%v), "; size(c%v) = ", size(c%v)
   c = b
   print *, "Following assignment: lbound(c%v) = ", lbound(c%v), "; size(c%v) = ", size(c%v)
   print *, "Following assignment: c%v = ", c%v, "; expected is ", b%v
end
C:\temp>ifort /free /standard-semantics p.f
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.9.0 Build 20230302_000000
Copyright (C) 1985-2023 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.34.31937.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
 Originally: lbound(c%v) =  -1 ; size(c%v) =  2
 Following assignment: lbound(c%v) =  -1 ; size(c%v) =  3
 Following assignment: c%v =  42 43 44 ; expected is 42 43 44
1 Like
$ gfortran blade.f90 && a.out
 Originally: lbound(c%v) =           -1 ; size(c%v) =            2
 Following assignment: lbound(c%v) =           -1 ; size(c%v) =            3
 Following assignment: c%v =           44           0           0 ; expected is           44           0           0

I think this is the usual problem that gfortran has with PDTs.

However, I would point out also that array components of derived types (normal ones, not PDTs) keep their lower bounds when the derived type is an argument. So instead of passing the array itself (where the lower bound changes), pass the information as a derived type, and there are no surprises. Of course, all of the other features about allocation on assignment and move_alloc() still apply to the array itself.

1 Like

You know the ALLOCATABLE version of the data container model of the PDT I show above will fail to retain the lower bound following intrinsic assignment, so why confuse readers by bringing up this attribute?

In defense of what is now much more complicated it was once a simple as well as useful feature in pre-F90 Fortran. It got a lot more complicated with the addition of Fortran pointers, ASSOCIATE, allocatable variables, automatic allocation, user-defined types, contained procedures, use assocation, … as the old feature was integrated into the new features is when things got less intuitive. The only original question was how to retain the non-default range when passing the arrays. Now I admit I avoid the feature but when I do use it I tend to use user-defined types with it for several reasons and do find it irritating how non-intuitive it has become when using it with newer features, at least for me. The amount of comments I have around every use combined with newer features even in code I expect to not pass on to anyone as reminders to ME makes me think something went wrong, albeit I admit it never bothered me enough to try to think of a clean solution.

Overloaded operators are another powerful feature that I avoid because it can cause confusion.

Am I the only one that is more bothered about the implicit array re-allocation than anything else? I can see more use in custom array bounds than in automatic re-allocation. Also bounds are explicitly defined and can be found with less indirection.

I think the behavior of the fist assignment is fine. Since the right side array fits in the left side array just copy the values in the same order starting on the first index of the left side array.

But if an array is too small to fit the assigned data I would expect an error. If at least C had not been allocated yet I could understand an implicit allocation. But if the indexes are already defined explicitly in the allocation why the hell would I want the bounds to change with an assignment? Honestly I would rather have the array sliced to fit (with a check-bounds warning if possible) than this.

1 Like

@RedHatTurtle, I think I agree with you, as I use both Gfortran and a Fortran 95 compiler. For the case of auto-allocate their behaviour is different, which is worrying when using old F90/95 code.
In many cases, auto allocate syntax is not allowed in F95, but I have had some code where the auto-allocate is unexpected for allocatable arrays in F95 code.
As for use of non-default bounds, when using legacy code, a common error can be a zero array subscript. This error can be more easily managed for a zero lower bound array.

Allocation on assignement has been introduced in F2003, so a standard confirming F95 compiler should produce a runtime error when attempting such a thing.

There are some examples where this is not the case.
Essentially, pre F2003 code can have different results due to auto-allocation.

c = b where c is either not allocated or allocated with a different size than b was illegal code before F2003, thus with an undefined behavior.