Size of an unallocated array

I have faced the same problem like others. I have tried to use the size function to a not-yet-allocated or recently deallocated array which was giving a positive integer; a misleading behaviour.

After identifying the pitfall I searched and found that it is not by the book. It is not eligible to ask the size of an allocatable array which is not yet allocated or it is deallocated.

Wouldn’t be more safe and prudent if

  • compiler raised an error like in so many other cases
  • size was set to a negative number following the error/status triple method; negative, zero, positive value.

It looks like there is a hole (size trap) in the middle of a one-way road. The (politicians but not engineers in) Town Hall has just put a sign in the beginning/entrance (fortran documentation) of the road that there is a ‘danger’ and leave it unsafe there for years. New people entering either from the beginning of the road without paying attention to the sign (read the doc but did not pay attention to this detail) or from a small street after the beginning (people with basic knowledge) fall inside the hole.

Don’t you believe that it would be more safe and prudent if the Town Hall

  • put a rigid frame (setting the size to a negative integer) around the hole, or
  • closed the hole permanently (compiler alarm)?

Tia

3 Likes

Regarding a compiler error, maybe one could be issued for simple cases, but other cases would not be detectable at compile time. I made the second suggestion at the GitHub forum for Fortran proposals, where it was suggested that the user create a generic function with the desired behavior:

module mySizeModule
 contains
  pure integer function mySize(x)
    class(*), dimension(..), intent(in), optional :: x
    mySize = -1
    if (present(x)) mySize = size(x)
  end function
end module

use mySizeModule
integer, allocatable :: a(:), b(:)
allocate(a(10))
print *, mySize(a), mySize(b)
end
3 Likes

@Beliavsky

I made the second suggestion at the GitHub forum for Fortran proposals, where it was suggested that the user create a generic function with the desired behavior:

So, one good man in the town wrote on a piece of hard paper “Pay attention! Here is the hole!” and post it in situ. I suspect (politicians in) the Town Hall are happy and very proud that they give to a citizen the opportunity to prove how good community they are! :wink: :smiley:

Talfyt and the workaround.

I’m sure there is, but I also 100% agree I don’t understand the reasoning behind these intrinsic size, lbound, etc. functions not being valid when an array is not allocated. That said,

  • in gfortran you can turn on checks for that by running gfortran -fcheck=pointer

  • my coding practice is to ALWAYS wrap a size intrinsic with a merge:

s = merge(size(x),0,allocated(x))

I know this is ugly, non-conformant, blabla. But for me, using an external function just for size(x)!? not an option. And it works great because I’m always sure I will never fail into that mistake again…

4 Likes

@FedericoPerini I will never fail into that mistake again, too. But other people will fail and unfortunately they will waste time to find out what is wrong. Talfyt and workaround, too.

2 Likes

Maybe it is better for a program to fail at run-time than to silently continue with strange results. What would you have lbound return when an array is not allocated? You could say -1, but in Fortran that is a valid lower bound for an array. I would not advise using merge as shown in your line of your code, since a zero-sized array is valid and sometimes useful and is distinct from an unallocated array.

@Beliavsky In fact, before sending us your module, I used to deallocate the array and right next I reallocate it as a zero size array. At least, I was sure that array was empty and size was zero.

Maybe it is better for a program to fail at run-time than to silently continue with strange results.

It is the safest way! And, as an engineer, I say “SAFETY FIRST!” :wink: :grinning:

1 Like

Yeah but how about performance? Calling a function from another module in another file will take time. One can instead define their size definition by saying I.e. size=-1, l bound=0, ubound=-1 (all three consistent) and even extend that to multiple dimensions.

1 Like

Is testing the allocated status such a big deal?

It’s one more thing to remember to avoid undefined behavior. Some enjoy the balancing act of staying on the tightrope over the various pitfalls of UB found in certain languages. Others would rather see that their tools were designed to make such unpleasant surprises impossible in the first place. I would guess that OP falls into the latter category since they posted about it.

@PierU No, it is not a big deal at all. The problem is not the status of the array, it is the mismatch between status and size.

Further, the next concern is: “How many things of this kind I shall discover in future which will waste my time?”

Moreover, I could number of things for which you could ask the same question if they were not already available?

Finally, Town Hall could also ask “Is looking downwards such a big deal to avoid the hole?”

Sometimes it is. If a dummy array is not declared allocatable, then it is illegal to inquire its allocation status, even if the actual argument is allocatable. When writing library routines with allocatable arguments, each routine must check for allocation status before doing anything, accessing its values, testing for bounds and size, reallocating, etc. The programmer cannot assume that the arguments are always available in the expected state.

Many of these persnickety situations involving allocatable arrays could be avoided if it were possible to declare an allocatable array to be in an allocated state. Presently, it is only possible to declare an allocatable array (or any allocatable entity, scalar, derived type component, etc.) to be in an unallocated state.

If it were possible to declare the array to be allocated, possibly with initial values, then many programs could be written with no tests for allocation status and with no explicit deallocate() or allocate() statements anywhere. The array would be declared in an allocated state, and the only changes to that would be through assignment statements which might change the size or bounds of the allocated array, but not its allocation status. This could all be done in a backwards compatible way, using new syntax that does not conflict with any previous codes.

However, this should not be made the new default. Sometimes it is very useful to have arrays that can have both allocated and unallocated status.

4 Likes

One could say that

real, allocatable :: x(:), y(0), z(3)

is equivalent to the current

real, allocatable :: x(:), y(:), z(:) 
allocate (y(0),z(3))

Could that work?

If the variables are declared in a procedure, do you allow the dimensions to be set by a procedure argument, as with automatic arrays? Then

subroutine foo(n)
integer, intent(in) :: n
real, allocatable :: x(n)

would be allowed.

1 Like

@FedericoPerini ,

Dunno about ugly but note you can create trouble for yourself by using MERGE and possibly set off a ton of “discourse” online by bringing up this troublesome intrinsic!!.

Starting with Fortran 2023 processors, you will be able to consider the safer alternative

s = ( allocated(x) ? size(x) : xx )  !<-- where 'xx' is a suitable `expr`, hopefully one that does not evaluate to zero in any and all circumstances as that can be misleading also
2 Likes

I am not sure about this. Fortran doesn’t do short circuit evaluations. A compiler vendor might but that’s not conforming nor portable across compilers and their versions. Had to chase a few such bugs myself when we updated our compiler.

There are probably reasons behind this behavior:

  • consistency: if inquiry fonctions like size(), lbound(), etc, were authorised on unallocated arrays, then (as already mentioned) what should lbound() return? Any number is a valid lower bound… The same for ubound() So you may wrongly conclude that the array is allocated. So size() would be the only one having a defined behavior on a unallocatable array… but then what would the point of allocated() if size() was also a way to test the allocated status?
  • efficiency: the Fortran standard generally doesn’t require runtime checks. This is intentional. We could imagine many of them, and they would help tracking down bugs, but they come at a price in terms of performances. However, during the development phase, the compilers propose various options to enable some runtime checks.

From my experience, using an unallocated array as if it was allocated (accessing elements, deallocating it…) sooner or later result in a crash (often segmentation violation)

I would say that it is illegal to pass an unallocated array if the dummy argument is not allocatable, so there’s nothing to really test in the routine.

There is no such thing as the size of an unallocated array. There’s a big difference between “this bucket is empty” and “I don’t have a bucket”.

Now, to put it in terms of the Fortran language and the intrinsic procedure size, it’s interface is something like,

pure function size(arr)
  real, intent(in) :: arr(:)
  integer :: size
end function

which does not take an allocatable array. It is illegal to pass an unallocated argument to a procedure who’s argument is not allocatable. And if the code is invalid, it’s allowed to return whatever it wants (or set your computer on fire). You really are asking “How big is this thing I don’t have?”

3 Likes

Consider this example:

program xxx
   real, allocatable :: a(:)
   call sub( a )
contains
   subroutine sub(a)
      real, intent(in) :: a(:)
      write(*,*) 'Hello World'
      write(*,*) 'size=', size(a)
      return
   end subroutine sub
end program xxx

If you are writing sub(), and someone else will write call sub(a) in the future, then how do you protect yourself from accessing the unallocated array? You can’t inquire its allocation status, because the dummy array is not allocatable. You can’t make the dummy argument allocatable because you want it to work with all kinds of arrays. So as I said, sometimes it is difficult to make all the pieces fit.

In your program, the array was allocated with those bounds, (1:0), so of course that is what the ubound() and lbound() functions should return. The question at hand is what should those functions return if the array had not been allocated.

Here is the basic truth.

The message to the Fortran practitioners is generally RTFM , though the thingy corresponding to “M” here is often difficult for a practitioner to get at, or it can be unclear, or be missing with the details. The net effect is an unfriendly environment for practitioners.

Practitioners are not paramount in the Fortran language development, Fortran is a prisoner to the budget constraints and a whole lot of other resource related issues of compiler vendors. Sheer failure of imagination, when it comes to delighting the practitioner, has pervaded the environment, that is until @certik came along.

And so this is where the vision and approach by @certik with LFortran truly stands out in terms of consistently caring for the practitioners and striving to serve their needs.

With the situation highlighted in this thread by @FLNewbiee with a reference to an unallocated allocatable object even in the context of intrinsics such as SIZE and the concerns and needs regarding that which have also been raised by some other readers here, if at all something different is to be done, the best bet is working with @certik et al. in the LFortran space and see how that can drive some positive charge in the practical and efficient use of Fortran with some solutions that might be built into the compiler itself as an performant extension rather than as the user defined function thingy suggested by a compiler writer at Nvidia in the @Beliavsky thread at the J3 GitHub site which is akin to throwing a bone to a …