Size of an unallocated array

FortranFan · November 1, 2022, 12:37am

everythingfunctional:

Now, to put it in terms of the Fortran language and the intrinsic procedure size, it’s interface is something like,
pure function size(arr)
  real, intent(in) :: arr(:)
  integer :: size
end function
which does not take an allocatable array .

Agree that this is an accurate assessment of where things are at the moment. But that is just a product of what is perceived to be achievable given the limited resources.

One can see it as a sheer failure of imagination that such an interface would be the case. Compilers can do better, imagine what FORTRAN I compiler archived or how variadic arguments became supported way back. Thus the Fortran standard bearers can do much better for the practitioners if they can get their act together, one aspect of which will be to engage more with the Community to figure out ways (crowd sourcing Fortranners globally for example for various tasks) to get more work done faster and advance the language.

certik · November 1, 2022, 1:05am

@FLNewbiee, excellent design questions about Fortran. Here is my answer:

I think the best approach is to give a runtime error, since the operation is not defined.

Here is how you can ensure the compiler gives a runtime error with GFortran:

$ cat a.f90                      
integer, allocatable :: a(:), b(:)
allocate(a(10))
print *, size(a), size(b)
end
$ gfortran -fcheck=pointer  a.f90
$ ./a.out 
At line 3 of file a.f90
Fortran runtime error: Allocatable argument 'b' is not allocated

Error termination. Backtrace:
#0  0x1024b3c07
#1  0x1024b47a7
#2  0x1024b4be3
#3  0x1020c3d13
#4  0x1020c3d8f

(The stacktrace doesn’t work on my macOS, I haven’t had time to figure out how to fix it, I compiled GFortran using Spack.)

There are multiple approaches, it seems in this case size should just give a runtime error.

But I want to comment on the general approach: I recommend to think about Fortran in two modes: Debug build and Release build.

Debug build: build quickly, all checks enabled, so the Fortran code either will give a compiler error at compile time, or a runtime error. Ideally it can’t segfault or give any kind of undefined behavior.
Release build: can build slowly, but runs as quickly as possible. Checks are disabled, so it can segfault or produce bad numbers. If that happens, switch back to Debug mode and you’ll get a compile time or runtime error.

One can also think of a third mode: ReleaseSafe, which will run as fast as possible, but it will leave all checks on. That might be needed if you run the code as part of a webbrowser (let’s say), but for typical numerical codes you don’t need it.

Currently the compilers do not expose these modes easily, as a user you need to provide the appropriate options by hand. We are trying to fix that with fpm which has the two modes with some decent default options.

RonShepard · November 1, 2022, 1:46am

It occurred to me after I posted this that there is an obscure feature of fortran that could be used to address some of this. Consider the following:

program xxx
   real, allocatable :: a(:)
   call sub( a )
contains
   subroutine sub(b)
      real, intent(in), optional :: b(:)
      write(*,*) 'present(b)=', present(b)
      if(present(b)) write(*,*) 'size=', size(b)
      return
   end subroutine sub
end program xxx`

This compiles to give the output:

$ gfortran unalloc.f90 && a.out
 present(b)= F

So in this way, the programmer of sub() can protect against accessing an unallocated array. Note that dummy array b(:) is not allocatable, so it will work with all kinds of actual arguments (arrays, pointers, allocatable arrays, constants, expressions, etc.). Perhaps a possible problem is that the subroutine must have an explicit interface. That’s not much of a restriction, but it is an important one.

However, one needs to be quite fluent with fortran to understand how this works. It is an obscure feature, one that looks like a bug if you don’t know the trick.

FortranFan · November 1, 2022, 2:45am

As rediscovered upthread with the OPTIONAL attribute of the received argument and which was also posted at the J3 GitHub site when @Beliavsky opened a thread there over a year ago, the “interface” to SIZE intrinsic with the default integer result can be envisioned as follows should the standard decide to support the case of unallocated allocatable objects:

module intrinsic_m
   ..
contains
   simple function size( x ) result( r )
      type(*), intent(in), optional :: x(..)  !<-- any type, any rank
      integer :: r
      r = xx  !<--  = -1?  pick your "poison" in terms of result of unallocatated objects
      if ( present(x) ) r = ..
   end function
   ..
end module

mecej4 · November 1, 2022, 5:18am

Doesn’t the trick depend on the compiler having one and the same internal representation (“descriptor”) for two different entities, namely,

an optional argument that is not present, and
an allocatable array that is not currently allocated?

Current compilers may make the trick work, but I think depending on their continuing to do so is risky. For instance, a future implementation may provide three states for an allocatable array: (1) never allocated (2) currently allocated and (3) previously allocated and subsequently deallocated.

RonShepard · November 1, 2022, 5:35am

It is required by the standard to work. Of course, that could change in the future with some language change that is not backwards compatible. I forget when this became part of the standard. F2008 maybe?

RonShepard · November 1, 2022, 7:31am

I doubt that is true, and I also doubt that there is any data anywhere to know what users think about this question.

Most users who use allocatable arrays use the allocation status as a normal part of the data information. For example, if you build a tree structure with allocatable nodes, then the allocation status is the natural way to denote branch termination. There is no need to keep a separate flag, as is typical in other languages that use pointers for these kinds of data structures.

PierU · November 1, 2022, 9:22am

As I wrote earlier, such a call is illegal Fortran, and I don’t have to protect my routine against an illegal call. Yes, your “trick” with optional works, but in the general case this can be a too strong constraint that can have its own pitfalls (position of the argument, or sending the message that this argument is optional while this is just a programming trick)

The “M” can simply be “asking on a forum”: this is neither unclear nor unfriendly.

We may also notice that for many users, depending on their background, the concept of an “empty array” (or “zero-sized array”) is not granted. A classical mistake that can be seen on forums is allocate(x(0)) then “why can’t I access x(0)??” At some point you have to learn (whatever the way you learn) the characterics of the language you pretend developing with.

Beliavsky · November 1, 2022, 10:33am

Could the size intrinsic be extended with an optional noalloc argument that specifies the return value when the argument is not allocated? This would not change the behavior of current code but allow more concise syntax that some people want. One could then write

real, allocatable :: x(:)
n = size(x,noalloc=-1)

plevold · November 1, 2022, 11:20am

@Beliavsky as it is currently undefined behaviour to query the size of an unallocated array it should be sufficient to define it’s behaviour for an unallocated array (for example to return 0) if that would be desirable.

I agree that there’s an important difference between a zero sized array an an unallocated one and I have used this in my code on occasions. But being able to query allocation status and array size with one function would indeed be handy. If one would need to treat an unallocated and a zero sized array differently one could always use allocated to separate the two of them.

certik · November 1, 2022, 2:17pm

@kargl summarized nicely what the standard says. As far as the standard goes:

And the idea (as I understand it) is that compilers should ensure that nonconforming code will give either a compile time or a runtime error (in Debug mode). And GFortran does, in this case.

So there is no problem, as long as we have good compilers.

FortranFan · November 1, 2022, 4:19pm

@certik, you may want to clarify further. Note “Debug mode” (or a debugging compiler) can mean differently for different readers. In the GCC space, -g compiler option may mean that. But that is inadequate for “a runtime error (in Debug mode)” you mention.

Thus, depending on the practitioner perspective, it may not be a “there is no problem” viewpoint.

   integer, allocatable :: x(:)
   print *, "size(x) = ", size(x)
end

C:\temp>gfortran -g p.f90 -o p.exe

C:\temp>p.exe
 size(x) =           41

One can always the position that one can’t make everyone “happy”, the users need to “read” something - standard whatever i.e., give them the RTFM response in different ways, and that may be ok to some.

But it is up to the user - @FLNewbiee, etc. in this instance - who get to decide whether “there is no problem” and ultimately vote with their feet.

certik · November 1, 2022, 5:25pm

I clarified above (Size of an unallocated array - #26 by certik) what I mean by a Debug mode. Let me know if it is not clear.

FortranFan · November 1, 2022, 6:21pm

@certik, the -fcheck=pointer option you refer to is seen as a code generation convention which the compiler permits when the debug option (-g) is not in effect and when optimizations are used. That is what made me think your use of the phrase “Debug mode” may be confusing for some other readers.

Please refer to this page at gcc.gnu.org for gfortran:
Code Gen Options (The GNU Fortran Compiler)

So with optimizations in effect and no debug setting i.e., colloquially the Release mode:

C:\temp>gfortran -O3 -fcheck=pointer p.f90 -o p.exe

C:\temp>p.exe
At line 2 of file p.f90
Fortran runtime error: Allocatable argument 'x' is not allocated

Error termination. Backtrace:

Could not print backtrace: libbacktrace could not find executable to open
#0  0xade8fb2a
#1  0xade87131
#2  0xade8327b
#3  0xade615a9
#4  0xadeabb2f
#5  0xade613bd
#6  0xade614f5
#7  0xff277033
#8  0xff5a26a0
#9  0xffffffff

RonShepard · November 1, 2022, 8:02pm

I think this question, stated in that way, is simple enough. x(0) is the same as x(1:0) in this context because the lower bound is the implied default of 1.

However, there is a little more to this. What about allocate(x(2:1))? That is also a zero size array. What should lbound(x) and ubound(x) be in this case? I was surprised at the answer, and I’m still not convinced that the standard is right on this one.

certik · November 1, 2022, 8:54pm

@FortranFan not sure I follow. My understanding of the -fcheck options (I actually just use -fcheck=all) is that it generates runtime code for all kinds of checks. It is independent of whether or not you use optimizations like -O3. So roughly speaking:

Debug: -fcheck=all
Release: -O3
ReleaseSafe: -O3 -fcheck=all

There are more options to consider, and we should have a dedicated page for that, but roughly speaking this is the start.

Update: if you want the full options, I use the following:

Debug: -Wall -Wextra -Wimplicit-interface -Wno-unused-function -fPIC -fmax-errors=1 -g -fcheck=all -fbacktrace -ffpe-trap=invalid,zero,overflow -finit-real=snan -finit-integer=-99999999.
Release: -Wall -Wextra -Wimplicit-interface -Wno-unused-function -fPIC -fmax-errors=1 -O3 -march=native -ffast-math -funroll-loops.

RonShepard · November 1, 2022, 10:40pm

Well, that is an answer, but it’s not a very good one. A better approach is to recognize the illegal call and tell the other programmer what is wrong. The optional trick can be used to do that at runtime. The compiler flag that results in the runtime pointer error also does that. An even better approach would be to tell the other programmer at compile time. In this case, I’m not sure that is possible.

JohnCampbell · November 2, 2022, 5:31am

I have always been annoyed at the definition of SIZE in the Fortran standard.
It is typical of the IT stereotype of “F… you”

SIZE should never have been default integer type, but an integer consistent with the memory addressing model. Now iso_c_binding is part of the Fortran standard, why is C_SIZE_T not used ?
(Silverfrost FTN95 reserves integer kind=7 for this type, which changes between 32-bit and 64-bit depending on the /64 compile option)

SIZE (array) should return a valid array size if available, not overflow as a negative number.
Negative numbers should be reserved for special cases, such as not allocated or not associated, similar to stat= functionality.

This has been complained about for years, especially since x64, but also a problem with Win32, but never adequately addressed.

Why has the standard committee refused to respond to this functionality problem ?

JohnCampbell · November 2, 2022, 7:13am

@kargl
But, why should I control the kind of an intrinsic function ?
That should be the role of the compiler ! especially for default operation.

Sorry, but default compiler performance should be to provide sensible/practical information, not some smart “F… you” got-ya.

ivanpribec · November 2, 2022, 7:34am

How often do you query the size? In a typical application I’d guess the bottleneck are the operations performed on the array elements, and not the size/allocation status query of the array.

Topic		Replies	Views
Arbitrary sized array Help	19	1393	June 13, 2020
Changing size of an array within a loop. Good practice? Help	11	2180	March 22, 2023
Bug or undefined behaviour in passing zero-sized arrays as optional argument? Help	3	580	November 16, 2021
Inquire status of (non)allocatable arrays	16	883	November 5, 2023
Allocatable arrays and lower bounds	38	3921	December 12, 2024

Size of an unallocated array

Related topics