Agree that this is an accurate assessment of where things are at the moment. But that is just a product of what is perceived to be achievable given the limited resources.
One can see it as a sheer failure of imagination that such an interface would be the case. Compilers can do better, imagine what FORTRAN I compiler archived or how variadic arguments became supported way back. Thus the Fortran standard bearers can do much better for the practitioners if they can get their act together, one aspect of which will be to engage more with the Community to figure out ways (crowd sourcing Fortranners globally for example for various tasks) to get more work done faster and advance the language.
@FLNewbiee, excellent design questions about Fortran. Here is my answer:
I think the best approach is to give a runtime error, since the operation is not defined.
Here is how you can ensure the compiler gives a runtime error with GFortran:
$ cat a.f90
integer, allocatable :: a(:), b(:)
allocate(a(10))
print *, size(a), size(b)
end
$ gfortran -fcheck=pointer a.f90
$ ./a.out
At line 3 of file a.f90
Fortran runtime error: Allocatable argument 'b' is not allocated
Error termination. Backtrace:
#0 0x1024b3c07
#1 0x1024b47a7
#2 0x1024b4be3
#3 0x1020c3d13
#4 0x1020c3d8f
(The stacktrace doesnât work on my macOS, I havenât had time to figure out how to fix it, I compiled GFortran using Spack.)
There are multiple approaches, it seems in this case size should just give a runtime error.
But I want to comment on the general approach: I recommend to think about Fortran in two modes: Debug build and Release build.
Debug build: build quickly, all checks enabled, so the Fortran code either will give a compiler error at compile time, or a runtime error. Ideally it canât segfault or give any kind of undefined behavior.
Release build: can build slowly, but runs as quickly as possible. Checks are disabled, so it can segfault or produce bad numbers. If that happens, switch back to Debug mode and youâll get a compile time or runtime error.
One can also think of a third mode: ReleaseSafe, which will run as fast as possible, but it will leave all checks on. That might be needed if you run the code as part of a webbrowser (letâs say), but for typical numerical codes you donât need it.
Currently the compilers do not expose these modes easily, as a user you need to provide the appropriate options by hand. We are trying to fix that with fpm which has the two modes with some decent default options.
It occurred to me after I posted this that there is an obscure feature of fortran that could be used to address some of this. Consider the following:
program xxx
real, allocatable :: a(:)
call sub( a )
contains
subroutine sub(b)
real, intent(in), optional :: b(:)
write(*,*) 'present(b)=', present(b)
if(present(b)) write(*,*) 'size=', size(b)
return
end subroutine sub
end program xxx`
This compiles to give the output:
$ gfortran unalloc.f90 && a.out
present(b)= F
So in this way, the programmer of sub() can protect against accessing an unallocated array. Note that dummy array b(:) is not allocatable, so it will work with all kinds of actual arguments (arrays, pointers, allocatable arrays, constants, expressions, etc.). Perhaps a possible problem is that the subroutine must have an explicit interface. Thatâs not much of a restriction, but it is an important one.
However, one needs to be quite fluent with fortran to understand how this works. It is an obscure feature, one that looks like a bug if you donât know the trick.
As rediscovered upthread with the OPTIONAL attribute of the received argument and which was also posted at the J3 GitHub site when @Beliavsky opened a thread there over a year ago, the âinterfaceâ to SIZE intrinsic with the default integer result can be envisioned as follows should the standard decide to support the case of unallocated allocatable objects:
module intrinsic_m
..
contains
simple function size( x ) result( r )
type(*), intent(in), optional :: x(..) !<-- any type, any rank
integer :: r
r = xx !<-- = -1? pick your "poison" in terms of result of unallocatated objects
if ( present(x) ) r = ..
end function
..
end module
Doesnât the trick depend on the compiler having one and the same internal representation (âdescriptorâ) for two different entities, namely,
an optional argument that is not present, and
an allocatable array that is not currently allocated?
Current compilers may make the trick work, but I think depending on their continuing to do so is risky. For instance, a future implementation may provide three states for an allocatable array: (1) never allocated (2) currently allocated and (3) previously allocated and subsequently deallocated.
It is required by the standard to work. Of course, that could change in the future with some language change that is not backwards compatible. I forget when this became part of the standard. F2008 maybe?
I doubt that is true, and I also doubt that there is any data anywhere to know what users think about this question.
Most users who use allocatable arrays use the allocation status as a normal part of the data information. For example, if you build a tree structure with allocatable nodes, then the allocation status is the natural way to denote branch termination. There is no need to keep a separate flag, as is typical in other languages that use pointers for these kinds of data structures.
As I wrote earlier, such a call is illegal Fortran, and I donât have to protect my routine against an illegal call. Yes, your âtrickâ with optional works, but in the general case this can be a too strong constraint that can have its own pitfalls (position of the argument, or sending the message that this argument is optional while this is just a programming trick)
The âMâ can simply be âasking on a forumâ: this is neither unclear nor unfriendly.
We may also notice that for many users, depending on their background, the concept of an âempty arrayâ (or âzero-sized arrayâ) is not granted. A classical mistake that can be seen on forums is allocate(x(0)) then âwhy canât I access x(0)??â At some point you have to learn (whatever the way you learn) the characterics of the language you pretend developing with.
Could the size intrinsic be extended with an optional noalloc argument that specifies the return value when the argument is not allocated? This would not change the behavior of current code but allow more concise syntax that some people want. One could then write
@Beliavsky as it is currently undefined behaviour to query the size of an unallocated array it should be sufficient to define itâs behaviour for an unallocated array (for example to return 0) if that would be desirable.
I agree that thereâs an important difference between a zero sized array an an unallocated one and I have used this in my code on occasions. But being able to query allocation status and array size with one function would indeed be handy. If one would need to treat an unallocated and a zero sized array differently one could always use allocated to separate the two of them.
@kargl summarized nicely what the standard says. As far as the standard goes:
And the idea (as I understand it) is that compilers should ensure that nonconforming code will give either a compile time or a runtime error (in Debug mode). And GFortran does, in this case.
So there is no problem, as long as we have good compilers.
@certik, you may want to clarify further. Note âDebug modeâ (or a debugging compiler) can mean differently for different readers. In the GCC space, -g compiler option may mean that. But that is inadequate for âa runtime error (in Debug mode)â you mention.
Thus, depending on the practitioner perspective, it may not be a âthere is no problemâ viewpoint.
One can always the position that one canât make everyone âhappyâ, the users need to âreadâ something - standard whatever i.e., give them the RTFM response in different ways, and that may be ok to some.
But it is up to the user - @FLNewbiee, etc. in this instance - who get to decide whether âthere is no problemâ and ultimately vote with their feet.
@certik, the -fcheck=pointer option you refer to is seen as a code generation convention which the compiler permits when the debug option (-g) is not in effect and when optimizations are used. That is what made me think your use of the phrase âDebug modeâ may be confusing for some other readers.
I think this question, stated in that way, is simple enough. x(0) is the same as x(1:0) in this context because the lower bound is the implied default of 1.
However, there is a little more to this. What about allocate(x(2:1))? That is also a zero size array. What should lbound(x) and ubound(x) be in this case? I was surprised at the answer, and Iâm still not convinced that the standard is right on this one.
@FortranFan not sure I follow. My understanding of the -fcheck options (I actually just use -fcheck=all) is that it generates runtime code for all kinds of checks. It is independent of whether or not you use optimizations like -O3. So roughly speaking:
Debug: -fcheck=all
Release: -O3
ReleaseSafe: -O3 -fcheck=all
There are more options to consider, and we should have a dedicated page for that, but roughly speaking this is the start.
Update: if you want the full options, I use the following:
Well, that is an answer, but itâs not a very good one. A better approach is to recognize the illegal call and tell the other programmer what is wrong. The optional trick can be used to do that at runtime. The compiler flag that results in the runtime pointer error also does that. An even better approach would be to tell the other programmer at compile time. In this case, Iâm not sure that is possible.
I have always been annoyed at the definition of SIZE in the Fortran standard.
It is typical of the IT stereotype of âF⌠youâ
SIZE should never have been default integer type, but an integer consistent with the memory addressing model. Now iso_c_binding is part of the Fortran standard, why is C_SIZE_T not used ?
(Silverfrost FTN95 reserves integer kind=7 for this type, which changes between 32-bit and 64-bit depending on the /64 compile option)
SIZE (array) should return a valid array size if available, not overflow as a negative number.
Negative numbers should be reserved for special cases, such as not allocated or not associated, similar to stat= functionality.
This has been complained about for years, especially since x64, but also a problem with Win32, but never adequately addressed.
Why has the standard committee refused to respond to this functionality problem ?
How often do you query the size? In a typical application Iâd guess the bottleneck are the operations performed on the array elements, and not the size/allocation status query of the array.