Accessing zero-length characters

Suppose we want to pass a empty string ‘’ through code, but various places in the code are all ready hard-coded to assume a length 1:N. Is it valid for N=0, such that we access the string as string(1:0)? Consider this example:

module t
    implicit none

    contains

    subroutine test(x,y)
        integer, intent(in) :: y
        character(len=*), intent(in) :: x

        write(*,*) 'Test output:"',x(1:y),'" len=',len(x)

    end subroutine test

end module t


program tt
    use t
    implicit none

    call test('    ',1)
    call test(' ',1)
    call test(' ',0)
    call test('',0)

end program tt

I get this output (with warnings and checking on):

gfortran -Wall -fcheck=all teststr.f90 
./a.out 
 Test output:" " len=           4
 Test output:" " len=           1
 Test output:"" len=           1
 Test output:"" len=           0

If I try to access as:

call test('',1)

Then i get an array out of bounds for accessing x(1:1) for a 0 length string.

It feels wrong to access the string as (1:0) as my upper bound < the lower bound without also specifying a stride, but it seems to work?

It is the same as accessing a zero-sized array. The (1:0) section specification precisely means that you select zero elements (or characters). Zero-sized arrays (zero-length strings) make life a lot easier as there are less corner cases that require special treatment.

Yes, the standard defines this much more specifically than in the past; but I have found a lot of compilers that do not fully implement it all. One difference I see between compilers is whether the test for zero length is done first or if in the string STR(N:M) does N have to always be at least one so I would assume N has to be at least one to avoid the issue for now myself.

! 9.4.1     Substrings
! A substring is a contiguous portion of a character string (7.4.4).
! R908     substring              is  parent-string ( substring-range )
! R909     parent-string          is  scalar-variable-name
! or array-element
! or coindexed-named-object
! or scalar-structure-component
! or scalar-constant
! R910     substring-range       is  [ scalar-int-expr ] : [ scalar-int-expr ]
! C908     (R909) parent-string shall be of type character.
! 
! The value of the first scalar-int-expr in substring-range is the
! starting point of the substring and the value of the second one is the
! ending point of the substring. The length of a substring is the number of
! characters in the substring and is MAX (l −f +1, 0), where f and l are
! the starting and ending points, respectively. Let the characters in
! the parent string be numbered 1, 2, 3, ..., n, where n is the length of
! the parent string. Then the characters in the substring are those from
! the parent string from the starting point and proceeding in sequence
! up to and including the ending point. If the starting point is greater
! than the ending point, the substring has length zero; otherwise, both
! the starting point and the ending point shall be within the range 1, 2,
! ..., n. If the starting point is not specified, the default value is
! 1. If the ending point is not specified, the default value is n.
! 
! Examples of character substrings are:
program testit
implicit none
character(len=*),parameter :: all='(*(g0))'
character(len=10) :: b(2)=['abcdefghij','ABCDEFGHIJ']
type struct
   character(len=:),allocatable :: name
end type struct
type(struct) :: p
character(len=:),allocatable :: ID
integer :: N,M

   print all, 'A) ',B(1)(1:5)     ! array element as parent string
   p=struct('XYZ')
   print all, 'B) ',P%NAME(1:1)   ! structure component as parent string
   ID='123456789'
   print all, 'C) ',ID(4:9)       ! scalar variable name as parent string
   N=1
   M=4
   !not implemented in a significant number of compilers
   !print all, 'D) ',’0123456789’(N:M) ! character constant as parent string

   print all,  'E) ',id(3:2)
   print all,  'G) ',id(1:-10)
   ! argueable
   ! start index is less than one
   print all,  'H) ',id(0:-10)
   !print all,  'I) ',id(0:3) 
   ! should fail
   !print all,  'J) ',id(-20:-10)
   ! non-intuitive
   !print all,  'F) ',id(3:2:-1) ! not allowed but seems like it should be

end program testit

There is a stride, it is just an implicit stride of 1.

One other thing that works, when it seems like it maybe it shouldn’t, is that a zero-length character may be accessed by any zero-length substring without error. For example, zc(1:0), zc(2:1), zc(2:0), and so on are all allowed, no matter how the zero-length character zc is declared. As long as the result is zero-length, the actual bound values don’t matter.

A similar thing is true also for accessing zero-length array slices in size-zero arrays. Also related to this is lbound(za)==1 and ubound(za)==0 for any zero-length array za or for any zero-length array slice, no matter how za is declared.

integer :: za(3:2)
write(*,*) 'lbound=', lbound(za), ubound(za)

will always print 1 and 0 for the two bounds, regardless of how za is actually declared.